A workshop co-located with NoDaLiDa/Baltic-HLT in Tallin, Estonia on March 5th, 2025.
More information to come in the following days!
Note: the deadline has been extended to December 19th 2024!
Quick Links
- Venue
- Registration
- Shared Task
- Invited Speakers
- Description of the Workshop
- Submission Information
- Important Dates
- Workshop Organizers
- Sponsor Information
Venue
This year's NLP4CALL workshop will be organized as a hybrid event. The workshop will physically take place in Tallin, Estonia on March 5th, 2025 and online on Zoom. The Zoom link will be sent out to registered participants.
The physical location will be Hestia Hotel Europa (Address: Paadi 5, Tallinn, Estonia) [View on Google Maps]. For on-site participation, the links and maps will be provided on the main conference website: venue.
Registration
Registration is done via the NoDaLiDa/Baltic-HLT registration website, to be made available December.
Shared Task
This year we are offering the MultiGEC shared task on multilingual grammatical error correction for L2 language learners. There are 12 target languages covered, namely Czech, English, Estonian, German, Greek, Icelandic, Italian, Latvian, Russian, Slovene, Swedish, and Ukrainian. This shared task is organized by the Computational SLA working group.
For more information, please see the Shared Task website: https://github.com/spraakbanken/multigec-2025/.
Invited Speakers
This year we have the pleasure to announce two invited speakers:
Andrew Caines: The Potential and the Pitfalls of Very Large Language Models for Language Learning Applications
Bio: Andrew Caines is a Senior Research Associate based in the Computer Laboratory at the University of Cambridge, U.K. He has been a member of the Institute for Automated Language Teaching & Assessment (ALTA) since its inception in 2013. His research interests relate to education technology for language learning, including corpus creation, automated essay scoring, grammatical error detection and correction, adaptive learning, content creation and the training of smaller, domain-specific language models.
More info and an abstract for the talk will come soon.
Peter Uhrig: AI-assisted (Pedagogical) Constructicography – Opportunities and Challenges
Peter Uhrig comes from Friedrich-Alexander-Universität in Germany.
Abstract: With the rise of large machine-readable computer corpora, lexicographers found themselves in the (un-)comfortable situation of having enough data on words and their combinations, but often at such a volume that it became difficult to select the most relevant information to include in the dictionary. We have shown in previous work (Uhrig & Proisl 2012, Evert et al. 2017, Uhrig et al. 2018) how the application and the understanding of NLP and statistics can improve the extraction of collocation candidates from corpora and thus provide the lexicographer with more accurate and relevant lists collocation candidates. Given that current large language models distill linguistic knowledge from corpora that are much larger than the ones we used in our previous research, it appears only logical to turn to them for even more improved input on collocations. In this talk, I will explore prompting strategies on standard models and show how fine-tuning can be used to turn such models into constructicographers that write full collocations dictionary entries. I will include a cursory evaluation of the resulting entries and a brief discussion about the usefulness (or lack thereof) of collocations dictionaries in the age of LLMs.
References:
- Evert et al. 2017
- Uhrig, P., & Proisl, T. (2012). Less hay, more needles – using dependency-annotated corpora to provide lexicographers with more accurate lists of collocation candidates. Lexicographica, 28, 141 - 180.
- Uhrig et al. 2018
Description of the Workshop
The workshop series on Natural Language Processing (NLP) for Computer-Assisted Language Learning (NLP4CALL) is a meeting place for researchers working on integrating Natural Language Processing and Speech Technologies in CALL systems and exploring the theoretical and methodological issues arising in this connection. The latter includes, among others, the integration of insights from Second Language Acquisition (SLA) research, and the promotion of “Computational SLA” through setting up Second Language research infrastructures.
The intersection of Natural Language Processing (or Language Technology / Computational Linguistics) and Speech Technology with Computer-Assisted Language Learning (CALL) brings “understanding” of language to CALL tools, thus making CALL intelligent. This fact has given the name for this area of research — Intelligent CALL, or short, ICALL. As the definition suggests, apart from having excellent knowledge of Natural Language Processing and/or Speech Technology, ICALL researchers need good insights into second language acquisition theories and practices, as well as knowledge of second language pedagogy and didactics. This workshop therefore invites a wide range of ICALL-relevant research, including studies where NLP-enriched tools are used for testing SLA and pedagogical theories, and vice versa, where SLA theories, pedagogical practices or empirical data and modeled in ICALL tools. The NLP4CALL workshop series is aimed at bringing together competences from these areas for sharing experiences and brainstorming around the future of the field.
We welcome papers:
- that describe research directly aimed at ICALL;
- that describe the ongoing development of resources and tools with potential usage in ICALL, either directly in interactive applications, or indirectly in materials, application, or curriculum development, e.g. learning material generation, assessment of learner texts and responses, individualized learning solutions, provision of feedback;
- that discuss challenges and/or research agenda for ICALL;
- that describe empirical studies on language learner data; or
- that explore the use of LLMs and Generative AI to develop ICALL tools.
In this edition of the workshop a special focus is given to:
- Grammatical error correction, with a special track for the MultiGEC shared task.
- The use of pedagogically oriented constructicographic resources (constructicons), with an emphasis on their practical application in ICALL. By constructicographic resources, we refer to resources that describe various types of constructions associated with specific meanings or functions, ranging from fully schematic and semi-schematic constructions (e.g., those with both fixed and variable elements) to specific lexical expressions.
We particularly encourage software demonstrations showcasing the potential use of existing Language and Speech Technologies or resources in ICALL applications for Nordic and Finno-Ugric languages.
Submission Information
We accept both short and long papers, as well as demo papers. The submissions must describe original and unpublished work.
Paper length:
Papers sent to the NLP4CALL workshop must adhere to the following page limits:
- Short and demo papers must be between 4 and 7 pages.
- Long papers must be between 8 and 12 pages.
- Shared task papers must be in between 4 and 12 pages.
Other considerations regarding the paper length are as follows:
- Papers have can have an unlimited number of pages for references.
- Appendices are allowed and are not counted in the page count. However, the main body of the paper has to be self-contained. That is, the reviewers are not expected to look at them.
- Camera-ready versions of accepted papers will be given an additional page to address reviewer comments.
Papers should describe original unpublished work or work-in-progress and will be peer-reviewed by at least two members of the program committee in a double-blind fashion. All accepted papers will be collected into a proceedings volume to be published both in the NEALT Proceeding Series and through the ACL anthology.
The submission will be through EasyChair: https://easychair.org/my/conference?conf=nlp4call2025
The links to the Latex and Word templates can be found here: https://github.com/NLP4CALL/current/blob/website/_includes/other_info/submission_information.md
Important Dates
All deadlines are anywhere on earth.
- Submission date: December ~~16th~~ 19th, 2024
- Acceptance notification: January 20th, 2025
- Camera-ready papers: February 2nd, 2025
- Workshop date: March 5th, 2025
Workshop Organizers
- Ricardo Muñoz Sánchez, Språkbanken Text, University of Gothenburg, Sweden
- David Alfter, Gothenburg Research Infrastructure in Digital Humanities (GRIDH), University of Gothenburg, Sweden
- Elena Volodina, Språkbanken Text, University of Gothenburg, Sweden
- Jelena Kallas, Institute of the Estonian Language, Estonia
Information about Sponsors and Other Supporters
This workshop is supported jointly by:
- The project Expanding the scope of a multi-purpose lexicographic resource to grammar and L2 competence, funded by the Estonian Research Council grant (PRG 1978).
- The project Grandma Karl is 27 years old: Automatic pseudonymization of research data with the Swedish Research Council grant with funding number 2022-02311.
- The research infrastructure Språkbanken, jointly funded by its 10 partner institutions and the Swedish Research Council (2018–2024; dnr 2017-00626)