NLP4CALL 2023 & MultiGED Shared Task
Nodalida workshop, Tórshavn, Faroe Islands, May 22 2023
Quick links
- COVID information
- Venue
- Registration
- Program
- Shared task
- Invited speakers
- Workshop description
- Submission information
- Important dates
- Program committee
- Workshop organizers
- Related links
- ICALL-related mailing lists
Zoom link for online participation
The Zoom link for online participation is here: Join on Zoom
Proceedings
The proceedings are available now: NLP4CALL 2023 proceedings
Sponsor information
The workshop is co-sponsored by the project Grandma Karl is 27 years old (funded by the Swedish Research Council) and Språkbanken Text through its support for the organization of the MultiGED-2023 shared task.
COVID information
Concerning COVID-related issues, we follow the official recommendations. For up-to-date information, please see https://korona.fo/travel
Venue
This year's NLP4CALL workshop will be organized as a hybrid event. The workshop will physically take place in Tórshavn, Faroe Islands on May 22, 2023 and online on Zoom. The Zoom link will be sent out to registered participants.
The physical location will be Sjóvinnuhúsið (Address: Vestara Bryggja 15, Tórshavn) [View on Google Maps].
For on-site participation, the links and maps are provided on the main conference website: venue.
Registration
Registration is done via the Nodalida registration website: Registration
Program
Please note that all time indications are UTC +1
09:00 - 09:10 | Opening session | |
09:10 - 10:00 | Invited talk 1 TELL: Tasks Engaging Language Learners Marije Michel Chair: Elena Volodina |
|
Session 1 | ||
Chair: Elena Volodina | ||
10:00 - 10:20 | On the Relevance and Learner Dependence of Co-text Complexity for Exercise Difficulty Tanja Heck and Detmar Meurers |
|
10:20 - 10:40 | Coffee break | |
Session 2 | ||
Chair: Ricardo Muñoz Sanchez | ||
10:40 - 11:00 | MultiGED-2023 shared task at NLP4CALL: Multilingual Grammatical Error Detection [Slides] Elena Volodina, Christopher Bryant, Andrew Caines, Orphée De Clercq, Jennifer-Carmen Frey, Elizaveta Ershova, Alexandr Rosen and Olga Vinogradova |
|
11:00 - 11:15 | Two Neural Models for Multilingual Grammatical Error Detection Phuong Le-Hong, The Quyen Ngo and Thi Minh Huyen Nguyen |
|
11:15 - 11:30 | A distantly supervised Grammatical Error Detection/Correction system for Swedish Murathan Kurfalı and Robert Östling |
|
11:30 - 11:45 | The NTNU System in MultiGED-2023: Contextual Flair Embeddings for Multilingual Grammatical Error Detection Lars Bungum, Björn Gambäck and Arild Brandrud Næss |
|
11:45 - 12:00 | EliCoDe at MultiGED2023: fine-tuning XLM-RoBERTa for multilingual grammatical error detection Davide Colla, Matteo Delsanto and Elisa Di Nuovo |
|
12:00 - 13:00 | Lunch | |
13:00 - 13:50 | Invited talk 2 Privacy-enhancing NLP: a primer Pierre Lison Chair: Hercules Dalianis |
|
Session 3 | ||
Chair: Arianna Masciolini | ||
13:50 - 14:05 | Automated Assessment of Task Completion in Spontaneous Speech for Finnish and Finland Swedish Language Learners Ekaterina Voskoboinik, Yaroslav Getman, Ragheb Al-Ghezi, Mikko Kurimo and Tamas Grosz |
|
14:05 - 14:20 | Speech Technology to Support Phonics Learning for Kindergarten Children at Risk of Dyslexia Stine Fuglsang Engmose and Peter Juel Henrichsen |
|
14:20 - 14:40 | Coffee break | |
Session 4 | ||
Chair: Ekaterina Voskoboinik | ||
14:40 - 14:55 | DaLAJ-GED - a dataset for Grammatical Error Detection tasks on Swedish [Slides] Elena Volodina, Yousuf Ali Mohammed, Aleksandrs Berdicevskis, Gerlof Bouma and Joey Öhman |
|
14:55 - 15:10 | Experiments on Automatic Error Detection and Correction for Uruguayan Learners of English Romina Brown, Santiago Paez, Gonzalo Herrera, Luis Chiruzzo and Aiala Rosá |
|
15:10 - 15:25 | Manual and Automatic Identification of Similar Arguments in EFL Learner Essays Ahmed Mousa, Ronja Laarmann-Quante and Andrea Horbach |
|
15:25 - 15:45 | Sequence Tagging in EFL Email Texts as Feedback for Language Learners Yuning Ding, Ruth Trüb, Johanna Fleckenstein, Stefan Keller and Andrea Horbach |
|
15:45 - 15:50 | Closing session | |
17:00 - ... | Conference welcome reception at Müllers Pakkhús [Directions from the workshop venue on Google Maps] |
Shared Task
NEW for this year is the MultiGED shared task on token-level error detection for L2 Czech, English, German, Italian and Swedish, organized by the Computational SLA working group.
For more information, please see the Shared Task website: https://github.com/spraakbanken/multiged-2023.
Invited speakers
This year we have the pleasure to welcome two invited speakers:
Marije Michel (PhD Applied Linguistics, University of Amsterdam) is chair of Language Learning at Groningen University in the Netherlands. Her research and teaching focus on second language acquisition and processing with specific attention to task-based language pedagogy, digitally-mediated interaction and writing in a second language.
TELL: Tasks Engaging Language Learners
Taking a task-based approach on language teaching, learning and assessment (TBLT), the basic unit of second language (L2) instruction is a task. Tasks are (pedagogic) activities that adhere to specific criteria (e.g., there needs to be a communicative gap, Skehan, 1998) in order to ensure that learners engage in meaningful language use during task performance. In the long run, only tasks engaging students in authentic language use may lead to L2 processes that have the potential to support L2 acquisition. In this presentation, I will review the most important principles of designing engaging learning tasks, highlight examples of practice-induced L2 research using digital tools, and will showcase some of my own work on task design for L2 learning during digitally mediated communication and L2 writing. In doing so, I will discuss the NLP measures we use to evaluate task-based performance, formulating TBLT desiderata for the future of NLP4CALL.
Pierre Lison is a senior researcher at the Norwegian Computing Center, a research institute located in Oslo and conducting research in computer science, statistical modelling and machine learning. Pierre’s research interests include privacy-enhancing NLP, spoken dialogue systems, multilingual corpora and weak supervision. Pierre currently leads the CLEANUP project on data-driven models for text sanitization. He also holds a part-time position as associate professor at the University of Oslo.
Privacy-enhancing NLP: a primer
Text documents often contain personal data in some form – either related to the authors themselves or to some other individuals mentioned in the text. This raises privacy concerns, especially when those documents are to be published online or included as training data for NLP models. For Computer-Assisted Language Learning, this problem is compounded by the presence of various lexical and grammatical errors that may provide additional cues as to the identity of the author. Fortunately, privacy-enhancing techniques can be applied to provide at least some level of privacy, both for the author of the text (or linguistic production) and for the other individuals that may be referred in it. I’ll review in this talk some of those techniques, such as text sanitization, text rewriting, and privacy-preserving training. I’ll also describe in our own work on data-driven text sanitization based on explicit measures of privacy risks and will also present how such methods can be evaluated using our recently released Text Anonymization Benchmark (TAB).
Description of the workshop
The workshop series on Natural Language Processing (NLP) for Computer-Assisted Language Learning (NLP4CALL) is a meeting place for researchers working on the integration of Natural Language Processing and Speech Technologies in CALL systems and exploring the theoretical and methodological issues arising in this connection. The latter includes, among others, insights from Second Language Acquisition (SLA) research, on the one hand, and promote development of “Computational SLA” through setting up Second Language research infrastructure(s), on the other.
The intersection of Natural Language Processing (or Language Technology / Computational Linguistics) and Speech Technology with Computer-Assisted Language Learning (CALL) brings “understanding” of language to CALL tools, thus making CALL intelligent. This fact has given the name for this area of research – Intelligent CALL, ICALL. As the definition suggests, apart from having excellent knowledge of Natural Language Processing and/or Speech Technology, ICALL researchers need good insights into second language acquisition theories and practices, as well as knowledge of second language pedagogy and didactics. This workshop invites therefore a wide range of ICALL-relevant research, including studies where NLP-enriched tools are used for testing SLA and pedagogical theories, and vice versa, where SLA theories, pedagogical practices or empirical data are modeled in ICALL tools.
The NLP4CALL workshop series is aimed at bringing together competences from these areas for sharing experiences and brainstorming around the future of the field.
We welcome papers:
- that describe research directly aimed at ICALL;
- that demonstrate actual or discuss the potential use of existing Language and Speech Technologies or resources for language learning;
- that describe the ongoing development of resources and tools with potential usage in ICALL, either directly in interactive applications, or indirectly in materials, application or curriculum development, e.g. learning material generation, assessment of learner texts and responses, individualized learning solutions, provision of feedback;
- that discuss challenges and/or research agenda for ICALL
- that describe empirical studies on language learner data.
This year a special focus is given to work done on error detection/correction and feedback generation.
We encourage paper presentations and software demonstrations describing the above- mentioned themes primarily, but not exclusively, for the Nordic languages. This workshop follows a series of workshops on NLP for CALL organized by a Special Interest Group in Intelligent Computer-Assisted Language Learning (SIG-ICALL) of NEALT, see:
The workshop series has been previously financed by the Centre for Language Technology (University of Gothenburg), the SweLL project (University of Gothenburg), the Swedish Research Council's conference grant, Språkbanken-Text, the L2 profiling project, the Center for Natural Language Processing (Cental) at the Université catholique de Louvain (UCLouvain), and itec at the Katholieke Universiteit Leuven (KUL).
For the past eleven years we successfully co-located the NLP4CALL with the two major Language Technology events in Scandinavia, SLTC and NoDaLiDa, thus making this workshop an annual event. We intend to continue this tradition of organizing the workshop around NoDaLiDa and SLTC. Through this workshop, we intend to profile ICALL research in Europe and in Nordic countries in particular, and to provide a dissemination venue for researchers active in this area.
Submission information
We will be using the NLP4CALL templates for the workshop this year.
IMPORTANT: For licensing reasons, all camera-ready papers must include the following sentence as an unmarked (unnumbered) footnote on the first page of the paper: This work is licensed under a Creative Commons Attribution 4.0 International Licence. Licence details: http://creativecommons.org/licenses/by/4.0/. NEW: Please note that the footnote will automatically be added to the final version for the LaTeX template (and Overleaf template once approved).
Submissions that do not adhere to the author guidelines will be rejected without review.
The footnote can be added by adding the following piece of code before the abstract: $\let\thefootnote\relax\footnotetext{This work is licensed under a Creative Commons Attribution 4.0 International Licence. Licence details: http://creativecommons.org/licenses/by/4.0/.} $
Authors are invited to submit long papers (8-12 pages) alternatively short or demo papers (4-7 pages), page count not including references. Please indicate one relevant paper type at submission time. Only pdf files will be accepted. Submissions will be managed through the electronic conference management system (link will be added soon). Final camera-ready versions of accepted papers will be given an additional page to address reviewer comments.
Papers should describe original unpublished work or work-in-progress. Every paper will be reviewed by at least 2 members of the program committee. As reviewing will be blind, please ensure that papers are anonymous. Self-references that reveal the author's identity, e.g., "We previously showed (Smith, 1991) ...", should be avoided. Instead, use citations such as "Smith previously showed (Smith, 1991) ...". Submissions will be judged on appropriateness, clarity, originality/innovativeness, correctness/soundness, meaningful comparison, significance and impact of ideas or results.
Papers describing systems for MultiGED-2023 shared task should be submitted to a special track and should follow the same page restrictions as the NLP4CALL papers (in the range of 4-12 pages). System descriptions will be reviewed by the MultiGED organizers.
All accepted papers will be collected into a proceedings volume to be submitted for publication in the Nodalida volume (see Nodalida page for details) and, additionally, double-published through ACL anthology, following experiences from previous workshops, e.g. the 11th NLP4CALL.
Important dates
26 January 2023 | First call for papers |
20 February 2023 | Second call for papers |
20 March 2023 | Third call for papers |
27 March 2023 | Final call for papers |
03 April 2023 | Paper submission deadline (short, long and demo) |
21 April 2023 | Notification of acceptance |
01 May 2023 | Camera-ready papers for publication |
22 May 2023 | Workshop date |
Program committee (preliminary)
- David Alfter, Université catholique de Louvain, Belgium and University of Gothenburg, Sweden
- Serge Bibauw, Universidad Central del Ecuador, Ecuador
- Claudia Borg, University of Malta, Malta
- António Branco, Universidade de Lisboa, Portugal
- Andrew Caines, University of Cambridge, UK
- Xiaobin Chen, Universität Tübingen, Germany
- Frederik Cornillie, University of Leuven, Belgium
- Kordula de Kuthy, Universität Tübingen, Germany
- Piet Desmet, University of Leuven, Belgium
- Thomas François, Université catholique de Louvain, Belgium
- Thomas Gaillat, Université Rennes 2, France
- Johannes Graën, University of Zurich, Switzerland
- Andrea Horbach, FernUniversität Hagen, Germany
- Arne Jönsson, Linköping University, Sweden
- Ronja Laarmann-Quante, FernUniversität Hagen, Germany
- Herbert Lange, University of Hamburg, Germany
- Peter Ljunglöf, University of Gothenburg, Sweden and Chalmers Institute of Technology, Sweden
- Margot Mieskes, University of Applied Sciences Darmstadt, Germany
- Lionel Nicolas, EURAC research, Italy
- Ulrike Pado, Hochschule für Technik Stuttgart, Germany
- Magali Paquot, Université catholique de Louvain, Belgium
- Evelina Rennes, Linköping University, Sweden
- Egon Stemle, EURAC research, Italy
- Francis M. Tyers, Indiana University Bloomington, US
- Sowmya Vajjala, National Research Council, Canada
- Elena Volodina, University of Gothenburg, Sweden
- Zarah Weiss, Universität Tübingen, Germany
- Torsten Zesch, FernUniversität Hagen, Germany
- Ramon Ziai, Universität Tübingen, Germany
- Robert Östling, Stockholm University, Sweden
Workshop organizers
- David Alfter, Gothenburg Research Infrastructure in Digital Humanities (GRIDH), University of Gothenburg, Gothenburg, Sweden; david dot alfter at gu dot se (Organizing chair)
- Elena Volodina, Språkbanken Text, Department of Swedish, multilingualism, language technology, University of Gothenburg, Sweden; elena dot volodina at svenska dot gu dot se
- Thomas François, CENTAL, Institute for Language and Communication, Université catholique de Louvain, Belgium
- Arne Jönsson, Department of Computer and Information Science, Linköping University, Sweden
- Evelina Rennes, Department of Computer and Information Science, Linköping University, Sweden
Financial acknowledgements (TBA)
For the past ten years we successfully co-located the NLP4CALL with the two major Language Technology events in Scandinavia, SLTC and NoDaLiDa, thus making this workshop an annual event. We intend to continue this tradition. Through this workshop, we intend to profile ICALL research in Nordic countries and to provide a dissemination venue for researchers active in this area.
Related links
- 11th workshop on NLP4CALL
- 10th workshop on NLP for CALL
- 9th workshop on NLP for CALL
- 8th workshop on NLP for CALL
- 7th workshop on NLP for CALL
- Joint 6th workshop on NLP for CALL and 2nd workshop on NLP for Research on Language Acquisition
- Joint 5th workshop on NLP for CALL and 1st workshop on NLP for Research on Language Acquisition
- 4th workshop on NLP for CALL
- 3rd workshop on NLP for CALL
- 2nd workshop on NLP for CALL
- 1st workshop on NLP for CALL
- SIG-ICALL, NEALT
ICALL-relevant mailing lists
There is one mailing list that spreads ICALL-relevant information run by BEA-workshop organizers (bea.nlp.workshop@gmail.com). We encourage you to join them to be updated of the events, publications and discussions in the area
To join BEA-list, contact Andrea Horbach (andrea.horbach@uni-due.de). BEA-mailing list spreads information in a digest form approx 4 times a year.
For NLP4CALL inquiries, please email David Alfter (david dot alfter at gu dot se )