Hoppa till huvudinnehåll
Språkbanken Text är en avdelning inom Språkbanken.

12th NLP4CALL

NLP4CALL 2023 & MultiGED Shared Task

Nodalida workshop, Tórshavn, Faroe Islands, May 22 2023

Quick links

Zoom link for online participation

The Zoom link for online participation is here: Join on Zoom

Proceedings

The proceedings are available now: NLP4CALL 2023 proceedings

Sponsor information

The workshop is co-sponsored by the project Grandma Karl is 27 years old (funded by the Swedish Research Council) and Språkbanken Text through its support for the organization of the MultiGED-2023 shared task.

COVID information

Concerning COVID-related issues, we follow the official recommendations. For up-to-date information, please see https://korona.fo/travel

Venue

This year's NLP4CALL workshop will be organized as a hybrid event. The workshop will physically take place in Tórshavn, Faroe Islands on May 22, 2023 and online on Zoom. The Zoom link will be sent out to registered participants.

The physical location will be Sjóvinnuhúsið (Address: Vestara Bryggja 15, Tórshavn) [View on Google Maps].
For on-site participation, the links and maps are provided on the main conference website: venue.

Registration

Registration is done via the Nodalida registration website: Registration

Program

Please note that all time indications are UTC +1

09:00 - 09:10   Opening session
09:10 - 10:00   Invited talk 1
TELL: Tasks Engaging Language Learners
Marije Michel
Chair: Elena Volodina
    Session 1
    Chair: Elena Volodina
10:00 - 10:20   On the Relevance and Learner Dependence of Co-text Complexity for Exercise Difficulty
Tanja Heck and Detmar Meurers
10:20 - 10:40   Coffee break
    Session 2
    Chair: Ricardo Muñoz Sanchez
10:40 - 11:00   MultiGED-2023 shared task at NLP4CALL: Multilingual Grammatical Error Detection [Slides]
Elena Volodina, Christopher Bryant, Andrew Caines, Orphée De Clercq, Jennifer-Carmen Frey, Elizaveta Ershova, Alexandr Rosen and Olga Vinogradova
11:00 - 11:15   Two Neural Models for Multilingual Grammatical Error Detection
Phuong Le-Hong, The Quyen Ngo and Thi Minh Huyen Nguyen
11:15 - 11:30   A distantly supervised Grammatical Error Detection/Correction system for Swedish
Murathan Kurfalı and Robert Östling
11:30 - 11:45   The NTNU System in MultiGED-2023: Contextual Flair Embeddings for Multilingual Grammatical Error Detection
Lars Bungum, Björn Gambäck and Arild Brandrud Næss
11:45 - 12:00   EliCoDe at MultiGED2023: fine-tuning XLM-RoBERTa for multilingual grammatical error detection
Davide Colla, Matteo Delsanto and Elisa Di Nuovo
12:00 - 13:00   Lunch
13:00 - 13:50   Invited talk 2
Privacy-enhancing NLP: a primer
Pierre Lison
Chair: Hercules Dalianis
    Session 3
    Chair: Arianna Masciolini
13:50 - 14:05   Automated Assessment of Task Completion in Spontaneous Speech for Finnish and Finland Swedish Language Learners
Ekaterina Voskoboinik, Yaroslav Getman, Ragheb Al-Ghezi, Mikko Kurimo and Tamas Grosz
14:05 - 14:20   Speech Technology to Support Phonics Learning for Kindergarten Children at Risk of Dyslexia
Stine Fuglsang Engmose and Peter Juel Henrichsen
14:20 - 14:40   Coffee break
    Session 4
    Chair: Ekaterina Voskoboinik
14:40 - 14:55   DaLAJ-GED - a dataset for Grammatical Error Detection tasks on Swedish [Slides]
Elena Volodina, Yousuf Ali Mohammed, Aleksandrs Berdicevskis, Gerlof Bouma and Joey Öhman
14:55 - 15:10   Experiments on Automatic Error Detection and Correction for Uruguayan Learners of English
Romina Brown, Santiago Paez, Gonzalo Herrera, Luis Chiruzzo and Aiala Rosá
15:10 - 15:25   Manual and Automatic Identification of Similar Arguments in EFL Learner Essays
Ahmed Mousa, Ronja Laarmann-Quante and Andrea Horbach
15:25 - 15:45   Sequence Tagging in EFL Email Texts as Feedback for Language Learners
Yuning Ding, Ruth Trüb, Johanna Fleckenstein, Stefan Keller and Andrea Horbach
15:45 - 15:50   Closing session
17:00 - ...   Conference welcome reception at Müllers Pakkhús
[Directions from the workshop venue on Google Maps]

Shared Task

MultiGED 2023

 

 

NEW for this year is the MultiGED shared task on token-level error detection for L2 Czech, English, German, Italian and Swedish, organized by the Computational SLA working group.

For more information, please see the Shared Task website: https://github.com/spraakbanken/multiged-2023.

Invited speakers

This year we have the pleasure to welcome two invited speakers:


Marije Michel

Marije Michel (PhD Applied Linguistics, University of Amsterdam) is chair of Language Learning at Groningen University in the Netherlands. Her research and teaching focus on second language acquisition and processing with specific attention to task-based language pedagogy, digitally-mediated interaction and writing in a second language.

TELL: Tasks Engaging Language Learners
Taking a task-based approach on language teaching, learning and assessment (TBLT), the basic unit of second language (L2) instruction is a task. Tasks are (pedagogic) activities that adhere to specific criteria (e.g., there needs to be a communicative gap, Skehan, 1998) in order to ensure that learners engage in meaningful language use during task performance. In the long run, only tasks engaging students in authentic language use may lead to L2 processes that have the potential to support L2 acquisition. In this presentation, I will review the most important principles of designing engaging learning tasks, highlight examples of practice-induced L2 research using digital tools, and will showcase some of my own work on task design for L2 learning during digitally mediated communication and L2 writing. In doing so, I will discuss the NLP measures we use to evaluate task-based performance, formulating TBLT desiderata for the future of NLP4CALL.

 

Pierre Lison

Pierre Lison is a senior researcher at the Norwegian Computing Center, a research institute located in Oslo and conducting research in computer science, statistical modelling and machine learning. Pierre’s research interests include privacy-enhancing NLP, spoken dialogue systems, multilingual corpora and weak supervision. Pierre currently leads the CLEANUP project on data-driven models for text sanitization. He also holds a part-time position as associate professor at the University of Oslo.

Privacy-enhancing NLP: a primer
Text documents often contain personal data in some form – either related to the authors themselves or to some other individuals mentioned in the text. This raises privacy concerns, especially when those documents are to be published online or included as training data for NLP models. For Computer-Assisted Language Learning, this problem is compounded by the presence of various lexical and grammatical errors that may provide additional cues as to the identity of the author. Fortunately, privacy-enhancing techniques can be applied to provide at least some level of privacy, both for the author of the text (or linguistic production) and for the other individuals that may be referred in it. I’ll review in this talk some of those techniques, such as text sanitization, text rewriting, and privacy-preserving training. I’ll also describe in our own work on data-driven text sanitization based on explicit measures of privacy risks and will also present how such methods can be evaluated using our recently released Text Anonymization Benchmark (TAB).

Description of the workshop

The workshop series on Natural Language Processing (NLP) for Computer-Assisted Language Learning (NLP4CALL) is a meeting place for researchers working on the integration of Natural Language Processing and Speech Technologies in CALL systems and exploring the theoretical and methodological issues arising in this connection. The latter includes, among others, insights from Second Language Acquisition (SLA) research, on the one hand, and promote development of “Computational SLA” through setting up Second Language research infrastructure(s), on the other.

The intersection of Natural Language Processing (or Language Technology / Computational Linguistics) and Speech Technology with Computer-Assisted Language Learning (CALL) brings “understanding” of language to CALL tools, thus making CALL intelligent. This fact has given the name for this area of research – Intelligent CALL, ICALL. As the definition suggests, apart from having excellent knowledge of Natural Language Processing and/or Speech Technology, ICALL researchers need good insights into second language acquisition theories and practices, as well as knowledge of second language pedagogy and didactics. This workshop invites therefore a wide range of ICALL-relevant research, including studies where NLP-enriched tools are used for testing SLA and pedagogical theories, and vice versa, where SLA theories, pedagogical practices or empirical data are modeled in ICALL tools.

The NLP4CALL workshop series is aimed at bringing together competences from these areas for sharing experiences and brainstorming around the future of the field.

We welcome papers:

  • that describe research directly aimed at ICALL;
  • that demonstrate actual or discuss the potential use of existing Language and Speech Technologies or resources for language learning;
  • that describe the ongoing development of resources and tools with potential usage in ICALL, either directly in interactive applications, or indirectly in materials, application or curriculum development, e.g. learning material generation, assessment of learner texts and responses, individualized learning solutions, provision of feedback;
  • that discuss challenges and/or research agenda for ICALL
  • that describe empirical studies on language learner data.

This year a special focus is given to work done on error detection/correction and feedback generation.

We encourage paper presentations and software demonstrations describing the above- mentioned themes primarily, but not exclusively, for the Nordic languages. This workshop follows a series of workshops on NLP for CALL organized by a Special Interest Group in Intelligent Computer-Assisted Language Learning (SIG-ICALL) of NEALT, see: .

The workshop series has been previously financed by the Centre for Language Technology (University of Gothenburg), the SweLL project (University of Gothenburg), the Swedish Research Council's conference grant, Språkbanken-Text, the L2 profiling project, the Center for Natural Language Processing (Cental) at the Université catholique de Louvain (UCLouvain), and itec at the Katholieke Universiteit Leuven (KUL).

For the past eleven years we successfully co-located the NLP4CALL with the two major Language Technology events in Scandinavia, SLTC and NoDaLiDa, thus making this workshop an annual event. We intend to continue this tradition of organizing the workshop around NoDaLiDa and SLTC. Through this workshop, we intend to profile ICALL research in Europe and in Nordic countries in particular, and to provide a dissemination venue for researchers active in this area.

Submission information

We will be using the NLP4CALL templates for the workshop this year.

IMPORTANT: For licensing reasons, all camera-ready papers must include the following sentence as an unmarked (unnumbered) footnote on the first page of the paper: This work is licensed under a Creative Commons Attribution 4.0 International Licence. Licence details: http://creativecommons.org/licenses/by/4.0/. NEW: Please note that the footnote will automatically be added to the final version for the LaTeX template (and Overleaf template once approved).

Submissions that do not adhere to the author guidelines will be rejected without review.

The footnote can be added by adding the following piece of code before the abstract: $\let\thefootnote\relax\footnotetext{This work is licensed under a Creative Commons Attribution 4.0 International Licence. Licence details: http://creativecommons.org/licenses/by/4.0/.} $

Authors are invited to submit long papers (8-12 pages) alternatively short or demo papers (4-7 pages), page count not including references. Please indicate one relevant paper type at submission time. Only pdf files will be accepted. Submissions will be managed through the electronic conference management system (link will be added soon). Final camera-ready versions of accepted papers will be given an additional page to address reviewer comments.

Make a submission

Papers should describe original unpublished work or work-in-progress. Every paper will be reviewed by at least 2 members of the program committee. As reviewing will be blind, please ensure that papers are anonymous. Self-references that reveal the author's identity, e.g., "We previously showed (Smith, 1991) ...", should be avoided. Instead, use citations such as "Smith previously showed (Smith, 1991) ...". Submissions will be judged on appropriateness, clarity, originality/innovativeness, correctness/soundness, meaningful comparison, significance and impact of ideas or results.

Papers describing systems for MultiGED-2023 shared task should be submitted to a special track and should follow the same page restrictions as the NLP4CALL papers (in the range of 4-12 pages). System descriptions will be reviewed by the MultiGED organizers.

All accepted papers will be collected into a proceedings volume to be submitted for publication in the Nodalida volume (see Nodalida page for details) and, additionally, double-published through ACL anthology, following experiences from previous workshops, e.g. the 11th NLP4CALL.

Important dates

26 January 2023 First call for papers
20 February 2023 Second call for papers
20 March 2023 Third call for papers
27 March 2023 Final call for papers
03 April 2023 Paper submission deadline (short, long and demo)
21 April 2023 Notification of acceptance
01 May 2023 Camera-ready papers for publication
22 May 2023 Workshop date

Program committee (preliminary)

  • David Alfter, Université catholique de Louvain, Belgium and University of Gothenburg, Sweden
  • Serge Bibauw, Universidad Central del Ecuador, Ecuador
  • Claudia Borg, University of Malta, Malta
  • António Branco, Universidade de Lisboa, Portugal
  • Andrew Caines, University of Cambridge, UK
  • Xiaobin Chen, Universität Tübingen, Germany
  • Frederik Cornillie, University of Leuven, Belgium
  • Kordula de Kuthy, Universität Tübingen, Germany
  • Piet Desmet, University of Leuven, Belgium
  • Thomas François, Université catholique de Louvain, Belgium
  • Thomas Gaillat, Université Rennes 2, France
  • Johannes Graën, University of Zurich, Switzerland
  • Andrea Horbach, FernUniversität Hagen, Germany
  • Arne Jönsson, Linköping University, Sweden
  • Ronja Laarmann-Quante, FernUniversität Hagen, Germany
  • Herbert Lange, University of Hamburg, Germany
  • Peter Ljunglöf, University of Gothenburg, Sweden and Chalmers Institute of Technology, Sweden
  • Margot Mieskes, University of Applied Sciences Darmstadt, Germany
  • Lionel Nicolas, EURAC research, Italy
  • Ulrike Pado, Hochschule für Technik Stuttgart, Germany
  • Magali Paquot, Université catholique de Louvain, Belgium
  • Evelina Rennes, Linköping University, Sweden
  • Egon Stemle, EURAC research, Italy
  • Francis M. Tyers, Indiana University Bloomington, US
  • Sowmya Vajjala, National Research Council, Canada
  • Elena Volodina, University of Gothenburg, Sweden
  • Zarah Weiss, Universität Tübingen, Germany
  • Torsten Zesch, FernUniversität Hagen, Germany
  • Ramon Ziai, Universität Tübingen, Germany
  • Robert Östling, Stockholm University, Sweden

Workshop organizers

  • David Alfter, Gothenburg Research Infrastructure in Digital Humanities (GRIDH), University of Gothenburg, Gothenburg, Sweden; david dot alfter at gu dot se (Organizing chair)
  • Elena Volodina, Språkbanken Text, Department of Swedish, multilingualism, language technology, University of Gothenburg, Sweden; elena dot volodina at svenska dot gu dot se
  • Thomas François, CENTAL, Institute for Language and Communication, Université catholique de Louvain, Belgium
  • Arne Jönsson, Department of Computer and Information Science, Linköping University, Sweden
  • Evelina Rennes, Department of Computer and Information Science, Linköping University, Sweden

Financial acknowledgements (TBA)

For the past ten years we successfully co-located the NLP4CALL with the two major Language Technology events in Scandinavia, SLTC and NoDaLiDa, thus making this workshop an annual event. We intend to continue this tradition. Through this workshop, we intend to profile ICALL research in Nordic countries and to provide a dissemination venue for researchers active in this area.

ICALL-relevant mailing lists

There is one mailing list that spreads ICALL-relevant information run by BEA-workshop organizers (bea.nlp.workshop@gmail.com). We encourage you to join them to be updated of the events, publications and discussions in the area

To join BEA-list, contact Andrea Horbach (andrea.horbach@uni-due.de). BEA-mailing list spreads information in a digest form approx 4 times a year.

For NLP4CALL inquiries, please email David Alfter (david dot alfter at gu dot se )