9th NLP4CALL, SLTC, Gothenburg, Sweden

SLTC workshop, Gothenburg, Sweden, November 25, 2020

The proceedings are published: [LINK]

Quick links

Venue

The NLP4CALL workshop is co-located with SLTC 2020 in Gothenburg, Sweden.
IMPORTANT: Just like the main conference, the workshop will be organized as an online-only event.

Registration information

At least one author of the accepted papers should be registered for the workshop. In order to register, please visit the SLTC page and follow the instructions there.

Program

Location: Online

Please note that all time indications are CET/UTC+1


09:00 - 09:05		Opening session
09:05 - 10:00		Invited talk 1 What is an NLP NLP? Considerations from an L2 Assessment Perspective. Mark Brenchley Chair: Elena Volodina
09:05 - 10:00		Replacement Pseudonymization of learner essays. Elena Volodina
10:00 - 10:30		Coffee break
		Session 1
		Chair: Therese Lindström Tiedemann
10:30 - 11:00		Show, Don't Tell: Visualising Finnish Word Formation in a Browser-Based Reading Assistant. Frankie Robertson
11:00 - 11:30		The Teacher-Student Chatroom Corpus. Andrew Caines, Helen Yannakoudakis, Helena Edmondson, Helen Allen, Pascual Pérez-Paredes, Bill Byrne, Paula Buttery
11:30 - 12:00		Polygloss - A conversational agent for language practice. Etiene da Cruz Dalcol, Massimo Poesio
12:00 - 13:05		Lunch
13:05 - 14:00		Invited talk 2 Crowdsourcing as a means to democratize access to L2 enriched data: the case of L2 proficiency. Magali Paquot Chair: Elena Volodina
		Session 2
		Chair: Herbert Lange
14:00 - 14:30		Substituto - A Synchronous Educational Language Game for Simultaneous Teaching and Crowdsourcing. Marianne Grace Araneta, GülÅen EryiÄit, Alexander König, Ji-Ung Lee, Ana Luís, Verena Lyding, Lionel Nicolas, Christos Rodosthenous, Federico Sangati
		Best paper voting
14:30 - 15:00		Coffee break
15:00 - 15:30		Organizer talk: Experts versus non-experts in crowdsourcing. Preliminary results. David Alfter, Therese Lindström Tiedemann, Elena Volodina
15:30 - 17:00		Research notes session and discussion. Discussion leaders: Torsten Zesch, Johannes Graën Ethical Collaborative Construction of High-Quality CALL Content with LARA. Branislav Bédi, Cathy Chua, Hanieh Habibi, Manny Rayner What is a right answer? Challenges in Classifying Learner Answers to Listening Comprehension Tasks : The case of orthographic variance. Ronja Laarman-Quante, Andrea Horbach, Torsten Zesch Parallel Corpora as a resource for Data-driven Language Learning. (Demo) Johannes Graën Discussion
17:00 - 17:10		ICALL SIG meeting and election of new SIG chair.
17:10 - 17:30		Best paper/best presentation award and closing session

Invited speakers

This year we have the pleasure to welcome two invited speakers: Magali Paquot

Dr Magali Paquot is an FNRS research associate at the Centre for English Corpus Linguistics, UCLouvain. She specializes in the use of learner corpora to study key topics in SLA and is particularly interested in methodological issues. She is co-editor in chief of the International Journal of Learner Corpus Research and one of the founding members of the Learner Corpus Research Association.

Her most recent publications were published in the International Journal of Learner Corpus Research, Language Assessment Quarterly and Second Language Research. She also co-edited A Practical Handbook of Corpus Linguistics (with S. Th. Gries; Springer, in press) and the Routledge Handbook of Second Language Acquisition and Corpora (with N. Tracy-Ventura, submitted).

Title: Crowdsourcing as a means to democratize access to L2 enriched data: the case of L2 proficiency

The success of Natural Language Processing (NLP) for Computer-Assisted Language Learning (NLP4CALL) projects is often dependent on the quality of its primary L2 data. One variable that is essential for the development of CALL systems, yet very time-consuming and costly, is foreign language proficiency (Ballier et al. 2020). In Second Language Acquisition, its measurement has also not always received the attention it deserves, and practices of proficiency level assignment have been the subject of continued criticism (e.g. Hulstijn et al. 2010). In this talk, I will report on the first results of the Crowdsourcing Language Assessment Project (CLAP), which aims to investigate whether crowdsourcing can offer practical solutions to the time and cost difficulties often associated with foreign language proficiency assessment. More specifically, CLAP explores whether and how a crowd of people can be used to assess learner texts reliably and validly. In its current design, the project relies on an adaptive comparative judgement task (Pollitt, 2012).

Mark Brenchley

Dr Mark Brenchley is Senior Research Manager at Cambridge Assessment English. Mark manages research supporting the development and validation of Cambridge English products in the areas of speaking and writing, as well as vocabulary and grammar more broadly. He specialises in the application of corpus-based methodologies and is responsible for maintaining and developing the company’s internal corpus architecture, including the Cambridge Learner Corpus. His current work, in particular, focuses on the development and validation of auto-marking technologies.

Mark holds a PhD in Education from the University of Exeter, where he explored the development of spoken and written syntax within the English education system. Following his PhD, he co-developed the Growth in Grammar Corpus, a novel corpus of student writing that covers the primary and secondary phases of the English education system.

Title: What is an NLP NLP? Considerations from an L2 Assessment Perspective

Recent years have witnessed what feels like an exponential development in the scope and performance of NLP-approaches to human language. This is no less the case regarding the field of second language assessment, where NLP techniques seem likely to become ever more essential to, and integrated with, the assessment process. Indeed, surveying the recent progress of NLP, it seems hard to think of an assessment area where such techniques would not have genuine practical value. From an NLP-perspective, in other words, the future of NLP-informed assessment looks extremely bright. At the same time, it remains important to keep taking stock, especially where there is always a chance that techniques and applications will advance at a faster rate than our ability to properly conceptualise them. With that in mind, this talk offers a more philosophical perspective on the role of NLP in second language assessment, focusing on the question of what it might actually mean for something to be an "NLP NLP"; that is, a natural language processed, natural language profile. In general, it will explore the relationship between NLP and L2 profiles with regard to the wider notion of validity as a key assessment concept (Messick, 1989; Bachman & Palmer, 1996; Weir, 2005; Kane, 2006); In particular, it will do so with regard to the specific validation framework utilised at Cambridge English (e.g. Shaw & Weir, 2007), and with reference to some of our current principles and practice.

Description of the workshop

The workshop series on Natural Language Processing (NLP) for Computer-Assisted Language Learning (NLP4CALL) is a meeting place for researchers working on the integration of Natural Language Processing and Speech Technologies in CALL systems and exploring the theoretical and methodological issues arising in this connection. The latter includes, among others, insights from Second Language Acquisition (SLA) research, on the one hand, and promote development of “Computational SLA” through setting up Second Language research infrastructure(s), on the other.

The intersection of Natural Language Processing (or Language Technology / Computational Linguistics) and Speech Technology with Computer-Assisted Language Learning (CALL) brings “understanding” of language to CALL tools, thus making CALL intelligent. This fact has given the name for this area of research – Intelligent CALL, ICALL. As the definition suggests, apart from having excellent knowledge of Natural Language Processing and/or Speech Technology, ICALL researchers need good insights into second language acquisition theories and practices, as well as knowledge of second language pedagogy and didactics. This workshop invites therefore a wide range of ICALL-relevant research, including studies where NLP-enriched tools are used for testing SLA and pedagogical theories, and vice versa, where SLA theories, pedagogical practices or empirical data are modeled in ICALL tools.

The NLP4CALL workshop series is aimed at bringing together competences from these areas for sharing experiences and brainstorming around the future of the field.

We welcome papers:

that describe research directly aimed at ICALL;
that demonstrate actual or discuss the potential use of existing Language and Speech Technologies or resources for language learning;
that describe the ongoing development of resources and tools with potential usage in ICALL, either directly in interactive applications, or indirectly in materials, application or curriculum development, e.g. learning material generation, assessment of learner texts and responses, individualized learning solutions, provision of feedback;
that discuss challenges and/or research agenda for ICALL
that describe empirical studies on language learner data.

This year a special focus is given to work done on second language vocabulary and grammar profiling, as well as the use of crowdsourcing for creating, collecting and curating data in NLP projects.

We encourage paper presentations and software demonstrations describing the above-mentioned themes primarily, but not exclusively, for the Nordic languages.

Submission information

We will be using the NAACL-HLT 2019 templates for the workshop this year.

IMPORTANT: For licensing reasons, all camera-ready papers must include the following sentence as an unmarked (unnumbered) footnote on the first page of the paper: This work is licensed under a Creative Commons Attribution 4.0 International Licence. Licence details: http://creativecommons.org/licenses/by/4.0/.

Submissions that do not adhere to the author guidelines will be rejected without review.

Authors are invited to submit long papers (8-12 pages) alternatively short or demo papers (4-7 pages), page count not including references. Please indicate one relevant paper type at submission time. Only pdf files will be accepted. Submissions will be managed through the electronic conference management system EasyChair. Final camera-ready versions of accepted papers will be given an additional page to address reviewer comments.

Make a submission

Papers should describe original unpublished work or work-in-progress. Every paper will be reviewed by at least 2 members of the program committee. As reviewing will be blind, please ensure that papers are anonymous. Self-references that reveal the author's identity, e.g., "We previously showed (Smith, 1991) ...", should be avoided. Instead, use citations such as "Smith previously showed (Smith, 1991) ...". Submissions will be judged on appropriateness, clarity, originality/innovativeness, correctness/soundness, meaningful comparison, significance and impact of ideas or results.

All accepted papers will be collected into a proceedings volume to be submitted for publication in the NEALT Proceeding Series (Linköping Electronic Conference Proceedings) and, additionally, double-published through ACL anthology, following experiences from previous workshops, e.g. the 8th NLP4CALL.

Important dates

15 June, Monday: first call for papers
3 August, Monday: second call for papers
26 August, Wednesday: third call for papers
16 September, Wednesday: final call for papers
23 September, Wednesday 30 September, Wednesday: paper submission deadline (long, short and demo)
21 October, Wednesday: notification of acceptance
11 November, Wednesday: camera-ready papers for publication
25 November, Wednesday: workshop date

Program committee (preliminary)

Lars Ahrenberg, Linköping University, Sweden
David Alfter, University of Gothenburg, Sweden
Claudia Borg, University of Malta, Malta
António Branco, Universidade de Lisboa, Portugal
Mark Brenchley, Cambridge Assessment English, UK
Jill Burstein, Educational Testing Service, US
Andrew Caines, University of Cambridge, UK
Xiaobin Chen, Universität Tübingen, Germany
Kordula de Kuthy, Universität Tübingen, Germany
Simon Dobnik, University of Gothenburg, Sweden
Thomas François, Université catholique de Louvain, Belgium
Johannes Graën, University of Gothenburg, Sweden and Universitat Pompeu Fabra, Spain
Andrea Horbach, University of Duisburg-Essen, Germany
Herbert Lange, University of Gothenburg, Sweden and Chalmers Institute of Technology, Sweden
Peter Ljunglöf, University of Gothenburg, Sweden and Chalmers Institute of Technology, Sweden
Verena Lyding, EURAC research, Italy
Beata Megyesi, Uppsala University, Sweden
Detmar Meurers, Universität Tübingen, Germany
Margot Mieskes, University of Applied Sciences Darmstadt, Germany
Lionel Nicolas, EURAC research, Italy
Ulrike Pado, Hochschule für Technik Stuttgart, Germany
Magali Paquot, Université catholique de Louvain, Belgium
Ildikó Pilán, Norwegian Computing Center, Norway
Robert Reynolds, Brigham Young University, US
Gerold Schneider, University of Zurich, Switzerland
Egon Stemle, EURAC research, Italy
Anaïs Tack, Université catholique de Louvain, Belgium and KU Leuven, Belgium
Irina Temnikova, Mitra Translations, Bulgaria
Francis M. Tyers, Indiana University Bloomington, US and Higher School of Economics Moscow, Russia
Sowmya Vajjala, National Research Council, Canada
Elena Volodina, University of Gothenburg, Sweden
Zarah Weiss, Universität Tübingen, Germany
Mats Wirén, Stockholm University, Sweden
Torsten Zesch, University of Duisburg-Essen, Germany
Ramon Ziai, Universität Tübingen, Germany
Robert Östling, Stockholm University, Sweden

Workshop organizers

David Alfter, Språkbanken, Department of Swedish, University of Gothenburg; david dot alfter at svenska dot gu dot se (Organizing chair)
Elena Volodina, Språkbanken, Department of Swedish, University of Gothenburg; elena dot volodina at svenska dot gu dot se
Ildikó Pilán, Norwegian Computing Center, Norway
Herbert Lange, Department of Computer Science and Engineering, University of Gothenburg and Chalmers University of Technology, Sweden; herbert dot lange at cse dot gu dot se
Lars Borin, Språkbanken, Department of Swedish, University of Gothenburg; lars dot borin at svenska dot gu dot se

The workshop series has been previously financed by the Centre for Language Technology (University of Gothenburg), the SweLL project (University of Gothenburg) and the Swedish Research Council's conference grant. Currently the funding comes from Språkbanken-Text and the L2 profiling project.

For the past eight years we successfully co-located the NLP4CALL with the two major Language Technology events in Scandinavia, SLTC and NoDaLiDa, thus making this workshop an annual event. We intend to continue this tradition. Through this workshop, we intend to profile ICALL research in Nordic countries and to provide a dissemination venue for researchers active in this area.

ICALL-relevant mailing lists

There are two mailing lists that spread ICALL-relevant information: one run by EuroCALL/CALICO SIG-ICALL group (nlpcall@artsservices.uwaterloo.ca // nlpcall@watarts.uwaterloo.ca) and the other one run by BEA-workshop organizers (bea.nlp.workshop@gmail.com). We encourage you to join them to be updated of the events, publications and discussions in the area

To join EuroCALL/CALICO list, contact Mathias Schulze (mschulze@uwaterloo.ca) . You can freely write to the EuroCALL/CALICO list when you want to disseminate some call for papers/information or ask questions.
To join BEA-list, contact Ekaterina Kochmar (Ekaterina.Kochmar@cl.cam.ac.uk) . BEA-mailing list spreads information in a digest form approx 4 times a year.

For NLP4CALL inquiries, please email David Alfter (david dot alfter at svenska dot gu dot se)