Språkbanken at LREC 2026

Submitted by Maria Irena Szawerna on 2026-06-05

This post was written in collaboration with Dana Dannélls, Dimitrios Kokkinakis, Céline Leuzinger, Ricardo Muñoz Sánchez, and Arianna Masciolini

The Fifteenth biennial Language Resources and Evaluation Conference (LREC2026) attracted record-breaking crowds, not in the least from our very own Språkbanken (and Sweden at large was among the top 10 most common affiliations!). Organized by the ELRA Language Resources Association and taking place in Palma de Mallorca, Spain, the conference spanned a total of 6 days: 3 workshop/tutorial days and 3 main conference days. Despite its length and staggering size (2400+ participants, 900+ papers, 58 co-located events), the conference maintained an air of friendliness and a sense of community that was appreciated both by newcomers and more experienced conference-goers among our ranks.

Members of Språkbanken at GU in particular were involved in organizing 3 out of the 46 workshops accepted to LREC2026, and co-authored a number of main conference and workshop publications. Additionally, Elena Volodina was involved in organizing the conference itself and served as one of its Publicity Co-Chairs and a board member of ELRA, the association in charge of the conference.

Overall impressions

Our two more senior attendees, Dana and Dimitrios, highlighted the interdisciplinarity and collaborativeness of the conference as its most important aspects. Dana described the conference as a meeting point for friends and colleagues both old and new, the location of which fostered an engaging and collaborative atmosphere. Dimi emphasized that LREC2026 acted like a melting pot for research and ideas from computational linguistics, digital humanities, AI, and other fields, with a special focus on reliability, transparency, and usefulness.

The community of doctoral students at Språkbanken at GU was represented by three young researchers, all at different stages in their PhD journeys.

Céline said, with great joy, that LREC 2026 was a great first conference experience for her. "At first, the huge crowds were a little overwhelming, but once I got to speak to people individually, I felt very welcomed and at ease!" she confessed, and added that she is looking forward to LREC 2028!

Maria, at the halfway point in her PhD journey, shared that this conference made her feel truly welcome in the community and gave her a sense of belonging. She asserted that the new acquaintances she made at the conference were very friendly and that friends from before made each conference day better!

"LREC as a conference has marked three key stages of my PhD," says Ricardo. The first conference he attended in person was the 2022 edition in Marseille, while the 2024 edition coincided with the midpoint of his studies. This year’s edition is the last conference he will have attended as part of his PhD. "I have met so many lovely people each time while learning a lot from them," he confesses, and adds "each edition being two years apart from the previous one has also allowed me to look at how much I have evolved throughout this journey."

RESOURCEFUL 2026

The fourth international workshop on the role of resources in the age of large language models, RESOURCEFUL-2026, was a full-day workshop that took place on the first workshop day of the LREC2026 conference. It was co-organized by Dana Dannélls. It has attracted around 40 participants that are interested in multilingual and low-resource NLP, evaluation, benchmarking, and annotation methodologies for language technology. Three invited speakers: Mark Fišel, Tiago Torrent, and Maria Gavriilidou gave inspiring talks about these topics and contributed to an interesting panel discussion about pressing challenges in and future directions in low-resource languages and community benefits. The workshop content and proceedings can be found here.

workshop room poster session hall lunch on the entrance floor surrounded by food trucks coffee break terrace

Fig 1: The workshop and conference venue at Palau de Congressos de Palma, from the left: workshop room, poster session hall, lunch on the entrance floor surrounded by food trucks and the coffee break terrace. Photos courtesy of Dana Dannélls.

RaPID-6-MENTAL.ai

Dimitrios Kokkinakis served as workshop chair for the sixth edition of the RaPID workshop series, organised this year in cooperation with the MENTAL.ai project in France. The half-day workshop brought together researchers working on language, speech, social media and multimodal data from people with cognitive, psychiatric and developmental conditions, including dementia, autism, Parkinson’s disease and depression. The programme included two keynote speakers, one invited speaker, four oral presentations, and eight poster presentations (one of which was a paper by Dimitrios Kokkinakis, Herb Lange and Ricardo Muñoz Sánchez). The workshop highlighted how computational linguistics and NLP can support earlier diagnosis, risk identification, monitoring of disease progression, and improved patient care through non-invasive, data-driven methods. A blog about the workshop is here, while the proceedings of RaPID-6@MENTAL.ai can be found here and will soon also be available in the ACL Anthology.

LREC entrance prof. B. MacWhinney, keynote speaker

closing ceremony exhibit from the Miro museum the city of Sóller a tourist shop in Palma

Fig. 2: LREC entrance; one of the RaPID-6 keynote speakers (prof. B. MacWhinney); closing ceremony; exhibit from the Miro museum; the city of Sóller and a tourist shop in Palma. Photos courtesy of Dimitrios Kokkinakis.

LEGAL2026 & CALD-pseudo 2026

On the second workshop day, the joint LEGAL2026 & CALD-pseudo 2026 workshop took place, serving as a hub for those in the NLP community interested in de-identification and other legal and ethical issues, and for those in the legal community interested in NLP. The workshop spanned the whole day, and its CALD-pseudo track, focusing on pseudonymization and anonymization, was co-organized by Maria Irena Szawerna, Ricardo Muñoz Sánchez, and Elena Volodina. Maria also presented a short paper at the workshop. The workshop featured fascinating talks from Maja Bogataj Jančič and Ivan Habernal. A more detailed report can be found here, and the workshop proceedings are, as of now, available here.

group photo of workshop participants prof. Ivan Habernal, invited speaker Maria presenting her paper Gabriel Loiseau, one of the CALD-pseudo presenters

Fig. 3: From the left: workshop group picture (courtesy of the conference center staff), our invited speaker, prof. Ivan Habernal, Maria, and one of the presenters, Gabriel Loiseau (latter 3 photos courtesy of Maria Irena Szawerna).

Overview of the publications from the conference

At her LEGAL2026 & CALD-pseudo 2026 joint workshop, Maria Irena Szawerna presented her and Simon Dobnik’s (FLoV) work on exploring how language models’ meaning representations align with human perceptions of personal information. A day later she was at the main conference’s first poster session with Jacob Lee Suchardt (Leipzig University). The two talked about using language models to generate surrogates to replace personal information with and how (not) to evaluate the results.

Niklas Deworetzki (CSE) and Christina Klironomou (FLoV) both presented joint work with Arianna Masciolini. Niklas and Arianna’s paper, Syntactic Sugar for Syntactic Queries, introduces a tool for automatically translate high-level tree queries into CQL, the Corpus Query Language used in, among other platforms Språkbanken’s Korp (more on that on December 14 at the Higher Seminar!). Christina’s paper, written in collaboration with colleagues from the Universities of Crete and Thessaloniki, describes a brand new treebank of learner Greek, freshly released as part of Universal Dependencies 2.18.

At the 5th NLPerspectives workshop, Céline Leuzinger and Ricardo Muñoz Sánchez presented joint work with Emilie Francis and Lee Gauthier. Their paper explores the impact that LLM pre-annotation has on annotator subjectivity.

As mentioned previously, Dimitrios Kokkinakis and Ricardo Muñoz Sánchez presented at the RaPID-6-Mental.ai workshop a poster co-authored with Herb Lange on the use of Whisper models to help transcribe interviews of patients with dementia.

Finally, at RESOURCEFUL 2026, Dana Dannélls's paper on various OCR methods for historical Swedish was presented

Bibliography

Niklas Deworetzki and Arianna Masciolini (2026). “Syntactic Sugar for Syntactic Queries: Sequential Representations for Dependency Queries”. In Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026), pages 11669–11678. ELRA.

Emilie Francis, Céline Leuzinger, Ricardo Muñoz Sánchez, Lee Gauthier (2026). “ChatGPT, why can’t anyone afford a house? On the Effects of LLM pre-annotation on Annotator Subjectivity”. In Proceedings of NLPerspectives @ LREC 2026, pages 98–111. ELRA.

Martin Johansson, Selma Waginder, Dana Dannélls (2026). “Exploring the similarities and differences between VLM-driven and traditional OCR for Historical Swedish Data”. In Proceedings of The Fourth Workshop on the Role of Resources in the Age of Large Language Models (RESOURCEFUL 2026), pages 193–199. ELRA.

Christina Klironomou, Thelka Pasparaki, Arianna Masciolini, Alexandros Tantos, Despoina Ourania Touriki, Konstantinos Tsiotskas and Eleni Tsourilla (2026). “Towards Universal Dependencies for L2 Learners of Modern Greek: Annotation and Challenges”. In Proceedings of the Ninth Workshop on Universal Dependencies @ LREC2026, pages 103–108. ELRA.

Maria Irena Szawerna, Simon Dobnik (2026). “Birds of a Feather: Do Embedding Representations of Personal Information Flock Together?” In Proceedings of the Joint Workshop on Legal and Ethical Issues in Human Language Technologies and Computational Approaches to Language Data Pseudonymization, Anonymization, De-identification, and Data Privacy (LEGAL2026 and CALD-pseudo 2026) @ LREC 2026, pages 62-72. ELRA.

Maria Irena Szawerna, Jacob Lee Suchardt (2026). “Fill-in-the-Blanks: Automatic Generation and Evaluation of Language Models' Pseudonyms for English and Swedish Texts”. In Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026), pages 1155-1169. ELRA.

Dimitrios Kokkinakis, Herbert Lange and Ricardo Muñoz Sánchez (2026). “Disfluencies and ASR Performance on Swedish Spontaneous Speech from the ‘Trip to Stockholm' Discourse Narrative Task”. In Proceedings of the Proceedings of the RaPID-6@MENTAL.ai @LREC 2026, pages 24–33. ELRA.