Skip to main content

Blog posts

Approaches to corpus searches - a mini-workshop at Språkbanken Text

1. Workshop in a nutshell

On April, 23, 2024 Språkbanken Text and Huminfra organized a well-attended mini-workshop Approaches to corpus searches. The workshop was devoted to showcasing new perspectives to corpus searches from different angles: techical, user-oriented, visualization.

Language learning hypotheses with the Swedish Word Family: from simple to complex

Authors: Elena Volodina, Yousuf Ali Mohammed, Therese Lindström Tiedemann

In our previous blog on the Swedish Word Family we described how morphologically annotated resources can be used for analysis of texts for cultural aspects, namely, how different holidays are represented in second language corpora.

SwedishFromScratch: a mini-course for Ukranians

Sweden has received a new wave of refugees - Ukranians. They have sound school backgrounds, good university education, are ambitious and want to work. The Swedish government makes sure that Ukranian refugees get a temporary residence permit and a work permit in a fast application process, which at the moment takes about one month from the application date till decision.

PRAO at the Department of Swedish


Our names are Ebba and Anastasia. 

God Jul from the Swedish Word Family

Lexical resources for Natural Language Processing (NLP),  Second Language Acquisition (SLA) and other applied disciplines differ in the choice of the lexical units they use as their main entry. Most widely-spread is use of a lemma, i.e. base form of a word, or a lemgram, i.e. base form + its part of speech (POS), cf François et al. (2016) and Kilgarriff et al. (2014).

Swedish derivational morphology with CoDeRooMor

This blog is based on a joint work by Elena Volodina, Therese Lindström Tiedemann and Yousuf Ali Mohammed within the RJ-funded project L2 profiles. Three annotators have contributed to this work: Stellan Petersson (University of Gothenburg), Beatrice Silén (University of Helsinki ) and Maisa Lauriala (University of Helsinki).

Pseudonymization of learner essays as a way to meet GDPR requirements

This blog is based on the author's (Elena Volodina's) joint research with Yousuf (Samir) Ali Mohammed, Arild Matsson, Beáta Megyesi and Sandra Derbring

How reliable is sense disambiguation in texts by native and non-native speakers?

(This blog is based on a joint research and publication in collaboration with David Alfter, Therese Lindström Tiedemann, Maisa Lauriala and Daniela Piipponen)

Korp searches in Second Language data

Korp offers a lot of different corpus collections for various types of search (and research). Swedish as a Second Language (L2) is one of the subcategories of the language that can be studied with the help of Korp. At the moment, Korp provides access to five L2 corpora through its interface:

Common Pitfalls in the Development of ICALL Applications

This blog is a piece of opinion where I sketch the process of developing NLP-based applications for second language learning and look at the process from the point of view of typical (mis)conceptions and challenges, as I have experienced them. Are we over-trusting the potential of NLP? Are teachers by definition reluctant to use NLP-based solutions in classrooms? How, if at all, can academic universities ensure sustainability of the developed applications?