Jupiter Descending

Those interested in astronomy may have noticed that Jupiter is currently in opposition. But what does that mean? And what does ”Jupiter” mean? And what does all of this have to do with corpus linguistics, Elizabeth Taylor, and Tuesdays?

Vilket ord är bäst?

I förra blogginlägget fick vi läsa om korsord, en populär sysselsättning så här under sommarmånaderna. En relaterad hobby om man vill vara lite mer social är förstås att spela Scrabble – även känt under det svenska namnet Alfapet, samt i olika digitala versioner, bland annat Wordfeud. Men hur många poäng går det egentligen att få på ett enda ord? Vilket Scrabble-ord är bäst?

Gör ditt eget korsord!

Snart är det semester, och då är korsord en klassiker. Särskilt nu i isoleringstider när vi ändå inte bör umgås, vad är då bättre än att sitta i hammocken med en välvässad blyertspenna, ett bra sudd, SAOL-appen, och ett korsord? Det finns många tidningar att köpa med korsord av olika svårighetsgrader, för dig som tycker om att lösa korsord. Men det är lite svårare om du skulle vilja tillverka ditt alldeles egna korsord. Tills nu – som ett led i Språkbankens service …

Swedish derivational morphology with CoDeRooMor

This blog is based on a joint work by Elena Volodina, Therese Lindström Tiedemann and Yousuf Ali Mohammed within the RJ-funded project L2 profiles. Three annotators have contributed to this work: Stellan Petersson (University of Gothenburg), Beatrice Silén (University of Helsinki ) and Maisa Lauriala (University of Helsinki). Do you know how many prefixes or suffixes the Swedish language has? Which ones? Different sources state different numbers, e.g Thorell (1984) lists approx. 90 derivational suffixes and about 50 derivatonal prefixes; Hultman (2003) …

Reflektioner från SLTC 2020

Humanister exteriör

25-27 november gick den åttonde upplagan av SLTC, Swedish Language Technology Conference, av stapeln på Humanisten här i Göteborg. Eller, skulle ha gjort om inte ett visst virus satte stopp för det. Istället fick vi som alla andra ställa om till en helt digital utgåva, men det funkade det med. Vi fick ett rekord i antalet registreringar: 193 deltagare från 34 olika länder! (Majoriteten, 60%, kom dock från Sverige). Inte alla dök förstås upp – dels var registreringen gratis, och dels var …

Pseudonymization of learner essays as a way to meet GDPR requirements

This blog is based on the author’s (Elena Volodina’s) joint research with Yousuf (Samir) Ali Mohammed, Arild Matsson, Beáta Megyesi and Sandra Derbring Access to language data is an obvious prerequisite for research in digital humanities in general, and for the development of NLP-based tools in particular. However, accessible data becomes a challenging target where personal data is involved. This is very true of language learner data where tasks are often phrased so that they, directly or indirectly, elicit explicit personal information, …

Korp searches in Second Language data

Korp offers a lot of different corpus collections for various types of search (and research). Swedish as a Second Language (L2) is one of the subcategories of the language that can be studied with the help of Korp. At the moment, Korp provides access to five L2 corpora through its interface: ASU – Andraspråksutveckling SpIn – texts from the centrum for Språkintroduktion SW1203 – texts from a preparatory course for university students SweLL – Swedish Learner Language – adult-written essays from a …

Common Pitfalls in the Development of ICALL Applications

This blog is a piece of opinion where I sketch the process of developing NLP-based applications for second language learning and look at the process from the point of view of typical (mis)conceptions and challenges, as I have experienced them. Are we over-trusting the potential of NLP? Are teachers by definition reluctant to use NLP-based solutions in classrooms? How, if at all, can academic universities ensure sustainability of the developed applications? 1 Introduction Natural Language Processing (NLP) and Language Technology (LT) deal …

A multilingual annotated corpus of world’s natural language descriptions

Shafqat Mumtaz Virk, Harald Hammarström, Markus Forsberg, Søren Wichmann The diversity of 7000 languages of the world represents an irreplaceable and abundant resource for understanding the unique communication system of our species (Evans and Levinson, 2009). All comparison and analysis of languages departs from language descriptions — publications that contain facts about particular languages. The typical examples of this genre are grammars and dictionaries (Hammarström and Nordhoff, 2011). Until recently, language descriptions were available in paper form only, with indexes as the …