Hoppa till huvudinnehåll

Blogg

Bloggen listas i omvänd datumordning. Du kan även visa alla etiketter för att på så sätt få fram alla inlägg av den typen.

God Jul from the Swedish Word Family

- Elena Volodina

Lexical resources for Natural Language Processing (NLP),  Second Language Acquisition (SLA) and other applied disciplines differ in the choice of the lexical units they use as their main entry. Most widely-spread is use of a lemma, i.e. base form of a word, or a lemgram, i.e. base form + its part of speech (POS), cf François et al. (2016) and Kilgarriff et al. (2014).

Change is key! 6-year RJ Program funded!

- Nina Tahmasebi
In the RJ-funded program Change is Key!, we will develop tools to turn text into a story of our language, our societies, and our cultures, and how these have changed over time. The program spans six years (2022-2027) and has 11 participating researchers. Read more here!

Jupiter Descending

- Niklas Zechner
Those interested in astronomy may have noticed that Jupiter is currently in opposition. But what does that mean? And what does "Jupiter" mean? And what does all of this have to do with corpus linguistics, Elizabeth Taylor, and Tuesdays?

Gör ditt eget korsord!

- Peter Ljunglöf

Snart är det semester, och då är korsord en klassiker. Särskilt nu i isoleringstider när vi ändå inte bör umgås, vad är då bättre än att sitta i hammocken med en välvässad blyertspenna, ett bra sudd, SAOL-appen, och ett korsord?

Documentation: a (fictional) sad story with a (real) happy ending

- Aleksandrs (Sasha) Berdicevskis

This post is based on joint work with Gerlof Bouma. Illustrations by Jan and Julija.

Here's a sad story (it's fictional, but sad nonetheless).

Swedish derivational morphology with CoDeRooMor

- Elena Volodina

This blog is based on a joint work by Elena Volodina, Therese Lindström Tiedemann and Yousuf Ali Mohammed within the RJ-funded project L2 profiles. Three annotators have contributed to this work: Stellan Petersson (University of Gothenburg), Beatrice Silén (University of Helsinki ) and Maisa Lauriala (University of Helsinki).

A Swedish COVID-19 (sv-COVID-19) corpus and its exploration ... smorgasbord

- Dimitrios Kokkinakis

As the COVID-19 virus became a pandemic in March 2020, the amount of (time-stamped written) data, such as news/newspaper reports, scientific articles, social media posts (e.g. blogs and twitter), surveys and other information about the virus and its symptoms, prevention, management and transmission became massively available. Such data contained both valid and reliable information, and relevant facts from trusted sources and also rumors, conspiracy theories and misinformation from unofficial ones.

The SwedishGLUE project

- Yvonne Adesam

Artificial intelligence system dealing with (human) natural language rely on language models, predictions of which words occur together. To better understand how such models work -- and where they fail -- when applied to Swedish texts we need Swedish test data. A collection of test data addressing various aspects of understanding and generating text allows us to evaluate and compare models.

Reflektioner från SLTC 2020

- Peter Ljunglöf

25-27 november gick den åttonde upplagan av SLTC, Swedish Language Technology Conference, av stapeln på Humanisten här i Göteborg. Eller, skulle ha gjort om inte ett visst virus satte stopp för det.

How native and non-native speakers talk to each other

- Aleksandrs (Sasha) Berdicevskis

We at Språkbanken Text have just released a new corpus of native (L1) and non-native (L2) speech in four languages: English, Spanish, French and Italian. The corpus contains more than 170 million words produced by more than 97 thousand speakers (size varies a lot across the four languages, though).