Blog posts

Behind the scenes: The secret life of Språkbanken

2025-09-08

Schematic representation of the Språkbanken Text infrastructure with the role of the analysis group highlighted

Buggar, vägledning, samarbeten: Hur Språkbanken Text svarar på frågor

2024-12-18

Cassandra: a toolset for analyzing and visualizing language change

2022-12-07

Within the Cassandra project we are using Korp to analyze numerous instances of language change: not one, not two, but dozens (and in the future, potentially hundreds). At this scale, it is impossible to perform searches (and process their results) manually. Fortunately, Korp has an API that makes an automatization of this process possible.

Documentation: a (fictional) sad story with a (real) happy ending

2021-05-28

This post is based on joint work with Gerlof Bouma. Illustrations by Jan and Julija.

Here's a sad story (it's fictional, but sad nonetheless).

How native and non-native speakers talk to each other

2020-12-09

We at Språkbanken Text have just released a new corpus of native (L1) and non-native (L2) speech in four languages: English, Spanish, French and Italian. The corpus contains more than 170 million words produced by more than 97 thousand speakers (size varies a lot across the four languages, though).

The five lives of Talbanken

2020-06-09

This post is about Talbanken, one of the most widely used and important Swedish corpora. There exist at least five versions of this treebank, and the purpose of this post is to reduce ambiguity of the name "Talbanken", which sometimes leads to confusion. I am going to list the five versions, explain the basic differences between them and suggest unambiguous version names.

Grym och häftig ordförändring

2019-10-04

Ord kan förändra sina betydelser. Man behöver inte en doktorgrad i språkvetenskap för att upptäcka att grym i (1) betyder inte samma sak som i (2).

Sidansvarig: sb-webb