The SwedishGLUE project

Artificial intelligence system dealing with (human) natural language rely on language models, predictions of which words occur together. To better understand how such models work — and where they fail — when applied to Swedish texts we need Swedish test data. A collection of test data addressing various aspects of understanding and generating text allows us to evaluate and compare models. During the autumn of 2020 we have started working on developing evaluation data for Swedish language models at Språkbanken Text. This …

Argumentation Mining

What if you could find all arguments in a text without having to read it? Or, what if you could search a database for a controversial topic and immediately get arguments for and against it, gathered from text all around the internet? Or, imagine when writing an essay you would automatically get an estimation of how persuasive your arguments are. Scenarios such as these could be possible with techniques developed in the field of argumentation mining. The aim of this relatively new …

The Kubhist corpus of Swedish newspapers

Among the flurry of Språkbanken’s historical resources we find the Kubhist corpus – a diachronic collection of historical newspaper texts – in two versions: Kubhist 1 spanning the time period of 1750–1950, and Kubhist 2 spanning the time period of 1645–1926. Historical corpora of this kind, especially when available in searchable format, are valuable sources of information for learning about our history, language and culture. These are especially appealing for researchers coming from the digital humanities who study history, literature, linguistics, sociology …