Skip to main content

Language resources

On this page you can browse and search our corpora and lexicons. Click on a resource name to see what files are available for download. You can go directly to the search interface by clicking on the Korp or Karp logo.
Resource Type Language Access
Blingbring
Blingbring, an enhanced and modernized version of Bring's thesaurus (1930)
Lexicon Swedish
Collection
Blog mix
Material from a selection of Swedish blogs. Regularly updated.
Corpus Swedish
Blog mix 1998
Material from a selection of Swedish blogs. Is updated regularly.
Corpus Swedish
Blog mix 1999
Material from a selection of Swedish blogs. Is updated regularly.
Corpus Swedish
Blog mix 2000
Material from a selection of Swedish blogs. Is updated regularly.
Corpus Swedish
Blog mix 2001
Material from a selection of Swedish blogs. Is updated regularly.
Corpus Swedish
Blog mix 2002
Material from a selection of Swedish blogs. Is updated regularly.
Corpus Swedish
Blog mix 2003
Material from a selection of Swedish blogs. Is updated regularly.
Corpus Swedish
Blog mix 2004
Material from a selection of Swedish blogs. Is updated regularly.
Corpus Swedish
Blog mix 2005
Material from a selection of Swedish blogs. Is updated regularly.
Corpus Swedish
Blog mix 2006
Material from a selection of Swedish blogs. Is updated regularly.
Corpus Swedish
Blog mix 2007
Material from a selection of Swedish blogs. Is updated regularly.
Corpus Swedish
Blog mix 2008
Material from a selection of Swedish blogs. Is updated regularly.
Corpus Swedish
Blog mix 2009
Material from a selection of Swedish blogs. Is updated regularly.
Corpus Swedish
Blog mix 2010
Material from a selection of Swedish blogs. Is updated regularly.
Corpus Swedish
Blog mix 2011
Material from a selection of Swedish blogs. Is updated regularly.
Corpus Swedish
Blog mix 2012
Material from a selection of Swedish blogs. Is updated regularly.
Corpus Swedish
Blog mix 2013
Material from a selection of Swedish blogs. Is updated regularly.
Corpus Swedish
Blog mix unknown date
Material from a selection of Swedish blogs. Is updated regularly.
Corpus Swedish
Bring
A digital version of Bring's thesaurus (1930)
Lexicon Swedish
CoDeRooMor, v.01
Morphological dataset (word-building morphology), Swedish L2 profiles project
Lexicon Swedish
Eukalyptus Treebank of Written Swedish
A treebank with written Swedish data, with parts-of-speech, TIGER-style syntax, multiword expressions and sense annotation
Corpus Swedish
InterFra
Corpus of French spoken by Swedish students
Corpus French
InterFra Swedish
To promote research in the field of French L2 second language acquisition in a developmental, interactional and variationist perspective. The HLP (High Level Proficiency in Second language use) project also investigates learners of other L2s such as Swedish, Spanish, English and Italian.
Corpus Swedish
InterFra tagged
To promote research in the field of French L2 second language acquisition in a developmental, interactional and variationist perspective. The HLP (High Level Proficiency in Second language use) project also investigates learners of other L2s such as Swedish, Spanish, English and Italian.
Corpus French
Lemmatization model: Stanza
Pretrained model for lemmatization.
Model Swedish
NyLLex v2
A lexical resource derived from books published by Sweden´s largest publisher of easy language texts. The entries are annotated with frequency counts distributed over six reading proficiency levels.
Lexicon Swedish
POS-tagging model: Flair
Pretrained models for POS-tagging.
Model Swedish
POS-tagging model: Marmot
Pretrained models for POS-tagging.
Model Swedish
POS-tagging model: Stanza
Pretrained models for POS-tagging.
Model Swedish
Pretrained embeddings
A list of pretrained embeddings for Swedish
Model Swedish
SUC 2.0
Stockholm-Umeå corpus 2.0
Corpus Swedish
SUC 3.0
Stockholm-Umeå corpus 3.0
Corpus Swedish
SUC Novels (StorSUC)
Stockholm-Umeå corpus
Corpus Swedish
SUCX 2.0
Stockholm-Umeå corpus 2.0 scrambled
Corpus Swedish
SUCX 3.0
Stockholm-Umeå corpus 3.0 scrambled
Corpus Swedish
sv-COVID-19
A compilation of various articles related to the COVID-19 pandemic
Corpus Swedish
SweDiagnostics
Swedish version of (Super)GLUE Diagnostic
Corpus Swedish
Swedish ABSAbank-Imm 1.1
An annotated Swedish corpus for aspect-based sentiment analysis (a version of Absabank)
Corpus Swedish
Swedish analogy 2.0
Swedish semantic and syntactic similarity
Corpus Swedish
Swedish treebank
A Swedish treebank built from recycled language resources
Corpus Swedish
SweFAQ 2.0
Frequently asked questions from Swedish authorities' websites with shuffled answers
Corpus Swedish
SweWiC 2.0
A Swedish Word-in-Context dataset
Corpus Swedish
SweWinogender 2.0
A Swedish dataset for coreference and gender bias
Corpus Swedish
Syntag treebank
A Swedish treebank with syntactic analysis of 158 articles from Press-65.
Corpus Swedish
TalbankenSBX
Talbanken is a Swedish treebank. This is the Språkbanken Text version of Talbanken.
Corpus Swedish
TalbankenSTB
Talbanken is a Swedish treebank.
Corpus Swedish