Skip to main content

Language resources

On this page you can browse and search our corpora and lexicons. Click on a resource name to see what files are available for download. You can go directly to the search interface by clicking on the Korp or Karp logo.
Resource Type Language Access
SAOB1950
Scanned books from 1950 to 2007 that are used as source material for updating SAOB, with a selection that reflects the Swedish vocabulary during the 20th century.
Corpus Swedish
ScandiSent
Sentiment Corpus for Swedish, Norwegian, Danish, Finnish and English crawled from trustpilot.
Corpus Swedish, Norwegian Bokmål, Danish, English, Finnish
Schlyter
Dictionary of Old Swedish
Lexicon Swedish
SemEval2020 Task 1
Swedish Test Data for SemEval 2020 Task 1: Unsupervised Lexical Semantic Change Detection (extracts from Kubhist v2)
Corpus Swedish
SenSALDO
SenSALDO, SALDO entries and text word forms with sentiment information (prior polarity)
Lexicon Swedish
Sentiment Lexicon
Sentiment lexicon for Swedish based on SALDO
Lexicon Swedish
Sibirian-German
Siberian German is transcribed German spoken of about 36 000 people in the region of Krasnoyarsk in Siberia (Russia).
Corpus Swedish
Sibirientyska kvinnor
Dialogs between four women born in 1927 to 1937 in the Soviet Volga Republic
Corpus Swedish
SIC2 - Stockholm Internet Corpus
The Stockholm Internet Corpus (SIC2) contains Swedish blog posts, annotated with part of speech, morphological features, and named entities.
Corpus Swedish
Simple lexicon
The Swedish SIMPLE Lexicon - A language technology resource with access to semantic information in Swedish
Lexicon Swedish
Simple+
The Swedish SIMPLE Lexicon - A language technology resource with access to semantic information in Swedish, connected to SALDO senses
Lexicon Swedish
SKBL
The Biographical Dictionary of Swedish Women
Lexicon Swedish, English
Smittskydd
The newspaper Smittskydd by Smittskyddsinstitutet (Swedish Institute for Communicable Disease Control) 2002–2010
Corpus Swedish
SNP 1978–79
Swedish parliament proceedings 1978–1979
Corpus Swedish
Söderwall
Dictionary of Old Swedish
Lexicon Swedish
Söderwall Supplement
Dictionary of Old Swedish
Lexicon Swedish
Collection
Somali corpora
A collection of Samli corpora
Corpus Somali
Somali Wikipedia
Corpus of Somali Wikipedia
Corpus Somali
Somali: Af Soomaali 1971-79
Corpus Somali
Somali: Af-Soomaali 2001 Somaliland
Corpus Somali
Somali: Af-Soomaali 2001 Soomaaliya
Corpus Somali
Somali: Afka Hooyo 2010–19 Iswiidhan
Corpus Somali
Somali: Caafimaad 1972–79
Corpus Somali
Somali: Cilmi-Afeed
Corpus Somali
Somali: Cilmiga Bulshada 1971–1980
Corpus Somali
Somali: Cilmiga Bulshada 2001-03 Soomaaliya
Corpus Somali
Somali: Cilmiga Bulshada 2016 Somaliland
Corpus Somali
Somali: Kitaabka Quduuska Ah
Corpus Somali
Somali: Maaddooyinka Kale 1972–79
Corpus Somali
Somali: Raadiyaha Denmark 2014
Corpus Somali
Somali: Raadiyaha Iswiidhan 2014
Corpus Somali
Somali: Saynis 1980–89
Corpus Somali
Somali: Sheekooyin Carruureed
Corpus Somali
Somali: Sheekooyin Carruureed (Turjuman)
Corpus Somali
Somali: Sheekooyin Gaagaaban
Corpus Somali
Somali: Suugaan
Corpus Somali
Somali: Suugaan (Turjuman)
Corpus Somali
Somali: Suugaan 2
Corpus Somali
Somali: Taariikh iyo Dhaqan (Turjuman)
Corpus Somali
Somali: Xisaab 2001 Soomaaliya
Corpus Somali
Somali: Xisaab 2016 Somaliland
Corpus Somali
SpIn v1
256 essays collected from Language Introduction course (mid-term exams) for newly arrived refugees. Some of the students are recurrent.
Corpus Swedish
Sports anglicisms
English loan-words in the Swedish sports press
Lexicon Swedish
Språkprov SO 2009
De drygt 94 000 språkexemplen är hämtade ur Svensk ordbok utgiven av Svenska Akademien (2009). Exemplens uppgift är att stödja ordboksdefinitionerna och att ge information om uppslagsordens fraseologi. <br><br>För åtkomst kontakta <a href="mailto:emma.skoldberg@svenska.gu.se">Emma Sköldberg</a>.
Corpus Swedish
SUC 2.0
Stockholm-Umeå corpus 2.0
Corpus Swedish
SUC 3.0
Stockholm-Umeå corpus 3.0
Corpus Swedish
SUC Novels (StorSUC)
Stockholm-Umeå corpus
Corpus Swedish
SUCX 2.0
Stockholm-Umeå corpus 2.0 scrambled
Corpus Swedish
SUCX 3.0
Stockholm-Umeå corpus 3.0 scrambled
Corpus Swedish
Collection
SuperLim 2
A standardized suite for evaluation and analysis of Swedish natural language understanding systems.
Corpus Swedish
SuperSim (repackaged for Superlim) 2.0
A dataset for word similarity and relatedness in Swedish
Corpus Swedish
sv-COVID-19
A compilation of various articles related to the COVID-19 pandemic
Corpus Swedish
Svensk Tidskrift
27 annual volumes of the conservative journal Svensk Tidskrift, from 1891 to 1940
Corpus Swedish
Collection
SVT news
News texts from svt.se
Corpus Swedish
SVT news 2004
News texts from svt.se
Corpus Swedish
SVT news 2005
News texts from svt.se
Corpus Swedish
SVT news 2006
News texts from svt.se
Corpus Swedish
SVT news 2007
News texts from svt.se
Corpus Swedish
SVT news 2008
News texts from svt.se
Corpus Swedish
SVT news 2009
News texts from svt.se
Corpus Swedish
SVT news 2010
News texts from svt.se
Corpus Swedish
SVT news 2011
News texts from svt.se
Corpus Swedish
SVT news 2012
News texts from svt.se
Corpus Swedish
SVT news 2013
News texts from svt.se
Corpus Swedish
SVT news 2014
News texts from svt.se
Corpus Swedish
SVT news 2015
News texts from svt.se
Corpus Swedish
SVT news 2016
News texts from svt.se
Corpus Swedish
SVT news 2017
News texts from svt.se
Corpus Swedish
SVT news 2018
News texts from svt.se
Corpus Swedish
SVT news 2019
News texts from svt.se
Corpus Swedish
SVT news 2020
News texts from svt.se
Corpus Swedish
SVT news 2021
News texts from svt.se
Corpus Swedish
SVT news 2022
News texts from svt.se
Corpus Swedish
SVT news 2023
News texts from svt.se
Corpus Swedish
SVT news unknown date
News texts from svt.se
Corpus Swedish
SW1203-essays
Essays written by L2 Swedish language learners, university courses
Corpus Swedish
Swe-NERC
A resource for training and evaluation of Named Entity Recognition for Swedish.
Corpus Swedish
Swedberg's Swensk Ordabok
Swedberg's Swensk Ordabok
Lexicon Swedish, Latin
Swedberg's Swensk Ordabok (morphology, rudimentary)
Swedberg's Swensk Ordabok (morphology, rudimentary)
Lexicon Swedish
SweDiagnostics
Swedish version of (Super)GLUE Diagnostic
Corpus Swedish
Swedish ABSAbank
An annotated Swedish corpus for aspect-based sentiment analysis
Corpus Swedish
Swedish ABSAbank-Imm 1.1
An annotated Swedish corpus for aspect-based sentiment analysis (a version of Absabank)
Corpus Swedish
Swedish analogy 2.0
Swedish semantic and syntactic similarity
Corpus Swedish
Swedish Bible 1873
Swedish translation of the Bible from 1873
Corpus Swedish
Swedish Bible 1917
Official Swedish translation of the Bible from 1917
Corpus Swedish
Swedish Code of Statutes
Swedish Code of Statutes 1880-01-01 – 2023-12-15
Corpus Swedish
Swedish Diachronic Word Embeddings
Swedish Diachronic Word Embedding Models Trained on Historical Newspaper Data
Model Swedish
Swedish EAT: question classification
A translated version of the QAQC dataset for expected-answer-type classification.
Corpus Swedish
Swedish fraktur 1626-1816
A selection of fraktur texts printed between 1626 and 1816 from the collections of the University Library of University of Gothenburg (UB). For OCR analysis.
Corpus Swedish
Swedish FrameNet (SweFN)
A lexical semantic resource based on the same principles as the English Berkeley FrameNet. This part of the resource contains the frames and the manually annotated semantic content.
Lexicon Swedish
Swedish framenet (SweFN)
A lexical semantic resource based on the same principles as the English Berkeley FrameNet. This part of the resource contains the corpus examples, automatically enriched with linguistic information.
Corpus Swedish
Swedish newspapers 1818-1870
A selection of Swedish newspapers printed between 1818 and 1870 from the collections of Kungliga biblioteket (KB). For OCR analysis.
Corpus Swedish
Swedish newspapers 1871-1906
A selection of Swedish newspapers printed between 1871 and 1906 from the collections of Kungliga biblioteket (KB). For OCR analysis.
Corpus Swedish
Swedish party programs and election manifestos
Swedish political party programs and election manifestos 1887–2024
Corpus Swedish
Swedish Prose Fiction 1800–1900
All Swedish fiction published for the first time during the years 1800, 1820, 1840, 1860, 1880 and 1900
Corpus Swedish
Swedish treebank
A Swedish treebank built from recycled language resources
Corpus Swedish
Swedish Twitter 2015
Material collected from a selection of Swedish speaking twitter users from 2015
Corpus Swedish
Swedish Twitter 2016
Material collected from a selection of Swedish speaking twitter users from 2016
Corpus Swedish
Swedish Twitter 2017
Material collected from a selection of Swedish speaking twitter users from 2017
Corpus Swedish
Swedish Wikipedia
Corpus of Swedish Wikipedia
Corpus Swedish