Skip to main content
Språkbanken Text is a part of Språkbanken.

Language resources

On this page you can browse and search our datasets. Click on a row name to see what files are available for download. You can go directly to the search interface by clicking on the tool logo.
Resurs Typ Språk Åtkomst
Collection
SweLL
SweLL -- Swedish Learner Language -- is a collection of SweLL corpora and derivative resources coming from these corpora. SweLL corpora consisf of learner texts written by learners with other mother tongues than Swedish. All texts have been collected in test situations (none of them coming from home-written tasks).
Corpus Swedish, Multiple languages
SweLL v1 original
Corpus Swedish
SweLL v1 target
Corpus Swedish
SweLL-gold
Essays written by adult learners of Swedish, manually pseudonymized and correction annotated. The corpus contains both the original learner text and a corrected version of each essay. Collection period 2017-2020.
Corpus Swedish
SweLL-gold original
Corpus Swedish
SweLL-gold target
Corpus Swedish
Collection
SweLL-pilot
Essays written by adult learners of Swedish, manually labeled with the CEFR levels (a European scale of language proficiency levels within language learning). Collection period 2006-2015.
Corpus Swedish
SweLLex
SweLLex is a lexicon of productive vocabulary for Swedish as a second language
Lexicon Swedish
SweNLI 1.0
A Swedish NLI dataset
Corpus Swedish
SweParaphrase 2.0
Semantic Textual Similarity reference data (STS Benchmark).
Corpus Swedish
SweSAT Swedish Scholastic Aptitude Test Synonyms 1.1
Swedish Scholastic Aptitude Test Synonyms
Lexicon Swedish
Swesaurus
A Swedish WordNet
Lexicon Swedish
SweWiC 2.0
A Swedish Word-in-Context dataset
Corpus Swedish
SweWinogender 2.0
A Swedish dataset for coreference and gender bias
Corpus Swedish
SweWinograd 2.0
A Swedish dataset for pronoun resolution
Corpus Swedish
Syntag treebank
A Swedish treebank with syntactic analysis of 158 articles from Press-65.
Corpus Swedish
Sæmundaredda
Ancient Icelandic poetry collection also known as The King's Book
Corpus Old Norse
TalbankenSBX
Talbanken is a Swedish treebank. This is the Språkbanken Text version of Talbanken.
Corpus Swedish
TalbankenSTB
Talbanken is a Swedish treebank.
Corpus Swedish
The English-Swedish Parallel Corpus (ESPC)
ESPC is a combined comparable and parallel corpus suitable for cross-language research for diffferent types.
Corpus Swedish, English
The Riksdag's open data - Debates
Debates from the Swedish parliament in the period 1993/94-2017/18
Corpus Swedish
The Swedish Culturomics Gigaword Corpus
One billion Swedish words from 1950 and onwards. Code to extract data from the corpus, as well as usage instructions, can be downloaded from https://svn.spraakbanken.gu.se/sb-arkiv/tools/gigaword/
Corpus Swedish
The Swedish Literature Bank: Free Works
E-texts and searchable facsimiles fron the Swedish Literature Bank (litteraturbanken.se)
Corpus Swedish
The Swedish Literature Bank: Restricted Works
E-texts and searchable facsimiles fron the Swedish Literature Bank (litteraturbanken.se)
Corpus Swedish
The Swedish PoliGraph
An extensible knowledge graph with information on members of the Swedish parliament
Lexicon Swedish
Tiden
30 annual volumes of the socialist journal Tiden, 1909–1940
Corpus Swedish
TISUS texts
Essays written by L2 Swedish learners as part of a TISUS exam
Corpus Swedish
TISUS v1
Corpus Swedish
TISUS-texter v2
Corpus Swedish
Twitter Mix
Material from a selection of Swedish Twitter users. Is regularly updated.
Corpus Swedish
Twitter: Party Leader Debate June 2013
Material from Twitter, collected during the party leader debate on June 12th 2013 and a few days before and after
Corpus Swedish
Twitter: Party Leader Debate May 2014
Material from Twitter, collected during the party leader debate on May 4th 2013 and a few days before and after
Corpus Swedish
Twitter: Party Leader Debate October 2013
Material from Twitter, collected during the party leader debate on June 6th 2013 and a few days before and after
Corpus Swedish
UNSC-Graph
An extensible knowledge graph for the UNSC corpus, detailing participants and debates from the UN Security Council 1995-2020
Lexicon English
Ur Dagens Krönika
Eight annual volumes of the cultural journal Ur Dagens Krönika, 1881–1890
Corpus Swedish
Vocation list
A list of vocations in Swedish
Lexicon Swedish
Collection
Web News
News from Swedish newspapers' websites
Corpus Swedish
Web News 2001
News from Swedish newspapers' websites
Corpus Swedish
Web News 2002
News from Swedish newspapers' websites
Corpus Swedish
Web News 2003
News from Swedish newspapers' websites
Corpus Swedish
Web News 2004
News from Swedish newspapers' websites
Corpus Swedish
Web News 2005
News from Swedish newspapers' websites
Corpus Swedish
Web News 2006
News from Swedish newspapers' websites
Corpus Swedish
Web News 2007
News from Swedish newspapers' websites
Corpus Swedish
Web News 2008
News from Swedish newspapers' websites
Corpus Swedish
Web News 2009
News from Swedish newspapers' websites
Corpus Swedish
Web News 2010
News from Swedish newspapers' websites
Corpus Swedish
Web News 2011
News from Swedish newspapers' websites
Corpus Swedish
Web News 2012
News from Swedish newspapers' websites
Corpus Swedish
Web News 2013
News from Swedish newspapers' websites
Corpus Swedish
Wexjöbladet 1820's
Part of the collection Kubhist2
Corpus Swedish
Word Embeddings trained on English Wikipedia
Word Embeddings trained on English Wikipedia
Model English
WordNet-SALDO
A linking between SALDO senses and Core WordNet
Lexicon Swedish, English
WordReference
A large corpus of native and non-native written speech in four languages.
Corpus English, Spanish, French, Italian
Written production in learner French
This corpus contains student texts written by Swedish learners of French
Corpus French
BibTeX list