Skip to main content

Language resources

On this page you can browse and search our corpora and lexicons. Click on a resource name to see what files are available for download. You can go directly to the search interface by clicking on the Korp or Karp logo.
Resource Tokens Language Access
8 Sidor
News articles from 8 SIDOR.
4,998,634 Swedish
Academic texts: Humanities
A corpus with academic texts
14,454,573 Swedish
Academic texts: Social science
A corpus with academic texts
10,855,954 Swedish
Af Soomaali 1993-94
9,247 Somali
Af-Soomaali 2016 Somaliland
51,236 Somali
Aftonbladet 1830's
Part of the collection Kubhist2
29,870,739 Swedish
Agriculture
Agricultural manuals: "Engelska Åker-Mannen" and "En Grundelig Kundskap Om Swenska Åkerbruket"
90,767 Swedish
Applications
Anonymised job applications. The corpus is protected, contact Lena Rogström (lena.rogstroem@svenska.gu.se) for more information and access.
26,228 Swedish
Argumentation sentences 1.0
A translated corpus for classifying sentence stance in relation to a topic.
Swedish
Collection
ASPAC
The Amsterdam Slavic Parallel Aligned Corpus
Swedish, Belarusian, Bulgarian, Czech, German, Lower Sorbian, Modern Greek (1453-), English, Spanish, French, Croatian, Upper Sorbian, Latin, Macedonian, Dutch, Polish, Portuguese, Romanian, Russian, Kele (Papua New Guinea), Slovak, Slovenian, Serbian, Slavomolisano, Turkmen, Ukrainian
ASPAC: Swedish
The Swedish part of The Amsterdam Slavic Parallel Aligned Corpus
773,703 Swedish
ASPAC: Swedish-Belarussian
Part of The Amsterdam Slavic Parallel Aligned Corpus
401,158 Swedish, Belarusian
ASPAC: Swedish-Bulgarian
Part of The Amsterdam Slavic Parallel Aligned Corpus
667,092 Swedish, Bulgarian
ASPAC: Swedish-Croatian
Part of The Amsterdam Slavic Parallel Aligned Corpus
992,471 Swedish, Croatian
ASPAC: Swedish-Czech
Part of The Amsterdam Slavic Parallel Aligned Corpus
1,438,880 Swedish, Czech
ASPAC: Swedish-Dutch
Part of The Amsterdam Slavic Parallel Aligned Corpus
1,549,106 Swedish, Dutch
ASPAC: Swedish-English
Part of The Amsterdam Slavic Parallel Aligned Corpus
1,516,943 Swedish, English
ASPAC: Swedish-French
Part of The Amsterdam Slavic Parallel Aligned Corpus
341,914 Swedish, French
ASPAC: Swedish-German
Part of The Amsterdam Slavic Parallel Aligned Corpus
1,580,660 Swedish, German
ASPAC: Swedish-Greek
Part of The Amsterdam Slavic Parallel Aligned Corpus
303,518 Modern Greek (1453-), Swedish
ASPAC: Swedish-Italian
Part of The Amsterdam Slavic Parallel Aligned Corpus
91,166 Swedish, Italian
ASPAC: Swedish-Latin
Part of The Amsterdam Slavic Parallel Aligned Corpus
134,180 Swedish, Latin
ASPAC: Swedish-Lower Sorbian
Part of The Amsterdam Slavic Parallel Aligned Corpus
36,551 Swedish, Lower Sorbian
ASPAC: Swedish-Macedonian
Part of The Amsterdam Slavic Parallel Aligned Corpus
602,313 Swedish, Macedonian
ASPAC: Swedish-Molise Slavik
Part of The Amsterdam Slavic Parallel Aligned Corpus
35,279 Slavomolisano, Swedish
ASPAC: Swedish-Polish
Part of The Amsterdam Slavic Parallel Aligned Corpus
1,467,390 Swedish, Polish
ASPAC: Swedish-Portuguese
Part of The Amsterdam Slavic Parallel Aligned Corpus
270,241 Swedish, Portuguese
ASPAC: Swedish-Romanian
Part of The Amsterdam Slavic Parallel Aligned Corpus
93,861 Swedish, Romanian
ASPAC: Swedish-Russian
Part of The Amsterdam Slavic Parallel Aligned Corpus
1,466,745 Swedish, Russian
ASPAC: Swedish-Serbian (cyrillic)
Part of The Amsterdam Slavic Parallel Aligned Corpus
577,094 Kele (Papua New Guinea), Swedish
ASPAC: Swedish-Serbian (latin)
Part of The Amsterdam Slavic Parallel Aligned Corpus
505,216 Swedish, Serbian
ASPAC: Swedish-Slovak
Part of The Amsterdam Slavic Parallel Aligned Corpus
554,510 Swedish, Slovak
ASPAC: Swedish-Slovene
Part of The Amsterdam Slavic Parallel Aligned Corpus
579,527 Swedish, Slovenian
ASPAC: Swedish-Spanish
Part of The Amsterdam Slavic Parallel Aligned Corpus
61,931 Swedish, Spanish
ASPAC: Swedish-Turkmen
Part of The Amsterdam Slavic Parallel Aligned Corpus
31,397 Swedish, Turkmen
ASPAC: Swedish-Ukrainian
Part of The Amsterdam Slavic Parallel Aligned Corpus
453,836 Swedish, Ukrainian
ASPAC: Swedish-Upper Sorbian
Part of The Amsterdam Slavic Parallel Aligned Corpus
85,146 Swedish, Upper Sorbian
ASU
Structural development of the second language
643,949 Swedish
August Strindberg's letters
Part of the collected works of August Strindberg
1,507,958 Swedish
August Strindberg's novels
Part of the collected works of August Strindberg
4,309,037 Swedish
Bellman
Collected works of C.M. Bellman
452,030 Swedish
Betänkande ang. läroböcker (1882)
A report from 1882, digitized by the Gothenburg University Library
41,521 Swedish
Biblioteksbladet
The earliest volumes of "Biblioteksbladet: Organ för Sveriges allmänna biblioteksförening" from 1916–1940, digitized by Project Runeberg
4,595,593 Swedish
Collection
Bicameral Riksdag
Collection of textual documents from the Swedish bicameral parliament data
Swedish
Bicameral riksdag: Government official investigations
Part of the data set "Bicameral Riksdag"
59,266,835 Swedish
Bicameral riksdag: Letters of the Riksdag
Part of the data set "Bicameral Riksdag"
29,775,566 Swedish
Bicameral riksdag: Motions
Part of the data set "Bicameral Riksdag"
73,189,180 Swedish
Bicameral riksdag: Narratives and accounts
Part of the data set "Bicameral Riksdag"
61,348,401 Swedish
Bicameral riksdag: Propositions and letters
Part of the data set "Bicameral Riksdag"
319,201,218 Swedish
Bicameral riksdag: Protocols
Part of the data set "Bicameral Riksdag"
327,554,657 Swedish
Bicameral riksdag: Register
Part of the data set "Bicameral Riksdag"
23,323,395 Swedish
Bicameral riksdag: Regulations
Part of the data set "Bicameral Riksdag"
2,628,009 Swedish
Bicameral riksdag: Reports, memorandums and opinions
Part of the data set "Bicameral Riksdag"
195,467,124 Swedish
Bicameral riksdag: The constitution of the Riksdag
Part of the data set "Bicameral Riksdag"
83,964 Swedish
Collection
Blog mix
Material from a selection of Swedish blogs. Regularly updated.
Swedish
Blog mix 1998
Material from a selection of Swedish blogs. Is updated regularly.
30,939 Swedish
Blog mix 1999
Material from a selection of Swedish blogs. Is updated regularly.
604,019 Swedish
Blog mix 2000
Material from a selection of Swedish blogs. Is updated regularly.
188,779 Swedish
Blog mix 2001
Material from a selection of Swedish blogs. Is updated regularly.
326,659 Swedish
Blog mix 2002
Material from a selection of Swedish blogs. Is updated regularly.
242,723 Swedish
Blog mix 2003
Material from a selection of Swedish blogs. Is updated regularly.
271,877 Swedish
Blog mix 2004
Material from a selection of Swedish blogs. Is updated regularly.
638,967 Swedish
Blog mix 2005
Material from a selection of Swedish blogs. Is updated regularly.
4,800,032 Swedish
Blog mix 2006
Material from a selection of Swedish blogs. Is updated regularly.
8,106,551 Swedish
Blog mix 2007
Material from a selection of Swedish blogs. Is updated regularly.
19,096,258 Swedish
Blog mix 2008
Material from a selection of Swedish blogs. Is updated regularly.
43,703,790 Swedish
Blog mix 2009
Material from a selection of Swedish blogs. Is updated regularly.
75,113,677 Swedish
Blog mix 2010
Material from a selection of Swedish blogs. Is updated regularly.
97,435,693 Swedish
Blog mix 2011
Material from a selection of Swedish blogs. Is updated regularly.
100,591,617 Swedish
Blog mix 2012
Material from a selection of Swedish blogs. Is updated regularly.
80,041,223 Swedish
Blog mix 2013
Material from a selection of Swedish blogs. Is updated regularly.
62,098,899 Swedish
Blog mix 2014
Material from a selection of Swedish blogs. Is updated regularly.
40,133,589 Swedish
Blog mix unknown date
Material from a selection of Swedish blogs. Is updated regularly.
35,028,559 Swedish
Bloggmix 2015
Material from a selection of Swedish blogs. Is updated regularly.
27,835,518 Swedish
Bloggmix 2016
Material from a selection of Swedish blogs. Is updated regularly.
17,699,703 Swedish
Bloggmix 2017
Material from a selection of Swedish blogs. Is updated regularly.
1,669,477 Swedish
Bonnier novels I (1976–77)
A corpus of 69 Bonnier novels from 1976–77
6,578,675 Swedish
Bonniers novels II (1980–81)
A corpus of 60 Bonnier novels from 1980–81
4,304,271 Swedish
Caafimaad 1983
1,521 Somali
COCTAILL
Corpus of coursebooks used for teaching L2 Swedish. Annotated manually for text structure and pedagogical/didactical categories; automatically linguistically annotated.
710,251 Swedish
Corpus word statistics
Accumulated word statistics from many of our modern Swedish corpora
Dagens Arena
News texts from dagensarena.se
10,908,463 Swedish
DaLAJ-GED-SuperLim 2.0
Dataset for Linguistic Acceptability Judgments (and more), v.2.0
Swedish
Dalin: Then Swänska Argus 1732-1734
Manual transcription of Then Swänska Argus by Olof von Dalin, Stockholm, 1732–1734. For OCR analysis.
213,399 Swedish
Dalpilen 1860's
Part of the collection Kubhist2
8,984,628 Swedish
Databank of 1977 Spanish Press
Text from two Spanish newspapers from 1977. Part of SOL - Spanish Online.
2,166,383 Spanish
Databank of Eleven Spanish Novels 1951–1971
Corpus consisting of 11 Spanish novels. Part of SOL - Spanish Online.
1,248,184 Spanish
Detective Department
Data from the Detective Department at the Gothenburg police, from late 1800s to early 1900s.
Swedish
Detektiva avdelningen
1,343,709 Swedish
DiabetologNytt (1996–1999)
The paper DiabetologNytt (Diabetologynews) 1996-1999
228,313 Swedish
Diverse tidningar
Fourteen annual volumes of eight different periodicals (1810–1933) digitized by Project Runeberg
5,358,564 Swedish
DN 1987
Dagens Nyheter 1987
5,129,248 Swedish
Dramawebben (demo)
Texts from Dramawebben, a digital archive of free Swedish drama.
790,456 Swedish
DReaM
A multilingual corpus of linguistic descriptions of the world's natural languages.
75,027,790 English
DReaM-Copyright-Protected
A multilingual corpus of linguistic descriptions of the world's natural languages.
225,617,801 English
Ekeblad's letters
Ekeblad's letter corpus is based on the digital edition Breven till Claes 1639–1655 by Sture Allén.
52,253 Swedish
Ethnological question lists
Question lists of the Nordic Museum
57,606,576 Swedish
Eukalyptus Treebank of Written Swedish
A treebank with written Swedish data, with parts-of-speech, TIGER-style syntax, multiword expressions and sense annotation
99,913 Swedish
Collection
Europarl
European Parliament Proceedings Parallel Corpus
Swedish, Danish, German, Modern Greek (1453-), English, Spanish, Finnish, French, Italian, Dutch, Portuguese
Europarl: Swedish
The Swedish part of European Parliament Proceedings Parallel Corpus
33,406,922 Swedish