Skip to main content
Svenska
English
Språkbanken Text is a part of
Språkbanken
.
News and events
Research
Tools
Data
FAQ
About us
Contact us
Menu
Breadcrumb
Home
Language resources
Language resources
Language resources
On this page you can browse and search our datasets. Click on a row name to see what files are available for download. You can go directly to the search interface by clicking on the tool logo.
All (1323)
Collections (30)
Corpora (1198)
Lexicons (62)
Training and evaluation data (15)
Models (48)
Name or description
Language
- Any -
Swedish
Albanian
Belarusian
Blissymbols
Bosnian
Bulgarian
Croatian
Czech
Danish
Dutch
English
Estonian
Faroese
Finland Swedish
Finnish
French
German
Icelandic
Iranian Persian
Italian
Kele (Papua New Guinea)
Kurdish
Latin
Latvian
Lower Sorbian
Macedonian
Modern Greek (1453-)
Multiple languages
Norwegian
Norwegian Bokmål
Old English (ca. 450-1100)
Old High German (ca. 750-1050)
Old Norse
Old Saxon
Polish
Portuguese
Romanian
Russian
Serbian
Slavomolisano
Slovak
Slovenian
Somali
Spanish
Turkish
Turkmen
Ukrainian
Upper Sorbian
Xhosa
Resurs
Antal tokens
Språk
Åtkomst
8 Sidor
News articles from 8 SIDOR.
4,998,634
Swedish
Dataset:
attasidor.xml.bz2
2024-03-07 – 150.01 MB – CC BY 4.0
Word statistics:
stats_attasidor.csv
2024-02-13 – 4.76 MB – CC BY 4.0
Explore in:
Academic texts: Humanities
A corpus with academic texts
14,454,573
Swedish
Dataset:
sweachum.xml.bz2
2017-05-19 – 208.67 MB – CC BY 4.0
Word statistics:
stats_SWEACHUM.txt
2017-05-21 – 25.63 MB – CC BY 4.0
Explore in:
Academic texts: Social science
A corpus with academic texts
10,855,954
Swedish
Dataset:
sweacsam.xml.bz2
2017-06-07 – 157.41 MB – CC BY 4.0
Word statistics:
stats_SWEACSAM.txt
2017-05-21 – 18.43 MB – CC BY 4.0
Explore in:
Af Soomaali 1993-94
9,247
Somali
Dataset:
somali-1993-94.xml.bz2
2024-01-04 – 19.45 KB – CC BY 4.0
Explore in:
Af-Soomaali 2016 Somaliland
51,236
Somali
Dataset:
somali-as-2016.xml.bz2
2024-01-04 – 109.54 KB – CC BY 4.0
Explore in:
Aftonbladet 1830's
Part of the collection Kubhist2
29,870,739
Swedish
Dataset:
kubhist2-aftonbladet-1830.xml.bz2
2024-01-14 – 1.02 GB – CC BY 4.0
Word statistics:
stats_kubhist2-aftonbladet-1830.csv
2024-01-11 – 88.16 MB – CC BY 4.0
Explore in:
Agriculture
Agricultural manuals: "Engelska Åker-Mannen" and "En Grundelig Kundskap Om Swenska Åkerbruket"
90,767
Swedish
Dataset:
akerbruk.xml.bz2
2015-05-19 – 898.54 KB – CC BY 4.0
Word statistics:
stats_AKERBRUK.txt
2014-04-29 – 931.32 KB – CC BY 4.0
Explore in:
Argumentation sentences 1.0
A translated corpus for classifying sentence stance in relation to a topic.
Swedish
Dataset:
argumentation-sentences.zip
2023-03-30 – 827.04 KB – CC BY 4.0
Collection
ASPAC
The Amsterdam Slavic Parallel Aligned Corpus
Swedish, Belarusian, Bulgarian, Czech, German, Lower Sorbian, Modern Greek (1453-), English, Spanish, French, Croatian, Upper Sorbian, Latin, Macedonian, Dutch, Polish, Portuguese, Romanian, Russian, Kele (Papua New Guinea), Slovak, Slovenian, Serbian, Slavomolisano, Turkmen, Ukrainian
See 27 collected resources
Explore in:
ASPAC: Swedish
The Swedish part of The Amsterdam Slavic Parallel Aligned Corpus
773,703
Swedish
Dataset:
aspacsv.xml.bz2
2021-07-08 – 14.28 MB – CC BY 4.0
Word statistics:
stats_aspacsv.csv
2021-07-09 – 3.01 MB – CC BY 4.0
Explore in:
ASPAC: Swedish-Belarussian
Part of The Amsterdam Slavic Parallel Aligned Corpus
401,158
Swedish, Belarusian
Dataset:
aspacsvbe-sv.xml.bz2
2016-11-03 – 2.33 MB – CC BY 4.0
Dataset:
aspacsvbe-be.xml.bz2
2016-11-03 – 772.78 KB – CC BY 4.0
Word statistics:
stats_ASPACSVBE-SV.txt
2016-11-06 – 1.05 MB – CC BY 4.0
Word statistics:
stats_ASPACSVBE-BE.txt
2016-11-15 – 1.21 MB – CC BY 4.0
Explore in:
ASPAC: Swedish-Bulgarian
Part of The Amsterdam Slavic Parallel Aligned Corpus
667,092
Swedish, Bulgarian
Dataset:
aspacsvbg-sv.xml.bz2
2016-11-02 – 4.08 MB – CC BY 4.0
Dataset:
aspacsvbg-bg.xml.bz2
2016-11-02 – 1.83 MB – CC BY 4.0
Word statistics:
stats_ASPACSVBG-SV.txt
2016-11-15 – 1.66 MB – CC BY 4.0
Word statistics:
stats_ASPACSVBG-BG.txt
2016-11-15 – 1.42 MB – CC BY 4.0
Explore in:
ASPAC: Swedish-Croatian
Part of The Amsterdam Slavic Parallel Aligned Corpus
992,471
Swedish, Croatian
Dataset:
aspacsvhr-sv.xml.bz2
2016-11-02 – 6.08 MB – CC BY 4.0
Dataset:
aspacsvhr-hr.xml.bz2
2016-11-03 – 1.88 MB – CC BY 4.0
Word statistics:
stats_ASPACSVHR-SV.txt
2016-11-06 – 2.38 MB – CC BY 4.0
Word statistics:
stats_ASPACSVHR-HR.txt
2016-11-15 – 1.92 MB – CC BY 4.0
Explore in:
ASPAC: Swedish-Czech
Part of The Amsterdam Slavic Parallel Aligned Corpus
1,438,880
Swedish, Czech
Dataset:
aspacsvcs-sv.xml.bz2
2016-11-03 – 9.03 MB – CC BY 4.0
Dataset:
aspacsvcs-cs.xml.bz2
2016-11-03 – 2.68 MB – CC BY 4.0
Word statistics:
stats_ASPACSVCS-SV.txt
2016-11-06 – 2.95 MB – CC BY 4.0
Word statistics:
stats_ASPACSVCS-CS.txt
2016-11-15 – 2.66 MB – CC BY 4.0
Explore in:
ASPAC: Swedish-Dutch
Part of The Amsterdam Slavic Parallel Aligned Corpus
1,549,106
Swedish, Dutch
Dataset:
aspacsvnl-sv.xml.bz2
2016-11-02 – 9.03 MB – CC BY 4.0
Dataset:
aspacsvnl-nl.xml.bz2
2016-11-03 – 4.02 MB – CC BY 4.0
Word statistics:
stats_ASPACSVNL-SV.txt
2016-11-15 – 2.95 MB – CC BY 4.0
Word statistics:
stats_ASPACSVNL-NL.txt
2016-11-15 – 1.46 MB – CC BY 4.0
Explore in:
ASPAC: Swedish-English
Part of The Amsterdam Slavic Parallel Aligned Corpus
1,516,943
Swedish, English
Dataset:
aspacsven-sv.xml.bz2
2016-11-25 – 9.1 MB – CC BY 4.0
Dataset:
aspacsven-en.xml.bz2
2016-11-25 – 3.87 MB – CC BY 4.0
Word statistics:
stats_ASPACSVEN-SV.txt
2016-11-27 – 2.95 MB – CC BY 4.0
Word statistics:
stats_ASPACSVEN-EN.txt
2016-11-27 – 1.18 MB – CC BY 4.0
Explore in:
ASPAC: Swedish-French
Part of The Amsterdam Slavic Parallel Aligned Corpus
341,914
Swedish, French
Dataset:
aspacsvfr-sv.xml.bz2
2016-11-25 – 1.95 MB – CC BY 4.0
Dataset:
aspacsvfr-fr.xml.bz2
2016-11-25 – 1008.92 KB – CC BY 4.0
Word statistics:
stats_ASPACSVFR-SV.txt
2016-11-27 – 1.06 MB – CC BY 4.0
Word statistics:
stats_ASPACSVFR-FR.txt
2016-11-27 – 592.08 KB – CC BY 4.0
Explore in:
ASPAC: Swedish-German
Part of The Amsterdam Slavic Parallel Aligned Corpus
1,580,660
Swedish, German
Dataset:
aspacsvde-sv.xml.bz2
2016-10-31 – 9.07 MB – CC BY 4.0
Dataset:
aspacsvde-de.xml.bz2
2016-10-31 – 4.64 MB – CC BY 4.0
Word statistics:
stats_ASPACSVDE-SV.txt
2016-11-15 – 2.95 MB – CC BY 4.0
Word statistics:
stats_ASPACSVDE-DE.txt
2016-11-15 – 2.21 MB – CC BY 4.0
Explore in:
ASPAC: Swedish-Greek
Part of The Amsterdam Slavic Parallel Aligned Corpus
303,518
Modern Greek (1453-), Swedish
Dataset:
aspacsvel-sv.xml.bz2
2016-11-02 – 1.94 MB – CC BY 4.0
Dataset:
aspacsvel-el.xml.bz2
2016-11-03 – 570.94 KB – CC BY 4.0
Word statistics:
stats_ASPACSVEL-SV.txt
2016-11-15 – 1.06 MB – CC BY 4.0
Word statistics:
stats_ASPACSVEL-EL.txt
2016-11-15 – 751.07 KB – CC BY 4.0
Explore in:
ASPAC: Swedish-Italian
Part of The Amsterdam Slavic Parallel Aligned Corpus
91,166
Swedish, Italian
Dataset:
aspacsvit-sv.xml.bz2
2016-11-25 – 519.56 KB – CC BY 4.0
Dataset:
aspacsvit-it.xml.bz2
2016-11-25 – 249.64 KB – CC BY 4.0
Word statistics:
stats_ASPACSVIT-SV.txt
2016-11-27 – 376.03 KB – CC BY 4.0
Word statistics:
stats_ASPACSVIT-IT.txt
2016-11-27 – 250.04 KB – CC BY 4.0
Explore in:
ASPAC: Swedish-Latin
Part of The Amsterdam Slavic Parallel Aligned Corpus
134,180
Swedish, Latin
Dataset:
aspacsvla-sv.xml.bz2
2016-11-03 – 792.29 KB – CC BY 4.0
Dataset:
aspacsvla-la.xml.bz2
2016-11-03 – 372.16 KB – CC BY 4.0
Word statistics:
stats_ASPACSVLA-SV.txt
2016-11-15 – 477.42 KB – CC BY 4.0
Word statistics:
stats_ASPACSVLA-LA.txt
2016-11-06 – 468.89 KB – CC BY 4.0
Explore in:
ASPAC: Swedish-Lower Sorbian
Part of The Amsterdam Slavic Parallel Aligned Corpus
36,551
Swedish, Lower Sorbian
Dataset:
aspacsvdsb-sv.xml.bz2
2016-11-03 – 195.53 KB – CC BY 4.0
Dataset:
aspacsvdsb-dsb.xml.bz2
2016-11-03 – 72.76 KB – CC BY 4.0
Word statistics:
stats_ASPACSVDSB-SV.txt
2016-11-06 – 193.69 KB – CC BY 4.0
Word statistics:
stats_ASPACSVDSB-DSB.txt
2016-11-15 – 152.29 KB – CC BY 4.0
Explore in:
ASPAC: Swedish-Macedonian
Part of The Amsterdam Slavic Parallel Aligned Corpus
602,313
Swedish, Macedonian
Dataset:
aspacsvmk-sv.xml.bz2
2016-11-02 – 3.76 MB – CC BY 4.0
Dataset:
aspacsvmk-mk.xml.bz2
2016-11-03 – 1.06 MB – CC BY 4.0
Word statistics:
stats_ASPACSVMK-SV.txt
2016-11-06 – 1.59 MB – CC BY 4.0
Word statistics:
stats_ASPACSVMK-MK.txt
2016-11-15 – 1.14 MB – CC BY 4.0
Explore in:
ASPAC: Swedish-Molise Slavik
Part of The Amsterdam Slavic Parallel Aligned Corpus
35,279
Slavomolisano, Swedish
Dataset:
aspacsvsvm-sv.xml.bz2
2016-11-03 – 194.99 KB – CC BY 4.0
Dataset:
aspacsvsvm-svm.xml.bz2
2016-11-03 – 63.89 KB – CC BY 4.0
Word statistics:
stats_ASPACSVSVM-SV.txt
2016-11-15 – 193.69 KB – CC BY 4.0
Word statistics:
stats_ASPACSVSVM-SVM.txt
2016-11-06 – 110.31 KB – CC BY 4.0
Explore in:
ASPAC: Swedish-Polish
Part of The Amsterdam Slavic Parallel Aligned Corpus
1,467,390
Swedish, Polish
Dataset:
aspacsvpl-sv.xml.bz2
2016-11-02 – 9.04 MB – CC BY 4.0
Dataset:
aspacsvpl-pl.xml.bz2
2016-11-02 – 4.44 MB – CC BY 4.0
Word statistics:
stats_ASPACSVPL-SV.txt
2016-11-06 – 2.95 MB – CC BY 4.0
Word statistics:
stats_ASPACSVPL-PL.txt
2016-11-15 – 4.03 MB – CC BY 4.0
Explore in:
ASPAC: Swedish-Portuguese
Part of The Amsterdam Slavic Parallel Aligned Corpus
270,241
Swedish, Portuguese
Dataset:
aspacsvpt-sv.xml.bz2
2016-11-25 – 1.55 MB – CC BY 4.0
Dataset:
aspacsvpt-pt.xml.bz2
2016-11-03 – 770.36 KB – CC BY 4.0
Word statistics:
stats_ASPACSVPT-SV.txt
2016-11-27 – 855.21 KB – CC BY 4.0
Word statistics:
stats_ASPACSVPT-PT.txt
2016-11-27 – 509.74 KB – CC BY 4.0
Explore in:
ASPAC: Swedish-Romanian
Part of The Amsterdam Slavic Parallel Aligned Corpus
93,861
Swedish, Romanian
Dataset:
aspacsvro-sv.xml.bz2
2016-11-03 – 517.08 KB – CC BY 4.0
Dataset:
aspacsvro-ro.xml.bz2
2016-11-02 – 276.74 KB – CC BY 4.0
Word statistics:
stats_ASPACSVRO-SV.txt
2016-11-15 – 376.03 KB – CC BY 4.0
Word statistics:
stats_ASPACSVRO-RO.txt
2016-11-06 – 302.43 KB – CC BY 4.0
Explore in:
ASPAC: Swedish-Russian
Part of The Amsterdam Slavic Parallel Aligned Corpus
1,466,745
Swedish, Russian
Dataset:
aspacsvru-sv.xml.bz2
2016-11-28 – 9.08 MB – CC BY 4.0
Dataset:
aspacsvru-ru.xml.bz2
2016-11-28 – 4.41 MB – CC BY 4.0
Word statistics:
stats_ASPACSVRU-SV.txt
2016-12-04 – 2.95 MB – CC BY 4.0
Word statistics:
stats_ASPACSVRU-RU.txt
2016-12-04 – 3.67 MB – CC BY 4.0
Explore in:
ASPAC: Swedish-Serbian (cyrillic)
Part of The Amsterdam Slavic Parallel Aligned Corpus
577,094
Serbian, Swedish
Dataset:
aspacsvsbc-sv.xml.bz2
2016-11-03 – 3.47 MB – CC BY 4.0
Dataset:
aspacsvsbc-sbc.xml.bz2
2016-11-03 – 1006.26 KB – CC BY 4.0
Word statistics:
stats_ASPACSVSBC-SV.txt
2016-11-06 – 1.33 MB – CC BY 4.0
Word statistics:
stats_ASPACSVSBC-SBC.txt
2016-11-15 – 1.25 MB – CC BY 4.0
Explore in:
ASPAC: Swedish-Serbian (latin)
Part of The Amsterdam Slavic Parallel Aligned Corpus
505,216
Swedish, Serbian
Dataset:
aspacsvsr-sv.xml.bz2
2016-11-03 – 3.11 MB – CC BY 4.0
Dataset:
aspacsvsr-sr.xml.bz2
2016-11-03 – 956.03 KB – CC BY 4.0
Word statistics:
stats_ASPACSVSR-SV.txt
2016-11-15 – 1.45 MB – CC BY 4.0
Word statistics:
stats_ASPACSVSR-SR.txt
2016-11-15 – 1.2 MB – CC BY 4.0
Explore in:
ASPAC: Swedish-Slovak
Part of The Amsterdam Slavic Parallel Aligned Corpus
554,510
Swedish, Slovak
Dataset:
aspacsvsk-sv.xml.bz2
2016-11-02 – 3.41 MB – CC BY 4.0
Dataset:
aspacsvsk-sk.xml.bz2
2016-11-03 – 1.56 MB – CC BY 4.0
Word statistics:
stats_ASPACSVSK-SV.txt
2016-11-15 – 1.51 MB – CC BY 4.0
Word statistics:
stats_ASPACSVSK-SK.txt
2016-11-15 – 1.27 MB – CC BY 4.0
Explore in:
ASPAC: Swedish-Slovene
Part of The Amsterdam Slavic Parallel Aligned Corpus
579,527
Swedish, Slovenian
Dataset:
aspacsvsl-sv.xml.bz2
2016-11-03 – 3.44 MB – CC BY 4.0
Dataset:
aspacsvsl-sl.xml.bz2
2016-11-02 – 1.69 MB – CC BY 4.0
Word statistics:
stats_ASPACSVSL-SV.txt
2016-11-06 – 1.51 MB – CC BY 4.0
Word statistics:
stats_ASPACSVSL-SL.txt
2016-11-06 – 1.31 MB – CC BY 4.0
Explore in:
ASPAC: Swedish-Spanish
Part of The Amsterdam Slavic Parallel Aligned Corpus
61,931
Swedish, Spanish
Dataset:
aspacsves-sv.xml.bz2
2016-11-03 – 325.61 KB – CC BY 4.0
Dataset:
aspacsves-es.xml.bz2
2016-11-03 – 145.87 KB – CC BY 4.0
Word statistics:
stats_ASPACSVES-SV.txt
2016-11-15 – 252.63 KB – CC BY 4.0
Word statistics:
stats_ASPACSVES-ES.txt
2016-11-15 – 176.15 KB – CC BY 4.0
Explore in:
ASPAC: Swedish-Turkmen
Part of The Amsterdam Slavic Parallel Aligned Corpus
31,397
Swedish, Turkmen
Dataset:
aspacsvtk-sv.xml.bz2
2016-11-02 – 196.79 KB – CC BY 4.0
Dataset:
aspacsvtk-tk.xml.bz2
2016-11-03 – 61.13 KB – CC BY 4.0
Word statistics:
stats_ASPACSVTK-SV.txt
2016-11-15 – 193.69 KB – CC BY 4.0
Word statistics:
stats_ASPACSVTK-TK.txt
2016-11-15 – 181.33 KB – CC BY 4.0
Explore in:
ASPAC: Swedish-Ukrainian
Part of The Amsterdam Slavic Parallel Aligned Corpus
453,836
Swedish, Ukrainian
Dataset:
aspacsvuk-sv.xml.bz2
2016-11-02 – 2.67 MB – CC BY 4.0
Dataset:
aspacsvuk-uk.xml.bz2
2016-11-03 – 869.41 KB – CC BY 4.0
Word statistics:
stats_ASPACSVUK-SV.txt
2016-11-06 – 1.15 MB – CC BY 4.0
Word statistics:
stats_ASPACSVUK-UK.txt
2016-11-15 – 1.41 MB – CC BY 4.0
Explore in:
ASPAC: Swedish-Upper Sorbian
Part of The Amsterdam Slavic Parallel Aligned Corpus
85,146
Swedish, Upper Sorbian
Dataset:
aspacsvhsb-sv.xml.bz2
2016-11-03 – 476.11 KB – CC BY 4.0
Dataset:
aspacsvhsb-hsb.xml.bz2
2016-11-03 – 162.79 KB – CC BY 4.0
Word statistics:
stats_ASPACSVHSB-SV.txt
2016-11-15 – 330.74 KB – CC BY 4.0
Word statistics:
stats_ASPACSVHSB-HSB.txt
2016-11-15 – 278.48 KB – CC BY 4.0
Explore in:
ASU
Structural development of the second language
643,949
Swedish
Word statistics:
stats_asu.csv
2022-04-06 – 1.71 MB – CC BY 4.0
Explore in:
August Strindberg's letters
Part of the collected works of August Strindberg
1,507,958
Swedish
Dataset:
strindbergbrev.xml.bz2
2017-04-26 – 20.39 MB – CC BY 4.0
Word statistics:
stats_STRINDBERGBREV.txt
2017-04-30 – 5.54 MB – CC BY 4.0
Explore in:
August Strindberg's novels
Part of the collected works of August Strindberg
4,309,037
Swedish
Dataset:
strindbergromaner.xml.bz2
2017-06-20 – 63.43 MB – CC BY 4.0
Word statistics:
stats_STRINDBERGROMANER.txt
2017-06-25 – 11.25 MB – CC BY 4.0
Explore in:
Bellman
Collected works of C.M. Bellman
452,030
Swedish
Dataset:
bellman.xml.bz2
2015-11-09 – 4.83 MB – CC BY 4.0
Word statistics:
stats_BELLMAN.txt
2015-11-15 – 2.46 MB – CC BY 4.0
Explore in:
Betänkande ang. läroböcker (1882)
A report from 1882, digitized by the Gothenburg University Library
41,521
Swedish
Dataset:
betankande.xml.bz2
2015-12-11 – 403.44 KB – CC BY 4.0
Word statistics:
stats_BETANKANDE.txt
2015-12-13 – 398.51 KB – CC BY 4.0
Explore in:
Biblioteksbladet
The earliest volumes of "Biblioteksbladet: Organ för Sveriges allmänna biblioteksförening" from 1916–1940, digitized by Project Runeberg
4,595,593
Swedish
Dataset:
runeberg-biblblad.xml.bz2
2015-05-19 – 52.49 MB – CC BY 4.0
Word statistics:
stats_RUNEBERG-BIBLBLAD.txt
2015-06-25 – 14.48 MB – CC BY 4.0
Explore in:
Collection
Bicameral Riksdag
Collection of textual documents from the Swedish bicameral parliament data
Swedish
See 10 collected resources
Explore in:
Bicameral riksdag: Government official investigations
Part of the data set "Bicameral Riksdag"
59,266,835
Swedish
Dataset:
tkr-utredningar-kombet-sou.xml.bz2
2023-12-12 – 986.58 MB – CC BY 4.0
Word statistics:
stats_tkr-utredningar-kombet-sou.csv
2023-12-13 – 223.48 MB – CC BY 4.0
Explore in:
Bicameral riksdag: Letters of the Riksdag
Part of the data set "Bicameral Riksdag"
29,775,566
Swedish
Dataset:
tkr-rskr.xml.bz2
2023-12-11 – 476.4 MB – CC BY 4.0
Word statistics:
stats_tkr-rskr.csv
2023-12-12 – 133.13 MB – CC BY 4.0
Explore in:
Bicameral riksdag: Motions
Part of the data set "Bicameral Riksdag"
73,189,180
Swedish
Dataset:
tkr-motioner.xml.bz2
2023-12-11 – 1.42 GB – CC BY 4.0
Word statistics:
stats_tkr-motioner.csv
2023-12-12 – 391.82 MB – CC BY 4.0
Explore in:
Bicameral riksdag: Narratives and accounts
Part of the data set "Bicameral Riksdag"
61,348,401
Swedish
Dataset:
tkr-berattelser-redogorelser-frsrdg.xml.bz2
2023-12-12 – 1 GB – CC BY 4.0
Word statistics:
stats_tkr-berattelser-redogorelser-frsrdg.csv
2023-12-13 – 332.03 MB – CC BY 4.0
Explore in:
Bicameral riksdag: Propositions and letters
Part of the data set "Bicameral Riksdag"
319,201,218
Swedish
Dataset:
tkr-propositioner-skrivelser.xml.bz2
2023-12-12 – 5.94 GB – CC BY 4.0
Word statistics:
stats_tkr-propositioner-skrivelser.csv
2023-12-13 – 839.16 MB – CC BY 4.0
Explore in:
Bicameral riksdag: Protocols
Part of the data set "Bicameral Riksdag"
327,554,657
Swedish
Dataset:
tkr-protokoll.xml.bz2
2023-12-12 – 6.08 GB – CC BY 4.0
Word statistics:
stats_tkr-protokoll.csv
2023-12-13 – 765.82 MB – CC BY 4.0
Explore in:
Bicameral riksdag: Register
Part of the data set "Bicameral Riksdag"
23,323,395
Swedish
Dataset:
tkr-register.xml.bz2
2023-12-11 – 285.18 MB – CC BY 4.0
Word statistics:
stats_tkr-register.csv
2023-12-12 – 100.33 MB – CC BY 4.0
Explore in:
Bicameral riksdag: Regulations
Part of the data set "Bicameral Riksdag"
2,628,009
Swedish
Dataset:
tkr-reglementen-sfs.xml.bz2
2023-12-11 – 43.12 MB – CC BY 4.0
Word statistics:
stats_tkr-reglementen-sfs.csv
2023-12-12 – 28.6 MB – CC BY 4.0
Explore in:
Bicameral riksdag: Reports, memorandums and opinions
Part of the data set "Bicameral Riksdag"
195,467,124
Swedish
Dataset:
tkr-bet-mem-utl.xml.bz2
2023-12-11 – 3.59 GB – CC BY 4.0
Word statistics:
stats_tkr-bet-mem-utl.csv
2023-12-12 – 615.55 MB – CC BY 4.0
Explore in:
Bicameral riksdag: The constitution of the Riksdag
Part of the data set "Bicameral Riksdag"
83,964
Swedish
Dataset:
tkr-riksdagens-forfattningssamling-rfs.xml.bz2
2023-12-11 – 1.56 MB – CC BY 4.0
Word statistics:
stats_tkr-riksdagens-forfattningssamling-rfs.csv
2023-11-28 – 954.85 KB – CC BY 4.0
Explore in:
Collection
Blog mix
Material from a selection of Swedish blogs. Regularly updated.
Swedish
See 21 collected resources
Explore in:
Blog mix 1998
Material from a selection of Swedish blogs. Is updated regularly.
30,939
Swedish
Dataset:
bloggmix1998.xml.bz2
2017-02-14 – 453.05 KB – CC BY 4.0
Word statistics:
stats_BLOGGMIX1998.txt
2017-02-19 – 425.44 KB – CC BY 4.0
Explore in:
Blog mix 1999
Material from a selection of Swedish blogs. Is updated regularly.
604,019
Swedish
Dataset:
bloggmix1999.xml.bz2
2017-02-14 – 9.27 MB – CC BY 4.0
Word statistics:
stats_BLOGGMIX1999.txt
2017-02-19 – 2.75 MB – CC BY 4.0
Explore in:
Blog mix 2000
Material from a selection of Swedish blogs. Is updated regularly.
188,779
Swedish
Dataset:
bloggmix2000.xml.bz2
2017-02-22 – 2.69 MB – CC BY 4.0
Word statistics:
stats_BLOGGMIX2000.txt
2017-02-19 – 1.28 MB – CC BY 4.0
Explore in:
Blog mix 2001
Material from a selection of Swedish blogs. Is updated regularly.
326,659
Swedish
Dataset:
bloggmix2001.xml.bz2
2017-02-14 – 4.7 MB – CC BY 4.0
Word statistics:
stats_BLOGGMIX2001.txt
2017-02-19 – 2.05 MB – CC BY 4.0
Explore in:
Blog mix 2002
Material from a selection of Swedish blogs. Is updated regularly.
242,723
Swedish
Dataset:
bloggmix2002.xml.bz2
2017-02-14 – 3.4 MB – CC BY 4.0
Word statistics:
stats_BLOGGMIX2002.txt
2017-02-19 – 1.55 MB – CC BY 4.0
Explore in:
Blog mix 2003
Material from a selection of Swedish blogs. Is updated regularly.
271,877
Swedish
Dataset:
bloggmix2003.xml.bz2
2017-02-14 – 3.76 MB – CC BY 4.0
Word statistics:
stats_BLOGGMIX2003.txt
2017-02-19 – 1.83 MB – CC BY 4.0
Explore in:
Blog mix 2004
Material from a selection of Swedish blogs. Is updated regularly.
638,967
Swedish
Dataset:
bloggmix2004.xml.bz2
2017-02-14 – 9.03 MB – CC BY 4.0
Word statistics:
stats_BLOGGMIX2004.txt
2017-02-19 – 2.85 MB – CC BY 4.0
Explore in:
Blog mix 2005
Material from a selection of Swedish blogs. Is updated regularly.
4,800,032
Swedish
Dataset:
bloggmix2005.xml.bz2
2017-02-14 – 70.01 MB – CC BY 4.0
Word statistics:
stats_BLOGGMIX2005.txt
2017-02-19 – 11.01 MB – CC BY 4.0
Explore in:
Blog mix 2006
Material from a selection of Swedish blogs. Is updated regularly.
8,106,551
Swedish
Dataset:
bloggmix2006.xml.bz2
2017-02-15 – 123.62 MB – CC BY 4.0
Word statistics:
stats_BLOGGMIX2006.txt
2017-02-19 – 16.72 MB – CC BY 4.0
Explore in:
Blog mix 2007
Material from a selection of Swedish blogs. Is updated regularly.
19,096,258
Swedish
Dataset:
bloggmix2007.xml.bz2
2017-02-15 – 288.92 MB – CC BY 4.0
Word statistics:
stats_BLOGGMIX2007.txt
2017-02-19 – 27.5 MB – CC BY 4.0
Explore in:
Blog mix 2008
Material from a selection of Swedish blogs. Is updated regularly.
43,703,790
Swedish
Dataset:
bloggmix2008.xml.bz2
2017-02-16 – 656.67 MB – CC BY 4.0
Word statistics:
stats_BLOGGMIX2008.txt
2017-02-19 – 44.33 MB – CC BY 4.0
Explore in:
Blog mix 2009
Material from a selection of Swedish blogs. Is updated regularly.
75,113,677
Swedish
Dataset:
bloggmix2009.xml.bz2
2017-02-17 – 1.1 GB – CC BY 4.0
Word statistics:
stats_BLOGGMIX2009.txt
2017-02-19 – 60.62 MB – CC BY 4.0
Explore in:
Blog mix 2010
Material from a selection of Swedish blogs. Is updated regularly.
97,435,693
Swedish
Dataset:
bloggmix2010.xml.bz2
2017-02-23 – 1.44 GB – CC BY 4.0
Word statistics:
stats_BLOGGMIX2010.txt
2017-02-26 – 72.48 MB – CC BY 4.0
Explore in:
Blog mix 2011
Material from a selection of Swedish blogs. Is updated regularly.
100,591,617
Swedish
Dataset:
bloggmix2011.xml.bz2
2017-02-24 – 1.48 GB – CC BY 4.0
Word statistics:
stats_BLOGGMIX2011.txt
2017-02-26 – 71.79 MB – CC BY 4.0
Explore in:
Blog mix 2012
Material from a selection of Swedish blogs. Is updated regularly.
80,041,223
Swedish
Dataset:
bloggmix2012.xml.bz2
2017-02-23 – 1.17 GB – CC BY 4.0
Word statistics:
stats_BLOGGMIX2012.txt
2017-02-26 – 60.09 MB – CC BY 4.0
Explore in:
Blog mix 2013
Material from a selection of Swedish blogs. Is updated regularly.
62,098,899
Swedish
Dataset:
bloggmix2013.xml.bz2
2017-02-24 – 930.12 MB – CC BY 4.0
Word statistics:
stats_BLOGGMIX2013.txt
2017-02-26 – 50.29 MB – CC BY 4.0
Explore in:
Blog mix 2014
Material from a selection of Swedish blogs. Is updated regularly.
40,133,589
Swedish
Dataset:
bloggmix2014.xml.bz2
2017-02-23 – 596.24 MB – CC BY 4.0
Word statistics:
stats_BLOGGMIX2014.txt
2017-02-26 – 37.74 MB – CC BY 4.0
Explore in:
Blog mix unknown date
Material from a selection of Swedish blogs. Is updated regularly.
35,028,559
Swedish
Dataset:
bloggmixodat.xml.bz2
2017-02-23 – 511.42 MB – CC BY 4.0
Word statistics:
stats_BLOGGMIXODAT.txt
2017-02-26 – 36.51 MB – CC BY 4.0
Explore in:
Bloggmix 2015
Material from a selection of Swedish blogs. Is updated regularly.
27,835,518
Swedish
Dataset:
bloggmix2015.xml.bz2
2017-05-10 – 434.91 MB – CC BY 4.0
Word statistics:
stats_BLOGGMIX2015.txt
2017-05-10 – 30.57 MB – CC BY 4.0
Explore in:
Bloggmix 2016
Material from a selection of Swedish blogs. Is updated regularly.
17,699,703
Swedish
Dataset:
bloggmix2016.xml.bz2
2017-02-22 – 262.98 MB – CC BY 4.0
Word statistics:
stats_BLOGGMIX2016.txt
2017-02-26 – 23.52 MB – CC BY 4.0
Explore in:
Bloggmix 2017
Material from a selection of Swedish blogs. Is updated regularly.
1,669,477
Swedish
Dataset:
bloggmix2017.xml.bz2
2017-02-22 – 23.48 MB – CC BY 4.0
Word statistics:
stats_BLOGGMIX2017.txt
2017-02-26 – 5.84 MB – CC BY 4.0
Explore in:
Bonnier novels I (1976–77)
A corpus of 69 Bonnier novels from 1976–77
6,578,675
Swedish
Dataset:
romi.xml.bz2
2017-10-04 – 135.42 MB – CC BY 4.0
Word statistics:
stats_ROMI.txt
2017-10-08 – 12.78 MB – CC BY 4.0
Explore in:
Bonniers novels II (1980–81)
A corpus of 60 Bonnier novels from 1980–81
4,304,271
Swedish
Dataset:
romii.xml.bz2
2017-03-17 – 62.87 MB – CC BY 4.0
Word statistics:
stats_ROMII.txt
2017-03-19 – 11.09 MB – CC BY 4.0
Explore in:
Caafimaad 1983
1,521
Somali
Dataset:
somali-caafimaad-1983.xml.bz2
2024-01-15 – 4.48 KB – CC BY 4.0
Explore in:
COCTAILL
Corpus of coursebooks used for teaching L2 Swedish. Annotated manually for text structure and pedagogical/didactical categories; automatically linguistically annotated. See more here https://spraakbanken.gu.se/forskning/teman/icall/icall-l2-projects/l2-data
710,251
Swedish
Dataset:
coctaill.xml.bz2
2017-10-30 – 16.57 MB – CC BY 4.0
Word statistics:
stats_COCTAILL.txt
2017-11-05 – 3.03 MB – CC BY 4.0
Explore in:
COCTAILL activities & examples
Corpus of coursebooks used for teaching L2 Swedish. Annotated manually for text structure and pedagogical/didactical categories; automatically linguistically annotated.
343,793
Swedish
Word statistics:
stats_COCTAILL-AE.txt
2021-07-04 – 1.71 MB – CC BY 4.0
Explore in:
COCTAILL lesson text
Corpus of coursebooks used for teaching L2 Swedish. Annotated manually for text structure and pedagogical/didactical categories; automatically linguistically annotated.
308,206
Swedish
Word statistics:
stats_COCTAILL-LT.txt
2021-07-04 – 1.84 MB – CC BY 4.0
Explore in:
Corpus of spoken isiXhosa
A corpus of transcribed and annotated recordings of spoken Xhosa.
7,039
Xhosa
Explore in:
Corpus word statistics
Accumulated word statistics from many of our modern Swedish corpora
Word statistics:
stats_all.txt
2022-04-10 – 5.14 GB – CC BY 4.0
Dagens Arena
News texts from dagensarena.se
10,908,463
Swedish
Dataset:
da.xml.bz2
2024-01-02 – 297.94 MB – CC BY 4.0
Word statistics:
stats_da.csv
2024-01-03 – 294.52 MB – CC BY 4.0
Explore in:
DaLAJ-GED-SuperLim 2.0
Dataset for Linguistic Acceptability Judgments (and more), v.2.0
Swedish
Dataset:
dalaj-ged-superlim.zip
2023-04-03 – 1.41 MB – CC BY 4.0
Dataset:
dalaj-ged-tsv.zip
2023-05-20 – 1.15 MB – CC BY 4.0
Dataset:
liuep197-11.pdf
2024-01-25 – 463.74 KB – CC BY 4.0
Dalin: Then Swänska Argus 1732-1734
Manual transcription of Then Swänska Argus by Olof von Dalin, Stockholm, 1732–1734. For OCR analysis.
213,399
Swedish
Dataset:
dalin-then-swaanska-argus-1732-1734.tar.gz
2020-06-12 – 80.21 MB – CC BY 4.0
Dalpilen 1860's
Part of the collection Kubhist2
8,984,628
Swedish
Dataset:
kubhist2-dalpilen-1860.xml.bz2
2024-01-09 – 273.1 MB – CC BY 4.0
Word statistics:
stats_kubhist2-dalpilen-1860.csv
2024-01-10 – 28 MB – CC BY 4.0
Explore in:
Databank of 1977 Spanish Press
Text from two Spanish newspapers from 1977. Part of SOL - Spanish Online.
2,166,383
Spanish
Dataset:
pe77.xml.bz2
2017-11-10 – 7.7 MB – CC BY 4.0
Explore in:
Databank of Eleven Spanish Novels 1951–1971
Corpus consisting of 11 Spanish novels. Part of SOL - Spanish Online.
1,248,184
Spanish
Dataset:
one71.xml.bz2
2017-11-10 – 3.68 MB – CC BY 4.0
Explore in:
Detective Department
Data from the Detective Department at the Gothenburg police, from late 1800s to early 1900s.
Swedish
Dataset:
geocoords.txt
2023-06-20 – 326.73 KB – CC BY 4.0
Dataset:
pixelcoords.txt
2023-06-20 – 182.83 KB – CC BY 4.0
Detektiva avdelningen
1,343,709
Swedish
Dataset:
detektivaavdelningen.xml.bz2
2024-03-13 – 20.95 MB – CC BY 4.0
Word statistics:
stats_detektivaavdelningen.csv
2024-03-28 – 2.47 MB – CC BY 4.0
Explore in:
DiabetologNytt (1996–1999)
The paper DiabetologNytt (Diabetologynews) 1996-1999
228,313
Swedish
Explore in:
Diverse tidningar
Fourteen annual volumes of eight different periodicals (1810–1933) digitized by Project Runeberg
5,358,564
Swedish
Dataset:
runeberg-diverse.xml.bz2
2014-12-08 – 65.51 MB – CC BY 4.0
Word statistics:
stats_RUNEBERG-DIVERSE.txt
2015-06-25 – 23.3 MB – CC BY 4.0
Explore in:
DN 1987
Dagens Nyheter 1987
5,129,248
Swedish
Dataset:
dn1987.xml.bz2
2022-12-13 – 137.38 MB – CC BY 4.0
Word statistics:
stats_dn1987.csv
2022-12-14 – 18.12 MB – CC BY 4.0
Explore in:
Dramawebben (demo)
Texts from Dramawebben, a digital archive of free Swedish drama.
790,456
Swedish
Dataset:
drama.xml.bz2
2017-03-21 – 9.1 MB – CC BY 4.0
Word statistics:
stats_DRAMA.txt
2017-03-26 – 2.61 MB – CC BY 4.0
Explore in:
DReaM
A multilingual corpus of linguistic descriptions of the world's natural languages.
75,027,790
English
Dataset:
dream.zip.bz2
2020-11-11 – 188.83 MB – CC BY 4.0
Explore in:
DReaM-Copyright-Protected
A multilingual corpus of linguistic descriptions of the world's natural languages.
225,617,801
English
DReaM-de-open
18,619,718
German
Word statistics:
stats_DREAM-DE-OPEN.txt
2022-02-27 – 69.46 MB – CC BY 4.0
Explore in:
DReaM-de-restricted
36,965,999
German
Word statistics:
stats_DREAM-DE-RESTRICTED.txt
2020-07-05 – 103.52 MB – CC BY 4.0
Explore in:
DReaM-en-open
27,411,739
English
Word statistics:
stats_DREAM-EN-OPEN.txt
2020-02-24 – 51.62 MB – CC BY 4.0
Explore in:
Pagination
Page
1
Page
2
Page
3
Page
4
Page
5
Page
6
Page
7
Page
8
Page
9
Page
10
Page
11
Page
12
Next page
Next ›
Last page
Last »
News and events
News archive
Conferences and workshops
CLT retreat 2020
AI Trust workshop
Autumn Workshop
Höstworkshop 2025
Höstworkshop 2024
Höstworkshop 2023
Höstworkshop 2022
Höstworkshop 2021
Autumn Workshop 2020
Autumn Workshop 2011 and Korp-release
Autumn Workshop 2012
Autumn Workshop 2013
Autumn Workshop 2014
Autumn Workshop 2015
Autumn Workshop 2016
Autumn Workshop 2017
Autumn Workshop 2018
Autumn Workshop 2019
Språkbanken 40 years
CDLC workshop
CLT workshop Spring 2023
EACL 2014
Korp Workshop
Korp Workshop 2014
Korpworkshop 2018
NoDaLiDa 2017
RESOURCEFUL
SLTC 2020
Programme
Instructions
People
Support
Call for papers
Sustainable language representations
Position statements
Workshop on Profiling second language vocabulary and grammar - 2023
Blog
Calendar
Previous events
Research
Publications
Doktorandutbildning
For PhD students and supervisors
Tools
Korp
User manual
Web API
Distribution and development
Corpus statistics
Sentence sets
Karp
Web API
Sparv
Sparv Pipeline
Sparv's user manual
Annotations by Sparv
Web service (API)
Web Sparv
Mink
User manual
Tutorial
Web API
Privacy and data policy
Lärka
Other tools
Catta
IT-baserad grammatikinlärning
Data
FAQ
About us
Staff
Organisation
Språkbanken Text i världen
Språkbanken 50 years
Celebration
A brief history
PhD program
Teaching
How to cite
Alumni
Meetings and workshops
Kick-off meetings
Kick-off H2021
Kick-off V2021
Kick-off H2020
Kick-off V2020
Kick-off H2019
Kick-off V2019
Kick-off H2018
Kick-off V2018
Kick-off H2017
Kick-off V2017
Kick-off H2016
Kick-off V2016
Kick-off H2015
Workshops
End of the year workshop 2024
End of the year workshop 2023
Semester workshop 2022
Semester workshop H2021
Semester workshop V2021
Semester workshop H2020
Semester workshop V2020
Forskningsmöten
SBX Retreat
SBX Retreat 2024
SBX Retreat 2023
SBX Retreat 2022
Working group meetings
Cookies
Internal
Contact us
Help desk