Menu
News and events
Open submenu
Research
Open submenu
Data
Analyses
Platforms
Open submenu
About us
Open submenu
Contact us
Open submenu
FAQ
Close submenu
News and events
News archive
Conferences and workshops
Open submenu
Blog
Calendar
Open submenu
Close submenu
Conferences and workshops
CLT retreat 2020
AI Trust workshop
Autumn Workshop
Open submenu
CDLC workshop
CLT workshop Spring 2023
EACL 2014
Korp Workshop
Open submenu
NoDaLiDa 2017
RESOURCEFUL
SLTC 2020
Open submenu
Sustainable language representations
Open submenu
Workshop on Profiling second language vocabulary and grammar - 2023
Close submenu
Autumn Workshop
Höstworkshop 2025
Höstworkshop 2024
Höstworkshop 2023
Höstworkshop 2022
Höstworkshop 2021
Autumn Workshop 2020
Autumn Workshop 2011 and Korp-release
Autumn Workshop 2012
Autumn Workshop 2013
Autumn Workshop 2014
Autumn Workshop 2015
Autumn Workshop 2016
Autumn Workshop 2017
Autumn Workshop 2018
Autumn Workshop 2019
Språkbanken 40 years
Close submenu
Korp Workshop
Korp Workshop 2014
Korpworkshop 2018
Close submenu
SLTC 2020
Programme
Instructions
People
Support
Call for papers
Close submenu
Sustainable language representations
Position statements
Close submenu
Calendar
Previous events
Close submenu
Research
Publications
Doktorandutbildning
Open submenu
Close submenu
Doktorandutbildning
For PhD students and supervisors
Close submenu
Platforms
Korp
Open submenu
Karp
Open submenu
Sparv
Open submenu
Mink
Open submenu
Lärka
Other tools
Open submenu
Close submenu
Korp
User manual
Web API
Distribution and development
Corpus statistics
Sentence sets
Close submenu
Karp
Web API
Close submenu
Sparv
Sparv Pipeline
Sparv's user manual
Annotations by Sparv
Web service (API)
Web Sparv
Close submenu
Mink
User manual
Tutorial
Video: Overview (sv)
Web API
Privacy and data policy
Close submenu
Other tools
Catta
IT-baserad grammatikinlärning
Close submenu
About us
Staff
Organisation
Språkbanken Text i världen
Språkbanken 50 years
Open submenu
A brief history
PhD program
Teaching
How to cite
Alumni
Meetings and workshops
Open submenu
Cookies
Internal
Close submenu
Språkbanken 50 years
Celebration
Close submenu
Meetings and workshops
Kick-off meetings
Open submenu
Workshops
Open submenu
Forskningsmöten
SBX Retreat
Open submenu
Working group meetings
Close submenu
Kick-off meetings
Kick-off H2021
Kick-off V2021
Kick-off H2020
Kick-off V2020
Kick-off H2019
Kick-off V2019
Kick-off H2018
Kick-off V2018
Kick-off H2017
Kick-off V2017
Kick-off H2016
Kick-off V2016
Kick-off H2015
Close submenu
Workshops
End of the year workshop 2024
End of the year workshop 2023
Semester workshop 2022
Semester workshop H2021
Semester workshop V2021
Semester workshop H2020
Semester workshop V2020
Close submenu
SBX Retreat
SBX Retreat 2024
SBX Retreat 2023
SBX Retreat 2022
Close submenu
Contact us
Help desk
Skip to main content
Svenska
English
Språkbanken Text is a part of
Språkbanken
.
News and events
Research
Data
Analyses
Platforms
About us
Contact us
FAQ
Menu
Breadcrumb
Home
Language resources
Language resources
On this page you can browse and search our datasets. Click on a row name to see what files are available for download. You can go directly to the search interface by clicking on the tool logo.
All (1329)
Collections (31)
Corpora (1200)
Lexicons (66)
Training and evaluation data (15)
Models (48)
Name or description
Language
- Any -
Swedish
Albanian
Belarusian
Blissymbols
Bosnian
Bulgarian
Croatian
Czech
Danish
Dutch
English
Estonian
Faroese
Finland Swedish
Finnish
French
German
Icelandic
Iranian Persian
Italian
Kele (Papua New Guinea)
Kurdish
Latin
Latvian
Lower Sorbian
Macedonian
Modern Greek (1453-)
Multiple languages
Norwegian
Norwegian Bokmål
Old English (ca. 450-1100)
Old High German (ca. 750-1050)
Old Norse
Old Saxon
Polish
Portuguese
Romanian
Russian
Serbian
Slavomolisano
Slovak
Slovenian
Somali
Spanish
Turkish
Turkmen
Ukrainian
Upper Sorbian
Xhosa
Resurs
Typ
Språk
Åtkomst
Somali: Kitaabka Quduuska Ah
Corpus
Somali
Dataset:
somali-kqa.xml.bz2
2016-09-29 – 1.67 MB – CC BY 4.0
Word statistics:
stats_SOMALI-KQA.txt
2020-02-25 – 957.24 KB – CC BY 4.0
Explore in:
Somali: Maaddooyinka Kale 1972–79
Corpus
Somali
Dataset:
somali-mk-1972-79.xml.bz2
2021-08-27 – 45.99 KB – CC BY 4.0
Explore in:
Somali: Ogaden Online
Corpus
Somali
Dataset:
somali-ogaden.xml.bz2
2016-10-13 – 216.75 KB – CC BY 4.0
Explore in:
Somali: Qoraallo 1956-1970
Corpus
Somali
Dataset:
somali-qoraallo.xml.bz2
2019-01-30 – 37.31 KB – CC BY 4.0
Explore in:
Somali: Qur’aan
Corpus
Somali
Dataset:
somali-quraan.xml.bz2
2019-01-30 – 275.2 KB – CC BY 4.0
Explore in:
Somali: Raadiyaha Denmark 2014
Corpus
Somali
Dataset:
somali-radioden2014.xml.bz2
2016-09-29 – 399.81 KB – CC BY 4.0
Word statistics:
stats_SOMALI-RADIODEN2014.txt
2020-02-25 – 467.39 KB – CC BY 4.0
Explore in:
Somali: Raadiyaha Iswiidhan 2014
Corpus
Somali
Dataset:
somali-radioswe2014.xml.bz2
2016-09-29 – 598.92 KB – CC BY 4.0
Word statistics:
stats_SOMALI-RADIOSWE2014.txt
2020-02-25 – 579.87 KB – CC BY 4.0
Explore in:
Somali: Radio Muqdisho
Corpus
Somali
Dataset:
somali-radiomuq.xml.bz2
2017-02-17 – 51.34 KB – CC BY 4.0
Explore in:
Somali: Saynis 1972–77
Corpus
Somali
Dataset:
somali-saynis-1972-77.xml.bz2
2018-06-27 – 302.14 KB – CC BY 4.0
Explore in:
Somali: Saynis 1980–89
Corpus
Somali
Dataset:
somali-saynis-1980-89.xml.bz2
2021-08-27 – 96.67 KB – CC BY 4.0
Explore in:
Somali: Saynis 1994–96
Corpus
Somali
Dataset:
somali-saynis-1994-96.xml.bz2
2018-06-27 – 155.84 KB – CC BY 4.0
Explore in:
Somali: Saynis 2001 Somaliland
Corpus
Somali
Dataset:
somali-saynis.xml.bz2
2017-09-20 – 73.75 KB – CC BY 4.0
Explore in:
Somali: Saynis 2001 Soomaaliya
Corpus
Somali
Dataset:
somali-saynis-2001.xml.bz2
2019-02-18 – 12.8 KB – CC BY 4.0
Explore in:
Somali: Saynis 2010 Somaliland
Corpus
Somali
Dataset:
somali-saynis-2010.xml.bz2
2019-10-01 – 70.23 KB – CC BY 4.0
Explore in:
Somali: Saynis 2011 Soomaaliya
Corpus
Somali
Dataset:
somali-saynis-2011-soomaaliya.xml.bz2
2019-01-30 – 111.64 KB – CC BY 4.0
Explore in:
Somali: Saynis 2016 Somaliland
Corpus
Somali
Dataset:
somali-saynis-2016.xml.bz2
2019-10-01 – 71.3 KB – CC BY 4.0
Explore in:
Somali: Saynis 2018 Soomaaliya
Corpus
Somali
Dataset:
somali-saynis-2018.xml.bz2
2019-10-01 – 57.11 KB – CC BY 4.0
Explore in:
Somali: Sheekooyin Carruureed
Corpus
Somali
Dataset:
somali-sheekooyin.xml.bz2
2021-08-27 – 85.91 KB – CC BY 4.0
Explore in:
Somali: Sheekooyin Carruureed (Turjuman)
Corpus
Somali
Dataset:
somali-sheekooyin-carruureed.xml.bz2
2021-08-30 – 43.72 KB – CC BY 4.0
Explore in:
Somali: Sheekooyin Gaagaaban
Corpus
Somali
Dataset:
somali-sheekooying.xml.bz2
2021-08-27 – 628.9 KB – CC BY 4.0
Explore in:
Somali: Somali Faces
Corpus
Somali
Dataset:
somali-faces.xml.bz2
2017-01-30 – 119.98 KB – CC BY 4.0
Explore in:
Somali: Suugaan
Corpus
Somali
Dataset:
somali-suugaan.xml.bz2
2017-11-27 – 364.94 KB – CC BY 4.0
Word statistics:
stats_SOMALI-SUUGAAN.txt
2020-02-25 – 502.45 KB – CC BY 4.0
Explore in:
Somali: Suugaan (Turjuman)
Corpus
Somali
Dataset:
somali-suugaan-turjuman.xml.bz2
2021-08-27 – 27.26 KB – CC BY 4.0
Explore in:
Somali: Suugaan 2
Corpus
Somali
Dataset:
somali-suugaan2.xml.bz2
2022-12-15 – 7.13 MB – CC BY 4.0
Explore in:
Somali: Taariikh iyo Dhaqan (Turjuman)
Corpus
Somali
Dataset:
somali-tid-turjuman.xml.bz2
2021-08-30 – 108.74 KB – CC BY 4.0
Explore in:
Somali: Warbixin Ku Saabsan Iswiidhan
Corpus
Somali
Dataset:
somali-wksi.xml.bz2
2017-01-30 – 124.78 KB – CC BY 4.0
Explore in:
Somali: Warbixin Ku Saabsan Kanada
Corpus
Somali
Dataset:
somali-wksk.xml.bz2
2017-01-30 – 48.91 KB – CC BY 4.0
Explore in:
Somali: Wardheer News
Corpus
Somali
Dataset:
somali-wardheer.xml.bz2
2017-05-31 – 1.37 MB – CC BY 4.0
Explore in:
Somali: Xeerar Somaliland
Corpus
Somali
Dataset:
somali-xeerar.xml.bz2
2017-05-31 – 1.04 MB – CC BY 4.0
Explore in:
Somali: Xisaab 1971-79
Corpus
Somali
Dataset:
somali-xisaab-1971-79.xml.bz2
2017-01-30 – 6.55 KB – CC BY 4.0
Explore in:
Somali: Xisaab 1994-97
Corpus
Somali
Dataset:
somali-xisaab-1994-97.xml.bz2
2017-01-30 – 2.64 KB – CC BY 4.0
Explore in:
Somali: Xisaab 2001 Somaliland
Corpus
Somali
Dataset:
somali-xisaab-2001-hargeysa.xml.bz2
2019-10-01 – 69.66 KB – CC BY 4.0
Explore in:
Somali: Xisaab 2001 Soomaaliya
Corpus
Somali
Dataset:
somali-xisaab-2001-nayroobi.xml.bz2
2021-08-27 – 138.5 KB – CC BY 4.0
Explore in:
Somali: Xisaab 2011 Itoobiya
Corpus
Somali
Dataset:
somali-xisaab-2011-itoobiya.xml.bz2
2017-09-20 – 83.62 KB – CC BY 4.0
Explore in:
Somali: Xisaab 2016 Somaliland
Corpus
Somali
Dataset:
somali-xisaab-2016-somaliland.xml.bz2
2021-08-27 – 117.01 KB – CC BY 4.0
Explore in:
Somali: Xisaab 2018 Soomaaliya
Corpus
Somali
Dataset:
somali-xisaab-2018-soomaaliya.xml.bz2
2019-10-01 – 55.16 KB – CC BY 4.0
Explore in:
SpIn
Corpus
Swedish
Word statistics:
stats_SPIN-SOURCE.txt
2020-02-25 – 292.82 KB – CC BY 4.0
Explore in:
SpIn v1
256 essays collected from Language Introduction course (mid-term exams) for newly arrived refugees. Some of the students are recurrent.
Corpus
Swedish
Explore in:
Sports anglicisms
English loan-words in the Swedish sports press
Lexicon
Swedish
Explore in:
Språkprov SO 2009
De drygt 94 000 språkexemplen är hämtade ur Svensk ordbok utgiven av Svenska Akademien (2009). Exemplens uppgift är att stödja ordboksdefinitionerna och att ge information om uppslagsordens fraseologi. <br><br>För åtkomst kontakta <a href="mailto:emma.skoldberg@svenska.gu.se">Emma Sköldberg</a>.
Corpus
Swedish
Explore in:
SUC 2.0
Stockholm-Umeå corpus 2.0
Corpus
Swedish
Word statistics:
stats_SUC2.txt
2017-05-21 – 6.65 MB – CC BY 4.0
SUC 3.0
Stockholm-Umeå corpus 3.0
Corpus
Swedish
Dataset:
suc3.xml.bz2
2024-06-03 – 84.44 MB – CC BY 4.0
Word statistics:
stats_suc3.csv
2024-03-28 – 7.7 MB – CC BY 4.0
Explore in:
SUC Novels (StorSUC)
Stockholm-Umeå corpus
Corpus
Swedish
Dataset:
storsuc.xml.bz2
2017-04-26 – 68.28 MB – CC BY 4.0
Word statistics:
stats_STORSUC.txt
2017-04-30 – 11.23 MB – CC BY 4.0
Explore in:
SUCX 2.0
Stockholm-Umeå corpus 2.0 scrambled
Corpus
Swedish
Dataset:
suc2.xml.bz2
2017-05-19 – 17.68 MB – CC BY-SA 4.0
Word statistics:
stats_SUC2.txt
2017-05-21 – 6.65 MB – CC BY-SA 4.0
Explore in:
SUCX 3.0
Stockholm-Umeå corpus 3.0 scrambled
Corpus
Swedish
Dataset:
suc3.xml.bz2
2024-06-03 – 84.44 MB – CC BY-SA 4.0
Word statistics:
stats_suc3.csv
2024-03-28 – 7.7 MB – CC BY 4.0
Explore in:
Collection
SuperLim 2
A standardized suite for evaluation and analysis of Swedish natural language understanding systems.
Corpus
Swedish
Dataset:
SuperLim-2-2.0.4.zip
2024-01-25 – 156.63 MB – CC BY 4.0
Dataset:
SuperLim_maintenance.odt
2024-01-25 – 16.96 KB
SuperSim (repackaged for Superlim) 2.0
A dataset for word similarity and relatedness in Swedish
Corpus
Swedish
Dataset:
supersim-superlim.zip
2023-03-30 – 70.45 KB – CC BY 4.0
sv-COVID-19
A compilation of various articles related to the COVID-19 pandemic
Corpus
Swedish
Dataset:
sv-covid-19.xml.bz2
2025-02-20 – 216.31 MB – CC BY 4.0
Word statistics:
stats_sv-covid-19.csv
2025-02-27 – 12.94 MB – CC BY 4.0
Explore in:
SVALex
SVALex is a lexicon of receptive vocabulary for Swedish as a second language
Lexicon
Swedish
Dataset:
svalex_xlsx.tar.bz2
2025-01-24 – 2.16 MB – CC BY-NC-SA 4.0
Dataset:
svalex_tsv.tar.bz2
2025-01-24 – 203.25 KB – CC BY-NC-SA 4.0
Explore in:
Svensk Tidskrift
27 annual volumes of the conservative journal Svensk Tidskrift, from 1891 to 1940
Corpus
Swedish
Dataset:
runeberg-svtidskr.xml.bz2
2014-12-08 – 93.06 MB – CC BY 4.0
Word statistics:
stats_RUNEBERG-SVTIDSKR.txt
2015-06-25 – 22.18 MB – CC BY 4.0
Explore in:
Svenska MWELex
Swe-MWELex is a sense-based word list of multi-word expressions that learners of Swedish as a second language can handle at the different levels of proficiency (according to the CEFR scale). The word list features MWE items and their frequencies from essays (productive vocabulary, based on SweLL-pilot) and from course books (receptive vocabulary, based on COCTAILL). Besides, each MWE has been classified by its type (based on their syntactic and lexical characteristics), as well as by a subgroup within the group of verbal MWEs)
Lexicon
Swedish
Dataset:
swe-mwelex.xlsx
2025-03-12 – 184.75 KB – CC BY 4.0
Dataset:
swe-mwelex.csv
2025-02-20 – 414.88 KB – CC BY-NC-SA 4.0
Explore in:
Collection
SVT news
News texts from svt.se
Corpus
Swedish
See 21 collected resources
Explore in:
SVT news 2004
News texts from svt.se
Corpus
Swedish
Dataset:
svt-2004.xml.bz2
2022-12-06 – 12.54 MB – CC BY 4.0
Word statistics:
stats_svt-2004.csv
2022-04-26 – 11.18 MB – CC BY 4.0
Explore in:
SVT news 2005
News texts from svt.se
Corpus
Swedish
Dataset:
svt-2005.xml.bz2
2022-12-06 – 94.29 MB – CC BY 4.0
Word statistics:
stats_svt-2005.csv
2022-04-27 – 78.88 MB – CC BY 4.0
Explore in:
SVT news 2006
News texts from svt.se
Corpus
Swedish
Dataset:
svt-2006.xml.bz2
2022-12-06 – 120.32 MB – CC BY 4.0
Word statistics:
stats_svt-2006.csv
2022-04-27 – 93.91 MB – CC BY 4.0
Explore in:
SVT news 2007
News texts from svt.se
Corpus
Swedish
Dataset:
svt-2007.xml.bz2
2022-12-06 – 159.96 MB – CC BY 4.0
Word statistics:
stats_svt-2007.csv
2022-04-27 – 115.85 MB – CC BY 4.0
Explore in:
SVT news 2008
News texts from svt.se
Corpus
Swedish
Dataset:
svt-2008.xml.bz2
2022-12-06 – 221.24 MB – CC BY 4.0
Word statistics:
stats_svt-2008.csv
2022-04-27 – 146.52 MB – CC BY 4.0
Explore in:
SVT news 2009
News texts from svt.se
Corpus
Swedish
Dataset:
svt-2009.xml.bz2
2022-12-06 – 254.45 MB – CC BY 4.0
Word statistics:
stats_svt-2009.csv
2022-04-27 – 160.78 MB – CC BY 4.0
Explore in:
SVT news 2010
News texts from svt.se
Corpus
Swedish
Dataset:
svt-2010.xml.bz2
2022-12-06 – 284.46 MB – CC BY 4.0
Word statistics:
stats_svt-2010.csv
2022-04-27 – 174.12 MB – CC BY 4.0
Explore in:
SVT news 2011
News texts from svt.se
Corpus
Swedish
Dataset:
svt-2011.xml.bz2
2022-12-06 – 268.69 MB – CC BY 4.0
Word statistics:
stats_svt-2011.csv
2022-04-27 – 165.74 MB – CC BY 4.0
Explore in:
SVT news 2012
News texts from svt.se
Corpus
Swedish
Dataset:
svt-2012.xml.bz2
2022-12-06 – 273.87 MB – CC BY 4.0
Word statistics:
stats_svt-2012.csv
2022-04-27 – 162.68 MB – CC BY 4.0
Explore in:
SVT news 2013
News texts from svt.se
Corpus
Swedish
Dataset:
svt-2013.xml.bz2
2022-12-06 – 397.91 MB – CC BY 4.0
Word statistics:
stats_svt-2013.csv
2022-04-27 – 216.85 MB – CC BY 4.0
Explore in:
SVT news 2014
News texts from svt.se
Corpus
Swedish
Dataset:
svt-2014.xml.bz2
2022-12-07 – 454.63 MB – CC BY 4.0
Word statistics:
stats_svt-2014.csv
2022-04-27 – 239.84 MB – CC BY 4.0
Explore in:
SVT news 2015
News texts from svt.se
Corpus
Swedish
Dataset:
svt-2015.xml.bz2
2022-12-07 – 539.73 MB – CC BY 4.0
Word statistics:
stats_svt-2015.csv
2022-04-27 – 269.89 MB – CC BY 4.0
Explore in:
SVT news 2016
News texts from svt.se
Corpus
Swedish
Dataset:
svt-2016.xml.bz2
2022-12-07 – 613.63 MB – CC BY 4.0
Word statistics:
stats_svt-2016.csv
2022-04-27 – 293.12 MB – CC BY 4.0
Explore in:
SVT news 2017
News texts from svt.se
Corpus
Swedish
Dataset:
svt-2017.xml.bz2
2022-12-07 – 601.37 MB – CC BY 4.0
Word statistics:
stats_svt-2017.csv
2022-04-27 – 283.26 MB – CC BY 4.0
Explore in:
SVT news 2018
News texts from svt.se
Corpus
Swedish
Dataset:
svt-2018.xml.bz2
2022-12-07 – 533.68 MB – CC BY 4.0
Word statistics:
stats_svt-2018.csv
2022-04-27 – 263.34 MB – CC BY 4.0
Explore in:
SVT news 2019
News texts from svt.se
Corpus
Swedish
Dataset:
svt-2019.xml.bz2
2022-12-07 – 515.99 MB – CC BY 4.0
Word statistics:
stats_svt-2019.csv
2022-04-27 – 256.88 MB – CC BY 4.0
Explore in:
SVT news 2020
News texts from svt.se
Corpus
Swedish
Dataset:
svt-2020.xml.bz2
2022-12-07 – 453.02 MB – CC BY 4.0
Word statistics:
stats_svt-2020.csv
2022-04-27 – 228.8 MB – CC BY 4.0
Explore in:
SVT news 2021
News texts from svt.se
Corpus
Swedish
Dataset:
svt-2021.xml.bz2
2022-12-07 – 424.19 MB – CC BY 4.0
Word statistics:
stats_svt-2021.csv
2022-04-27 – 220.19 MB – CC BY 4.0
Explore in:
SVT news 2022
News texts from svt.se
Corpus
Swedish
Dataset:
svt-2022.xml.bz2
2023-08-30 – 395.67 MB – CC BY 4.0
Word statistics:
stats_svt-2022.csv
2023-08-30 – 208.8 MB – CC BY 4.0
Explore in:
SVT news 2023
News texts from svt.se
Corpus
Swedish
Dataset:
svt-2023.xml.bz2
2023-08-29 – 211.47 MB – CC BY 4.0
Word statistics:
stats_svt-2023.csv
2023-08-29 – 119.69 MB – CC BY 4.0
Explore in:
SVT news unknown date
News texts from svt.se
Corpus
Swedish
Dataset:
svt-nodate.xml.bz2
2023-02-08 – 862.74 KB – CC BY 4.0
Word statistics:
stats_svt-nodate.csv
2023-02-09 – 501.94 KB – CC BY 4.0
Explore in:
SW1203 v1
Corpus
Swedish
Word statistics:
stats_SW1203V1.txt
2021-07-04 – 363.68 KB – CC BY 4.0
Explore in:
SW1203-essays
Essays written by L2 Swedish language learners, university courses
Corpus
Swedish
Word statistics:
stats_SW1203.txt
2018-05-20 – 381.64 KB – CC BY 4.0
Explore in:
SW1203-uppsatser version 2
Corpus
Swedish
Word statistics:
stats_SW1203V2.txt
2020-02-26 – 375.63 KB – CC BY 4.0
Explore in:
Swe-NERC
A resource for training and evaluation of Named Entity Recognition for Swedish.
Corpus
Swedish
Dataset:
Swe-NERC-v1.0.tar.gz
2024-03-05 – 5.74 MB – CC BY 4.0
Swedberg's Swensk Ordabok
Swedberg's Swensk Ordabok
Lexicon
Swedish, Latin
Dataset:
swedberg.xml
2017-09-19 – 8.89 MB – CC BY 4.0
Explore in:
Swedberg's Swensk Ordabok (morphology, rudimentary)
Swedberg's Swensk Ordabok (morphology, rudimentary)
Lexicon
Swedish
Dataset:
swedbergm.xml
2017-09-19 – 5.76 MB – CC BY 4.0
Explore in:
SweDiagnostics
Swedish version of (Super)GLUE Diagnostic
Corpus
Swedish
Dataset:
swediagnostics.zip
2023-04-04 – 72.89 KB – CC BY 4.0
Swedish ABSAbank
An annotated Swedish corpus for aspect-based sentiment analysis
Corpus
Swedish
Dataset:
swe-absa-bank.zip
2020-03-04 – 128.55 MB – CC BY 4.0
Dataset:
absabankimm-combined.zip
2023-02-20 – 15.87 MB – CC BY 4.0
Swedish ABSAbank-Imm 1.1
An annotated Swedish corpus for aspect-based sentiment analysis (a version of Absabank)
Corpus
Swedish
Dataset:
absabank-imm.zip
2023-03-30 – 1.03 MB – CC BY 4.0
Swedish analogy 2.0
Swedish semantic and syntactic similarity
Corpus
Swedish
Dataset:
sweanalogy.zip
2023-03-30 – 178.63 KB – CC BY 4.0
Swedish Bible 1873
Swedish translation of the Bible from 1873
Corpus
Swedish
Dataset:
bibel1873dalin.xml.bz2
2015-05-20 – 5.84 MB – CC BY 4.0
Word statistics:
stats_BIBEL1873DALIN.txt
2014-04-29 – 1.62 MB – CC BY 4.0
Explore in:
Swedish Bible 1917
Official Swedish translation of the Bible from 1917
Corpus
Swedish
Dataset:
bibel1917.xml.bz2
2015-05-19 – 7.5 MB – CC BY 4.0
Word statistics:
stats_BIBEL1917.txt
2014-10-13 – 1.82 MB – CC BY 4.0
Explore in:
Swedish Code of Statutes
Swedish Code of Statutes 1880-01-01 – 2023-12-15
Corpus
Swedish
Dataset:
sfs.xml.bz2
2024-05-13 – 325.85 MB – CC BY 4.0
Word statistics:
stats_sfs.csv
2024-05-20 – 16.16 MB – CC BY 4.0
Explore in:
Swedish Diachronic Word Embeddings
Swedish Diachronic Word Embedding Models Trained on Historical Newspaper Data
Model
Swedish
Dataset:
HENGCHEN-TAHMASEBI_-_2020_-_Kubhist2_diachronic_embeddings.zip
2024-01-25 – 15.13 GB – CC BY 4.0
Swedish EAT: question classification
A translated version of the QAQC dataset for expected-answer-type classification.
Corpus
Swedish
Dataset:
swe_qaqc_train.csv
2023-06-08 – 361.34 KB – CC BY 4.0
Dataset:
Swedish_EAT_v1.0.tsv
2023-06-08 – 2.05 KB – CC BY 4.0
Swedish fraktur 1626-1816
A selection of fraktur texts printed between 1626 and 1816 from the collections of the University Library of University of Gothenburg (UB). For OCR analysis.
Corpus
Swedish
Dataset:
svensk-fraktur-1626-1816.tar.gz
2021-11-26 – 757.73 MB – CC BY 4.0
Swedish FrameNet (SweFN)
A lexical semantic resource based on the same principles as the English Berkeley FrameNet. This part of the resource contains the frames and the manually annotated semantic content.
Lexicon
Swedish
Dataset:
swefn.xml
2021-11-09 – 7 MB – CC BY 4.0
Dataset:
swefn-full.zip
2021-12-21 – 7.53 MB – CC BY 4.0
Explore in:
Swedish framenet (SweFN)
A lexical semantic resource based on the same principles as the English Berkeley FrameNet. This part of the resource contains the corpus examples, automatically enriched with linguistic information.
Corpus
Swedish
Dataset:
swefn-ex.xml.bz2
2021-11-25 – 3.62 MB – CC BY 4.0
Word statistics:
stats_swefn-ex.csv
2021-11-26 – 1.88 MB – CC BY 4.0
Explore in:
Swedish FrameNet 2.0 (SweFN)
A lexical semantic resource based on the same principles as the English Berkeley FrameNet. This version is updated to correspond to BFN 1.7.
Lexicon
Swedish
Dataset:
swefn-2-0.json.zip
2024-10-16 – 1006.51 KB – CC BY 4.0
Dataset:
swefn-2-0.tsv.zip
2024-10-16 – 969.61 KB – CC BY 4.0
Swedish newspapers 1818-1870
A selection of Swedish newspapers printed between 1818 and 1870 from the collections of Kungliga biblioteket (KB). For OCR analysis.
Corpus
Swedish
Dataset:
svenska-tidningar-1818-1870.tar.gz
2020-05-26 – 458.22 MB – CC BY 4.0
Swedish newspapers 1871-1906
A selection of Swedish newspapers printed between 1871 and 1906 from the collections of Kungliga biblioteket (KB). For OCR analysis.
Corpus
Swedish
Dataset:
svenska-tidningar-1871-1906.tar.gz
2022-05-03 – 831.74 MB – CC BY 4.0
Swedish party programs and election manifestos
Swedish political party programs and election manifestos 1887–2024
Corpus
Swedish
Dataset:
vivill.xml.bz2
2024-06-10 – 165.57 MB – CC BY 4.0
Word statistics:
stats_vivill.csv
2024-06-10 – 6.28 MB – CC BY 4.0
Explore in:
Swedish Prose Fiction 1800–1900
All Swedish fiction published for the first time during the years 1800, 1820, 1840, 1860, 1880 and 1900
Corpus
Swedish
Dataset:
spf.xml.bz2
2017-05-19 – 231.69 MB – CC BY 4.0
Word statistics:
stats_SPF.txt
2021-05-09 – 18.23 MB – CC BY 4.0
Explore in:
Swedish treebank
A Swedish treebank built from recycled language resources
Corpus
Swedish
Swedish Twitter 2015
Material collected from a selection of Swedish speaking twitter users from 2015
Corpus
Swedish
Word statistics:
stats_TWITTER-2015.txt
2018-02-04 – 615.63 MB – CC BY 4.0
Explore in:
Swedish Twitter 2016
Material collected from a selection of Swedish speaking twitter users from 2016
Corpus
Swedish
Word statistics:
stats_TWITTER-2016.txt
2018-02-11 – 805.79 MB – CC BY 4.0
Explore in:
Swedish Twitter 2017
Material collected from a selection of Swedish speaking twitter users from 2017
Corpus
Swedish
Word statistics:
stats_TWITTER-2017.txt
2018-02-18 – 652.14 MB – CC BY 4.0
Explore in:
Pagination
First page
« First
Previous page
‹ Previous
Page
1
Page
2
Page
3
Page
4
Page
5
Page
6
Page
7
Page
8
Page
9
Page
10
Page
11
Page
12
Page
13
Next page
Next ›
Last page
Last »
Close menu