Skip to main content
Svenska
English
News and events
Research
Data
Analyses
Platforms
FAQ
About us
Contact us
Menu
Breadcrumb
Home
Language resources
Language resources
Language resources
On this page you can browse and search our datasets. Click on a row name to see what files are available for download. You can go directly to the search interface by clicking on the tool logo.
All (1388)
Collections (31)
Corpora (1229)
Lexicons (83)
Training and evaluation data (27)
Models (49)
Title
Free search
Language
- Any -
Swedish
Albanian
Arabic
Belarusian
Blissymbols
Bosnian
Bulgarian
Croatian
Czech
Danish
Dutch
English
Estonian
Faroese
Finland Swedish
Finnish
French
German
Icelandic
Iranian Persian
Italian
Kele (Papua New Guinea)
Kurdish
Latin
Latvian
Lower Sorbian
Macedonian
Modern Greek (1453-)
Multiple languages
Norwegian
Norwegian Bokmål
Old English (ca. 450-1100)
Old High German (ca. 750-1050)
Old Norse
Old Saxon
Polish
Portuguese
Romanian
Russian
Serbian
Slavomolisano
Slovak
Slovenian
Somali
Spanish
Turkish
Turkmen
Ukrainian
Upper Sorbian
Xhosa
Resurs
Antal tokens
Språk
Åtkomst
ScandiSent
Sentiment Corpus for Swedish, Norwegian, Danish, Finnish and English crawled from trustpilot.
Swedish, Norwegian Bokmål, Danish, English, Finnish
Dataset:
ScandiSent.zip
2024-01-25 – 5.16 MB – CC-BY-4.0
Dataset:
ScandiSent-mt.zip
2024-01-25 – 3.62 MB – CC-BY-4.0
Segregation texts: Gothenburg city: Budgets
1,659,609
Swedish
Dataset:
segreg-gbg-budgetar.xml.bz2
2025-11-05 – 36.29 MB – CC-BY-4.0
Word statistics:
stats_segreg-gbg-budgetar.csv.zip
2025-11-05 – 930.46 KB – CC-BY-4.0
Explore in:
Segregation texts: Gothenburg city: Committees
367,615
Swedish
Dataset:
segreg-gbg-namnder.xml.bz2
2025-11-05 – 7.54 MB – CC-BY-4.0
Word statistics:
stats_segreg-gbg-namnder.csv.zip
2025-11-05 – 371.55 KB – CC-BY-4.0
Explore in:
Segregation texts: Gothenburg city: Interpellations
106,118
Swedish
Dataset:
segreg-gbg-interpellationer.xml.bz2
2025-11-05 – 1.99 MB – CC-BY-4.0
Word statistics:
stats_segreg-gbg-interpellationer.csv.zip
2025-11-05 – 151.72 KB – CC-BY-4.0
Explore in:
Segregation texts: Gothenburg city: Motions
176,307
Swedish
Dataset:
segreg-gbg-motioner.xml.bz2
2025-11-05 – 3.36 MB – CC-BY-4.0
Word statistics:
stats_segreg-gbg-motioner.csv.zip
2025-11-05 – 216.6 KB – CC-BY-4.0
Explore in:
Segregation texts: Gothenburg city: Offices/Administrations
756,756
Swedish
Dataset:
segreg-gbg-kontor.xml.bz2
2025-11-05 – 15.56 MB – CC-BY-4.0
Word statistics:
stats_segreg-gbg-kontor.csv.zip
2025-11-05 – 556.41 KB – CC-BY-4.0
Explore in:
Segregation texts: Gothenburg city: Opinions
223,857
Swedish
Dataset:
segreg-gbg-yttranden.xml.bz2
2025-11-05 – 4.35 MB – CC-BY-4.0
Word statistics:
stats_segreg-gbg-yttranden.csv.zip
2025-11-05 – 247.56 KB – CC-BY-4.0
Explore in:
Segregation texts: Gothenburg city: Petitions
216,417
Swedish
Dataset:
segreg-gbg-yrkanden.xml.bz2
2025-11-05 – 4.55 MB – CC-BY-4.0
Word statistics:
stats_segreg-gbg-yrkanden.csv.zip
2025-11-05 – 290.74 KB – CC-BY-4.0
Explore in:
Segregation texts: Gothenburg city: Reports
668,287
Swedish
Dataset:
segreg-gbg-rapporter.xml.bz2
2025-11-05 – 13.75 MB – CC-BY-4.0
Word statistics:
stats_segreg-gbg-rapporter.csv.zip
2025-11-05 – 565.02 KB – CC-BY-4.0
Explore in:
Segregation texts: Media: Municipal newsletter
60,191
Swedish
Dataset:
segreg-media-vartgoteborg.xml.bz2
2025-11-06 – 1.2 MB – CC-BY-4.0
Word statistics:
stats_segreg-media-vartgoteborg.csv.zip
2025-11-06 – 120.69 KB – CC-BY-4.0
Explore in:
Segregation texts: Riksdag's open data: Activities in the Chamber
1,953,686
Swedish
Dataset:
segreg-rd-kammakt.xml.bz2
2025-03-07 – 39.94 MB – CC-BY-4.0
Word statistics:
stats_segreg-rd-kammakt.csv.zip
2025-03-06 – 833.54 KB – CC-BY-4.0
Explore in:
Segregation texts: Riksdag's open data: Committee on EU Affairs
Documents from the Committee on EU Affairs
3,703
Swedish
Dataset:
segreg-rd-eun.xml.bz2
2025-03-07 – 58.53 KB – CC-BY-4.0
Word statistics:
stats_segreg-rd-eun.csv.zip
2025-03-06 – 12.52 KB – CC-BY-4.0
Explore in:
Segregation texts: Riksdag's open data: Committee reports and statements
Utskottens betänkanden och utlåtanden, inklusive riksdagens beslut, en sammanfattning av voteringsresultaten och Beslut i korthet
26,706,890
Swedish
Dataset:
segreg-rd-bet.xml.bz2
2025-03-07 – 504.76 MB – CC-BY-4.0
Word statistics:
stats_segreg-rd-bet.csv.zip
2025-03-06 – 3.7 MB – CC-BY-4.0
Explore in:
Segregation texts: Riksdag's open data: Documents from Committees
Dokument från utskotten, bland annat KU-anmälningar, protokoll, verksamhetsberättelser och den gamla dokumentserien Utredningar från riksdagen
65,746
Swedish
Dataset:
segreg-rd-utsk.xml.bz2
2025-03-07 – 1.11 MB – CC-BY-4.0
Word statistics:
stats_segreg-rd-utsk.csv.zip
2025-03-06 – 117.28 KB – CC-BY-4.0
Explore in:
Segregation texts: Riksdag's open data: EU initiatives
EU initiatives are documents from the European Commission, “COM documents”.
1,962,097
Swedish
Dataset:
segreg-rd-kom.xml.bz2
2025-03-07 – 13.62 MB – CC-BY-4.0
Word statistics:
stats_segreg-rd-kom.csv.zip
2025-03-06 – 579.99 KB – CC-BY-4.0
Explore in:
Segregation texts: Riksdag's open data: Explanatory memorandums on EU proposals
Regeringens faktapromemorior om EU-kommissionens förslag
18,678
Swedish
Dataset:
segreg-rd-fpm.xml.bz2
2025-03-07 – 388.78 KB – CC-BY-4.0
Word statistics:
stats_segreg-rd-fpm.csv.zip
2025-03-06 – 54.98 KB – CC-BY-4.0
Explore in:
Segregation texts: Riksdag's open data: Government bills
Propositioner och skrivelser från regeringen
35,480,771
Swedish
Dataset:
segreg-rd-prop.xml.bz2
2025-03-07 – 641.52 MB – CC-BY-4.0
Word statistics:
stats_segreg-rd-prop.csv.zip
2025-03-06 – 6.69 MB – CC-BY-4.0
Explore in:
Segregation texts: Riksdag's open data: Interpellations
Interpellations from members of the Riksdag to the government
948,204
Swedish
Dataset:
segreg-rd-ip.xml.bz2
2025-03-07 – 18.87 MB – CC-BY-4.0
Word statistics:
stats_segreg-rd-ip.csv.zip
2025-03-06 – 526.53 KB – CC-BY-4.0
Explore in:
Segregation texts: Riksdag's open data: Ministry Publications Series
Utredningar från regeringens departement
6,820,996
Swedish
Dataset:
segreg-rd-ds.xml.bz2
2025-03-07 – 114.52 MB – CC-BY-4.0
Word statistics:
stats_segreg-rd-ds.csv.zip
2025-03-06 – 2.71 MB – CC-BY-4.0
Explore in:
Segregation texts: Riksdag's open data: Motions
Motions from the members of the Riksdag
16,208,509
Swedish
Dataset:
segreg-rd-mot.xml.bz2
2025-03-07 – 343.2 MB – CC-BY-4.0
Word statistics:
stats_segreg-rd-mot.csv.zip
2025-03-06 – 3.17 MB – CC-BY-4.0
Explore in:
Segregation texts: Riksdag's open data: Order papers
Föredragningslistor för kammarens sammanträden
5,149
Swedish
Dataset:
segreg-rd-flista.xml.bz2
2025-03-07 – 71.12 KB – CC-BY-4.0
Word statistics:
stats_segreg-rd-flista.csv.zip
2025-03-06 – 15.54 KB – CC-BY-4.0
Explore in:
Segregation texts: Riksdag's open data: Other documents
Dokumentserierna Riksrevisionens granskningsrapporter, Utredningar från Riksdagsförvaltningen och Rapporter från riksdagen samt planeringsdokument, bilagor till dokument och uttag ur riksdagens databaser och de gamla dokumentserierna Utredningar från riksdag
1,854,388
Swedish
Dataset:
segreg-rd-ovr.xml.bz2
2025-03-07 – 31.4 MB – CC-BY-4.0
Word statistics:
stats_segreg-rd-ovr.csv.zip
2025-03-06 – 1.19 MB – CC-BY-4.0
Explore in:
Segregation texts: Riksdag's open data: Records of proceedings in the Chamber
Records of proceedings in the Chamber
57,270,162
Swedish
Dataset:
segreg-rd-prot.xml.bz2
2025-03-07 – 1.08 GB – CC-BY-4.0
Word statistics:
stats_segreg-rd-prot.csv.zip
2025-03-06 – 6.44 MB – CC-BY-4.0
Explore in:
Segregation texts: Riksdag's open data: Reports
Framställningar och redogörelser från organ som utsetts av riksdagen
1,316,348
Swedish
Dataset:
segreg-rd-frsrdg.xml.bz2
2025-03-07 – 24.42 MB – CC-BY-4.0
Word statistics:
stats_segreg-rd-frsrdg.csv.zip
2025-03-06 – 825.7 KB – CC-BY-4.0
Explore in:
Segregation texts: Riksdag's open data: Statements of opinion
Utskottens yttranden
669,769
Swedish
Dataset:
segreg-rd-yttr.xml.bz2
2025-03-04 – 13.25 MB – CC-BY-4.0
Word statistics:
stats_segreg-rd-yttr.csv.zip
2025-03-06 – 435.64 KB – CC-BY-4.0
Explore in:
Segregation texts: Riksdag's open data: Swedish Government Official Reports (SOU series)
Olika utredningars förslag till regeringen
66,695,400
Swedish
Dataset:
segreg-rd-sou.xml.bz2
2025-03-07 – 1.25 GB – CC-BY-4.0
Word statistics:
stats_segreg-rd-sou.csv.zip
2025-03-06 – 10.83 MB – CC-BY-4.0
Explore in:
Segregation texts: Riksdag's open data: Written questions
Written questions from members of the Riksdag to the Government and the answer to these
139,993
Swedish
Dataset:
segreg-rd-skfr.xml.bz2
2025-03-07 – 2.85 MB – CC-BY-4.0
Word statistics:
stats_segreg-rd-skfr.csv.zip
2025-03-06 – 175.24 KB – CC-BY-4.0
Explore in:
Segregationstexter: Riksdagens öppna data: Utredningar
Kommittédirektiv och kommittéberättelser för utredningar som regeringen tillsätter
4,121
Swedish
Dataset:
segreg-rd-utr.xml.bz2
2025-03-07 – 78.6 KB – CC-BY-4.0
Word statistics:
stats_segreg-rd-utr.csv.zip
2025-03-06 – 11.46 KB – CC-BY-4.0
Explore in:
SemEval2020 Task 1
Swedish Test Data for SemEval 2020 Task 1: Unsupervised Lexical Semantic Change Detection (extracts from Kubhist v2)
182,000,000
Swedish
Dataset:
semeval2020_ulscd_swe.zip
2024-01-25 – 956.05 MB – CC-BY-4.0
Sibirian-German
Siberian German is transcribed German spoken of about 36 000 people in the region of Krasnoyarsk in Siberia (Russia).
34,205
Swedish
Word statistics:
stats_SIBERIANGERMANDIALOGS.txt.zip
2025-04-22 – 24.53 KB – CC-BY-4.0
Explore in:
Sibirientyska kvinnor
Dialogs between four women born in 1927 to 1937 in the Soviet Volga Republic
16,208
Swedish
Word statistics:
stats_SIBERIANGERMANWOMEN.txt.zip
2025-04-22 – 11.35 KB – CC-BY-4.0
Explore in:
SIC2 - Stockholm Internet Corpus
The Stockholm Internet Corpus (SIC2) contains Swedish blog posts, annotated with part of speech, morphological features, and named entities.
13,562
Swedish
Dataset:
sic2.xml.bz2
2020-11-25 – 262.36 KB – CC-BY-4.0
Word statistics:
stats_sic2.csv.zip
2025-04-22 – 44.79 KB – CC-BY-4.0
Dataset:
sic2.zip
2020-11-11 – 83.63 KB – CC-BY-4.0
Dataset:
readme.txt
2020-11-17 – 2.18 KB – CC-BY-4.0
Explore in:
Smittskydd
The newspaper Smittskydd by Smittskyddsinstitutet (Swedish Institute for Communicable Disease Control) 2002–2010
691,716
Swedish
Dataset:
smittskydd.xml.bz2
2017-04-05 – 11.26 MB – CC-BY-4.0
Word statistics:
stats_SMITTSKYDD.txt.zip
2025-04-22 – 610.8 KB – CC-BY-4.0
Explore in:
SNP 1978–79
Swedish parliament proceedings 1978–1979
4,865,138
Swedish
Dataset:
snp7879.xml.bz2
2017-04-05 – 81.35 MB – CC-BY-4.0
Word statistics:
stats_SNP7879.txt.zip
2025-04-22 – 1.39 MB – CC-BY-4.0
Explore in:
Collection
Somali corpora
A collection of Samli corpora
Somali
See 26 collected resources
Explore in:
Somali Wikipedia
Corpus of Somali Wikipedia
869,335
Somali
Dataset:
wikipedia-so.xml.bz2
2016-10-27 – 2.34 MB – CC-BY-4.0
Word statistics:
stats_WIKIPEDIA-SO.txt.zip
2025-04-22 – 548.53 KB – CC-BY-4.0
Explore in:
Somali: Af Soomaali 1971-79
50,794
Somali
Dataset:
somali-1971-79.xml.bz2
2023-02-17 – 135.14 KB – CC-BY-4.0
Explore in:
Somali: Af-Soomaali 2001 Somaliland
35,043
Somali
Dataset:
somali-as-2001.xml.bz2
2021-08-27 – 113.01 KB – CC-BY-4.0
Explore in:
Somali: Af-Soomaali 2001 Soomaaliya
129,947
Somali
Dataset:
somali-2001.xml.bz2
2023-02-17 – 288.17 KB – CC-BY-4.0
Explore in:
Somali: Af-Soomaali 2006 Itoobiya
64,351
Somali
Dataset:
somali-itoobiya.xml.bz2
2017-06-28 – 125.93 KB – CC-BY-4.0
Explore in:
Somali: Af-Soomaali 2010 Somaliland
51,513
Somali
Dataset:
somali-hargeysa-2010.xml.bz2
2017-11-27 – 145.67 KB – CC-BY-4.0
Explore in:
Somali: Af-Soomaali 2013 Somaliland
25,247
Somali
Dataset:
somali-as-2013.xml.bz2
2019-02-18 – 59.78 KB – CC-BY-4.0
Explore in:
Somali: Af-Soomaali 2018 Soomaaliya
15,677
Somali
Dataset:
somali-as-2018.xml.bz2
2019-10-01 – 32.81 KB – CC-BY-4.0
Explore in:
Somali: Afka Hooyo 1992-02 Kanada
706
Somali
Dataset:
somali-ah-1992-02-kanada.xml.bz2
2017-01-30 – 2.99 KB – CC-BY-4.0
Explore in:
Somali: Afka Hooyo 2010–19 Iswiidhan
21,542
Somali
Dataset:
somali-ah-2010-19.xml.bz2
2021-08-30 – 65.49 KB – CC-BY-4.0
Explore in:
Somali: BBC
82,437
Somali
Dataset:
somali-bbc.xml.bz2
2017-05-31 – 181.65 KB – CC-BY-4.0
Explore in:
Somali: Caafimaad 1972–79
13,550
Somali
Dataset:
somali-caafimaad-1972-79.xml.bz2
2021-08-30 – 38.92 KB – CC-BY-4.0
Explore in:
Somali: Caafimaad 1994
8,977
Somali
Dataset:
somali-caafimaad-1994.xml.bz2
2017-05-16 – 24.79 KB – CC-BY-4.0
Explore in:
Somali: Cilmi-Afeed
190,429
Somali
Dataset:
somali-cilmi.xml.bz2
2021-08-27 – 683.26 KB – CC-BY-4.0
Explore in:
Somali: Cilmiga Bulshada 1971–1980
79,005
Somali
Dataset:
somali-cb.xml.bz2
2018-03-12 – 212.86 KB – CC-BY-4.0
Word statistics:
stats_SOMALI-CB.txt.zip
2025-04-22 – 62.33 KB – CC-BY-4.0
Explore in:
Somali: Cilmiga Bulshada 1980-89
4,951
Somali
Dataset:
somali-cb-1980-89.xml.bz2
2018-03-12 – 13.05 KB – CC-BY-4.0
Explore in:
Somali: Cilmiga Bulshada 2001 Somaliland
30,258
Somali
Dataset:
somali-hargeysa.xml.bz2
2017-09-20 – 72.85 KB – CC-BY-4.0
Explore in:
Somali: Cilmiga Bulshada 2001-03 Soomaaliya
48,234
Somali
Dataset:
somali-cb-2001-03-soomaaliya.xml.bz2
2021-08-27 – 159.48 KB – CC-BY-4.0
Explore in:
Somali: Cilmiga Bulshada 2010 Somaliland
11,713
Somali
Dataset:
somali-cb-2010.xml.bz2
2019-02-18 – 27.54 KB – CC-BY-4.0
Explore in:
Somali: Cilmiga Bulshada 2011 Itoobiya
30,124
Somali
Dataset:
somali-cb-2011.xml.bz2
2019-02-18 – 64 KB – CC-BY-4.0
Explore in:
Somali: Cilmiga Bulshada 2016 Somaliland
54,498
Somali
Dataset:
somali-cb-2016.xml.bz2
2021-08-27 – 179.66 KB – CC-BY-4.0
Explore in:
Somali: Cilmiga Bulshada 2018 Soomaaliya
42,557
Somali
Dataset:
somali-cb-2018.xml.bz2
2019-10-01 – 77.07 KB – CC-BY-4.0
Explore in:
Somali: Cilmiga Deegaanka 2012 Itoobiya
56,874
Somali
Dataset:
somali-cd-2012-itoobiya.xml.bz2
2018-03-12 – 132.13 KB – CC-BY-4.0
Explore in:
Somali: Golaha Wakiillada Somaliland
539,206
Somali
Dataset:
somali-wakiillada.xml.bz2
2017-05-31 – 1.17 MB – CC-BY-4.0
Explore in:
Somali: Haatuf News 2002
1,495,343
Somali
Dataset:
somali-haatuf-news-2002.xml.bz2
2018-06-27 – 3.34 MB – CC-BY-4.0
Explore in:
Somali: Haatuf News 2003
2,359,710
Somali
Dataset:
somali-haatuf-news-2003.xml.bz2
2018-06-27 – 5.29 MB – CC-BY-4.0
Explore in:
Somali: Haatuf News 2004
1,813,484
Somali
Dataset:
somali-haatuf-news-2004.xml.bz2
2018-06-27 – 4.08 MB – CC-BY-4.0
Explore in:
Somali: Haatuf News 2005
2,003,060
Somali
Dataset:
somali-haatuf-news-2005.xml.bz2
2018-06-27 – 4.57 MB – CC-BY-4.0
Explore in:
Somali: Haatuf News 2006
2,125,632
Somali
Dataset:
somali-haatuf-news-2006.xml.bz2
2018-06-27 – 4.69 MB – CC-BY-4.0
Explore in:
Somali: Haatuf News 2007
1,758,810
Somali
Dataset:
somali-haatuf-news-2007.xml.bz2
2018-06-27 – 3.93 MB – CC-BY-4.0
Explore in:
Somali: Haatuf News 2008
1,286,309
Somali
Dataset:
somali-haatuf-news-2008.xml.bz2
2018-06-27 – 2.81 MB – CC-BY-4.0
Explore in:
Somali: Haatuf News 2009
393,199
Somali
Dataset:
somali-haatuf-news-2009.xml.bz2
2018-06-27 – 817.25 KB – CC-BY-4.0
Explore in:
Somali: Kitaabka Quduuska Ah
841,187
Somali
Dataset:
somali-kqa.xml.bz2
2016-09-29 – 1.67 MB – CC-BY-4.0
Word statistics:
stats_SOMALI-KQA.txt.zip
2025-04-22 – 233.75 KB – CC-BY-4.0
Explore in:
Somali: Maaddooyinka Kale 1972–79
14,908
Somali
Dataset:
somali-mk-1972-79.xml.bz2
2021-08-27 – 45.99 KB – CC-BY-4.0
Explore in:
Somali: Ogaden Online
98,454
Somali
Dataset:
somali-ogaden.xml.bz2
2016-10-13 – 216.75 KB – CC-BY-4.0
Explore in:
Somali: Qoraallo 1956-1970
14,153
Somali
Dataset:
somali-qoraallo.xml.bz2
2019-01-30 – 37.31 KB – CC-BY-4.0
Explore in:
Somali: Qur’aan
141,555
Somali
Dataset:
somali-quraan.xml.bz2
2019-01-30 – 275.2 KB – CC-BY-4.0
Explore in:
Somali: Raadiyaha Denmark 2014
199,173
Somali
Dataset:
somali-radioden2014.xml.bz2
2016-09-29 – 399.81 KB – CC-BY-4.0
Word statistics:
stats_SOMALI-RADIODEN2014.txt.zip
2025-04-22 – 116.98 KB – CC-BY-4.0
Explore in:
Somali: Raadiyaha Iswiidhan 2014
235,911
Somali
Dataset:
somali-radioswe2014.xml.bz2
2016-09-29 – 598.92 KB – CC-BY-4.0
Word statistics:
stats_SOMALI-RADIOSWE2014.txt.zip
2025-04-22 – 149.41 KB – CC-BY-4.0
Explore in:
Somali: Radio Muqdisho
22,801
Somali
Dataset:
somali-radiomuq.xml.bz2
2017-02-17 – 51.34 KB – CC-BY-4.0
Explore in:
Somali: Saynis 1972–77
112,845
Somali
Dataset:
somali-saynis-1972-77.xml.bz2
2018-06-27 – 302.14 KB – CC-BY-4.0
Explore in:
Somali: Saynis 1980–89
33,034
Somali
Dataset:
somali-saynis-1980-89.xml.bz2
2021-08-27 – 96.67 KB – CC-BY-4.0
Explore in:
Somali: Saynis 1994–96
60,787
Somali
Dataset:
somali-saynis-1994-96.xml.bz2
2018-06-27 – 155.84 KB – CC-BY-4.0
Explore in:
Somali: Saynis 2001 Somaliland
29,988
Somali
Dataset:
somali-saynis.xml.bz2
2017-09-20 – 73.75 KB – CC-BY-4.0
Explore in:
Somali: Saynis 2001 Soomaaliya
4,659
Somali
Dataset:
somali-saynis-2001.xml.bz2
2019-02-18 – 12.8 KB – CC-BY-4.0
Explore in:
Somali: Saynis 2010 Somaliland
30,471
Somali
Dataset:
somali-saynis-2010.xml.bz2
2019-10-01 – 70.23 KB – CC-BY-4.0
Explore in:
Somali: Saynis 2011 Soomaaliya
45,689
Somali
Dataset:
somali-saynis-2011-soomaaliya.xml.bz2
2019-01-30 – 111.64 KB – CC-BY-4.0
Explore in:
Somali: Saynis 2016 Somaliland
31,196
Somali
Dataset:
somali-saynis-2016.xml.bz2
2019-10-01 – 71.3 KB – CC-BY-4.0
Explore in:
Somali: Saynis 2018 Soomaaliya
30,786
Somali
Dataset:
somali-saynis-2018.xml.bz2
2019-10-01 – 57.11 KB – CC-BY-4.0
Explore in:
Somali: Sheekooyin Carruureed
26,003
Somali
Dataset:
somali-sheekooyin.xml.bz2
2021-08-27 – 85.91 KB – CC-BY-4.0
Explore in:
Somali: Sheekooyin Carruureed (Turjuman)
13,865
Somali
Dataset:
somali-sheekooyin-carruureed.xml.bz2
2021-08-30 – 43.72 KB – CC-BY-4.0
Explore in:
Somali: Sheekooyin Gaagaaban
180,852
Somali
Dataset:
somali-sheekooying.xml.bz2
2021-08-27 – 628.9 KB – CC-BY-4.0
Explore in:
Somali: Somali Faces
51,440
Somali
Dataset:
somali-faces.xml.bz2
2017-01-30 – 119.98 KB – CC-BY-4.0
Explore in:
Somali: Suugaan
156,288
Somali
Dataset:
somali-suugaan.xml.bz2
2017-11-27 – 364.94 KB – CC-BY-4.0
Word statistics:
stats_SOMALI-SUUGAAN.txt.zip
2025-04-22 – 118.59 KB – CC-BY-4.0
Explore in:
Somali: Suugaan (Turjuman)
8,796
Somali
Dataset:
somali-suugaan-turjuman.xml.bz2
2021-08-27 – 27.26 KB – CC-BY-4.0
Explore in:
Somali: Suugaan 2
2,827,328
Somali
Dataset:
somali-suugaan2.xml.bz2
2022-12-15 – 7.13 MB – CC-BY-4.0
Explore in:
Somali: Taariikh iyo Dhaqan (Turjuman)
35,479
Somali
Dataset:
somali-tid-turjuman.xml.bz2
2021-08-30 – 108.74 KB – CC-BY-4.0
Explore in:
Somali: Warbixin Ku Saabsan Iswiidhan
59,823
Somali
Dataset:
somali-wksi.xml.bz2
2017-01-30 – 124.78 KB – CC-BY-4.0
Explore in:
Somali: Warbixin Ku Saabsan Kanada
24,039
Somali
Dataset:
somali-wksk.xml.bz2
2017-01-30 – 48.91 KB – CC-BY-4.0
Explore in:
Somali: Wardheer News
499,037
Somali
Dataset:
somali-wardheer.xml.bz2
2017-05-31 – 1.37 MB – CC-BY-4.0
Explore in:
Somali: Xeerar Somaliland
450,142
Somali
Dataset:
somali-xeerar.xml.bz2
2017-05-31 – 1.04 MB – CC-BY-4.0
Explore in:
Somali: Xisaab 1971-79
1,875
Somali
Dataset:
somali-xisaab-1971-79.xml.bz2
2017-01-30 – 6.55 KB – CC-BY-4.0
Explore in:
Somali: Xisaab 1994-97
713
Somali
Dataset:
somali-xisaab-1994-97.xml.bz2
2017-01-30 – 2.64 KB – CC-BY-4.0
Explore in:
Somali: Xisaab 2001 Somaliland
32,676
Somali
Dataset:
somali-xisaab-2001-hargeysa.xml.bz2
2019-10-01 – 69.66 KB – CC-BY-4.0
Explore in:
Somali: Xisaab 2001 Soomaaliya
50,361
Somali
Dataset:
somali-xisaab-2001-nayroobi.xml.bz2
2021-08-27 – 138.5 KB – CC-BY-4.0
Explore in:
Pagination
First page
« First
Previous page
‹ Previous
Page
1
Page
2
Page
3
Page
4
Page
5
Page
6
Page
7
Page
8
Page
9
Page
10
Page
11
Page
12
Page
13
Next page
Next ›
Last page
Last »
News and events
News archive
Blog
Calendar
Conferences and workshops
CLT retreat 2020
AI Trust workshop
Autumn Workshop
Höstworkshop 2025
Höstworkshop 2024
Höstworkshop 2023
Höstworkshop 2022
Höstworkshop 2021
Autumn Workshop 2020
Autumn Workshop 2011 and Korp-release
Autumn Workshop 2012
Autumn Workshop 2013
Autumn Workshop 2014
Autumn Workshop 2015
Autumn Workshop 2016
Autumn Workshop 2017
Autumn Workshop 2018
Autumn Workshop 2019
Språkbanken 40 years
CDLC workshop
CLT workshop Spring 2023
EACL 2014
Korp Workshop
Korp Workshop 2014
Korpworkshop 2018
NoDaLiDa 2017
RESOURCEFUL
SLTC 2020
Programme
Instructions
People
Support
Call for papers
Sustainable language representations
Position statements
Workshop on Profiling second language vocabulary and grammar - 2023
Research
Publications
Doktorandutbildning
For PhD students and supervisors
Research meetings
Data
Analyses
Platforms
Korp
User manual
Web API
Distribution and development
Corpus statistics
Sentence sets
Karp
Web API
Sparv
Web Sparv - User Manual
Web service (API)
Web Sparv - Technical Documentation
Mink
User manual
Tutorial
Video: Overview (in Swedish)
Web API
Privacy and data policy
Strix
Lärka
Other tools
Catta
IT-baserad grammatikinlärning
FAQ
About us
Staff
Organisation
Språkbanken Text around the world
Språkbanken 50 years
Celebration
A brief history
Studera språkteknologi
PhD program
Teaching
How to cite
Alumni
Meetings and workshops
Kick-off meetings
Kick-off H2021
Kick-off V2021
Kick-off H2020
Kick-off V2020
Kick-off H2019
Kick-off V2019
Kick-off H2018
Kick-off V2018
Kick-off H2017
Kick-off V2017
Kick-off H2016
Kick-off V2016
Kick-off H2015
Workshops
End of the year workshop & APT 2025
End of the year workshop 2024
End of the year workshop 2023
Semester workshop 2022
Semester workshop H2021
Semester workshop V2021
Semester workshop H2020
Semester workshop V2020
Forskningsmöten
Gruppmöten
SBX Retreat
SBX Retreat 2026
SBX Retreat 2024
SBX Retreat 2023
SBX Retreat 2022
Cookies
Internal
Contact us
Help desk