Skip to main content

Language resources

On this page you can browse and search our datasets. Click on a row name to see what files are available for download. You can go directly to the search interface by clicking on the tool logo.
Resurs Typ Språk Åtkomst
Collection
Somali corpora
A collection of Samli corpora
Corpus Somali
Somali Wikipedia
Corpus of Somali Wikipedia
Corpus Somali
Somali: Af Soomaali 1971-79
Corpus Somali
Somali: Af-Soomaali 2001 Somaliland
Corpus Somali
Somali: Af-Soomaali 2001 Soomaaliya
Corpus Somali
Somali: Af-Soomaali 2006 Itoobiya
Corpus Somali
Somali: Af-Soomaali 2010 Somaliland
Corpus Somali
Somali: Af-Soomaali 2013 Somaliland
Corpus Somali
Somali: Af-Soomaali 2018 Soomaaliya
Corpus Somali
Somali: Afka Hooyo 1992-02 Kanada
Corpus Somali
Somali: Afka Hooyo 2010–19 Iswiidhan
Corpus Somali
Somali: BBC
Corpus Somali
Somali: Caafimaad 1972–79
Corpus Somali
Somali: Caafimaad 1994
Corpus Somali
Somali: Cilmi-Afeed
Corpus Somali
Somali: Cilmiga Bulshada 1971–1980
Corpus Somali
Somali: Cilmiga Bulshada 1980-89
Corpus Somali
Somali: Cilmiga Bulshada 2001 Somaliland
Corpus Somali
Somali: Cilmiga Bulshada 2001-03 Soomaaliya
Corpus Somali
Somali: Cilmiga Bulshada 2010 Somaliland
Corpus Somali
Somali: Cilmiga Bulshada 2011 Itoobiya
Corpus Somali
Somali: Cilmiga Bulshada 2016 Somaliland
Corpus Somali
Somali: Cilmiga Bulshada 2018 Soomaaliya
Corpus Somali
Somali: Cilmiga Deegaanka 2012 Itoobiya
Corpus Somali
Somali: Golaha Wakiillada Somaliland
Corpus Somali
Somali: Haatuf News 2002
Corpus Somali
Somali: Haatuf News 2003
Corpus Somali
Somali: Haatuf News 2004
Corpus Somali
Somali: Haatuf News 2005
Corpus Somali
Somali: Haatuf News 2006
Corpus Somali
Somali: Haatuf News 2007
Corpus Somali
Somali: Haatuf News 2008
Corpus Somali
Somali: Haatuf News 2009
Corpus Somali
Somali: Kitaabka Quduuska Ah
Corpus Somali
Somali: Maaddooyinka Kale 1972–79
Corpus Somali
Somali: Ogaden Online
Corpus Somali
Somali: Qoraallo 1956-1970
Corpus Somali
Somali: Qur’aan
Corpus Somali
Somali: Raadiyaha Denmark 2014
Corpus Somali
Somali: Raadiyaha Iswiidhan 2014
Corpus Somali
Somali: Radio Muqdisho
Corpus Somali
Somali: Saynis 1972–77
Corpus Somali
Somali: Saynis 1980–89
Corpus Somali
Somali: Saynis 1994–96
Corpus Somali
Somali: Saynis 2001 Somaliland
Corpus Somali
Somali: Saynis 2001 Soomaaliya
Corpus Somali
Somali: Saynis 2010 Somaliland
Corpus Somali
Somali: Saynis 2011 Soomaaliya
Corpus Somali
Somali: Saynis 2016 Somaliland
Corpus Somali
Somali: Saynis 2018 Soomaaliya
Corpus Somali
Somali: Sheekooyin Carruureed
Corpus Somali
Somali: Sheekooyin Carruureed (Turjuman)
Corpus Somali
Somali: Sheekooyin Gaagaaban
Corpus Somali
Somali: Somali Faces
Corpus Somali
Somali: Suugaan
Corpus Somali
Somali: Suugaan (Turjuman)
Corpus Somali
Somali: Suugaan 2
Corpus Somali
Somali: Taariikh iyo Dhaqan (Turjuman)
Corpus Somali
Somali: Warbixin Ku Saabsan Iswiidhan
Corpus Somali
Somali: Warbixin Ku Saabsan Kanada
Corpus Somali
Somali: Wardheer News
Corpus Somali
Somali: Xeerar Somaliland
Corpus Somali
Somali: Xisaab 1971-79
Corpus Somali
Somali: Xisaab 1994-97
Corpus Somali
Somali: Xisaab 2001 Somaliland
Corpus Somali
Somali: Xisaab 2001 Soomaaliya
Corpus Somali
Somali: Xisaab 2011 Itoobiya
Corpus Somali
Somali: Xisaab 2016 Somaliland
Corpus Somali
Somali: Xisaab 2018 Soomaaliya
Corpus Somali
SpIn
Corpus Swedish
SpIn v1
256 essays collected from Language Introduction course (mid-term exams) for newly arrived refugees. Some of the students are recurrent.
Corpus Swedish
Sports anglicisms
English loan-words in the Swedish sports press
Lexicon Swedish
Språkprov SO 2009
De drygt 94 000 språkexemplen är hämtade ur Svensk ordbok utgiven av Svenska Akademien (2009). Exemplens uppgift är att stödja ordboksdefinitionerna och att ge information om uppslagsordens fraseologi.
Corpus Swedish
SUC 2.0
Stockholm-Umeå corpus 2.0
Corpus Swedish
SUC 3.0
Stockholm-Umeå corpus 3.0
Corpus Swedish
SUC Novels (StorSUC)
Stockholm-Umeå corpus
Corpus Swedish
SUCX 2.0
Stockholm-Umeå corpus 2.0 scrambled
Corpus Swedish
SUCX 3.0
Stockholm-Umeå corpus 3.0 scrambled
Corpus Swedish
Collection
SuperLim 2
A standardized suite for evaluation and analysis of Swedish natural language understanding systems.
Corpus Swedish
SuperSim (repackaged for Superlim) 2.0
A dataset for word similarity and relatedness in Swedish
Corpus Swedish
sv-COVID-19
A compilation of various articles related to the COVID-19 pandemic
Corpus Swedish
SVALex
SVALex is a lexicon of receptive vocabulary for Swedish as a second language
Lexicon Swedish
Svensk Tidskrift
27 annual volumes of the conservative journal Svensk Tidskrift, from 1891 to 1940
Corpus Swedish
Svenska MWELex
Swe-MWELex is a sense-based word list of multi-word expressions that learners of Swedish as a second language can handle at the different levels of proficiency (according to the CEFR scale). The word list features MWE items and their frequencies from essays (productive vocabulary, based on SweLL-pilot) and from course books (receptive vocabulary, based on COCTAILL). Besides, each MWE has been classified by its type (based on their syntactic and lexical characteristics), as well as by a subgroup within the group of verbal MWEs)
Lexicon Swedish
Collection
SVT news
News texts from svt.se
Corpus Swedish
SVT news 2004
News texts from svt.se
Corpus Swedish
SVT news 2005
News texts from svt.se
Corpus Swedish
SVT news 2006
News texts from svt.se
Corpus Swedish
SVT news 2007
News texts from svt.se
Corpus Swedish
SVT news 2008
News texts from svt.se
Corpus Swedish
SVT news 2009
News texts from svt.se
Corpus Swedish
SVT news 2010
News texts from svt.se
Corpus Swedish
SVT news 2011
News texts from svt.se
Corpus Swedish
SVT news 2012
News texts from svt.se
Corpus Swedish
SVT news 2013
News texts from svt.se
Corpus Swedish
SVT news 2014
News texts from svt.se
Corpus Swedish
SVT news 2015
News texts from svt.se
Corpus Swedish
SVT news 2016
News texts from svt.se
Corpus Swedish
SVT news 2017
News texts from svt.se
Corpus Swedish
SVT news 2018
News texts from svt.se
Corpus Swedish
BibTeX list