Collection SuperLim
A standardized suite for evaluation and analysis of Swedish natural language understanding systems. |
|
Corpus |
Swedish |
|
CoDeRooMor, v.01
Morphological dataset (word-building morphology), Swedish L2 profiles project, |
|
Lexicon |
Swedish |
|
DaLAJ v.1.0
Dataset for Linguistic Acceptability Judgments (and more), v.1.0., is a collection of sentences from SweLL (Swedish Learner Language) essays. Each DaLAJ sentence contains one error only. |
|
Corpus |
Swedish |
|
Dalin: Then Swänska Argus 1732-1734
Manual transcription of Then Swänska Argus by Olof von Dalin, Stockholm, 1732–1734. For OCR analysis. |
|
Corpus |
Swedish |
|
Eukalyptus Treebank of Written Swedish
A treebank with written Swedish data, with parts-of-speech, TIGER-style syntax, multiword expressions and sense annotation |
|
Corpus |
Swedish |
|
SemEval2020 Task 1
Swedish Test Data for SemEval 2020 Task 1: Unsupervised Lexical Semantic Change Detection (extracts from Kubhist v2) |
|
Corpus |
Swedish |
|
SIC2 - Stockholm Internet Corpus
The Stockholm Internet Corpus (SIC2) contains Swedish blog posts, annotated with part of speech, morphological features, and named entities. |
|
Corpus |
Swedish |
|
SUC 2.0
Stockholm-Umeå corpus 2.0 |
|
Corpus |
Swedish |
|
SUC 3.0
Stockholm-Umeå corpus 3.0 |
|
Corpus |
Swedish |
|
SuperSim (repackaged for Superlim)
A test set for word similarity and relatedness in Swedish |
|
Corpus |
Swedish |
|
SweDiagnostics
Swedish version (Super)GLUE Diagnostic |
|
Corpus |
Swedish, English |
|
Swedish ABSAbank
An annotated Swedish corpus for aspect-based sentiment analysis |
|
Corpus |
Swedish |
|
Swedish ABSAbank-Imm 1.0
An annotated Swedish corpus for aspect-based sentiment analysis (a version of Absabank) |
|
Corpus |
Swedish |
|
Swedish analogy test set v1.0
Swedish semantic and syntactic similarity: test set |
|
Corpus |
Swedish |
|
Swedish FAQ (mismatched) 1.0
Frequently asked questions from Swedish authorities' websites with shuffled answers |
|
Corpus |
Swedish |
|
Swedish fraktur 1626-1816
A selection of fraktur texts printed between 1626 and 1816 from the collections of the University Library of University of Gothenburg (UB). For OCR analysis. |
|
Corpus |
Swedish |
|
Swedish newspapers 1818-1870
A selection of Swedish newspapers printed between 1818 and 1870 from the collections of Kungliga biblioteket (KB). For OCR analysis. |
|
Corpus |
Swedish |
|
Swedish newspapers 1871-1906
A selection of Swedish newspapers printed between 1871 and 1906 from the collections of Kungliga biblioteket (KB). For OCR analysis. |
|
Corpus |
Swedish |
|
Swedish treebank
A Swedish treebank built from recycled language resources |
|
Corpus |
Swedish |
|
SweFraCas 1.0
Textual inference/entailment problem set |
|
Corpus |
Swedish |
|
SweParaphrase
A subset of the Semantic Textual Similarity reference data
(STS Benchmark). |
|
Corpus |
Swedish |
|
SweSAT Swedish Scholastic Aptitude Test Synonyms
Swedish Scholastic Aptitude Test Synonyms |
|
Lexicon |
Swedish |
|
SweWiC
A Swedish Word-in-Context test set. |
|
Corpus |
Swedish |
|
SweWinogender
A Swedish test set for coreference and gender bias. |
|
Corpus |
Swedish |
|
SweWinograd
A Swedish test set for pronoun resolution. |
|
Corpus |
Swedish |
|
Syntag treebank
A Swedish treebank with syntactic analysis of 158 articles from Press-65. |
|
Corpus |
Swedish |
|
TalbankenSBX
Talbanken is a Swedish treebank. This is the Språkbanken Text version of Talbanken. |
|
Corpus |
Swedish |
|
TalbankenSTB
Talbanken is a Swedish treebank. |
|
Corpus |
Swedish |
|