Hoppa till huvudinnehåll


A standardized suite for evaluation and analysis of Swedish natural language understanding systems.


Resource Task Split Citation
Aspect-Based Sentiment Analysis (Immigration) Label the sentiment that the author of a text expressed towards immigration on the 1--5 scale 10-fold cross-validation, consecutive [1]
DaLAJ Determine whether a sentence is correct Swedish or not Hold-out [2]
Swedish FAQ (mismatched) Match the question with the answer within a category Test only
SweSAT synonyms Select the correct synonym or description of a word or expression Test only
Swedish Analogy test set Given two word pairs A:B and C:D, capture that the relation between A and B is the same as between C and D Test only [3]
Swedish Test Set for SemEval 2020 Task 1: Unsupervised Lexical Semantic Change Detection Determine whether a given word has changed its meaning during a hundred year period Test only [4]
Determine to what extent a given word has changed its meaning during a hundred year period
SweFraCas Given the question and the premises, choose the suitable answer Test only
SweWinograd Resolve pronouns to their antecedents in items constructed to require reasoning (Winograd Schemata) Test only
SweWinogender Find the correct antecedent of a personal pronoun, avoiding the gender bias Test only [5]
SweDiagnostics Determine the logical relation between the two sentences Test only
SweParaphrase Determine how similar two sentences are Test only
SuperSim Predict semantic word similarity and/or relatedness between words out of context. Test only [6]
SweWiC Say if instances of a word in two contexts represent the same word sense. Test only

Frequently asked questions

How do I cite SuperLim?

  • To cite the suite as a whole, use the standard reference given below.
  • If you discuss individual resources, for instance when reporting results on different SuperLim tasks, also cite the references for these resources – even if you discuss all of them! The references are given in the table above and in the documentation sheet for each resource.

Standard reference SuperLim:
Yvonne Adesam, Aleksandrs Berdicevskis, Felix Morger (2020): SwedishGLUE – Towards a Swedish Test Set for Evaluating Natural Language Understanding Models BibTeX
[1] Original Absabank:
Jacobo Rouces, Lars Borin, Nina Tahmasebi (2020): Creating an Annotated Corpus for Aspect-Based Sentiment Analysis in Swedish, in Proceedings of the 5th conference in Digital Humanities in the Nordic Countries, Riga, Latvia, October 21-23, 2020. BibTeX
[2] DaLAJ:
Volodina, Elena, Yousuf Ali Mohammed, and Julia Klezl (2021). DaLAJ - a dataset for linguistic acceptability judgments for Swedish. In Proceedings of the 10th Workshop on Natural Language Processing for Computer Assisted Language Learning (NLP4CALL 2021). Linköping Electronic Conference Proceedings 177:3, s. 28-37. https://ep.liu.se/ecp/177/003/ecp2021177003.pdf
[3] Analogy:
Tosin Adewumi, Foteini Liwicki, Markus Liwicki. (2020). Corpora compared: The case of the Swedish Gigaword & Wikipedia corpora. In: Proceedings of the 8th SLTC, Gothenburg. arXiv preprint arXiv:2011.03281
[4] Swedish Test Set for SemEval 2020 Task 1:
Unsupervised Lexical Semantic Change Detection: Dominik Schlechtweg, Barbara McGillivray, Simon Hengchen, Haim Dubossarsky, Nina Tahmasebi (2020): SemEval-2020 Task 1: Unsupervised Lexical Semantic Change Detection, in Proceedings of the Fourteenth Workshop on Semantic Evaluation (SemEval2020), Barcelona, Spain (Online), December 12, 2020. BibTeX
[5] Winogender:
Saga Hansson, Konstantinos Mavromatakis, Yvonne Adesam, Gerlof Bouma and Dana Dannélls (2021). The Swedish Winogender Dataset. In The 23rd Nordic Conference on Computational Linguistics (NoDaLiDa 2021), Reykjavik.
[6] SuperSim:
Hengchen, Simon and Tahmasebi, Nina (2021). SuperSim: a test set for word similarity and relatedness in Swedish. In The 23rd Nordic Conference on Computational Linguistics (NoDaLiDa 2021), Reykjavik. arXiv preprint arXiv:2014.05228

How do different models perform on the SuperLim tasks?

SuperLim does not currently have a leaderboard (though we do hope to create one). As a temporary solution, we will be collecting the available information here. If you have evaluated your model on some of our data, please let us know, we will add your results!

  • Faton Rekathati's (KBLab) evalution: SweParaphrase, Swedish FAQ, SweSAT, SuperSim.

Most resources do not have any training data!

Yes, in its current version SuperLim is mostly a suite of test sets (however, splits into train, dev and test are provided for some of the larger resources). We strive to develop it further, which will hopefully result in training data appearing here as well.

I am using contextualized embeddings (aka dynamic embeddings, aka token embeddings). How should I apply my model to those of your tasks where there is no context (e.g. Analogy)?

There is currently no predefined answer, since we do not want to impose any unnecessary restrictions on how the models solve the tasks. We suggest that you devise the necessary method yourself (you can e.g. average across contextualized embeddings in order to generate "classic/statis/type" embeddings). It is important, however, that you document what you do very clearly (if you average: how exactly? If you use any additional corpora for that, which ones?), since that might affect comparability.

I trained a system and want to submit its results. How do I do that?

Instructions will appear here later. But if you have already trained something (even if it's one task and not the whole set), drop us a line about how it went, it will really help us!

I have a dataset that I think can become part of SuperLim.

Please contact us at sb-info@svenska.gu.se. Do the same if you have any other question not covered here.

SuperLim or SwedishGlue?

The name of the collection is SuperLim. The initial work on it was funded by the project, which is called SuperLim in Swedish and SwedishGlue in English.


  • Korpus
  • Tränings- och utvärderingsdata




Resurser: 13