SwedishGlue: a benchmark suite for language models

Background

The Swedish NLP is undergoing right now a transformative breakthrough when it comes to the development of large-scale Swedish language models. These models have the capacity to significantly improve the performance of virtually all types of language technology applications for Swedish. Since algorithms and implementations are publicly available, there are already some existing Swedish models, and more will be created during 2020. The models will be actively used within academy, private and public sector.
There is, however, a lack of evaluation data for Swedish, which makes it impossible to estimate the quality of the models. In addition, earlier studies of English language models show that the models are sensitive to what kind of data they are trained on: those biases that exist in the training data become an inherent part of the model. Without Swedish evaluation data it is impossible to further improve quality of the Swedish models, to make the inherent biases visible and to remove them.

Project description

For these reasons, RISE, National Library (KB), Språkbanken Text and AI Innovation of Sweden join forces to create evaluation sets for Swedish language models. The evaluation sets will to some extent mirror their well-established English counterparts, for instance, (Super)GLUE.

Organizations

RISE, National Library of Sweden (KB), Språkbanken and AI Innovation of Sweden.

Publications

All:

2020

Yvonne Adesam, Aleksandrs Berdicevskis, Felix Morger (2020): SwedishGLUE – Towards a Swedish Test Set for Evaluating Natural Language Understanding Models

SwedishGlue: a benchmark suite for language models

Background

Project description

Organizations

Publications

2020

Project duration

Project members

Funding

Research topics

Project type

The SuperLim collection

Umbrella project

SwedishGlue: a benchmark suite for language models

Background

Project description

Organizations

Publications

2020

Icon

Project duration

Project members

Funding

Research topics

Project type

The SuperLim collection

Umbrella project