Standard reference
Elena Volodina, Ildikó Pilán, Ingegerd Enström, Lorena Llozhi, Peter Lundkvist, Gunlög Sundberg, Monica Sandell
(2016):
SweLL on the rise: Swedish Learner Language corpus for European Reference Level studies, in
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016), May 23-28, 2016, Portorož, Slovenia
Data citation
Elena Volodina, Ildikó Pilán, Ingegerd Enström, Lorena Llozhi, Peter Lundkvist, Gunlög Sundberg, & Monica Sandell (2016). SweLL-pilot (updated: 2016-01-01). [Data set]. Språkbanken Text. https://doi.org/10.23695/d3bg-8g24

Annotation
Essays are manually graded using the CEFR scale by human teachers. In addition, the essays are also linguistically annotated (POS tagging, lemmatization, dependency annotation) using Sparv.
Caveats
Data collection is limited to a small geographical area and a short period of time. Although several language backgrounds are represented, the corpus is very unbalanced in this sense and as a consequence not well suited for native language identification tasks.The corpus consists of three subcorpora, each coming from a different source, with different proficiency levels, and with different colleciton periods. While the three subcorpora can be used simultaneously, care should be taken into account to ensure that artifacts from these differences do not leak into models.
Intended uses
Automated grading using the CEFR scale, anonymization, second language acquisiton studies
References
[HOW TO CITE 1]: Volodina Elena. (2024) On two SweLL learner corpora–SweLL-pilot and SweLL-gold. In Huminfra Conference, pp. 83-94. https://doi.org/10.3384/ecp205012
[README]: Elena Volodina (2021). https://spraakbanken.github.io/swell-release-v1/Readme-SweLL-pilot
[HOW TO CITE 2]: Mats Wirén, Arild Matsson, Dan Rosén, Elena Volodina (2018): SVALA: Annotation of Second-Language Learner Text Based on Mostly Automatic Alignment of Parallel Corpora, in Selected papers from the CLARIN Annual Conference 2018, Pisa, 8-10 October 2018 / edited by Inguna Skadina, Maria Eskevich
Accessible through
Access | Platform | Licence |
---|---|---|
|
CLARIN-ID, -PRIV, -NORED, -BY (https://www.kielipankki.fi/support/clarin-eula/#res) |