Skip to main content
Språkbanken Text is a part of Språkbanken.

SweLLex

Standard reference Information

Elena Volodina, Ildikó Pilán, Lorena Llozhi, Baptiste Degryse, Thomas François (2016): SweLLex: second language learners' productive vocabulary, in Linköping Electronic Conference Proceedings. Proceedings of the joint workshop on NLP for Computer Assisted Language Learning and NLP for Language Acquisition at SLTC, Umeå, 16th November 2016 BibTeX

Data citation Information

Elena, Volodina, & Ildikó, Pilán (2016). SweLLex (updated: 2016-06-09). [Data set]. Språkbanken Text. https://doi.org/10.23695/6h3v-zw25
BibTeX Additional ways to cite the dataset.
SweLLex is a lexicon of productive vocabulary for Swedish as a second language

SweLLex is a lexicon of productive vocabulary for Swedish as a second/foreign language (SVA). Like its sister resource, SVALex, it reports the normalized frequencies of words (lemmas) across six levels of the CEFR (Common European Framework of Reference for Languages). In the same fashion as SVALex, it contains information on both single word usage, multi-word expressions, as well as information on their usage at different levels, something that is rarely present in the resources of this kind.

The frequencies have been estimated on a corpus of essays written by SVA learners, SweLL-pilot corpus, described in the article:

  • Elena Volodina, Ildikó Pilán, Ingegerd Enström, Lorena Llozhi, Peter Lundkvist, Gunlög Sundberg, Monica Sandell. 2016. SweLL on the rise: Swedish Learner Language corpus for European Reference Level studies. Proceedings of LREC 2016, Slovenia.

More details on SweLLex resource are provided on this webpage and in article by Volodina et al (2016).

Annotation

CEFR levels, lemmatization, POS-tagging, frequency

Intended uses

teaching L2 Swedish, developing CALL and ICALL systems, using as features in classification, profiling Swedish as a second language

Accessible through

Access Platform Licence
CC BY 4.0

Download

File Size Modified Licence
SweLLex_v1_xlsx.tar.bz2
Columns: word (lemma), word class (SUC tags), normalized frequencies per level and total (format: level_freq@a1), number documents per level (format: nb_doc@a1), frequences in unique learner essays by level and subcorpus (format: c1_TISUS1, a1_SpIn31) (xlsx)
3.21 MB 2025-01-24 CC BY 4.0
SweLLex_v1_tsv.tar.bz2
Columns: word (lemma), word class (SUC tags), normalized frequencies per level and total (format: level_freq@a1), number documents per level (format: nb_doc@a1), frequences in unique learner essays by level and subcorpus (format: c1_TISUS1, a1_SpIn31) (tsv)
213.59 KB 2025-01-24 CC BY 4.0

Type

  • Lexicon

Language

Swedish

Size

Entries: 6,967

Keywords

  • word lists
  • L2
  • productive vocabulary

Creators

  • Elena, Volodina
  • Ildikó, Pilán

Created

2016-06-09

Updated

2016-06-09

Contact

sb-info@svenska.gu.se