Lemmatization model: Stanza

Data citation

Språkbanken Text (2020). Lemmatization model: Stanza (updated: 2020-11-19). [Data set]. Språkbanken Text. https://doi.org/10.23695/681b-be74

Additional ways to cite the dataset.

Pretrained model for lemmatization.

Models

We provide a model that enables lemmatization of Swedish text following the SUC3 standard. Note that SUC3 lemmatization does not exactly match the SALDO standard that is used in our Korp resources.

SUC3 was randomly split into training, validation and test sets (80:10:10). The model was trained for 30 epochs using the default Stanza settings. The accuracy on the test set is 99.18.

Download

File	Size	Modified	Licence
lem_stanza.zip	3.74 MB	2020-11-19	CC-BY-4.0

Data citation

Models

Download

Type

Language

Size

Updated

Contact

DOI