Mathir Trees contains 5 manually annotated Old Swedish texts or text fragments: 'Här sigx aff abotum allum skemptan mykla', a satirical text about abbots; the first chapter of Äldre Västgötalagen, a 13th century Västergötland provincial law; the first 20-something (edition) pages from Pentateuchparafrasen, a paraphrase of the five books of Moses; the first 5 chapters (plus some fragments) of Östgötalagen, a 13th century Östergötland provincial law; part 1 of Tungulus, Visio Tnugdali/Vision of Tundale, the shorter version. The texts consist of more than 33 000 tokens and close to 2 500 sentences. The annotation is provided in PROIEL XML files. The morpho-syntactic annotation in MAÞiR Trees follows the Menotec guidelines and is part of the PROIEL family of annotation guidelines/treebanks. Lemmatization is based on Söderwall's dictionary and supplements. The resource was compiled as part of the Mathir project, financed by the Marcus and Amalia Wallenberg Foundation, no. 2012.0146, by Gerlof Bouma and Yvonne Adesam.
Citation
Språkbanken Text (2024). MAÞiR Trees (updated: 2024-04-17). [Data set]. Språkbanken Text. https://doi.org/10.23695/705b-nq35Additional ways to cite the dataset.
An Old Swedish treebank, with lemma, parts-of-speech, and PROIEL-style dependency syntax.
Annotation
Manually annotated with lemma, and Menotec/PROIEL-style parts-of-speech and dependency structures.
References
H. Eckhoff, K. Bech, Gerlof Bouma, K. Eide, D. Haug, O. E. Haugen, M. Johndal (2018): The PROIEL treebank family: a standard for early attestations of Indo-European languages, in Language Resources and Evaluation, volume 52, issue 1, pages 29-65
File | Size | Modified | Licence |
---|---|---|---|
mathir_trees_v0.1.tgz
other
(tgz)
|
5.49 MB | 2024-04-17 |
CC BY-NC 4.0
attribution
|