An Old Swedish treebank, with lemma, parts-of-speech, and PROIEL-style dependency syntax.
Mathir Trees contains 5 manually annotated Old Swedish texts or text
fragments: 'Här sigx aff abotum allum skemptan mykla', a satirical text
about abbots; the first chapter of Äldre Västgötalagen, a 13th century
Västergötland provincial law; the first 20-something (edition) pages from
Pentateuchparafrasen, a paraphrase of the five books of Moses; the first
5 chapters (plus some fragments) of Östgötalagen, a 13th century
Östergötland provincial law; part 1 of Tungulus, Visio Tnugdali/Vision of
Tundale, the shorter version. The texts consist of more than 33 000
tokens and close to 2 500 sentences. The annotation is provided in PROIEL
XML files. The morpho-syntactic annotation in MAÞiR Trees follows the
Menotec guidelines and is part of the PROIEL family of annotation
guidelines/treebanks. Lemmatization is based on Söderwall's dictionary
and supplements. The resource was compiled as part of the Mathir project,
financed by the Marcus and Amalia Wallenberg Foundation, no. 2012.0146,
by Gerlof Bouma and Yvonne Adesam.
Citation
Språkbanken Text. (2024-04-17). MAÞiR Trees [Data set]. Språkbanken Text. https://doi.org/10.23695/705b-nq35Additional ways to cite the dataset.
Annotation
Manually annotated with lemma, and Menotec/PROIEL-style parts-of-speech and dependency structures.
References
- H. Eckhoff, K. Bech, Gerlof Bouma, K. Eide, D. Haug, O. E. Haugen, M. Johndal (2018): The PROIEL treebank family: a standard for early attestations of Indo-European languages, in Language Resources and Evaluation, volume 52, issue 1, pages 29-65
File | Size | Modified | Licence |
---|---|---|---|
mathir_trees_v0.1.tgz
other
(tgz)
|
5.49 MB | 2024-04-17 |
CC BY-NC 4.0
attribution
|