MAÞiR

The Swedish language of the Middle Ages, Old Swedish (ca 1225-1526), is preserved in manuscripts, letters and early print. These documents are valuable for a wide variety of researchers, such as linguists interested in Swedish language changes during that time, law scholars who want to explore mediaeval laws, theologians who study early translations of bible texts, or medical historians who are interested in mediaeval folk healing.

In the MAþiR -- Methods for the automatic Analysis of Text in digital Historical Resources -- project, we create tools for automatic linguistic analysis of Old Swedish. The project is related to Språkbanken's historical resource efforts, Diabase, and lies in the field of computational linguistics and natural language processing. By adding grammatical information to digitized Old Swedish texts, we can facilitate studies of this cultural heritage and enable new ways to explore it.

Developing tools for Old Swedish is a demanding task, even with the best computational linguistic methods, due to properties of the Old Swedish texts. First, the language of the time was changing, regarding e.g. word order and inflection. Second, there was no orthographical standard, in the modern sense. The same word could be spelled in many different ways. The word "maþir", meaning man ord human, was e.g. also spelled "mæþr", "mander" or "meþer". Different spellings were even present in the same paragraph. Third, the language varies between the texts, as 300 years have passed between the earliest and the latest texts, and they come from different geographical areas and are of different genres. Fourth, most automatic methods require either a very detailed computational description of the language, or a large amount of text, which has already been linguistically annotated, to be used as training material for the computer. None of this is currently available for Old Swedish. The core of the MAÞiR-project is exploring ways of handling these challenges in the Old Swedish texts.

Publications

2018

Yvonne Adesam, Malin Ahlberg, Gerlof Bouma (2018): FSvReader – Exploring Old Swedish Cultural Heritage Texts, in CEUR Workshop Proceedings, vol. 2084. Proceedings of the Digital Humanities in the Nordic Countries 3rd Conference Helsinki, Finland, March 7-9, 2018. Edited by Eetu, Mäkelä Mikko, Tolonen Jouni Tuominen
H. Eckhoff, K. Bech, Gerlof Bouma, K. Eide, D. Haug, O. E. Haugen, M. Johndal (2018): The PROIEL treebank family: a standard for early attestations of Indo-European languages, in Language Resources and Evaluation, volume 52, issue 1, pages 29-65

2017

Gerlof Bouma, Yvonne Adesam (2017): Proceedings of the NoDaLiDa 2017 Workshop on Processing Historical Language

2016

Gerlof Bouma, Yvonne Adesam (2016): Part-of-speech and Morphology Tagging Old Swedish, in Proceedings of the Sixth Swedish Language Technology Conference (SLTC) Umeå University, 17-18 November, 2016
Yvonne Adesam, Gerlof Bouma (2016): Old Swedish Part-of-Speech Tagging between Variation and External Knowledge, in Proceedings of the 10th SIGHUM Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities, Berlin, Germany, August 11, 2016

2014

Yvonne Adesam, Malin Ahlberg, Peter Andersson, Gerlof Bouma, Markus Forsberg, Mans Hulden (2014): Computer-aided Morphology Expansion for Old Swedish, in Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14) May 26-31, 2014 Reykjavik, Iceland, pages 1102-1105

2013

Gerlof Bouma, Yvonne Adesam (2013): Experiments on sentence segmentation in Old Swedish editions, in NEALT Proceedings Series, volume 18

2012

Yvonne Adesam, Malin Ahlberg, Gerlof Bouma (2012): Processing spelling variation in historical text, in Proceedings of the Fourth Swedish Language Technology Conference (SLTC)
Yvonne Adesam, Malin Ahlberg, Gerlof Bouma (2012): bokstaffua, bokstaffwa, bokstafwa, bokstaua, bokstawa... Towards lexical link-up for a corpus of Old Swedish, in Proceedings of the LTHist workshop at Konvens

Publications

2018

2017

2016

2014

2013

2012

Project duration

Project members

Funding

Project type