The Dictionary/Grammar Reading Machine: Computational Tools for Accessing the World’s Linguistic Heritage (DReaM).
The diversity of the world's 6,500 languages embodies a wealth of information on human cognition and the history of populations. As languages go extinct, the linguistic heritage of human kind increasingly resides in grammars and dictionaries, which are rapidly accumulating. Accessing this heritage entails that the descriptions are available and that they are read by someone. Availability is a problem because publications are often difficult to access.
In this project we aim to enhance access to the world’s linguistic heritage by making an existing collection of more than 9,000 PDF documents no longer protected by to copy-right available in a stable archive enriched by added metadata and computational tools developed to search information within the texts. Moreover, a number of dictionaries will be converted to apps for mobile devices that can be distributed to speakers of minority languages, handing back to these speakers some of their linguistic heritage. The developed resources, particularly grammatical descriptions, are to be used for experimentation and development of methodologies for automatic extraction of linguistic features.
The project is funded by the European Union's Horizon 2020 research and innovation programme under grant agreement No 6995327 and lasts between 2018-02-01 - 2020-12-31.
Project partners:
- Uppsala University, Department of Linguistics and Philology, Sweden
- Språkbanken, Department of Swedish, University of Gothenburg, Sweden
- Leiden University Centre for Linguistics (LUCL), Netherlands
- Centre National de la Recherche Scientifique Langage, Langues et Cultures d'Afrique Noire (LLACAN), France
For more information, please visit: http://stp.lingfil.uu.se/~harald/dream.html