Skip to main content


	title        = {Metadata Formats for Learner Corpora: Case Study and Discussion},
	booktitle    = {Proceedings of the 11th Workshop on Natural Language Processing for Computer-Assisted Language Learning (NLP4CALL 2022) },
	author       = {Lange, Herbert},
	year         = {2022},
	publisher    = {Linköping University Electronic press},
	address      = {Linköping},

	title        = {RefCo and its Checker: Improving Language Documentation Corpora’s Reusability Through a Semi-Automatic Review Process},
	booktitle    = {Proceedings of the Thirteenth Language Resources and Evaluation Conference},
	author       = {Lange, Herbert and Aznar, Jocelyn},
	year         = {2022},
	publisher    = {European Language Resources Association},
	address      = {Marseille, France},

	title        = {QUEST: Guidelines and Specifications for the Assessment of Audiovisual, Annotated Language Data },
	abstract     = {

This guide documents the main results of the joint project “QUEST: Quality – Established: Qualitätsstandards und Kurationskriterien für audiovisuelle annotierte Sprachdaten”, which was carried out between 2019 and 2022 and funded by the German Federal Ministry of Education and Research (BMBF). The project consortium consisted of the University of Hamburg, the Leibniz-Centre General Linguistics (ZAS) in Berlin, the Archive for Spoken German (AGD)/Institute for the German Language (IDS) in Mannheim and the University of Cologne. The BBAW in Berlin was also involved through the ‘Endangered Languages Documentation Programme’.

Main aim of the project was to maximise the potential for reuse and secondary use of audiovisual, annotated language data. For this purpose, QUEST developed quality standards and curation criteria for several reuse scenarios such as ‘Language Documentation’, ‘Learner Corpora’, ‘Interpreted Corpora’, ‘Sign Language’, ‘Language Community’, ‘Ethnography’ and ‘Oral History’. Based on this, quality assurance procedures (an online questionnaire and automated quality checks) were implemented and tested on authentic data.

In summary, the guidelines document provides definitions and examples for the quality criteria elaborated in QUEST, which are intended to provide information on the reuse potential of audiovisual, annotated data and aims to give overview of the objects and workflows of the evaluation system. Quality standards and curation criteria are linked to data maturity levels and suggestions are made on how to evaluate each criterion.
	journal      = {Working Papers in Corpus Linguistics and Digital Technologies: Analyses and Methodology},
	author       = {Wamprechtshammer, Anna and Arestau, Elena and Aznar, Jocelyn and Hedeland, Hanna and Isard, Amy and Khait, Ilya and Lange, Herbert and Majka, Nicole and Rau, Felix},
	year         = {2022},
	volume       = {8},