Skip to main content

CLT seminar

Dates

20 October 2022 10:30–11:30

Location

C442

Invited

Public

Speaker: Herbert Lange

Title: Semi-automatic quality assurance for audiovisual corpus data

Abstract:
Gathering high-quality language data is important for linguistic research. This is a particular challenge for audiovisual data, e.g. for language documentation but also learner and sign language data. The data has to be transcribed and annotated and it is essential to understand the annotations for later reuse. In the QUEST project (QUality ESTablished) we developed processes and criteria to validate and improve audiovisual corpus data as well as implemented automatic validation procedures where possible.
I will present a semi-automatic review process to improve and certify the quality of corpus data as well as a concrete implementation of relevant criteria for language documentation.