Guest talk by Špela Arhar Holdt - Leveraging Error-Annotated Corpora and the Svala Tool: The Case of Slovene


20 juni 2023 10:00–11:30


Room: C362, Humanisten

Abstract: In this presentation, we will delve into the world of error-annotated corpora and the Svala Tool, focusing on their application in Slovene language learning and error detection. I will begin by introducing three different types of Slovene corpora with error annotations (Šolar, Lektor, and KOST) and discuss the methodology behind their creation prior to the localization of the Svala Tool. Next, I will outline the process of localizing and adapting the Svala Tool specifically for the Slovene language. I will present the CJVT Svala version of the tool, report on its current use, and share our overall positive user experience. Moving forward, I will highlight other novelties in building error-annotated corpora that we are currently working on. These include a portal for collecting school essays, the newly developed XML TEI format for error-annotated corpora, and the creation of a specialized concordancer for effective corpus analysis. Finally, I will outline our plans to harness error-annotated corpora for the development of automated error detection, with a specific focus on supporting the enhancement of written language skills in Slovene.