Hoppa till huvudinnehåll
Språkbanken Text är en avdelning inom Språkbanken.

Corpus of spoken isiXhosa

Standardreferens Information

Eva-Marie Bloom Ström, Onelisa Slater, Aron Zahran, Aleksandrs Berdicevskis, Anne Schumacher (2023): Preparing a corpus of spoken Xhosa, in Proceedings of the 2023 CLASP Conference on Learning with Small Data (LSD), Gothenburg and online 11–12 September 2023, pages 62-67 BibTeX

Citering Information

Språkbanken Text (2024). Corpus of spoken isiXhosa (uppdaterad: 2024-11-26). [Data set]. Språkbanken Text. https://doi.org/10.23695/xrsg-mp07
BibTeX Ytterligare sätt att citera datamängden.
A corpus of transcribed and annotated recordings of spoken Xhosa.

The Corpus of Spoken isiXhosa

The Corpus of Spoken isiXhosa consists of transcribed and annotated recordings of spoken Xhosa [xho]. The recordings have been made in the Eastern Cape in South Africa from 2015 onwards. The transcribed texts are annotated with morpheme-by-morpheme glosses, part-of-speech tags, and free English translations.

The recordings and the annotations of Xhosa data have been made as part of three different research projects led by senior lecturer Eva-Marie Bloom Ström at the University of Gothenburg. All projects, including the ongoing ‘How do words get in order? The role of speaker-hearer interaction in languages of southern Africa’, were founded by the Swedish Research Council.

The Corpus has been developed in collaboration with Språkbanken Text.

For more on annotation, preparation of data, and acknowledgements see:

  • Bloom Ström, E.-M., Slater, O., Zahran, A., Berdicevskis, A., & Schumacher, A. (2023). Preparing a corpus of spoken Xhosa. Proceedings of the 2023 CLASP Conference on Learning with Small Data (LSD), 62–67. https://aclanthology.org/2023.clasp-1.7

For questions about the corpus:
Eva-Marie Bloom Ström eva-marie.strom@gu.se

If you notice any errors or inconsistencies in annotations, please report them to this email address.

Main contributors:

  • Eva-Marie Bloom Ström
    Senior Lecturer, University of Gothenburg
  • Onelisa Slater
    MA, Rhodes University
  • Aron Zahran
    PhD, Inalco/Llacan (CNRS) & Ghent University

Tillgänglig via

Typ

  • Korpus

Språk

Storlek

Meningar: 1 347
Token: 7 039

Skapad

2024-05-08

Updaterad

2024-11-26

Kontakt

Språkbanken Text
sb-info@svenska.gu.se