Skip to main content

Gothenburg Dialogue Corpus (GDC)

Citation

Språkbanken Text. (2017-03-26). Gothenburg Dialogue Corpus (GDC) [Data set]. Språkbanken Text. https://doi.org/10.23695/p2v4-6g89
Additional ways to cite the dataset.
GDC is a collection of 360 individual dialogues transcribed from recordings.
Gothenburg Dialogue Corpus (GDC) is a collection of 360 individual dialogues transcribed from recordings of about 25 different social activites. The corpus was initiated in the late 1970's to meet a growing interest in naturalistic spoken language data. The GDC data is very diverse considering the different social activities with regard to punctuation, grammar, vocabulary and the role of language and communication in human social life. The corpus consist of both audio (50%) and audio/video (50%) recordings of naturalistically occurring interactions.

For access please contact data@flov.gu.se.
File Size Modified Licence
stats_GDC.txt
Word statistics: Information (CSV)
3.95 MB 2017-03-26 CC BY 4.0
attribution

Type

  • Corpus

Language

Swedish

Size

Sentences: 107,700
Tokens: 1,473,608

Updated

2017-03-26

Contact

Institutionen för filosofi, lingvistik och vetenskapsteori
data@flov.gu.se