Hoppa till huvudinnehåll

Svensk EAT: frågeklassifikation

En översättning av QAQC datamängden för klassificering av typer av det förväntade svaret
I. IDENTIFYING INFORMATION
Title* Swedish EAT v1.0
Subtitle
Created by* Jonatan Cerwall (jonatancerwall@gmail.com)
Publisher(s)* Språkbanken Text
Link(s) / permanent identifier(s)*
License(s)*
Abstract* This dataset is a translated version of the QAQC dataset (https://cogcomp.seas.upenn.edu/Data/QA/QC/) for expected-answer-type classification. Taxonomy is the Li and Roth Taxonomy, also from https://cogcomp.seas.upenn.edu/Data/QA/QC/.
Funded by*
Cite as Cerwall, J. (2021). What the BERT? Fine-tuning KB-BERT for Question Classification. Unpublished manuscript, School of Electrical Engineering and Computer Science, KTH.
Related datasets
II. USAGE
Key applications Machine learning, EAT Classification
Intended task(s)/usage(s) Evaluate models by standard classification
Recommended evaluation measures Accuracy
Dataset function(s) Testing
Recommended split(s) Test only
III. DATA
Primary data* Text
Language* Swedish
Dataset in numbers* 5451 questions in training set, 500 in test set.
Nature of the content* Open ended factoid questions.
Format* Comma-separated, four columns:
text -- the open ended factoid question
verbose label -- both the coarse-grained label and the fine-grained label formatted as COARSE:fine
coarse label -- coarse-grained label
fine label -- fine-grained label
Data source(s)* Translated from the QAQC dataset (https://cogcomp.seas.upenn.edu/Data/QA/QC/)
Data collection method(s)* --
Data selection and filtering* --
Data preprocessing* --
Data labeling* --
Annotator characteristics
IV. ETHICS AND CAVEATS
Ethical considerations "Some outdated treatment of women (eg "Vilka är de sexigaste kvinnorna i världen?")"
Things to watch out for
V. ABOUT DOCUMENTATION
Data last updated* 2021-07-27
Which changes have been made, compared to the previous version* First version
Access to previous versions
This document created* 2021-07-27
This document last updated* 2023-06-08
Where to look for further details
Documentation template version*
VI. OTHER
Related projects
References

Annotation

Classification of factoid questions by the type of the answer that is expected (coarse label and fine-grained label)

Förbehåll

Some outdated treatment of women (eg "Vilka är de sexigaste kvinnorna i världen?")
Fil Storlek Modifierad Licens
361.34 KB 2023-06-08 CC BY 4.0
attribution
2.05 KB 2023-06-08 CC BY 4.0
attribution

Typ

  • Korpus
  • Tränings- och utvärderingsdata

Språk

svenska

Storlek

Meningar: 0
Token: 0

Nyckelord

  • gold
  • translated dataset
  • neither-corpus-nor-lexicon

Kontakt

Språkbanken
sb-info@svenska.gu.se