A small collection of 10 pairs of parallel texts in Swedish and English annotated with personal information categories.
This is a small corpus of 10 pairs of texts in Swedish and English annotated with personal information categories. The annotation largely follows that of the TAB corpus (https://aclanthology.org/2022.cl-4.19/). The twenty texts in total were sourced from the Parallel Global Voices corpus (https://nlp.ilsp.gr/pgv/, CC BY 4.0) and manually annotated. That corpus, in turn, had collected the texts from the Global Voices websites (https://globalvoices.org/, CC BY 3.0).
