Pseudonymization of learner essays as a way to meet GDPR requirements

This blog is based on joint research with Yousuf (Samir) Ali Mohammed, Arild Matsson, Beáta Megyesi and Sandra Derbring Access to language data is an obvious prerequisite for research in digital humanities in general, and for the development of NLP-based tools in particular.

However, accessible data becomes a challenging target where personal data is involved. This is very true of language learner data where tasks are often phrased so that they, directly or indirectly, elicit explicit personal information, e.g.”Describe your school” or …
