Dana Dannélls


Doctor of Philosophy in Natural Language Processing
Master of Arts in Computational Linguistics
Bachelor of Electronics and Computer Engineering


Dana Dannélls is a researcher in language technology at Språkbanken, University of Gothenburg. Before that, she was a project administration officer at the Centre for Language Technology in Gothenburg (CLT). During 2013 she was a postdoctoral researcher at the department of Computer Science and Engineering at Chalmers exploring methods for generating multilingual natural language from Semantic Web ontologies. Prior to this, she completed her PhD at the University of Gothenburg and Graduate School of Language Technology (GSLT) on Multilingual text generation from structured formal representations. She received her M.S. degree in Computational Linguistics from Språkbanken at the University of Gothenburg in 2006, and her B.S. degree in Electronics and Computer Engineering from Göteborgs Tekniska Institut (GTI) in 2002.

Her main research interests are: Multilingual Natural Language Generation, that is the process of generating text in multilingual languages automatically from non-linguistic data, Lexical Semantics that is the study of the meaning (senses) of individual words and the relations between them, and automatic methods for extracting and representing different pieces of knowledge.

She was the local organizer of NoDaLiDa 2017, and EACL 2014. She was also the organizer of the 6th CLT workshop, and several annual Språkbanken workshops. She has been and is involved as program committee member for several journals, international conferences and workshops including: Journal on Computing and Cultural Heritage (JOCCH), Semantic Web journal (SWJ), Language Resources and Evaluation Conference (LREC), the Digital Humanities Conference (DH), WEB, Semantic Web and Information Extraction (SWAIE) workshop, the International FrameNet workshop, the EMNLP-CoNLL workshop on Phonology, Morphology, Tagging, Chunking and Word Segmentation, and the COLING workshop on Semantics and PARSing and Multi-word Expressions.

Research interests

  • Multilingual Natural Language Generation
  • Lexical Semantics
  • Semantic Web
  • Knowledge Representation
  • Digital Humanities

Research description

My research interests in language technology span the areas of textual analysis, lexical semantics, multilingual natural language generation, and knowledge representation standards. I have specific expertise in developing natural language applications and resources. Recently, I have been involved in several digital humanities projects within SWE-CLARIN. Since January 2019 I am involved in a research infrastructure project in collaboration with the national library of Sweden (KB), where the aim is to improve KB's Optical Character Recognition (OCR) process, especially in relation to the digitisation of newspapers. We are focusing on improving OCR errors with the help of electronic dictionaries and word lists automatically extracted from corpora. Experiments on improving OCR with specific word lists that we carried out in the project A free cloud service for OCR demonstrated the usefulness of this approach when applied on historical material.

Another research project I am a part of is the Swedish FrameNet (SweFN++), a lexical-semantic resource based on the theory of frame semantics that has been expanded from and constructed in line with the Berkeley FrameNet. A large part of my work within the project involves the development of domain specific semantic frames, semantic and syntactic annotations of examples, and automatic production of texts from framenet data. One focus of my work is on developing NLP applications that exploit FN data. An example of an application we developed is generation of computational multilingual FrameNet-based grammar and lexicon from FrameNet-annotated corpora.

Other NLP projects related to my research interests that I have participated in are:

The Swedish constructicon project, SweCcn -- a Swedish constructicon, a large electronic database of Swedish constructions, which has been developed as an extension of the Swedish FrameNet. My work focused on developing an automatic approach based on the resource grammar library provided by Grammatical Framework. We acquired a computational construction grammar from the Swedish construction in order to extend and improve the Swedish resource grammar library. Another line of my work concerned exploitation of statistical methods to validate constructions that are targeted towards second language (L2) learners.

The Linked-Open Data (LTLOD@SB) project, where we published four Swedish lexical-semantic resources available at Språkbanken in RDF with Lemon.

I was the leader of the workpackage Case study: Cultural Heritage in the MOLTO EU project coordinated at the department of Computer Science and Engineering at Chalmers. I was working mainly with texts from the cultural heritage domain in 15 languages. My contributions to the project were: multilingual natural language generation from Semantic Web ontologies in the grammatical framework (GF), and multilingual knowledge extraction from Wikipedia articles. The work resulted in an online multilingual system that enables interaction with digital museum libraries through natural language text.

The EU project Semantic Mining in Biomedicine, coordinated at University of Gothenburg. My work in the project involved experiments of semantic mining techniques to extract texts from Medline which were written for different groups of readers and natural language generation of biomedical texts in three languages: English, Swedish and French.



  • Dana Dannélls, Lars Björk, Ove Dirdal, Torsten Johansson (2021): A Two-OCR Engine Method for Digitized Swedish Newspapers, in Selected Papers from the CLARIN Annual Conference 2020, Linköping Electronic Conference Proceedings 180.
  • Saga Hansson, Konstantinos Mavromatakis, Yvonne Adesam, Gerlof Bouma, Dana Dannélls (2021): The Swedish Winogender Dataset, in Proceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa), May 31 - June 2, 2021, Reykjavik, Iceland (online).
  • Molly Skelbye, Dana Dannélls (2021): OCR Processing of Swedish Historical Newspapers Using Deep Hybrid CNN–LSTM Networks, in Proceedings of the International Conference on Recent Advances in Natural Language Processing, 1–3 September, 2021 / edited by Galia Angelova, Maria Kunilovskaya, Ruslan Mitkov, Ivelina Nikolova-Koleva.
  • Shafqat Virk, Dana Dannélls, Azam Sheikh Muhammad (2021): A Novel Machine Learning Based Approach for Post-OCR Error Detection, in Proceedings of the International Conference on Recent Advances in Natural Language Processing, 1–3 September, 2021 / Edited by Galia Angelova, Maria Kunilovskaya, Ruslan Mitkov, Ivelina Nikolova-Koleva.
  • Shafqat Virk, Dana Dannélls, Lars Borin, Markus Forsberg (2021): A Data-Driven Semi-Automatic Framenet Development Methodology, in Proceedings of the International Conference on Recent Advances in Natural Language Processing, 1–3 September, 2021 / Edited by Galia Angelova, Maria Kunilovskaya, Ruslan Mitkov, Ivelina Nikolova-Koleva.
















Show all publications as BibTeX
Dana Dannélls



  • 0704774680
  • 0317865054