Menu

Research

Språkbanken's research unit develops state-of-the-art language technology and pursues theoretical and practical aims within different research areas. Our research focuses both on language technology itself (creating comprehensive, high-quality resources that are needed to develop tools and algorithms) and on questions from other disciplines.

Filter and sort projects

Research theme

ICALL - Intelligent Computer-Assisted Language Learning

Research theme

ALZ-RJ-Cognitive Decline

Gold reserve

SwedishGlue: a benchmark suite for language models

2020–2021

The purpose of this project is to create high-quality test sets for att enable all actors within Swedish NLP to evaluate and compare language models.

  • Markus Forsberg
  • Yvonne Adesam
  • Aleksandrs (Sasha) Berdicevskis
  • Dana Dannélls
  • Felix Morger
  • Gerlof Bouma
  • Magnus Sahlgren
  • Love Börjeson
  • Johanna Bergman
  • evaluation
  • language models
  • bias

Argumentation analysis and technology

2020–    

A joint project between Språkbanken Text, FLoV and CLASP, with the purpose of creating and exploring methods for argumentation technology.

  • Anna Lindahl
  • Stian Rødven Eide
  • Axel Almquist
  • Bill Noble
  • Christine Howes
  • Ellen Breitholtz
  • Vladislav Maraev
  • Martin Kaså
  • linguistics
  • computational linguistics
  • argumentation
  • text
  • dialogue
  • pragmatics
  • semantics
  • politics
  • forum
  • online discussion
  • argumentation technology
  • argument mining

Rumour mining

2020–2023
  • Jacobo Rouces
  • Lars Borin
  • Mia-Marie Hammarlin
  • Fredrik Miegel
  • digital humanities

Text Mining of medical publications about Personal Centered Care

2019–2019
  • Jacobo Rouces

    Svenskt språkdatalabb

    2019–2021

    Målet med Svenskt Språkdatalabb är att skapa en nationell kunskapsnod inom språkteknologi, och ta fram svenska referensdatamängder för NLP som sedan tillgängliggörs med öppen access i AI Innovation of Swedens datafabrik.

    • Peter Ljunglöf
    • Aleksandrs (Sasha) Berdicevskis

      Towards Computational Lexical Semantic Change Detection

      2019–2022

      In this project, we aim to find automatic, corpus-based methods for detecting semantic change and lexical replacement for Swedish and English.

      • Nina Tahmasebi
      • Simon Hengchen
      • Richard Johansson
      • Maria Koptjevskaja Tamm

        Evaluation and refinement of an enhanced OCR-process for mass digitisation

        2019–2020

        The purpose of this project is to fine-tune and evaluate a test platform for OCR-production that was developed by Kungliga biblioteket (KB) in cooperation with the Norwegian software company Zissor in 2017.

        • Dana Dannélls
        • Lars Björk
        • Torsten Johansson
        • OCR

        The rise of complex verb constructions in Germanic

        2018–    

        The project studies the rise of complex verb constructions in Germanic.

        • Evie Coussé
        • Gerlof Bouma
        • Nicoline van der Sijs
        • Dirk-Jan de Kooter
        • Trude Dijkstra

          Milage: Multilingual Automated Grammar Extraction

          2018–2022
          • Shafqat Mumtaz Virk
          • Markus Forsberg
          • Harald Hammarström

            DReaM: The Dictionary/Grammar Reading Machine

            2018–2020

            A Multilingual Annotated Corpus of Grammars for the World's Languages

            • Shafqat Mumtaz Virk
            • Markus Forsberg
            • Harald Hammarström

              L2 profiles for Swedish

              2018–2021
              • Elena Volodina
              • Therese Lindström Tiedemann
              • Yousuf Ali Mohammed
              • David Alfter
              • ICALL
              • NLP4CALL
              • språklig komplexitet
              • SLA
              • second language learning
              • CEFR profiles

              SweLL - Infrastructure for L2 Swedish

              2017–2020
              • Elena Volodina
              • Yousuf Ali Mohammed
              • Arild Matsson
              • Mats Wirén
              • Beáta Megyesi
              • Julia Prentice
              • Gunlög Sundberg
              • Lena Granstedt
              • Monica Reichenberg
              • Lisa Rudebeck
              • Second language infrastructure
              • Swedish as a second language
              • essay annotation
              • correction annotation
              • pseudonymization

              Linguistic and extra-linguistic parameters for early detection of cognitive impairment

              2016–2019
              • Dimitrios Kokkinakis
              • Kristina Lundholm Fors
              • Malin Antonsson
              • Marie Eckerström
              • Charalambos Themistocleous
              • language disorders

              Digital LSI

              2015–2021

              Digitization of Grierson’s Linguistic Survey of India (LSI; 1903-1927)

              • Lars Borin
              • Shafqat Mumtaz Virk
              • Anju Saxena
              • Bernard Comrie

                Language Technology Linked Open Data at Språkbanken

                2014–2014

                This project aims to make lexical resources for language technology available in the form of linked open data.

                • Lars Borin
                • Dana Dannélls
                • Markus Forsberg
                • LOD
                • Semantiska webben
                • länkad data

                Koala – Korp's linguistic annotations

                2014–2017

                Improved annotations for the Korp corpus infrastructure.

                • Yvonne Adesam
                • Lars Borin
                • Gerlof Bouma
                • Markus Forsberg
                • Richard Johansson

                  Corpus-driven induction of linguistic knowledge

                  2014–2018

                  We will apply corpus-driven methods as a way to expand and correct existing hand-crafted linguistic resources, and conversely we will use hand-crafted resources as additional sources of supervision when learning meaning representations automatically.

                  • Richard Johansson
                  • Luis Nieto Piña

                    MAÞiR

                    2014–2017

                    Developing automatic annotation tools for Old Swedish texts.

                    • Gerlof Bouma
                    • Yvonne Adesam

                      A free cloud service for OCR

                      2013–2014
                      • Dana Dannélls
                      • Lars Borin
                      • Gerlof Bouma
                      • OCR
                      • historiskt material

                      SweCcn -- a Swedish constructicon

                      2013–2016

                      The aim of this project is to develop a Swedish so-called constructicon, a database of Swedish constructions.

                      • Lars Borin
                      • Dana Dannélls
                      • Markus Forsberg
                      • Leif-Jöran Olsson
                      • Jonatan Uppström
                      • Benjamin Lyngfelt
                      • Kristian Blensenius
                      • Linnea Bäckström
                      • Anna Ehrlemark
                      • Per Malm
                      • Joel Olofsson
                      • Julia Prentice
                      • Rudolf Rydstedt
                      • Emma Sköldberg
                      • Sofia Tingsell
                      • Lexicography
                      • integrerad lexikonresurs
                      • constructicon

                      Towards a Knowledge-Based Culturomics

                      2013–2018

                      The main aim of this research program is to advance the state of the art in language technology resources and methods for semantic processing of Swedish text, in order to provide researchers and others with more sophisticated tools for working with the information contained in large volumes of digitized text, e.g., by being able to correlate and compare the content of texts and text passages on a large scale.

                      • Jacobo Rouces
                      • Lars Borin
                      • Nina Tahmasebi
                      • Dimitrios Kokkinakis
                      • Pierre Nugues
                      • Richard Johansson
                      • Dubhashi Devdatt
                      • culturomics

                      Funktionella somatiska symtom

                      2013–2014

                      Tolkning och förståelse av funktionella symtom i primärvården

                      • Dimitrios Kokkinakis
                      • Eva Lidén
                      • Elisabeth Björk Brämberg
                      • Sylvia Määttä
                      • Staffan Svensson

                        PINCORE

                        2012–2016

                        Person-Centred Information and Communication for patients undergoing Colo-Rectal Cancer Surgery

                        • Dimitrios Kokkinakis

                          META-NORD

                          2011–2013

                          The META-NORD project aims to establish an open linguistic infrastructure in the Baltic and Nordic countries.

                          • Lars Borin
                          • Markus Forsberg

                            Akademiska ordlistor

                            2011–2014
                              • Lexicography
                              • second language learning
                              • NLP4CALL

                              A System Architecture for ICALL

                              2011–2013
                              • Lars Borin
                              • Elena Volodina
                              • Hrafn Loftsson
                              • Birna Arnbjörnsdóttir
                              • ICALL
                              • NLP4CALL
                              • Swedish as a second language
                              • Second language infrastructure
                              • second language learning
                              • s

                              MedEval

                              2011–2012

                              En svensk medicinsk testkollektion

                              • Karin Friberg Heppin
                              • Anni Järvelin

                                Swedish FrameNet++ (SweFN++)

                                2011–    

                                The goal of the SweFN++ project is to build an open-content -- i.e., freely available and modifiable -- integrated lexical resource for Swedish -- so far lacking -- to be used as a basic infrastructural component in Swedish language technology (LT) research and in the development of LT applications for Swedish.

                                • Lars Borin
                                • Dana Dannélls
                                • Dimitrios Kokkinakis
                                • Markus Forsberg
                                • Jonatan Uppström
                                • Leif-Jöran Olsson
                                • Malin Ahlberg
                                • Maria Toporowska Gronostaj
                                • Karin Friberg Heppin
                                • Richard Johansson
                                • lexikon
                                • lexikal semantik
                                • modern
                                • integrerad lexikonresurs
                                • framenet

                                MOLTO - Multilingual Online Translation

                                2010–2013

                                MOLTO's goal is to develop a set of tools for translating texts between multiple languages in real time with high quality. Languages are separate modules in the tool and can be varied; prototypes covering a majority of the EU's 23 official languages will be built.

                                • Dana Dannélls
                                • Generation
                                • translation
                                • multilingual
                                • cultural heritage
                                • GF

                                Digital areal linguistics

                                2010–2014

                                The goal of this project is to create a database of comparable lexical items in a number of representative languages spoken in the Himalayan region in India and to use this database for investigating the Himalayas as a linguistic area.

                                • Lars Borin
                                • Taraka Rama
                                • Anju Saxena
                                • Bernard Comrie
                                • language technology
                                • areal linguistics
                                • linguistic typology
                                • computational linguistics
                                • Lexicography

                                Kelly - KEywords for Language Learning for Young and adults alike

                                2009–2011
                                • Elena Volodina
                                • Sofie Johansson Kokkinakis
                                • ICALL
                                • NLP4CALL
                                • second language learning
                                • CEFR profiles
                                • Swedish as a second language

                                CONPLISIT

                                2009–2010

                                Consumption patterns and life-style in Swedish literature – novels 1830-1860

                                • Lars Borin
                                • Markus Forsberg
                                • Christer Ahlberger