Skip to main content

News archive

New project aims to enhance access to the world’s linguistic heritage

The Dictionary/Grammar Reading Machine: Computational Tools for Accessing the World’s Linguistic Heritage (DReaM).

The diversity of the world's 6,500 languages embodies a wealth of information on human cognition and the history of populations. As languages go extinct, the linguistic heritage of human kind increasingly resides in grammars and dictionaries, which are rapidly accumulating. Accessing this heritage entails that the descriptions are available and that they are read by someone. Availability is a problem because publications are often difficult to access.

In this project we aim to enhance access to the world’s linguistic heritage by making an existing collection of more than 9,000 PDF documents no longer protected by to copy-right available in a stable archive enriched by added metadata and computational tools developed to search information within the texts. Moreover, a number of dictionaries will be converted to apps for mobile devices that can be distributed to speakers of minority languages, handing back to these speakers some of their linguistic heritage. The developed resources, particularly grammatical descriptions, are to be used for experimentation and development of methodologies for automatic extraction of linguistic features.

The project is funded by the European Union's Horizon 2020 research and innovation programme under grant agreement No 6995327 and lasts between 2018-02-01 - 2020-12-31.

Project partners:

  • Uppsala University, Department of Linguistics and Philology, Sweden
  • Språkbanken, Department of Swedish, University of Gothenburg, Sweden
  • Leiden University Centre for Linguistics (LUCL), Netherlands
  • Centre National de la Recherche Scientifique Langage, Langues et Cultures d'Afrique Noire (LLACAN), France

For more information, please visit: http://stp.lingfil.uu.se/~harald/dream.html

New project: From Dust to Dawn: Multilingual Grammar Extraction from Grammars

From Dust to Dawn: Multilingual Grammar Extraction from Grammars is a new project that enables computers to read grammatical descriptions and automatically extract information.

Traditionally, researchers often study the diversity of world’s languages by reading and comparing grammatical descriptions manually. Nowadays, a large amount of linguistic descriptions and books are easily available in digital formats. Reading them all for a wider-level comparison and analysis is way beyond individual people's capabilities. Text technology, i.e. computer-based text management in natural language, is now powerful enough to potentially be used to harvest facts at different levels of detail within a given domain (in this case, information on world languages).

In this project we want to utilize a useful collection of 9000 digitized grammatical descriptions covering over a thousand languages in order to significantly expand the ability to make major language comparisons. For this purpose, the project will develop methodologies to enable computers to read grammatical descriptions and automatically extract information (“linguistic facts”). We are to explore and develop a notion of “language profile”, which is a structured digital collection and representation of a language encapsulating all available knowledge about a language extracted from various sources.

The project is funded by the Marcus and Amalia Wallenberg (MAW) Foundation and lasts between 2018-07-01 - 2022-06-30. The project is run by two partners:

  • Uppsala University, Department of Linguistics and Philology, Sweden
  • Språkbanken, Department of Swedish, University of Gothenburg, Sweden

Fore more information, please visit: http://stp.lingfil.uu.se/~harald/maw.html

Nytt projekt inom andraspråksforskning: L2 profiling - Utveckling av lexikala och grammatiska kompetenser i invandrarsvenska

Fler invandrare söker sig till Sverige och det blir alltmer viktigt att kunna lära ut svenska som andraspråk (eller främmande språk) (L2) på bästa möjliga sätt. De senaste åren har vi sett nya metoder för att studera språkutvecklingen inom L2 bl.a. med hjälp av inlärarkorpusar. För svenska finns det ytterst få sådana studier, men Språkbanken har en inlärarkorpus med uppsatser och en kursbokskorpus (för L2 svenska) som lämpar sig för sådana studier. Det nyligen startade projektet L2 profiling leds av Elena Volodina, forskare inom språkteknologi vid Språkbanken, Göteborgs universitet. Mer information om projektet finns på denna sidan: https://spraakbanken.gu.se/swe/forskning/l2profiling

Projektet Språkliga och extra-lingvistiska parametrar för tidig upptäckt av kognitiv svikt tilldelas anslag från RJ

Dimitrios Kokkinakis, forskare i språkvetenskaplig databehandling vid Göteborgs universitet, har fått drygt 10 miljoner kronor från Riksbankens Jubileumsfond (RJ) för forskningsprojektet Språkliga och extra-lingvistiska parametrar för tidig upptäckt av kognitiv svikt.

Projektet går ut på att undersöka om man kan upptäcka tidigt stadium av demenssjukdomar genom att analysera skrift- och talspråk samt mäta ögonrörelser.

Läs webbnyheten "Språkteknologiska analyser som komplement till demensdiagnostik": http://svenska.gu.se/aktuellt/nyheter/fulltext/sprakteknologiska-analyser-som-komplement-till-demensdiagnostik.cid1314609