Meaning through sensory data

Recently, we have seen a surge of methods that claim to embed meaning from textual corpora. But is that possible? Can text really reveal meaning, and if so, can current NLP methods detect it? Can our methods, as they some times claim, understand? Perhaps the larger question is the following: can we bring meaning to words using only the information stored in text? This question is essential for any Artificial Intelligence (AI) system that uses text as a basis. Let us take …

The Gothenburg H70 birth cohort studies and the digital assessment of neuropsychological tests

A comment often received by the reviewers of manuscripts to scientific conferences and journals is one about the representative sample under scrutiny and whether there are any solid arguments for accepting that the population characteristics, and particularly the features extracted from the empirical data acquired from such a population (e.g. from speech production) provide sufficient or accurate enough information to use in various algorithmic approaches (e.g. in machine learning). State-of-the-art studies on computational methods to identify signs of cognitive deterioration in language …

Argumentation Mining

What if you could find all arguments in a text without having to read it? Or, what if you could search a database for a controversial topic and immediately get arguments for and against it, gathered from text all around the internet? Or, imagine when writing an essay you would automatically get an estimation of how persuasive your arguments are. Scenarios such as these could be possible with techniques developed in the field of argumentation mining. The aim of this relatively new …

The Swedish PoliGraph

Continuing on last month’s theme on Swedish parliamentary data, we would like to introduce a new tool designed to use and explore them. The Swedish PoliGraph is unfortunately not able to tell when a politician lies, at least not yet. Rather it is a graph that connects politicians to their roles and participation in the Swedish parliament. With it, we can ask questions such as: Who was present in the parliament on a given date? Who spoke on that date? Which party …

Analyzing data from the Swedish Parliament

The Swedish Parliament (Riksdagen) continuously releases open data on its website, which includes documents approved and used during parliamentary sessions as well as what each member of parliament votes during each roll call (voting session). This data can be used to gain insight on what topics members of parliament and parties discuss and vote. In the following post, I will provide some example analyses that were performed with Python, but it could be done similarly with many other programming languages with data …

What are probing tasks in NLP?

In recent years, neural network based approaches (i.e. deep learning) have been the main models for state-of-the-art systems in natural language processing, whether that is in machine translation, natural language inference, language modeling or sentiment analysis. At the same time researchers have asked themselves what kind of linguistic information these neural networks are able to capture. Answering this question is not a trivial undertaking: state-of-the-art model’s are usually multiple layers deep with non-linear transformations learned through billions of mathematical operations. The benchmarks …

Searching for linguistic signs of cognitive deterioration

In our research group, we are exploring ways of analysing language to find early signs of possible cognitive impairment, which may develop to dementia. Dementia is a condition that affects people all around the world, and the risk of developing it increases with age. There are several different types of dementia, but the most common one is Alzheimer’s disease. This disease causes deterioration of cognitive functions, and one of the first symptoms tends to be problems with short-term memory. Language is also …

Grym och häftig ordförändring

Ord kan förändra sina betydelser. Man behöver inte en doktorgrad i språkvetenskap för att upptäcka att grym i (1) betyder inte samma sak som i (2). (1) — Ja, så här grym kan fotbollen vara. Tyvärr, menade Gefles tränare Lennart ’Liston’ Söderberg. Dagens Nyheter 1987-05-18 (2) Jag hörde inte vad mina lagspelare sa, om de kallade på mig eller minsta lilla, för det var en sådan skön och grym stämning. Svenska Dagbladet 2013-09-03 Det finns dock en del svårare frågor. Finns det …

Using Språkbanken corpora in NLTK

At Språkbanken we collect resources, mainly lexica and corpora, most of them in Swedish. So far we have collected Swedish corpora totalling 13 billions of words, in all kinds of genres and from all time periods. Most of our corpora are not manually annotated, and the ones that are annotated usually have only one kind of annotation (e.g., part of speech, lemmas, dependency structures, constituent structure, etc). To be able to use the same tools to analyse any corpus, we have devised …