Sustainable language representations for a changing world, 31 May 2021

The workshop has ended, and we are very happy for your participation! Thank you all!
The slides by Linda Mannila (about Swedish in Finland) and Elisabet Lobo (about differential privacy) are available from the schedule below.

This workshop will discuss how language representations or language models can be built to be sustainable in face of changing language, new domains of application, new demands from the surrounding use cases and business models, and a changing world. The intention of the workshop is to raise awareness of some identified issues and to invite new related issues to be addressed, potentially in future workshops or debates.

Aim of the workshop

The workshop will be organised around three overlapping challenges when it comes to making language models sustainable, and to assessing the qualities and characteristics of language models:

Societal challenges: sensitivity to bias and prejudice; coverage over varieties, dialects, minority languages; differential privacy; personal integrity and intimacy
Technical challenges: model cards and consumer or downstream user declaration of content; sustainability over temporal change; applicability to new domains; assessing quality and coverage of a model
Legal challenges: the right to be forgotten and anonymisation of training data; intellectual property rights with respect to training data and output of a language model; liabilities and risks in using a language model as part of a service

It will be organised as three thematic sessions with invited keynotes, followed by position statements from participants, and a concluding open discussion. We invite you to submit a short position statement related to one or several of the above challenges.

The final aim of the workshop is to formulate a statement on criteria for assessing the quality and sustainability of language models. We intend to publish this statement to a broader audience after the workshop in some suitable way, which will be discussed during the sessions.

Program, Monday 31 May

Each session will be 1½ hours long with one invited keynote (30 minutes), followed by short 5-minute position statements from participants, and a concluding open discussion.

Don't forget to register by 21 May – note that once you have registered you can attend as many or few sessions as you want (or have the time for).
You can still submit a position statement – see below for more information.

10:00–11:30 Societal challenges

Invited talk	30 minutes	Linda Mannila, Digismart: AI and the Swedish language in Finland [slides in PDF, 18MB]
Position statements	5 mins each	Christina Tånnander and Björn Westling: Language models at the Swedish Agency for Accessible Media
Position statements	5 mins each	Marina Santini, Evelina Rennes, Daniel Holmer and Arne Jönsson: Human-in-the-Loop: Where Does Text Complexity Lie?
Concluding remarks	5 minutes	the organisers
(lunch break)

13:00–14:30 Technical challenges

Invited talk	30 minutes	Elisabet Lobo Vesga, Chalmers: An introduction to differential privacy [slides in PDF, 3MB]
Position statements	5 mins each	Jenny Kunz: Data Transparency and Interpretability of Language Representations
		Riley Capshaw, Eva Blomqvist, Marina Santini and Marjan Alirezaie: BERT is as Gentle as a Sledgehammer: Too Powerful or Too Blunt? It Depends on the Benchmark
		Nikolai Ilinykh and Simon Dobnik: Taking BERT for a walk: on the necessity of grounding, multi-modality and embodiment for impactful NLP
Concluding remarks	5 minutes	the organisers

15:00–16:30 Legal challenges

Invited talk	30–40 minutes	Stanley Greenstein and Peter Wahlgren, Stockholm University
Open discussion		all participants
Concluding remarks	5 minutes	the organisers

Position Statements

You are very welcome to submit a short position statement about any of the challenges above. This can, e.g., be a short summary of your own previous or upcoming work in this area, or your personal reasons for being interested in this, or a short description of an idea that you have been thinking about for a while. It's really up to you. Submit your position statement in EasyChair here:

https://easychair.org/conferences?conf=sustainlangrepr2021

Please keep your submission short – it should not be longer than 3000 characters, to be able to keep it within one A4 page. The statements should not be anonymous, and there will be no academic reviewing.

The position statements will be circulated to all workshop participants prior to the workshop, and each submitter will get the chance to elaborate on their statement during the workshop. There will be no official workshop proceedings.

Registration

This workshop is part of NoDaLiDa 2021, and registration is done there:

Registration is free of charge
Deadline for registration is 17 May 2021
You don't have to attend the main NoDaLiDa conference
Register here: https://nodalida2021.github.io/registration.html

Organisers

The workshop is organised by the Vinnova-financed project Svenskt Språkdatalabb (Swedish Language Data Lab), in cooperation with Språkbanken Text and Gavagai AB.

Jussi Karlgren, Spotify Research, Stockholm
Peter Ljunglöf, Språkbanken Text, University of Gothenburg

Please contact us if you have any questions.