Skip to main content

Sustainable language representations for a changing world, 31 May 2021

  • The workshop has ended, and we are very happy for your participation! Thank you all!
  • The slides by Linda Mannila (about Swedish in Finland) and Elisabet Lobo (about differential privacy) are available from the schedule below.

This workshop will discuss how language representations or language models can be built to be sustainable in face of changing language, new domains of application, new demands from the surrounding use cases and business models, and a changing world. The intention of the workshop is to raise awareness of some identified issues and to invite new related issues to be addressed, potentially in future workshops or debates.

Aim of the workshop

The workshop will be organised around three overlapping challenges when it comes to making language models sustainable, and to assessing the qualities and characteristics of language models:

  • Societal challenges: sensitivity to bias and prejudice; coverage over varieties, dialects, minority languages; differential privacy; personal integrity and intimacy
  • Technical challenges: model cards and consumer or downstream user declaration of content; sustainability over temporal change; applicability to new domains; assessing quality and coverage of a model
  • Legal challenges: the right to be forgotten and anonymisation of training data; intellectual property rights with respect to training data and output of a language model; liabilities and risks in using a language model as part of a service

It will be organised as three thematic sessions with invited keynotes, followed by position statements from participants, and a concluding open discussion. We invite you to submit a short position statement related to one or several of the above challenges.

The final aim of the workshop is to formulate a statement on criteria for assessing the quality and sustainability of language models. We intend to publish this statement to a broader audience after the workshop in some suitable way, which will be discussed during the sessions.

Program, Monday 31 May

Each session will be 1½ hours long with one invited keynote (30 minutes), followed by short 5-minute position statements from participants, and a concluding open discussion.

  • Don't forget to register by 21 May – note that once you have registered you can attend as many or few sessions as you want (or have the time for).
  • You can still submit a position statement – see below for more information.

10:00–11:30 Societal challenges

Invited talk 30 minutes Linda Mannila, Digismart: AI and the Swedish language in Finland [slides in PDF, 18MB]
Position statements 5 mins each Christina Tånnander and Björn Westling: Language models at the Swedish Agency for Accessible Media
Marina Santini, Evelina Rennes, Daniel Holmer and Arne Jönsson: Human-in-the-Loop: Where Does Text Complexity Lie?
Concluding remarks 5 minutes the organisers
(lunch break)

13:00–14:30 Technical challenges

Invited talk 30 minutes Elisabet Lobo Vesga, Chalmers: An introduction to differential privacy [slides in PDF, 3MB]
Position statements 5 mins each Jenny Kunz: Data Transparency and Interpretability of Language Representations
Riley Capshaw, Eva Blomqvist, Marina Santini and Marjan Alirezaie: BERT is as Gentle as a Sledgehammer: Too Powerful or Too Blunt? It Depends on the Benchmark
Nikolai Ilinykh and Simon Dobnik: Taking BERT for a walk: on the necessity of grounding, multi-modality and embodiment for impactful NLP
Concluding remarks 5 minutes the organisers

15:00–16:30 Legal challenges

Invited talk 30–40 minutes Stanley Greenstein and Peter Wahlgren, Stockholm University
Open discussion   all participants
Concluding remarks 5 minutes the organisers

Position Statements

You are very welcome to submit a short position statement about any of the challenges above. This can, e.g., be a short summary of your own previous or upcoming work in this area, or your personal reasons for being interested in this, or a short description of an idea that you have been thinking about for a while. It's really up to you. Submit your position statement in EasyChair here:

Please keep your submission short – it should not be longer than 3000 characters, to be able to keep it within one A4 page. The statements should not be anonymous, and there will be no academic reviewing.

The position statements will be circulated to all workshop participants prior to the workshop, and each submitter will get the chance to elaborate on their statement during the workshop. There will be no official workshop proceedings.

Registration

This workshop is part of NoDaLiDa 2021, and registration is done there:

Organisers

The workshop is organised by the Vinnova-financed project Svenskt Språkdatalabb (Swedish Language Data Lab), in cooperation with Språkbanken Text and Gavagai AB.

Please contact us if you have any questions.