NoDaLiDa 2017

This is the documentation page for the 21st Nordic Conference on Computational Linguistics (NoDaLiDa) on 22-24 of May 2017. All together we had 184 participants (workshops + main conference) who celebrated the 40th anniversary of Nodalida.

Content

Program and proceedings
Workshops
Keynote speakers
People
Papers
Sponsors

Program and proceedings

Program (PDF)
Proceedings (PDF)

Workshops

The following workshops took place on monday 22 May before the main conference.

NLP4CALL & LA (full day)
Universal Dependencies (full day)
Processing Historical Language (full day)
Constraint Grammar (half day morning)

Keynote speakers

Rada Mihalcea, University of Michigan

Rada Mihalcea is a Professor in the Computer Science and Engineering department at the University of Michigan. Her research interests are in computational linguistics, with a focus on lexical semantics, multilingual natural language processing, and computational social sciences. She serves or has served on the editorial boards of the Journals of Computational Linguistics, Language Resources and Evaluations, Natural Language Engineering, Research in Language in Computation, IEEE Transactions on Affective Computing, and Transactions of the Association for Computational Linguistics. She was a program co-chair for the Conference of the Association for Computational Linguistics (2011) and the Conference on Empirical Methods in Natural Language Processing (2009), and a general chair for the Conference of the North American Chapter of the Association for Computational Linguistics (2015). She is the recipient of a National Science Foundation CAREER award (2008) and a Presidential Early Career Award for Scientists and Engineers (2009). In 2013, she was made an honorary citizen of her hometown of Cluj-Napoca, Romania.

Kyunghyun Cho, New York University

Kyunghyun Cho is an assistant professor the Department of Computer Science, Courant Institute of Mathematical Sciences and the Center for Data Science at New York University (NYU). Previously, he was a postdoctoral researcher at the University of Montreal under the supervision of Prof. Yoshua Bengio after obtaining a doctorate degree at Aalto University (Finland) in early 2014. His main research interests include neural networks, generative models and their applications, especially, to natural language understanding.

Sharon Goldwater, University of Edinburgh

Sharon Goldwater is a Reader at the University of Edinburgh's School of Informatics, where she is a member of the Institute for Language, Cognition and Computation. She received her PhD in 2007 from Brown University and spent two years as a postdoctoral researcher at Stanford University before moving to Edinburgh. Her research interests include unsupervised learning for natural language processing, computer modelling of language acquisition in children, and computational studies of language use. Dr. Goldwater co-chaired the 2014 Conference of the European Chapter of the Association for Computational Linguistics and is Chair-Elect of EACL. She has served on the editorial boards of the Transactions of the Association for Computational Linguistics, the Computational Linguistics journal, and OPEN MIND: Advances in Cognitive Science (a new open-access journal). In 2016, she received the Roger Needham Award from the British Computer Society, awarded for "distinguished research contribution in computer science by a UK-based researcher who has completed up to 10 years of post-doctoral research."

People

General chair

Jörg Tiedemann, University of Helsinki, Finland

Program chairs

Jón Guðnason, Reykjavik University, Iceland
Beáta Megyesi, Uppsala University, Sweden
Kadri Muischnek, University of Tartu, Estonia
Inguna Skadiņa, University of Latvia, Latvia
Anders Søgaard, University of Copenhagen, Denmark
Andrius Utka, Vytautas Magnus University, Lithuania
Lilja Øvrelid, University of Oslo, Norway

Local organizing committee

Nina Tahmasebi (chair), University of Gothenburg, Sweden
Yvonne Adesam (co-chair), University of Gothenburg, Sweden
Martin Kaså (co-chair), University of Gothenburg, Sweden
Sven Lindström (publication chair), University of Gothenburg, Sweden
Elena Volodina (sponsor chair), University of Gothenburg, Sweden
Lars Borin, University of Gothenburg, Sweden
Dana Dannélls, University of Gothenburg, Sweden
Ildikó Pilán, University of Gothenburg, Sweden

Papers

Best short paper

Will my auxiliary tagging task help? Estimating Auxiliary Tasks Effectivity in Multi-Task Learning
Johannes Bjerva

Best student paper

OCR and post-correction of historical Finnish texts
Senka Drobac, Pekka Kauppinen and Krister Lindén

Best paper

Joint UD Parsing of Norwegian Bokmål and Nynorsk
Erik Velldal, Lilja Øvrelid and Petter Hohle

Accepted papers

Machine translation with North Saami as a pivot language
Lene Antonsen, Ciprian Gerstenberger, Maja Kappfjell, Sandra Nystø Ráhka, Marja-Liisa Olthuis, Trond Trosterud and Francis Morton Tyers

Real-valued Syntactic Word Vectors (RSV) for Greedy Neural Dependency Parsing
Ali Basirat and Joakim Nivre

From Treebank to Propbank: A Semantic-Role and VerbNet Corpus for Danish
Eckhard Bick

Will my auxiliary tagging task help? Estimating Auxiliary Tasks Effectivity in Multi-Task Learning
Johannes Bjerva

Cross-lingual Learning of Semantic Textual Similarity with Multilingual Word Representations
Johannes Bjerva and Robert Östling

Iconic Locations in Swedish Sign Language: Mapping Form to Meaning with Lexical Databases
Carl Börstell and Robert Östling

Using Pseudowords for Algorithm Comparison: An Evaluation Framework for Graph-based Word Sense Induction
Flavio Massimiliano Cecchini, Chris Biemann and Martin Riedl

KILLE: a Framework for Situated Agents for Learning Language Through Interaction
Simon Dobnik and Erik de Graaf

OCR and post-correction of historical Finnish texts
Senka Drobac, Pekka Kauppinen and Krister Lindén

Machine Learning for Rhetorical Figure Detection: More Chiasmus with Less Annotation
Marie Dubremetz and Joakim Nivre

Mainstreaming August Strindberg with Text Normalization
Adam Ek and Sofia Knuutinen

Word vectors, reuse, and replicability: Towards a community repository of large-text resources
Murhaf Fares, Andrey Kutuzov, Stephan Oepen and Erik Velldal

Optimizing a PoS Tagset for Norwegian Dependency Parsing
Petter Hohle, Lilja Øvrelid and Erik Velldal

Evaluation of language identification methods using 285 languages
Tommi Jauhiainen, Krister Lindén and Heidi Jauhiainen

Tagging Named Entities in 19th Century and Modern Finnish Newspaper Material with a Finnish Semantic Tagger
Kimmo Kettunen and Laura Löfberg

Docforia: A Multilayer Document Model
Marcus Klang and Pierre Nugues

Improving Optical Character Recognition of Finnish Historical Newspapers with a Combination of Fraktur & Antiqua Models and Image Preprocessing
Mika Koistinen, Kimmo Kettunen and Tuula Pääkkönen

Data Collection from Persons with Mild Forms of Cognitive Impairment and Healthy Controls - Infrastructure for Classification and Prediction of Dementia
Dimitrios Kokkinakis, Kristina Lundholm Fors, Eva Björkner and Arto Nordlund

Replacing OOV Words For Dependency Parsing With Distributional Semantics
Prasanth Kolachina, Martin Riedl and Chris Biemann

Aligning phonemes using finte-state methods
Kimmo Koskenniemi

Creating register sub-corpora for the Finnish Internet Parsebank
Veronika Laippala, Juhani Luotolahti, Aki-Juhani Kyröläinen, Tapio Salakoski and Filip Ginter

Acoustic Model Compression with MAP adaptation
Katri Leino and Mikko Kurimo

Redefining Context Windows for Word Embedding Models: An Experimental Study
Pierre Lison and Andrey Kutuzov

Linear Ensembles of Word Embedding Models
Avo Muromägi, Kairit Sirts and Sven Laur

SWEGRAM – A Web-Based Tool for Automatic Annotation and Analysis of Swedish Texts
Jesper Näsman, Beata Megyesi and Anne Palmér

Can We Create a Tool for General Domain Event Analysis?
Siim Orasmaa and Heiki-Jaan Kaalep

The Effect of Excluding Out of Domain Training Data from Supervised Named-Entity Recognition
Adam Persson

North-Sámi to Finnish rule-based machine translation system
Tommi Pirinen, Francis M. Tyers, Trond Trosterud, Ryan Johnson, Kevin Unhammer and Tiina Puolakainen

Quote Extraction and Attribution from Norwegian Newspapers
Andrew Salway, Paul Meurer, Knut Hofland and Øystein Reigem

Wordnet extension via word embeddings: Experiments on the Norwegian Wordnet
Heidi Sand, Erik Velldal and Lilja Øvrelid

Málrómur: A Manually Verified Corpus of Recorded Icelandic Speech
Steinþór Steingrímsson, Jón Guðnason, Sigrún Helgadóttir and Eiríkur Rögnvaldsson

Twitter Topic Modeling by Tweet Aggregation
Asbjørn Steinskog, Jonas Therkelsen and Björn Gambäck

The Effect of Translationese on Tuning for Statistical Machine Translation
Sara Stymne

A Multilingual Entity Linker Using PageRank and Semantic Graphs
Anton Södergren and Pierre Nugues

Joint UD Parsing of Norwegian Bokmål and Nynorsk
Erik Velldal, Lilja Øvrelid and Petter Hohle

Finnish resources for evaluating language model semantics
Viljami Venekoski and Jouko Vankka

Coreference Resolution for Swedish and German using Distant Supervision
Alexander Wallin and Pierre Nugues

Universal Dependencies for Swedish Sign Language
Robert Östling, Carl Börstell, Moa Gärdenfors and Mats Wirén

System demos

Services for text simplification and analysis
Johan Falkenjack, Evelina Rennes, Daniel Fahlborg, Vida Johansson and Arne Jönsson

Exploring Properties of Intralingual and Interlingual Association Measures Visually
Johannes Graën and Christof Bless

Multilingwis2 – Explore Your Parallel Corpus
Johannes Graën, Dominique Sandoz and Martin Volk

TALERUM - Learning Danish by Doing Danish
Peter Juel Henrichsen

Dep_search: Efficient Search Tool for Large Dependency Parsebanks
Juhani Luotolahti, Jenna Kanerva and Filip Ginter

A modernised version of the Glossa corpus search system
Anders Nøklestad, Kristin Hagen, Janne Bondi Johannessen, Michał Kosek and Joel Priestley

Proto-Indo-European Lexicon: The Generative Etymological Dictionary of Indo-European Languages
Jouna Pyysalo

Cross-Lingual Syntax: Relating Grammatical Framework with Universal Dependencies
Aarne Ranta, Prasanth Kolachina and Thomas Hallgren

Exploring Treebanks with INESS Search
Victoria Rosén, Helge Dyvik, Paul Meurer and Koenraad De Smedt

Tilde MODEL - Multilingual Open Data for EU Languages
Roberts Rozis and Raivis Skadiņš

A System for Identifying and Exploring Text Repetition in Large Historical Document Corpora
Aleksi Vesanto, Filip Ginter, Hannu Salmi, Asko Nivala and Tapio Salakoski