SpråkbankenText's research meetings during Spring 2025, will take place on Tuesdays Thursdays, from 10:15 to 11.30 (occasionally to 12:00), typically in room F314, unless otherwise announced. The meetings are normally open for all members of SpråkbankenText. On some occasions, we invite a broader audience, these public seminars are announced in SpråkbankenText's calendar.
Please update the list of conference deadlines.
The schedule for the Spring of 2025 is as follows:
Date | Topic | Comments |
---|---|---|
January | ||
9 | ||
15 | < |
|
16 | ||
22 | < |
|
23 | < |
|
28 | |
|
30 | |
|
31 | |
|
February | ||
6 | ||
6-7 | ||
13 | |
|
15 | < |
|
20 | Postoponed: |
|
25 | 27 | |
March | ||
04 | |
|
6 | Postponed: |
|
13 | |
|
20 | |
|
24 | < |
|
27 | |
|
April |
3 | |
|
10 | |
|
11 | |
17 | |
23 | ||
24 | May | |
1 | ||
5 | ||
5 | |
|
8 | |
|
14 | |
|
14-15 | |
|
15 | |
|
21 | |
|
22 | ||
29 | |
June |
4 | < |
|
5 | This talk will be devoted to the challenges of working with data that contains personal information. I will describe a set of experiments with automatic pseudonymization that we have performed within Mormor Karl project. Among others, experiments with detection and labeling of personal categories using BERT models (Szawerna et al. 2024, 2025), attempts att using LLMs to "fill in the blanks" when substituting personal information with pseudonyms (yet unpublished) and a study on whether pseudonyms can provoke biased automated classifications (Muñoz Sánchez et al. 2024). The choice of models for our experiments is currently dictated by the sensitive nature of our data. To extend the choice from open source to proprietary models, we are currently collecting a "pseudo-corpus" with fictitious personal information that we will be able to share freely for future research (you are welcome to contribute to the pseudo-corpus collection as well). Finally, in this talk I will name several strategies to unify the research on automatic pseudonymization, and outline further challenges, needs for standardization and a proposal of a shared task. Szawerna et al. 2025. The Devil’s in the Details: the Detailedness of Classes Influences Personal Information Detection and Labeling. Szawerna et al. 2024. Detecting Personal Identifiable Information in Swedish Learner Essays. Muñoz Sánchez et al. 2024. Did the Names I Used within My Essay Affect My Score? Diagnosing Name Biases in Automated Essay Scoring?. |
|
12 | Investigating Linguistic Abilities of LLMs for Native Language Identification. Large language models (LLMs) often achieve high performance in native language identification (NLI) benchmarks by leveraging superficial contextual cues such as names, locations, and cultural stereotypes, rather than the underlying linguistic patterns indicative of native language (L1) influence. To improve robustness, previous work has instructed LLMs to disregard such clues. In this work, we demonstrate that this strategy is unreliable, and predictions can be easily altered by misleading hints. To address this problem, we introduce an agentic NLI pipeline inspired by forensic linguistics, where specialized agents accumulate and categorize diverse linguistic evidence before a final overall assessment. | Ahmet Yavuz Uluslu (University of Zürich, Switzerland) <OBS! Room: J406> |
18 | The 50th anniversary of Språkbanken 🥇🍾🎂 | Wallenberg conference centre, starting in the afternoon and include dinner 🥇🍾🎂🍴🍽 |
19 | ||
26 | July | |
3 | Summer ... | Summer ... |
The schedule for the Autumn of 2024 was as follows:
Date | Topic | Comments |
---|---|---|
September | ||
10 | ||
17 | ||
24 | ||
26 | |
|
October | ||
1 | |
|
8 | |
|
9 | <extra> |
|
15 | |
|
21 | ||
22 | |
|
23 | < |
|
29 | |
|
November | ||
5 | ||
12 | |
|
18 | |
|
19 | |
|
26 | SBX papers at the SLTC main conference or/and the SLTC satellite workshops | Dry run for the 10th Swedish Language Technology Conference (SLTC); 27–29 Nov., at Linköping Univ. |
27 | |
|
29 | |
|
December | ||
3 | Canceled: |
|
5 | |
|
10 | |
|
16 | <extra> Mini workshop: Grammatical Error Correction | Time: 13.15-15.00, Room: J415 |
24 | Happy Holidays! | Happy Holidays! |
The schedule for the Spring of 2024 is here:
Date | Topic | Comments |
---|---|---|
January | ||
16 | Planning meeting (internal) | |
22 | Application seminar SFS | |
23 | — | Ethics for NLP (FLoV) |
30 | — | RJ project deadline |
February | ||
6 | Herbert presents earlier work (internal) | |
13 | — | School holidays |
20 | Discussing VR application drafts (internal) | Also deadline VR Tvärvetensk forskningsmiljö |
27 | — | — |
March | ||
4-6 | — | Research retreat SFS |
5 | — | Deadline VR HS |
12 | Discussion of Felix & Sasha's draft (internal) | |
19 | — | Also: EACL |
25 | Felix final seminar | (Monday!) |
26 | Discussion of Emilie's draft (internal) | |
April | ||
2 | — | School holidays |
9 | Handledarkollegium SBX (closed) | Also: deadline VR NT |
16 | — | |
23 | Mini-workshop "Approaches to corpus searches" | NB! location J233, https://www.gu.se/en/event/approaches-to-corpus-searches |
30 | — | St Walpurga's eve |
May | ||
7 | (free) | |
14 | CLARIN:EL | Speaker: Kanella Pouli, fr Athena RC/ILSP, Greece. Abstract: CLARIN:EL is a Research infrastructure for Language Resources & Technologies (LRTs); it is the Greek part of the European CLARIN ERIC Infrastructure. It provides a multitude of assets related to Language Technology (LT) for and by Social Sciences & Humanities (and beyond), focusing mainly but not exclusively on Greek LRTs. The Central Inventory provides access to corpora, lexica, tools and language descriptions. Users can effortlessly locate desired resources using keywords and filters. Additionally, the infrastructure hosts a Workbench, i.e. a variety of services to process corpora from the Central Inventory or data submitted by registered users. For those eager to contribute their own data, a user-friendly interface enables the creation of metadata records for their resources. All the information regarding the infrastructure, the CLARIN network, and the NLP community in Greece can be found in the user guide, portal, and NLP:EL Knowledge Centre. Brief Bio: Computational Linguist, MSc., KP's work focuses on the design and improvement of repositories, the development and adaptation of metadata schemata for language resource documentation as well as the collection and processing of language resources. |
21 | — | No meeting bec. of LREC |
28 | (free) | |
June | ||
4 | — | SBX Retreat |
10 | Emilie idea seminar | (Monday!) |
11 | Stina Johansson and Lars Kullman from UB | About publication points |
18 | LREC/COLING summary |