sbx/KB-bert-swedish_PI-detection-detailed-iob

Standard reference

Maria Irena Szawerna, Simon Dobnik, Ricardo Muñoz Sánchez, Elena Volodina (2025): The Devil’s in the Details: the Detailedness of Classes Influences Personal Information Detection and Labeling, in Proceedings of the Joint 25th Nordic Conference on Computational Linguistics and 11th Baltic Conference on Human Language Technologies (NoDaLiDa/Baltic-HLT 2025), March 3–4, 2025 Tallinn, Estonia) / Richard Johansson and Sara Stymne (eds.), pages 697–708

Data citation

Szawerna, Maria Irena. sbx/KB-bert-swedish_PI-detection-detailed-iob [Data set]. Enriched and distributed by Språkbanken. https://doi.org/10.23695/w50j-tf54

Additional ways to cite the dataset.

En modell baserad på KB/bert-base-swedish-cased tränad med syfte att upptäcka personliga uppgifter, särskilt i studentuppsatser.

A model based on KB/bert-base-swedish-cased trained to detect personal information, especially in learner essays. This variant differentiates between 38 detailed categories and differentiates between beginning and inside.

Caveats

This model does not guarantee the detection of all personal information in the text. Never use it without human supervision (human-in-the-loop). The model performs noticeably worse on texts that are not student essays.

Intended uses

Personal Information detection

Download

File	Size	Modified	Licence
KB-bert-swedish_PI-detection-detailed-iob The model is hosted on HuggingFace and can be easily accessed e.g. using their Python library.	149.44 KB		GPL-3.0

sbx/KB-bert-swedish_PI-detection-detailed-iob

Standard reference

Data citation

Caveats

Intended uses

Download

Type

Language

Size

Keywords

Creators

Created

Contact

DOI