sbx/KB-bert-swedish_PI-detection-basic-iob

Standard reference

Maria Irena Szawerna, Simon Dobnik, Ricardo Muñoz Sánchez, Elena Volodina (2025): The Devil’s in the Details: the Detailedness of Classes Influences Personal Information Detection and Labeling, in Proceedings of the Joint 25th Nordic Conference on Computational Linguistics and 11th Baltic Conference on Human Language Technologies (NoDaLiDa/Baltic-HLT 2025), March 3–4, 2025 Tallinn, Estonia) / Richard Johansson and Sara Stymne (eds.), pages 697–708

Data citation

Szawerna, Maria Irena. sbx/KB-bert-swedish_PI-detection-basic-iob [Data set]. Enriched and distributed by Språkbanken. https://doi.org/10.23695/wb6y-sv35

Additional ways to cite the dataset.

En modell baserad på KB/bert-base-swedish-cased tränad med syfte att upptäcka personliga uppgifter, särskilt i studentuppsatser.

A model based on KB/bert-base-swedish-cased trained to detect personal information, especially in learner essays. This variant differentiates only between PI and non-PI and differentiates between beginning and inside.

Caveats

This model does not guarantee the detection of all personal information in the text. Never use it without human supervision (human-in-the-loop). The model performs noticeably worse on texts that are not student essays.

Intended uses

Personal Information detection

Download

File	Size	Modified	Licence
KB-bert-swedish_PI-detection-basic-iob The model is hosted on HuggingFace and can be easily accessed e.g. using their Python library.	148.85 KB		GPL-3.0

sbx/KB-bert-swedish_PI-detection-basic-iob

Standard reference

Data citation

Caveats

Intended uses

Download

Type

Language

Size

Keywords

Creators

Created

Contact

DOI