Skip to main content

sbx/KB-bert-base-swedish-cased_PI-detection-detailed-iob

Standard reference Information

Maria Irena Szawerna, Simon Dobnik, Ricardo Muñoz Sánchez, and Elena Volodina. 2025. The Devil’s in the Details: the Detailedness of Classes Influences Personal Information Detection and Labeling. In Proceedings of the Joint 25th Nordic Conference on Computational Linguistics and 11th Baltic Conference on Human Language Technologies (NoDaLiDa/Baltic-HLT 2025), pages 697–708, Tallinn, Estonia. University of Tartu Library. https://aclanthology.org/2025.nodalida-1.70/

Data citation Information

Szawerna, Maria Irena. sbx/KB-bert-base-swedish-cased_PI-detection-detailed-iob [Data set]. Språkbanken Text. https://doi.org/10.23695/w50j-tf54
BibTeX Additional ways to cite the dataset.
En modell baserad på KB/bert-base-swedish-cased tränad med syfte att upptäcka personliga uppgifter, särskilt i studentuppsatser.

A model based on KB/bert-base-swedish-cased trained to detect personal information, especially in learner essays. This variant differentiates between 38 detailed categories and differentiates between beginning and inside.

Caveats

This model does not guarantee the detection of all personal information in the text. Never use it without human supervision (human-in-the-loop). The model performs noticeably worse on texts that are not student essays.

Intended uses

Personal Information detection

Download

File Size Modified Licence
KB-bert-base-swedish-cased_PI-detection-detailed-iob
The model is hosted on HuggingFace and can be easily accessed e.g. using their Python library.
110.04 KB GPL-3.0

Type

  • Model

Language

Swedish

Size

Keywords

  • PI detection
  • BERT

Creators

  • Szawerna, Maria Irena

Created

2024-07-05

Contact

sb-info@svenska.gu.se