Skip to main content

sbx/KB-bert-base-swedish-cased_PI-detection-general-iob

Standard reference Information

Maria Irena Szawerna, Simon Dobnik, Ricardo Muñoz Sánchez, and Elena Volodina. 2025. The Devil’s in the Details: the Detailedness of Classes Influences Personal Information Detection and Labeling. In Proceedings of the Joint 25th Nordic Conference on Computational Linguistics and 11th Baltic Conference on Human Language Technologies (NoDaLiDa/Baltic-HLT 2025), pages 697–708, Tallinn, Estonia. University of Tartu Library. https://aclanthology.org/2025.nodalida-1.70/

Data citation Information

Szawerna, Maria Irena. sbx/KB-bert-base-swedish-cased_PI-detection-general-iob [Data set]. Språkbanken Text. https://doi.org/10.23695/nm9x-z436
BibTeX Additional ways to cite the dataset.
En modell baserad på KB/bert-base-swedish-cased tränad med syfte att upptäcka personliga uppgifter, särskilt i studentuppsatser.

A model based on KB/bert-base-swedish-cased trained to detect personal information, especially in learner essays. This variant differentiates between 7 general categories and differentiates between beginning and inside.

Caveats

This model does not guarantee the detection of all personal information in the text. Never use it without human supervision (human-in-the-loop). The model performs noticeably worse on texts that are not student essays.

Intended uses

Personal Information detection

Download

File Size Modified Licence
KB-bert-base-swedish-cased_PI-detection-general-iob
The model is hosted on HuggingFace and can be easily accessed e.g. using their Python library.
109.69 KB GPL-3.0

Type

  • Model

Language

Swedish

Size

Keywords

  • PI detection
  • BERT

Creators

  • Szawerna, Maria Irena

Created

2024-07-04

Contact

sb-info@svenska.gu.se