Skip to main content

sbx-swe-pi_detection-sparv

Standard reference Information

Maria Irena Szawerna, David Alfter, Elena Volodina (2025): Annotating Personal Information in Swedish Texts with SPARV, in Proceedings of the First Workshop on Natural Language Processing and Language Models for Digital Humanities, pages 155-163. https://acl-bg.org/proceedings/2025/LM4DH%202025/pdf/2025.lm4dh-1.15.pdf

Analysis citation Information

Språkbanken Text. sbx-swe-pi_detection-sparv [Analysis]. Språkbanken Text. https://doi.org/10.23695/6wp0-ds77
BibTeX Additional ways to cite the dataset.
A plugin for Sparv for detecting personal information in Swedish texts, especially learner essays.

A plugin for Sparv for detecting personal information in Swedish texts, especially learner essays (note: performs noticeably worse on other domains, but the models used for annotation will likely be updated in the future). This model does not guarantee the detection of all personal information in the text. Never use it without human supervision (human-in-the-loop). The model performs noticeably worse on texts that are not student essays.

Example

This analysis is used with Sparv. Check out Sparv's quick start guide to get started!

To use this analysis, add the following line under export.annotations in the Sparv corpus configuration file:

- <token>:sbx_pi_detection.pi  # None

In order to use this plugin you need to add the following setting to your Sparv corpus configuration file with the appropriate argument (basic, basic_iob, general, general_iob, detailed, or detailed_iob):

sbx_pi_detection:
  annotation_level: general

You also need to install the following plugin: sbx_pi_detection.

For general information on how to install plugins, see here.

For more info on how to use Sparv, check out the Sparv documentation.

Example output:

<token pi="O">Jag</token>
<token pi="O">heter</token>
<token pi="personal_name">Maria</token>
<token pi="O">.</token>

Type

  • Analysis

Task

  • PI detection

Unit

  • token

License

MIT

Contact

sb-info@svenska.gu.se