Skip to main content

sbx-swe-sense-sparv_saldowsd-rs

Analysis citation Information

Språkbanken Text (2025). sbx-swe-sense-sparv_saldowsd-rs (updated: 2025-11-26). [Analysis]. Språkbanken Text. https://doi.org/10.23695/452d-5270
BibTeX Additional ways to cite the dataset.
Word sense disambiguation based on SALDO annotation (uses a rewrite of saldowsd.jar in Rust)

Example

This analysis is used with Sparv. Check out Sparv's quick start guide to get started!

To use this analysis, add the following line under export.annotations in the Sparv corpus configuration file:

- <token>:sbx_wsd_rs.sense  # Sense disambiguated SALDO identifiers

You also need to install the following plugin: sbx_wsd_rs.

For general information on how to install plugins, see here.

For more info on how to use Sparv, check out the Sparv documentation.

Example output:

<token sense="|den..2:-1.000|">Det</token>
<token sense="|finna..1:0.497|finnas..1:0.472|finna..2:0.031|">finns</token>
<token sense="|den..1:-1.000|en..2:-1.000|">en</token>
<token sense="|fil..4:0.661|fil..5:0.194|fil..1:0.104|fil..2:0.040|fil..3:0.001|">fil</token>
<token sense="|i..2:-1.000|">i</token>
<token sense="|katalog..1:-1.000|">katalogen</token>
<token sense="|på..1:-1.000|"></token>
<token sense="|den..1:-1.000|den..2:-1.000|en..2:-1.000|">den</token>
<token sense="|extern..1:-1.000|">externa</token>
<token sense="|hårddisk..1:-1.000|">hårddisken</token>
<token sense="|">.</token>
<token sense="|man..1:-1.000|">Man</token>
<token sense="|kunna..1:0.666|kunna..4:0.147|kunna..3:0.110|kunna..2:0.077|">kan</token>
<token sense="|använda..1:-1.000|">använda</token>
<token sense="|den..1:-1.000|en..2:-1.000|">en</token>
<token sense="|fil..2:0.573|fil..4:0.213|fil..1:0.130|fil..5:0.084|fil..3:0.001|">fil</token>
<token sense="|för..1:-1.000|för..5:-1.000|för..6:-1.000|för..7:-1.000|för..9:-1.000|">för</token>
<token sense="|att..1:-1.000|">att</token>
<token sense="|slipa..2:0.832|slipa..1:0.168|">slipa</token>
<token sense="|kant..1:-1.000|">kanterna</token>
<token sense="|på..1:-1.000|"></token>
<token sense="|bräda..1:0.787|bräde..1:0.213|">brädan</token>
<token sense="|">.</token>

Evaluation results

Using lemma embeddings: precision: 0.569 recall: 0.292 f-measure: 0.386

Using sense embeddings: precision: 0.667 recall: 0.332 f-measure: 0.443

More information: https://aclanthology.org/N15-1164.pdf

Type

  • Analysis

Task

  • sense disambiguation

Unit

  • token

License

MIT

Trained on

SALDO from May 2014 (SCOUSE model)

Keyword

  • saldo

Created

2025-09-03

Updated

2025-11-26

Contact

sb-info@svenska.gu.se