Skip to main content
Språkbanken Text is a department within Språkbanken.

swe-sense-wsd

Citation Information

Språkbanken Text (2022). swe-sense-wsd (updated: 2022-05-13). [Analysis]. Språkbanken Text.

Standard reference Information

https://aclanthology.org/N15-1164.pdf

Word sense disambiguation based on SALDO annotation

Example

This analysis is used with Sparv. Check out Sparv's quick start guide to get started!

To use this analysis, add the following line under export.annotations in the Sparv corpus configuration file:

- <token>:wsd.sense  # Sense disambiguated SALDO identifiers

For more info on how to use Sparv, check out the Sparv documentation.

Example output:

<token sense="|den..2:-1.000|">Det</token>
<token sense="|finna..1:0.497|finnas..1:0.472|finna..2:0.031|">finns</token>
<token sense="|den..1:-1.000|en..2:-1.000|">en</token>
<token sense="|fil..4:0.661|fil..5:0.194|fil..1:0.104|fil..2:0.040|fil..3:0.001|">fil</token>
<token sense="|i..2:-1.000|">i</token>
<token sense="|katalog..1:-1.000|">katalogen</token>
<token sense="|på..1:-1.000|"></token>
<token sense="|den..1:-1.000|den..2:-1.000|en..2:-1.000|">den</token>
<token sense="|extern..1:-1.000|">externa</token>
<token sense="|hårddisk..1:-1.000|">hårddisken</token>
<token sense="|">.</token>
<token sense="|man..1:-1.000|">Man</token>
<token sense="|kunna..1:0.666|kunna..4:0.147|kunna..3:0.110|kunna..2:0.077|">kan</token>
<token sense="|använda..1:-1.000|">använda</token>
<token sense="|den..1:-1.000|en..2:-1.000|">en</token>
<token sense="|fil..2:0.573|fil..4:0.213|fil..1:0.130|fil..5:0.084|fil..3:0.001|">fil</token>
<token sense="|för..1:-1.000|för..5:-1.000|för..6:-1.000|för..7:-1.000|för..9:-1.000|">för</token>
<token sense="|att..1:-1.000|">att</token>
<token sense="|slipa..2:0.832|slipa..1:0.168|">slipa</token>
<token sense="|kant..1:-1.000|">kanterna</token>
<token sense="|på..1:-1.000|"></token>
<token sense="|bräda..1:0.787|bräde..1:0.213|">brädan</token>
<token sense="|">.</token>

Evaluation results

Using lemma embeddings:
precision: 0.569 recall: 0.292 f-measure: 0.386

Using sense embeddings:
precision: 0.667 recall: 0.332 f-measure: 0.443

More information: https://aclanthology.org/N15-1164.pdf

Other references

  • https://github.com/spraakbanken/sparv-wsd/blob/master/README.pdf

  • Sparv wsd: https://github.com/spraakbanken/sparv-wsd

Type

  • Analysis

Task

sense disambiguation

Unit

token

Tool

Sparv wsd

Trained on

SALDO from May 2014 (SCOUSE model)

Keyword

  • saldo

Created

2018-05-28

Updated

2022-05-13

Contact

Språkbanken Text
sb-info@svenska.gu.se