Skip to main content

sbx-swe-sense-sparv

Analysis citation Information

Språkbanken Text (2022). sbx-swe-sense-sparv (updated: 2022-05-13). [Analysis]. Språkbanken Text. https://doi.org/10.23695/j2vv-v579
BibTeX Additional ways to cite the dataset.
Word sense disambiguation based on SALDO annotation

Example

This analysis is used with Sparv. Check out Sparv's quick start guide to get started!

To use this analysis, add the following line under export.annotations in the Sparv corpus configuration file:

- <token>:wsd.sense  # Sense disambiguated SALDO identifiers

For more info on how to use Sparv, check out the Sparv documentation.

Example output:

<token sense="|den..2:-1.000|">Det</token>
<token sense="|finna..1:0.497|finnas..1:0.472|finna..2:0.031|">finns</token>
<token sense="|den..1:-1.000|en..2:-1.000|">en</token>
<token sense="|fil..4:0.661|fil..5:0.194|fil..1:0.104|fil..2:0.040|fil..3:0.001|">fil</token>
<token sense="|i..2:-1.000|">i</token>
<token sense="|katalog..1:-1.000|">katalogen</token>
<token sense="|på..1:-1.000|"></token>
<token sense="|den..1:-1.000|den..2:-1.000|en..2:-1.000|">den</token>
<token sense="|extern..1:-1.000|">externa</token>
<token sense="|hårddisk..1:-1.000|">hårddisken</token>
<token sense="|">.</token>
<token sense="|man..1:-1.000|">Man</token>
<token sense="|kunna..1:0.666|kunna..4:0.147|kunna..3:0.110|kunna..2:0.077|">kan</token>
<token sense="|använda..1:-1.000|">använda</token>
<token sense="|den..1:-1.000|en..2:-1.000|">en</token>
<token sense="|fil..2:0.573|fil..4:0.213|fil..1:0.130|fil..5:0.084|fil..3:0.001|">fil</token>
<token sense="|för..1:-1.000|för..5:-1.000|för..6:-1.000|för..7:-1.000|för..9:-1.000|">för</token>
<token sense="|att..1:-1.000|">att</token>
<token sense="|slipa..2:0.832|slipa..1:0.168|">slipa</token>
<token sense="|kant..1:-1.000|">kanterna</token>
<token sense="|på..1:-1.000|"></token>
<token sense="|bräda..1:0.787|bräde..1:0.213|">brädan</token>
<token sense="|">.</token>

Evaluation results

Using lemma embeddings:
precision: 0.569 recall: 0.292 f-measure: 0.386

Using sense embeddings:
precision: 0.667 recall: 0.332 f-measure: 0.443

More information: https://aclanthology.org/N15-1164.pdf

Type

  • Analysis

Task

  • sense disambiguation

Unit

  • token

Dependencies

External tools

Sparv wsd
MIT License

Models

Trained on

SALDO from May 2014 (SCOUSE model)

Keyword

  • saldo

Created

2018-05-28

Updated

2022-05-13

Contact

sb-info@svenska.gu.se