Skip to main content

sbx-swe-compound-sparv-saldowords

Analysis citation Information

Språkbanken Text (2020). sbx-swe-compound-sparv-saldowords (updated: 2020-07-09). [Analysis]. Språkbanken Text. https://doi.org/10.23695/w46y-cs57
BibTeX Additional ways to cite the dataset.
Analysis of SALDO wordform compounds

Tokens and their POS tags are looked up in the SALDO lexicon in order to enrich them with compound information. More information (in Swedish) is found in the Språkbanken Text FAQ.

Example

This analysis is used with Sparv. Check out Sparv's quick start guide to get started!

To use this analysis, add the following line under export.annotations in the Sparv corpus configuration file:

- <token>:saldo.compwf  # Compound analysis using wordforms

For more info on how to use Sparv, check out the Sparv documentation.

Example output:

<token compwf="|">Språkbanken</token>
<token compwf="|">Text</token>
<token compwf="|">är</token>
<token compwf="|">en</token>
<token compwf="|forsknings+infrastruktur|">forskningsinfrastruktur</token>
<token compwf="|">för</token>
<token compwf="|">språkliga</token>
<token compwf="|">data</token>
<token compwf="|">och</token>
<token compwf="|">en</token>
<token compwf="|språk+teknologisk|">språkteknologisk</token>
<token compwf="|forsknings+enhet|">forskningsenhet</token>
<token compwf="|">.</token>

Type

  • Analysis

Task

  • compound analysis

Unit

  • token

Keyword

  • saldo

Created

2018-03-28

Updated

2020-07-09

Contact

sb-info@svenska.gu.se