Skip to main content
Språkbanken Text is a department within Språkbanken.

swe-lemmatization-sparv-saldo2

Citation Information

Språkbanken Text (2020). swe-lemmatization-sparv-saldo2 (updated: 2020-01-15). [Analysis]. Språkbanken Text.
Full-form lookup for SALDO citation forms (lemmas) plus analysis of compounds made up of SALDO entries

The SALDO morphology full-form lexicon is used to find possible citation forms (lemmas) and word senses for text word tokens, preserving ambiguity. Additionally, the compounding information in SALDO is used for compound analysis.

Example

This analysis is used with Sparv. Check out Sparv's quick start guide to get started!

To use this analysis, add the following line under export.annotations in the Sparv corpus configuration file:

- <token>:saldo.baseform2  # Baseform including baseforms derived from compounds

For more info on how to use Sparv, check out the Sparv documentation.

Example output:

<token baseform="|den|den här|">Det</token>
<token baseform="|här|den här:1|">här</token>
<token baseform="|vara|">är</token>
<token baseform="|en|">en</token>
<token baseform="|korpus|">korpus</token>
<token baseform="|">.</token>

Type

  • Analysis

Task

lemmatization

Unit

token

Tool

Sparv

Keyword

  • saldo

Created

2018-03-28

Updated

2020-01-15

Contact

Språkbanken Text
sb-info@svenska.gu.se