Skip to main content

sbx-swe-compound-sparv-saldolemgram

Analysis citation Information

Språkbanken Text (2020). sbx-swe-compound-sparv-saldolemgram (updated: 2020-07-09). [Analysis]. Språkbanken Text. https://doi.org/10.23695/74z6-tk60
BibTeX Additional ways to cite the dataset.
Analysis of SALDO lemgram compounds including a probability ranking

Tokens and their POS tags are looked up in the SALDO lexicon in order to enrich them with compound information. More information (in Swedish) is found in the Språkbanken Text FAQ.

Example

This analysis is used with Sparv. Check out Sparv's quick start guide to get started!

To use this analysis, add the following line under export.annotations in the Sparv corpus configuration file:

- <token>:saldo.complemgram  # Compound analysis using lemgrams

For more info on how to use Sparv, check out the Sparv documentation.

Example output:

<token complemgram="|">Språkbanken</token>
<token complemgram="|">Text</token>
<token complemgram="|">är</token>
<token complemgram="|">en</token>
<token complemgram="|forskning..nn.1+infrastruktur..nn.1:8.476e-13|">forskningsinfrastruktur</token>
<token complemgram="|">för</token>
<token complemgram="|">språkliga</token>
<token complemgram="|">data</token>
<token complemgram="|">och</token>
<token complemgram="|">en</token>
<token complemgram="|språk..nn.1+teknologisk..av.1:6.726e-13|språka..vb.1+teknologisk..av.1:4.035e-23|">språkteknologisk</token>
<token complemgram="|forskning..nn.1+enhet..nn.1:9.033e-13|">forskningsenhet</token>
<token complemgram="|">.</token>

Type

  • Analysis

Task

  • compound analysis

Unit

  • token

Keyword

  • saldo

Created

2018-03-28

Updated

2020-07-09

Contact

sb-info@svenska.gu.se