Skip to main content
Språkbanken Text is a part of Språkbanken.

swe-lexical_classes_text-sparv-swefn

Citation Information

Språkbanken Text (2017). swe-lexical_classes_text-sparv-swefn (updated: 2017-09-21). [Analysis]. Språkbanken Text.
BibTeX
Lexical classes from SweFN on text-level

Tokens are looked up in Swedish FrameNet (SweFN, lexical-semantic resource that follows the theory of Frame Semantics) in order to enrich them with information about their lexical classes. Texts are then enriched with information about lexical classes based on which classes are relevant for the tokens within them.

The SweFN frequency model (trained on Göteborgsposten 2008, SUC 3.0 and Bonniersromaner I (1976–77)) is used as reference for ranking the SweFN classes occurring in each text. Using token-level lexical class information, it calculates and assigns the most relevant classes for each text. These classes are filtered and ranked based on their frequency and dominance compared to the reference material.

Dominance refers to the relative importance or prominence of a lexical class in a given text compared to a reference material. Dominance is derived by comparing the observed frequency of a lexical class in the text to its expected (relative) frequency in the reference material.

Example

This analysis is used with Sparv. Check out Sparv's quick start guide to get started!

To use this analysis, add the following line under export.annotations in the Sparv corpus configuration file:

- <text>:lexical_classes.swefn  # Lexical classes for text chunks from SweFN

For more info on how to use Sparv, check out the Sparv documentation.

Example output:

<text swefn="|Type:149.863|Animals:137.544|Typicality:107.808|">
  <token>Rödräv</token>
  <token>eller</token>
  <token>vanlig</token>
  <token>räv</token>
  <token>är</token>
  <token>ett</token>
  <token>hunddjur</token>
  <token>och</token>
  <token>den</token>
  <token>mest</token>
  <token>förekommande</token>
  <token>arten</token>
  <token>i</token>
  <token>rävsläktet</token>
  <token>.</token>
</text>

Other references

  • Dana Dannélls, Lars Borin, Karin Friberg Heppin (2021): The Swedish FrameNet++ Harmonization, integration, method development and practical language technology applications. John Benjamins: Amsterdam, Philadelphia. ISBN 978 90 272 5848 9.

Type

  • Analysis

Task

  • lexical classes

Unit

  • text

Tool

Sparv

Trained on

Reference corpora for relative frequencies: Göteborgsposten 2008, SUC 3.0, Bonniersromaner I (1976–77)

Created

2017-09-21

Updated

2017-09-21

Contact

Språkbanken Text
sb-info@svenska.gu.se