Tokens are looked up in Swedish FrameNet (SweFN, lexical-semantic resource that follows the theory of Frame Semantics) in order to enrich them with information about their lexical classes. Documents are then enriched with information about lexical classes based on which classes are relevant for the tokens within them.
The SweFN frequency model (trained on Göteborgsposten 2008, SUC 3.0 and Bonniersromaner I (1976–77)) is used as reference for ranking the SweFN classes occurring in each document. Using token-level lexical class information, it calculates and assigns the most relevant classes for each document. These classes are filtered and ranked based on their frequency and dominance compared to the reference material.
Dominance refers to the relative importance or prominence of a lexical class in a given document compared to a reference material. Dominance is derived by comparing the observed frequency of a lexical class in the document to its expected (relative) frequency in the reference material.