Språkbanken Text is a department within Språkbanken.

Swedish FrameNet (SweFN)

Citation Information

Språkbanken Text. (2021-12-21). Swedish FrameNet (SweFN) [Data set]. Språkbanken Text.
A lexical semantic resource based on the same principles as the English Berkeley FrameNet. This part of the resource contains the frames and the manually annotated semantic content.
The file swefn.xml contains the frames and the manually annotated semantic content. The file is the complete resource containing both the semantic and the syntactic content. The SweFN corpus examples can be downloaded here.
Title* Swedish FrameNet v1.0
Subtitle Swedish FrameNet for describing and documenting Swedish lexical entries, containing sentences annotated manually with semantic information and automatically with morphosyntactic information.
Created by* Dana Dannélls (, Maria Toporowska Gronostaj, Karin Friberg Heppin, and others.
Publisher(s)* Språkbanken Text (
Link(s) / permanent identifier(s)*
License(s)* CC BY 4.0
Abstract* Swedish FrameNet (SweFN) is a multi-layered lexical, grammatical and semantic computational resource based on the theory of frame semantics. The resource was created within the Swedish FrameNet++ project. It has been developed in line with Berkeley FrameNet 1.5. In SweFN sentences are annotated manually with semantic information and automatically with morphosyntactic annotations. The resource contains 1,195 semantic frames, 39,212 lexical units linked to Saldo and 9,020 semantically and syntactically annotated sentences.
Funded by* The Swedish Research Council (grant no. 2010–06013) and several other contributing projects
Cite as [1], [2]
Related datasets SweFN has links to several lexical resources at Språkbanken Text: Dalin, Loan Word Typology list, Parole+, Saldo, Simple+, Swesaurus.
Key applications Information retrieval, Machine translation, Natural language generation, Question answering, Semantic role labeling, Text classification, Textual entailment, Word sense disambiguation.
Intended task(s)/usage(s) Train and evaluate machine learning models, develop semantic role labeling systems.
Recommended evaluation measures Precision, Recall, F-score
Dataset function(s) Training, testing, development
Recommended split(s) 10-fold cross-validation.
Primary data* Text
Language* Swedish
Dataset in numbers* 1,195 Frames, 39,210 Lexical Units and 9,020 semantically and syntactically annotated sentences.
Nature of the content* Similarly to the Berkeley FrameNet, the Swedish FrameNet is build around semantic frames for describing and documenting Swedish lexical entries. It contains frame elements (FE) and lexical units (LU). A semantic frame represents factual information about concepts and situations in our world through frame elements. LU are words or multiword expressions are defined as a pairing of a word with a sense. They analyzed with frame elements and are documented with their syntactic relations with help av example sentences.
Swedish FrameNet contains several layers of annotations, divided into two xml files: (1) one containing information about the semantic properties of the LUs and the semantic analysis of the sentences in which they appear. That is the manual annotation. (2) one containing information about the frame, frame elements, domain, lexical units and the linguistic analysis automatically processed in Sprav pipeline, including the syntactic structure of each sentence, the morphological and other lexical descriptions (sense,sentiment score) of the lexical units.
Format* The format of both files (semantic and morphosyntatic annotations) are in XML. There are 20 data fields specified in the semantic file 'swefn.xml': 1. The name of the frame, 2. Definition, 3. Core elements, 4. Peripheral elements, 5. SweCxn ID, 6. Semantic type, 7. Example sentences, 8. Compound patterns, 9. Compound examples, 10. Lexical units (LUs), 11. Suggestions for LUs, 12. Regular polysemy, 13. Domain, 14. Inheritance, 15. Berkeley Frame ID, 16. Berkeley LUs, 17. Created by, 18. Comment, 19. Status, 20. Modification date.
Data source(s)* Sentences in SweFN have been extracted from the Web and from corpus examples. They have been annotated with manually with semantic information and automatically with morphosyntactic information using Sparv v4.1. Lexical units have been linked to Saldo v2.3 using Karp editing interface.
Data collection method(s)* Sentences and lexical units have been extracted manually and semi-automatically. Frames in SweFN have been developed by taking two approaches: extension and merging.
Data selection and filtering* As a result of the collection methods of frames, there are 59 frames in SweFN that do not have an exact match in BFN. Out of these, 20 are modified versions of BFN frames, revised mainly by splitting the original English frames into more specific ones [5].
Data preprocessing* The majority of sentences have been annotated through Karp's editing interface. Some sentences may have been shortened. Each of the data files has been preprocessed seperatly, one in Karp v5 and one in Sparv v4.1.
Data labeling*
Annotator characteristics Approximately 10 annotators have been involved in the semantic annotation work. Some had background in linguistics, some in computational linguistics and a few in lexicography. All annotators had at least undergraduate degree.
Ethical considerations
Things to watch out for All frames are linked to Berkely FrameNet v1.7.
Data last updated* 2021-12-21, v1.0
Which changes have been made, compared to the previous version* This is the first official version.
Access to previous versions
This document created* 2021-12-21, Dana Dannélls
This document last updated* 2021-12-21, Dana Dannélls
Where to look for further details [1], [2]
Documentation template version* v1.0
Related projects See complete list of contributing projects.
Språkbanken Text