En lexikal-semantisk resurs baserad på samma principer som engelska Berkeley FrameNet. Denna version är uppdaterad för att motsvara BFN 1.7.
The file swefn-2.0.json contains the frames and the manually annotated semantic content, in a JSON format.
The file swefn-2.0.tsv contains the same thing in a tab-separated table. Long lines have been separated for legibility; replace \n\t with \t for a strict table.
The Karp version has not been updated to 2.0 as of Oct 2024.
I. IDENTIFYING INFORMATION | |
Title* | Swedish FrameNet v2.0 |
Subtitle | Swedish FrameNet for describing and documenting Swedish lexical entries, containing sentences annotated manually with semantic information. |
Created by* | Dana Dannélls (dana.dannells@svenska.gu.se), Niklas Zechner, Maria Toporowska Gronostaj, Karin Friberg Heppin, and others. |
Publisher(s)* | Språkbanken Text (sb-info@svenska.gu.se) |
Link(s) / permanent identifier(s)* | https://spraakbanken.gu.se/resurser/swefn-2-0 |
License(s)* | CC BY 4.0 |
Abstract* | Swedish FrameNet (SweFN) is a multi-layered lexical, grammatical and semantic computational resource based on the theory of frame semantics. The resource was created within the Swedish FrameNet++ project. It has been developed in line with Berkeley FrameNet 1.5 (SweFN version 1.0), but has been updated and aligned with BFN 1.7. The resource contains 1,329 Frames, ~39,000 Lexical Unit linked to Saldo and ~9,000 semantically annotated sentences. |
Funded by* | The Swedish Research Council (grant no. 2010–06013) and several other contributing projects |
Cite as | [1], [2] |
Related datasets | SweFN has links to several lexical resources at Språkbanken Text: Dalin, Loan Word Typology list, Parole+, Saldo, Simple+, Swesaurus. |
II. USAGE | |
Key applications | Information retrieval, Machine translation, Natural language generation, Question answering, Semantic role labeling, Text classification, Textual entailment, Word sense disambiguation. |
Intended task(s)/usage(s) | Train and evaluate machine learning models, develop semantic role labeling systems. |
Recommended evaluation measures | Precision, Recall, F-score |
Dataset function(s) | Training, testing, development |
Recommended split(s) | 10-fold cross-validation. |
III. DATA | |
Primary data* | Text |
Language* | Swedish |
Dataset in numbers* | 1,329 Frames, ~39,000 Lexical Units and ~9,000 semantically and syntactically annotated sentences. |
Nature of the content* | Similarly to the Berkeley FrameNet, the Swedish FrameNet is build around semantic frames for describing and documenting Swedish lexical entries. It contains frame elements (FE) and lexical units (LU). A semantic frame represents factual information about concepts and situations in our world through frame elements. LU are words or multiword expressions defined as a pairing of a word with a sense. |
Format* | The file contains 17 data fields: 1. Frame, 2. Definition, 3. Core elements, 4. Peripheral elements, 5. SweCxn ID, 6. Semantic type, 7. Example sentences, 8. Compound patterns, 9. Compound examples, 10. Lexical units (LUs), 11. Suggestions for LUs, 12. Regular polysemy, 13. Domain, 14. Inheritance, 15. Created by, 16. Created date 17. Comment |
Data source(s)* | Sentences in SweFN have been extracted from the Web and from corpus examples. They have been annotated with manually with semantic information. Lexical units have been linked to Saldo v2.3 using Karp editing interface. |
Data collection method(s)* | Sentences and lexical units have been extracted manually and semi-automatically. Frames in SweFN have been developed by taking two approaches: extension and merging. |
Data selection and filtering* | |
Data preprocessing* | The majority of sentences have been annotated through Karp's editing interface. Some sentences may have been shortened. Each of the data files has been preprocessed seperatly, one in Karp v5 and one in Sparv v4.1. |
Data labeling* | |
Annotator characteristics | Approximately 10 annotators have been involved in the semantic annotation work. Some had background in linguistics, some in computational linguistics and a few in lexicography. All annotators had at least undergraduate degree. |
IV. ETHICS AND CAVEATS | |
Ethical considerations | |
Things to watch out for | All frames are linked to Berkely FrameNet v1.7. |
V. ABOUT DOCUMENTATION | |
Data last updated* | 2024-10-14, v2.0 |
Which changes have been made, compared to the previous version* | Replacement of frame element names, and additions of frame elements that were missing. |
Access to previous versions | https://spraakbanken.gu.se/resurser/swefn|
This document created* | 2024-10-14, Niklas Zechner |
This document last updated* | 2024-10-24, Dana Dannélls |
Where to look for further details | [1], [2] |
Documentation template version* | v2.0 |
VI. OTHER | |
Related projects | See complete list of contributing projects. |
References | [1] Dana Dannélls, Lars Borin, Markus Forsberg, Karin Friberg Heppin, Maria Toporowska Gronostaj (2021): Swedish FrameNet. The Swedish FrameNet++. In Harmonization, integration, method development and practical language technology applications, pages 37--66, John Benjamins: Amsterdam, Philadelphia. ISBN 978 90 272 5848 9.
[2] Dana Dannélls, Lars Borin, Karin Friberg Heppin (2021): The Swedish FrameNet++ Harmonization, integration, method development and practical language technology applications. John Benjamins: Amsterdam, Philadelphia. ISBN 978 90 272 5848 9. [3] Dana Dannélls, Karin Friberg Heppin, Anna Ehrlemark (2014): Using language technology resources and tools to construct Swedish FrameNet. In Proceedings of the Workshop on Lexical and Grammatical Resources for Language Processing, pages 8--17, Dublin: ACL. [4] Karin Friberg Heppin, Miriam R.L. Petruck (2014): Encoding of Compounds in Swedish FrameNet. In Proceedings of the 10th Workshop on Multiword Expressions (MWE 2014) Workshop at EACL 2014 (Gothenburg, Sweden). Association for Computational Linguistics, pages 67--71, Gothenburg: ACL. [5] Friberg Heppin, Karin & Maria Toporowska Gronostaj (2014). Exploiting FrameNet for Swedish: Mismatch? Constructions and Frames 6(1): 52–72. [6] Karin Friberg Heppin (2013): Search using semantic FrameNet frames as variables. In Proceedings of Sixth Workshop on Exploiting Semantic Annotations in Information Retrieval (ESAIR 2013), held at CIKM 2013 in San Francisco, pages 25--28. [7] Kaarlo Voionmaa, Karin Friberg Heppin (2013): Use of support verbs in FrameNet annotations. In Electronic lexicography in the 21st century: thinking outside the paper. Proceedings of the eLex 2013 conference, Tallinn, Estonia. [8] Richard Johansson, Karin Friberg Heppin, Dimitrios Kokkinakis (2012): Semantic Role Labeling with the Swedish FrameNet. In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12); Istanbul, Turkey, pages 3697--3700. [9] Dana Dannélls, Lars Borin (2012): Toward language independent methodology for generating artwork descriptions – Exploring FrameNet information. In EACL 2012 workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities (LaTeCH), pages 18–23. Avignon: ACL. [10] Karin Friberg Heppin, Kaarlo Voionmaa (2012): Practical aspects of transferring the English Berkeley FrameNet to other languages. In Proceedings of SLTC 2012, 28–29. Lund: Lund University. [11] Dimitrios Kokkinakis (2012): Initial Experiments of Medication Event Extraction Using Frame Semantics. In Scandinavian Conference on Health Informatics (SHI), volym Linköping Electronic Conference Proceedings, pages 41--47. Linköping: LiUEP. [12] Richard Johansson (2012): Non-atomic Classification to Improve a Semantic Role Labeler for a Low-resource Language. In Proceedings of the First Joint Conference on Lexical and Computational Semantics (*SEM), Montréal, Canada, pages 95--99. [13] Lyngfelt, Benjamin, Lars Borin, Markus Forsberg, Julia Prentice, Rudolf Rydstedt, Emma Sköldberg & Sofia Tingsell. 2012. Adding a constructicon to the Swedish resource network of Språkbanken. In Proceedings of KONVENS 2012 (LexSem 2012 workshop), 452–461. Vienna: ÖGAI. [14] Lars Borin, Markus Forsberg, Richard Johansson, Kristiina Muhonen, Tanja Purtonen, Kaarlo Voionmaa (2012): Transferring Frames: Utilization of Linked Lexical Resources. In Proceedings of the Workshop on Inducing Linguistic Structure Submission (WILS), pages 8--15. Montrèal: ACL. [15] Dimitrios Kokkinakis, Maria Toporowska Gronostaj (2010): Linking SweFN++ with Medical Resources, towards a MedFrameNet for Swedish. In Proceedings of Louhi at NAACL-HLT 2010, pages 68–71. Los Angeles: ACL. [16] Dana Dannélls (2010): Applying semantic frame theory to automate natural language templates generation from ontology statements. In Proceedings of INLG 2010, 179–184. Dublin: ACL. [17] Lars Borin, Dana Dannélls, Markus Forsberg, Maria Toporowska Gronostaj, Dimitrios Kokkinakis (2009): Thinking Green: Toward Swedish FrameNet++. Presentation at the FrameNet Masterclass and Workshop in connection with TLT 2009. Milan. |