Sparv-Superlim is a Sparv plugin, which uses the reference models trained for the Superlim multi-task benchmark to provide ten additional NLU annotations to the Sparv Pipeline. The Sparv Pipeline is a text analysis platform developed by Språkbanken Text. The additional annotations are listed in the table below, ranging from author stance to immigration to inference and semantic relationships between concurring sentences:
Superlim task | Sparv-Superlim Annotation | Annotation | Label | Segment |
---|---|---|---|---|
absabank-imm | migration_stance | Attitude towards immigration | float between 1-5 | sentence |
argumentation- sentences | [topic]_stance | Stance to a given topic | pro, con or neutral | sentence |
dalaj-ged | correct_swedish | Correct Swedish | correct or incorrect | sentence |
swenli | previous_entailment | The logical relationship of two sentences | entailment, contradiction or neutral | sentence pair |
sweparaphrase | similarity | Similarity between two sentences | float between 1-5 | sentence pair |
Not only does this plugin help corpus linguists or other researchers to analyze the content of texts by more sentence-level features, it is also a way to test the model and benchmark robustness of Superlim on novel data. For example, in the standard reference of Sparv-Superlim, I use it to analyze political trends over time of political parties using party manifestos. The results show that the model trained on absabank-imm
, can identify changes in attitude towards migration of the Swedish Moderate party (Table 2), while a model trained on argumentation_sentences
do not seem to align with known political stances of the Center party and Liberals towards nuclear energy (Table 5).
These results illustrates the importance of integrating popular benchmarks into text analysis platforms in order to see how they actually perform "in the wild".
You can read more about the plugin and the details of these two use cases in the standard reference When Sparv met Superlim... A Sparv Plugin for Natural Language Understanding Analysis of Swedish (Morger, 2024). Alternatively, if you want to use or contribute to the plugin, go to the official repository on GitHub.