sv-COVID-19

Data citation

Språkbanken (2023). sv-COVID-19 (updated: 2023-05-29). [Data set]. Enriched and distributed by Språkbanken. https://doi.org/10.23695/k6fh-4f59

Additional ways to cite the dataset.

A compilation of various articles related to the COVID-19 pandemic

sv-covid-19 is a collection of Swedish news texts, scientific and popular science articles and articles from certain blogs and social media wuch as Flashback and Twitter, which started to be published at the beginning of the coronavirus pandemic (early 2020). The latest verision of the corpus consists of approximately eight million words and 9000 articles. The corpus contains various text types and texts with different stylistic levels. The texts have been marked up with word class tags, morphological analysis and lemma, as well as some structural and functional information, such as author names.

References

Dimitrios Kokkinakis (2021): Insights on a Swedish Covid-19 corpus, in CLARIN Annual Conference (Virtual Event). 27 – 29 September 2021. Monica Monachini, Maria Eskevich (red.). s. 31-34

Accessible through

Access	Platform	Licence
https://spraakbanken.gu.se/korp/#?corpus=sv-covid-19 (scrambled)		CC-BY-4.0

Download

File	Size	Modified	Licence
sv-covid-19.xml.bz2 corpus (XML, scrambled)	216.31 MB	2025-02-20	CC-BY-4.0
stats_sv-covid-19.csv.zip Word statistics: (CSV)	2.45 MB	2025-04-22	CC-BY-4.0

Data citation

References

Accessible through

Download

Type

Language

Size

Keywords

Updated

Contact

DOI