sv-covid-19 is a collection of Swedish news texts, scientific and popular science articles and articles from certain blogs and social media wuch as Flashback and Twitter, which started to be published at the beginning of the coronavirus pandemic (early 2020). The latest verision of the corpus consists of approximately eight million words and 9000 articles. The corpus contains various text types and texts with different stylistic levels. The texts have been marked up with word class tags, morphological analysis and lemma, as well as some structural and functional information, such as author names.
Citation
Språkbanken Text (2023). sv-COVID-19 (updated: 2023-05-29). [Data set]. Språkbanken Text. https://doi.org/10.23695/k6fh-4f59Additional ways to cite the dataset.
A compilation of various articles related to the COVID-19 pandemic
References
Dimitrios Kokkinakis (2021): Insights on a Swedish Covid-19 corpus, in CLARIN Annual Conference (Virtual Event). 27 – 29 September 2021. Monica Monachini, Maria Eskevich (red.). s. 31-34
File | Size | Modified | Licence |
---|---|---|---|
200.6 MB | 2023-05-29 |
CC BY 4.0
attribution
|
|
12.47 MB | 2023-05-29 |
CC BY 4.0
attribution
|