Site map

SUC Novels (StorSUC)

Information

License Egen: http://k2xx.spraakdata.gu.se/stb/om/suc-license.pdf
Number of tokens: 4653743

Search

Korp
Web service
Statistics
Old web interface (via Glossa)

Downloads

XML (citat)

Metadata

metadata as xml metadata as json

The Stockholm-Umeå Corpus (SUC) is a collection of Swedish texts from the 1990's, consisting of one million words in total. The corpus is balanced, meaning that it contains various text types and stylistic levels. The texts are annotated with part-of-speech tags, mophological analysis and lemma, as well as some structural and functional information.

Version 1.0 was developed in co-operation between Gunnel Källgren at Stockholm University and Eva Ejerhed at Umeå University and was made available in 1997 by the department of linguistics at Stockholm University. Version 2.0 was made available in 2006 by Sofia Gustafson-Capková and Britt Hartmann at the department of linguistics at Stockholm University. It contains the same texts as SUC 1.0 but is extended with some annotation. Additionally, SUC 2.0 contains bonus materials. TigerSUC is SUC 2.0 converted to TIGER-XML by Martin Volk. StorSUC is additional SUC material of 4 million words.

SUC is freely available for research, but requires that every user signs an individual license with the department of linguistics at Stockholm University. Since December 1st 2008 the SUC licensing is delegated to Språkbanken at the University of Gothenburg.

The SUC license (in pdf format) needs to be printed, signed, and sent to

SUC-licens
Språkbanken
Institutionen för svenska språket
Göteborgs universitet
Box 200
405 30 Göteborg

Additional information

© University of Gothenburg 2009, Box 100, 405 30 Gothenburg, Sweden
Tel +46 31 786 0000, Contact

About the site

X
Loading