Skip to main content
Språkbanken Text is a department within Språkbanken.

Analyses

Search our analyses. You can click on a row to see the details.
Analysis Sort descending Type Collections Task Unit Language
eng-dependency-stanza
Dependency parsing with Stanza's standard model for English
Analysis dependency parsing token English
eng-lemmatization-stanza
Lemmatization with Stanza's standard model for English
Analysis lemmatization token English
eng-msd-stanza-ufeats
Stanza-based morphological analysis for English, using universal features (UD)
Analysis morphosyntactic tagging token English
eng-namedentity-stanza
Named entity recognition with Stanza's standard model for English
Analysis named entity recognition English
eng-pos-stanza
Part-of-speech annotation with Penn Treebank tags with Stanza's standard model for English
Analysis part-of-speech tagging token English
eng-pos-stanza-upos
Part-of-speech annotation with UD (universal dependency) tags with Stanza's standard model for English
Analysis part-of-speech tagging token English
eng-sentence-stanza
Sentence segmentation with Stanza's standard model for English
Analysis sentence segmentation sentence English
eng-tokenization-stanza
Tokenization with Stanza's standard model for English
Analysis tokenization token English
export-conllu
Export of corpus data in Språkbanken Text's CoNLL-U format
Utility export
export-xml-preserved
XML corpus export preserving whitespaces from source file
Utility export
export-xml-pretty
XML corpus export where every token is printed on a new line
Utility export
export-xml-scrambled
XML corpus export with scrambled contents
Utility export
Collection
mink-analyses
Collection of analyses used in Mink
Analysis, Collection Swedish
paragraph-sparv-blanklines
Segments text into paragraphs by blank lines using the RegexpTokenizer from NLTK
Analysis tokenization paragraph
paragraph-sparv-linebreaks
Segments text into paragraphs by linebreaks using the RegexpTokenizer from NLTK
Analysis paragraph segmentation paragraph
paragraph-sparv-whitespace
Segments text into paragraphs by whitespaces using the RegexpTokenizer from NLTK
Analysis paragraph segmentation paragraph
sentence-punkt
Segments text into sentences by punctuation marks using the RegexpTokenizer from NLTK
Analysis sentence segmentation sentence
sentence-sparv-blanklines
Segments text into sentences by blank lines using the RegexpTokenizer from NLTK
Analysis tokenization sentence
sentence-sparv-linebreaks
Segments text into sentences by linebreaks using the RegexpTokenizer from NLTK
Analysis sentence segmentation sentence
sentence-sparv-whitespace
Segments text into sentences by whitespaces using the RegexpTokenizer from NLTK
Analysis sentence segmentation sentence
Collection
standard-analyses-swe
Collection of Sparv analyses for modern Swedish
Analysis, Collection Swedish
swe-compound-sparv-saldolemgram
Analysis of SALDO lemgram compounds including a probability ranking
Analysis mink-analyses, standard-analyses-swe compound analysis token Swedish
swe-compound-sparv-saldowords
Analysis of SALDO wordform compounds
Analysis mink-analyses, standard-analyses-swe compound analysis token Swedish
swe-dependency-malt-treebank
Swedish dependency parsing from MaltParser trained on Sweedish treebank
Analysis dependency parsing token Swedish
swe-dependency-stanza-stanzasynt
Swedish dependency parsing with Stanza trained on Sweedish treebank
Analysis mink-analyses, standard-analyses-swe dependency parsing token Swedish
swe-geotagcontext-sparv
Annotate text chunks with location data, based on locations contained within the text
Analysis standard-analyses-swe geotagging text Swedish
swe-geotagmetadata-sparv
Annotate text chunks with location data, based on metadata containing location names
Analysis geotagging text Swedish
swe-lemgram-sparv-saldo
Lookup for SALDO lemgrams
Analysis mink-analyses, standard-analyses-swe lexical lookup token Swedish
swe-lemmatization-sparv-saldo
Full-form lookup for SALDO citation forms (lemmas)
Analysis lemmatization token Swedish
swe-lemmatization-sparv-saldo2
Full-form lookup for SALDO citation forms (lemmas) plus analysis of compounds made up of SALDO entries
Analysis mink-analyses, standard-analyses-swe lemmatization token Swedish
swe-lemmatization-stanza-stanzalem
Swedish citation form analysis (base forms, lemmas) by Stanza, trained on SUC3
Analysis lemmatization token Swedish
swe-lexical_classes_text-sparv-blingbring
Lexical classes from Blingbring on text-level
Analysis mink-analyses, standard-analyses-swe lexical classes text Swedish
swe-lexical_classes_text-sparv-swefn
Lexical classes from SweFN on text-level
Analysis mink-analyses, standard-analyses-swe lexical classes text Swedish
swe-lexical_classes_token-sparv-blingbring
Lexical classes from Blingbring on token-level
Analysis mink-analyses, standard-analyses-swe lexical classes token Swedish
swe-lexical_classes_token-sparv-swefn
Lexical classes from SweFN on token-level
Analysis mink-analyses, standard-analyses-swe lexical classes token Swedish
swe-msd-hunpos-suc3
Annotation of morphological features (SUC) by Hunpos for Swedish
Analysis morphosyntactic tagging token Swedish
swe-msd-hunpos-suc3-1800
Annotation of morphological features (SUC) by Hunpos for Swedish from the 1800's
Analysis morphosyntactic tagging token Swedish
swe-msd-stanza-stanzamorph-suc3
Annotation of morphological features (SUC) by Stanza for Swedish
Analysis mink-analyses, standard-analyses-swe morphosyntactic tagging token Swedish
swe-msd-stanza-stanzamorph-ufeats
Stanza-based morphological analysis for Swedish, using universal features (UD)
Analysis mink-analyses, standard-analyses-swe morphosyntactic tagging token Swedish
swe-namedentity-swener
Named entity recognition (NER) recognises named entities such as locations, persons and time expressions in text.
Analysis mink-analyses, standard-analyses-swe named entity recognition Swedish
swe-phrasestructure-sparv
Swedish phrase structure parsing based on Mamba-Dep dependency analysis
Analysis phrase structure parsing Swedish
swe-pos-hunpos-suc3
Swedish part-of-speech annotation with SUC tags by Hunpos
Analysis part-of-speech tagging token Swedish
swe-pos-hunpos-suc3-1800
Part-of-speech annotation with SUC tags by Hunpos for Swedish from the 1800's
Analysis part-of-speech tagging token Swedish
swe-pos-stanza-stanzamorph
Swedish part-of-speech annotation with SUC tags by Stanza
Analysis mink-analyses, standard-analyses-swe part-of-speech tagging token Swedish
swe-readability-sparv-lix
Annotation of Swedish texts with LIX values which indicate the difficulty of the texts
Analysis mink-analyses, standard-analyses-swe readability measures text Swedish
swe-readability-sparv-nk
Annotation of Swedish texts with nominal ratios which indicate the difficulty of the texts
Analysis mink-analyses, standard-analyses-swe readability measures text Swedish
swe-readability-sparv-ovix
Annotation of Swedish texts with OVIX values which indicate the difficulty of the texts
Analysis mink-analyses, standard-analyses-swe readability measures text Swedish
swe-sbx-ocr-correction-viklofg-sweocr
OCR correction annotations
Analysis ocr-correction
swe-sbx-word-prediction-kb-bert
Word prediction annotations for each word in a text.
Analysis word-prediction token
swe-sense-sparv-saldo
Lookup for SALDO identifiers
Analysis lexical lookup token Swedish
swe-sense-wsd
Word sense disambiguation based on SALDO annotation
Analysis mink-analyses, standard-analyses-swe sense disambiguation token Swedish
swe-sentence-punkt-storsuc
Segments text into sentences, custom-made for Swedish
Analysis mink-analyses, standard-analyses-swe sentence segmentation sentence Swedish
swe-sentiment-sparv-sensaldo
Sentiment analysis via lookup in SenSALDO
Analysis mink-analyses, standard-analyses-swe sentiment analysis token Swedish
swe-tokenization-sparv-betterword
Tokenizes text, custom-made for Swedish
Analysis mink-analyses, standard-analyses-swe tokenization token Swedish
tokenization-sparv-blanklines
Tokenizes text into tokens by blank lines using the RegexpTokenizer from NLTK
Analysis tokenization token
tokenization-sparv-linebreaks
Tokenizes text into tokens by linebreaks using the RegexpTokenizer from NLTK
Analysis tokenization token
tokenization-sparv-whitespace
Tokenizes text into tokens by whitespaces using the RegexpTokenizer from NLTK
Analysis tokenization token