Skip to main content
Språkbanken Text is a part of Språkbanken.

Analyses

Search our analyses. You can click on a row to see the details.
Analysis Sort descending Collections Task Unit Language
eng-dependency-stanza
Dependency parsing with Stanza's standard model for English
dependency parsing token English
eng-lemmatization-stanza
Lemmatization with Stanza's standard model for English
lemmatization token English
eng-msd-stanza-ufeats
Stanza-based morphological analysis for English, using universal features (UD)
morphosyntactic tagging token English
eng-namedentity-stanza
Named entity recognition with Stanza's standard model for English
named entity recognition English
eng-pos-stanza
Part-of-speech annotation with Penn Treebank tags with Stanza's standard model for English
part-of-speech tagging token English
eng-pos-stanza-upos
Part-of-speech annotation with UD (universal dependency) tags with Stanza's standard model for English
part-of-speech tagging token English
eng-sentence-stanza
Sentence segmentation with Stanza's standard model for English
sentence segmentation sentence English
eng-tokenization-stanza
Tokenization with Stanza's standard model for English
tokenization token English
Collection
mink-analyses
Collection of analyses used in Mink
Swedish
paragraph-sparv-blanklines
Segments text into paragraphs by blank lines using the RegexpTokenizer from NLTK
tokenization paragraph
paragraph-sparv-linebreaks
Segments text into paragraphs by linebreaks using the RegexpTokenizer from NLTK
paragraph segmentation paragraph
paragraph-sparv-whitespace
Segments text into paragraphs by whitespaces using the RegexpTokenizer from NLTK
paragraph segmentation paragraph
sentence-punkt
Segments text into sentences by punctuation marks using the RegexpTokenizer from NLTK
sentence segmentation sentence
sentence-sparv-blanklines
Segments text into sentences by blank lines using the RegexpTokenizer from NLTK
tokenization sentence
sentence-sparv-linebreaks
Segments text into sentences by linebreaks using the RegexpTokenizer from NLTK
sentence segmentation sentence
sentence-sparv-whitespace
Segments text into sentences by whitespaces using the RegexpTokenizer from NLTK
sentence segmentation sentence
Collection
standard-analyses-swe
Collection of Sparv analyses for modern Swedish
Swedish
swe-compound-sparv-saldolemgram
Analysis of SALDO lemgram compounds including a probability ranking
mink-analyses, standard-analyses-swe compound analysis token Swedish
swe-compound-sparv-saldowords
Analysis of SALDO wordform compounds
mink-analyses, standard-analyses-swe compound analysis token Swedish
swe-dependency-malt-treebank
Swedish dependency parsing from MaltParser trained on Sweedish treebank
dependency parsing token Swedish
swe-dependency-stanza-stanzasynt
Swedish dependency parsing with Stanza trained on Sweedish treebank
mink-analyses, standard-analyses-swe dependency parsing token Swedish
swe-geotagcontext-sparv
Annotate text chunks with location data, based on locations contained within the text
standard-analyses-swe geotagging text Swedish
swe-geotagmetadata-sparv
Annotate text chunks with location data, based on metadata containing location names
geotagging text Swedish
swe-lemgram-sparv-saldo
Lookup for SALDO lemgrams
mink-analyses, standard-analyses-swe lexical lookup token Swedish
swe-lemmatization-sparv-saldo
Full-form lookup for SALDO citation forms (lemmas)
lemmatization token Swedish
swe-lemmatization-sparv-saldo2
Full-form lookup for SALDO citation forms (lemmas) plus analysis of compounds made up of SALDO entries
mink-analyses, standard-analyses-swe lemmatization token Swedish
swe-lemmatization-stanza-stanzalem
Swedish citation form analysis (base forms, lemmas) by Stanza, trained on SUC3
lemmatization token Swedish
swe-lexical_classes_text-sparv-blingbring
Lexical classes from Blingbring on text-level
mink-analyses, standard-analyses-swe lexical classes text Swedish
swe-lexical_classes_text-sparv-swefn
Lexical classes from SweFN on text-level
mink-analyses, standard-analyses-swe lexical classes text Swedish
swe-lexical_classes_token-sparv-blingbring
Lexical classes from Blingbring on token-level
mink-analyses, standard-analyses-swe lexical classes token Swedish
swe-lexical_classes_token-sparv-swefn
Lexical classes from SweFN on token-level
mink-analyses, standard-analyses-swe lexical classes token Swedish
swe-msd-hunpos-suc3
Annotation of morphological features (SUC) by Hunpos for Swedish
morphosyntactic tagging token Swedish
swe-msd-hunpos-suc3-1800
Annotation of morphological features (SUC) by Hunpos for Swedish from the 1800's
morphosyntactic tagging token Swedish
swe-msd-stanza-stanzamorph-suc3
Annotation of morphological features (SUC) by Stanza for Swedish
mink-analyses, standard-analyses-swe morphosyntactic tagging token Swedish
swe-msd-stanza-stanzamorph-ufeats
Stanza-based morphological analysis for Swedish, using universal features (UD)
mink-analyses, standard-analyses-swe morphosyntactic tagging token Swedish
swe-namedentity-swener
Named entity recognition (NER) recognises named entities such as locations, persons and time expressions in text.
mink-analyses, standard-analyses-swe named entity recognition Swedish
swe-phrasestructure-sparv
Swedish phrase structure parsing based on Mamba-Dep dependency analysis
phrase structure parsing Swedish
swe-pos-hunpos-suc3
Swedish part-of-speech annotation with SUC tags by Hunpos
part-of-speech tagging token Swedish
swe-pos-hunpos-suc3-1800
Part-of-speech annotation with SUC tags by Hunpos for Swedish from the 1800's
part-of-speech tagging token Swedish
swe-pos-stanza-stanzamorph
Swedish part-of-speech annotation with SUC tags by Stanza
mink-analyses, standard-analyses-swe part-of-speech tagging token Swedish
swe-readability-sparv-lix
Annotation of Swedish texts with LIX values which indicate the difficulty of the texts
mink-analyses, standard-analyses-swe readability measures text Swedish
swe-readability-sparv-nk
Annotation of Swedish texts with nominal ratios which indicate the difficulty of the texts
mink-analyses, standard-analyses-swe readability measures text Swedish
swe-readability-sparv-ovix
Annotation of Swedish texts with OVIX values which indicate the difficulty of the texts
mink-analyses, standard-analyses-swe readability measures text Swedish
swe-sbx-ocr-correction-viklofg-sweocr
OCR correction annotations
ocr-correction
swe-sbx-word-prediction-kb-bert
Word prediction annotations for each word in a text.
word-prediction token
swe-sense-sparv-saldo
Lookup for SALDO identifiers
lexical lookup token Swedish
swe-sense-wsd
Word sense disambiguation based on SALDO annotation
mink-analyses, standard-analyses-swe sense disambiguation token Swedish
swe-sentence-punkt-storsuc
Segments text into sentences, custom-made for Swedish
mink-analyses, standard-analyses-swe sentence segmentation sentence Swedish
swe-sentiment-sparv-sensaldo
Sentiment analysis via lookup in SenSALDO
mink-analyses, standard-analyses-swe sentiment analysis token Swedish
swe-tokenization-sparv-betterword
Tokenizes text, custom-made for Swedish
mink-analyses, standard-analyses-swe tokenization token Swedish
tokenization-sparv-blanklines
Tokenizes text into tokens by blank lines using the RegexpTokenizer from NLTK
tokenization token
tokenization-sparv-linebreaks
Tokenizes text into tokens by linebreaks using the RegexpTokenizer from NLTK
tokenization token
tokenization-sparv-whitespace
Tokenizes text into tokens by whitespaces using the RegexpTokenizer from NLTK
tokenization token