Skip to main content
Språkbanken Text is a part of Språkbanken.

swe-readability-sparv-ovix

Citation Information

Språkbanken Text (2018). swe-readability-sparv-ovix (updated: 2018-03-28). [Analysis]. Språkbanken Text.
BibTeX
Annotation of Swedish texts with OVIX values which indicate the difficulty of the texts

OVIX (ordvariationsindex) is a readability measure based on how many words occur only once in the text chunk.

OVIX is calculated as log(tokens) / log(2 - (log(types) / log(tokens)))

A high value can be interpreted as frequently introducing new words to the reader. On the other hand, a low value may indicate a monotonous text.

Example

This analysis is used with Sparv. Check out Sparv's quick start guide to get started!

To use this analysis, add the following line under export.annotations in the Sparv corpus configuration file:

- <text>:readability.ovix  # OVIX values for text chunks

For more info on how to use Sparv, check out the Sparv documentation.

Example output:

<text ovix="inf">
  <token>Det</token>
  <token>här</token>
  <token>är</token>
  <token>en</token>
  <token>enkel</token>
  <token>mening</token>
  <token>.</token>
</text>
<text ovix="94.13">
  <token>LIX</token>
  <token>(</token>
  <token>Björnsson</token>
  <token>,</token>
  <token>1968</token>
  <token>)</token>
  <token>är</token>
  <token>ett</token>
  <token>läsbarhetsvärde</token>
  <token>beräknat</token>
  <token></token>
  <token>genomsnittligt</token>
  <token>antal</token>
  <token>ord</token>
  <token>per</token>
  <token>mening</token>
  <token>och</token>
  <token>andel</token>
  <token>långa</token>
  <token>ord</token>
  <token>(</token>
  <token>över</token>
  <token>sex</token>
  <token>bokstäver</token>
  <token>långa</token>
  <token>)</token>
  <token>.</token>
</text>

Type

  • Analysis

Task

  • readability measures

Unit

  • text

Tool

Sparv

Created

2018-03-28

Updated

2018-03-28

Contact

Språkbanken Text
sb-info@svenska.gu.se