* The Annotation

- Terminal level:

  id: word identifier [obligatory]
  word: word form [obligatory]
  correction: obvious word form errors [optional]
  pos: part-of-speech [obligatory]
  msd: part-of-speech specific morpho-syntactic features [obligatory]
  msd2: general morpho-syntactic features [obligatory]
  lemma: currently empty [optional]
  saldo: Saldo (v2.0) identifier [optional]
  sicid: sic identifier for blog tokens [optional]
  sense: Saldo sense identifier [optional]
  sense_quality: A confidence value for 'sense', where '4' means
      manually checked, '3' means a sense annotated by at least two
      annotators without conflict, '2' means annotated by one
      annotator, '1' means multiple conflicting annotations with one
      majority alternative, and '0' means multiple conflicting
      annotations without one majority alternative.
  sense_ann: sense annotations, where the first four positions are
      different annotators and the fifth contains an automatic
      heuristic, mainly for auxiliary 'ha' ('have') and existential
      'det' ('it') [optional]


- Nonterminal level:

  nt id: nonterminal (phrase) identifier [obligatory]
  nt cat: phrase label [obligatory]
  flags: ???
  edge idref: child identifier [obligatory]
  edge label: function label [obligatory]
  secedge idref: secondary edge child identifier [obligatory]
  secedge label: secondary edge function label [obligatory]


* Relation to the text source

The tokens in the annotations in the Annotation directory are
guaranteed to map unto the text files in the Sources directory with
interpolation of any amount of optional whitespace. Tokens may include
whitespace themselves, which in that case must occur in the source
text as is.
