next up previous
Next: Language Pairs Up: PEDANT Previous: PEDANT

The composition of PEDANT

PEDANT consists of texts in several languages and aims at providing a wide collection of text types and language pairs in order to facilitate the creation of sub-set corpora for the specific purposes various researchers might have. Our initial efforts have not concentrated on fiction. There are several reasons for this, most of them pragmatic. In the first place the legalities are more complicated when it comes to fiction and copyright questions. Such problems are more than doubled when two languages are involved. Experience has shown that we can often reach an agreement with Swedish publishers who are familiar with our work and do not distrust our intentions, but the problems involved with reaching the same agreements with publishers abroad are often insurmountable. In the second place we are, as has already been mentioned, allowing the composition of our collection to be guided by the needs of our surroundings. At this moment that means that we are giving high priority to the planned program for translators to be offered in the Faculty of Arts and Humanities. This program will start under the next academic year and will concentrate on the less glamorous texts a translator is faced with: technical manuals, political documents, financial reports and the like. We feel strongly about the need for translators to familiarize themselves with computational auxilliary tools. By simple extensions the data in PEDANT could easily be applied in practical situations so we are determined to involve the system actively in the training program and are building up our initial collection of texts accordingly.

Within PEDANT we will be using full, unabridged texts, since we want to be able to provide all types of information that a text can provide. Most text will be parallel in pairs, that is to say there is one original text and one translation of that text, but we have some texts that are parallel in up to nine languages, which is the same as one original and translations into eight different languages. For one specific type of text, namely official texts from the European Union, there is no translated text since they are all ``original'', officially. It is, however, still possible to align them, since they have the same information and structure in all languages.


next up previous
Next: Language Pairs Up: PEDANT Previous: PEDANT

Daniel Ridings
Sun Mar 31 09:05:43 METDST 1996