Skip to main content

Mink

What is Mink?

Språkbanken Text is a research infrastructure for language data. We provide digital text data suitable for research, and we develop analysis tools based on language technology.

With Mink, you can submit your own text data directly into our tools.

Use Mink at spraakbanken.gu.se/mink

Who can use Mink?

Anyone with an eduGAIN account can use that to login to Mink. This includes most people associated with a university or other academic institution.

Other users can create an account at eduID which, itself, is connected to eduGAIN.

We are working toward offering a "demo" version of Mink which will have some limitations, but will be available without having to log in.

We also want to extend our authentication solution to allow other accounts such as Swedish BankID, Google and the classic email-password combination.

What can Mink do?

This first version of Mink targets a specific workflow:

  1. Create a corpus of uploaded text files
  2. Run automatic annotation
  3. Use the results in Korp or in Strix, or as XML/CSV files

Supported formats for text files are:

  • plain text (.txt)
  • XML
  • Microsoft Word (.docx)
  • Open Document (.odt)
  • PDF

The annotation pipeline includes:

  • Part-of-speech tags (POS)
  • Base form (lemma)
  • Morphosyntactic tags (MSD)
  • Dependencies
  • Sentiment labels

Upcoming features

Some future development goals for Mink are extended annotation settings, sharing and publishing, and workflows for other types of language data, such as lexicons.