Språkbanken Text is a research infrastructure for language data. We create and maintain language data analysed with state-of-the-art language technology methods for researchers everywhere, and take pride in providing free and open access where possible.
Mink is our effort to put Språkbanken Text’s research infrastructure into the hands of the researchers. You can use Mink to apply our language technology methods on texts that you have collected yourself. The resulting data can be downloaded or made available through our research tools, such as Korp and Strix, behind login.
Your source data and the results are private (see Privacy and data policy). Forthcoming versions will include features for sharing datasets with teams and publishing them publicly.
Read more in the Mink documentation
The YAML editor on the Custom config page now validates against the Sparv configuration schema, for direct feedback on mistakes.
Read about all changes in the changelog.
Among the default annotations, unwanted groups of annotations can now be disabled.
Read about all changes in the changelog.
Better color contrast and other accessibility improvements. The NER option now includes Geotagging. Custom configuration can be uploaded as well as edited directly.
Read about all changes in the changelog.
New icon pack. Fixed some loading spinners and broken error messages.
Read about all changes in the changelog.
Your language data are analyzed by our language technology platform Sparv according to your settings.
Upload your word-processor documents, PDFs, plain text files or XML data.
When using XML, the output of the analyses is added while keeping the input structure intact: Using Mink with custom annotations in source (blog entry in Swedish)
Invite fellow researchers to view your data in our research tools, protected by login. If you want to share your data with the research community, contact us to discuss publication.
These features are scheduled for upcoming versions of Mink.
Mink or no Mink? Make sure to first get acquainted with any existing language data sets related to your interests! Språkbanken's growing collection of research data can be browsed on the Data section of our website. At a larger scale, Språkbanken forms a part of the CLARIN ERIC, whose collected assortment of data can be browsed in the Virtual Language Observatory.