Hoppa till huvudinnehåll
Språkbanken Text är en avdelning inom Språkbanken.

BibTeX

@inProceedings{ljunglof-etal-2024-binary-342402,
	title        = {Binary indexes for optimising corpus queries},
	abstract     = {To be able to search for patterns in annotated text corpora is crucial for many different research disciplines. However, searching for complex patterns in large corpora can take long time – sometimes several minutes or even hours.

We investigate how inverted indexes can be used for efficient searching in large annotated corpora, and in particular binary indexes. We show how corpus queries are translated into lookups in unary and binary inverted indexes, and give efficient strategies for combining the results using efficient set operations. In addition we discuss how to make use of binary indexes for more complex query types.},
	booktitle    = {Proceedings of the 20th Conference on Natural Language Processing (KONVENS 2024), September 10-13, 2024, Vienna, Austria},
	author       = {Ljunglöf, Peter and Smallbone, Nicholas and Thoresson, Mijo and Salomonsson, Victor},
	year         = {2024},
	publisher    = {Association for Computational Linguistics},
	ISBN         = {9798331304843},
}