Facebook Instagram Twitter RSS Feed PodBean Back to top on side

Lightweight Fingerprints for Fast Approximate Keyword Matching Using Bitwise Operations

In: Computing and Informatics, vol. 38, no. 2
A. Cisłak - Sz. Grabowski

Details:

Year, pages: 2019, 367 - 389
Language: eng
Keywords:
Fingerprint, keyword matching, approximate matching, bitwise
Document type: article
About article:
We aim to speed up approximate keyword matching with the use of a lightweight, fixed-size block of data for each string, called a fingerprint. These work in a similar way to hash values; however, they can be also used for matching with errors. They store information regarding symbol occurrences using individual bits, and they can be compared against each other with a constant number of bitwise operations. In this way, certain strings can be deduced to be at least within the distance k from each other (using Hamming or Levenshtein distance) without performing an explicit verification. We show experimentally that for a preprocessed collection of strings, fingerprints can provide substantial speedups for k = 1, namely over 2.5 times for the Hamming distance and over 30 times for the Levenshtein distance. Tests were conducted on synthetic and real-world English and URL data.
How to cite:
ISO 690:
Cisłak, A., Grabowski, S. 2019. Lightweight Fingerprints for Fast Approximate Keyword Matching Using Bitwise Operations. In Computing and Informatics, vol. 38, no.2, pp. 367-389. 1335-9150. DOI: https://doi.org/10.31577/cai_2019_2_367

APA:
Cisłak, A., Grabowski, S. (2019). Lightweight Fingerprints for Fast Approximate Keyword Matching Using Bitwise Operations. Computing and Informatics, 38(2), 367-389. 1335-9150. DOI: https://doi.org/10.31577/cai_2019_2_367
About edition:
Publisher: Ústav informatiky SAV
Published: 4. 6. 2019