Efficient string matching with wildcards and length constraints

Publication Type:
Journal Article
Citation:
Knowledge and Information Systems, 2006, 10 (4), pp. 399 - 419
Issue Date:
2006-11-01
Filename Description Size
Thumbnail2011000605OK.pdf457.66 kB
Adobe PDF
Full metadata record
This paper defines a challenging problem of pattern matching between a pattern P and a text T, with wildcards and length constraints, and designs an efficient algorithm to return each pattern occurrence in an online manner. In this pattern matching problem, the user can specify the constraints on the number of wildcards between each two consecutive letters of P and the constraints on the length of each matching substring in T. We design a complete algorithm, SAIL that returns each matching substring of P in T as soon as it appears in T in an O (n+klmg) time with an O (lm) space overhead, where n is the length of T, k is the frequency of P's last letter occurring in T, l is the user-specified maximum length for each matching substring, m is the length of P, and g is the maximum difference between the user-specified maximum and minimum numbers of wildcards allowed between two consecutive letters in P. © Springer-Verlag London Limited 2006.
Please use this identifier to cite or link to this item: