Detecting False Matches in String-Matching Algorithms

Shanmugavelayutham Muthukrishnan

    Research output: Contribution to journalArticle

    Abstract

    Consider a text string of length n, a pattern string of length m, and a match vector of length n which declares each location in the text to be either a mismatch (the pattern does not occur beginning at that location in the text) or a potential match (the pattern may occur beginning at that location in the text). Some of the potential matches could be false, i.e., the pattern may not occur beginning at some location in the text declared to be a potential match. We investigate the complexity of two problems in this context, namely, checking if there is any false match, and identifying all the false matches in the match vector. We present an algorithm on the CRCW PRAM that checks if there exists a false match in O(1) time using O(n) processors. This algorithm does not require preprocessing the pattern. Therefore, checking for false matches is provably simpler than string matching since string matching takes Ω(log log m) time on the CRCW PRAM. We use this simple algorithm to convert the Karp-Rabin Monte Carlo type string-matching algorithm into a Las Vegas type algorithm without asymptotic loss in complexity. We also present an efficient algorithm for identifying all the false matches and, as a consequence, show that string-matching algorithms take Ω(log log m) time even given the flexibility to output a few false matches.

    Original languageEnglish (US)
    Pages (from-to)512-520
    Number of pages9
    JournalAlgorithmica (New York)
    Volume18
    Issue number4
    DOIs
    StatePublished - Jan 1 1997

    Fingerprint

    String searching algorithms
    String Algorithms
    String Matching
    Matching Algorithm
    Strings
    False
    Convert
    Preprocessing
    Efficient Algorithms
    Flexibility
    Text
    Output

    Keywords

    • Checking string matching algorithms
    • Parallel algorithms
    • Randomized (Las Vegas) string matching

    ASJC Scopus subject areas

    • Computer Science(all)
    • Computer Science Applications
    • Applied Mathematics

    Cite this

    Detecting False Matches in String-Matching Algorithms. / Muthukrishnan, Shanmugavelayutham.

    In: Algorithmica (New York), Vol. 18, No. 4, 01.01.1997, p. 512-520.

    Research output: Contribution to journalArticle

    Muthukrishnan, Shanmugavelayutham. / Detecting False Matches in String-Matching Algorithms. In: Algorithmica (New York). 1997 ; Vol. 18, No. 4. pp. 512-520.
    @article{476cfe7cdd9b4a9a9192b18529908c0c,
    title = "Detecting False Matches in String-Matching Algorithms",
    abstract = "Consider a text string of length n, a pattern string of length m, and a match vector of length n which declares each location in the text to be either a mismatch (the pattern does not occur beginning at that location in the text) or a potential match (the pattern may occur beginning at that location in the text). Some of the potential matches could be false, i.e., the pattern may not occur beginning at some location in the text declared to be a potential match. We investigate the complexity of two problems in this context, namely, checking if there is any false match, and identifying all the false matches in the match vector. We present an algorithm on the CRCW PRAM that checks if there exists a false match in O(1) time using O(n) processors. This algorithm does not require preprocessing the pattern. Therefore, checking for false matches is provably simpler than string matching since string matching takes Ω(log log m) time on the CRCW PRAM. We use this simple algorithm to convert the Karp-Rabin Monte Carlo type string-matching algorithm into a Las Vegas type algorithm without asymptotic loss in complexity. We also present an efficient algorithm for identifying all the false matches and, as a consequence, show that string-matching algorithms take Ω(log log m) time even given the flexibility to output a few false matches.",
    keywords = "Checking string matching algorithms, Parallel algorithms, Randomized (Las Vegas) string matching",
    author = "Shanmugavelayutham Muthukrishnan",
    year = "1997",
    month = "1",
    day = "1",
    doi = "10.1007/PL00009168",
    language = "English (US)",
    volume = "18",
    pages = "512--520",
    journal = "Algorithmica",
    issn = "0178-4617",
    publisher = "Springer New York",
    number = "4",

    }

    TY - JOUR

    T1 - Detecting False Matches in String-Matching Algorithms

    AU - Muthukrishnan, Shanmugavelayutham

    PY - 1997/1/1

    Y1 - 1997/1/1

    N2 - Consider a text string of length n, a pattern string of length m, and a match vector of length n which declares each location in the text to be either a mismatch (the pattern does not occur beginning at that location in the text) or a potential match (the pattern may occur beginning at that location in the text). Some of the potential matches could be false, i.e., the pattern may not occur beginning at some location in the text declared to be a potential match. We investigate the complexity of two problems in this context, namely, checking if there is any false match, and identifying all the false matches in the match vector. We present an algorithm on the CRCW PRAM that checks if there exists a false match in O(1) time using O(n) processors. This algorithm does not require preprocessing the pattern. Therefore, checking for false matches is provably simpler than string matching since string matching takes Ω(log log m) time on the CRCW PRAM. We use this simple algorithm to convert the Karp-Rabin Monte Carlo type string-matching algorithm into a Las Vegas type algorithm without asymptotic loss in complexity. We also present an efficient algorithm for identifying all the false matches and, as a consequence, show that string-matching algorithms take Ω(log log m) time even given the flexibility to output a few false matches.

    AB - Consider a text string of length n, a pattern string of length m, and a match vector of length n which declares each location in the text to be either a mismatch (the pattern does not occur beginning at that location in the text) or a potential match (the pattern may occur beginning at that location in the text). Some of the potential matches could be false, i.e., the pattern may not occur beginning at some location in the text declared to be a potential match. We investigate the complexity of two problems in this context, namely, checking if there is any false match, and identifying all the false matches in the match vector. We present an algorithm on the CRCW PRAM that checks if there exists a false match in O(1) time using O(n) processors. This algorithm does not require preprocessing the pattern. Therefore, checking for false matches is provably simpler than string matching since string matching takes Ω(log log m) time on the CRCW PRAM. We use this simple algorithm to convert the Karp-Rabin Monte Carlo type string-matching algorithm into a Las Vegas type algorithm without asymptotic loss in complexity. We also present an efficient algorithm for identifying all the false matches and, as a consequence, show that string-matching algorithms take Ω(log log m) time even given the flexibility to output a few false matches.

    KW - Checking string matching algorithms

    KW - Parallel algorithms

    KW - Randomized (Las Vegas) string matching

    UR - http://www.scopus.com/inward/record.url?scp=0346339526&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=0346339526&partnerID=8YFLogxK

    U2 - 10.1007/PL00009168

    DO - 10.1007/PL00009168

    M3 - Article

    AN - SCOPUS:0346339526

    VL - 18

    SP - 512

    EP - 520

    JO - Algorithmica

    JF - Algorithmica

    SN - 0178-4617

    IS - 4

    ER -