Overlap matching

Amihood Amir, Richard Cole, Ramesh Hariharan, Moshe Lewenstein, Ely Porat

Research output: Contribution to journalArticle

Abstract

We propose a new paradigm for string matching, namely structural matching. In structural matching, the text and pattern contents are not important. Rather, some areas in the text and pattern, such as intervals, are singled out. A "match" is a text location where a specified relation between the text and pattern areas is satisfied. In particular we define the structural matching problem of overlap (parity) matching. We seek the text locations where all overlaps of the given pattern and text intervals have even length. We show that this problem can be solved in time O(n log m), where the text length is n and the pattern length is m. As an application of overlap matching, we show how to reduce the string matching with swaps problem to the overlap matching problem. The string matching with swaps problem is the problem of string matching in the presence of local swaps. The best deterministic upper bound known for this problem was O(nm1/3 log m log σ) for a general alphabet Σ, where σ = min(m, |Z|). Our reduction provides a solution to the pattern matching with swaps problem in time O(n log m log σ).

Original languageEnglish (US)
Pages (from-to)57-74
Number of pages18
JournalInformation and Computation
Volume181
Issue number1
DOIs
StatePublished - Feb 25 2003

Fingerprint

Overlap
String Matching
Swap
Pattern matching
Matching Problem
Interval
Pattern Matching
Text
Parity
Paradigm
Upper bound

ASJC Scopus subject areas

  • Computational Theory and Mathematics

Cite this

Amir, A., Cole, R., Hariharan, R., Lewenstein, M., & Porat, E. (2003). Overlap matching. Information and Computation, 181(1), 57-74. https://doi.org/10.1016/S0890-5401(02)00035-4

Overlap matching. / Amir, Amihood; Cole, Richard; Hariharan, Ramesh; Lewenstein, Moshe; Porat, Ely.

In: Information and Computation, Vol. 181, No. 1, 25.02.2003, p. 57-74.

Research output: Contribution to journalArticle

Amir, A, Cole, R, Hariharan, R, Lewenstein, M & Porat, E 2003, 'Overlap matching', Information and Computation, vol. 181, no. 1, pp. 57-74. https://doi.org/10.1016/S0890-5401(02)00035-4
Amir A, Cole R, Hariharan R, Lewenstein M, Porat E. Overlap matching. Information and Computation. 2003 Feb 25;181(1):57-74. https://doi.org/10.1016/S0890-5401(02)00035-4
Amir, Amihood ; Cole, Richard ; Hariharan, Ramesh ; Lewenstein, Moshe ; Porat, Ely. / Overlap matching. In: Information and Computation. 2003 ; Vol. 181, No. 1. pp. 57-74.
@article{2c08285998d94f419f374a7a00deaa75,
title = "Overlap matching",
abstract = "We propose a new paradigm for string matching, namely structural matching. In structural matching, the text and pattern contents are not important. Rather, some areas in the text and pattern, such as intervals, are singled out. A {"}match{"} is a text location where a specified relation between the text and pattern areas is satisfied. In particular we define the structural matching problem of overlap (parity) matching. We seek the text locations where all overlaps of the given pattern and text intervals have even length. We show that this problem can be solved in time O(n log m), where the text length is n and the pattern length is m. As an application of overlap matching, we show how to reduce the string matching with swaps problem to the overlap matching problem. The string matching with swaps problem is the problem of string matching in the presence of local swaps. The best deterministic upper bound known for this problem was O(nm1/3 log m log σ) for a general alphabet Σ, where σ = min(m, |Z|). Our reduction provides a solution to the pattern matching with swaps problem in time O(n log m log σ).",
author = "Amihood Amir and Richard Cole and Ramesh Hariharan and Moshe Lewenstein and Ely Porat",
year = "2003",
month = "2",
day = "25",
doi = "10.1016/S0890-5401(02)00035-4",
language = "English (US)",
volume = "181",
pages = "57--74",
journal = "Information and Computation",
issn = "0890-5401",
publisher = "Elsevier Inc.",
number = "1",

}

TY - JOUR

T1 - Overlap matching

AU - Amir, Amihood

AU - Cole, Richard

AU - Hariharan, Ramesh

AU - Lewenstein, Moshe

AU - Porat, Ely

PY - 2003/2/25

Y1 - 2003/2/25

N2 - We propose a new paradigm for string matching, namely structural matching. In structural matching, the text and pattern contents are not important. Rather, some areas in the text and pattern, such as intervals, are singled out. A "match" is a text location where a specified relation between the text and pattern areas is satisfied. In particular we define the structural matching problem of overlap (parity) matching. We seek the text locations where all overlaps of the given pattern and text intervals have even length. We show that this problem can be solved in time O(n log m), where the text length is n and the pattern length is m. As an application of overlap matching, we show how to reduce the string matching with swaps problem to the overlap matching problem. The string matching with swaps problem is the problem of string matching in the presence of local swaps. The best deterministic upper bound known for this problem was O(nm1/3 log m log σ) for a general alphabet Σ, where σ = min(m, |Z|). Our reduction provides a solution to the pattern matching with swaps problem in time O(n log m log σ).

AB - We propose a new paradigm for string matching, namely structural matching. In structural matching, the text and pattern contents are not important. Rather, some areas in the text and pattern, such as intervals, are singled out. A "match" is a text location where a specified relation between the text and pattern areas is satisfied. In particular we define the structural matching problem of overlap (parity) matching. We seek the text locations where all overlaps of the given pattern and text intervals have even length. We show that this problem can be solved in time O(n log m), where the text length is n and the pattern length is m. As an application of overlap matching, we show how to reduce the string matching with swaps problem to the overlap matching problem. The string matching with swaps problem is the problem of string matching in the presence of local swaps. The best deterministic upper bound known for this problem was O(nm1/3 log m log σ) for a general alphabet Σ, where σ = min(m, |Z|). Our reduction provides a solution to the pattern matching with swaps problem in time O(n log m log σ).

UR - http://www.scopus.com/inward/record.url?scp=0037465502&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0037465502&partnerID=8YFLogxK

U2 - 10.1016/S0890-5401(02)00035-4

DO - 10.1016/S0890-5401(02)00035-4

M3 - Article

AN - SCOPUS:0037465502

VL - 181

SP - 57

EP - 74

JO - Information and Computation

JF - Information and Computation

SN - 0890-5401

IS - 1

ER -