Faster suffix tree construction with missing suffix links

Richard Cole, Ramesh Hariharan

Research output: Contribution to journalArticle

Abstract

We consider suffix tree construction for situations with missing suffix links. Two examples of such situations are suffix trees for parameterized strings and suffix trees for two-dimensional arrays. These trees also have the property that the node degrees may be large. We add a new back-propagation component to McCreight's algorithm and also give a high probability hashing scheme for large degrees. We show that these two features enable construction of suffix trees for general situations with missing suffix links in O(n) time, with high probability. This gives the first randomized linear time algorithm for constructing suffix trees for parameterized strings.

Original languageEnglish (US)
Pages (from-to)26-42
Number of pages17
JournalSIAM Journal on Computing
Volume33
Issue number1
DOIs
StatePublished - Nov 2003

Fingerprint

Suffix Tree
Suffix
Backpropagation
Strings
Hashing
Back Propagation
Randomized Algorithms
Linear-time Algorithm
Vertex of a graph

Keywords

  • Dynamic perfect hashing
  • Parameterized strings
  • Suffix tree
  • Two-dimensional suffix trees

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computational Theory and Mathematics
  • Applied Mathematics

Cite this

Faster suffix tree construction with missing suffix links. / Cole, Richard; Hariharan, Ramesh.

In: SIAM Journal on Computing, Vol. 33, No. 1, 11.2003, p. 26-42.

Research output: Contribution to journalArticle

Cole, Richard ; Hariharan, Ramesh. / Faster suffix tree construction with missing suffix links. In: SIAM Journal on Computing. 2003 ; Vol. 33, No. 1. pp. 26-42.
@article{fe9e9838ba9149d88a36bcde0adedc76,
title = "Faster suffix tree construction with missing suffix links",
abstract = "We consider suffix tree construction for situations with missing suffix links. Two examples of such situations are suffix trees for parameterized strings and suffix trees for two-dimensional arrays. These trees also have the property that the node degrees may be large. We add a new back-propagation component to McCreight's algorithm and also give a high probability hashing scheme for large degrees. We show that these two features enable construction of suffix trees for general situations with missing suffix links in O(n) time, with high probability. This gives the first randomized linear time algorithm for constructing suffix trees for parameterized strings.",
keywords = "Dynamic perfect hashing, Parameterized strings, Suffix tree, Two-dimensional suffix trees",
author = "Richard Cole and Ramesh Hariharan",
year = "2003",
month = "11",
doi = "10.1137/S0097539701424465",
language = "English (US)",
volume = "33",
pages = "26--42",
journal = "SIAM Journal on Computing",
issn = "0097-5397",
publisher = "Society for Industrial and Applied Mathematics Publications",
number = "1",

}

TY - JOUR

T1 - Faster suffix tree construction with missing suffix links

AU - Cole, Richard

AU - Hariharan, Ramesh

PY - 2003/11

Y1 - 2003/11

N2 - We consider suffix tree construction for situations with missing suffix links. Two examples of such situations are suffix trees for parameterized strings and suffix trees for two-dimensional arrays. These trees also have the property that the node degrees may be large. We add a new back-propagation component to McCreight's algorithm and also give a high probability hashing scheme for large degrees. We show that these two features enable construction of suffix trees for general situations with missing suffix links in O(n) time, with high probability. This gives the first randomized linear time algorithm for constructing suffix trees for parameterized strings.

AB - We consider suffix tree construction for situations with missing suffix links. Two examples of such situations are suffix trees for parameterized strings and suffix trees for two-dimensional arrays. These trees also have the property that the node degrees may be large. We add a new back-propagation component to McCreight's algorithm and also give a high probability hashing scheme for large degrees. We show that these two features enable construction of suffix trees for general situations with missing suffix links in O(n) time, with high probability. This gives the first randomized linear time algorithm for constructing suffix trees for parameterized strings.

KW - Dynamic perfect hashing

KW - Parameterized strings

KW - Suffix tree

KW - Two-dimensional suffix trees

UR - http://www.scopus.com/inward/record.url?scp=1842459453&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=1842459453&partnerID=8YFLogxK

U2 - 10.1137/S0097539701424465

DO - 10.1137/S0097539701424465

M3 - Article

AN - SCOPUS:1842459453

VL - 33

SP - 26

EP - 42

JO - SIAM Journal on Computing

JF - SIAM Journal on Computing

SN - 0097-5397

IS - 1

ER -