### Abstract

In this work, we are interested in periodic trends in long data streams in the presence of computational constraints. To this end; we present algorithms for discovering periodic trends in the combinatorial property testing model in a data stream S of length n using o(n) samples and space. In accordance with the property testing model, we first explore the notion of being close to periodic by defining three different notions of self-distance through relaxing different notions of exact periodicity. An input S is then called approximately periodic if it exhibits a small self-distance (with respect to any one self-distance defined). We show that even though the different definitions of exact periodicity are equivalent, the resulting definitions of self-distance and approximate periodicity are not; we also show that these self-distances are constant approximations of each other. Afterwards, we present algorithms which distinguish between the two cases where S is exactly periodic and S is far from periodic with only a constant probability of error. Our algorithms sample only O(nlog^{2} n) (or O(nlog^{4} n), depending on the self-distance) positions and use as much space. They can also find, using o(n) samples and space, the largest/smallest period, and/or all of the approximate periods of S. These algorithms can also be viewed as working on streaming inputs where each data item is seen once and in order, storing only a sublinear (O(nlog ^{2} n) or O(nlog^{4} n)) size sample from which periodicities are identified.

Original language | English (US) |
---|---|

Article number | 43 |

Journal | ACM Transactions on Algorithms |

Volume | 6 |

Issue number | 2 |

DOIs | |

State | Published - Mar 1 2010 |

### Fingerprint

### Keywords

- Combinatorial property testing
- Periodicity

### ASJC Scopus subject areas

- Mathematics (miscellaneous)

### Cite this

*ACM Transactions on Algorithms*,

*6*(2), [43]. https://doi.org/10.1145/1721837.1721859

**Periodicity testing with sublinear samples and space.** / Ergun, Funda; Muthukrishnan, Shanmugavelayutham; Sahinalp, Cenk.

Research output: Contribution to journal › Article

*ACM Transactions on Algorithms*, vol. 6, no. 2, 43. https://doi.org/10.1145/1721837.1721859

}

TY - JOUR

T1 - Periodicity testing with sublinear samples and space

AU - Ergun, Funda

AU - Muthukrishnan, Shanmugavelayutham

AU - Sahinalp, Cenk

PY - 2010/3/1

Y1 - 2010/3/1

N2 - In this work, we are interested in periodic trends in long data streams in the presence of computational constraints. To this end; we present algorithms for discovering periodic trends in the combinatorial property testing model in a data stream S of length n using o(n) samples and space. In accordance with the property testing model, we first explore the notion of being close to periodic by defining three different notions of self-distance through relaxing different notions of exact periodicity. An input S is then called approximately periodic if it exhibits a small self-distance (with respect to any one self-distance defined). We show that even though the different definitions of exact periodicity are equivalent, the resulting definitions of self-distance and approximate periodicity are not; we also show that these self-distances are constant approximations of each other. Afterwards, we present algorithms which distinguish between the two cases where S is exactly periodic and S is far from periodic with only a constant probability of error. Our algorithms sample only O(nlog2 n) (or O(nlog4 n), depending on the self-distance) positions and use as much space. They can also find, using o(n) samples and space, the largest/smallest period, and/or all of the approximate periods of S. These algorithms can also be viewed as working on streaming inputs where each data item is seen once and in order, storing only a sublinear (O(nlog 2 n) or O(nlog4 n)) size sample from which periodicities are identified.

AB - In this work, we are interested in periodic trends in long data streams in the presence of computational constraints. To this end; we present algorithms for discovering periodic trends in the combinatorial property testing model in a data stream S of length n using o(n) samples and space. In accordance with the property testing model, we first explore the notion of being close to periodic by defining three different notions of self-distance through relaxing different notions of exact periodicity. An input S is then called approximately periodic if it exhibits a small self-distance (with respect to any one self-distance defined). We show that even though the different definitions of exact periodicity are equivalent, the resulting definitions of self-distance and approximate periodicity are not; we also show that these self-distances are constant approximations of each other. Afterwards, we present algorithms which distinguish between the two cases where S is exactly periodic and S is far from periodic with only a constant probability of error. Our algorithms sample only O(nlog2 n) (or O(nlog4 n), depending on the self-distance) positions and use as much space. They can also find, using o(n) samples and space, the largest/smallest period, and/or all of the approximate periods of S. These algorithms can also be viewed as working on streaming inputs where each data item is seen once and in order, storing only a sublinear (O(nlog 2 n) or O(nlog4 n)) size sample from which periodicities are identified.

KW - Combinatorial property testing

KW - Periodicity

UR - http://www.scopus.com/inward/record.url?scp=77950813642&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=77950813642&partnerID=8YFLogxK

U2 - 10.1145/1721837.1721859

DO - 10.1145/1721837.1721859

M3 - Article

VL - 6

JO - ACM Transactions on Algorithms

JF - ACM Transactions on Algorithms

SN - 1549-6325

IS - 2

M1 - 43

ER -