Dynamic pattern detection with temporal consistency and connectivity constraints

Skyler Speakman, Yating Zhang, Daniel Neill

Research output: Contribution to journalConference article

Abstract

We explore scalable and accurate dynamic pattern detection methods in graph-based data sets. We apply our proposed Dynamic Subset Scan method to the task of detecting, tracking, and source-tracing contaminant plumes spreading through a water distribution system equipped with noisy, binary sensors. While static patterns affect the same subset of data over a period of time, dynamic patterns may affect different subsets of the data at each time step. These dynamic patterns require a new approach to define and optimize penalized likelihood ratio statistics in the subset scan framework, as well as new computational techniques that scale to large, real-world networks. To address the first concern, we develop new subset scan methods that allow the detected subset of nodes to change over time, while incorporating temporal consistency constraints to reward patterns that do not dramatically change between adjacent time steps. Second, our Additive Graph Scan algorithm allows our novel scan statistic to process small graphs (500 nodes) in 4.1 seconds on average while maintaining an approximation ratio over 99% compared to an exact optimization method, and to scale to large graphs with over 12,000 nodes in 30 minutes on average. Evaluation results across multiple detection, tracking, and source-tracing tasks demonstrate substantial performance gains achieved by the Dynamic Subset Scan approach.

Original languageEnglish (US)
Article number6729554
Pages (from-to)697-706
Number of pages10
JournalProceedings - IEEE International Conference on Data Mining, ICDM
DOIs
StatePublished - Dec 1 2013
Event13th IEEE International Conference on Data Mining, ICDM 2013 - Dallas, TX, United States
Duration: Dec 7 2013Dec 10 2013

Fingerprint

Statistics
Water distribution systems
Set theory
Impurities
Sensors

Keywords

  • likelihood ratio statistics
  • sensor fusion
  • spatial and subset scan statistics
  • water distribution systems

ASJC Scopus subject areas

  • Engineering(all)

Cite this

Dynamic pattern detection with temporal consistency and connectivity constraints. / Speakman, Skyler; Zhang, Yating; Neill, Daniel.

In: Proceedings - IEEE International Conference on Data Mining, ICDM, 01.12.2013, p. 697-706.

Research output: Contribution to journalConference article

@article{3edf8131e8524ad8838fad47eab42126,
title = "Dynamic pattern detection with temporal consistency and connectivity constraints",
abstract = "We explore scalable and accurate dynamic pattern detection methods in graph-based data sets. We apply our proposed Dynamic Subset Scan method to the task of detecting, tracking, and source-tracing contaminant plumes spreading through a water distribution system equipped with noisy, binary sensors. While static patterns affect the same subset of data over a period of time, dynamic patterns may affect different subsets of the data at each time step. These dynamic patterns require a new approach to define and optimize penalized likelihood ratio statistics in the subset scan framework, as well as new computational techniques that scale to large, real-world networks. To address the first concern, we develop new subset scan methods that allow the detected subset of nodes to change over time, while incorporating temporal consistency constraints to reward patterns that do not dramatically change between adjacent time steps. Second, our Additive Graph Scan algorithm allows our novel scan statistic to process small graphs (500 nodes) in 4.1 seconds on average while maintaining an approximation ratio over 99{\%} compared to an exact optimization method, and to scale to large graphs with over 12,000 nodes in 30 minutes on average. Evaluation results across multiple detection, tracking, and source-tracing tasks demonstrate substantial performance gains achieved by the Dynamic Subset Scan approach.",
keywords = "likelihood ratio statistics, sensor fusion, spatial and subset scan statistics, water distribution systems",
author = "Skyler Speakman and Yating Zhang and Daniel Neill",
year = "2013",
month = "12",
day = "1",
doi = "10.1109/ICDM.2013.66",
language = "English (US)",
pages = "697--706",
journal = "Proceedings - IEEE International Conference on Data Mining, ICDM",
issn = "1550-4786",

}

TY - JOUR

T1 - Dynamic pattern detection with temporal consistency and connectivity constraints

AU - Speakman, Skyler

AU - Zhang, Yating

AU - Neill, Daniel

PY - 2013/12/1

Y1 - 2013/12/1

N2 - We explore scalable and accurate dynamic pattern detection methods in graph-based data sets. We apply our proposed Dynamic Subset Scan method to the task of detecting, tracking, and source-tracing contaminant plumes spreading through a water distribution system equipped with noisy, binary sensors. While static patterns affect the same subset of data over a period of time, dynamic patterns may affect different subsets of the data at each time step. These dynamic patterns require a new approach to define and optimize penalized likelihood ratio statistics in the subset scan framework, as well as new computational techniques that scale to large, real-world networks. To address the first concern, we develop new subset scan methods that allow the detected subset of nodes to change over time, while incorporating temporal consistency constraints to reward patterns that do not dramatically change between adjacent time steps. Second, our Additive Graph Scan algorithm allows our novel scan statistic to process small graphs (500 nodes) in 4.1 seconds on average while maintaining an approximation ratio over 99% compared to an exact optimization method, and to scale to large graphs with over 12,000 nodes in 30 minutes on average. Evaluation results across multiple detection, tracking, and source-tracing tasks demonstrate substantial performance gains achieved by the Dynamic Subset Scan approach.

AB - We explore scalable and accurate dynamic pattern detection methods in graph-based data sets. We apply our proposed Dynamic Subset Scan method to the task of detecting, tracking, and source-tracing contaminant plumes spreading through a water distribution system equipped with noisy, binary sensors. While static patterns affect the same subset of data over a period of time, dynamic patterns may affect different subsets of the data at each time step. These dynamic patterns require a new approach to define and optimize penalized likelihood ratio statistics in the subset scan framework, as well as new computational techniques that scale to large, real-world networks. To address the first concern, we develop new subset scan methods that allow the detected subset of nodes to change over time, while incorporating temporal consistency constraints to reward patterns that do not dramatically change between adjacent time steps. Second, our Additive Graph Scan algorithm allows our novel scan statistic to process small graphs (500 nodes) in 4.1 seconds on average while maintaining an approximation ratio over 99% compared to an exact optimization method, and to scale to large graphs with over 12,000 nodes in 30 minutes on average. Evaluation results across multiple detection, tracking, and source-tracing tasks demonstrate substantial performance gains achieved by the Dynamic Subset Scan approach.

KW - likelihood ratio statistics

KW - sensor fusion

KW - spatial and subset scan statistics

KW - water distribution systems

UR - http://www.scopus.com/inward/record.url?scp=84894655018&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84894655018&partnerID=8YFLogxK

U2 - 10.1109/ICDM.2013.66

DO - 10.1109/ICDM.2013.66

M3 - Conference article

SP - 697

EP - 706

JO - Proceedings - IEEE International Conference on Data Mining, ICDM

JF - Proceedings - IEEE International Conference on Data Mining, ICDM

SN - 1550-4786

M1 - 6729554

ER -