Pincer-search: A new algorithm for discovering the maximum frequent set

Dao I. Lin, Zvi M. Kedem

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Discovering frequent itemsets is a key problem in important data mining applications, such as the discovery of association rules, strong rules, episodes, and minimal keys. Typical algorithms for solving this problem operate in a bottom-up breadth-first search direction. The computation starts from frequent 1-itemsets (minimal length frequent itemsets) and continues until all maximal (length) frequent itemsets are found. During the execution, every frequent itemset is explicitly considered. Such algorithms perform reasonably well when all maximal frequent itemsets are short. However, performance drastically decreases when some of the maximal frequent itemsets are relatively long. We present a new algorithm which combines both the bottom-up and topdown searches. The primary search direction is still bottom-up, but a restricted search is also conducted in the top-down direction. This search is used only for maintaining and updating a new data structure we designed, the maximum frequent candidate set. It is used to prune candidates in the bottom-up search. A very important characteristic of the algorithm is that it does not require explicite examination of every frequent itemset. Therefore the algorithm performs well even when some maximal frequent itemsets are long. As its output, the algorithm produces the maximum frequent set, i.e., the set containing all maximal frequent itemsets, which therefore specifies immediately all frequent itemsets. We evaluate the performance of the Mgorithm using a well-known benchmark database. The improvements can be up to several orders of magnitude, compared to the best current algorithms.

Original languageEnglish (US)
Title of host publicationAdvances in Database Technology, EDBT 1998 - 6th International Conference on Extending Database Technology, Proceedings
Pages105-119
Number of pages15
StatePublished - Dec 1 1998
Event6th International Conference on Extending Database Technology, EDBT 1998 - Valencia, Spain
Duration: Mar 23 1998Mar 27 1998

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume1377 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other6th International Conference on Extending Database Technology, EDBT 1998
CountrySpain
CityValencia
Period3/23/983/27/98

    Fingerprint

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

Lin, D. I., & Kedem, Z. M. (1998). Pincer-search: A new algorithm for discovering the maximum frequent set. In Advances in Database Technology, EDBT 1998 - 6th International Conference on Extending Database Technology, Proceedings (pp. 105-119). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 1377 LNCS).