Mining maximal frequent itemsets from data streams

Publication Type:
Journal Article
Citation:
Journal of Information Science, 2007, 33 (3), pp. 251 - 262
Issue Date:
2007-01-01
Filename Description Size
Thumbnail2011000603OK.pdf229.61 kB
Adobe PDF
Full metadata record
Frequent pattern mining from data streams is an active research topic in data mining. Existing research efforts often rely on a two-phase framework to discover frequent patterns: (1) using internal data structures to store meta-patterns obtained by scanning the stream data; and (2) re-mining the meta-patterns to finalize and output frequent patterns. The defectiveness of such a two-phase framework lies in the fact that the two stages provide barriers to dynamically and immediately finding frequent patterns with online functionalities. It is expected that a single-phase algorithm can fulfil frequent pattern mining from data streams in such a way that the users can see patterns in an immediate and dynamic manner, as soon as the patterns have become frequent. In this paper, we propose INSTANT, a single-phase algorithm for discovering frequent itemsets from data streams. The theoretical foundation of INSTANT is based on a framework theory on a set of itemsets, which is also presented in the paper. The novel design of INSTANT ensures that it employs compact data structures to mine frequent patterns from data streams in a single phase. Our experimental results demonstrate the time and space efficiency of the proposed algorithm. © CILIP.
Please use this identifier to cite or link to this item: