e-NSP: efficient negative sequential pattern mining based on identified positive patterns without database rescanning

Publisher:
ACM
Publication Type:
Conference Proceeding
Citation:
Proceedings of the 20th ACM International Conference on Information and Knowledge Management, 2011, pp. 825 - 830
Issue Date:
2011-01
Full metadata record
Files in This Item:
Filename Description Size
Thumbnail2011000711OK.pdf1.81 MB
Adobe PDF
Mining Negative Sequential Patterns (NSP) is much more challenging than mining Positive Sequential Patterns (PSP) due to the high computational complexity and huge search space required in calculating Negative Sequential Candidates (NSC). Very few approaches are available for mining NSP, which mainly rely on re-scanning databases after identifying PSP. As a result, they are very ine?cient. In this paper, we propose an e?cient algorithm for mining NSP, called e-NSP, which mines for NSP by only involving the identi?ed PSP, without re-scanning databases. First, negative containment is de?ned to determine whether or not a data sequence contains a negative sequence. Second, an e?cient approach is proposed to convert the negative containment problem to a positive containment problem. The supports of NSC are then calculated based only on the corresponding PSP. Finally, a simple but e?cient approach is proposed to generate NSC. With e-NSP, mining NSP does not require additional database scans, and the existing PSP mining algorithms can be integrated into e-NSP to mine for NSP e?ciently. eNSP is compared with two currently available NSP mining algorithms on 14 synthetic and real-life datasets. Intensive experiments show that e-NSP takes as little as 3% of the runtime of the baseline approaches and is applicable for efficient mining of NSP in large datasets.
Please use this identifier to cite or link to this item: