Summarizing uncertain transaction databases by Probabilistic Tiles

Publication Type:
Conference Proceeding
Citation:
Proceedings of the International Joint Conference on Neural Networks, 2016, 2016-October pp. 4375 - 4382
Issue Date:
2016-10-31
Filename Description Size
07727771.pdfPublished version268.39 kB
Adobe PDF
Full metadata record
© 2016 IEEE. Transaction data mining is ubiquitous in various domains and has been researched extensively. In recent years, observing that uncertainty is inherent in many real world applications, uncertain data mining has attracted much research attention. Among the research problems, summarization is important because it produces concise and informative results, which facilitates further analysis. However, there are few works exploring how to effectively summarize uncertain transaction data. In this paper, we formulate the problem of summarizing uncertain transaction data as Minimal Probabilistic Tile Cover Mining, which aims to find a high-quality probabilistic tile set covering an uncertain database with minimal cost. We define the concept of Probabilistic Price and Probabilistic Price Order to evaluate and compare the quality of tiles, and propose a framework to discover the minimal probabilistic tile cover. The bottleneck is to check whether a tile is better than another according to the Probabilistic Price Order, which involves the computation of a joint probability. We prove that it can be decomposed into independent terms and calculated efficiently. Several optimization techniques are devised to further improve the performance. Experimental results on real world datasets demonstrate the conciseness of the produced tiles and the effectiveness and efficiency of our approach.
Please use this identifier to cite or link to this item: