Mining impact-targeted activity patterns in imbalanced data

Publication Type:
Journal Article
IEEE Transactions on Knowledge and Data Engineering, 2008, 20 (8), pp. 1053 - 1066
Issue Date:
Filename Description Size
Thumbnail2008000233OK_Cao.pdf3.13 MB
Adobe PDF
Full metadata record
Impact-targeted activities are rare but they may have a significant impact on the society. For example, isolated terrorism activities may lead to a disastrous event, threatening the national security. Similar issues can also be seen in many other areas. Therefore, it is important to identify such particular activities before they lead to having a significant impact to the world. However, it is challenging to mine impact-targeted activity patterns due to their imbalanced structure. This paper develops techniques for discovering such activity patterns. First, the complexities of mining imbalanced impact-targeted activities are analyzed. We then discuss strategies for constructing impact-targeted activity sequences. Algorithms are developed to mine frequent positive-impact-oriented (P → T) and negative-impact-oriented (P → T̄) activity patterns, sequential impact-contrasted activity patterns (P is frequently associated with both patterns P → T and P → T̄ in separated data sets), and sequential impact-reversed activity patterns (both P → T and PQ → T̄ are frequent). Activity impact modeling is also studied to quantify the pattern impact on business outcomes. Social security debt-related activity data is used to test the proposed approaches. The outcomes show that they are promising for information and security informatics (ISI) applications to identify impact-targeted activity patterns in imbalanced data. © 2008 IEEE.
Please use this identifier to cite or link to this item: