A Partial-Repeatability Approach to Data Mining

Cai, K; Yin, Y; Zhang, S

A Partial-Repeatability Approach to Data Mining

Cai, K Yin, Y Zhang, S

Permalink

Publisher:: IEEE Computer Society
Publication Type:: Journal Article
Citation:: The IEEE Intelligent Informatics Bulletin, 2005, 5 (1), pp. 26 - 34
Issue Date:: 2005-01

Closed Access

	Filename	Description	Size
	2009005041OK.pdf		372.54 kB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Cai, K	en_US
dc.contributor.author	Yin, Y	en_US
dc.contributor.author	Zhang, S	en_US
dc.date.issued	2005-01	en_US
dc.identifier.citation	The IEEE Intelligent Informatics Bulletin, 2005, 5 (1), pp. 26 - 34	en_US
dc.identifier.issn	1727-5997	en_US
dc.identifier.uri	http://hdl.handle.net/10453/9011
dc.description.abstract	Unlike the data approached in traditional data mining activities, software data are featured with partial-repeatability or parepeatics, which is an invariant property that can neither be proved in mathematics nor validated to a high accuracy in physics, but still (partially) governs the behavior of the data. Parepeatics emerges as a result of the inaccurate universe. The universe comprises all possible C language programs is an example that cannot be accurately characterized since human writes defect-prone programs. In this paper we design a parepeatic mining framework for software data diming, where the mined knowledge is represented in terms of parepeatic models. A parepeatic model consists of central knowledge, a knowledge fluctuation zone and a correctness factor. Our approach can generate the required parepeatic model as a new form of knowledge representation from a given dataset and apply it to software data mining. Experimental results with real C language programs show that the proposed approach is effective	en_US
dc.publisher	IEEE Computer Society	en_US
dc.relation.ispartof	The IEEE Intelligent Informatics Bulletin	en_US
dc.rights	© 2005 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.	en_US
dc.title	A Partial-Repeatability Approach to Data Mining	en_US
dc.type	Journal Article
utslib.citation.volume	1	en_US
utslib.citation.volume	5	en_US
utslib.for	080604 Database Management	en_US
utslib.for	0807 Library and Information Studies	en_US
utslib.for	08 Information and Computing Sciences	en_US
pubs.embargo.period	Not known	en_US
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
utslib.copyright.status	closed_access
pubs.consider-herdc	false	en_US
pubs.issue	1	en_US
pubs.volume	5	en_US

Abstract:

Unlike the data approached in traditional data mining activities, software data are featured with partial-repeatability or parepeatics, which is an invariant property that can neither be proved in mathematics nor validated to a high accuracy in physics, but still (partially) governs the behavior of the data. Parepeatics emerges as a result of the inaccurate universe. The universe comprises all possible C language programs is an example that cannot be accurately characterized since human writes defect-prone programs. In this paper we design a parepeatic mining framework for software data diming, where the mined knowledge is represented in terms of parepeatic models. A parepeatic model consists of central knowledge, a knowledge fluctuation zone and a correctness factor. Our approach can generate the required parepeatic model as a new form of knowledge representation from a given dataset and apply it to software data mining. Experimental results with real C language programs show that the proposed approach is effective

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/9011