CD: A coupled discretization algorithm

Wang, C; Wang, M; She, Z; Cao, L

CD: A coupled discretization algorithm

Wang, C Wang, M She, Z Cao, L

Permalink

Publication Type:: Conference Proceeding
Citation:: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2012, 7301 LNAI (PART 2), pp. 407 - 418
Issue Date:: 2012-05-29

Closed Access

	Filename	Description	Size
	2012001104OK.pdf	Published Version	354.26 kB		View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Wang, C	en_US
dc.contributor.author	Wang, M	en_US
dc.contributor.author	She, Z	en_US
dc.contributor.author	Cao, L https://orcid.org/0000-0003-1562-9429	en_US
dc.date.issued	2012-05-29	en_US
dc.identifier.citation	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2012, 7301 LNAI (PART 2), pp. 407 - 418	en_US
dc.identifier.isbn	9783642302190	en_US
dc.identifier.issn	0302-9743	en_US
dc.identifier.uri	http://hdl.handle.net/10453/32998
dc.description.abstract	Discretization technique plays an important role in data mining and machine learning. While numeric data is predominant in the real world, many algorithms in supervised learning are restricted to discrete variables. Thus, a variety of research has been conducted on discretization, which is a process of converting the continuous attribute values into limited intervals. Recent work derived from entropy-based discretization methods, which has produced impressive results, introduces information attribute dependency to reduce the uncertainty level of a decision table; but no attention is given to the increment of certainty degree from the aspect of positive domain ratio. This paper proposes a discretization algorithm based on both positive domain and its coupling with information entropy, which not only considers information attribute dependency but also concerns deterministic feature relationship. Substantial experiments on extensive UCI data sets provide evidence that our proposed coupled discretization algorithm generally outperforms other seven existing methods and the positive domain based algorithm proposed in this paper, in terms of simplicity, stability, consistency, and accuracy. © 2012 Springer-Verlag.	en_US
dc.relation	http://purl.org/au-research/grants/arc/DP0988016
dc.relation	http://purl.org/au-research/grants/arc/LP0989721R1
dc.relation	http://purl.org/au-research/grants/arc/DP1096218
dc.relation	http://purl.org/au-research/grants/arc/LP100200774
dc.relation.ispartof	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)	en_US
dc.relation.isbasedon	10.1007/978-3-642-30220-6_34	en_US
dc.subject.classification	Artificial Intelligence & Image Processing	en_US
dc.title	CD: A coupled discretization algorithm	en_US
dc.type	Conference Proceeding
utslib.citation.volume	PART 2	en_US
utslib.citation.volume	7301 LNAI	en_US
utslib.for	0801 Artificial Intelligence and Image Processing	en_US
dc.location.activity	Kuala Lumpur, Malaysia
pubs.embargo.period	Not known	en_US
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Strength - AAI - Advanced Analytics Institute Research Centre
utslib.copyright.status	closed_access
pubs.issue	PART 2	en_US
pubs.publication-status	Published	en_US
pubs.volume	7301 LNAI	en_US

Abstract:

Discretization technique plays an important role in data mining and machine learning. While numeric data is predominant in the real world, many algorithms in supervised learning are restricted to discrete variables. Thus, a variety of research has been conducted on discretization, which is a process of converting the continuous attribute values into limited intervals. Recent work derived from entropy-based discretization methods, which has produced impressive results, introduces information attribute dependency to reduce the uncertainty level of a decision table; but no attention is given to the increment of certainty degree from the aspect of positive domain ratio. This paper proposes a discretization algorithm based on both positive domain and its coupling with information entropy, which not only considers information attribute dependency but also concerns deterministic feature relationship. Substantial experiments on extensive UCI data sets provide evidence that our proposed coupled discretization algorithm generally outperforms other seven existing methods and the positive domain based algorithm proposed in this paper, in terms of simplicity, stability, consistency, and accuracy. © 2012 Springer-Verlag.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/32998