Estimating confidence intervals for structural differences between contrast groups with missing data

Qin, Y; Zhang, S; Zhu, X; Zhang, J; Zhang, C

Estimating confidence intervals for structural differences between contrast groups with missing data

Qin, Y Zhang, S Zhu, X Zhang, J Zhang, C

Permalink

Publication Type:: Journal Article
Citation:: Expert Systems with Applications, 2009, 36 (3 PART 2), pp. 6431 - 6438
Issue Date:: 2009-01-01

Closed Access

	Filename	Description	Size
	2008001127OK.pdf		579.44 kB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Qin, Y	en_US
dc.contributor.author	Zhang, S	en_US
dc.contributor.author	Zhu, X	en_US
dc.contributor.author	Zhang, J	en_US
dc.contributor.author	Zhang, C https://orcid.org/0000-0001-5715-7154	en_US
dc.date.issued	2009-01-01	en_US
dc.identifier.citation	Expert Systems with Applications, 2009, 36 (3 PART 2), pp. 6431 - 6438	en_US
dc.identifier.issn	0957-4174	en_US
dc.identifier.uri	http://hdl.handle.net/10453/9095
dc.description.abstract	Difference detection is actual and extremely useful for evaluating a new medicine B against a specified disease by comparing to an old medicine A, which has been used to treat the disease for many years. The datasets generated by applying A and B to the disease are called contrast groups and, main differences between the groups are the mean and distribution differences, referred to structural differences in this paper. However, contrast groups are only two samples obtained by limited applications or tests on A and B, and may be with missing values. Therefore, the differences derived from the groups are inevitably uncertain. In this paper, we propose a statistically sound approach for measuring this uncertainty by identifying the confidence intervals of structural differences between contrast groups. This method is designed significantly against most of those applications whose exact data distributions are unknown a priori, and the data may also be with missing values. We apply our approach to UCI datasets to illustrate its power as a new data mining technique for, such as, distinguishing spam from non-spam emails; and the benign breast cancer from the malign one. © 2008 Elsevier Ltd. All rights reserved.	en_US
dc.relation	http://purl.org/au-research/grants/arc/DP0667060
dc.relation.ispartof	Expert Systems with Applications	en_US
dc.relation.isbasedon	10.1016/j.eswa.2008.07.068	en_US
dc.subject.classification	Artificial Intelligence & Image Processing	en_US
dc.title	Estimating confidence intervals for structural differences between contrast groups with missing data	en_US
dc.type	Journal Article
utslib.citation.volume	3 PART 2	en_US
utslib.citation.volume	36	en_US
utslib.for	0801 Artificial Intelligence and Image Processing	en_US
utslib.for	080604 Database Management	en_US
utslib.for	0102 Applied Mathematics	en_US
utslib.for	01 Mathematical Sciences	en_US
utslib.for	08 Information and Computing Sciences	en_US
utslib.for	09 Engineering	en_US
pubs.embargo.period	Not known	en_US
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/DVC (International)
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Strength - ACRI - Australia China Relations Institute
pubs.organisational-group	/University of Technology Sydney/Strength - CAI - Centre for Artificial Intelligence
utslib.copyright.status	closed_access
pubs.issue	3 PART 2	en_US
pubs.publication-status	Published	en_US
pubs.volume	36	en_US

Abstract:

Difference detection is actual and extremely useful for evaluating a new medicine B against a specified disease by comparing to an old medicine A, which has been used to treat the disease for many years. The datasets generated by applying A and B to the disease are called contrast groups and, main differences between the groups are the mean and distribution differences, referred to structural differences in this paper. However, contrast groups are only two samples obtained by limited applications or tests on A and B, and may be with missing values. Therefore, the differences derived from the groups are inevitably uncertain. In this paper, we propose a statistically sound approach for measuring this uncertainty by identifying the confidence intervals of structural differences between contrast groups. This method is designed significantly against most of those applications whose exact data distributions are unknown a priori, and the data may also be with missing values. We apply our approach to UCI datasets to illustrate its power as a new data mining technique for, such as, distinguishing spam from non-spam emails; and the benign breast cancer from the malign one. © 2008 Elsevier Ltd. All rights reserved.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/9095