Estimating confidence intervals for structural differences between contrast groups with missing data

Elsevier Science
Publication Type:
Journal Article
Expert Systems with Applications, 2009, 36 (3), pp. 6431 - 6438
Issue Date:
Full metadata record
Files in This Item:
Filename Description SizeFormat
2008001127OK.pdf579.44 kBAdobe PDF
Difference detection is actual and extremely useful for evaluating a new medicine B against a specified disease by comparing to an old medicine A, which has been used to treat the disease for many years. The datasets generated by applying A and B to the disease are called contrast groups and, main differences between the groups are the mean and distribution differences, referred to structural differences in this paper. However, contrast groups are only two samples obtained by limited applications or tests on A and B, and may be with missing values. Therefore, the differences derived from the groups are inevitably uncertain. In this paper, we propose a statistically sound approach for measuring this uncertainty by identifying the confidence intervals of structural differences between contrast groups. This method is designed significantly against most of those applications whose exact data distributions are unknown a priori, and the data may also be with missing values. We apply our approach to UCI datasets to illustrate its power as a new data mining technique for, such as, distinguishing spam from non-spam emails; and the benign breast cancer from the malign one.
Please use this identifier to cite or link to this item: