Efficient selection of discriminative genes from microarray gene expression data for cancer diagnosis

Publication Type:
Journal Article
IEEE Transactions on Circuits and Systems I: Regular Papers, 2005, 52 (9), pp. 1909 - 1918
Issue Date:
Full metadata record
A new mutual information (MI)-based feature-selection method to solve the so-called large p and small n problem experienced in a microarray gene expression-based data is presented. First, a grid-based feature clustering algorithm is introduced to eliminate redundant features. A huge gene set is then greatly reduced in a very efficient way. As a result, the computational efficiency of the whole feature-selection process is substantially enhanced. Second, MI is directly estimated using quadratic MI together with Parzen window density estimators. This approach is able to deliver reliable results even when only a small pattern set is available. Also, a new MI-based criterion is proposed to avoid the highly redundant selection results in a systematic way. At last, attributed to the direct estimation of MI, the appropriate selected feature subsets can be reasonably determined. © 2005 IEEE.
Please use this identifier to cite or link to this item: