A framework for high dimensional data reduction in the microarray domain

Anaissi, A; Kennedy, PJ; Goyal, M

A framework for high dimensional data reduction in the microarray domain

Anaissi, A Kennedy, PJ

Goyal, M

Permalink

Publication Type:: Conference Proceeding
Citation:: Proceedings 2010 IEEE 5th International Conference on Bio-Inspired Computing: Theories and Applications, BIC-TA 2010, 2010, pp. 903 - 907
Issue Date:: 2010-12-31

Open Access

Copyright Clearance Process

Recently Added
In Progress
Open Access

This item is open access.

Adobe PDF

Download full textAdobe PDF (191.31 kB)

View on publisher's site

View statistics

Full metadata record

Field	Value	Language
dc.contributor.author	Anaissi, A	en_US
dc.contributor.author	Kennedy, PJ https://orcid.org/0000-0001-7837-3171	en_US
dc.contributor.author	Goyal, M https://orcid.org/0000-0003-2853-9393	en_US
dc.date.issued	2010-12-31	en_US
dc.identifier.citation	Proceedings 2010 IEEE 5th International Conference on Bio-Inspired Computing: Theories and Applications, BIC-TA 2010, 2010, pp. 903 - 907	en_US
dc.identifier.isbn	9781424464388	en_US
dc.identifier.uri	http://hdl.handle.net/10453/16279
dc.description.abstract	Microarray analysis and visualization is very helpful for biologists and clinicians to understand gene expression in cells and to facilitate diagnosis and treatment of patients. However, a typical microarray dataset has thousands of features and a very small number of observations. This very high dimensional data has a massive amount of information which often contains some noise, non-useful information and small number of relevant features for disease or genotype. This paper proposes a framework for very high dimensional data reduction based on three technologies: feature selection, linear dimensionality reduction and non-linear dimensionality reduction. In this paper, feature selection based on mutual information will be proposed for filtering features and selecting the most relevant features with the minimum redundancy. A kernel linear dimensionality reduction method is also used to extract the latent variables from a high dimensional data set. In addition, a non-linear dimensionality reduction based on local linear embedding is used to reduce the dimension and visualize the data. Experimental results are presented to show the outputs of each step and the efficiency of this framework. © 2010 IEEE.	en_US
dc.relation.ispartof	Proceedings 2010 IEEE 5th International Conference on Bio-Inspired Computing: Theories and Applications, BIC-TA 2010	en_US
dc.relation.isbasedon	10.1109/BICTA.2010.5645247	en_US
dc.title	A framework for high dimensional data reduction in the microarray domain	en_US
dc.type	Conference Proceeding
utslib.for	080607 Information Engineering and Theory	en_US
dc.location.activity	Changsha, China	en_US
pubs.embargo.period	Not known	en_US
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology/School of Computer Science
pubs.organisational-group	/University of Technology Sydney/Strength - CAI - Centre for Artificial Intelligence
pubs.organisational-group	/University of Technology Sydney/Strength - CHT - Health Technologies
utslib.copyright.status	open_access
pubs.publication-status	Published	en_US

Abstract:

Microarray analysis and visualization is very helpful for biologists and clinicians to understand gene expression in cells and to facilitate diagnosis and treatment of patients. However, a typical microarray dataset has thousands of features and a very small number of observations. This very high dimensional data has a massive amount of information which often contains some noise, non-useful information and small number of relevant features for disease or genotype. This paper proposes a framework for very high dimensional data reduction based on three technologies: feature selection, linear dimensionality reduction and non-linear dimensionality reduction. In this paper, feature selection based on mutual information will be proposed for filtering features and selecting the most relevant features with the minimum redundancy. A kernel linear dimensionality reduction method is also used to extract the latent variables from a high dimensional data set. In addition, a non-linear dimensionality reduction based on local linear embedding is used to reduce the dimension and visualize the data. Experimental results are presented to show the outputs of each step and the efficiency of this framework. © 2010 IEEE.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/16279