Feature prioritisation on big genomic data for analysing gene-gene interactions

Inderscience Publishers
Publication Type:
Journal Article
International Journal of Bioinformatics Research and Applications, 2021, 17, (2), pp. 158-177
Issue Date:
Filename Description Size
ijbra.2021.114420.pdfPublished version1.65 MB
Adobe PDF
Full metadata record
Complex diseases are not caused by single genes but result from intricate non-linear interactions among them. There is a critical need to implement approaches that take into account non-linear gene-gene interactions in searching for markers that jointly cause diseases. Determining the interaction between more than two single nucleotide polymorphisms (SNP) within the whole genome data is computationally expensive and often infeasible. In this paper, we develop an approach to classify patients with Acute Lymphoblastic Leukaemia by analysing multiple SNP interactions. A novel feature prioritisation algorithm called interaction effect quantity (IEQ) selects SNPs with high potential of interaction by analysing their distribution throughout the genomic data and enables deeper analysis of non-linear interactions within large datasets. We show that IEQ enables analyses of interactions between up to four SNPs, with F-measure for classification greater than 89% obtained. Such an analysis is typically much more computationally challenging if IEQ is not implemented.
Please use this identifier to cite or link to this item: