Inference of gene interaction networks using conserved subsequential patterns from multiple time course gene expression datasets
- Publication Type:
- Journal Article
- BMC Genomics, 2015, 16 (12)
- Issue Date:
© 2015 Liu et al. Motivation: Deciphering gene interaction networks (GINs) from time-course gene expression (TCGx) data is highly valuable to understand gene behaviors (e.g., activation, inhibition, time-lagged causality) at the system level. Existing methods usually use a global or local proximity measure to infer GINs from a single dataset. As the noise contained in a single data set is hardly self-resolved, the results are sometimes not reliable. Also, these proximity measurements cannot handle the co-existence of the various in vivo positive, negative and time-lagged gene interactions. Methods and results: We propose to infer reliable GINs from multiple TCGx datasets using a novel conserved subsequential pattern of gene expression. A subsequential pattern is a maximal subset of genes sharing positive, negative or time-lagged correlations of one expression template on their own subsets of time points. Based on these patterns, a GIN can be built from each of the datasets. It is assumed that reliable gene interactions would be detected repeatedly. We thus use conserved gene pairs from the individual GINs of the multiple TCGx datasets to construct a reliable GIN for a species. We apply our method on six TCGx datasets related to yeast cell cycle, and validate the reliable GINs using protein interaction networks, biopathways and transcription factor-gene regulations. We also compare the reliable GINs with those GINs reconstructed by a global proximity measure Pearson correlation coefficient method from single datasets. It has been demonstrated that our reliable GINs achieve much better prediction performance especially with much higher precision. The functional enrichment analysis also suggests that gene sets in a reliable GIN are more functionally significant. Our method is especially useful to decipher GINs from multiple TCGx datasets related to less studied organisms where little knowledge is available except gene expression data.
Please use this identifier to cite or link to this item: