Efficient mining of pan-correlation patterns from time course data

Liu, Q; Li, J; Wong, L; Ramamohanarao, K

Efficient mining of pan-correlation patterns from time course data

Liu, Q Li, J

Wong, L Ramamohanarao, K

Permalink

Publication Type:: Conference Proceeding
Citation:: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2016, 10086 LNAI pp. 234 - 249
Issue Date:: 2016-01-01

Open Access

Copyright Clearance Process

Recently Added
In Progress
Open Access

This item is open access.

Adobe PDF

Download Accepted ManuscriptAdobe PDF (773.69 kB)

View on publisher's site

View statistics

Full metadata record

Field	Value	Language
dc.contributor.author	Liu, Q	en_US
dc.contributor.author	Li, J https://orcid.org/0000-0003-1833-7413	en_US
dc.contributor.author	Wong, L	en_US
dc.contributor.author	Ramamohanarao, K	en_US
dc.date.issued	2016-01-01	en_US
dc.identifier.citation	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2016, 10086 LNAI pp. 234 - 249	en_US
dc.identifier.isbn	9783319495859	en_US
dc.identifier.issn	0302-9743	en_US
dc.identifier.uri	http://hdl.handle.net/10453/100195
dc.description.abstract	© Springer International Publishing AG 2016. There are different types of correlation patterns between the variables of a time course data set, such as positive correlations, negative correlations, time-lagged correlations, and those correlations containing small interrupted gaps. Usually, these correlations are maintained only on a subset of time points rather than on the whole span of the time points which are traditionally required for correlation definition. As these types of patterns underline different trends of data movement, mining all of them is an important step to gain a broad insight into the dependencies of the variables. In this work, we prove that these diverse types of correlation patterns can be all represented by a generalized form of positive correlation patterns. We also prove a correspondence between positive correlation patterns and sequential patterns. We then present an efficient single-scan algorithm for mining all of these types of correlations. This “pan-correlation” mining algorithm is evaluated on synthetic time course data sets, as well as on yeast cell cycle gene expression data sets. The results indicate that: (i) our mining algorithm has linear time increment in terms of increasing number of variables; (ii) negative correlation patterns are abundant in real-world data sets; and (iii) correlation patterns with time lags and gaps are also abundant. Existing methods have only discovered incomplete forms of many of these patterns, and have missed some important patterns completely.	en_US
dc.relation.ispartof	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)	en_US
dc.relation.isbasedon	10.1007/978-3-319-49586-6_16	en_US
dc.subject.classification	Artificial Intelligence & Image Processing	en_US
dc.title	Efficient mining of pan-correlation patterns from time course data	en_US
dc.type	Conference Proceeding
utslib.citation.volume	10086 LNAI	en_US
utslib.for	0801 Artificial Intelligence and Image Processing	en_US
pubs.embargo.period	Not known	en_US
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Strength - AAI - Advanced Analytics Institute Research Centre
pubs.organisational-group	/University of Technology Sydney/Strength - CHT - Health Technologies
utslib.copyright.status	open_access
pubs.publication-status	Published	en_US
pubs.volume	10086 LNAI	en_US

Abstract:

© Springer International Publishing AG 2016. There are different types of correlation patterns between the variables of a time course data set, such as positive correlations, negative correlations, time-lagged correlations, and those correlations containing small interrupted gaps. Usually, these correlations are maintained only on a subset of time points rather than on the whole span of the time points which are traditionally required for correlation definition. As these types of patterns underline different trends of data movement, mining all of them is an important step to gain a broad insight into the dependencies of the variables. In this work, we prove that these diverse types of correlation patterns can be all represented by a generalized form of positive correlation patterns. We also prove a correspondence between positive correlation patterns and sequential patterns. We then present an efficient single-scan algorithm for mining all of these types of correlations. This “pan-correlation” mining algorithm is evaluated on synthetic time course data sets, as well as on yeast cell cycle gene expression data sets. The results indicate that: (i) our mining algorithm has linear time increment in terms of increasing number of variables; (ii) negative correlation patterns are abundant in real-world data sets; and (iii) correlation patterns with time lags and gaps are also abundant. Existing methods have only discovered incomplete forms of many of these patterns, and have missed some important patterns completely.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/100195