Hidden breakpoints in genome alignments

Kehr, B; Reinert, K; Darling, AE

Hidden breakpoints in genome alignments

Kehr, B Reinert, K Darling, AE

Permalink

Publication Type:: Conference Proceeding
Citation:: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2012, 7534 LNBI pp. 391 - 403
Issue Date:: 2012-10-01

Closed Access

	Filename	Description	Size
	1207.6964v1.pdf	Published Version	1.06 MB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Kehr, B	en_US
dc.contributor.author	Reinert, K	en_US
dc.contributor.author	Darling, AE https://orcid.org/0000-0003-2397-7925	en_US
dc.date.issued	2012-10-01	en_US
dc.identifier.citation	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2012, 7534 LNBI pp. 391 - 403	en_US
dc.identifier.isbn	9783642331213	en_US
dc.identifier.issn	0302-9743	en_US
dc.identifier.uri	http://hdl.handle.net/10453/122501
dc.description.abstract	During the course of evolution, an organism's genome can undergo changes that affect the large-scale structure of the genome. These changes include gene gain, loss, duplication, chromosome fusion, fission, and rearrangement. When gene gain and loss occurs in addition to other types of rearrangement, breakpoints of rearrangement can exist that are only detectable by comparison of three or more genomes. An arbitrarily large number of these "hidden" breakpoints can exist among genomes that exhibit no rearrangements in pairwise comparisons. We present an extension of the multichromosomal breakpoint median problem to genomes that have undergone gene gain and loss. We then demonstrate that the median distance among three genomes can be used to calculate a lower bound on the number of hidden breakpoints present. We provide an implementation of this calculation including the median distance, along with some practical improvements on the time complexity of the underlying algorithm. We apply our approach to measure the abundance of hidden breakpoints in simulated data sets under a wide range of evolutionary scenarios. We demonstrate that in simulations the hidden breakpoint counts depend strongly on relative rates of inversion and gene gain/loss. Finally we apply current multiple genome aligners to the simulated genomes, and show that all aligners introduce a high degree of error in hidden breakpoint counts, and that this error grows with evolutionary distance in the simulation. Our results suggest that hidden breakpoint error may be pervasive in genome alignments. © 2012 Springer-Verlag.	en_US
dc.relation.ispartof	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)	en_US
dc.relation.isbasedon	10.1007/978-3-642-33122-0_31	en_US
dc.subject.classification	Artificial Intelligence & Image Processing	en_US
dc.title	Hidden breakpoints in genome alignments	en_US
dc.type	Conference Proceeding
utslib.citation.volume	7534 LNBI	en_US
utslib.for	0605 Microbiology	en_US
utslib.for	0604 Genetics	en_US
pubs.embargo.period	Not known	en_US
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Science
pubs.organisational-group	/University of Technology Sydney/Strength - ithree - Institute of Infection, Immunity and Innovation
utslib.copyright.status	closed_access
pubs.publication-status	Published	en_US
pubs.volume	7534 LNBI	en_US

Abstract:

During the course of evolution, an organism's genome can undergo changes that affect the large-scale structure of the genome. These changes include gene gain, loss, duplication, chromosome fusion, fission, and rearrangement. When gene gain and loss occurs in addition to other types of rearrangement, breakpoints of rearrangement can exist that are only detectable by comparison of three or more genomes. An arbitrarily large number of these "hidden" breakpoints can exist among genomes that exhibit no rearrangements in pairwise comparisons. We present an extension of the multichromosomal breakpoint median problem to genomes that have undergone gene gain and loss. We then demonstrate that the median distance among three genomes can be used to calculate a lower bound on the number of hidden breakpoints present. We provide an implementation of this calculation including the median distance, along with some practical improvements on the time complexity of the underlying algorithm. We apply our approach to measure the abundance of hidden breakpoints in simulated data sets under a wide range of evolutionary scenarios. We demonstrate that in simulations the hidden breakpoint counts depend strongly on relative rates of inversion and gene gain/loss. Finally we apply current multiple genome aligners to the simulated genomes, and show that all aligners introduce a high degree of error in hidden breakpoint counts, and that this error grows with evolutionary distance in the simulation. Our results suggest that hidden breakpoint error may be pervasive in genome alignments. © 2012 Springer-Verlag.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/122501