Minirmd: accurate and fast duplicate removal tool for short reads via multiple minimizers.

Publisher:
Oxford University Press
Publication Type:
Journal Article
Citation:
Bioinformatics, 2021, 37, (11), pp. 1604-1606
Issue Date:
2021
Filename Description Size
btaa915.pdfPublished version224.85 kB
Adobe PDF
Full metadata record
Removing duplicate and near-duplicate reads, generated by high-throughput sequencing technologies, is able to reduce computational resources in downstream applications. Here we develop minirmd, a de novo tool to remove duplicate reads via multiple rounds of clustering using different length of minimizer. Experiments demonstrate that minirmd removes more near-duplicate reads than existing clustering approaches and is faster than existing multi-core tools. To the best of our knowledge, minirmd is the first tool to remove near-duplicates on reverse-complementary strand
Please use this identifier to cite or link to this item: