TIGER Toolset
Among scientific disciplines, genomics has
one of the fastest growing bodies of data today. This is largely due to the
recent advances in next-generation sequencing (NGS) technologies, which have
tremendously reduced DNA sequencing costs. This massive amount of sequencing
data have provided the basis to better understand the tree of life and to
identify molecular signatures of human variation and disease mechanisms. To make
such analyses possible, the key computational task is to de novo assemble the
raw short sequences (called reads) from NGS technologies into complete or
near-complete genomes. However, the enormous amount of data creates an
inevitable barrier to the assembly process in terms of memory usage and
computation time. It usually takes days to weeks to assemble an entire human
genome and requires a machine with hundreds of Giga Bytes memory and hundreds of
processors. In addition, the lower quality and limited read length produced by
NGS, as compared to the traditional Sanger sequencing, make it extremely
difficult to assemble reads into long scaffolds, which are essential to
facilitate the analyses of large-scale genome rearrangements.
We have developed a novel de novo assembly framework, called TIGER, which adapts
to available computing resources by iteratively decomposing the assembly problem
into sub-problems. Our method is also flexible to embed different assemblers for
various types of target genomes. Using the sequence data from a human
chromosome, our results show that TIGER can achieve much better NG50s, better
genome coverage, and slightly higher errors, as compared to Velvet and
SOAPdenovo, using modest amount of memory that are available in commodity
computers today.
This work is published at BMC Bioinformatics as "TIGER: tiled iterative genome assemble".
This research project involves 3 Phd
students: Xiao-Long Wu, Yun Heo, and Izzat El Hajj and 3 professors:
Wen-Mei
Hwu (ECE), Deming Chen (ECE), and Jian Ma (BIOE)
The toolset is available upon
request with a research license agreement signed first. Please send your request
to
xiaolong@illinois.edu.