Featured Technologies
Fastqforward: Whole Genome Sequencing Analysis In Under 10 Minutes
ID U-6940
Category Computing
Subcategory Data Infrastructure (Data Mining, Visualization, & Analysis)
Researchers
Brief Summary
A pipeline to both rapidly process genome sequencing data as well as improve cost effectiveness by optimizing resource distribution.
Problem Statement
Current solutions take a day or longer to process genome sequences.
Technology Description
FastQForward distributes sequencing alignment, data polishing steps, and variant calling in a highly parallelized manner both within a single machine as well as across multiple machines in a computer cluster. It can process a 30x coverage human whole genome sequencing sample in just over 5 minutes (using 56 machines and 2128 CPU cores) compared to 24 hours or more for comparable software pipelines that operate within a single machine. Also because sample availability as well as compute resource usage tend to follow a Poisson distribution, it can be difficult to match input data to available resources resulting in significant costs both from idle hardware as well as lost time. By spreading analysis across multiple machines when appropriate, FastQForward allows researchers to optimize resource usage and maximize cost effectiveness.
Stage of Development
Fully Integrated System
Benefit
- Extremely fast sequencing analysis MPI on computer clusters (cloud)
- Forks on desktops and servers Linux based
- Designed to be easy-to-use
Publications
https://ucgd.genetics.utah.edu/fastqforward-just-got-faster/
Contact Info
Aaron Duffy
(801) 585-1377
aaron.duffy@utah.edu