Quality-score guided error correction for short-read sequencing data using CUDA
Recently introduced new sequencing technologies can produce massive amounts of short-read data. Detection and correction of sequencing errors in this data is an important but time-consuming pre-processing step for de-novo genome assembly. In this paper, we demonstrate how the quality-score value associated with each base-call can be integrated in a CUDA-based parallel error correction algorithm. We show that quality-score guided error correction can improve the assembly accuracy of several datasets from the NCBI SRA (Short-Read Archive) in terms of N50-values as well as runtime. We further propose a number of improvements of to our previously published CUDA-EC algorithm to improve its runtime by a factor of up to 1.88.
Müller-Wittig, Wolfgang K.