Genome Res 1998 Mar;8(3):186-94
Department of Molecular Biotechnology, University of
Washington, Seattle, Washington 98195-7730, USA.
Elimination of the data processing bottleneck in
high-throughput sequencing will require both improved accuracy of
data processing software and reliable measures of that accuracy.
We have developed and implemented in our base-calling program
phred the ability to estimate a probability of error for each
base-call, as a function of certain parameters computed from the
trace data. These error probabilities are shown here to be valid
(correspond to actual error rates) and to have high power to
discriminate correct base-calls from incorrect ones, for read data
collected under several different chemistries and electrophoretic
conditions. They play a critical role in our assembly program
phrap and our finishing program consed.
PMID: 9521922, UI: 98190160