National Center for Biotechnology Information, National Library
of Medicine, Bethesda, Maryland 20894-0001.
Sequence similarity between a translated nucleotide sequence
and a known biological protein can provide strong evidence for the
presence of a homologous coding region, even between distantly
related genes. The computer program BLASTX performed conceptual
translation of a nucleotide query sequence followed by a protein
database search in one programmatic step. We characterized the
sensitivity of BLASTX recognition to the presence of substitution,
insertion and deletion errors in the query sequence and to
sequence divergence. Reading frames were reliably identified in
the presence of 1% query errors, a rate that is typical for
primary sequence data. BLASTX is appropriate for use in moderate
and large scale sequencing projects at the earliest opportunity,
when the data are most prone to containing errors.
PMID: 8485583, UI: 93251048