VERSION 0.960718 RELEASE NOTES: Version 0.960718 contains the following significant modifications: 1) There have been improvements in memory usage and speed that should substantially facilitate phrap and cross_match analyses of large datasets. Cross_match can now be used for database searches. These changes required a fair amount of reorganization of the internal data structures. In the course of making them, I have had to temporarily inactivate the phrap option that allowed one set of reads to be assembled against a second set of sequences (e.g. reference sequences being scanned for polymorphisms), using two or more input files. This will be restored soon. 2) Swat has been improved in several ways. I have restored and improved its ability to compute z-scores and E values for database searches, and these now appear to be quite reliable. Also, it is now possible to use swat for profile searches (with the restriction that gap penalties still need to be position independent). 3) A graphical viewer for phrap assemblies, "phrapview", is now included. This is intended to complement the "local" view of the assembly provided by consed, by giving a "global" view that focusses on information pertaining to possible incorrectness, incompleteness, or non-uniqueness of the assembly. Phrapview displays depth of coverage, forward-reverse read pairs, significant pairwise matches involving reads in different locations, and chimeric reads. It requires a ".view" file produced by running phrap with the -view option. Phrapview is written in perl-tk and to run it you will need to have installed on your system a recent version of perl that includes the tk library (available for free from a number of web sites). Further documentation appears in the file "general.doc". The program was written rather hastily (in less than a week, including the time to learn perl and Tk) and I would appreciate any feedback on how to make it more useful. 4) The following known bugs have been fixed: (i) A bug that caused occasional crashes in the "Revising contigs" phase. (ii) Another bug in revise_contigs that occasionally caused premature truncation of the contig sequence, resulting in massive pileups of reads at the truncated end; and that also caused a lower quality read to occasionally be used in place of a higher quality one in deriving the contig sequence. (iii) A bug that caused an infinite loop on data readin on SGI machines (N.B. I don't have access to an SGI computer and so haven't been able to verify that the programs now run successfully -- please let me know if there are still problems). (iv) A bug that caused premature termination of phrap when there are 0 length reads in the dataset. Please continue to report any bugs to me. 5) Contig base quality information is now output in the .ace file (as well as the .contigs.qual file). You need to obtain a new version of consed from David Gordon (which he is distributing this week) in order to avoid having consed crash on these .ace files. (This version of consed does not yet actually use the contig base quality information). 6) Phrap, cross_match and swat are now all case insensitive, in the sense that all sequences are immediately converted to upper case on readin. The -use_case option is no longer available. If you were previously using that, you will need to create .qual files that contain the information instead. In any case (so to speak!) I strongly recommend that you use phred's quality values which are substantially more discriminating.