.TH FASTS/TFASTSv3 1 local .SH NAME fasts3, fasts3_t \- compare several short peptide sequences against a protein database using a modified fasta algorithm. tfasts3, tfasts3_t \- compare short pepides against a translated DNA database. .SH DESCRIPTION .B fasts3 and .B tfasts3 are designed to compare set of (presumably non-contiguous) peptides to a protein (fasts3) or translated DNA (tfasts3) database. fasts3/tfasts3 are designed particularly for short peptide data from mass-spec analysis of protein digests. Unlike the traditional .B fasta3 search, which uses a protein or DNA sequence, .B fasts3 and .B tfasts3 work with a query sequence of the form: .in +5 .nf >tests from mgstm1 MLLE, MILGYW, MGADP, MLCYNP .fi .in 0 This sequence indicates that four peptides are to be used. When this sequence is compared against mgstm1.aa (included with the distribution), the result is: .nf .ft C .in +5 testf MILGYW----------MLLE------------MGDAP----------- :::::: :::: ::::: GT8.7 MPMILGYWNVRGLTHPIRMLLEYTDSSYDEKRYTMGDAPDFDRSQWLNEK 10 20 30 40 50 testf -------------------------------------------------- GT8.7 FKLGLDFPNLPYLIDGSHKITQSNAILRYLARKHHLDGETEEERIRADIV 60 70 80 90 100 20 testf ------------MLCYNP :::::: GT8.7 ENQVMDTRMQLIMLCYNPDFEKQKPEFLKTIPEKMKLYSEFLGKRPWFAG 110 120 130 140 150 .in 0 .ft P .fi .SH Options .LP .B fasts3 and .B tfasts3 can accept a query sequence from the unix "stdin" data stream. This makes it much easier to use fasta3 and its relatives as part of a WWW page. To indicate that stdin is to be used, use "-" or "@" as the query sequence file name. .TP \-b # number of best scores to show (must be < -E cutoff) .TP \-d # number of best alignments to show ( must be < -E cutoff) .TP \-D turn on debugging mode. Enables checks on sequence alphabet that cause problems with tfastx3, tfasty3, tfasta3. .TP \-E # Expectation value limit for displaying scores and alignments. Expectation values for .B fasts3 and .B tfasts3 are not as accurate as those for the other .B fasta3 programs. .TP \-H turn off histogram display .TP \-i compare against only the reverse complement of the library sequence. .TP \-L report long sequence description in alignments .TP \-m 0,1,2,3,4,5,6,9,10 alignment display options .TP \-N # break long library sequences into blocks of # residues. Useful for bacterial genomes, which have only one sequence entry. -N 2000 works well for well for bacterial genomes. .TP \-O file send output to file .TP \-q/-Q quiet option; do not prompt for input .TP \-R file save all scores to statistics file .TP \-S # offset substitution matrix values by a constant # .TP \-s name specify substitution matrix. BLOSUM50 is used by default; PAM250, PAM120, and BLOSUM62 can be specified by setting -s P120, P250, or BL62. With this version, many more scoring matrices are available, including BLOSUM80 (BL80), and MDM_10, MDM_20, MDM_40 (M10, M20, M40). Alternatively, BLASTP1.4 format scoring matrix files can be specified. .TP \-T # (threaded, parallel only) number of threads or workers to use (set by default to 4 at compile time). .TP \-t # Translation table - tfasts3 can use the BLAST tranlation tables. See \fChttp://www.ncbi.nih.gov/htbin-post/Taxonomy/wprintgc?mode=c/\fP. .TP \-w # line width for similarity score, sequence alignment, output. .TP \-x "#,#" offsets query, library sequence for numbering alignments .TP \-z # Specify statistical calculation. Default is -z 1, which uses regression against the length of the library sequence. -z 0 disables statistics. -z 2 uses the ln() length correction. -z 3 uses Altschul and Gish's statistical estimates for specific protein BLOSUM scoring matrices and gap penalties. -z 4: an alternate regression method. .TP \-Z db_size Set the apparent database size used for expectation value calculations. .TP \-3 (TFASTS3 only) use only forward frame translations .SH Environment variables: .TP FASTLIBS location of library choice file (-l FASTLIBS) .TP SMATRIX default scoring matrix (-s SMATRIX) .TP SRCH_URL the format string used to define the option to re-search the database. .TP REF_URL the format string used to define the option to lookup the library sequence in entrez, or some other database. .SH AUTHOR Bill Pearson .br wrp@virginia.EDU