.TH FORMATRPSDB 1 2004-10-20 NCBI "NCBI Tools User's Manual" .SH NAME formatrpsdb \- Build databases for RPS Blast .SH SYNOPSIS .B formatrpsdb [\|\fB\-\fP\|] [\|\fB\-E\fP\ \fIN\fP\|] [\|\fB\-G\fP\ \fIN\fP\|] [\|\fB\-S\fP\ \fIX\fP\|] [\|\fB\-U\fP\ \fIstr\fP\|] [\|\fB\-b\fP\|] [\|\fB\-f\fP\ \fIX\fP\|] \fB\-i\fP\ \fIfilename\fP [\|\fB\-l\fP\ \fIfilename\fP\|] [\|\fB\-n\fP\ \fIstr\fP\|] [\|\fB\-o\fP\|] [\|\fB\-t\fP\ \fIstr\fP\|] [\|\fB\-v\fP\ \fIN\fP\|] .SH DESCRIPTION \fBFormatrpsdb\fP is a utility that converts a collection of input sequences into a database suitable for use with Reverse Position Specific (RPS) Blast. Each input sequence, together with its position-specific scoring matrix (PSSM), is ASN.1 encoded into a PssmWithParameters (or `scoremat') object and resides in a separate file. Scoremat objects can be created using \fBblastpgp\fP. \fBFormatrpsdb\fP is given a list of these files and produces the corresponding database. \fBFormatrpsdb\fP is designed to perform the work of \fBformatdb\fP, \fBmakemat\fP and \fBcopymat\fP simultaneously, without generating the large number of intermediate files these utilities would need to create an RPS Blast database. Further, scoremat objects are in more general use than the binary format makemat requires. It is hoped that direct manipulation of scoremat objects will encourage conversion of more diverse sequence collections into RPS Blast databases. Databases generated by formatrpsdb are binary compatible with databases generated by \fBformatdb\fP/\fBmakemat\fP/\fBcopymat\fP, although the database files will in general not be byte- for-byte identical. .SH OPTIONS A summary of options is included below. .TP \fB\-\fP Print usage message .TP \fB\-E\fP\ \fIN\fP The gap extension penalty (if not specified in the scoremat; default = 1) .TP \fB\-G\fP\ \fIN\fP The gap opening penalty (if not specified in the scoremat; default = 11) .TP \fB\-S\fP\ \fIX\fP For scoremats that contain only residue frequencies, the scaling factor to apply when creating PSSMs (default = 100) .TP \fB\-U\fP\ \fIstr\fP Underlying score matrix (if not specified in the scoremat; default = BLOSUM62) .TP \fB\-b\fP Scoremat files are binary (vs. text) ASN1. .TP \fB\-f\fP\ \fIX\fP Threshold for extending hits for RPS database (default = 11) .TP \fB\-i\fP\ \fIfilename\fP Input file containing list of ASN.1 Scoremat filenames .TP \fB\-l\fP\ \fIfilename\fP Log file name (default = formatrpsdb.log) .TP \fB\-n\fP\ \fIstr\fP Base name of output database (same as input file if not specified) .TP \fB\-o\fP Create index files for database .TP \fB\-t\fP\ \fIstr\fP Title for database file .TP \fB\-v\fP\ \fIN\fP Database volume size in millions of letters (default = 0, which really means no limit) .SH AUTHOR The National Center for Biotechnology Information. .SH SEE ALSO .BR blast (1), .BR copymat (1), .BR formatdb (1), .BR makemat (1), /usr/share/doc/blast2/formatrpsdb.html