NAME¶
formatrpsdb - Build databases for RPS Blast
SYNOPSIS¶
formatrpsdb [
-] [
-E N] [
-G N] [
-S X] [
-U str] [
-b] [
-f X]
-i filename [
-l filename] [
-n str] [
-o] [
-t str] [
-v N]
DESCRIPTION¶
Formatrpsdb is a utility that converts a collection of input sequences
into a database suitable for use with Reverse Position Specific (RPS) Blast.
Each input sequence, together with its position-specific scoring matrix
(PSSM), is ASN.1 encoded into a PssmWithParameters (or `scoremat') object and
resides in a separate file. Scoremat objects can be created using
blastpgp.
Formatrpsdb is given a list of these files and
produces the corresponding database.
Formatrpsdb is designed to perform the work of
formatdb,
makemat and
copymat simultaneously, without generating the large
number of intermediate files these utilities would need to create an RPS Blast
database. Further, scoremat objects are in more general use than the binary
format makemat requires. It is hoped that direct manipulation of scoremat
objects will encourage conversion of more diverse sequence collections into
RPS Blast databases.
Databases generated by formatrpsdb are binary compatible with databases
generated by
formatdb/
makemat/
copymat, although the
database files will in general not be byte- for-byte identical.
OPTIONS¶
A summary of options is included below.
- -
- Print usage message
- -E N
- The gap extension penalty (if not specified in the scoremat; default =
1)
- -G N
- The gap opening penalty (if not specified in the scoremat; default =
11)
- -S X
- For scoremats that contain only residue frequencies, the scaling factor to
apply when creating PSSMs (default = 100)
- -U str
- Underlying score matrix (if not specified in the scoremat; default =
BLOSUM62)
- -b
- Scoremat files are binary (vs. text) ASN1.
- -f X
- Threshold for extending hits for RPS database (default = 11)
- -i filename
- Input file containing list of ASN.1 Scoremat filenames
- -l filename
- Log file name (default = formatrpsdb.log)
- -n str
- Base name of output database (same as input file if not specified)
- -o
- Create index files for database
- -t str
- Title for database file
- -v N
- Database volume size in millions of letters (default = 0, which really
means no limit)
AUTHOR¶
The National Center for Biotechnology Information.
SEE ALSO¶
blast(1),
copymat(1),
formatdb(1),
makemat(1),
/usr/share/doc/blast2/formatrpsdb.html