'\" t .\" Title: vmatchselect .\" Author: [see the "AUTHOR(S)" section] .\" Generator: Asciidoctor 2.0.20 .\" Date: .\" Manual: \ \& .\" Source: \ \& .\" Language: English .\" .TH "VMATCHSELECT" "1" "" "\ \&" "\ \&" .ie \n(.g .ds Aq \(aq .el .ds Aq ' .ss \n[.ss] 0 .nh .ad l .de URL \fI\\$2\fP <\\$1>\\$3 .. .als MTO URL .if \n[.g] \{\ . mso www.tmac . am URL . ad l . . . am MTO . ad l . . . LINKSTYLE blue R < > .\} .SH "NAME" vmatchselect \- sort and select matches .SH "SYNOPSIS" .sp \fBvmatchselect\fP [options] matchfile .SH "DESCRIPTION" .sp \fBvmatchselect\fP allows one to select interesting matches from the output of vmatch as specified by user\-defined criteria. It delivers matches of chosen length, degeneracy or significance into further analysis routines. .sp \fBvmatchselect\fP removes from the input all those matches that are contained in another match. To do this efficiently, the matches are sorted by their position in the database sequence, and hence in the order in which the matches are output, unless the user specifies otherwise. Moreover, the sequences of the virtual suffix tree for which the match filewas produced can be clustered according to the matches. The input for \fBvmatchselect\fP is a file produced by vmatch, called a match file. .sp The output of \fBvmatchselect\fP goes to standard output and is sorted in ascending order of the positions of the left instance of a match. Two matches where the left instance occurs at the same position, are sorted in descending order of their length. Two matches of the same length where the left instance occurs in the same position, are sorted in ascending order of the position of the right instance of the match. .sp \fBvmatchselect\fP provides a subset of the options of \fBvmatch\fP. The main difference to \fBvmatch\fP is that \fBvmatchselect\fP gets the matches from a match file, while \fBvmatch\fP computes the matches from scratch. Therefore options specifying the index and/or the query sequences to be matched, as well as options specifying how to match are not available in \fBvmatchselect\fP. The options of \fBvmatchselect\fP have the same meaning as in the program \fBvmatch\fP. Thus, for a description, see the corresponding documentation. Note that \fBvmatchselect\fP also supports the option "\-dbcluster". If \fBvmatchselect\fP is called with this option, then it parses the given match file and performs single linkage clustering based on the matches in this file. Thus \fBvmatch\fP and \fBvmatchselect\fP can perform hierarchical clustering. In a first step an initial set of matches with loose matching criteria is computed, using \fBvmatch\fP. Then one clusters these matches by calling \fBvmatchselect\fP. In a second round one applies more strict choices for the matches by the using the options "\-l", "\-leastscore", "\-evalue", or "\-identity", etc. This facilitates stepwise refinement of clusters without much computational effort and no new index construction for the sequence of a cluster. The output of \fBvmatchselect\fP is the same as the output of \fBvmatch\fP. .SH "OPTIONS" .sp \fB\-dbcluster\fP .RS 4 Cluster the database sequences. .sp .RS 4 .ie n \{\ \h'-04'\(bu\h'+03'\c .\} .el \{\ . sp -1 . IP \(bu 2.3 .\} first argument is percentage of shorter string to be included in match, .RE .sp .RS 4 .ie n \{\ \h'-04'\(bu\h'+03'\c .\} .el \{\ . sp -1 . IP \(bu 2.3 .\} second argument is percentage of larger string to be included in match, .RE .sp .RS 4 .ie n \{\ \h'-04'\(bu\h'+03'\c .\} .el \{\ . sp -1 . IP \(bu 2.3 .\} third optional argument is filenameprefix, .RE .sp .RS 4 .ie n \{\ \h'-04'\(bu\h'+03'\c .\} .el \{\ . sp -1 . IP \(bu 2.3 .\} fourth optional argument is (minclustersize, maxclustersize) .RE .RE .sp \fB\-nonredundant\fP .RS 4 Generate file with non\-redundant set of sequences; only works together with option \fB\-dbcluster\fP. .RE .sp \fB\-selfun\fP .RS 4 Specify shared object file containing selection function. .RE .sp \fB\-l\fP .RS 4 Specify that match must have the given length, optionally specify minimum and maximum size of gaps between repeat instances. .RE .sp \fB\-leastscore\fP .RS 4 Specify the minimum score of a match. .RE .sp \fB\-evalue\fP .RS 4 Specify the maximum E\-value of a match. .RE .sp \fB\-identity\fP .RS 4 Specify minimum identity of match in range [1..100%]. .RE .sp \fB\-sort\fP .RS 4 Sort the matches, additional argument is mode: la: ascending order of length ld: descending order of length ia: ascending order of first position id: descending order of first position ja: ascending order of second position jd: descending order of second position ea: ascending order of Evalue ed: descending order of Evalue sa: ascending order of score sd: descending order of score ida: ascending order of identity idd: descending order of identity .RE .sp \fB\-best\fP .RS 4 Show the best matches (those with smallest E\-values), default is best 50. .RE .sp \fB\-s\fP .RS 4 Show the alignment of matching sequences. .RE .sp \fB\-showdesc\fP .RS 4 Show sequence description of match. .RE .sp \fB\-f\fP .RS 4 Show filename where match occurs. .RE .sp \fB\-absolute\fP .RS 4 Show absolute positions. .RE .sp \fB\-nodist\fP .RS 4 Do not show distance of match. .RE .sp \fB\-noevalue\fP .RS 4 Do not show E\-value of match. .RE .sp \fB\-noscore\fP .RS 4 Do not show score of match. .RE .sp \fB\-noidentity\fP .RS 4 Do not show identity of match. .RE .sp \fB\-v\fP .RS 4 Verbose mode. .RE .sp \fB\-version\fP .RS 4 Show the version of the Vmatch package. .RE .sp \fB\-help\fP .RS 4 Show help. .RE .SH "SEE ALSO" .sp vmatch(1)