'\" t .\" Title: SKIPREDUNDANT .\" Author: Debian Med Packaging Team .\" Generator: DocBook XSL Stylesheets v1.76.1 .\" Date: 05/11/2012 .\" Manual: EMBOSS Manual for Debian .\" Source: EMBOSS 6.4.0 .\" Language: English .\" .TH "SKIPREDUNDANT" "1e" "05/11/2012" "EMBOSS 6.4.0" "EMBOSS Manual for Debian" .\" ----------------------------------------------------------------- .\" * Define some portability stuff .\" ----------------------------------------------------------------- .\" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .\" http://bugs.debian.org/507673 .\" http://lists.gnu.org/archive/html/groff/2009-02/msg00013.html .\" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .ie \n(.g .ds Aq \(aq .el .ds Aq ' .\" ----------------------------------------------------------------- .\" * set default formatting .\" ----------------------------------------------------------------- .\" disable hyphenation .nh .\" disable justification (adjust text to left margin only) .ad l .\" ----------------------------------------------------------------- .\" * MAIN CONTENT STARTS HERE * .\" ----------------------------------------------------------------- .SH "NAME" skipredundant \- Remove redundant sequences from an input set .SH "SYNOPSIS" .HP \w'\fBskipredundant\fR\ 'u \fBskipredundant\fR \fB\-feature\ \fR\fB\fItoggle\fR\fR \fB\-sequences\ \fR\fB\fIseqset\fR\fR [\fB\-datafile\ \fR\fB\fImatrixf\fR\fR] \fB\-mode\ \fR\fB\fIlist\fR\fR \fB\-threshold\ \fR\fB\fIfloat\fR\fR \fB\-minthreshold\ \fR\fB\fIfloat\fR\fR \fB\-maxthreshold\ \fR\fB\fIfloat\fR\fR \fB\-gapopen\ \fR\fB\fIfloat\fR\fR \fB\-gapextend\ \fR\fB\fIfloat\fR\fR \fB\-outseq\ \fR\fB\fIseqoutall\fR\fR \fB\-redundantoutseq\ \fR\fB\fIseqoutall\fR\fR .HP \w'\fBskipredundant\fR\ 'u \fBskipredundant\fR \fB\-help\fR .SH "DESCRIPTION" .PP \fBskipredundant\fR is a command line program from EMBOSS (\(lqthe European Molecular Biology Open Software Suite\(rq)\&. It is part of the "Edit" command group(s)\&. .SH "OPTIONS" .SS "Input section" .PP \fB\-feature\fR \fItoggle\fR .RS 4 Sequence feature information will be retained if this option is set\&. .RE .PP \fB\-sequences\fR \fIseqset\fR .RS 4 .RE .PP \fB\-datafile\fR \fImatrixf\fR .RS 4 This is the scoring matrix file used when comparing sequences\&. By default it is the file \*(AqEBLOSUM62\*(Aq (for proteins) or the file \*(AqEDNAFULL\*(Aq (for nucleic sequences)\&. These files are found in the \*(Aqdata\*(Aq directory of the EMBOSS installation\&. .RE .SS "Required section" .PP \fB\-mode\fR \fIlist\fR .RS 4 This option specifies whether to remove redundancy at a single threshold percentage sequence similarity or remove redundancy outside a range of acceptable threshold percentage similarity\&. All permutations of pair\-wise sequence alignments are calculated for each set of input sequences in turn using the EMBOSS implementation of the Needleman and Wunsch global alignment algorithm\&. Redundant sequences are removed in one of two modes as follows: (i) If a pair of proteins achieve greater than a threshold percentage sequence similarity (specified by the user) the shortest sequence is discarded\&. (ii) If a pair of proteins have a percentage sequence similarity that lies outside an acceptable range (specified by the user) the shortest sequence is discarded\&. Default value: 1 .RE .PP \fB\-threshold\fR \fIfloat\fR .RS 4 This option specifies the percentage sequence identity redundancy threshold\&. The percentage sequence identity redundancy threshold determines the redundancy calculation\&. If a pair of proteins achieve greater than this threshold the shortest sequence is discarded\&. Default value: 95\&.0 .RE .PP \fB\-minthreshold\fR \fIfloat\fR .RS 4 This option specifies the percentage sequence identity redundancy threshold (lower limit)\&. The percentage sequence identity redundancy threshold determines the redundancy calculation\&. If a pair of proteins have a percentage sequence similarity that lies outside an acceptable range the shortest sequence is discarded\&. Default value: 30\&.0 .RE .PP \fB\-maxthreshold\fR \fIfloat\fR .RS 4 This option specifies the percentage sequence identity redundancy threshold (upper limit)\&. The percentage sequence identity redundancy threshold determines the redundancy calculation\&. If a pair of proteins have a percentage sequence similarity that lies outside an acceptable range the shortest sequence is discarded\&. Default value: 90\&.0 .RE .PP \fB\-gapopen\fR \fIfloat\fR .RS 4 The gap open penalty is the score taken away when a gap is created\&. The best value depends on the choice of comparison matrix\&. The default value assumes you are using the EBLOSUM62 matrix for protein sequences, and the EDNAFULL matrix for nucleotide sequences\&. Default value: @($(acdprotein)? 10\&.0 : 10\&.0 ) .RE .PP \fB\-gapextend\fR \fIfloat\fR .RS 4 The gap extension, penalty is added to the standard gap penalty for each base or residue in the gap\&. This is how long gaps are penalized\&. Usually you will expect a few long gaps rather than many short gaps, so the gap extension penalty should be lower than the gap penalty\&. An exception is where one or both sequences are single reads with possible sequencing errors in which case you would expect many single base gaps\&. You can get this result by setting the gap open penalty to zero (or very low) and using the gap extension penalty to control gap scoring\&. Default value: @($(acdprotein)? 0\&.5 : 0\&.5 ) .RE .SS "Advanced section" .SS "Output section" .PP \fB\-outseq\fR \fIseqoutall\fR .RS 4 .RE .PP \fB\-redundantoutseq\fR \fIseqoutall\fR .RS 4 .RE .SH "BUGS" .PP Bugs can be reported to the Debian Bug Tracking system (http://bugs\&.debian\&.org/emboss), or directly to the EMBOSS developers (http://sourceforge\&.net/tracker/?group_id=93650&atid=605031)\&. .SH "SEE ALSO" .PP skipredundant is fully documented via the \fBtfm\fR(1) system\&. .SH "AUTHOR" .PP \fBDebian Med Packaging Team\fR <\&debian\-med\-packaging@lists\&.alioth\&.debian\&.org\&> .RS 4 Wrote the script used to autogenerate this manual page\&. .RE .SH "COPYRIGHT" .br .PP This manual page was autogenerated from an Ajax Control Definition of the EMBOSS package\&. It can be redistributed under the same terms as EMBOSS itself\&. .sp