.\" Title: GLAM2-PURGE .\" Author: Andrew Neuwald .\" Generator: DocBook XSL Stylesheets v1.73.2 .\" Date: 05/19/2008 .\" Manual: glam2 Manual .\" Source: GLAM2 1056 .\" .TH "GLAM2\-PURGE" "1" "05/19/2008" "GLAM2 1056" "glam2 Manual" .\" disable hyphenation .nh .\" disable justification (adjust text to left margin only) .ad l .SH "NAME" glam2-purge \- Removes redundant sequences from a FASTA file .SH "SYNOPSIS" .HP 12 \fBglam2\-purge\fR \fIfile\fR \fBscore\fR [\fBoptions\fR] .SH "DESCRIPTION" .PP \fBglam2\-purge\fR is a modified version of Andrew Neuwald\'s \fBpurge\fR program that removes redundant sequences from a FASTA file\&. This is recommended in order to prevent highly similar sequences distorting the search for motifs\&. Purge works with either DNA or protein sequences and creates an output file such that no two sequences have a (gapless) local alignment score greater than a threshold specified by the user\&. The output file is named \&.\&. The alignment score is based on the BLOSUM62 matrix for proteins, and on a +5/\-1 scoring scheme for DNA\&. Purge can also be used to mask tandem repeats\&. It uses the XNU program for this purpose\&. .SH "OPTIONS" .PP \fB\-n\fR .RS 4 Sequences are DNA (default: protein)\&. .RE .PP \fB\-b\fR .RS 4 Use blast heuristic method (default for protein)\&. .RE .PP \fB\-e\fR .RS 4 Use an exhaustive method (default for DNA)\&. .RE .PP \fB\-q\fR .RS 4 Keep first sequence in the set\&. .RE .PP \fB\-x\fR .RS 4 Use xnu to mask protein tandem repeats\&. .RE .SH "SEE ALSO" .PP \fBglam2\fR(1), \fBglam2format\fR(1), \fBglam2mask\fR(1), \fBglam2scan\fR(1), \fBxnu\fR(1) .PP The full Hypertext documentation of GLAM2 is available online at \fIhttp://bioinformatics\&.org\&.au/glam2/\fR or on this computer in \fI/usr/share/doc/glam2/\fR\&. .SH "REFERENCES" .PP Purge was written by Andy Neuwald and is described in more detail in Neuwald et al\&., "Gibbs motif sampling: detection of bacterial outer membrane protein repeats", Protein Science, 4:1618\(en1632, 1995\&. Please cite it if you use Purge\&. .PP If you use GLAM2, please cite: MC Frith, NFW Saunders, B Kobe, TL Bailey (2008) Discovering sequence motifs with arbitrary insertions and deletions, PLoS Computational Biology (in press)\&. .SH "AUTHORS" .PP \fBAndrew Neuwald\fR .sp -1n .IP "" 4 Author of purge, renamed glam2\-purge in Debian\&. .PP \fBMartin Frith\fR .sp -1n .IP "" 4 Modified purge to be ANSI standard C and improved the user interface\&. .PP \fBTimothy Bailey\fR .sp -1n .IP "" 4 Modified purge to be ANSI standard C and improved the user interface\&. .PP \fBCharles Plessy\fR <\&plessy@debian\&.org\&> .sp -1n .IP "" 4 Formatted this manpage in DocBook XML for the Debian distribution\&. .SH "COPYRIGHT" .PP The source code and the documentation of Purge and GLAM2 are released in the public domain\&. .sp