'\" t .\" Title: CLUSTALW .\" Author: Des Higgins .\" Generator: DocBook XSL Stylesheets v1.75.2 .\" Date: 12/28/2010 .\" Manual: Clustal Manual .\" Source: Clustal 2.1 .\" Language: English .\" .TH "CLUSTALW" "1" "12/28/2010" "Clustal 2.1" "Clustal Manual" .\" ----------------------------------------------------------------- .\" * Define some portability stuff .\" ----------------------------------------------------------------- .\" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .\" http://bugs.debian.org/507673 .\" http://lists.gnu.org/archive/html/groff/2009-02/msg00013.html .\" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .ie \n(.g .ds Aq \(aq .el .ds Aq ' .\" ----------------------------------------------------------------- .\" * set default formatting .\" ----------------------------------------------------------------- .\" disable hyphenation .nh .\" disable justification (adjust text to left margin only) .ad l .\" ----------------------------------------------------------------- .\" * MAIN CONTENT STARTS HERE * .\" ----------------------------------------------------------------- .SH "NAME" clustalw \- Multiple alignment of nucleic acid and protein sequences .SH "SYNOPSIS" .HP \w'\fBclustalw\fR\ 'u \fBclustalw\fR [\fB\-infile\fR] \fIfile\&.ext\fR [\fBOPTIONS\fR] .HP \w'\fBclustalw\fR\ 'u \fBclustalw\fR [\fB\-help\fR | \fB\-fullhelp\fR] .SH "DESCRIPTION" .PP Clustal\ \&W is a general purpose multiple alignment program for DNA or proteins\&. .PP The program performs simultaneous alignment of many nucleotide or amino acid sequences\&. It is typically run interactively, providing a menu and an online help\&. If you prefer to use it in command\-line (batch) mode, you will have to give several options, the minimum being \fB\-infile\fR\&. .SH "OPTIONS" .SS "DATA (sequences)" .PP \fB \-infile=\fR\fB\fIfile\&.ext\fR\fR\fB \fR .RS 4 Input sequences\&. .RE .PP \fB \-profile1=\fR\fB\fIfile\&.ext\fR\fR\fB and \-profile2=\fR\fB\fIfile\&.ext\fR\fR\fB \fR .RS 4 Profiles (old alignment) .RE .SS "VERBS (do things)" .PP \fB\-options\fR .RS 4 List the command line parameters\&. .RE .PP \fB\-help or \-check\fR .RS 4 Outline the command line params\&. .RE .PP \fB\-fullhelp\fR .RS 4 Output full help content\&. .RE .PP \fB\-align\fR .RS 4 Do full multiple alignment\&. .RE .PP \fB\-tree\fR .RS 4 Calculate NJ tree\&. .RE .PP \fB\-pim\fR .RS 4 Output percent identity matrix (while calculating the tree)\&. .RE .PP \fB \-bootstrap\fR\fB\fI=n\fR\fR\fB \fR .RS 4 Bootstrap a NJ tree (\fIn\fR= number of bootstraps; def\&. = 1000)\&. .RE .PP \fB\-convert\fR .RS 4 Output the input sequences in a different file format\&. .RE .SS "PARAMETERS (set things)" .sp .it 1 an-trap .nr an-no-space-flag 1 .nr an-break-flag 1 .br .ps +1 \fBGeneral settings:\fR .RS 4 .PP \fB\-interactive\fR .RS 4 Read command line, then enter normal interactive menus\&. .RE .PP \fB\-quicktree\fR .RS 4 Use FAST algorithm for the alignment guide tree\&. .RE .PP \fB\-type=\fR .RS 4 \fIPROTEIN\fR or \fIDNA\fR sequences\&. .RE .PP \fB\-negative\fR .RS 4 Protein alignment with negative values in matrix\&. .RE .PP \fB\-outfile=\fR .RS 4 Sequence alignment file name\&. .RE .PP \fB\-output=\fR .RS 4 \fIGCG\fR, \fIGDE\fR, \fIPHYLIP\fR, \fIPIR\fR or \fINEXUS\fR\&. .RE .PP \fB\-outputorder=\fR .RS 4 \fIINPUT\fR or \fIALIGNED\fR .RE .PP \fB\-case\fR .RS 4 \fILOWER\fR or \fIUPPER\fR (for GDE output only)\&. .RE .PP \fB\-seqnos=\fR .RS 4 \fIOFF\fR or \fION\fR (for Clustal output only)\&. .RE .PP \fB\-seqnos_range=\fR .RS 4 \fIOFF\fR or \fION\fR (NEW: for all output formats)\&. .RE .PP \fB\-range=\fR\fB\fIm\fR\fR\fB,\fR\fB\fIn\fR\fR .RS 4 Sequence range to write starting \fIm\fR to \fIm\fR+\fIn\fR\&. .RE .PP \fB\-maxseqlen=\fR\fB\fIn\fR\fR .RS 4 Maximum allowed input sequence length\&. .RE .PP \fB\-quiet\fR .RS 4 Reduce console output to minimum\&. .RE .PP \fB\-stats=\fR\fB\fIfile\fR\fR .RS 4 Log some alignments statistics to \fIfile\fR\&. .RE .RE .sp .it 1 an-trap .nr an-no-space-flag 1 .nr an-break-flag 1 .br .ps +1 \fBFast Pairwise Alignments:\fR .RS 4 .PP \fB\-ktuple=\fR\fB\fIn\fR\fR .RS 4 Word size\&. .RE .PP \fB\-topdiags=\fR\fB\fIn\fR\fR .RS 4 Number of best diags\&. .RE .PP \fB\-window=\fR\fB\fIn\fR\fR .RS 4 Window around best diags\&. .RE .PP \fB\-pairgap=\fR\fB\fIn\fR\fR .RS 4 Gap penalty\&. .RE .PP \fB\-score\fR .RS 4 \fIPERCENT\fR or \fIABSOLUTE\fR\&. .RE .RE .sp .it 1 an-trap .nr an-no-space-flag 1 .nr an-break-flag 1 .br .ps +1 \fBSlow Pairwise Alignments:\fR .RS 4 .PP \fB\-pwmatrix=\fR .RS 4 :Protein weight matrix=\fIBLOSUM\fR, \fIPAM\fR, \fIGONNET\fR, \fIID\fR or \fIfilename\fR .RE .PP \fB\-pwdnamatrix=\fR .RS 4 DNA weight matrix=\fIBLOSUM\fRIUB, \fIBLOSUM\fRCLUSTALW or \fIBLOSUM\fRfilename\&. .RE .PP \fB\-pwgapopen=\fR\fB\fIf\fR\fR .RS 4 Gap opening penalty\&. .RE .PP \fB\-pwgapext=\fR\fB\fIf\fR\fR .RS 4 Gap extension penalty\&. .RE .RE .sp .it 1 an-trap .nr an-no-space-flag 1 .nr an-break-flag 1 .br .ps +1 \fBMultiple Alignments:\fR .RS 4 .PP \fB\-newtree=\fR .RS 4 File for new guide tree\&. .RE .PP \fB\-usetree=\fR .RS 4 File for old guide tree\&. .RE .PP \fB\-matrix=\fR .RS 4 Protein weight matrix=\fIBLOSUM\fR, \fIPAM\fR, \fIGONNET\fR, \fIID\fR or \fIfilename\fR\&. .RE .PP \fB\-dnamatrix=\fR .RS 4 DNA weight matrix=\fIIUB\fR, \fICLUSTALW\fR or \fIfilename\fR\&. .RE .PP \fB\-gapopen=\fR\fB\fIf\fR\fR .RS 4 Gap opening penalty\&. .RE .PP \fB\-gapext=\fR\fB\fIf\fR\fR .RS 4 Gap extension penalty\&. .RE .PP \fB\-engaps\fR .RS 4 No end gap separation pen\&. .RE .PP \fB\-gapdist=\fR\fB\fIn\fR\fR .RS 4 Gap separation pen\&. range\&. .RE .PP \fB\-nogap\fR .RS 4 Residue\-specific gaps off\&. .RE .PP \fB\-nohgap\fR .RS 4 Hydrophilic gaps off\&. .RE .PP \fB\-hgapresidues=\fR .RS 4 List hydrophilic res\&. .RE .PP \fB\-maxdiv=\fR\fB\fIn\fR\fR .RS 4 Percent identity for delay\&. .RE .PP \fB\-type=\fR .RS 4 \fIPROTEIN\fR or \fIDNA\fR .RE .PP \fB\-transweight=\fR\fB\fIf\fR\fR .RS 4 Transitions weighting\&. .RE .PP \fB\-iteration=\fR .RS 4 \fINONE\fR or \fITREE\fR or \fIALIGNMENT\fR\&. .RE .PP \fB\-numiter=\fR\fB\fIn\fR\fR .RS 4 Maximum number of iterations to perform\&. .RE .RE .sp .it 1 an-trap .nr an-no-space-flag 1 .nr an-break-flag 1 .br .ps +1 \fBProfile Alignments:\fR .RS 4 .PP \fB\-profile\fR .RS 4 Merge two alignments by profile alignment\&. .RE .PP \fB\-newtree1=\fR .RS 4 File for new guide tree for profile1\&. .RE .PP \fB\-newtree2=\fR .RS 4 File for new guide tree for profile2\&. .RE .PP \fB\-usetree1=\fR .RS 4 File for old guide tree for profile1\&. .RE .PP \fB\-usetree2=\fR .RS 4 File for old guide tree for profile2\&. .RE .RE .sp .it 1 an-trap .nr an-no-space-flag 1 .nr an-break-flag 1 .br .ps +1 \fBSequence to Profile Alignments:\fR .RS 4 .PP \fB\-sequences\fR .RS 4 Sequentially add profile2 sequences to profile1 alignment\&. .RE .PP \fB\-newtree=\fR .RS 4 File for new guide tree\&. .RE .PP \fB\-usetree=\fR .RS 4 File for old guide tree\&. .RE .RE .sp .it 1 an-trap .nr an-no-space-flag 1 .nr an-break-flag 1 .br .ps +1 \fBStructure Alignments:\fR .RS 4 .PP \fB\-nosecstr1\fR .RS 4 Do not use secondary structure\-gap penalty mask for profile 1\&. .RE .PP \fB\-nosecstr2\fR .RS 4 Do not use secondary structure\-gap penalty mask for profile 2\&. .RE .PP \fB\-secstrout=\fR\fB\fISTRUCTURE\fR\fR\fB or \fR\fB\fIMASK\fR\fR\fB or \fR\fB\fIBOTH\fR\fR\fB or \fR\fB\fINONE\fR\fR .RS 4 Output in alignment file\&. .RE .PP \fB\-helixgap=\fR\fB\fIn\fR\fR .RS 4 Gap penalty for helix core residues\&. .RE .PP \fB\-strandgap=\fR\fB\fIn\fR\fR .RS 4 Gap penalty for strand core residues\&. .RE .PP \fBloopgap=\fR\fB\fIn\fR\fR .RS 4 Gap penalty for loop regions\&. .RE .PP \fB\-terminalgap=\fR\fB\fIn\fR\fR .RS 4 Gap penalty for structure termini\&. .RE .PP \fB\-helixendin=\fR\fB\fIn\fR\fR .RS 4 Number of residues inside helix to be treated as terminal\&. .RE .PP \fB\-helixendout=\fR\fB\fIn\fR\fR .RS 4 Number of residues outside helix to be treated as terminal\&. .RE .PP \fB\-strandendin=\fR\fB\fIn\fR\fR .RS 4 Number of residues inside strand to be treated as terminal\&. .RE .PP \fB\-strandendout=\fR\fB\fIn\fR\fR .RS 4 Number of residues outside strand to be treated as terminal\&. .RE .RE .sp .it 1 an-trap .nr an-no-space-flag 1 .nr an-break-flag 1 .br .ps +1 \fBTrees:\fR .RS 4 .PP \fB\-outputtree=\fR\fB\fInj\fR\fR\fB OR \fR\fB\fIphylip\fR\fR\fB OR \fR\fB\fIdist\fR\fR\fB OR \fR\fB\fInexus\fR\fR .RS 4 .RE .PP \fB\-seed=\fR\fB\fIn\fR\fR .RS 4 Seed number for bootstraps\&. .RE .PP \fB\-kimura\fR .RS 4 Use Kimura\*(Aqs correction\&. .RE .PP \fB\-tossgaps\fR .RS 4 Ignore positions with gaps\&. .RE .PP \fB\-bootlabels=\fR\fB\fInode\fR\fR .RS 4 Position of bootstrap values in tree display\&. .RE .PP \fB\-clustering=\fR .RS 4 NJ or UPGMA\&. .RE .RE .SH "BUGS" .PP The Clustal bug tracking system can be found at \m[blue]\fB\%http://bioinf.ucd.ie/bugzilla/buglist.cgi?quicksearch=clustal\fR\m[]\&. .SH "SEE ALSO" .PP \fBclustalx\fR(1)\&. .SH "REFERENCES" .sp .RS 4 .ie n \{\ \h'-04'\(bu\h'+03'\c .\} .el \{\ .sp -1 .IP \(bu 2.3 .\} Larkin MA, Blackshields G, Brown NP, Chenna R, McGettigan PA, McWilliam H, Valentin F, Wallace IM, Wilm A, Lopez R, Thompson JD, Gibson TJ, Higgins DG\&. (2007)\&. \m[blue]\fBClustal W and Clustal X version 2\&.0\&.\fR\m[]\&\s-2\u[1]\d\s+2 Bioinformatics, 23, 2947\-2948\&. .RE .sp .RS 4 .ie n \{\ \h'-04'\(bu\h'+03'\c .\} .el \{\ .sp -1 .IP \(bu 2.3 .\} Chenna R, Sugawara H, Koike T, Lopez R, Gibson TJ, Higgins DG, Thompson JD\&. (2003)\&. \m[blue]\fBMultiple sequence alignment with the Clustal series of programs\&.\fR\m[]\&\s-2\u[2]\d\s+2 Nucleic Acids Res\&., 31, 3497\-3500\&. .RE .sp .RS 4 .ie n \{\ \h'-04'\(bu\h'+03'\c .\} .el \{\ .sp -1 .IP \(bu 2.3 .\} Jeanmougin F, Thompson JD, Gouy M, Higgins DG, Gibson TJ\&. (1998)\&. \m[blue]\fBMultiple sequence alignment with Clustal X\fR\m[]\&\s-2\u[3]\d\s+2\&. Trends Biochem Sci\&., 23, 403\-405\&. .RE .sp .RS 4 .ie n \{\ \h'-04'\(bu\h'+03'\c .\} .el \{\ .sp -1 .IP \(bu 2.3 .\} Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DG\&. (1997)\&. \m[blue]\fBThe CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools\&.\fR\m[]\&\s-2\u[4]\d\s+2 Nucleic Acids Res\&., 25, 4876\-4882\&. .RE .sp .RS 4 .ie n \{\ \h'-04'\(bu\h'+03'\c .\} .el \{\ .sp -1 .IP \(bu 2.3 .\} Higgins DG, Thompson JD, Gibson TJ\&. (1996)\&. \m[blue]\fBUsing CLUSTAL for multiple sequence alignments\&.\fR\m[]\&\s-2\u[5]\d\s+2 Methods Enzymol\&., 266, 383\-402\&. .RE .sp .RS 4 .ie n \{\ \h'-04'\(bu\h'+03'\c .\} .el \{\ .sp -1 .IP \(bu 2.3 .\} Thompson JD, Higgins DG, Gibson TJ\&. (1994)\&. \m[blue]\fBCLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position\-specific gap penalties and weight matrix choice\&.\fR\m[]\&\s-2\u[6]\d\s+2 Nucleic Acids Res\&., 22, 4673\-4680\&. .RE .sp .RS 4 .ie n \{\ \h'-04'\(bu\h'+03'\c .\} .el \{\ .sp -1 .IP \(bu 2.3 .\} Higgins DG\&. (1994)\&. \m[blue]\fBCLUSTAL V: multiple alignment of DNA and protein sequences\&.\fR\m[]\&\s-2\u[7]\d\s+2 Methods Mol Biol\&., 25, 307\-318 .RE .sp .RS 4 .ie n \{\ \h'-04'\(bu\h'+03'\c .\} .el \{\ .sp -1 .IP \(bu 2.3 .\} Higgins DG, Bleasby AJ, Fuchs R\&. (1992)\&. \m[blue]\fBCLUSTAL V: improved software for multiple sequence alignment\&.\fR\m[]\&\s-2\u[8]\d\s+2 Comput\&. Appl\&. Biosci\&., 8, 189\-191\&. .RE .sp .RS 4 .ie n \{\ \h'-04'\(bu\h'+03'\c .\} .el \{\ .sp -1 .IP \(bu 2.3 .\} Higgins,D\&.G\&. and Sharp,P\&.M\&. (1989)\&. \m[blue]\fBFast and sensitive multiple sequence alignments on a microcomputer\&.\fR\m[]\&\s-2\u[9]\d\s+2 Comput\&. Appl\&. Biosci\&., 5, 151\-153\&. .RE .sp .RS 4 .ie n \{\ \h'-04'\(bu\h'+03'\c .\} .el \{\ .sp -1 .IP \(bu 2.3 .\} Higgins,D\&.G\&. and Sharp,P\&.M\&. (1988)\&. \m[blue]\fBCLUSTAL: a package for performing multiple sequence alignment on a microcomputer\&.\fR\m[]\&\s-2\u[10]\d\s+2 Gene, 73, 237\-244\&. .RE .SH "AUTHORS" .PP \fBDes Higgins\fR .RS 4 Copyright holder for Clustal\&. .RE .PP \fBJulie Thompson\fR .RS 4 Copyright holder for Clustal\&. .RE .PP \fBToby Gibson\fR .RS 4 Copyright holder for Clustal\&. .RE .PP \fBCharles Plessy\fR <\&plessy@debian\&.org\&> .RS 4 Prepared this manpage in DocBook XML for the Debian distribution\&. .RE .SH "COPYRIGHT" .br Copyright \(co 1988\(en2010 Des Higgins, Julie Thompson & Toby Giboson (Clustal) .br Copyright \(co 2008\(en2010 Charles Plessy (This manpage) .br .PP This program is free software: you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version\&. .PP This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE\&. See the GNU Lesser General Public License for more details\&. .PP You should have received a copy of the GNU Lesser General Public License along with this program\&. If not, see http://www\&.gnu\&.org/licenses/, or on Debian systems, /usr/share/common\-licenses/LGPL\-3\&. .PP This manual page and its XML source can be used, modified, and redistributed as if it were in public domain\&. .sp .SH "NOTES" .IP " 1." 4 Clustal W and Clustal X version 2.0. .RS 4 \%http://www.ncbi.nlm.nih.gov/pubmed/17846036 .RE .IP " 2." 4 Multiple sequence alignment with the Clustal series of programs. .RS 4 \%http://www.ncbi.nlm.nih.gov/pubmed/12824352 .RE .IP " 3." 4 Multiple sequence alignment with Clustal X .RS 4 \%http://www.ncbi.nlm.nih.gov/pubmed/9810230 .RE .IP " 4." 4 The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. .RS 4 \%http://www.ncbi.nlm.nih.gov/pubmed/9396791 .RE .IP " 5." 4 Using CLUSTAL for multiple sequence alignments. .RS 4 \%http://www.ncbi.nlm.nih.gov/pubmed/8743695 .RE .IP " 6." 4 CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. .RS 4 \%http://www.ncbi.nlm.nih.gov/pubmed/7984417 .RE .IP " 7." 4 CLUSTAL V: multiple alignment of DNA and protein sequences. .RS 4 \%http://www.ncbi.nlm.nih.gov/pubmed/8004173 .RE .IP " 8." 4 CLUSTAL V: improved software for multiple sequence alignment. .RS 4 \%http://www.ncbi.nlm.nih.gov/pubmed/1591615 .RE .IP " 9." 4 Fast and sensitive multiple sequence alignments on a microcomputer. .RS 4 \%http://www.ncbi.nlm.nih.gov/pubmed/2720464 .RE .IP "10." 4 CLUSTAL: a package for performing multiple sequence alignment on a microcomputer. .RS 4 \%http://www.ncbi.nlm.nih.gov/pubmed/3243435 .RE