NAME¶
clustalo - General purpose multiple sequence alignment program for proteins
SYNOPSIS¶
clustalo [-h]
DESCRIPTION¶
Clustal-Omega is a general purpose multiple sequence alignment (MSA) program for
proteins. It produces high quality MSAs and is capable of handling data-sets
of hundreds of thousands of sequences in reasonable time.
In default mode, users give a file of sequences to be aligned and these are
clustered to produce a guide tree and this is used to guide a
"progressive alignment" of the sequences. There are also facilities
for aligning existing alignments to each other, aligning a sequence to an
alignment and for using a hidden Markov model (HMM) to help guide an alignment
of new sequences that are homologous to the sequences used to make the HMM.
This latter procedure is referred to as "external profile alignment"
or EPA.
Clustal-Omega uses HMMs for the alignment engine, based on the HHalign package
from Johannes Soeding [1]. Guide trees are made using an enhanced version of
mBed [2] which can cluster very large numbers of sequences in O(N*log(N))
time. Multiple alignment then proceeds by aligning larger and larger
alignments using HHalign, following the clustering given by the guide tree.
In its current form Clustal-Omega can only align protein sequences but not
DNA/RNA sequences. It is envisioned that DNA/RNA will become available in a
future version.
USAGE¶
Tool usage is available in /usr/share/doc/clustalo/README.
DEVELOPMENT¶
Headers and libraries are available in libclustalo-dev package.
CITING¶
Sievers F, Wilm A, Dineen DG, Gibson TJ, Karplus K, Li W, Lopez R, McWilliam H,
Remmert M, Söding J, Thompson JD, Higgins DG (2011). Fast, scalable
generation of high-quality protein multiple sequence alignments
using Clustal Omega. Mol Syst Biol 7.
AUTHOR¶
Olivier Sallou (olivier.sallou (at) irisa.fr) - Man page and packaging
Conway Institute UCD Dublin (clustalw (at) ucd.ie) - clustalo