HHALIGN(1)

User Commands

HHALIGN(1)

NAME¶

hhalign - align a query alignment/HMM to a template alignment/HMM

SYNOPSIS¶

hhalign -i query [-t template] [options]

DESCRIPTION¶

HHalign version 2.0.16 (January 2013) Align a query alignment/HMM to a template alignment/HMM by HMM-HMM alignment If only one alignment/HMM is given it is compared to itself and the best off-diagonal alignment plus all further non-overlapping alignments above significance threshold are shown. Remmert M, Biegert A, Hauser A, and Soding J. HHblits: Lightning-fast iterative protein sequence searching by HMM-HMM alignment. Nat. Methods 9:173-175 (2011). (C) Johannes Soeding, Michael Remmert, Andreas Biegert, Andreas Hauser

-i <file>: input query alignment (fasta/a2m/a3m) or HMM file (.hhm)

-t <file>: input template alignment (fasta/a2m/a3m) or HMM file (.hhm)

Output options:¶

-o <file>: write output alignment to file

-ofas <file>: write alignments in FASTA, A2M (-oa2m) or A3M (-oa3m) format

-Oa3m <file>: write query alignment in a3m format to file (default=none)

-Aa3m <file>: append query alignment in a3m format to file (default=none)

-atab <file>: write alignment as a table (with posteriors) to file (default=none)

-index <file> use given alignment to calculate Viterbi score (default=none)

-v <int>: verbose mode: 0:no screen output 1:only warings 2: verbose

-seq: [1,inf[ max. number of query/template sequences displayed (def=1)

-nocons: don't show consensus sequence in alignments (default=show)

-nopred: don't show predicted 2ndary structure in alignments (default=show)

-nodssp: don't show DSSP 2ndary structure in alignments (default=show)

-ssconf: show confidences for predicted 2ndary structure in alignments

-aliw int: number of columns per line in alignment list (def=80)

-P <float>: for self-comparison: max p-value of alignments (def=0.001

-p <float>: minimum probability in summary and alignment list (def=0)

-E <float>: maximum E-value in summary and alignment list (def=1E+06)

-Z <int>: maximum number of lines in summary hit list (def=100)

-z <int>: minimum number of lines in summary hit list (def=1)

-B <int>: maximum number of alignments in alignment list (def=100)

-b <int>: minimum number of alignments in alignment list (def=1)

-rank int: specify rank of alignment to write with -Oa3m or -Aa3m option (default=1)

Filter input alignment (options can be combined):¶

-id: [0,100] maximum pairwise sequence identity (%) (def=90)

-diff [0,inf[ filter most diverse set of sequences, keeping at least this

: many sequences in each block of >50 columns (def=100)

-cov: [0,100] minimum coverage with query (%) (def=0)

-qid: [0,100] minimum sequence identity with query (%) (def=0)

-qsc: [0,100] minimum score per column with query (def=-20.0)

Input alignment format:¶

-M a2m: use A2M/A3M (default): upper case = Match; lower case = Insert; '-' = Delete; '.' = gaps aligned to inserts (may be omitted)

-M first: use FASTA: columns with residue in 1st sequence are match states

-M [0,100]: use FASTA: columns with fewer than X% gaps are match states

HMM-HMM alignment options:¶

-glob/-loc: global or local alignment mode (def=local)

-alt <int>: show up to this number of alternative alignments (def=1)

-realign: realign displayed hits with max. accuracy (MAC) algorithm

-norealign: do NOT realign displayed hits with MAC algorithm (def=realign)

-mact [0,1[: posterior probability threshold for MAC alignment (def=0.350) A threshold value of 0.0 yields global alignments.

-sto <int>: use global stochastic sampling algorithm to sample this many alignments

-excl <range> exclude query positions from the alignment, e.g. '1-33,97-168'

-shift [-1,1] score offset (def=-0.030)

-corr [0,1]: weight of term for pair correlations (def=0.10)

-ssm: 0-4 0:no ss scoring [default=2]

: 1:ss scoring after alignment 2:ss scoring during alignment

-ssw: [0,1] weight of ss score (def=0.11)

-def: read default options from ./.hhdefaults or <home>/.hhdefault.

Example: hhalign -i T0187.a3m -t d1hz4a_.hhm -png T0187pdb.png

Output options:¶

-o <file>: write output alignment to file

-ofas <file>: write alignments in FASTA, A2M (-oa2m) or A3M (-oa3m) format

-Oa3m <file>: write query alignment in a3m format to file (default=none)

-Aa3m <file>: append query alignment in a3m format to file (default=none)

-atab <file>: write alignment as a table (with posteriors) to file (default=none)

-v <int>: verbose mode: 0:no screen output 1:only warings 2: verbose

-seq: [1,inf[ max. number of query/template sequences displayed (def=1)

-nocons: don't show consensus sequence in alignments (default=show)

-nopred: don't show predicted 2ndary structure in alignments (default=show)

-nodssp: don't show DSSP 2ndary structure in alignments (default=show)

-ssconf: show confidences for predicted 2ndary structure in alignments

-aliw int: number of columns per line in alignment list (def=80)

-P <float>: for self-comparison: max p-value of alignments (def=0.001

-p <float>: minimum probability in summary and alignment list (def=0)

-E <float>: maximum E-value in summary and alignment list (def=1E+06)

-Z <int>: maximum number of lines in summary hit list (def=100)

-z <int>: minimum number of lines in summary hit list (def=1)

-B <int>: maximum number of alignments in alignment list (def=100)

-b <int>: minimum number of alignments in alignment list (def=1)

-rank int: specify rank of alignment to write with -Oa3m or -Aa3m option (default=1)

-tc <file>: write a TCoffee library file for the pairwise comparison

-tct [0,100]: min. probobability of residue pairs for TCoffee (def=5%)

Options to filter input alignment (options can be combined):¶

-id: [0,100] maximum pairwise sequence identity (%) (def=90)

-diff [0,inf[: filter most diverse set of sequences, keeping at least this many sequences in each block of >50 columns (def=100)

-cov: [0,100] minimum coverage with query (%) (def=0)

-qid: [0,100] minimum sequence identity with query (%) (def=0)

-qsc: [0,100] minimum score per column with query (def=-20.0)

HMM-building options:¶

-M a2m: use A2M/A3M (default): upper case = Match; lower case = Insert; '-' = Delete; '.' = gaps aligned to inserts (may be omitted)

-M first: use FASTA: columns with residue in 1st sequence are match states

-M [0,100]: use FASTA: columns with fewer than X% gaps are match states

-tags: do NOT neutralize His-, C-myc-, FLAG-tags, and trypsin recognition sequence to background distribution

Pseudocount (pc) options:¶

-pcm: 0-2 position dependence of pc admixture 'tau' (pc mode, default=2)

0: no pseudo counts:: tau = 0

1: constant: tau = a

: 2: diversity-dependent: tau = a/(1 + ((Neff[i]-1)/b)^c) (Neff[i]: number of effective seqs in local MSA around column i) 3: constant diversity pseudocounts

-pca: [0,1] overall pseudocount admixture (def=1.0)

-pcb: [1,inf[ Neff threshold value for -pcm 2 (def=1.5)

-pcc: [0,3] extinction exponent c for -pcm 2 (def=1.0)

-pre_pca [0,1]: PREFILTER pseudocount admixture (def=0.8)

-pre_pcb [1,inf[ PREFILTER threshold for Neff (def=1.8)

Context-specific pseudo-counts:¶

-nocontxt: use substitution-matrix instead of context-specific pseudocounts

-contxt <file> context file for computing context-specific pseudocounts (default=./data/context_data.lib)

-cslib: <file> column state file for fast database prefiltering (default=./data/cs219.lib)

Gap cost options:¶

-gapb [0,inf[: Transition pseudocount admixture (def=1.00)

-gapd [0,inf[: Transition pseudocount admixture for open gap (default=0.15)

-gape [0,1.5]: Transition pseudocount admixture for extend gap (def=1.00)

-gapf ]0,inf]: factor to increase/reduce the gap open penalty for deletes (def=0.60)

-gapg ]0,inf]: factor to increase/reduce the gap open penalty for inserts (def=0.60)

-gaph ]0,inf]: factor to increase/reduce the gap extend penalty for deletes(def=0.60)

-gapi ]0,inf]: factor to increase/reduce the gap extend penalty for inserts(def=0.60)

-egq: [0,inf[ penalty (bits) for end gaps aligned to query residues (def=0.00)

-egt: [0,inf[ penalty (bits) for end gaps aligned to template residues (def=0.00)

Alignment options:¶

-glob/-loc: global or local alignment mode (def=global)

-mac: use Maximum Accuracy (MAC) alignment instead of Viterbi

-mact [0,1]: posterior prob threshold for MAC alignment (def=0.350)

-sto <int>: use global stochastic sampling algorithm to sample this many alignments

-sc: <int> amino acid score (tja: template HMM at column j) (def=1)

0: = log2 Sum(tja*qia/pa) (pa: aa background frequencies)

1: = log2 Sum(tja*qia/pqa) (pqa = 1/2*(pa+ta) )

2: = log2 Sum(tja*qia/ta) (ta: av. aa freqs in template)

3: = log2 Sum(tja*qia/qa) (qa: av. aa freqs in query)

-corr [0,1]: weight of term for pair correlations (def=0.10)

-shift [-1,1]: score offset (def=-0.030)

-r: repeat identification: multiple hits not treated as independent

-ssm: 0-2 0:no ss scoring [default=2]

: 1:ss scoring after alignment 2:ss scoring during alignment

-ssw: [0,1] weight of ss score compared to column score (def=0.11)

-ssa: [0,1] ss confusion matrix = (1-ssa)*I + ssa*psipred-confusion-matrix [def=1.00)

-calm 0-3: empirical score calibration of 0:query 1:template 2:both (def=off)

Default options can be specified in './.hhdefaults' or '~/.hhdefaults'

November 2014

hhalign 2.0.16

Source file:	hhalign.1.en.gz (from hhsuite 2.0.16-5)
Source last updated:	2014-11-07T06:12:38Z
Converted to HTML:	2019-02-15T22:03:31Z