Scroll to navigation

HMM_TRAIN(1) User Commands HMM_TRAIN(1)

NAME

hmm_train - Estimate the transition probabilities of an HMM, based on multiple -g <gff_fname_list> [OPTIONS] > out.hmm

DESCRIPTION:

Estimate the transition probabilities of an HMM, based on multiple alignments, sequence annotations, and a category map.

OPTIONS

required options

-m <msa_fname_list>

List of multiple sequence alignment files. Currently, in testing mode, the list must be of length one.

-c <category_map_fname>

File defining mapping of feature types to category numbers.

-g <gff_fname_list>

Files in GFF defining sequence features to be used in labeling sites. Frame of reference of feature indices is determined feature-by-feature according to
'seqname' attribute.
Filenames must correspond in number and order
to the elements of <msa_fname_list>.

alignment options

-M <msa_length_list>

(Mutually exclusive with -m) Assume alignments of the specified lengths (comma-separated list) and do not not attempt to map the coordinates in the specified GFFs (assume they are in the desired coordinate frame). This option allows an HMM to be trained directly from GFFs, without alignments. Not permitted with -I.

-i PHYLIP|FASTA|MPM|SS

(default SS) Alignment format.

-R <tag>

Before estimating transition probabilities, group features by <tag> (e.g., "transcript_id" or "exon_id") and reverse complement segments of the alignment corresponding to groups on the reverse strand. Groups must be non-overlapping (see refeature --unique).

indel options

-I <indel_cat_list>

To have
nonzero probability for the states corresponding to a specified category range, indels must be "clean" (nonoverlapping), must be assignable by parsimony to a single branch in the phylogenetic tree, and must have lengths that are exact multiples of the category range size. Avoid -G with this option. If used in training mode, requires -T.

-t <tree_fname>

Use the specified tree topology when training for indels.

-n <nseqs>

Train an indel model for <nseqs> sequences, despite that the training alignment has a different number. All (non-trivial) gap patterns are assumed to be equally frequent.

other options

-q

Proceed quietly (without updates to stderr).

-h

Print this help message and exit.
May 2016 hmm_train 1.4