NAME¶

hmm_train - Estimate the transition probabilities of an HMM, based on multiple -g <gff_fname_list> [OPTIONS] > out.hmm

DESCRIPTION:¶

Estimate the transition probabilities of an HMM, based on multiple alignments, sequence annotations, and a category map.

-m <msa_fname_list>

: List of multiple sequence alignment files. Currently, in testing mode, the list must be of length one.

-c <category_map_fname>

-g <gff_fname_list>

: Files in GFF defining sequence features to be used in labeling sites. Frame of reference of feature indices is determined feature-by-feature according to

-M <msa_length_list>

: (Mutually exclusive with -m) Assume alignments of the specified lengths (comma-separated list) and do not not attempt to map the coordinates in the specified GFFs (assume they are in the desired coordinate frame). This option allows an HMM to be trained directly from GFFs, without alignments. Not permitted with -I.

-i PHYLIP|FASTA|MPM|SS

-R <tag>

: Before estimating transition probabilities, group features by <tag> (e.g., "transcript_id" or "exon_id") and reverse complement segments of the alignment corresponding to groups on the reverse strand. Groups must be non-overlapping (see refeature --unique).

-I <indel_cat_list>

: nonzero probability for the states corresponding to a specified category range, indels must be "clean" (nonoverlapping), must be assignable by parsimony to a single branch in the phylogenetic tree, and must have lengths that are exact multiples of the category range size. Avoid -G with this option. If used in training mode, requires -T.

-t <tree_fname>

-n <nseqs>

: Train an indel model for <nseqs> sequences, despite that the training alignment has a different number. All (non-trivial) gap patterns are assumed to be equally frequent.

-q

-h

May 2016

hmm_train 1.4

Source file:	hmm_train.1.en.gz (from phast 1.5+dfsg-2)
Source last updated:	2020-10-03T07:21:30Z
Converted to HTML:	2022-09-19T07:12:17Z