'\" t
.\"     Title: gth
.\"    Author: [see the "AUTHOR(S)" section]
.\" Generator: Asciidoctor 2.0.12
.\"      Date: 
.\"    Manual: \ \&
.\"    Source: \ \&
.\"  Language: English
.\"
.TH "GTH" "1" "" "\ \&" "\ \&"
.ie \n(.g .ds Aq \(aq
.el       .ds Aq '
.ss \n[.ss] 0
.nh
.ad l
.de URL
\fI\\$2\fP <\\$1>\\$3
..
.als MTO URL
.if \n[.g] \{\
.  mso www.tmac
.  am URL
.    ad l
.  .
.  am MTO
.    ad l
.  .
.  LINKSTYLE blue R < >
.\}
.SH "NAME"
gth \- predict genome structures
.SH "SYNOPSIS"
.sp
\fBgth\fP [option ...] \-genomic file [...] \-cdna file [...] \-protein file [...]
.SH "DESCRIPTION"
.sp
Computes similarity\-based gene structure predictions (spliced alignments)
using cDNA/EST and/or protein sequences and assemble the resulting spliced
alignments to consensus spliced alignments.
.SH "OPTIONS"
.sp
\fB\-genomic\fP <file>
.RS 4
specify input files containing genomic sequences (mandatory option)
.RE
.sp
\fB\-cdna\fP <file>
.RS 4
specify input files containing cDNA/EST sequences
.RE
.sp
\fB\-protein\fP <file>
.RS 4
specify input files containing protein sequences
.RE
.sp
\fB\-species\fP <species>
.RS 4
specify species to select splice site model which is most appropriate; possible species:
"human"
"mouse"
"rat"
"chicken"
"drosophila"
"nematode"
"fission_yeast"
"aspergillus"
"arabidopsis"
"maize"
"rice"
"medicago"
default: undefined
.RE
.sp
\fB\-bssm\fP
.RS 4
read bssm parameter from file in the path given by the environment variable BSSMDIR, default: undefined
.RE
.sp
\fB\-scorematrix\fP
.RS 4
read amino acid substitution scoring matrix from file in the
path given by the environment variable GTHDATADIR
default: BLOSUM62
.RE
.sp
\fB\-translationtable\fP
.RS 4
set the codon translation table used for codon translation in
matching, DP, and output
default: 1
.RE
.sp
\fB\-f\fP
.RS 4
analyze only forward strand of genomic sequences
default: no
.RE
.sp
\fB\-r\fP
.RS 4
analyze only reverse strand of genomic sequences
default: no
.RE
.sp
\fB\-cdnaforward\fP
.RS 4
align only forward strand of cDNAs
default: no
.RE
.sp
\fB\-frompos\fP
.RS 4
analyze genomic sequence from this position
requires \-topos or \-width; counting from 1 on
default: 0
.RE
.sp
\fB\-topos\fP
.RS 4
analyze genomic sequence to this position
requires \-frompos; counting from 1 on
default: 0
.RE
.sp
\fB\-width\fP
.RS 4
analyze only this width of genomic sequence
requires \-frompos
default: 0
.RE
.sp
\fB\-v\fP
.RS 4
be verbose
default: no
.RE
.sp
\fB\-xmlout\fP
.RS 4
show output in XML format
default: no
.RE
.sp
\fB\-gff3out\fP
.RS 4
show output in GFF3 format
default: no
.RE
.sp
\fB\-md5ids\fP
.RS 4
show MD5 fingerprints as sequence IDs
default: no
.RE
.sp
\fB\-o\fP
.RS 4
redirect output to specified file
default: undefined
.RE
.sp
\fB\-gzip\fP
.RS 4
write gzip compressed output file
default: no
.RE
.sp
\fB\-bzip2\fP
.RS 4
write bzip2 compressed output file
default: no
.RE
.sp
\fB\-force\fP
.RS 4
force writing to output file
default: no
.RE
.sp
\fB\-skipalignmentout\fP
.RS 4
skip output of spliced alignments
default: no
.RE
.sp
\fB\-mincutoffs\fP
.RS 4
show full spliced alignments
i.e., cutoffs mode for leading and terminal bases is MINIMAL
default: no
.RE
.sp
\fB\-showintronmaxlen\fP
.RS 4
set the maximum length of a fully shown intron
If set to 0, all introns are shown completely
default: 120
.RE
.sp
\fB\-minorflen\fP
.RS 4
set the minimum length of an ORF to be shown
default: 64
.RE
.sp
\fB\-startcodon\fP
.RS 4
require than an ORF must begin with a start codon
default: no
.RE
.sp
\fB\-finalstopcodon\fP
.RS 4
require that the final ORF must end with a stop codon
default: no
.RE
.sp
\fB\-showseqnums\fP
.RS 4
show sequence numbers in output
default: no
.RE
.sp
\fB\-pglgentemplate\fP
.RS 4
show genomic template in PGL lines
(switch off for backward compatibility)
default: yes
.RE
.sp
\fB\-gs2out\fP
.RS 4
output in old GeneSeqer2 format
default: no
.RE
.sp
\fB\-maskpolyatails\fP
.RS 4
mask poly(A) tails in cDNA/EST files
default: no
.RE
.sp
\fB\-proteinsmap\fP
.RS 4
specify smap file used for protein files
default: protein
.RE
.sp
\fB\-noautoindex\fP
.RS 4
do not create indices automatically
except for the .dna.* files used for the DP.
existence is not tested before an index is actually used!
default: no
.RE
.sp
\fB\-createindicesonly\fP
.RS 4
stop program flow after the indices have been created
default: no
.RE
.sp
\fB\-skipindexcheck\fP
.RS 4
skip index check (in preprocessing phase)
default: no
.RE
.sp
\fB\-minmatchlen\fP
.RS 4
specify minimum match length (cDNA matching)
default: 20
.RE
.sp
\fB\-seedlength\fP
.RS 4
specify the seed length (cDNA matching)
default: 18
.RE
.sp
\fB\-exdrop\fP
.RS 4
specify the Xdrop value for edit distance extension (cDNA
matching)
default: 2
.RE
.sp
\fB\-prminmatchlen\fP
.RS 4
specify minimum match length (protein matches)
default: 24
.RE
.sp
\fB\-prseedlength\fP
.RS 4
specify seed length (protein matching)
default: 10
.RE
.sp
\fB\-prhdist\fP
.RS 4
specify Hamming distance (protein matching)
default: 4
.RE
.sp
\fB\-online\fP
.RS 4
run the similarity filter online without using the complete
index (increases runtime)
default: no
.RE
.sp
\fB\-inverse\fP
.RS 4
invert query and index in vmatch call
default: no
.RE
.sp
\fB\-exact\fP
.RS 4
use exact matches in the similarity filter
default: no
.RE
.sp
\fB\-gcmaxgapwidth\fP
.RS 4
set the maximum gap width for global chains
defines approximately the maximum intron length
set to 0 to allow for unlimited length
in order to avoid false\-positive exons (lonely exons) at the
sequence ends, it is very important to set this parameter
appropriately!
default: 1000000
.RE
.sp
\fB\-gcmincoverage\fP
.RS 4
set the minimum coverage of global chains regarding to the
reference sequence
default: 50
.RE
.sp
\fB\-paralogs\fP
.RS 4
compute paralogous genes (different chaining procedure)
default: no
.RE
.sp
\fB\-enrichchains\fP
.RS 4
enrich genomic sequence part of global chains with additional
matches
default: no
.RE
.sp
\fB\-introncutout\fP
.RS 4
enable the intron cutout technique
default: no
.RE
.sp
\fB\-fastdp\fP
.RS 4
use jump table to increase speed of DP calculation
default: no
.RE
.sp
\fB\-autointroncutout\fP
.RS 4
set the automatic intron cutout matrix size in megabytes and
enable the automatic intron cutout technique
default: 0
.RE
.sp
\fB\-icinitialdelta\fP
.RS 4
set the initial delta used for intron cutouts
default: 50
.RE
.sp
\fB\-iciterations\fP
.RS 4
set the number of intron cutout iterations
default: 2
.RE
.sp
\fB\-icdeltaincrease\fP
.RS 4
set the delta increase during every iteration
default: 50
.RE
.sp
\fB\-icminremintronlen\fP
.RS 4
set the minimum remaining intron length for an intron to be
cut out
default: 10
.RE
.sp
\fB\-nou12intronmodel\fP
.RS 4
disable the U12\-type intron model
default: no
.RE
.sp
\fB\-u12donorprob\fP
.RS 4
set the probability for perfect U12\-type donor sites
default: 0.99
.RE
.sp
\fB\-u12donorprob1mism\fP
.RS 4
set the prob. for U12\-type donor w. 1 mismatch
default: 0.90
.RE
.sp
\fB\-probies\fP
.RS 4
set the initial exon state probability
default: 0.50
.RE
.sp
\fB\-probdelgen\fP
.RS 4
set the genomic sequence deletion probability
default: 0.03
.RE
.sp
\fB\-identityweight\fP
.RS 4
set the pairs of identical characters weight
default: 2.00
.RE
.sp
\fB\-mismatchweight\fP
.RS 4
set the weight for mismatching characters
default: \-2.00
.RE
.sp
\fB\-undetcharweight\fP
.RS 4
set the weight for undetermined characters
default: 0.00
.RE
.sp
\fB\-deletionweight\fP
.RS 4
set the weight for deletions
default: \-5.00
.RE
.sp
\fB\-dpminexonlen\fP
.RS 4
set the minimum exon length for the DP
default: 5
.RE
.sp
\fB\-dpminintronlen\fP
.RS 4
set the minimum intron length for the DP
default: 50
.RE
.sp
\fB\-shortexonpenal\fP
.RS 4
set the short exon penalty
default: 100.00
.RE
.sp
\fB\-shortintronpenal\fP
.RS 4
set the short intron penalty
default: 100.00
.RE
.sp
\fB\-wzerotransition\fP
.RS 4
set the zero transition weights window size
default: 80
.RE
.sp
\fB\-wdecreasedoutput\fP
.RS 4
set the decreased output weights window size
default: 80
.RE
.sp
\fB\-leadcutoffsmode\fP
.RS 4
set the cutoffs mode for leading bases
can be either RELAXED, STRICT, or MINIMAL
default: RELAXED
.RE
.sp
\fB\-termcutoffsmode\fP
.RS 4
set the cutoffs mode for terminal bases
can be either RELAXED, STRICT, or MINIMAL
default: STRICT
.RE
.sp
\fB\-cutoffsminexonlen\fP
.RS 4
set the cutoffs minimum exon length
default: 5
.RE
.sp
\fB\-scoreminexonlen\fP
.RS 4
set the score minimum exon length
default: 50
.RE
.sp
\fB\-minaveragessp\fP
.RS 4
set the minimum average splice site prob.
default: 0.50
.RE
.sp
\fB\-duplicatecheck\fP
.RS 4
criterion used to check for spliced alignment duplicates,
choose from none|id|desc|seq|both
default: both
.RE
.sp
\fB\-minalignmentscore\fP
.RS 4
set the minimum alignment score for spliced alignments to be
included into the set of spliced alignments
default: 0.00
.RE
.sp
\fB\-maxalignmentscore\fP
.RS 4
set the maximum alignment score for spliced alignments to be
included into the set of spliced alignments
default: 1.00
.RE
.sp
\fB\-mincoverage\fP
.RS 4
set the minimum coverage for spliced alignments to be
included into the set of spliced alignments
default: 0.00
.RE
.sp
\fB\-maxcoverage\fP
.RS 4
set the maximum coverage for spliced alignments to be
included into the set of spliced alignments
default: 9999.99
.RE
.sp
\fB\-intermediate\fP
.RS 4
stop after calculation of spliced alignments and output
results in reusable XML format. Do not process this output
yourself, use the ``normal\(aq\(aq XML output instead!
default: no
.RE
.sp
\fB\-sortags\fP
.RS 4
sort alternative gene structures according to the weighted
mean of the average exon score and the average splice site
probability
default: no
.RE
.sp
\fB\-sortagswf\fP
.RS 4
set the weight factor for the sorting of AGSs
default: 1.00
.RE
.sp
\fB\-exondistri\fP
.RS 4
show the exon length distribution
default: no
.RE
.sp
\fB\-introndistri\fP
.RS 4
show the intron length distribution
default: no
.RE
.sp
\fB\-refseqcovdistri\fP
.RS 4
show the reference sequence coverage distribution
default: no
.RE
.sp
\fB\-first\fP
.RS 4
set the maximum number of spliced alignments per genomic DNA
input. Set to 0 for unlimited number.
default: 0
.RE
.sp
\fB\-help\fP
.RS 4
display help for basic options and exit
.RE
.sp
\fB\-help+\fP
.RS 4
display help for all options and exit
.RE
.sp
\fB\-version\fP
.RS 4
display version information and exit
.RE