'\" t .\" Title: gth .\" Author: [see the "AUTHOR(S)" section] .\" Generator: Asciidoctor 2.0.12 .\" Date: .\" Manual: \ \& .\" Source: \ \& .\" Language: English .\" .TH "GTH" "1" "" "\ \&" "\ \&" .ie \n(.g .ds Aq \(aq .el .ds Aq ' .ss \n[.ss] 0 .nh .ad l .de URL \fI\\$2\fP <\\$1>\\$3 .. .als MTO URL .if \n[.g] \{\ . mso www.tmac . am URL . ad l . . . am MTO . ad l . . . LINKSTYLE blue R < > .\} .SH "NAME" gth \- predict genome structures .SH "SYNOPSIS" .sp \fBgth\fP [option ...] \-genomic file [...] \-cdna file [...] \-protein file [...] .SH "DESCRIPTION" .sp Computes similarity\-based gene structure predictions (spliced alignments) using cDNA/EST and/or protein sequences and assemble the resulting spliced alignments to consensus spliced alignments. .SH "OPTIONS" .sp \fB\-genomic\fP .RS 4 specify input files containing genomic sequences (mandatory option) .RE .sp \fB\-cdna\fP .RS 4 specify input files containing cDNA/EST sequences .RE .sp \fB\-protein\fP .RS 4 specify input files containing protein sequences .RE .sp \fB\-species\fP .RS 4 specify species to select splice site model which is most appropriate; possible species: "human" "mouse" "rat" "chicken" "drosophila" "nematode" "fission_yeast" "aspergillus" "arabidopsis" "maize" "rice" "medicago" default: undefined .RE .sp \fB\-bssm\fP .RS 4 read bssm parameter from file in the path given by the environment variable BSSMDIR, default: undefined .RE .sp \fB\-scorematrix\fP .RS 4 read amino acid substitution scoring matrix from file in the path given by the environment variable GTHDATADIR default: BLOSUM62 .RE .sp \fB\-translationtable\fP .RS 4 set the codon translation table used for codon translation in matching, DP, and output default: 1 .RE .sp \fB\-f\fP .RS 4 analyze only forward strand of genomic sequences default: no .RE .sp \fB\-r\fP .RS 4 analyze only reverse strand of genomic sequences default: no .RE .sp \fB\-cdnaforward\fP .RS 4 align only forward strand of cDNAs default: no .RE .sp \fB\-frompos\fP .RS 4 analyze genomic sequence from this position requires \-topos or \-width; counting from 1 on default: 0 .RE .sp \fB\-topos\fP .RS 4 analyze genomic sequence to this position requires \-frompos; counting from 1 on default: 0 .RE .sp \fB\-width\fP .RS 4 analyze only this width of genomic sequence requires \-frompos default: 0 .RE .sp \fB\-v\fP .RS 4 be verbose default: no .RE .sp \fB\-xmlout\fP .RS 4 show output in XML format default: no .RE .sp \fB\-gff3out\fP .RS 4 show output in GFF3 format default: no .RE .sp \fB\-md5ids\fP .RS 4 show MD5 fingerprints as sequence IDs default: no .RE .sp \fB\-o\fP .RS 4 redirect output to specified file default: undefined .RE .sp \fB\-gzip\fP .RS 4 write gzip compressed output file default: no .RE .sp \fB\-bzip2\fP .RS 4 write bzip2 compressed output file default: no .RE .sp \fB\-force\fP .RS 4 force writing to output file default: no .RE .sp \fB\-skipalignmentout\fP .RS 4 skip output of spliced alignments default: no .RE .sp \fB\-mincutoffs\fP .RS 4 show full spliced alignments i.e., cutoffs mode for leading and terminal bases is MINIMAL default: no .RE .sp \fB\-showintronmaxlen\fP .RS 4 set the maximum length of a fully shown intron If set to 0, all introns are shown completely default: 120 .RE .sp \fB\-minorflen\fP .RS 4 set the minimum length of an ORF to be shown default: 64 .RE .sp \fB\-startcodon\fP .RS 4 require than an ORF must begin with a start codon default: no .RE .sp \fB\-finalstopcodon\fP .RS 4 require that the final ORF must end with a stop codon default: no .RE .sp \fB\-showseqnums\fP .RS 4 show sequence numbers in output default: no .RE .sp \fB\-pglgentemplate\fP .RS 4 show genomic template in PGL lines (switch off for backward compatibility) default: yes .RE .sp \fB\-gs2out\fP .RS 4 output in old GeneSeqer2 format default: no .RE .sp \fB\-maskpolyatails\fP .RS 4 mask poly(A) tails in cDNA/EST files default: no .RE .sp \fB\-proteinsmap\fP .RS 4 specify smap file used for protein files default: protein .RE .sp \fB\-noautoindex\fP .RS 4 do not create indices automatically except for the .dna.* files used for the DP. existence is not tested before an index is actually used! default: no .RE .sp \fB\-createindicesonly\fP .RS 4 stop program flow after the indices have been created default: no .RE .sp \fB\-skipindexcheck\fP .RS 4 skip index check (in preprocessing phase) default: no .RE .sp \fB\-minmatchlen\fP .RS 4 specify minimum match length (cDNA matching) default: 20 .RE .sp \fB\-seedlength\fP .RS 4 specify the seed length (cDNA matching) default: 18 .RE .sp \fB\-exdrop\fP .RS 4 specify the Xdrop value for edit distance extension (cDNA matching) default: 2 .RE .sp \fB\-prminmatchlen\fP .RS 4 specify minimum match length (protein matches) default: 24 .RE .sp \fB\-prseedlength\fP .RS 4 specify seed length (protein matching) default: 10 .RE .sp \fB\-prhdist\fP .RS 4 specify Hamming distance (protein matching) default: 4 .RE .sp \fB\-online\fP .RS 4 run the similarity filter online without using the complete index (increases runtime) default: no .RE .sp \fB\-inverse\fP .RS 4 invert query and index in vmatch call default: no .RE .sp \fB\-exact\fP .RS 4 use exact matches in the similarity filter default: no .RE .sp \fB\-gcmaxgapwidth\fP .RS 4 set the maximum gap width for global chains defines approximately the maximum intron length set to 0 to allow for unlimited length in order to avoid false\-positive exons (lonely exons) at the sequence ends, it is very important to set this parameter appropriately! default: 1000000 .RE .sp \fB\-gcmincoverage\fP .RS 4 set the minimum coverage of global chains regarding to the reference sequence default: 50 .RE .sp \fB\-paralogs\fP .RS 4 compute paralogous genes (different chaining procedure) default: no .RE .sp \fB\-enrichchains\fP .RS 4 enrich genomic sequence part of global chains with additional matches default: no .RE .sp \fB\-introncutout\fP .RS 4 enable the intron cutout technique default: no .RE .sp \fB\-fastdp\fP .RS 4 use jump table to increase speed of DP calculation default: no .RE .sp \fB\-autointroncutout\fP .RS 4 set the automatic intron cutout matrix size in megabytes and enable the automatic intron cutout technique default: 0 .RE .sp \fB\-icinitialdelta\fP .RS 4 set the initial delta used for intron cutouts default: 50 .RE .sp \fB\-iciterations\fP .RS 4 set the number of intron cutout iterations default: 2 .RE .sp \fB\-icdeltaincrease\fP .RS 4 set the delta increase during every iteration default: 50 .RE .sp \fB\-icminremintronlen\fP .RS 4 set the minimum remaining intron length for an intron to be cut out default: 10 .RE .sp \fB\-nou12intronmodel\fP .RS 4 disable the U12\-type intron model default: no .RE .sp \fB\-u12donorprob\fP .RS 4 set the probability for perfect U12\-type donor sites default: 0.99 .RE .sp \fB\-u12donorprob1mism\fP .RS 4 set the prob. for U12\-type donor w. 1 mismatch default: 0.90 .RE .sp \fB\-probies\fP .RS 4 set the initial exon state probability default: 0.50 .RE .sp \fB\-probdelgen\fP .RS 4 set the genomic sequence deletion probability default: 0.03 .RE .sp \fB\-identityweight\fP .RS 4 set the pairs of identical characters weight default: 2.00 .RE .sp \fB\-mismatchweight\fP .RS 4 set the weight for mismatching characters default: \-2.00 .RE .sp \fB\-undetcharweight\fP .RS 4 set the weight for undetermined characters default: 0.00 .RE .sp \fB\-deletionweight\fP .RS 4 set the weight for deletions default: \-5.00 .RE .sp \fB\-dpminexonlen\fP .RS 4 set the minimum exon length for the DP default: 5 .RE .sp \fB\-dpminintronlen\fP .RS 4 set the minimum intron length for the DP default: 50 .RE .sp \fB\-shortexonpenal\fP .RS 4 set the short exon penalty default: 100.00 .RE .sp \fB\-shortintronpenal\fP .RS 4 set the short intron penalty default: 100.00 .RE .sp \fB\-wzerotransition\fP .RS 4 set the zero transition weights window size default: 80 .RE .sp \fB\-wdecreasedoutput\fP .RS 4 set the decreased output weights window size default: 80 .RE .sp \fB\-leadcutoffsmode\fP .RS 4 set the cutoffs mode for leading bases can be either RELAXED, STRICT, or MINIMAL default: RELAXED .RE .sp \fB\-termcutoffsmode\fP .RS 4 set the cutoffs mode for terminal bases can be either RELAXED, STRICT, or MINIMAL default: STRICT .RE .sp \fB\-cutoffsminexonlen\fP .RS 4 set the cutoffs minimum exon length default: 5 .RE .sp \fB\-scoreminexonlen\fP .RS 4 set the score minimum exon length default: 50 .RE .sp \fB\-minaveragessp\fP .RS 4 set the minimum average splice site prob. default: 0.50 .RE .sp \fB\-duplicatecheck\fP .RS 4 criterion used to check for spliced alignment duplicates, choose from none|id|desc|seq|both default: both .RE .sp \fB\-minalignmentscore\fP .RS 4 set the minimum alignment score for spliced alignments to be included into the set of spliced alignments default: 0.00 .RE .sp \fB\-maxalignmentscore\fP .RS 4 set the maximum alignment score for spliced alignments to be included into the set of spliced alignments default: 1.00 .RE .sp \fB\-mincoverage\fP .RS 4 set the minimum coverage for spliced alignments to be included into the set of spliced alignments default: 0.00 .RE .sp \fB\-maxcoverage\fP .RS 4 set the maximum coverage for spliced alignments to be included into the set of spliced alignments default: 9999.99 .RE .sp \fB\-intermediate\fP .RS 4 stop after calculation of spliced alignments and output results in reusable XML format. Do not process this output yourself, use the ``normal\(aq\(aq XML output instead! default: no .RE .sp \fB\-sortags\fP .RS 4 sort alternative gene structures according to the weighted mean of the average exon score and the average splice site probability default: no .RE .sp \fB\-sortagswf\fP .RS 4 set the weight factor for the sorting of AGSs default: 1.00 .RE .sp \fB\-exondistri\fP .RS 4 show the exon length distribution default: no .RE .sp \fB\-introndistri\fP .RS 4 show the intron length distribution default: no .RE .sp \fB\-refseqcovdistri\fP .RS 4 show the reference sequence coverage distribution default: no .RE .sp \fB\-first\fP .RS 4 set the maximum number of spliced alignments per genomic DNA input. Set to 0 for unlimited number. default: 0 .RE .sp \fB\-help\fP .RS 4 display help for basic options and exit .RE .sp \fB\-help+\fP .RS 4 display help for all options and exit .RE .sp \fB\-version\fP .RS 4 display version information and exit .RE