'\" t .\" Title: gt-ltrdigest .\" Author: [FIXME: author] [see http://www.docbook.org/tdg5/en/html/author] .\" Generator: DocBook XSL Stylesheets vsnapshot .\" Date: 07/22/2020 .\" Manual: GenomeTools Manual .\" Source: GenomeTools 1.6.1 .\" Language: English .\" .TH "GT\-LTRDIGEST" "1" "07/22/2020" "GenomeTools 1\&.6\&.1" "GenomeTools Manual" .\" ----------------------------------------------------------------- .\" * Define some portability stuff .\" ----------------------------------------------------------------- .\" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .\" http://bugs.debian.org/507673 .\" http://lists.gnu.org/archive/html/groff/2009-02/msg00013.html .\" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .ie \n(.g .ds Aq \(aq .el .ds Aq ' .\" ----------------------------------------------------------------- .\" * set default formatting .\" ----------------------------------------------------------------- .\" disable hyphenation .nh .\" disable justification (adjust text to left margin only) .ad l .\" ----------------------------------------------------------------- .\" * MAIN CONTENT STARTS HERE * .\" ----------------------------------------------------------------- .SH "NAME" gt-ltrdigest \- Identifies and annotates sequence features in LTR retrotransposon candidates\&. .SH "SYNOPSIS" .sp \fBgt ltrdigest\fR [option \&...] gff3_file .SH "DESCRIPTION" .PP \fB\-outfileprefix\fR [\fIstring\fR] .RS 4 prefix for output files (e\&.g\&. \fIfoo\fR will create files called \fIfoo_*\&.csv\fR and \fIfoo_*\&.fas\fR) Omit this option for GFF3 output only\&. .RE .PP \fB\-metadata\fR [\fIyes|no\fR] .RS 4 output metadata (run conditions) to separate file (default: yes) .RE .PP \fB\-seqnamelen\fR [\fIvalue\fR] .RS 4 set maximal length of sequence names in FASTA headers (e\&.g\&. for clustalw or similar tools) (default: 20) .RE .PP \fB\-pptlen\fR [\fIstart\fR \fIend\fR] .RS 4 required PPT length range (default: [8\&.\&.30]) .RE .PP \fB\-uboxlen\fR [\fIstart\fR \fIend\fR] .RS 4 required U\-box length range (default: [3\&.\&.30]) .RE .PP \fB\-uboxdist\fR [\fIvalue\fR] .RS 4 allowed U\-box distance range from PPT (default: 0) .RE .PP \fB\-pptradius\fR [\fIvalue\fR] .RS 4 radius around beginning of 3\*(Aq LTR to search for PPT (default: 30) .RE .PP \fB\-pptrprob\fR [\fIvalue\fR] .RS 4 purine emission probability inside PPT (default: 0\&.970000) .RE .PP \fB\-pptyprob\fR [\fIvalue\fR] .RS 4 pyrimidine emission probability inside PPT (default: 0\&.030000) .RE .PP \fB\-pptgprob\fR [\fIvalue\fR] .RS 4 background G emission probability outside PPT (default: 0\&.250000) .RE .PP \fB\-pptcprob\fR [\fIvalue\fR] .RS 4 background C emission probability outside PPT (default: 0\&.250000) .RE .PP \fB\-pptaprob\fR [\fIvalue\fR] .RS 4 background A emission probability outside PPT (default: 0\&.250000) .RE .PP \fB\-ppttprob\fR [\fIvalue\fR] .RS 4 background T emission probability outside PPT (default: 0\&.250000) .RE .PP \fB\-pptuprob\fR [\fIvalue\fR] .RS 4 U/T emission probability inside U\-box (default: 0\&.910000) .RE .PP \fB\-trnas\fR [\fIfilename\fR] .RS 4 tRNA library in multiple FASTA format for PBS detection Omit this option to disable PBS search\&. .RE .PP \fB\-pbsalilen\fR [\fIstart\fR \fIend\fR] .RS 4 required PBS/tRNA alignment length range (default: [11\&.\&.30]) .RE .PP \fB\-pbsoffset\fR [\fIstart\fR \fIend\fR] .RS 4 allowed PBS offset from LTR boundary range (default: [0\&.\&.5]) .RE .PP \fB\-pbstrnaoffset\fR [\fIstart\fR \fIend\fR] .RS 4 allowed PBS/tRNA 3\*(Aq end alignment offset range (default: [0\&.\&.5]) .RE .PP \fB\-pbsmaxedist\fR [\fIvalue\fR] .RS 4 maximal allowed PBS/tRNA alignment unit edit distance (default: 1) .RE .PP \fB\-pbsradius\fR [\fIvalue\fR] .RS 4 radius around end of 5\*(Aq LTR to search for PBS (default: 30) .RE .PP \fB\-hmms\fR .RS 4 profile HMM models for domain detection (separate by spaces, finish with \-\-) in HMMER3 format Omit this option to disable pHMM search\&. .RE .PP \fB\-pdomevalcutoff\fR [\fIvalue\fR] .RS 4 global E\-value cutoff for pHMM search default 1E\-6 .RE .PP \fB\-pdomcutoff\fR [\fI\&...\fR] .RS 4 model\-specific score cutoff choose from TC (trusted cutoff) | GA (gathering cutoff) | NONE (no cutoffs) (default: NONE) .RE .PP \fB\-aliout\fR [\fIyes|no\fR] .RS 4 output pHMM to amino acid sequence alignments (default: no) .RE .PP \fB\-aaout\fR [\fIyes|no\fR] .RS 4 output amino acid sequences for protein domain hits (default: no) .RE .PP \fB\-allchains\fR [\fIyes|no\fR] .RS 4 output features from all chains and unchained features, labeled with chain numbers (default: no) .RE .PP \fB\-maxgaplen\fR [\fIvalue\fR] .RS 4 maximal allowed gap size between fragments (in amino acids) when chaining pHMM hits for a protein domain (default: 50) .RE .PP \fB\-force_recreate\fR [\fIyes|no\fR] .RS 4 force recreation of hmmpressed profiles (default: no) .RE .PP \fB\-pbsmatchscore\fR [\fIvalue\fR] .RS 4 match score for PBS/tRNA alignments (default: 5) .RE .PP \fB\-pbsmismatchscore\fR [\fIvalue\fR] .RS 4 mismatch score for PBS/tRNA alignments (default: \-10) .RE .PP \fB\-pbsinsertionscore\fR [\fIvalue\fR] .RS 4 insertion score for PBS/tRNA alignments (default: \-20) .RE .PP \fB\-pbsdeletionscore\fR [\fIvalue\fR] .RS 4 deletion score for PBS/tRNA alignments (default: \-20) .RE .PP \fB\-v\fR [\fIyes|no\fR] .RS 4 be verbose (default: no) .RE .PP \fB\-o\fR [\fIfilename\fR] .RS 4 redirect output to specified file (default: undefined) .RE .PP \fB\-gzip\fR [\fIyes|no\fR] .RS 4 write gzip compressed output file (default: no) .RE .PP \fB\-bzip2\fR [\fIyes|no\fR] .RS 4 write bzip2 compressed output file (default: no) .RE .PP \fB\-force\fR [\fIyes|no\fR] .RS 4 force writing to output file (default: no) .RE .PP \fB\-seqfile\fR [\fIfilename\fR] .RS 4 set the sequence file from which to take the sequences (default: undefined) .RE .PP \fB\-encseq\fR [\fIfilename\fR] .RS 4 set the encoded sequence indexname from which to take the sequences (default: undefined) .RE .PP \fB\-seqfiles\fR .RS 4 set the sequence files from which to extract the features use \fI\-\-\fR to terminate the list of sequence files .RE .PP \fB\-matchdesc\fR [\fIyes|no\fR] .RS 4 search the sequence descriptions from the input files for the desired sequence IDs (in GFF3), reporting the first match (default: no) .RE .PP \fB\-matchdescstart\fR [\fIyes|no\fR] .RS 4 exactly match the sequence descriptions from the input files for the desired sequence IDs (in GFF3) from the beginning to the first whitespace (default: no) .RE .PP \fB\-usedesc\fR [\fIyes|no\fR] .RS 4 use sequence descriptions to map the sequence IDs (in GFF3) to actual sequence entries\&. If a description contains a sequence range (e\&.g\&., III:1000001\&.\&.2000000), the first part is used as sequence ID (\fIIII\fR) and the first range position as offset (\fI1000001\fR) (default: no) .RE .PP \fB\-regionmapping\fR [\fIstring\fR] .RS 4 set file containing sequence\-region to sequence file mapping (default: undefined) .RE .PP \fB\-help\fR .RS 4 display help for basic options and exit .RE .PP \fB\-help+\fR .RS 4 display help for all options and exit .RE .PP \fB\-version\fR .RS 4 display version information and exit .RE .SH "REPORTING BUGS" .sp Report bugs to https://github\&.com/genometools/genometools/issues\&.