Scroll to navigation

GT-LTRDIGEST(1) GenomeTools Manual GT-LTRDIGEST(1)

NAME

gt-ltrdigest - Identifies and annotates sequence features in LTR retrotransposon candidates.

SYNOPSIS

gt ltrdigest [option ...] gff3_file

DESCRIPTION

-outfileprefix [string]
prefix for output files (e.g. foo will create files called foo_*.csv and foo_*.fas) Omit this option for GFF3 output only. (default: undefined)
-metadata [yes|no]
output metadata (run conditions) to separate file (default: yes)
-seqnamelen [value]
set maximal length of sequence names in FASTA headers (e.g. for clustalw or similar tools) (default: 20)
-pptlen [start end]
required PPT length range (default: [8..30])
-uboxlen [start end]
required U-box length range (default: [3..30])
-uboxdist [value]
allowed U-box distance range from PPT (default: 0)
-pptradius [value]
radius around beginning of 3' LTR to search for PPT (default: 30)
-pptrprob [value]
purine emission probability inside PPT (default: 0.970000)
-pptyprob [value]
pyrimidine emission probability inside PPT (default: 0.030000)
-pptgprob [value]
background G emission probability outside PPT (default: 0.250000)
-pptcprob [value]
background C emission probability outside PPT (default: 0.250000)
-pptaprob [value]
background A emission probability outside PPT (default: 0.250000)
-ppttprob [value]
background T emission probability outside PPT (default: 0.250000)
-pptuprob [value]
U/T emission probability inside U-box (default: 0.910000)
-trnas [filename]
tRNA library in multiple FASTA format for PBS detection Omit this option to disable PBS search. (default: undefined)
-pbsalilen [start end]
required PBS/tRNA alignment length range (default: [11..30])
-pbsoffset [start end]
allowed PBS offset from LTR boundary range (default: [0..5])
-pbstrnaoffset [start end]
allowed PBS/tRNA 3' end alignment offset range (default: [0..5])
-pbsmaxedist [value]
maximal allowed PBS/tRNA alignment unit edit distance (default: 1)
-pbsradius [value]
radius around end of 5' LTR to search for PBS (default: 30)
-hmms
profile HMM models for domain detection (separate by spaces, finish with --) in HMMER3 format Omit this option to disable pHMM search.
-pdomevalcutoff [value]
global E-value cutoff for pHMM search default 1E-6 (default: 0.000001)
-pdomcutoff [...]
model-specific score cutoff choose from TC (trusted cutoff) | GA (gathering cutoff) | NONE (no cutoffs) (default: NONE)
-aliout [yes|no]
output pHMM to amino acid sequence alignments (default: no)
-aaout [yes|no]
output amino acid sequences for protein domain hits (default: no)
-allchains [yes|no]
output features from all chains and unchained features, labeled with chain numbers (default: no)
-maxgaplen [value]
maximal allowed gap size between fragments (in amino acids) when chaining pHMM hits for a protein domain (default: 50)
-pbsmatchscore [value]
match score for PBS/tRNA alignments (default: 5)
-pbsmismatchscore [value]
mismatch score for PBS/tRNA alignments (default: -10)
-pbsinsertionscore [value]
insertion score for PBS/tRNA alignments (default: -20)
-pbsdeletionscore [value]
deletion score for PBS/tRNA alignments (default: -20)
-v [yes|no]
be verbose (default: no)
-o [filename]
redirect output to specified file (default: undefined)
-gzip [yes|no]
write gzip compressed output file (default: no)
-bzip2 [yes|no]
write bzip2 compressed output file (default: no)
-force [yes|no]
force writing to output file (default: no)
-seqfile [filename]
set the sequence file from which to take the sequences (default: undefined)
-encseq [filename]
set the encoded sequence indexname from which to take the sequences (default: undefined)
-seqfiles
set the sequence files from which to extract the features use -- to terminate the list of sequence files
-matchdesc [yes|no]
search the sequence descriptions from the input files for the desired sequence IDs (in GFF3), reporting the first match (default: no)
-matchdescstart [yes|no]
exactly match the sequence descriptions from the input files for the desired sequence IDs (in GFF3) from the beginning to the first whitespace (default: no)
-usedesc [yes|no]
use sequence descriptions to map the sequence IDs (in GFF3) to actual sequence entries. If a description contains a sequence range (e.g., III:1000001..2000000), the first part is used as sequence ID ( III) and the first range position as offset ( 1000001) (default: no)
-regionmapping [string]
set file containing sequence-region to sequence file mapping (default: undefined)
-help
display help for basic options and exit
-help+
display help for all options and exit
-version
display version information and exit

REPORTING BUGS

Report bugs to <gt-users@genometools.org>.
09/05/2014 GenomeTools 1.5.3