'\" t
.\" Title: gt-hop
.\" Author: [FIXME: author] [see http://docbook.sf.net/el/author]
.\" Generator: DocBook XSL Stylesheets v1.78.1
.\" Date: 09/05/2014
.\" Manual: GenomeTools Manual
.\" Source: GenomeTools 1.5.3
.\" Language: English
.\"
.TH "GT\-HOP" "1" "09/05/2014" "GenomeTools 1\&.5\&.3" "GenomeTools Manual"
.\" -----------------------------------------------------------------
.\" * Define some portability stuff
.\" -----------------------------------------------------------------
.\" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.\" http://bugs.debian.org/507673
.\" http://lists.gnu.org/archive/html/groff/2009-02/msg00013.html
.\" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.ie \n(.g .ds Aq \(aq
.el .ds Aq '
.\" -----------------------------------------------------------------
.\" * set default formatting
.\" -----------------------------------------------------------------
.\" disable hyphenation
.nh
.\" disable justification (adjust text to left margin only)
.ad l
.\" -----------------------------------------------------------------
.\" * MAIN CONTENT STARTS HERE *
.\" -----------------------------------------------------------------
.SH "NAME"
gt-hop \- Cognate sequence\-based homopolymer error correction\&.
.SH "SYNOPSIS"
.sp
\fBgt hop\fR \- \-c \-map \-reads [options\&...]
.SH "DESCRIPTION"
.PP
\fB\-c\fR [\fIstring\fR]
.RS 4
cognate sequence (encoded using gt encseq encode) (default: undefined)
.RE
.PP
\fB\-map\fR [\fIstring\fR]
.RS 4
mapping of reads to the cognate sequence it must be in SAM/BAM format, and sorted by coordinate (can be prepared e\&.g\&. using: samtools sort) (default: undefined)
.RE
.PP
\fB\-sam\fR [\fIyes|no\fR]
.RS 4
mapping file is SAM default: BAM (default: no)
.RE
.PP
\fB\-aggressive\fR [\fIyes|no\fR]
.RS 4
correct as much as possible (default: no)
.RE
.PP
\fB\-moderate\fR [\fIyes|no\fR]
.RS 4
mediate between sensitivity and precision (default: no)
.RE
.PP
\fB\-conservative\fR [\fIyes|no\fR]
.RS 4
correct only most likely errors (default: no)
.RE
.PP
\fB\-expert\fR [\fIyes|no\fR]
.RS 4
manually select correction criteria (default: no)
.RE
.PP
\fB\-reads\fR
.RS 4
uncorrected read file(s) in FastQ format; the corrected reads are output in the currect working directory in files which are named as the input files, each prepended by a prefix (see \-outprefix option) \-reads allows one to output the reads in the same order as in the input and is mandatory if the SAM contains more than a single primary alignment for each read (e\&.g\&. output of bwasw) see also \-o option as an alternative
.RE
.PP
\fB\-outprefix\fR [\fIstring\fR]
.RS 4
prefix for output filenames (corrected reads)when \-reads is specified the prefix is prepended to each input filename (default: hop_)
.RE
.PP
\fB\-o\fR [\fIstring\fR]
.RS 4
output file for corrected reads (see also \-reads/\-outprefix) if \-o is used, reads are output in a single file in the order they are found in the SAM file (which usually differ from the original order) this will only work if the reads were aligned with a software which only includes 1 alignment for each read (e\&.g\&. bwa) (default: undefined)
.RE
.PP
\fB\-hmin\fR [\fIvalue\fR]
.RS 4
minimal homopolymer length in cognate sequence (default: 3)
.RE
.PP
\fB\-read\-hmin\fR [\fIvalue\fR]
.RS 4
minimal homopolymer length in reads (default: 2)
.RE
.PP
\fB\-qmax\fR [\fIvalue\fR]
.RS 4
maximal average quality of homopolymer in a read (default: 120)
.RE
.PP
\fB\-altmax\fR [\fIvalue\fR]
.RS 4
max support of alternate homopol\&. length; e\&.g\&. 0\&.8 means: do not correct any read if homop\&. length in more than 80% of the reads has the same value, different from the cognate if altmax is set to 1\&.0 reads are always corrected (default: 0\&.800000)
.RE
.PP
\fB\-cogmin\fR [\fIvalue\fR]
.RS 4
min support of cognate sequence homopol\&. length; e\&.g\&. 0\&.1 means: do not correct any read if cognate homop\&. length is not present in at least 10% of the reads if cogmin is set to 0\&.0 reads are always corrected (default: 0\&.100000)
.RE
.PP
\fB\-mapqmin\fR [\fIvalue\fR]
.RS 4
minimal mapping quality (default: 21)
.RE
.PP
\fB\-covmin\fR [\fIvalue\fR]
.RS 4
minimal coverage; e\&.g\&. 5 means: do not correct any read if coverage (number of reads mapped over whole homopolymer) is less than 5 if covmin is set to 1 reads are always corrected (default: 1)
.RE
.PP
\fB\-allow\-muliple\fR [\fIyes|no\fR]
.RS 4
allow multiple corrections in a read (default: no)
.RE
.PP
\fB\-clenmax\fR [\fIvalue\fR]
.RS 4
maximal correction length default: unlimited (default: undefined)
.RE
.PP
\fB\-ann\fR [\fIstring\fR]
.RS 4
annotation of cognate sequence it must be sorted by coordinates on the cognate sequence (this can be e\&.g\&. done using: gt gff3 \-sort) if \-ann is used, corrections will be limited to homopolymers startingor ending inside the feature type indicated by \-ft optionformat: sorted GFF3 (default: undefined)
.RE
.PP
\fB\-ft\fR [\fIstring\fR]
.RS 4
feature type to use when \-ann option is specified (default: CDS)
.RE
.PP
\fB\-v\fR [\fIyes|no\fR]
.RS 4
be verbose (default: no)
.RE
.PP
\fB\-help\fR
.RS 4
display help for basic options and exit
.RE
.PP
\fB\-help+\fR
.RS 4
display help for all options and exit
.RE
.PP
\fB\-version\fR
.RS 4
display version information and exit
.RE
.sp
Correction mode:
.sp
One of the options \fI\-aggressive\fR, \fI\-moderate\fR, \fI\-conservative\fR or \fI\-expert\fR must be selected\&.
.sp
The \fI\-aggressive\fR, \fI\-moderate\fR and \fI\-conservative\fR modes are presets of the criteria by which it is decided if an observed discrepancy in homopolymer length between cognate sequence and a read shall be corrected or not\&. A description of the single criteria is provided by using the \fI\-help+\fR\*(Aq option\&. The presets are equivalent to the following settings:
.sp
.if n \{\
.RS 4
.\}
.nf
\-aggressive \-moderate \-conservative
\-hmin 3 3 3
\-read\-hmin 1 1 2
\-altmax 1\&.00 0\&.99 0\&.80
\-refmin 0\&.00 0\&.00 0\&.10
\-mapqmin 0 10 21
\-covmin 1 1 1
\-clenmax unlimited unlimited unlimited
\-allow\-multiple yes yes no
.fi
.if n \{\
.RE
.\}
.sp
The aggressive mode tries to maximize the sensitivity, the conservative mode to minimize the false positives\&. An even more conservative set of corrections can be achieved using the \fI\-ann\fR option (see \fI\-help+\fR)\&.
.sp
The \fI\-expert\fR mode allows one to manually set each parameter; the default values are the same as in the \fI\-conservative\fR mode\&.
.sp
(Finally, for evaluation purposes only, the \fI\-state\-of\-truth\fR mode can be used: this mode assumes that the sequenced genome has been specified as cognate sequence and outputs an ideal list of corrections\&.)
.SH "REPORTING BUGS"
.sp
Report bugs to \&.