.TH RAZERS3 1 "" "razers3 3.5.7 [tarball]" "" .SH NAME razers3 \- Faster, fully sensitive read mapping .SH SYNOPSIS \fBrazers3\fP [\fIOPTIONS\fP] <\fIGENOME FILE\fP> <\fIREADS FILE\fP> .br \fBrazers3\fP [\fIOPTIONS\fP] <\fIGENOME FILE\fP> <\fIPE-READS FILE1\fP> <\fIPE-READS FILE2\fP> .SH DESCRIPTION RazerS 3 is a versatile full-sensitive read mapper based on k-mer counting and seeding filters. It supports single and paired-end mapping, shared-memory parallelism, and optimally parametrizes the filter based on a user-defined minimal sensitivity. See \fIhttp://www.seqan.de/projects/razers\fP for more information. .sp Input to RazerS 3 is a reference genome file and either one file with single-end reads or two files containing left or right mates of paired-end reads. Use - to read single-end reads from stdin. .sp (c) Copyright 2009-2014 by David Weese. .SH REQUIRED ARGUMENTS .TP \fBARGUMENT 0\fP \fIINPUT_FILE\fP A reference genome file. Valid filetypes are: \fI.sam[.*]\fP, \fI.raw[.*]\fP, \fI.gbk[.*]\fP, \fI.frn[.*]\fP, \fI.fq[.*]\fP, \fI.fna[.*]\fP, \fI.ffn[.*]\fP, \fI.fastq[.*]\fP, \fI.fasta[.*]\fP, \fI.faa[.*]\fP, \fI.fa[.*]\fP, \fI.embl[.*]\fP, and \fI.bam\fP, where * is any of the following extensions: \fIgz\fP, \fIbz2\fP, and \fIbgzf\fP for transparent (de)compression. .TP \fBREADS\fP List of \fIINPUT_FILE\fP's Either one (single-end) or two (paired-end) read files. Valid filetypes are: \fI.sam[.*]\fP, \fI.raw[.*]\fP, \fI.gbk[.*]\fP, \fI.frn[.*]\fP, \fI.fq[.*]\fP, \fI.fna[.*]\fP, \fI.ffn[.*]\fP, \fI.fastq[.*]\fP, \fI.fasta[.*]\fP, \fI.faa[.*]\fP, \fI.fa[.*]\fP, \fI.embl[.*]\fP, and \fI.bam\fP, where * is any of the following extensions: \fIgz\fP, \fIbz2\fP, and \fIbgzf\fP for transparent (de)compression. .SH OPTIONS .TP \fB-h\fP, \fB--help\fP Display the help message. .TP \fB--version\fP Display version information. .SS Main Options: .TP \fB-i\fP, \fB--percent-identity\fP \fIDOUBLE\fP Percent identity threshold. In range [50..100]. Default: \fI95\fP. .TP \fB-rr\fP, \fB--recognition-rate\fP \fIDOUBLE\fP Percent recognition rate. In range [80..100]. Default: \fI100\fP. .TP \fB-ng\fP, \fB--no-gaps\fP Allow only mismatches, no indels. Default: allow both. .TP \fB-f\fP, \fB--forward\fP Map reads only to forward strands. .TP \fB-r\fP, \fB--reverse\fP Map reads only to reverse strands. .TP \fB-m\fP, \fB--max-hits\fP \fIINTEGER\fP Output only <\fINUM\fP> of the best hits. In range [1..inf]. Default: \fI100\fP. .TP \fB--unique\fP Output only unique best matches (-m 1 -dr 0 -pa). .TP \fB-tr\fP, \fB--trim-reads\fP \fIINTEGER\fP Trim reads to given length. Default: off. In range [14..inf]. .TP \fB-o\fP, \fB--output\fP \fIOUTPUT_FILE\fP Mapping result filename (use - to dump to stdout in razers format). Default: <\fIREADS FILE\fP>.razers. Valid filetypes are: \fI.sam\fP, \fI.razers\fP, \fI.gff\fP, \fI.fasta\fP, \fI.fa\fP, \fI.eland\fP, \fI.bam\fP, and \fI.afg\fP. .TP \fB-v\fP, \fB--verbose\fP Verbose mode. .TP \fB-vv\fP, \fB--vverbose\fP Very verbose mode. .SS Paired-end Options: .TP \fB-ll\fP, \fB--library-length\fP \fIINTEGER\fP Paired-end library length. In range [1..inf]. Default: \fI220\fP. .TP \fB-le\fP, \fB--library-error\fP \fIINTEGER\fP Paired-end library length tolerance. In range [0..inf]. Default: \fI50\fP. .SS Output Format Options: .TP \fB-a\fP, \fB--alignment\fP Dump the alignment for each match (only \fIrazer\fP or \fIfasta\fP format). .TP \fB-pa\fP, \fB--purge-ambiguous\fP Purge reads with more than <\fImax-hits\fP> best matches. .TP \fB-dr\fP, \fB--distance-range\fP \fIINTEGER\fP Only consider matches with at most NUM more errors compared to the best. Default: output all. .TP \fB-gn\fP, \fB--genome-naming\fP \fIINTEGER\fP Select how genomes are named (see Naming section below). In range [0..1]. Default: \fI0\fP. .TP \fB-rn\fP, \fB--read-naming\fP \fIINTEGER\fP Select how reads are named (see Naming section below). In range [0..3]. Default: \fI0\fP. .TP \fB--full-readid\fP Use the whole read id (don't clip after whitespace). .TP \fB-so\fP, \fB--sort-order\fP \fIINTEGER\fP Select how matches are sorted (see Sorting section below). In range [0..1]. Default: \fI0\fP. .TP \fB-pf\fP, \fB--position-format\fP \fIINTEGER\fP Select begin/end position numbering (see Coordinate section below). In range [0..1]. Default: \fI0\fP. .TP \fB-ds\fP, \fB--dont-shrink-alignments\fP Disable alignment shrinking in SAM. This is required for generating a gold mapping for Rabema. .SS Filtration Options: .TP \fB-fl\fP, \fB--filter\fP \fISTRING\fP Select k-mer filter. One of \fIpigeonhole\fP and \fIswift\fP. Default: \fIpigeonhole\fP. .TP \fB-mr\fP, \fB--mutation-rate\fP \fIDOUBLE\fP Set the percent mutation rate (\fIpigeonhole\fP). In range [0..20]. Default: \fI5\fP. .TP \fB-ol\fP, \fB--overlap-length\fP \fIINTEGER\fP Manually set the overlap length of adjacent k-mers (\fIpigeonhole\fP). In range [0..inf]. .TP \fB-pd\fP, \fB--param-dir\fP \fISTRING\fP Read user-computed parameter files in the directory <\fIDIR\fP> (\fIswift\fP). .TP \fB-t\fP, \fB--threshold\fP \fIINTEGER\fP Manually set minimum k-mer count threshold (\fIswift\fP). In range [1..inf]. .TP \fB-tl\fP, \fB--taboo-length\fP \fIINTEGER\fP Set taboo length (\fIswift\fP). In range [1..inf]. Default: \fI1\fP. .TP \fB-s\fP, \fB--shape\fP \fISTRING\fP Manually set k-mer shape. .TP \fB-oc\fP, \fB--overabundance-cut\fP \fIINTEGER\fP Set k-mer overabundance cut ratio. In range [0..1]. Default: \fI1\fP. .TP \fB-rl\fP, \fB--repeat-length\fP \fIINTEGER\fP Skip simple-repeats of length <\fINUM\fP>. In range [1..inf]. Default: \fI1000\fP. .TP \fB-lf\fP, \fB--load-factor\fP \fIDOUBLE\fP Set the load factor for the open addressing k-mer index. In range [1..inf]. Default: \fI1.6\fP. .SS Verification Options: .TP \fB-mN\fP, \fB--match-N\fP N matches all other characters. Default: N matches nothing. .TP \fB-ed\fP, \fB--error-distr\fP \fISTRING\fP Write error distribution to \fIFILE\fP. .TP \fB-mf\fP, \fB--mismatch-file\fP \fISTRING\fP Write mismatch patterns to \fIFILE\fP. .SS Misc Options: .TP \fB-cm\fP, \fB--compact-mult\fP \fIDOUBLE\fP Multiply compaction threshold by this value after reaching and compacting. In range [0..inf]. Default: \fI2.2\fP. .TP \fB-ncf\fP, \fB--no-compact-frac\fP \fIDOUBLE\fP Don't compact if in this last fraction of genome. In range [0..1]. Default: \fI0.05\fP. .SS Parallelism Options: .TP \fB-tc\fP, \fB--thread-count\fP \fIINTEGER\fP Set the number of threads to use (0 to force sequential mode). In range [0..inf]. Default: \fI1\fP. .TP \fB-pws\fP, \fB--parallel-window-size\fP \fIINTEGER\fP Collect candidates in windows of this length. In range [1..inf]. Default: \fI500000\fP. .TP \fB-pvs\fP, \fB--parallel-verification-size\fP \fIINTEGER\fP Verify candidates in packages of this size. In range [1..inf]. Default: \fI100\fP. .TP \fB-pvmpc\fP, \fB--parallel-verification-max-package-count\fP \fIINTEGER\fP Largest number of packages to create for verification per thread-1. In range [1..inf]. Default: \fI100\fP. .TP \fB-amms\fP, \fB--available-matches-memory-size\fP \fIINTEGER\fP Bytes of main memory available for storing matches. In range [-1..inf]. Default: \fI0\fP. .TP \fB-mhst\fP, \fB--match-histo-start-threshold\fP \fIINTEGER\fP When to start histogram. In range [1..inf]. Default: \fI5\fP. .SH FORMATS, NAMING, SORTING, AND COORDINATE SCHEMES RazerS 3 supports various output formats. The output format is detected automatically from the file name suffix. .TP .razers Razer format .TP .fa, .fasta Enhanced Fasta format .TP .eland Eland format .TP .gff GFF format .TP .sam SAM format .TP .bam BAM format .TP .afg Amos AFG format .sp By default, reads and contigs are referred by their Fasta ids given in the input files. With the \fB-gn\fP and \fB-rn\fP options this behaviour can be changed: .TP 0 Use Fasta id. .TP 1 Enumerate beginning with 1. .TP 2 Use the read sequence (only for short reads!). .TP 3 Use the Fasta id, do NOT append /L or /R for mate pairs. .sp .sp The way matches are sorted in the output file can be changed with the \fB-so\fP option for the following formats: \fBrazers\fP, \fBfasta\fP, \fBsam\fP, and \fBafg\fP. Primary and secondary sort keys are: .TP 0 1. read number, 2. genome position .TP 1 1. genome position, 2. read number .sp .sp The coordinate space used for begin and end positions can be changed with the \fB-pf\fP option for the \fBrazer\fP and \fBfasta\fP formats: .TP 0 Gap space. Gaps between characters are counted from 0. .TP 1 Position space. Characters are counted from 1. .SH EXAMPLES .TP \fBrazers3\fP \fB-i\fP \fB96\fP \fB-tc\fP \fB12\fP \fB-o\fP \fBmapped.razers\fP \fBhg18.fa\fP \fBreads.fq\fP Map single-end reads with 4% error rate using 12 threads. .TP \fBrazers3\fP \fB-i\fP \fB95\fP \fB-no-gaps\fP \fB-o\fP \fBmapped.razers\fP \fBhg18.fa\fP \fBreads.fq.gz\fP Map single-end gzipped reads with 5% error rate and no indels. .TP \fBrazers3\fP \fB-i\fP \fB94\fP \fB-rr\fP \fB95\fP \fB-tc\fP \fB12\fP \fB-ll\fP \fB280\fP \fB--le\fP \fB80\fP \fB-o\fP \fBmapped.razers\fP \fBhg18.fa\fP \fBreads_1.fq\fP \fBreads_2.fq\fP Map paired-end reads with up to 6% errors, 95% sensitivity, 12 threads, and only output aligned pairs with an outer distance of 200-360bp.