.TH RAZERS3 1 "" "razers3 3.5.8 [tarball]" "" .SH NAME razers3 \- Faster, fully sensitive read mapping .SH SYNOPSIS \fBrazers3\fP [\fIOPTIONS\fP] <\fIGENOME FILE\fP> <\fIREADS FILE\fP> .br \fBrazers3\fP [\fIOPTIONS\fP] <\fIGENOME FILE\fP> <\fIPE-READS FILE1\fP> <\fIPE-READS FILE2\fP> .SH DESCRIPTION RazerS 3 is a versatile full-sensitive read mapper based on k-mer counting and seeding filters. It supports single and paired-end mapping, shared-memory parallelism, and optimally parametrizes the filter based on a user-defined minimal sensitivity. See \fIhttp://www.seqan.de/projects/razers\fP for more information. .sp Input to RazerS 3 is a reference genome file and either one file with single-end reads or two files containing left or right mates of paired-end reads. Use - to read single-end reads from stdin. .sp (c) Copyright 2009-2014 by David Weese. .SH REQUIRED ARGUMENTS .TP \fBARGUMENT 0\fP \fIINPUT_FILE\fP A reference genome file. Valid filetypes are: \fI.sam[.*]\fP, \fI.raw[.*]\fP, \fI.gbk[.*]\fP, \fI.frn[.*]\fP, \fI.fq[.*]\fP, \fI.fna[.*]\fP, \fI.ffn[.*]\fP, \fI.fastq[.*]\fP, \fI.fasta[.*]\fP, \fI.faa[.*]\fP, \fI.fa[.*]\fP, \fI.embl[.*]\fP, and \fI.bam\fP, where * is any of the following extensions: \fIgz\fP, \fIbz2\fP, and \fIbgzf\fP for transparent (de)compression. .TP \fBREADS\fP List of \fIINPUT_FILE\fP's Either one (single-end) or two (paired-end) read files. Valid filetypes are: \fI.sam[.*]\fP, \fI.raw[.*]\fP, \fI.gbk[.*]\fP, \fI.frn[.*]\fP, \fI.fq[.*]\fP, \fI.fna[.*]\fP, \fI.ffn[.*]\fP, \fI.fastq[.*]\fP, \fI.fasta[.*]\fP, \fI.faa[.*]\fP, \fI.fa[.*]\fP, \fI.embl[.*]\fP, and \fI.bam\fP, where * is any of the following extensions: \fIgz\fP, \fIbz2\fP, and \fIbgzf\fP for transparent (de)compression. .SH OPTIONS .TP \fB-h\fP, \fB--help\fP Display the help message. .TP \fB--version\fP Display version information. .SS Main Options: .TP \fB-i\fP, \fB--percent-identity\fP \fIDOUBLE\fP Percent identity threshold. In range [50..100]. Default: \fI95\fP. .TP \fB-rr\fP, \fB--recognition-rate\fP \fIDOUBLE\fP Percent recognition rate. In range [80..100]. Default: \fI100\fP. .TP \fB-ng\fP, \fB--no-gaps\fP Allow only mismatches, no indels. Default: allow both. .TP \fB-f\fP, \fB--forward\fP Map reads only to forward strands. .TP \fB-r\fP, \fB--reverse\fP Map reads only to reverse strands. .TP \fB-m\fP, \fB--max-hits\fP \fIINTEGER\fP Output only <\fINUM\fP> of the best hits. In range [1..inf]. Default: \fI100\fP. .TP \fB--unique\fP Output only unique best matches (-m 1 -dr 0 -pa). .TP \fB-tr\fP, \fB--trim-reads\fP \fIINTEGER\fP Trim reads to given length. Default: off. In range [14..inf]. .TP \fB-o\fP, \fB--output\fP \fIOUTPUT_FILE\fP Mapping result filename (use - to dump to stdout in razers format). Default: <\fIREADS FILE\fP>.razers. Valid filetypes are: \fI.sam\fP, \fI.razers\fP, \fI.gff\fP, \fI.fasta\fP, \fI.fa\fP, \fI.eland\fP, \fI.bam\fP, and \fI.afg\fP. .TP \fB-v\fP, \fB--verbose\fP Verbose mode. .TP \fB-vv\fP, \fB--vverbose\fP Very verbose mode. .SS Paired-end Options: .TP \fB-ll\fP, \fB--library-length\fP \fIINTEGER\fP Paired-end library length. In range [1..inf]. Default: \fI220\fP. .TP \fB-le\fP, \fB--library-error\fP \fIINTEGER\fP Paired-end library length tolerance. In range [0..inf]. Default: \fI50\fP. .SS Output Format Options: .TP \fB-a\fP, \fB--alignment\fP Dump the alignment for each match (only \fIrazer\fP or \fIfasta\fP format). .TP \fB-pa\fP, \fB--purge-ambiguous\fP Purge reads with more than <\fImax-hits\fP> best matches. .TP \fB-dr\fP, \fB--distance-range\fP \fIINTEGER\fP Only consider matches with at most NUM more errors compared to the best. Default: output all. .TP \fB-gn\fP, \fB--genome-naming\fP \fIINTEGER\fP Select how genomes are named (see Naming section below). In range [0..1]. Default: \fI0\fP. .TP \fB-rn\fP, \fB--read-naming\fP \fIINTEGER\fP Select how reads are named (see Naming section below). In range [0..3]. Default: \fI0\fP. .TP \fB--full-readid\fP Use the whole read id (don't clip after whitespace). .TP \fB-so\fP, \fB--sort-order\fP \fIINTEGER\fP Select how matches are sorted (see Sorting section below). In range [0..1]. Default: \fI0\fP. .TP \fB-pf\fP, \fB--position-format\fP \fIINTEGER\fP Select begin/end position numbering (see Coordinate section below). In range [0..1]. Default: \fI0\fP. .TP \fB-ds\fP, \fB--dont-shrink-alignments\fP Disable alignment shrinking in SAM. This is required for generating a gold mapping for Rabema. .SS Filtration Options: .TP \fB-fl\fP, \fB--filter\fP \fISTRING\fP Select k-mer filter. One of \fIpigeonhole\fP and \fIswift\fP. Default: \fIpigeonhole\fP. .TP \fB-mr\fP, \fB--mutation-rate\fP \fIDOUBLE\fP Set the percent mutation rate (\fIpigeonhole\fP). In range [0..20]. Default: \fI5\fP. .TP \fB-ol\fP, \fB--overlap-length\fP \fIINTEGER\fP Manually set the overlap length of adjacent k-mers (\fIpigeonhole\fP). In range [0..inf]. .TP \fB-pd\fP, \fB--param-dir\fP \fISTRING\fP Read user-computed parameter files in the directory <\fIDIR\fP> (\fIswift\fP). .TP \fB-t\fP, \fB--threshold\fP \fIINTEGER\fP Manually set minimum k-mer count threshold (\fIswift\fP). In range [1..inf]. .TP \fB-tl\fP, \fB--taboo-length\fP \fIINTEGER\fP Set taboo length (\fIswift\fP). In range [1..inf]. Default: \fI1\fP. .TP \fB-s\fP, \fB--shape\fP \fISTRING\fP Manually set k-mer shape. .TP \fB-oc\fP, \fB--overabundance-cut\fP \fIINTEGER\fP Set k-mer overabundance cut ratio. In range [0..1]. Default: \fI1\fP. .TP \fB-rl\fP, \fB--repeat-length\fP \fIINTEGER\fP Skip simple-repeats of length <\fINUM\fP>. In range [1..inf]. Default: \fI1000\fP. .TP \fB-lf\fP, \fB--load-factor\fP \fIDOUBLE\fP Set the load factor for the open addressing k-mer index. In range [1..inf]. Default: \fI1.6\fP. .SS Verification Options: .TP \fB-mN\fP, \fB--match-N\fP N matches all other characters. Default: N matches nothing. .TP \fB-ed\fP, \fB--error-distr\fP \fISTRING\fP Write error distribution to \fIFILE\fP. .TP \fB-mf\fP, \fB--mismatch-file\fP \fISTRING\fP Write mismatch patterns to \fIFILE\fP. .SS Misc Options: .TP \fB-cm\fP, \fB--compact-mult\fP \fIDOUBLE\fP Multiply compaction threshold by this value after reaching and compacting. In range [0..inf]. Default: \fI2.2\fP. .TP \fB-ncf\fP, \fB--no-compact-frac\fP \fIDOUBLE\fP Don't compact if in this last fraction of genome. In range [0..1]. Default: \fI0.05\fP. .SS Parallelism Options: .TP \fB-tc\fP, \fB--thread-count\fP \fIINTEGER\fP Set the number of threads to use (0 to force sequential mode). In range [0..inf]. Default: \fI1\fP. .TP \fB-pws\fP, \fB--parallel-window-size\fP \fIINTEGER\fP Collect candidates in windows of this length. In range [1..inf]. Default: \fI500000\fP. .TP \fB-pvs\fP, \fB--parallel-verification-size\fP \fIINTEGER\fP Verify candidates in packages of this size. In range [1..inf]. Default: \fI100\fP. .TP \fB-pvmpc\fP, \fB--parallel-verification-max-package-count\fP \fIINTEGER\fP Largest number of packages to create for verification per thread-1. In range [1..inf]. Default: \fI100\fP. .TP \fB-amms\fP, \fB--available-matches-memory-size\fP \fIINTEGER\fP Bytes of main memory available for storing matches. In range [-1..inf]. Default: \fI0\fP. .TP \fB-mhst\fP, \fB--match-histo-start-threshold\fP \fIINTEGER\fP When to start histogram. In range [1..inf]. Default: \fI5\fP. .SH FORMATS, NAMING, SORTING, AND COORDINATE SCHEMES RazerS 3 supports various output formats. The output format is detected automatically from the file name suffix. .TP \&.razers Razer format .TP \&.fa, .fasta Enhanced Fasta format .TP \&.eland Eland format .TP \&.gff GFF format .TP \&.sam SAM format .TP \&.bam BAM format .TP \&.afg Amos AFG format .sp By default, reads and contigs are referred by their Fasta ids given in the input files. With the \fB-gn\fP and \fB-rn\fP options this behaviour can be changed: .TP 0 Use Fasta id. .TP 1 Enumerate beginning with 1. .TP 2 Use the read sequence (only for short reads!). .TP 3 Use the Fasta id, do NOT append /L or /R for mate pairs. .sp .sp The way matches are sorted in the output file can be changed with the \fB-so\fP option for the following formats: \fBrazers\fP, \fBfasta\fP, \fBsam\fP, and \fBafg\fP. Primary and secondary sort keys are: .TP 0 1. read number, 2. genome position .TP 1 1. genome position, 2. read number .sp .sp The coordinate space used for begin and end positions can be changed with the \fB-pf\fP option for the \fBrazer\fP and \fBfasta\fP formats: .TP 0 Gap space. Gaps between characters are counted from 0. .TP 1 Position space. Characters are counted from 1. .SH EXAMPLES .TP \fBrazers3\fP \fB-i\fP \fB96\fP \fB-tc\fP \fB12\fP \fB-o\fP \fBmapped.razers\fP \fBhg18.fa\fP \fBreads.fq\fP Map single-end reads with 4% error rate using 12 threads. .TP \fBrazers3\fP \fB-i\fP \fB95\fP \fB-no-gaps\fP \fB-o\fP \fBmapped.razers\fP \fBhg18.fa\fP \fBreads.fq.gz\fP Map single-end gzipped reads with 5% error rate and no indels. .TP \fBrazers3\fP \fB-i\fP \fB94\fP \fB-rr\fP \fB95\fP \fB-tc\fP \fB12\fP \fB-ll\fP \fB280\fP \fB--le\fP \fB80\fP \fB-o\fP \fBmapped.razers\fP \fBhg18.fa\fP \fBreads_1.fq\fP \fBreads_2.fq\fP Map paired-end reads with up to 6% errors, 95% sensitivity, 12 threads, and only output aligned pairs with an outer distance of 200-360bp.