.TH SPLAZERS 1 "" "splazers 1.3.7 [tarball]" "" .SH NAME splazers \- Split-map read sequences .SH SYNOPSIS \fBsplazers\fP [\fIOPTIONS\fP] <\fIGENOME FILE\fP> <\fIREADS FILE\fP> .br \fBsplazers\fP [\fIOPTIONS\fP] <\fIGENOME FILE\fP> <\fIREADS FILE 1\fP> <\fIREADS FILE 2\fP> .SH DESCRIPTION SplazerS uses a prefix-suffix mapping strategy to split-map read sequences.If a SAM file of mapped reads is given as input, all unmapped but anchoredreads are split-mapped onto anchoring target regions (specify option -an),if a Fasta/q file of reads is given, reads are split-mapped onto the wholereference sequence. .sp (c) Copyright 2010 by Anne-Katrin Emde. .SH REQUIRED ARGUMENTS .TP \fBARGUMENT 0\fP \fIINPUT_FILE\fP A reference genome file. Valid filetypes are: \fI.sam[.*]\fP, \fI.raw[.*]\fP, \fI.gbk[.*]\fP, \fI.frn[.*]\fP, \fI.fq[.*]\fP, \fI.fna[.*]\fP, \fI.ffn[.*]\fP, \fI.fastq[.*]\fP, \fI.fasta[.*]\fP, \fI.faa[.*]\fP, \fI.fa[.*]\fP, \fI.embl[.*]\fP, and \fI.bam\fP, where * is any of the following extensions: \fIgz\fP, \fIbz2\fP, and \fIbgzf\fP for transparent (de)compression. .TP \fBREADS\fP List of \fIINPUT_FILE\fP's Either one (single-end) or two (paired-end) read files. Valid filetypes are: \fI.sam[.*]\fP, \fI.raw[.*]\fP, \fI.gbk[.*]\fP, \fI.frn[.*]\fP, \fI.fq[.*]\fP, \fI.fna[.*]\fP, \fI.ffn[.*]\fP, \fI.fastq[.*]\fP, \fI.fasta[.*]\fP, \fI.faa[.*]\fP, \fI.fa[.*]\fP, \fI.embl[.*]\fP, and \fI.bam\fP, where * is any of the following extensions: \fIgz\fP, \fIbz2\fP, and \fIbgzf\fP for transparent (de)compression. .SH OPTIONS .TP \fB-h\fP, \fB--help\fP Display the help message. .TP \fB--version\fP Display version information. .SS Main Options:: .TP \fB-o\fP, \fB--output\fP \fIOUTPUT_FILE\fP Change output filename. Default: <\fIREADS FILE\fP>.result. .TP \fB-f\fP, \fB--forward\fP only compute forward matches .TP \fB-r\fP, \fB--reverse\fP only compute reverse complement matches .TP \fB-i\fP, \fB--percent-identity\fP \fIDOUBLE\fP Percent identity threshold. In range [50..100]. Default: \fI92\fP. .TP \fB-rr\fP, \fB--recognition-rate\fP \fIDOUBLE\fP set the percent recognition rate In range [80..100]. Default: \fI99\fP. .TP \fB-pd\fP, \fB--param-dir\fP \fISTRING\fP Read user-computed parameter files in the directory <\fIDIR\fP>. .TP \fB-id\fP, \fB--indels\fP Allow indels. Default: mismatches only. .TP \fB-ll\fP, \fB--library-length\fP \fIINTEGER\fP Paired-end library length. In range [1..inf]. Default: \fI220\fP. .TP \fB-le\fP, \fB--library-error\fP \fIINTEGER\fP Paired-end library length tolerance. In range [0..inf]. Default: \fI50\fP. .TP \fB-m\fP, \fB--max-hits\fP \fIINTEGER\fP Output only <\fINUM\fP> of the best hits. In range [1..inf]. Default: \fI100\fP. .TP \fB--unique\fP Output only unique best matches (-m 1 -dr 0 -pa). .TP \fB-tr\fP, \fB--trim-reads\fP \fIINTEGER\fP Trim reads to given length. Default: off. In range [14..inf]. .TP \fB-mcl\fP, \fB--min-clipped-len\fP \fIINTEGER\fP min. read length for read clipping In range [1..inf]. Default: \fI0\fP. .TP \fB-qih\fP, \fB--quality-in-header\fP quality string in fasta header .TP \fB-ou\fP, \fB--outputUnmapped\fP \fIOUTPUT_FILE\fP output filename for unmapped reads .TP \fB-v\fP, \fB--verbose\fP verbose mode .TP \fB-vv\fP, \fB--vverbose\fP very verbose mode .SS Output Format Options:: .TP \fB-a\fP, \fB--alignment\fP dump the alignment for each match .TP \fB-pa\fP, \fB--purge-ambiguous\fP purge reads with more than max-hits best matches .TP \fB-dr\fP, \fB--distance-range\fP \fIINTEGER\fP only consider matches with at most NUM more errors compared to the best (default output all) .TP \fB-of\fP, \fB--output-format\fP \fIINTEGER\fP Set output format. 0 = RazerS, 1 = Enhanced Fasta, 2 = Eland, 3 = GFF, 4 = SAM. In range [0..4]. .TP \fB-gn\fP, \fB--genome-naming\fP \fIINTEGER\fP Select how genomes are named. 0 = use Fasta id, 1 = enumerate beginning with 1. In range [0..1]. Default: \fI0\fP. .TP \fB-rn\fP, \fB--read-naming\fP \fIINTEGER\fP Select how reads are named. 0 = use Fasta id, 1 = enumerate beginning with 1. In range [0..1]. Default: \fI0\fP. .TP \fB-so\fP, \fB--sort-order\fP \fIINTEGER\fP Select how matches are sorted. 0 = read number, 1 = genome position. In range [0..1]. Default: \fI0\fP. .TP \fB-pf\fP, \fB--position-format\fP \fIINTEGER\fP Select begin/end position numbering (see Coordinate section below). 0 = gap space, 1 = position space. In range [0..1]. Default: \fI0\fP. .SS Split Mapping Options:: .TP \fB-sm\fP, \fB--split-mapping\fP \fIINTEGER\fP min. match length for prefix/suffix mapping (to disable split mapping, set to 0) Default: \fI18\fP. .TP \fB-maxG\fP, \fB--max-gap\fP \fIINTEGER\fP max. length of middle gap Default: \fI10000\fP. .TP \fB-minG\fP, \fB--min-gap\fP \fIINTEGER\fP min. length of middle gap (for edit distance mapping about 10% of read length is recommended) Default: \fI0\fP. .TP \fB-ep\fP, \fB--errors-prefix\fP \fIINTEGER\fP max. number of errors in prefix match Default: \fI1\fP. .TP \fB-es\fP, \fB--errors-suffix\fP \fIINTEGER\fP max. number of errors in suffix match Default: \fI1\fP. .TP \fB-gl\fP, \fB--genome-len\fP \fIINTEGER\fP genome length in Mb, for computation of expected number of random matches In range [-inf..10000]. Default: \fI3000\fP. .TP \fB-an\fP, \fB--anchored\fP anchored split mapping, only unmapped reads with mapped mates will be considered, requires the reads to be given in SAM format .TP \fB-pc\fP, \fB--penalty-c\fP \fIINTEGER\fP percent of read length, used as penalty for split-gap Default: \fI2\fP. .SS Filtration Options:: .TP \fB-oc\fP, \fB--overabundance-cut\fP \fIINTEGER\fP Set k-mer overabundance cut ratio. In range [0..1]. .TP \fB-rl\fP, \fB--repeat-length\fP \fIINTEGER\fP Skip simple-repeats of length <\fINUM\fP>. In range [1..inf]. Default: \fI1000\fP. .TP \fB-tl\fP, \fB--taboo-length\fP \fIINTEGER\fP Set taboo length. In range [1..inf]. Default: \fI1\fP. .TP \fB-lm\fP, \fB--low-memory\fP decrease memory usage at the expense of runtime .SS Verification Options: .TP \fB-mN\fP, \fB--match-N\fP N matches all other characters. Default: N matches nothing. .TP \fB-ed\fP, \fB--error-distr\fP \fISTRING\fP Write error distribution to \fIFILE\fP.