Scroll to navigation

SPLAZERS(1) SPLAZERS(1)

NAME

splazers - Split-map read sequences

SYNOPSIS

splazers [OPTIONS] <GENOME FILE> <READS FILE>
splazers [OPTIONS] <GENOME FILE> <READS FILE 1> <READS FILE 2>

DESCRIPTION

SplazerS uses a prefix-suffix mapping strategy to split-map read sequences.If a SAM file of mapped reads is given as input, all unmapped but anchoredreads are split-mapped onto anchoring target regions (specify option -an),if a Fasta/q file of reads is given, reads are split-mapped onto the wholereference sequence.

(c) Copyright 2010 by Anne-Katrin Emde.

REQUIRED ARGUMENTS

A reference genome file. Valid filetypes are: .sam[.*], .raw[.*], .gbk[.*], .frn[.*], .fq[.*], .fna[.*], .ffn[.*], .fastq[.*], .fasta[.*], .faa[.*], .fa[.*], .embl[.*], and .bam, where * is any of the following extensions: gz, bz2, and bgzf for transparent (de)compression.
Either one (single-end) or two (paired-end) read files. Valid filetypes are: .sam[.*], .raw[.*], .gbk[.*], .frn[.*], .fq[.*], .fna[.*], .ffn[.*], .fastq[.*], .fasta[.*], .faa[.*], .fa[.*], .embl[.*], and .bam, where * is any of the following extensions: gz, bz2, and bgzf for transparent (de)compression.

OPTIONS

Display the help message.
Display version information.

Main Options::

Change output filename. Default: <READS FILE>.result.
only compute forward matches
only compute reverse complement matches
Percent identity threshold. In range [50..100]. Default: 92.
set the percent recognition rate In range [80..100]. Default: 99.
Read user-computed parameter files in the directory <DIR>.
Allow indels. Default: mismatches only.
Paired-end library length. In range [1..inf]. Default: 220.
Paired-end library length tolerance. In range [0..inf]. Default: 50.
Output only <NUM> of the best hits. In range [1..inf]. Default: 100.
Output only unique best matches (-m 1 -dr 0 -pa).
Trim reads to given length. Default: off. In range [14..inf].
min. read length for read clipping In range [1..inf]. Default: 0.
quality string in fasta header
output filename for unmapped reads
verbose mode
very verbose mode

Output Format Options::

dump the alignment for each match
purge reads with more than max-hits best matches
only consider matches with at most NUM more errors compared to the best (default output all)
Set output format. 0 = RazerS, 1 = Enhanced Fasta, 2 = Eland, 3 = GFF, 4 = SAM. In range [0..4].
Select how genomes are named. 0 = use Fasta id, 1 = enumerate beginning with 1. In range [0..1]. Default: 0.
Select how reads are named. 0 = use Fasta id, 1 = enumerate beginning with 1. In range [0..1]. Default: 0.
Select how matches are sorted. 0 = read number, 1 = genome position. In range [0..1]. Default: 0.
Select begin/end position numbering (see Coordinate section below). 0 = gap space, 1 = position space. In range [0..1]. Default: 0.

Split Mapping Options::

min. match length for prefix/suffix mapping (to disable split mapping, set to 0) Default: 18.
max. length of middle gap Default: 10000.
min. length of middle gap (for edit distance mapping about 10% of read length is recommended) Default: 0.
max. number of errors in prefix match Default: 1.
max. number of errors in suffix match Default: 1.
genome length in Mb, for computation of expected number of random matches In range [-inf..10000]. Default: 3000.
anchored split mapping, only unmapped reads with mapped mates will be considered, requires the reads to be given in SAM format
percent of read length, used as penalty for split-gap Default: 2.

Filtration Options::

Set k-mer overabundance cut ratio. In range [0..1].
Skip simple-repeats of length <NUM>. In range [1..inf]. Default: 1000.
Set taboo length. In range [1..inf]. Default: 1.
decrease memory usage at the expense of runtime

Verification Options:

N matches all other characters. Default: N matches nothing.
Write error distribution to FILE.
splazers 1.3.8 [tarball]