RAZERS(1)

NAME¶

razers - Fast Read Mapping with Sensitivity Control

SYNOPSIS¶

razers [OPTIONS] <GENOME FILE> <READS FILE>

razers [OPTIONS] <GENOME FILE> <MP-READS FILE1> < MP-READS FILE2>

DESCRIPTION¶

RazerS is a versatile full-sensitive read mapper based on a k-mer counting filter. It supports single and paired-end mapping, and optimally parametrizes the filter based on a user-defined minimal sensitivity. See http://www.seqan.de/projects/razers for more information.

Input to RazerS is a reference genome file and either one file with single-end reads or two files containing left or right mates of paired-end reads. Use - to read single-end reads from stdin.

REQUIRED ARGUMENTS¶

ARGUMENT 0 INPUT_FILE: A reference genome file. Valid filetypes are: .sam[.*], .raw[.*], .gbk[.*], .frn[.*], .fq[.*], .fna[.*], .ffn[.*], .fastq[.*], .fasta[.*], .faa[.*], .fa[.*], .embl[.*], and .bam, where * is any of the following extensions: gz, bz2, and bgzf for transparent (de)compression.

READS List of INPUT_FILE's: Either one (single-end) or two (paired-end) read files. Valid filetypes are: .sam[.*], .raw[.*], .gbk[.*], .frn[.*], .fq[.*], .fna[.*], .ffn[.*], .fastq[.*], .fasta[.*], .faa[.*], .fa[.*], .embl[.*], and .bam, where * is any of the following extensions: gz, bz2, and bgzf for transparent (de)compression.

OPTIONS¶

-h, --help: Display the help message.

--version: Display version information.

Main Options:¶

-f, --forward: Map reads only to forward strands.

-r, --reverse: Map reads only to reverse strands.

-i, --percent-identity DOUBLE: Percent identity threshold. In range [50..100]. Default: 92.

-rr, --recognition-rate DOUBLE: Percent recognition rate. In range [80..100]. Default: 99.

-pd, --param-dir STRING: Read user-computed parameter files in the directory <DIR>.

-id, --indels: Allow indels. Default: mismatches only.

-ll, --library-length INTEGER: Paired-end library length. In range [1..inf]. Default: 220.

-le, --library-error INTEGER: Paired-end library length tolerance. In range [0..inf]. Default: 50.

-m, --max-hits INTEGER: Output only <NUM> of the best hits. In range [1..inf]. Default: 100.

--unique: Output only unique best matches (-m 1 -dr 0 -pa).

-tr, --trim-reads INTEGER: Trim reads to given length. Default: off. In range [14..inf].

-o, --output OUTPUT_FILE: Change output filename (use - to dump to stdout in razers format). Default: < READS FILE>.razers. Valid filetypes are: .razers, .gff, .fasta, .fa, and .eland.

-v, --verbose: Verbose mode.

-vv, --vverbose: Very verbose mode.

Output Format Options:¶

-a, --alignment: Dump the alignment for each match (only razer or fasta format).

-pa, --purge-ambiguous: Purge reads with more than <max-hits> best matches.

-dr, --distance-range INTEGER: Only consider matches with at most NUM more errors compared to the best. Default: output all.

-gn, --genome-naming INTEGER: Select how genomes are named (see Naming section below). In range [0..1]. Default: 0.

-rn, --read-naming INTEGER: Select how reads are named (see Naming section below). In range [0..2]. Default: 0.

-so, --sort-order INTEGER: Select how matches are sorted (see Sorting section below). In range [0..1]. Default: 0.

-pf, --position-format INTEGER: Select begin/end position numbering (see Coordinate section below). In range [0..1]. Default: 0.

Filtration Options:¶

-s, --shape STRING: Manually set k-mer shape. Default: 11111111111.

-t, --threshold INTEGER: Manually set minimum k-mer count threshold. In range [1..inf].

-oc, --overabundance-cut INTEGER: Set k-mer overabundance cut ratio. In range [0..1].

-rl, --repeat-length INTEGER: Skip simple-repeats of length <NUM>. In range [1..inf]. Default: 1000.

-tl, --taboo-length INTEGER: Set taboo length. In range [1..inf]. Default: 1.

-lm, --low-memory: Decrease memory usage at the expense of runtime.

Verification Options:¶

-mN, --match-N: N matches all other characters. Default: N matches nothing.

-ed, --error-distr STRING: Write error distribution to FILE.

-mcl, --min-clipped-len INTEGER: Set minimal read length for read clipping. In range [0..inf]. Default: 0.

-qih, --quality-in-header: Quality string in fasta header.

FORMATS, NAMING, SORTING, AND COORDINATE SCHEMES¶

RazerS supports various output formats. The output format is detected automatically from the file name suffix.

.razers: Razer format

.fa, .fasta: Enhanced Fasta format

.eland: Eland format

.gff: GFF format

By default, reads and contigs are referred by their Fasta ids given in the input files. With the -gn and -rn options this behaviour can be changed:

0: Use Fasta id.

1: Enumerate beginning with 1.

2: Use the read sequence (only for short reads!).

The way matches are sorted in the output file can be changed with the -so option for the following formats: razer, fasta, sam, and amos. Primary and secondary sort keys are:

0: 1. read number, 2. genome position

1: 1. genome position, 2. read number

The coordinate space used for begin and end positions can be changed with the -pf option for the razer and fasta formats:

0: Gap space. Gaps between characters are counted from 0.

1: Position space. Characters are counted from 1.

EXAMPLES¶

razers example/genome.fa example/reads.fa -id -a -mN -v: Map single-end reads with 4% error rate, indels, and output the alignments. Ns are considered to match everything.

razers example/genome.fa example/reads.fa example/reads2.fa -id -mN: Map paired-end reads with up to 4% errors, indels, and output concordantly mapped pairs within default library size. Ns are considered to match everything.

razers 1.5.8 [tarball]

Source file:	razers.1.en.gz (from seqan-apps 2.4.0+dfsg-8~bpo8+1)
Source last updated:	2018-03-26T11:36:49Z
Converted to HTML:	2019-02-11T16:12:19Z