Scroll to navigation

GUSTAF(1) GUSTAF(1)

NAME

gustaf - Gustaf - Generic mUlti-SpliT Alignment Finder: Tool for split-read mapping allowing multiple splits.

SYNOPSIS

gustaf [OPTIONS] <GENOME FASTA FILE> <READ FASTA FILE>

gustaf [OPTIONS] <GENOME FASTA FILE> <READ FASTA FILE> <READ FASTA FILE 2>

DESCRIPTION

GUSTAF uses SeqAns STELLAR to find splits as local matches on different strands or chromosomes. Criteria and penalties to chain these matches can be specified. Output file contains the breakpoints along the best chain.

The genome file is used as database input, the read file as query input.

All STELLAR options are supported. See STELLAR documentation for STELLAR parameters and options.

(c) 2011-2012 by Kathrin Trappe

REQUIRED ARGUMENTS


Valid filetypes are: .fq, .fastq, .fasta, and .fa.
Either one (single-end) or two (paired-end) read files. Valid filetypes are: .fq, .fastq, .fasta, and .fa.

OPTIONS

Display the help message.
Display version information.

Main Options:

Interchromosomal translocation penalty Default: 5.
Inversion penalty Default: 5.
Intrachromosomal order change penalty Default: 0.
Allowed overlap between matches Default: 0.5.
Allowed gap length between matches, default value corresponse to expected size of microindels (5 bp) Default: 5.
Allowed initial or ending gap length at begin and end of read with no breakpoint (e.g. due to sequencing errors at the end) Default: 15.
Allowed initial or ending gap length at begin and end of read that creates a breakend/breakpoint (e.g. for reads extending into insertions) Default: 30.
Minimal length of (small) insertion/duplication with double overlap to be considered tandem repeat Default: 50.
Allowed difference in breakpoint position Default: 5.
Disable inferring complex SVs
Number of supporting reads Default: 2.
Number of supporting concordant mates Default: 2.
Library size of paired-end reads
Library error (sd) of paired-end reads
Disable reverse complementing second mate pair input file.

Input Options:

File of (stellar) matches Valid filetypes are: .gff and .GFF.

Output Options:

Name of gff breakpoint output file. Valid filetypes are: .txt and .gff. Default: breakpoints.gff.
Name of vcf breakpoint output file. Valid filetypes are: .vcf and .txt. Default: breakpoints.vcf.
Job/Queue name Default: .
Enable graph output in dot format

Parallelization Options:

Number of threads for parallelization of I/O. Default: 1.

Main Options:

Maximal error rate (max 0.25). In range [0.0000001..0.25]. Default: 0.05.
Minimal length of epsilon-matches. In range [0..inf]. Default: 100.
Search only in forward strand of database.
Search only in reverse complement of database.
Alphabet type of input sequences (dna, rna, dna5, rna5, protein, char). One of dna, dna5, rna, rna5, protein, and char.
Set verbosity mode.

Filtering Options:

Length of the q-grams (max 32). In range [1..32].
Maximal period of low complexity repeats to be filtered. Default: 1.
Minimal length of low complexity repeats to be filtered. Default: 1000.
k-mer overabundance cut ratio. In range [0..1]. Default: 1.

Verification Options:

Maximal x-drop for extension. Default: 5.
Verification strategy: exact or bestLocal or bandedGlobal One of exact, bestLocal, and bandedGlobal. Default: exact.
Maximal number of verified matches before disabling verification for one query sequence (default infinity). In range [0..inf].
Maximal number of kept matches per query and database. If STELLAR finds more matches, only the longest ones are kept. Default: 50.
Number of matches triggering removal of duplicates. Choose a smaller value for saving space. Default: 500.
August 2014 gustaf 1.0.0