NAME¶
stellar - the SwifT Exact LocaL AligneR
SYNOPSIS¶
stellar [
OPTIONS] <
FASTA FILE 1> <
FASTA FILE
2>
DESCRIPTION¶
STELLAR implements the SWIFT filter algorithm (Rasmussen et al., 2006) and a
verification step for the SWIFT hits that applies local alignment, gapped
X-drop extension, and extraction of the longest epsilon-match.
Input to STELLAR are two files, each containing one or more sequences in FASTA
format. Each sequence from file 1 will be compared to each sequence in file 2.
The sequences from file 1 are used as database, the sequences from file 2 as
queries.
(c) 2010-2012 by Birte Kehr
REQUIRED ARGUMENTS¶
- FASTA_FILE_1 INPUT_FILE
-
Valid filetypes are: .fasta and .fa.
- FASTA_FILE_2 INPUT_FILE
-
Valid filetypes are: .fasta and .fa.
OPTIONS¶
- -h, --help
- Display the help message.
- --version
- Display version information.
Main Options:¶
- -e, --epsilon DOUBLE
- Maximal error rate (max 0.25). In range [0.0000001..0.25]. Default:
0.05.
- -l, --minLength INTEGER
- Minimal length of epsilon-matches. In range [0..inf]. Default:
100.
- -f, --forward
- Search only in forward strand of database.
- -r, --reverse
- Search only in reverse complement of database.
- -a, --alphabet STRING
- Alphabet type of input sequences (dna, rna, dna5, rna5, protein, char).
One of dna, dna5, rna, rna5, protein,
and char.
- -v, --verbose
- Set verbosity mode.
Filtering Options:¶
- -k, --kmer INTEGER
- Length of the q-grams (max 32). In range [1..32].
- -rp, --repeatPeriod INTEGER
- Maximal period of low complexity repeats to be filtered. Default:
1.
- -rl, --repeatLength INTEGER
- Minimal length of low complexity repeats to be filtered. Default:
1000.
- -c, --abundanceCut DOUBLE
- k-mer overabundance cut ratio. In range [0..1]. Default: 1.
Verification Options:¶
- -x, --xDrop DOUBLE
- Maximal x-drop for extension. Default: 5.
- -vs, --verification STRING
- Verification strategy: exact or bestLocal or bandedGlobal One of
exact, bestLocal, and bandedGlobal. Default:
exact.
- -dt, --disableThresh INTEGER
- Maximal number of verified matches before disabling verification for one
query sequence (default infinity). In range [0..inf].
- -n, --numMatches INTEGER
- Maximal number of kept matches per query and database. If STELLAR finds
more matches, only the longest ones are kept. Default: 50.
- -s, --sortThresh INTEGER
- Number of matches triggering removal of duplicates. Choose a smaller value
for saving space. Default: 500.
Output Options:¶
- -o, --out OUTPUT_FILE
- Name of output file. Valid filetypes are: .txt and .gff.
Default: stellar.gff.
- -od, --outDisabled OUTPUT_FILE
- Name of output file for disabled query sequences. Valid filetypes are:
.sam[.*], .raw[.*], .frn[.*], .fq[.*],
.fna[.*], .ffn[.*], .fastq[.*], .fasta[.*],
.faa[.*], .fa[.*], and .bam, where * is any of the
following extensions: gz, bz2, and bgzf for
transparent (de)compression. Default: stellar.disabled.fasta.
REFERENCES¶
Kehr, B., Weese, D., Reinert, K.: STELLAR: fast and exact local alignments. BMC
Bioinformatics, 12(Suppl 9):S15, 2011.