Scroll to navigation

LAMBDA(1) LAMBDA(1)

NAME

lambda - the Local Aligner for Massive Biological DatA

SYNOPSIS

lambda [OPTIONS] -q QUERY.fasta -d DATABASE.fasta [-o output.m8]

DESCRIPTION

Lambda is a local aligner optimized for many query sequences and searches in protein space. It is compatible to BLAST, but much faster than BLAST and many other comparable tools.

Detailed information is available in the wiki: <https://github.com/seqan/lambda/wiki>

OPTIONS

-h, --help
Display the help message.
-hh, --full-help
Display the help message with advanced options.
--version
Display version information.
--copyright
Display long copyright information.
-v, --verbosity INTEGER
Display more/less diagnostic output during operation: 0 [only errors]; 1 [default]; 2 [+run-time, options and statistics]. In range [0..2]. Default: 1.

Input Options:

-q, --query INPUT_FILE
Query sequences. Valid filetypes are: .sam[.*], .raw[.*], .gbk[.*], .frn[.*], .fq[.*], .fna[.*], .ffn[.*], .fastq[.*], .fasta[.*], .faa[.*], .fa[.*], .embl[.*], and .bam, where * is any of the following extensions: gz, bz2, and bgzf for transparent (de)compression.
-d, --database INPUT_FILE
Path to original database sequences (a precomputed index with .sa or .fm needs to exist!). Valid filetypes are: .sam[.*], .raw[.*], .gbk[.*], .frn[.*], .fq[.*], .fna[.*], .ffn[.*], .fastq[.*], .fasta[.*], .faa[.*], .fa[.*], .embl[.*], and .bam, where * is any of the following extensions: gz, bz2, and bgzf for transparent (de)compression.
-di, --db-index-type STRING
database index is in this format. One of sa and fm. Default: fm.

Output Options:

-o, --output OUTPUT_FILE
File to hold reports on hits (.m* are blastall -m* formats; .m8 is tab-seperated, .m9 is tab-seperated with with comments, .m0 is pairwise format). Valid filetypes are: .sam[.*], .m9[.*], .m8[.*], .m0[.*], and .bam, where * is any of the following extensions: gz, bz2, and bgzf for transparent (de)compression. Default: output.m8.
-oc, --output-columns STRING
Print specified column combination and/or order (.m8 and .m9 outputs only); call -oc help for more details. Default: std.
-id, --percent-identity INTEGER
Output only matches above this threshold (checked before e-value check). In range [0..100]. Default: 0.
-e, --e-value DOUBLE
Output only matches that score below this threshold. In range [0..inf]. Default: 0.1.
-nm, --num-matches INTEGER
Print at most this number of matches per query. In range [1..inf]. Default: 500.
--sam-with-refheader STRING
BAM files require all subject names to be written to the header. For SAM this is not required, so Lambda does not automatically do it to save space (especially for protein database this is a lot!). If you still want them with SAM, e.g. for better BAM compatibility, use this option. One of on and off. Default: off.
--sam-bam-seq STRING
Write matching DNA subsequence into SAM/BAM file (BLASTN). For BLASTX and TBLASTX the matching protein sequence is "untranslated" and positions retransformed to the original sequence. For BLASTP and TBLASTN there is no DNA sequence so a "*" is written to the SEQ column. The matching protein sequence can be written as an optional tag, see --sam-bam-tags. If set to uniq than the sequence is omitted iff it is identical to the previous match's subsequence. One of always, uniq, and never. Default: uniq.
--sam-bam-tags STRING
Write the specified optional columns to the SAM/BAM file. Call --sam-bam-tags help for more details. Default: AS NM ZE ZI ZF.
--sam-bam-clip STRING
Whether to hard-clip or soft-clip the regions beyond the local match. Soft-clipping retains the full sequence in the output file, but obviously uses more space. One of hard and soft. Default: hard.

General Options:

-t, --threads INTEGER
number of threads to run concurrently.
-qi, --query-index-type STRING
controls double-indexing. One of radix and none. Default: none.

Alphabets and Translation:

-p, --program STRING
Blast Operation Mode. One of blastn, blastp, blastx, tblastn, and tblastx. Default: blastx.
-g, --genetic-code INTEGER
The translation table to use for nucl -> amino acid translation(not for BlastN, BlastP). See https://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi?mode=c for ids (default is generic). Six frames are generated. Default: 1.
-ar, --alphabet-reduction STRING
Alphabet Reduction for seeding phase (ignored for BLASTN). One of none and murphy10. Default: murphy10.

Seeding / Filtration:

-sl, --seed-length INTEGER
Length of the seeds (default = 14 for BLASTN). Default: 10.
-so, --seed-offset INTEGER
Offset for seeding (if unset = seed-length, non-overlapping; default = 5 for BLASTN). Default: 10.
-sd, --seed-delta INTEGER
maximum seed distance. Default: 1.

Miscellaneous Heuristics:

-ps, --pre-scoring INTEGER
evaluate score of a region NUM times the size of the seed before extension (0 -> no pre-scoring, 1 -> evaluate seed, n-> area around seed, as well; default = 1 if no reduction is used). In range [1..inf]. Default: 2.
-pt, --pre-scoring-threshold DOUBLE
minimum average score per position in pre-scoring region. Default: 2.
-pd, --filter-putative-duplicates STRING
filter hits that will likely duplicate a match already found. One of on and off. Default: on.
-pa, --filter-putative-abundant STRING
If the maximum number of matches per query are found already, stop searching if the remaining realm looks unfeasable. One of on and off. Default: on.

Scoring:

-sc, --scoring-scheme INTEGER
use '45' for Blosum45; '62' for Blosum62 (default); '80' for Blosum80; [ignored for BlastN] Default: 62.
-ge, --score-gap INTEGER
Score per gap character (default = -2 for BLASTN). Default: -1.
-go, --score-gap-open INTEGER
Additional cost for opening gap (default = -5 for BLASTN). Default: -11.
-ma, --score-match INTEGER
Match score [only BLASTN]) Default: 2.
-mi, --score-mismatch INTEGER
Mismatch score [only BLASTN] Default: -3.

Extension:

-x, --x-drop INTEGER
Stop Banded extension if score x below the maximum seen (-1 means no xdrop). In range [-1..inf]. Default: 30.
-b, --band INTEGER
Size of the DP-band used in extension (-3 means log2 of query length; -2 means sqrt of query length; -1 means full dp; n means band of size 2n+1) In range [-3..inf]. Default: -3.

TUNING

Tuning the seeding parameters and (de)activating alphabet reduction has a strong influence on both speed and sensitivity. We recommend the following alternative profiles for protein searches:

fast (high similarity): -ar none -sl 7 -sd 0

sensitive (lower similarity): -so 5

For further information see the wiki: <https://github.com/seqan/lambda/wiki>

LEGAL

lambda Copyright: 2013-2017 Hannes Hauswedell, released under the GNU GPL v3 (or later); 2016-2017 Knut Reinert and Freie Universität Berlin, released under the 3-clause-BSDL
SeqAn Copyright: 2006-2015 Knut Reinert, FU-Berlin; released under the 3-clause BSDL.
In your academic works please cite: Hauswedell et al (2014); doi: 10.1093/bioinformatics/btu439
For full copyright and/or warranty information see --copyright.
Jan 21 2017 lambda 1.0.1