lambda2_searchn - the Local Aligner for Massive Biological DatA
lambda2 searchn [OPTIONS] -q QUERY.fasta -i INDEX.lambda [-o output.m8]
Lambda is a local aligner optimized for many query sequences and searches in protein space. It is compatible to BLAST, but much faster than BLAST and many other comparable tools.
Detailed information is available in the wiki: <https://github.com/seqan/lambda/wiki>
- -h, --help
- Display the help message.
- -hh, --full-help
- Display the help message with advanced options.
- Display version information.
- Display long copyright information.
- -v, --verbosity INTEGER
- Display more/less diagnostic output during operation: 0 [only errors]; 1 [default]; 2 [+run-time, options and statistics]. In range [0..2]. Default: 1.
- -q, --query INPUT_FILE
- Query sequences. Valid filetypes are: .sam[.*], .raw[.*], .gbk[.*], .frn[.*], .fq[.*], .fna[.*], .ffn[.*], .fastq[.*], .fasta[.*], .faa[.*], .fa[.*], .embl[.*], and .bam, where * is any of the following extensions: gz, bz2, and bgzf for transparent (de)compression.
- -i, --index INPUT_DIRECTORY
- The database index (created by the 'lambda mkindexn' command). Valid filetype is: .lambda.
- -o, --output OUTPUT_FILE
- File to hold reports on hits (.m* are blastall -m* formats; .m8 is tab-seperated, .m9 is tab-seperated with with comments, .m0 is pairwise format). Valid filetypes are: .sam[.*], .m9[.*], .m8[.*], .m0[.*], and .bam, where * is any of the following extensions: gz, bz2, and bgzf for transparent (de)compression. Default: output.m8.
- --output-columns STRING
- Print specified column combination and/or order (.m8 and .m9 outputs only); call -oc help for more details. Default: std.
- --percent-identity INTEGER
- Output only matches above this threshold (checked before e-value check). In range [0..100]. Default: 0.
- -e, --e-value DOUBLE
- Output only matches that score below this threshold. In range [0..100]. Default: 1e-04.
- -n, --num-matches INTEGER
- Print at most this number of matches per query. In range [1..10000]. Default: 256.
- --sam-with-refheader BOOL
- BAM files require all subject names to be written to the header. For SAM this is not required, so Lambda does not automatically do it to save space (especially for protein database this is a lot!). If you still want them with SAM, e.g. for better BAM compatibility, use this option. One of 1, ON, TRUE, T, YES, 0, OFF, FALSE, F, and NO. Default: off.
- --sam-bam-seq STRING
- Write matching DNA subsequence into SAM/BAM file. If set to uniq than the sequence is omitted iff it is identical to the previous match's subsequence. One of always, uniq, and never. Default: uniq.
- Write the specified optional columns to the SAM/BAM file. Call --sam-bam-tags help for more details. Default: AS NM ae ai qf.
- --sam-bam-clip STRING
- Whether to hard-clip or soft-clip the regions beyond the local match. Soft-clipping retains the full sequence in the output file, but obviously uses more space. One of hard and soft. Default: hard.
- -t, --threads INTEGER
- number of threads to run concurrently. Default: autodetected.
Seeding / Filtration:¶
- --adaptive-seeding BOOL
- Grow the seed if it has too many hits (low complexity filter). One of 1, ON, TRUE, T, YES, 0, OFF, FALSE, F, and NO. Default: off.
- --seed-length INTEGER
- Length of the seeds. In range [3..50]. Default: 14.
- --seed-offset INTEGER
- Offset for seeding (if unset = seed-length/2). In range [1..50]. Default: 7.
- --seed-delta INTEGER
- maximum seed distance. In range [0..1]. Default: 1.
- --seed-delta-increases-length BOOL
- Seed delta increases the min. seed length (for affected seeds). One of 1, ON, TRUE, T, YES, 0, OFF, FALSE, F, and NO. Default: off.
- --seed-half-exact BOOL
- Allow errors only in second half of seed. One of 1, ON, TRUE, T, YES, 0, OFF, FALSE, F, and NO. Default: on.
- --pre-scoring INTEGER
- evaluate score of a region NUM times the size of the seed before extension (0 -> no pre-scoring, 1 -> evaluate seed, n-> area around seed, as well; default = 1 if no reduction is used). In range [1..10]. Default: 2.
- --pre-scoring-threshold DOUBLE
- minimum average score per position in pre-scoring region. In range [0..20]. Default: 2.
- --filter-putative-duplicates BOOL
- filter hits that will likely duplicate a match already found. One of 1, ON, TRUE, T, YES, 0, OFF, FALSE, F, and NO. Default: on.
- --filter-putative-abundant BOOL
- If the maximum number of matches per query are found already, stop searching if the remaining realm looks unfeasible. One of 1, ON, TRUE, T, YES, 0, OFF, FALSE, F, and NO. Default: on.
- --merge-putative-siblings BOOL
- Merge seed from one region, stop searching if the remaining realm looks unfeasable. One of 1, ON, TRUE, T, YES, 0, OFF, FALSE, F, and NO. Default: on.
- --score-gap INTEGER
- Score per gap character. In range [-1000..1000]. Default: -2.
- --score-gap-open INTEGER
- Additional cost for opening gap. In range [-1000..1000]. Default: -5.
- --score-match INTEGER
- Match score [only BLASTN]) In range [-1000..1000]. Default: 2.
- --score-mismatch INTEGER
- Mismatch score [only BLASTN] In range [-1000..1000]. Default: -3.
- -x, --x-drop INTEGER
- Stop Banded extension if score x below the maximum seen (-1 means no xdrop). In range [-1..1000]. Default: 30.
- -b, --band INTEGER
- Size of the DP-band used in extension (-3 means log2 of query length; -2 means sqrt of query length; -1 means full dp; n means band of size 2n+1) In range [-3..1000]. Default: -3.
- -m, --extension-mode STRING
- Choice of extension algorithms. One of auto, xdrop, and fullSerial. Default: auto.
Tuning the seeding parameters and (de)activating alphabet reduction has a strong influence on both speed and sensitivity. We recommend the following alternative profiles for protein searches:
fast (high similarity): --seed-delta-increases-length on
sensitive (lower similarity): --seed-offset 3
For further information see the wiki: <https://github.com/seqan/lambda/wiki>
lambda2 searchn Copyright: 2013-2019 Hannes Hauswedell,
released under the GNU AGPL v3 (or later); 2016-2019 Knut Reinert and Freie
Universität Berlin, released under the 3-clause-BSDL
SeqAn Copyright: 2006-2015 Knut Reinert, FU-Berlin; released under the 3-clause BSDL.
In your academic works please cite: Hauswedell et al (2014); doi: 10.1093/bioinformatics/btu439
For full copyright and/or warranty information see --copyright.
|Dec 7 2020||lambda2 searchn 2.0.0|