NAME¶
proteinortho5 - orthology detection tool
SYNOPSIS¶
proteinortho5 [
OPTIONS]
FASTA1 FASTA2 [
FASTA...]
DESCRIPTION¶
Proteinortho is a stand-alone tool that is geared towards large datasets and
makes use of distributed computing techniques when run on multi-core hardware.
It implements an extended version of the reciprocal best alignment heuristic.
Proteinortho was applied to compute orthologous proteins in the complete set
of all 717 eubacterial genomes available at NCBI at the beginning of 2009.
Authors succeeded identifying thirty proteins present in 99% of all bacterial
proteomes.
OPTIONS¶
- -e=
- E-value for blast [default: 1e-05]
- -p=
- blast program {blastn|blastp|blastn+|blastp+} [default: blastp+]
- -project=
- prefix for all result file names [default: myproject]
- -synteny
- activate PoFF extension to separate similar sequences by contextual
adjacencies (requires .gff for each .fasta)
- -dups=
- PoFF: number of reiterations for adjacencies heuristic, to determine
duplicated regions (default: 0)
- -cs=
- PoFF: Size of a maximum common substring (MCS) for adjacency matches
(default: 3)
- -alpha=
- PoFF: weight of adjacencies vs. sequence similarity (default: 0.5)
- -desc
- write description files (for NCBI FASTA input only)
- -keep
- stores temporary blast results for reuse
- -force
- forces recalculation of blast results in any case
- -cpus=
- number of processors to use [default: auto]
- -selfblast
- apply selfblast, detects paralogs without orthologs
- -singles
- report singleton genes without any hit
- -identity=
- min. percent identity of best blast hits [default: 25]
- -cov=
- min. coverage of best blast alignments in % [default: 50]
- -conn=
- min. algebraic connectivity [default: 0.1]
- -sim=
- min. similarity for additional hits (0..1) [default: 0.95]
- -step=
- 1 -> generate indices 2 -> run blast (and ff-adj, if -synteny
is set) 3 -> clustering 0 -> all (default)
- -blastpath=
- path to your local blast (if not installed globally)
- -verbose
- keeps you informed about the progress
- -clean
- remove all unnecessary files after processing
- -graph
- generate .graph files (pairwise orthology relations)
- -debug
- gives detailed information for bug tracking
More specific blast parameters can be defined by
- -blastParameters='[parameters]' (e.g. -blastParameters='-seg
no')
In case jobs should be distributed onto several machines, use
- -startat= File number to start with (default: 0)
- -stopat= File number to end with (default: -1)