NAME¶
Bio::Tools::Run::Phylo::Phyml - Wrapper for rapid reconstruction of phylogenies
using Phyml
SYNOPSIS¶
use Bio::Tools::Run::Phylo::Phyml;
# Make a Phyml factory
$factory = Bio::Tools::Run::Phylo::Phyml->new(-verbose => 2);
# it defaults to protein alignment
# change parameters
$factory->model('Dayhoff');
# Pass the factory an alignment and run
$inputfilename = 't/data/protpars.phy';
$tree = $factory->run($inputfilename); # $tree is a Bio::Tree::Tree object.
# or set parameters at object creation
my %args = (
-data_type => 'dna',
-model => 'HKY',
-kappa => 4,
-invar => 'e',
-category_number => 4,
-alpha => 'e',
-tree => 'BIONJ',
-opt_topology => '0',
-opt_lengths => '1',
);
$factory = Bio::Tools::Run::Phylo::Phyml->new(%args);
# if you need the output files do
$factory->save_tempfiles(1);
$factory->tempdir($workdir);
# and get a Bio::Align::AlignI (SimpleAlign) object from somewhere
$tree = $factory->run($aln);
DESCRIPTION¶
This is a wrapper for running the phyml application by Stephane Guindon and
Olivier Gascuel. You can download it from:
http://atgc.lirmm.fr/phyml/
Installing¶
After downloading, you need to rename a the copy of the program that runs under
your operating system. I.e. "phyml_linux" into "phyml".
You will need to help this Phyml wrapper to find the "phyml" program.
This can be done in (at least) three ways:
- 1.
- Make sure the Phyml executable is in your path. Copy it to, or create a
symbolic link from a directory that is in your path.
- 2.
- Define an environmental variable PHYMLDIR which is a directory which
contains the 'phyml' application: In bash:
export PHYMLDIR=/home/username/phyml_v2.4.4/exe
In csh/tcsh:
setenv PHYMLDIR /home/username/phyml_v2.4.4/exe
- 3.
- Include a definition of an environmental variable PHYMLDIR in every script
that will use this Phyml wrapper module, e.g.:
BEGIN { $ENV{PHYMLDIR} = '/home/username/phyml_v2.4.4/exe' }
use Bio::Tools::Run::Phylo::Phyml;
Running¶
This wrapper has been tested with PHYML v2.4.4 and v.3.0
In its current state, the wrapper supports only input of one MSA and output of
one tree. It can easily be extended to support more advanced capabilities of
"phyml".
Two convienience methods have been added on top of the standard BioPerl
WrapperBase ones:
stats() and
tree_string(). You can call them
to after running the phyml program to retrieve into a string the statistics
and the tree in Newick format.
FEEDBACK¶
Mailing Lists¶
User feedback is an integral part of the evolution of this and other Bioperl
modules. Send your comments and suggestions preferably to the Bioperl mailing
list. Your participation is much appreciated.
bioperl-l@bioperl.org - General discussion
http://bioperl.org/wiki/Mailing_lists - About the mailing lists
Support¶
Please direct usage questions or support issues to the mailing list:
bioperl-l@bioperl.org
rather than to the module maintainer directly. Many experienced and reponsive
experts will be able look at the problem and quickly address it. Please
include a thorough description of the problem with code and data examples if
at all possible.
Reporting Bugs¶
Report bugs to the Bioperl bug tracking system to help us keep track of the bugs
and their resolution. Bug reports can be submitted via the web:
http://redmine.open-bio.org/projects/bioperl/
AUTHOR - Heikki Lehvaslaiho¶
heikki at bioperl dot org
APPENDIX¶
The rest of the documentation details each of the object methods. Internal
methods are usually preceded with a _
new¶
Title : new
Usage : $factory = Bio::Tools::Run::Phylo::Phyml->new(@params)
Function: creates a new Phyml factory
Returns : Bio::Tools::Run::Phylo::Phyml
Args : Optionally, provide any of the following (default in []):
-data_type => 'dna' or 'protein', [protein]
-dataset_count => 'integer, [1]
-model => 'HKY'... , [HKY|JTT]
-kappa => 'e' or float, [e]
-invar => 'e' or float, [e]
-category_number => integer, [1]
-alpha => 'e' or float (int v3),[e]
-tree => 'BIONJ' or your own, [BION]
-opt_topology => boolean [y]
-opt_lengths => boolean [y]
program_name¶
Title : program_name
Usage : $factory>program_name()
Function: holds the program name
Returns : string
Args : None
program_dir¶
Title : program_dir
Usage : $factory->program_dir(@params)
Function: returns the program directory, obtained from ENV variable.
Returns : string
Args : None
version¶
Title : version
Usage : exit if $prog->version < 1.8
Function: Determine the version number of the program
Example :
Returns : float or undef
Args : none
Phyml before 3.0 did not display the version. Assume 2.44 when can not determine
it.
run¶
Title : run
Usage : $factory->run($aln_file);
$factory->run($align_object);
Function: Runs Phyml to generate a tree
Returns : Bio::Tree::Tree object
Args : file name for your input alignment in a format
recognised by AlignIO, OR Bio::Align::AlignI
compliant object (eg. Bio::SimpleAlign).
stats¶
Title : stats
Usage : $factory->stats;
Function: Returns the contents of the phyml '_phyml_stat.txt' output file
Returns : string with statistics about the run, undef before run()
Args : none
tree_string¶
Title : tree_string
Usage : $factory->tree_string;
$factory->run($align_object);
Function: Returns the contents of the phyml '_phyml_tree.txt' output file
Returns : string with tree in Newick format, undef before run()
Args : none
Getsetters¶
These methods are used to set and get program parameters before running.
data_type¶
Title : data_type
Usage : $phyml->data_type('nt');
Function: Sets sequence alphabet to 'dna' (nt in v3) or 'aa'
If leaved unset, will be set automatically
Returns : set value, defaults to 'protein'
Args : None to get, 'dna' ('nt') or 'aa' to set.
Title : data_format
Usage : $phyml->data_format('s');
Function: Sets PHYLIP format to 'i' interleaved or
's' sequential
Returns : set value, defaults to 'i'
Args : None to get, 'i' or 's' to set.
dataset_count¶
Title : dataset_count
Usage : $phyml->dataset_count(3);
Function: Sets dataset number to deal with
Returns : set value, defaults to 1
Args : None to get, positive integer to set.
model¶
Title : model
Usage : $phyml->model('HKY');
Function: Choose the substitution model to use. One of
JC69 | K2P | F81 | HKY | F84 | TN93 | GTR (DNA)
JTT | MtREV | Dayhoff | WAG (amino acids)
v3.0:
HKY85 (default) | JC69 | K80 | F81 | F84 |
TN93 | GTR (DNA)
WAG (default) | JTT | MtREV | Dayhoff | DCMut |
RtREV | CpREV | VT | Blosum62 | MtMam | MtArt |
HIVw | HIVb (amino acids)
Returns : Name of the model, defaults to {HKY|JTT}
Args : None to get, string to set.
kappa¶
Title : kappa
Usage : $phyml->kappa(4);
Function: Sets transition/transversion ratio, leave unset to estimate
Returns : set value, defaults to 'e'
Args : None to get, float or integer to set.
invar¶
Title : invar
Usage : $phyml->invar(.3);
Function: Sets proportion of invariable sites, leave unset to estimate
Returns : set value, defaults to 'e'
Args : None to get, float or integer to set.
category_number¶
Title : category_number
Usage : $phyml->category_number(4);
Function: Sets number of relative substitution rate categories
Returns : set value, defaults to 1
Args : None to get, integer to set.
alpha¶
Title : alpha
Usage : $phyml->alpha(1.0);
Function: Sets gamma distribution parameter, leave unset to estimate
Returns : set value, defaults to 'e'
Args : None to get, float or integer to set.
tree¶
Title : tree
Usage : $phyml->tree('/tmp/tree.nwk');
Function: Sets starting tree, leave unset to estimate a distance tree
Returns : set value, defaults to 'BIONJ'
Args : None to get, newick tree file name to set.
v2 options¶
These methods can be used with PhyML v2* only.
opt_topology¶
Title : opt_topology
Usage : $factory->opt_topology(1);
Function: Choose to optimise the tree topology
Returns : {y|n} (default y)
Args : None to get, boolean to set.
v2.* only
opt_lengths¶
Title : opt_lengths
Usage : $factory->opt_lengths(0);
Function: Choose to optimise branch lengths and rate parameters
Returns : {y|n} (default y)
Args : None to get, boolean to set.
v2.* only
v3 options¶
These methods can be used with PhyML v3* only.
freq¶
Title : freq
Usage : $phyml->freq(e); $phyml->freq("0.2, 0.6, 0.6, 0.2");
Function: Sets nucleotide frequences or asks residue to be estimated
according to two models: e or d
Returns : set value,
Args : None to get, string to set.
v3 only.
opt¶
Title : opt
Usage : $factory->opt(1);
Function: Optimise tree parameters: tlr|tl|tr|l|n
Returns : {value|n} (default n)
Args : None to get, string to set.
v3.* only
search¶
Title : search
Usage : $factory->search(SPR);
Function: Tree topology search operation algorithm: NNI|SPR|BEST
Returns : string (defaults to NNI)
Args : None to get, string to set.
v3.* only
rand_start¶
Title : rand_start
Usage : $factory->rand_start(1);
Function: Sets the initial SPR tree to random.
Returns : boolean (defaults to false)
Args : None to get, boolean to set.
v3.* only; only meaningful if $prog->search is 'SPR'
rand_starts¶
Title : rand_starts
Usage : $factory->rand_starts(10);
Function: Sets the number of initial random SPR trees
Returns : integer (defaults to 1)
Args : None to get, integer to set.
v3.* only; only valid if $prog->search is 'SPR'
rand_seed¶
Title : rand_seed
Usage : $factory->rand_seed(1769876);
Function: Seeds the random number generator
Returns : random integer
Args : None to get, integer to set.
v3.* only; only valid if $prog->search is 'SPR'
Uses perl
rand() to initialize if not explicitely set.
Internal methods¶
These methods are private and should not be called outside this class.
_setparams¶
Title : _setparams
Usage : Internal function, not to be called directly
Function: Creates a string of params to be used in the command string
Returns : string of params
Args : none
_write_phylip_align_file¶
Title : _write_phylip_align_file
Usage : obj->__write_phylip_align_file($aln)
Function: Internal (not to be used directly)
Writes the alignment into the tmp directory
in PHYLIP interlieved format
Returns : filename
Args : Bio::Align::AlignI