table of contents
PEGASUS-PLAN(1) | PEGASUS-PLAN(1) |
NAME¶
pegasus-plan - runs Pegasus to generate the executable workflowSYNOPSIS¶
pegasus-plan [-v] [-q] [-V] [-h] [ -Dprop=value...]] [-b prefix] [ --conf propsfile] [ -c cachefile[,cachefile...]] [ -C style[,style...]] [ --dir dir] [ --force] [--force-replan] [ --inherited-rc-files] [-j prefix] [ -n] [-o site] [ -s site1[,site2...]] [ --staging-site s1=ss1[,s2=ss2[..]] [ --randomdir[=dirname]] [ --relative-dir dir] [ --relative-submit-dir dir] -d daxfile
DESCRIPTION¶
The pegasus-plan command takes in as input the DAX and generates an executable workflow usually in form of condor submit files, which can be submitted to an execution site for execution.The Pegasus Workflow Planner ensures that all
the data required for the execution of the executable workflow is transferred
to the execution site by adding transfer nodes at appropriate points in the
DAG. This is done by looking up an appropriate Replica Catalog to
determine the locations of the input files for the various jobs. At present
the default replica mechanism used is RLS.
The Pegasus Workflow Planner also tries to reduce the workflow, unless specified
otherwise. This is done by deleting the jobs whose output files have been
found in some location in the Replica Catalog. At present no cost metrics are
used. However preference is given to a location corresponding to the execution
site
The planner can also add nodes to transfer all the materialized files to an
output site. The location on the output site is determined by looking up the
site catalog file, the path to which is picked up from the
pegasus.catalog.site.file property value.
executables
The planner looks up a Transformation Catalog
to discover locations of the executables referred to in the executable
workflow. Users can specify INSTALLED or STAGEABLE executables in the catalog.
Stageable executables can be used by Pegasus to stage executables to resources
where they are not pre-installed.
resources
The layout of the sites, where Pegasus can
schedule jobs of a workflow are described in the Site Catalog. The planner
looks up the site catalog to determine for a site what directories a job can
be executed in, what servers to use for staging in and out data and what
jobmanagers (if applicable) can be used for submitting jobs.
OPTIONS¶
Any option will be displayed with its long options synonym(s). -Dproperty=valueThe -D option allows an experienced
user to override certain properties which influence the program execution,
among them the default location of the user’s properties file and the
PEGASUS home location. One may set several CLI properties by giving this
option multiple times. The -D option(s) must be the first option on the
command line. A CLI property take precedence over the properties file property
of the same key.
-d file, --dax file
The DAX is the XML input file that describes
an abstract workflow. This is a mandatory option, which has to be used.
-b prefix, --basename prefix
The basename prefix to be used while
constructing per workflow files like the dagman file (.dag file) and other
workflow specific files that are created by Condor. Usually this prefix, is
taken from the name attribute specified in the root element of the dax
files.
-c file[,file,...], --cache
file[,file,...]
A comma separated list of paths to replica
cache files that override the results from the replica catalog for a
particular LFN.
Each entry in the cache file describes a LFN , the corresponding PFN and the
associated attributes. The pool attribute should be specified for each entry.
To treat the cache files as supplemental replica catalogs set the property
pegasus.catalog.replica.cache.asrc to true. This results in the mapping
in the cache files to be merged with the mappings in the replica catalog.
Thus, for a particular LFN both the entries in the cache file and replica
catalog are available for replica selection.
-C style[,style,...], --cluster
style[,style,...]
LFN_1 PFN_1 pool=[site handle 1] LFN_2 PFN_2 pool=[site handle 2] ... LFN_N PFN_N [site handle N]
Comma-separated list of clustering styles to
apply to the workflow. This mode of operation results in clustering of n
compute jobs into a larger jobs to reduce remote scheduling overhead. You can
specify a list of clustering techniques to recursively apply them to the
workflow. For example, this allows you to cluster some jobs in the workflow
using horizontal clustering and then use label based clustering on the
intermediate workflow to do vertical clustering.
The clustered jobs can be run at the remote site, either sequentially or by
using MPI. This can be specified by setting the property
pegasus.job.aggregator. The property can be overridden by associating
the PEGASUS profile key collapser either with the transformation in the
transformation catalog or the execution site in the site catalog. The value
specified (to the property or the profile), is the logical name of the
transformation that is to be used for clustering jobs. Note that clustering
will only happen if the corresponding transformations are catalogued in the
transformation catalog.
PEGASUS ships with a clustering executable seqexec that can be found in
the $PEGASUS_HOME/bin directory. It runs the jobs in the clustered job
sequentially on the same node at the remote site.
In addition, an MPI wrapper mpiexec, is distributed as source with
PEGASUS. It can be found in the $PEGASUS_HOME/src/tools/cluster
directory. The wrapper is run on every MPI node, with the first one being the
master and the rest of the ones as workers. The number of instances of
mpiexec that are invoked is equal to the value of the Globus RSL key
nodecount. The master distributes the smaller constituent jobs to the
workers. For e.g. If there were 10 jobs in the clustered job and
nodecount was 5, then one node acts as master, and the 10 jobs are
distributed amongst the 4 slaves on demand. The master hands off a job to the
slave node as and when it gets free. So initially all the 4 nodes are given a
single job each, and then as and when they get done are handed more jobs till
all the 10 jobs have been executed.
By default, seqexec is used for clustering jobs unless overridden in the
properties or by the pegasus profile key collapser.
The following type of clustering styles are currently supported:
--conf propfile
•
horizontal is the style of clustering in which jobs on the same level are
aggregated into larger jobs. A level of the workflow is defined as the
greatest distance of a node, from the root of the workflow. Clustering occurs
only on jobs of the same type i.e they refer to the same logical
transformation in the transformation catalog.
Horizontal Clustering can operate in one of two modes. a. Job count based.
The granularity of clustering can be specified by associating either the PEGASUS
profile key clusters.size or the PEGASUS profile key
clusters.num with the transformation.
The clusters.size key indicates how many jobs need to be clustered into
the larger clustered job. The clusters.num key indicates how many clustered
jobs are to be created for a particular level at a particular execution site.
If both keys are specified for a particular transformation, then the
clusters.num key value is used to determine the clustering granularity.
1.Runtime based.
To cluster jobs according to runtimes user needs to set one property and two
profile keys. The property pegasus.clusterer.preference must be set to the
value runtime. In addition user needs to specify two Pegasus profiles.
a. clusters.maxruntime which specifies the maximum duration for which the
clustered job should run for. b. job.runtime which specifies the duration for
which the job with which the profile key is associated, runs for. Ideally,
clusters.maxruntime should be set in transformation catalog and job.runtime
should be set for each job individually.
•
label is the style of clustering in which you can label the jobs in your
workflow. The jobs with the same level are put in the same clustered job. This
allows you to aggregate jobs across levels, or in a manner that is best suited
to your application.
To label the workflow, you need to associate PEGASUS profiles with the jobs in
the DAX. The profile key to use for labeling the workflow can be set by the
property pegasus.clusterer.label.key. It defaults to label, meaning if
you have a PEGASUS profile key label with jobs, the jobs with the same value
for the pegasus profile key label will go into the same clustered job.
The path to properties file that contains the
properties planner needs to use while planning the workflow.
--dir dir
The base directory where you want the output
of the Pegasus Workflow Planner usually condor submit files, to be generated.
Pegasus creates a directory structure in this base directory on the basis of
username, VO Group and the label of the workflow in the DAX.
By default the base directory is the directory from which one runs the
pegasus-plan command.
-f, --force
This bypasses the reduction phase in which the
abstract DAG is reduced, on the basis of the locations of the output files
returned by the replica catalog. This is analogous to a make style
generation of the executable workflow.
--force-replan
By default, for hierarichal workflows if a DAX
job fails, then on job retry the rescue DAG of the associated workflow is
submitted. This option causes Pegasus to replan the DAX job in case of failure
instead.
-g, --group
The VO Group to which the user belongs
to.
-h, --help
Displays all the options to the
pegasus-plan command.
--inherited-rc-files file[,file,...]
A comma separated list of paths to replica
files. Locations mentioned in these have a lower priority than the locations
in the DAX file. This option is usually used internally for hierarchical
workflows, where the file locations mentioned in the parent (encompassing)
workflow DAX, passed to the sub workflows (corresponding) to the DAX
jobs.
-j prefix, --job-prefix prefix
The job prefix to be applied for constructing
the filenames for the job submit files.
-n, --nocleanup
This results in the generation of the separate
cleanup workflow that removes the directories created during the execution of
the executable workflow. The cleanup workflow is to be submitted after the
executable workflow has finished.
If this option is not specified, then Pegasus adds cleanup nodes to the
executable workflow itself that cleanup files on the remote sites when they
are no longer required.
-o site, --o site
The output site where all the materialized
data is transferred to.
By default the materialized data remains in the working directory on the
execution site where it was created. Only those output files are
transferred to an output site for which transfer attribute is set to true in
the DAX.
-q, --quiet
Decreases the logging level.
-r[dirname], --randomdir[=dirname]
Pegasus Worfklow Planner adds create directory
jobs to the executable workflow that create a directory in which all jobs for
that workflow execute on a particular site. The directory created is in the
working directory (specified in the site catalog with each site).
By default, Pegasus duplicates the relative directory structure on the submit
host on the remote site. The user can specify this option without arguments to
create a random timestamp based name for the execution directory that are
created by the create dir jobs. The user can can specify the optional argument
to this option to specify the basename of the directory that is to be created.
The create dir jobs refer to the dirmanager executable that is shipped as
part of the PEGASUS worker package. The transformation catalog is searched for
the transformation named pegasus::dirmanager for all the remote sites
where the workflow has been scheduled. Pegasus can create a default path for
the dirmanager executable, if PEGASUS_HOME environment variable is
associated with the sites in the site catalog as an environment profile.
--relative-dir dir
The directory relative to the base directory
where the executable workflow it to be generated and executed. This overrides
the default directory structure that Pegasus creates based on username, VO
Group and the DAX label.
--relative-submit-dir dir
The directory relative to the base directory
where the executable workflow it to be generated. This overrides the default
directory structure that Pegasus creates based on username, VO Group and the
DAX label. By specifying --relative-dir and
--relative-submit-dir you can have different relative execution
directory on the remote site and different relative submit directory on the
submit host.
-s site[,site,...], --sites
site[,site,...]
A comma separated list of execution sites on
which the workflow is to be executed. Each of the sites should have an entry
in the site catalog, that is being used. To run on the submit host, specify
the execution site as local.
In case this option is not specified, all the sites in the site catalog are
picked up as candidates for running the workflow.
--staging-site s1=ss1[,s2=ss2[..]]
A comma separated list of key=value pairs ,
where the key is the execution site and value is the staging site for that
execution site.
In case of running on a shared filesystem, the staging site is automatically
associated by the planner to be the execution site. If only a value is
specified, then that is taken to be the staging site for all the execution
sites. e.g --staging-site local means that the planner will use the
local site as the staging site for all jobs in the workflow.
-s, --submit
Submits the generated executable
workflow using pegasus-run script in $PEGASUS_HOME/bin directory.
By default, the Pegasus Workflow Planner only generates the Condor submit
files and does not submit them.
-v, --verbose
Increases the verbosity of messages about what
is going on. By default, all FATAL, ERROR, CONSOLE and WARN messages are
logged. The logging hierarchy is as follows:
For example, to see the INFO, CONFIG and DEBUG messages additionally, set
-vvv.
-V, --version
1.FATAL
2.ERROR
3.CONSOLE
4.WARN
5.INFO
6.CONFIG
7.DEBUG
8.TRACE
Displays the current version number of the
Pegasus Workflow Management System.
RETURN VALUE¶
If the Pegasus Workflow Planner is able to generate an executable workflow successfully, the exitcode will be 0. All runtime errors result in an exitcode of 1. This is usually in the case when you have misconfigured your catalogs etc. In the case of an error occurring while loading a specific module implementation at run time, the exitcode will be 2. This is usually due to factory methods failing while loading a module. In case of any other error occurring during the running of the command, the exitcode will be 1. In most cases, the error message logged should give a clear indication as to where things went wrong.PEGASUS PROPERTIES¶
This is not an exhaustive list of properties used. For the complete description and list of properties refer to $PEGASUS_HOME/doc/advanced-properties.pdf pegasus.selector.siteIdentifies what type of site selector you want
to use. If not specified the default value of Random is used. Other
supported modes are RoundRobin and NonJavaCallout that calls out
to a external site selector.
pegasus.catalog.replica
Specifies the type of replica catalog to be
used.
If not specified, then the value defaults to RLS.
pegasus.catalog.replica.url
Contact string to access the replica catalog.
In case of RLS it is the RLI url.
pegasus.dir.exec
A suffix to the workdir in the site catalog to
determine the current working directory. If relative, the value will be
appended to the working directory from the site.config file. If absolute it
constitutes the working directory.
pegasus.catalog.transformation
Specifies the type of transformation catalog
to be used. One can use either a file based or a database based transformation
catalog. At present the default is Text.
pegasus.catalog.transformation.file
The location of file to use as transformation
catalog.
If not specified, then the default location of $PEGASUS_HOME/var/tc.data is
used.
pegasus.catalog.site
Specifies the type of site catalog to be used.
One can use either a text based or an xml based site catalog. At present the
default is XML3.
pegasus.catalog.site.file
The location of file to use as a site catalog.
If not specified, then default value of $PEGASUS_HOME/etc/sites.xml is used in
case of the xml based site catalog and $PEGASUS_HOME/etc/sites.txt in case of
the text based site catalog.
pegasus.data.configuration
This property sets up Pegasus to run in
different environments. This can be set to
sharedfs If this is set, Pegasus will be setup to execute jobs on the
shared filesystem on the execution site. This assumes, that the head node of a
cluster and the worker nodes share a filesystem. The staging site in this case
is the same as the execution site.
nonsharedfs If this is set, Pegasus will be setup to execute jobs on an
execution site without relying on a shared filesystem between the head node
and the worker nodes.
condorio If this is set, Pegasus will be setup to run jobs in a pure
condor pool, with the nodes not sharing a filesystem. Data is staged to the
compute nodes from the submit host using Condor File IO.
pegasus.code.generator
The code generator to use. By default, Condor
submit files are generated for the executable workflow. Setting to
Shell results in Pegasus generating a shell script that can be executed
on the submit host.
FILES¶
$PEGASUS_HOME/etc/dax-3.3.xsdis the suggested location of the latest DAX
schema to produce DAX output.
$PEGASUS_HOME/etc/sc-3.0.xsd
is the suggested location of the latest Site
Catalog schema that is used to create the XML3 version of the site
catalog
$PEGASUS_HOME/etc/tc.data.text
is the suggested location for the file
corresponding to the Transformation Catalog.
$PEGASUS_HOME/etc/sites.xml3 | $PEGASUS_HOME/etc/sites.xml
is the suggested location for the file
containing the site information.
$PEGASUS_HOME/lib/pegasus.jar
contains all compiled Java bytecode to run the
Pegasus Workflow Planner.
SEE ALSO¶
pegasus-sc-client(1), pegasus-tc-client(1), pegasus-rc-client(1)AUTHORS¶
Karan Vahi <vahi at isi dot edu>05/24/2012 |