.\" This manpage has been automatically generated by docbook2man 
.\" from a DocBook document.  This tool can be found at:
.\" <http://shell.ipoline.com/~elmert/comp/docbook2X/> 
.\" Please send any bug reports, improvements, comments, patches, 
.\" etc. to Steve Cheng <steve@ggi-project.org>.
.TH "TRICENSUS-MPI" "1" "14 December 2016" "" "The Regina Handbook"

.SH NAME
tricensus-mpi \- Distribute a triangulation census amongst several machines using MPI
.SH SYNOPSIS

\fBtricensus-mpi\fR [ \fB-D, --depth=\fIlevels\fB\fR ] [ \fB-x, --dryrun\fR ] [ \fB-2, --dim2\fR | \fB-4, --dim4\fR ] [ \fB-o, --orientable\fR | \fB-n, --nonorientable\fR ] [ \fB-f, --finite\fR | \fB-d, --ideal\fR ] [ \fB-m, --minimal\fR | \fB-M, --minprime\fR | \fB-N, --minprimep2\fR | \fB-h, --minhyp\fR ] [ \fB-s, --sigs\fR ] \fB\fIpairs-file\fB\fR \fB\fIoutput-file-prefix\fB\fR

.SH "CAUTION"
.PP
The MPI utilities in Regina are deprecated, and will be removed from
Regina in a future release.
If you wish to parallelise the generation of a census, we recommend
splitting up the input pairing files into chunks, and using typical
queue systems (such as PBS) to parallelise.
.SH "DESCRIPTION"
.PP
Allows multiple processes, possibly running on a cluster of
different machines, to
collaborate in forming a census of 2-, 3- or 4-manifold triangulations.
Coordination is done through MPI (the Message Passing Interface),
and the entire census is run as a single MPI job.
This program is well suited for high-performance clusters.
.PP
The default behaviour is to enumerate 3-manifold triangulations.
If you wish to enumerate 2-manifold or 4-manifold triangulations instead,
you must pass \fB--dim2\fR or \fB--dim4\fR
respectively.
.PP
To prepare a census for distribution amongst several processes or
machines, the census must be split into smaller pieces.
Running \fBtricensus\fR
with option \fB--genpairs\fR (which is very fast) will create
a list of facet pairings
(e.g., tetrahedron face pairings for 3-manifold triangulations,
triangle edge pairings for 2-manifold triangulations, and so on).
Each facet pairing must be analysed in order to complete the census.
.PP
The full list of facet pairings should be stored in a single file,
which is passed on the command-line as
\fIpairs-file\fR\&.
This file must contain one facet pairing per line, and each of these
facet pairings must be in canonical form (i.e., must be a
minimal representative of its isomorphism class).  The facet
pairings generated by
\fBtricensus
--genpairs\fR are guaranteed to satisfy these conditions.
.PP
The \fBtricensus-mpi\fR utility has two modes of
operation: default mode, and subsearch mode.  These are explained
separately under modes
of operation below.
.PP
In both modes, one MPI process acts as the controller and the remaining
processes all act as slaves.  The controller reads the list of facet
pairings from \fIpairs-file\fR, constructs a
series of tasks based on these, and farms these tasks
out to the slaves for processing.  Each slave processes one task
at a time, asking the controller for a new task when it is finished
with the previous one.
.PP
At the end of each task, if any triangulations were found then
the slave responsible will save these triangulations to an output file.
The output file will have a name of the form
\fIoutput-file-prefix_p\&.rga\fR
in default mode or
\fIoutput-file-prefix_p-s\&.rga\fR
in subsearch mode.
Here \fIoutput-file-prefix\fR is passed on the
command line, \fIp\fR is the number
of the facet pairing being processed, and \fIs\fR
is the number of the subsearch within that facet pairing
(both facet pairings and subsearches are numbered from 1 upwards).
If no triangulations were found then the slave will not write
any output file at all.
.PP
The controller and slave
processes all take the same \fBtricensus-mpi\fR
options (excluding MPI-specific options, which are generally supplied
by an MPI wrapper program such as \fBmpirun\fR or
\fBmpiexec\fR).
The different roles of the processes are determined solely by their
MPI process rank (the controller is always the process with rank 0).
It should therefore be possible to start all MPI processes by
running a single command, as illustrated in the examples below.
.PP
As the census progresses, the controller keeps a detailed log of each
slave's activities, including how long each slave task has taken and how
many triangulations have been found.  This log is written to the file
\fIoutput-file-prefix\&.log\fR\&.
The utility
\fBtricensus-mpi-status\fR
can parse this log and produce a shorter human-readable summary.
.sp
.RS
.B "Important:"
It is \fBhighly recommended\fR
that you use the \fB--sigs\fR option.  This will keep
output files small, and will significantly reduce the memory footprint
of \fBtricensus-mpi\fR itself.
.RE
.SH "MODES OF OPERATION"
.PP
As discussed above, there are two basic modes of operation.
These are default mode (used when \fB--depth\fR is not
passed), and subsearch mode (used when \fB--depth\fR is
passed).
.TP 0.2i
\(bu
In \fBdefault mode\fR, the controller simply
reads the list of facet pairings and gives each pairing
to a slave for processing, one after another.
.TP 0.2i
\(bu
In \fBsubsearch mode\fR, more work is pushed to
the controller and the slave tasks are shorter.  Here the
controller reads one facet pairing at a time and begins processing
that facet pairing.  A fixed depth is supplied in the argument
\fB--depth\fR; each time that depth is reached in the
search tree, the
subsearch from that point on is given as a task to the next idle slave.
Meanwhile the controller backtracks (as though the subsearch had
finished) and continues, farming the next subsearch out when
the given depth is reached again, and so on.
.PP
The modes can be visualised as follows.
For each facet pairing, consider the corresponding recursive search
as a large search tree.  In default mode, the entire tree is
processed at once as a single slave task.  In subsearch mode, each
subtree rooted at the given depth is processed as a separate slave
task (and all processing between the root and the given depth is
done by the controller).
.PP
The main difference between the different modes of operation is
the lengths of the slave tasks, which can have a variety of effects.
.TP 0.2i
\(bu
In default mode the slave tasks are quite long.
This means the parallelisation can become very poor towards the
end of the census, with some slaves sitting idle for
a long time as they wait for the remaining slaves to finish.
.TP 0.2i
\(bu
As we move to subsearch mode with increasing depth, the slave
tasks become shorter and the slaves' finish times will be closer
together (thus avoiding the idle slave inefficiency described above).
Moreover, with a more refined subsearch,
the progress information stored in the log will be more detailed,
giving a better idea of how long the census has to go.  On the
other hand, more work is pushed to the single-process controller
(risking a bottleneck if the depth is too great, with slaves now
sitting idle as they wait for new tasks).  In addition the MPI overhead
is greater, and the number of output files can become extremely large.
.PP
In the end, experimentation is the best way to decide whether to run
in subsearch mode and at what depth.  Be aware of the option
\fB--dryrun\fR, which can give a quick overview of the
search space (and in particular, show how many subsearches are
required for each facet pairing at any given depth).
.SH "OPTIONS"
.PP
The census options accepted by \fBtricensus-mpi\fR
are identical to the options for \fBtricensus\fR
See the
\fBtricensus\fR reference
for details.
.PP
Some options from \fBtricensus\fR are not
available here (e.g., tetrahedra and boundary options), since these must
be supplied earlier on when generating the initial list of facet pairings.
.PP
There are new options specific to \fBtricensus-mpi\fR,
which are as follows.
.TP
\fB-D, --depth=\fIlevels\fB\fR
Indicates that subsearch mode should be used (instead of default
mode).  The argument \fIlevels\fR specifies
at what depth in the search tree processing should pass from the
controller to a new slave task.

The given depth must be strictly positive (running at depth zero
is equivalent to running in default mode).

See the modes of
operation section above for further information, as well
as hints on choosing a good value for \fIlevels\fR\&.
.TP
\fB-x, --dryrun\fR
Specifies that a fast dry run should be performed, instead of a
full census.

In a dry run, each time a slave accepts a task it
will immediately mark it as finished with no triangulations found.
The behaviour of the controller remains unchanged.

The result will be an empty census.  The benefit of a dry run is
the log file it produces, which will show precisely how facet pairings
would be divided into subsearches in a real census run.
In particular, the log file will show how
many subsearches each facet pairing produces (the utility
\fBtricensus-mpi-status\fR
can help extract this information from the log).

At small subsearch depths, a dry run should be extremely fast.
As the depth increases however, the dry run will become
slower due to the extra work given to the controller.

This option is only useful in subsearch mode (it can be used in
default mode, but the results are uninteresting).
See the modes of
operation section above for further details.
.SH "EXAMPLES"
.PP
Suppose we wish to form a census of all 6-tetrahedron closed
non-orientable 3-manifold triangulations, optimised for
prime minimal P2-irreducible triangulations (so some
non-prime, non-minimal or non-P2-irreducible triangulations may be omitted).
.PP
We begin by using \fBtricensus\fR to generate a full
list of face pairings.

.nf
    example$ \fBtricensus --genpairs -t 6 -i > 6.pairs\fR
    Total face pairings: 97
    example$
.fi
.PP
We now use \fBtricensus-mpi\fR to run the distributed
census.  A wrapper program such as \fBmpirun\fR
or \fBmpiexec\fR can generally
be used to start the MPI processes, though this depends on your
specific MPI implementation.  The following command runs a distributed
census on 10 processors using the MPICH implementation of MPI\&.

.nf
    example$ \fBmpirun -np 10 /usr/bin/tricensus-mpi -Nnf 6.pairs 6-nor\fR
    example$
.fi
.PP
The current state of processing is kept in the controller log
\fI6-nor.log\fR\&.  You can watch this log with the help of
\fBtricensus-mpi-status\fR\&.

.nf
    example$ \fBtricensus-mpi-status 6-nor.log\fR
    Pairing 1: done, 0 found
    ...
    Pairing 85: done, 0 found
    Pairing 86: done, 7 found
    Pairing 87: running
    Pairing 88: running
    Still running, 15 found, last activity: Wed Jun 10 05:57:34 2009
    example$
.fi
.PP
Once the census is finished, the resulting triangulations will be
saved in files such as
\fI6-nor_8.rga\fR,
\fI6-nor_86.rga\fR and so on.
.SH "MACOS\\~X AND WINDOWS USERS"
.PP
This utility is not shipped with the drag-and-drop app bundle for
\fBMacOS\~X\fR or with the \fBWindows\fR installer.
.SH "SEE ALSO"
.PP
censuslookup,
regconcat,
sigcensus,
tricensus,
tricensus-mpi-status,
regina-gui\&.
.SH "AUTHOR"
.PP
This utility was written by Benjamin Burton
<bab@maths.uq.edu.au>\&.
Many people have been involved in the development
of Regina; see the users' handbook for a full list of credits.