sand_filter_master - filter sequences for alignment in parallel
sand_filter_master [options] sequences.cfa candidates.cand
sand_filter_master is the first step in the SAND assembler. It reads in a
body of sequences, and uses a linear-time algorithm to produce a list of
candidate sequences to be aligned in detail by sand_align_master(1).
This program uses the Work Queue system to distributed tasks among
processors. After starting sand_filter_master, you must start a
number of work_queue_worker(1) processes on remote machines. The
workers will then connect back to the master process and begin executing
tasks. The actual filtering is performed by sand_filter_kernel(1) on
- -p <port>
- Port number for queue master to listen on. (default: 9123)
- -s <size>
- Number of sequences in each filtering task. (default: 1000)
- -r <file>
- A meryl file of repeat mers to be filtered out.
- -R <n>
- Automatically retry failed jobs up to n times. (default: 100)
- -k <size>
- The k-mer size to use in candidate selection (default is 22).
- -w <size>
- The minimizer window size. (default is 22).
- If set, do not unlink temporary binary output files.
- -c <file>
- Checkpoint filename; will be created if necessary.
- -d <flag>
- Enable debugging for this subsystem. (Try -d all to start.)
- -F <number>
- Work Queue fast abort multiplier. (default is 10.)
- -Z <file>
- Select port at random and write it out to this file.
- -o <file>
- Send debugging to this file.
- Show version string
- Show this help screen
On success, returns zero. On failure, returns non-zero.
If you begin with a FASTA formatted file of reads, used
sand_compress_reads(1) to produce a compressed FASTA (cfa) file. To run
filtering sequentially, start a single work_queue_worker(1) process in
the background. Then, invoke sand_filter_master.
% sand_compress_reads mydata.fasta mydata.cfa
% work_queue_worker localhost 9123 &
% sand_filter_master mydata.cfa mydata.cand
To speed up the process, run more work_queue_worker(1)
processes on other machines, or use condor_submit_workers(1) or
sge_submit_workers(1) to start hundreds of workers in your local
The Cooperative Computing Tools are Copyright (C) 2003-2004 Douglas Thain and
Copyright (C) 2005-2015 The University of Notre Dame. This software is
distributed under the GNU General Public License. See the file COPYING for