sand_filter_kernel - filter read sequences sequentially
sand_filter_kernel [options] <sequence file> [second sequence file]
sand_filter_kernel filters a list of genomic sequences, and produces a
list of candidate pairs for more detailed alignment. It is not normally called
by the user, but is invoked by sand_filter_master(1) for each
sequential step of a distributed alignment workload.
If one sequence file is given, sand_filter_kernel will look
for similarities between all sequences in that file. If given two files, it
will look for similarities between sequences in the first file and the
second file. The output is a list of candidate pairs, listing the name of
the candidate sequences and a starting position for alignment.
- -s <size>
- Size of "rectangle" for filtering. You can determine the size
dynamically by passing in d rather than a number.
- -r <file>
- A meryl file of repeat mers to be ignored.
- -k <size>
- The k-mer size to use in candidate selection (default is 22).
- -w <number>
- The minimizer window size to use in candidate selection (default is
- -o <filename>
- The output file. Default is stdout.
- -d <subsystem>
- Enable debug messages for this subsystem. Try -d all to start.
- Show version string.
- Show help screen.
On success, returns zero. On failure, returns non-zero.
Users do not normally invoke sand_filter_kernel directly. Instead,
options such as the k-mer size, minimizer window, and repeat file may be
specified by the same arguments to sand_filter_master(1) instead. For
example, to run a filter with a k-mer size of 20, window size of 24, and
repeat file of mydata.repeats:
% sand_filter_master -k 20 -w 24 -r mydata.repeats mydata.cfa mydata.cand
The Cooperative Computing Tools are Copyright (C) 2003-2004 Douglas Thain and
Copyright (C) 2005-2015 The University of Notre Dame. This software is
distributed under the GNU General Public License. See the file COPYING for