Scroll to navigation

clm close(1) USER COMMANDS clm close(1)

NAME


clm_close - Fetch connected components from graphs or subgraphs

clmclose is not in actual fact a program. This manual page documents the behaviour and options of the clm program when invoked in mode close. The options -h, --apropos, --version, -set, --nop are accessible in all clm modes. They are described in the clm manual page.

SYNOPSIS


clm close -imx <fname> [options]

clm close -imx fname (specify matrix input) -abc fname (specify label input) -dom fname (input domain/cluster file) [-o fname (output file)] [--is-undirected (trust input graph to be undirected)] [-levels LO/STEP/HI[/prefix] (write cluster size distribution for each cutoff)] [-levels-norm num (divide each level by num to define cutoff)] [--write-count (output component count)] [--write-sizes (output component sizes (default))] [--write-size-counts (output compressed list of component sizes)] [--write-cc (output components as clustering)] [--write-block (output graph restricted to -dom argument)] [--write-blockc (output graph complement of -dom argument)] [-cc-bound num (select components with size at least num)] [--sl (output single linkage tree as list of joins (for -imx input))] [-write-sl-list fname (write list of join order with weights)] [-tf spec (apply tf-spec to input matrix)] [-h (print synopsis, exit)] [--apropos (print synopsis, exit)] [--version (print version, exit)]

DESCRIPTION


Use clm close to fetch the connected components from a graph. Different output modes are supported (see below). In matrix mode (i.e. using the -imx option) the output returned with --write-cc can be used in conjunction with mcxsubs to retrieve individual subgraphs corresponding to connected components.

OPTIONS



-abc <fname> (label input)
The file name for input that is in label format.


-imx <fname> (input matrix)
The file name for input that is in mcl native matrix format.


-o fname (output file)
Specify the file where output is sent to. The default is STDOUT.


-dom fname (input domain/cluster file)
If this option is used, clm close will, as a first step, for each of the domains in file fname retrieve the associated subgraph from the input graph. These are then further decomposed into connected components, and the program will process these in the normal manner.


--write-count (output component count)


--write-sizes (output component sizes (default))


--write-size-counts (output compressed list of component sizes)


--write-cc (output components as clustering)


--write-block (output graph restricted to -dom argument)


--write-blockc (output graph complement of -dom argument)


The default behaviour is currently to output the sizes of the connected components. It is also possible to simply output the number of components with --write-count, to write a counted list of sizes with --write-size-counts, or to write the components as a clustering in mcl format with -write-cc. Even more options exist: it is possible to output the restriction of the input graph to a domain, or to output the complement of this restriction.


-levels LO/STEP/HI[/prefix] (write cluster size distribution for each cutoff)


-levels-norm num (divide each level by num to define cutoff)


Use -levels to inspect the cluster size distribution at various cut-offs by specifying a triplet of numbers (separated by forward slashes), the first of which is the starting point, the second is the step size, and the third is the end point. If a fourth argument (preceded by another slash) is given, all clusterings are written to a file based on the supplied argument as file name prefix. The cut-off can be further varied by the argument to -levels-norm.


--sl (output single linkage tree as list of joins (for -imx input))


-write-sl-list fname (write list of join order with weights)


A primary use case for this is to apply single link clustering to the rcl (restricted contingency linkage) graph that is output by clm vol with its write-rcl option. This rcl graph encodes a consensus clustering derived from the multiple clusterings that are given to clm vol.

The output (save with -o or UNIX redirection) can be supplied to rcl-res.pl with a list of varying resolution parameters to produce a small number of nested clusterings. The resolution parameters (second and subsequent arguments) to rcl-res.pl are set sizes; For each of the supplied resolutions res the script will descend the tree as long as the current node has some split below it where both clusters are of size at least res. Note that the resulting clustering may still have smaller clusters and singletons (resulting from other splits).

The mcl distribution has an example script graphs/rcl-example.sh that illustrates the different steps.


--is-undirected (omit graph undirected check)
With this option the transformation to make sure that the input is undirected is omitted. This will be slightly faster. Using this option while the input is directed may lead to erronenous results.


-cc-bound num (select components with size at least num)
Transform the input matrix values according to the syntax described in mcxio(5).

AUTHOR


Stijn van Dongen.

SEE ALSO


mclfamily(7) for an overview of all the documentation and the utilities in the mcl family.

9 Oct 2022 clm close 22-282