create_matrix - calculate the genome abundance similarity matrix


create_matrix [options] NAMES


Calculate the similarity matrix.

First, a set of reads is simulated for every reference genome using a read simulator from core/ specified via -s. Second, the simulated reads of each species are mapped against all reference genomes using the mapper specified with -m. Third, the resulting SAM-files are analyzed to calculate the similarity matrix. The similarity matrix is stored as a numpy file (-o).


Filename of the names file; the plain text names file should contain one name per line. The name is used as identifier in the whole algorithm.
show this help message and exit
Identifier of read simulator defined in core/ [default: none]
Reference sequence file pattern for the read simulator. Placeholder for the name is "%s". [default: ./ref/%s.fasta]
Identifier of mapper defined in core/ [default: none]
Reference index files for the read mapper. Placeholder for the name is "%s". [default: ./ref/%s.fasta]
Directory to store temporary simulated datasets and SAM files. [default: ./temp]
Output similarity matrix file. [default: ./similarity_matrix.npy]
February 2014 create_matrix SVNr18