NAME¶
create_matrix - calculate the genome abundance similarity matrix
SYNOPSIS¶
create_matrix [
options]
NAMES
DESCRIPTION¶
Calculate the similarity matrix.
First, a set of reads is simulated for every reference genome using a read
simulator from core/tools.py specified via
-s. Second, the simulated
reads of each species are mapped against all reference genomes using the
mapper specified with
-m. Third, the resulting SAM-files are analyzed
to calculate the similarity matrix. The similarity matrix is stored as a numpy
file (
-o).
OPTIONS¶
- NAMES
- Filename of the names file; the plain text names file should contain one
name per line. The name is used as identifier in the whole algorithm.
- -h, --help
- show this help message and exit
- -s SIMULATOR, --simulator=SIMULATOR
- Identifier of read simulator defined in core/tools.py [default: none]
- -r REF, --reference=REF
- Reference sequence file pattern for the read simulator. Placeholder for
the name is "%s". [default: ./ref/%s.fasta]
- -m MAPPER, --mapper=MAPPER
- Identifier of mapper defined in core/tools.py [default: none]
- -i INDEX, --index=INDEX
- Reference index files for the read mapper. Placeholder for the name is
"%s". [default: ./ref/%s.fasta]
- -t TEMP, --temp=TEMP
- Directory to store temporary simulated datasets and SAM files. [default:
./temp]
- -o OUT, --output=OUT
- Output similarity matrix file. [default: ./similarity_matrix.npy]