.TH CREATE_MATRIX "1" "February 2014" "create_matrix SVNr18" "User Commands"
.SH NAME
create_matrix \- calculate the genome abundance similarity matrix
.SH SYNOPSIS
.B create_matrix
[\fIoptions\fR] \fINAMES\fR
.SH DESCRIPTION
Calculate the similarity matrix.
.PP
First, a set of reads is simulated for every reference genome using a read
simulator from core/tools.py specified via \fB\-s\fR.
Second, the simulated reads of each species are mapped against all reference
genomes using the mapper specified with \fB\-m\fR.
Third, the resulting SAM\-files are analyzed to calculate the similarity
matrix. The similarity matrix is stored as a numpy file (\fB\-o\fR).
.SH OPTIONS
.TP
\fBNAMES\fR
Filename of the names file; the plain text names file should
contain one name per line. The name is used as identifier in
the whole algorithm.
.TP
\fB\-h\fR, \fB\-\-help\fR
show this help message and exit
.TP
\fB\-s\fR SIMULATOR, \fB\-\-simulator\fR=\fISIMULATOR\fR
Identifier of read simulator defined in core/tools.py
[default: none]
.TP
\fB\-r\fR REF, \fB\-\-reference\fR=\fIREF\fR
Reference sequence file pattern for the read
simulator. Placeholder for the name is "%s". [default:
\&./ref/%s.fasta]
.TP
\fB\-m\fR MAPPER, \fB\-\-mapper\fR=\fIMAPPER\fR
Identifier of mapper defined in core/tools.py
[default: none]
.TP
\fB\-i\fR INDEX, \fB\-\-index\fR=\fIINDEX\fR
Reference index files for the read mapper. Placeholder
for the name is "%s". [default: ./ref/%s.fasta]
.TP
\fB\-t\fR TEMP, \fB\-\-temp\fR=\fITEMP\fR
Directory to store temporary simulated datasets and
SAM files. [default: ./temp]
.TP
\fB\-o\fR OUT, \fB\-\-output\fR=\fIOUT\fR
Output similarity matrix file. [default:
\&./similarity_matrix.npy]