.TH ALF 1 "" "alf 1.1.10 [tarball]" ""
.SH NAME
alf \- Alignment free sequence comparison
.SH SYNOPSIS
\fBalf\fP [\fIOPTIONS\fP] \fB-i\fP \fIIN.FASTA\fP [\fB-o\fP \fIOUT.TXT\fP]
.SH DESCRIPTION
Compute pairwise similarity of sequences using alignment-free methods in \fIIN.FASTA\fP and write out tab-delimited matrix with pairwise scores to \fIOUT.TXT\fP.
.SH OPTIONS
.TP
\fB-h\fP, \fB--help\fP
Display the help message.
.TP
\fB--version\fP
Display version information.
.TP
\fB-v\fP, \fB--verbose\fP
When given, details about the progress are printed to the screen.
.SS Input / Output:
.TP
\fB-i\fP, \fB--input-file\fP \fIINPUT_FILE\fP
Name of the multi-FASTA input file. Valid filetypes are: \fI.sam[.*]\fP, \fI.raw[.*]\fP, \fI.gbk[.*]\fP, \fI.frn[.*]\fP, \fI.fq[.*]\fP, \fI.fna[.*]\fP, \fI.ffn[.*]\fP, \fI.fastq[.*]\fP, \fI.fasta[.*]\fP, \fI.faa[.*]\fP, \fI.fa[.*]\fP, \fI.embl[.*]\fP, and \fI.bam\fP, where * is any of the following extensions: \fIgz\fP, \fIbz2\fP, and \fIbgzf\fP for transparent (de)compression.
.TP
\fB-o\fP, \fB--output-file\fP \fIOUTPUT_FILE\fP
Name of the file to which the tab-delimtied matrix with pairwise scores will be written to.  Default is to write to stdout. Valid filetype is: \fI.alf[.*]\fP, where * is any of the following extensions: \fItsv\fP for transparent (de)compression.
.SS General Algorithm Parameters:
.TP
\fB-m\fP, \fB--method\fP \fISTRING\fP
Select method to use. One of \fIN2\fP, \fID2\fP, \fID2Star\fP, and \fID2z\fP. Default: \fIN2\fP.
.TP
\fB-k\fP, \fB--k-mer-size\fP \fIINTEGER\fP
Size of the k-mers. Default: \fI4\fP.
.TP
\fB-mo\fP, \fB--bg-model-order\fP \fIINTEGER\fP
Order of background Markov Model. Default: \fI1\fP.
.SS N2 Algorithm Parameters:
.TP
\fB-rc\fP, \fB--reverse-complement\fP \fISTRING\fP
Which strand to score.  Use \fIboth_strands\fP to score both strands simultaneously. One of \fIinput\fP, \fIboth_strands\fP, \fImean\fP, \fImin\fP, and \fImax\fP. Default: \fIinput\fP.
.TP
\fB-mm\fP, \fB--mismatches\fP \fIINTEGER\fP
Number of mismatches, one of \fI0\fP and \fI1\fP.  When \fI1\fP is used, N2 uses the k-mer-neighbour with one mismatch. Default: \fI0\fP.
.TP
\fB-mmw\fP, \fB--mismatch-weight\fP \fIDOUBLE\fP
Real-valued weight of counts for words with mismatches. Default: \fI0.1\fP.
.TP
\fB-kwf\fP, \fB--k-mer-weights-file\fP \fIOUTPUT_FILE\fP
Print k-mer weights for every sequence to this file if given. Valid filetype is: \fI.txt\fP.
.SH CONTACT AND REFERENCES
.TP
For questions or comments, contact:
Jonathan Goeke <goeke@molgen.mpg.de>
.TP
Please reference the following publication if you used ALF or the N2 method for your analysis:
Jonathan Goeke, Marcel H. Schulz, Julia Lasserre, and Martin Vingron. Estimation of Pairwise Sequence Similarity of Mammalian Enhancers with Word Neighbourhood Counts. Bioinformatics (2012).
.TP
Project Homepage:
http://www.seqan.de/projects/alf