.\" DO NOT MODIFY THIS FILE!  It was generated by help2man 1.47.1.
.TH FASTQ-MCF "1" "July 2015" "fastq-mcf 1.1.2" "User Commands"
.SH NAME
fastq-mcf \- ea-utils: detect levels of adapter presence, compute likelihoods and locations of the adapters
.SH SYNOPSIS
.B fastq-mcf
[\fI\,options\/\fR] \fI\,<adapters.fa> <reads.fq> \/\fR[\fI\,mates1.fq \/\fR...]
.SH DESCRIPTION
Version: 1.04.676
.PP
Detects levels of adapter presence, computes likelihoods and
locations (start, end) of the adapters.   Removes the adapter
sequences from the fastq file(s).
.PP
Stats go to stderr, unless \fB\-o\fR is specified.
.PP
Specify \fB\-0\fR to turn off all default settings
.PP
If you specify multiple 'paired\-end' inputs, then a \fB\-o\fR option is
required for each.  IE: \fB\-o\fR read1.clip.q \fB\-o\fR read2.clip.fq
.SH OPTIONS
.TP
\fB\-h\fR
This help
.TP
\fB\-o\fR FIL
Output file (stats to stdout)
.TP
\fB\-s\fR N.N
Log scale for adapter minimum\-length\-match (2.2)
.TP
\fB\-t\fR N
% occurance threshold before adapter clipping (0.25)
.TP
\fB\-m\fR N
Minimum clip length, overrides scaled auto (1)
.TP
\fB\-p\fR N
Maximum adapter difference percentage (10)
.TP
\fB\-l\fR N
Minimum remaining sequence length (19)
.TP
\fB\-L\fR N
Maximum remaining sequence length (none)
.TP
\fB\-D\fR N
Remove duplicate reads : Read_1 has an identical N bases (0)
.TP
\fB\-k\fR N
sKew percentage\-less\-than causing cycle removal (2)
.TP
\fB\-x\fR N
\&'N' (Bad read) percentage causing cycle removal (20)
.TP
\fB\-q\fR N
quality threshold causing base removal (10)
.TP
\fB\-w\fR N
window\-size for quality trimming (1)
.TP
\fB\-H\fR
remove >95% homopolymer reads (no)
.TP
\fB\-X\fR
remove low complexity reads (no)
.TP
\fB\-0\fR
Set all default parameters to zero/do nothing
.TP
\fB\-U\fR|u
Force disable/enable Illumina PF filtering (auto)
.TP
\fB\-P\fR N
Phred\-scale (auto)
.TP
\fB\-R\fR
Don't remove N's from the fronts/ends of reads
.TP
\fB\-n\fR
Don't clip, just output what would be done
.TP
\fB\-C\fR N
Number of reads to use for subsampling (300k)
.TP
\fB\-S\fR
Save all discarded reads to '.skip' files
.TP
\fB\-d\fR
Output lots of random debugging stuff
.SS "Quality adjustment options:"
.TP
\fB\-\-cycle\-adjust\fR
CYC,AMT   Adjust cycle CYC (negative = offset from end) by amount AMT
.TP
\fB\-\-phred\-adjust\fR
SCORE,AMT Adjust score SCORE by amount AMT
.TP
\fB\-\-phred\-adjust\-max\fR
SCORE     Adjust scores > SCORE to SCOTE
.SS "Filtering options*:"
.TP
\fB\-\-[mate\-]qual\-mean\fR
NUM       Minimum mean quality score
.TP
\fB\-\-[mate\-]qual\-gt\fR
NUM,THR   At least NUM quals > THR
.TP
\fB\-\-[mate\-]max\-ns\fR
NUM       Maxmium N\-calls in a read (can be a %)
.TP
\fB\-\-[mate\-]min\-len\fR
NUM       Minimum remaining length (same as \fB\-l\fR)
.TP
\fB\-\-homopolymer\-pct\fR
PCT       Homopolymer filter percent (95)
.TP
\fB\-\-lowcomplex\-pct\fR
PCT       Complexity filter percent (95)
.PP
If mate\- prefix is used, then applies to second non\-barcode read only
.PP
Adapter files are 'fasta' formatted:
.PP
Specify n/a to turn off adapter clipping, and just use filters
.PP
Increasing the scale makes recognition\-lengths longer, a scale
of 100 will force full\-length recognition of adapters.
.PP
Adapter sequences with _5p in their label will match 'end's,
and sequences with _3p in their label will match 'start's,
otherwise the 'end' is auto\-determined.
.PP
Skew is when one cycle is poor, 'skewed' toward a particular base.
If any nucleotide is less than the skew percentage, then the
whole cycle is removed.  Disable for methyl\-seq, etc.
.PP
Set the skew (\fB\-k\fR) or N\-pct (\fB\-x\fR) to 0 to turn it off (should be done
for miRNA, amplicon and other low\-complexity situations!)
.PP
Duplicate read filtering is appropriate for assembly tasks, and
never when read length < expected coverage.  \fB\-D\fR 50 will use
4.5GB RAM on 100m DNA reads \- be careful. Great for RNA assembly.
.PP
*Quality filters are evaluated after clipping/trimming
.PP
Homopolymer filtering is a subset of low\-complexity, but will not
be separately tracked unless both are turned on.