.TH SAMFILTER "1" "July 2015" "samFilter 3ca7fe8" "User Commands" .SH NAME samFilter \- filter nucleotide sequence alignments in SAM files .SH SYNOPSIS .B samFilter .I file.sam .I reference.fasta .I out.sam .RI [ options ] .SH OPTIONS .TP .I file.sam Input SAM file. .TP .I reference.fasta Reference used to generate reads. .TP .I out.sam Output SAM file. .TP .BI \-minAlnLength \0value (50) Report alignments only if their lengths are greater than \fIvalue\fR. .TP .BI \-minAlignLength \0value Alias of \fB\-minAlnLength\fR .TP .BI \-minLength \0value Alias of \fB\-minAlnLength\fR .TP .BI \-minPctSimilarity \0value .IP (70) Report alignments only if their percentage similairty is greater than \fIvalue\0. .TP .BI \-minPctIdentity \0value Alias of \fB\-minPctSimilarity\fR .TP .BI \-minPctAccuracy \0value (70) Report alignments only if their percentage accuray is greater than \fIvalue\fR. .TP .BI \-minAccuracy \0value Alias of \fB\-minPctAccuracy\fR .TP .BI \-hitPolicy \0value (randombest) Specify a policy to treat multiple hits from [all, allbest, random, randombest, leftmost] .RS .TP .I all report all alignments. .TP .I allbest report all equally top scoring alignments. .TP .I random report a random alignment. .TP .I randombest report a random alignment from multiple equally top scoring alignments. .TP .I leftmost report an alignment which has the best alignmentscore and has the smallest mapping coordinate in any reference. .RE .TP .BI \-scoreSign \0value (\-1) Whether higher or lower scores are better. .RS .TP \-1 lower is better .TP 1 higher is better. .RE .TP .BI \-scoreCutoff \0value (INF) Report alignments only if their scores are no worse than \fIvalue\fR. .TP .BI \-seed \0value (1) Seed for random number generator. If seed is 0, then use current time as seed. .TP .BI \-holeNumbers \0value A string of comma-delimited hole number ranges to output hits, such as '1,2,10-12'. This requires hit titles to be in SMRT read title format. .TP .B \-smrtTitle Use this option when filtering alignments generated by programs other than .BR blasr (1), e.g. bwa\-sw or gmap. Parse read coordinates from the SMRT read title. The title is in the format \fI\,/name/hole/coordinates\/\fP, where coordinates are in the format \ed+_\ed+, and represent the interval of the read that was aligned. .TP .BI \-titleTable \0value Use this experimental option when filtering alignments generated by .BR blasr (1) with \fB\-titleTable\fR titleTableName, in which case reference titles in SAM are represented by their indices (e.g., 0, 1, 2, ...) in the title table. .TP .BI \-filterAdapterOnly \0value Use this option to remove reads which can only map to adapters specified in the GFF file. .TP .B \-v Be verbose. .SH NOTES Because SAM has optional tags that have different meanings in different programs, careful usage is required in order to have proper output. The "xs" tag in bwa\-sw is used to show the suboptimal score, but in PacBio SAM .RB ( blasr (1)) it is defined as the start in the query sequence of the alignment. When \fB\-smrtTitle\fR is specified, the xs tag is ignored, but when it is not specified, the coordinates given by the xs and xe tags are used to define the interval of a read that is aligned. The CIGAR string is relative to this interval. .SH SEE ALSO .BR blasr (1) .BR loadPulses (1) .BR pls2fasta (1) .BR samtoh5 (1) .BR samtom4 (1) .BR sawriter (1) .BR sdpMatcher (1) .BR toAfg (1)