.\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.47.1. .TH SAM-STATS "1" "July 2015" "sam-stats 1.1.2" "User Commands" .SH NAME sam-stats \- ea-utils: produce digested statistics .SH SYNOPSIS .B sam-stats [\fI\,options\/\fR] [\fI\,file1\/\fR] [\fI\,file2\/\fR...\fI\,filen\/\fR] .SH DESCRIPTION Version: 1.38.681 .PP Produces lots of easily digested statistics for the files listed .PP Options (default in parens): .PP \fB\-D\fR Keep track of multiple alignments \fB\-O\fR PREFIX Output prefix enabling extended output (see below) \fB\-R\fR FIL Coverage/RNA output (coverage, 3' bias, etc, implies \fB\-A\fR) \fB\-A\fR Report all chr sigs, even if there are more than 1000 \fB\-b\fR INT Number of reads to sample for per\-base stats (1M) \fB\-S\fR INT Size of ascii\-signature (30) \fB\-x\fR FIL File extension for handling multiple files (stats) \fB\-M\fR Only overwrite if newer (requires \fB\-x\fR, or multiple files) \fB\-B\fR Input is bam, don't bother looking at magic \fB\-z\fR Don't fail when zero entries in sam .PP OUTPUT: .PP If one file is specified, then the output is to standard out. If multiple files are specified, or if the \fB\-x\fR option is supplied, the output file is .. Default extension is 'stats'. .PP Complete Stats: .TP : mean, max, stdev, median, Q1 (25 percentile), Q3 .TP reads : # of entries in the sam file, might not be # reads .TP phred : phred scale used .TP bsize : # reads used for qual stats .TP mapped reads : number of aligned reads (unique probe id sequences) .TP mapped bases : total of the lengths of the aligned reads .TP forward : number of forward\-aligned reads .TP reverse : number of reverse\-aligned reads .TP snp rate : mismatched bases / total bases (snv rate) .TP ins rate : insert bases / total bases .TP del rate : deleted bases / total bases .TP pct mismatch : percent of reads that have mismatches .TP pct align : percent of reads that aligned .TP len : read length stats, ignored if fixed\-length .TP mapq : stats for mapping qualities .TP insert : stats for insert sizes .TP % : percentage of mapped bases per chr, followed by a signature .SS "Subsampled stats (1M reads max):" .IP base qual : stats for base qualities %A,%T,%C,%G : base percentages .SS "Meaning of the per-chromosome signature:" .IP A ascii\-histogram of mapped reads by chromosome position. It is only output if the original SAM/BAM has a header. The values are the log2 of the # of mapped reads at each position + ascii '0'. .SS "Extended output mode produces a set of files:" .TP \&.stats : primary output .TP \&.fastx : fastx\-toolkit compatible output .TP \&.rcov : per\-reference counts & coverage .TP \&.xdist : mismatch distribution .TP \&.ldist : length distribution (if applicable) .TP \&.mqdist : mapping quality distribution