.TH VCF_FILTER "1" "October 2015" "0.6.7" "User Commands" .SH NAME vcf_filter \- Filter a VCF file .SH SYNOPSIS .B vcf_filter [\-h] [\-\-no\-short\-circuit] [\-\-no\-filtered] [\-\-output OUTPUT] [\-\-local\-script LOCAL_SCRIPT] input filter [filter_args] [filter [filter_args]] ... .SH DESCRIPTION This script is part of PyVCF. .SH OPTIONS .SS "positional arguments:" .TP input File to process (use \- for STDIN) (default: None) .SS "optional arguments:" .TP \fB\-h\fR, \fB\-\-help\fR Show this help message and exit. (default: False) .TP \fB\-\-no\-short\-circuit\fR Do not stop filter processing on a site if any filter is triggered (default: False) .TP \fB\-\-output\fR OUTPUT Filename to output [STDOUT] (default: <_io.TextIOWrapper name='' mode='w' encoding='ANSI_X3.4\-1968'>) .TP \fB\-\-no\-filtered\fR Output only sites passing the filters (default: False) .TP \fB\-\-local\-script\fR LOCAL_SCRIPT Python file in current working directory with the filter classes (default: None) .SS "mgq:" .IP Filters sites with only low quality variants. It is possible to have a high site quality with many low quality calls. This filter demands at least one call be above a threshold quality. .TP \fB\-\-genotype\-quality\fR GENOTYPE_QUALITY Filter sites with no genotypes above this quality (default: 50) .SS "snp-only:" .IP Choose only SNP variants .SS "dps:" .IP Threshold read depth per sample .TP \fB\-\-depth\-per\-sample\fR DEPTH_PER_SAMPLE Minimum required coverage in each sample (default: 5) .SS "avg-dps:" .IP Threshold average read depth per sample (read_depth / sample_count) .TP \fB\-\-avg\-depth\-per\-sample\fR AVG_DEPTH_PER_SAMPLE Minimum required average coverage per sample (default: 3) .SS "eb:" .IP Filter sites that look like correlated sequencing errors. Some sequencing technologies, notably pyrosequencing, produce mutation hotspots where there is a constant level of noise, producing some reference and some heterozygote calls. This filter computes a Bayes Factor for each site by comparing the binomial likelihood of the observed allelic depths under: * A model with constant error equal to the MAF. * A model where each sample is the ploidy reported by the caller. The test value is the log of the bayes factor. Higher values are more likely to be errors. Note: this filter requires rpy2 .TP \fB\-\-eblr\fR EBLR Filter sites above this error log odds ratio (default: \fB\-10\fR) .SS "sq:" .IP Filter low quailty sites .TP \fB\-\-site\-quality\fR SITE_QUALITY Filter sites below this quality (default: 30)