Scroll to navigation

vcf(5) Bioinformatics formats vcf(5)


vcf - Variant Call Format


The Variant Call Format (VCF) is a TAB-delimited format with each data line consisting of the following fields:

1 CHROM CHROMosome name
2 POS the left-most POSition of the variant
3 ID unique variant IDentifier
4 REF the REFerence allele
5 ALT the ALTernate allele(s) (comma-separated)
6 QUAL variant/reference QUALity
7 FILTER FILTERs applied
8 INFO INFOrmation related to the variant (semicolon-separated)
9 FORMAT FORMAT of the genotype fields (optional; colon-separated)
10+ SAMPLE SAMPLE genotypes and per-sample information (optional)

The following table gives the INFO tags used by samtools and bcftools.

Max-likelihood estimate of the site allele frequency (AF) of the first ALT allele (double)
Raw read depth (without quality filtering) (int)
# high-quality reference forward bases, ref reverse, alternate for and alt rev bases (int[4])
Consensus quality. Positive: sample genotypes different; negative: otherwise (int)
Root-Mean-Square mapping quality of covering reads (int)
Phred probability of AF in group1 samples being larger (,smaller) than in group2 (int[2])
Posterior weighted chi^2 P-value between group1 and group2 samples (double)
P-value for strand bias, baseQ bias, mapQ bias and tail distance bias (double[4])
Phred-scaled PCHI2 (int)
# permutations yielding a smaller PCHI2 (int)
Phred log ratio of genotype likelihoods with and without the trio/pair constraint (int)
Most probable genotype configuration without the trio constraint (string)
Most probable configuration with the trio constraint (string)
Tests variant positions within reads. Intended for filtering RNA-seq artifacts around splice sites (float)
Mann-Whitney rank-sum test for tail distance bias (float)
Hardy-Weinberg equilibrium test (Wigginton et al) (float)

The full VCF/BCF file format specification
Wigginton JE et al PMID:15789306
August 2013 htslib