.\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.40.10. .TH DINDEL "1" "March 2016" "" "User Commands" .SH NAME dindel \- finds of insertions and deletions from short nucleotide sequences .SH DESCRIPTION .SS "[Required] :" .TP \fB\-\-ref\fR arg fasta reference sequence (should be indexed with .fai file) .TP \fB\-\-outputFile\fR arg file\-prefix for output results .SS "[Required] Program option:" .TP \fB\-\-analysis\fR arg (=indels) Analysis type: getCIGARindels: Extract indels from CIGARs of mapped reads, and infer library insert size distributions indels: infer indels realignCandidates: Realign/reposition candidates in candidate file .SS "[Required] BAM input. Choose one of the following:" .TP \fB\-\-bamFile\fR arg read alignment file (should be indexed) .TP \fB\-\-bamFiles\fR arg file containing filepaths for BAMs to be jointly analysed (not possible for \fB\-\-analysis\fR=\fI=indels\fR .PP [Required for analysis == getCIGARindels]: Region to be considered for extraction of candidate indels.: .TP \fB\-\-region\fR arg region to be analysed in format start\-end, eg. 1000\-2000 .TP \fB\-\-tid\fR arg target sequence (eg 'X') .SS "[Required for analysis == indels]:" .TP \fB\-\-varFile\fR arg file with candidate variants to be tested. .TP \fB\-\-varFileIsOneBased\fR coordinates in varFile are one\-based .SS "Output options:" .TP \fB\-\-outputRealignedBAM\fR output BAM file with realigned reads .TP \fB\-\-processRealignedBAM\fR arg ABSOLUTE path to script to process realigned BAM file .TP \fB\-\-quiet\fR quiet output .SS "parameters for analysis==indels option:" .TP \fB\-\-doDiploid\fR analyze data assuming a diploid sequence .TP \fB\-\-doPooled\fR estimate haplotype frequencies using Bayesian EM algorithm. May be applied to single individual and pools. .SS "General algorithm parameters:" .TP \fB\-\-faster\fR use faster but less accurate ungapped read\-haplotype alignment model .TP \fB\-\-filterHaplotypes\fR prefilter haplotypes based on coverage .TP \fB\-\-flankRefSeq\fR arg (=2) #bases of reference sequence of indel region .TP \fB\-\-flankMaxMismatch\fR arg (=2) max number of mismatches in indel region .TP \fB\-\-priorSNP\fR arg (=0.001) prior probability of a SNP site .TP \fB\-\-priorIndel\fR arg (=0.0001) prior probability of a detected indel not being a sequencing error .TP \fB\-\-width\fR arg (=60) number of bases to left and right of indel .TP \fB\-\-maxHap\fR arg (=8) maximum number of haplotypes in likelihood computation .TP \fB\-\-maxRead\fR arg (=10000) maximum number of reads in likelihood computation .TP \fB\-\-mapQualThreshold\fR arg (=0.98999999999999999) lower limit for read mapping quality .TP \fB\-\-capMapQualThreshold\fR arg (=100) upper limit for read mapping quality in observationmodel_old (phred units) .TP \fB\-\-capMapQualFast\fR arg (=45) cap mapping quality in alignment using fast ungapped method .IP (WARNING: setting it too high (>50) .IP might result in significant overcalling!) .TP \fB\-\-skipMaxHap\fR arg (=200) skip computation if number of haplotypes exceeds this number .TP \fB\-\-minReadOverlap\fR arg (=20) minimum overlap between read and haplotype .TP \fB\-\-maxReadLength\fR arg (=500) maximum length of reads .TP \fB\-\-minCount\fR arg (=1) minimum number of WS observations of indel .TP \fB\-\-maxHapReadProd\fR arg (=10000000) skip if product of number of reads and haplotypes exceeds this value .TP \fB\-\-changeINStoN\fR change sequence of inserted sequence to \&'N', so that no penalty is incurred if a read mismatches the inserted sequence .SS "parameters for --pooled option:" .TP \fB\-\-bayesa0\fR arg (=0.001) Dirichlet a0 parameter haplotype frequency prior .TP \fB\-\-bayesType\fR arg (=singlevariant) Bayesian EM program type (all or singlevariant or priorpersite) .SS "General algorithm filtering options:" .HP \fB\-\-checkAllCIGARs\fR arg (=1) include all indels at the position of the call site .TP \fB\-\-filterReadAux\fR arg match string for exclusion of reads based on auxilary information .SS "Observation model parameters:" .TP \fB\-\-pError\fR arg (=0.00050000000000000001) probability of a read indel .TP \fB\-\-pMut\fR arg (=1.0000000000000001e\-05) probability of a mutation in the read .TP \fB\-\-maxLengthIndel\fR arg (=5) maximum length of a _sequencing error_ indel in read [not for \fB\-\-faster\fR option] .SS "Library options:" .TP \fB\-\-libFile\fR arg file with library insert histograms (as generated by \fB\-\-analysis\fR getCIGARindels) .SS "Misc results analysis options:" .TP \fB\-\-compareReadHap\fR compare likelihood differences in reads against haplotypes .HP \fB\-\-compareReadHapThreshold\fR arg (=0.5) difference threshold for viewing .TP \fB\-\-showEmpirical\fR show empirical distribution over nucleotides .TP \fB\-\-showCandHap\fR show candidate haplotypes for fast method .TP \fB\-\-showHapAlignments\fR show for each haplotype which reads map to it .TP \fB\-\-showReads\fR show reads .TP \fB\-\-inferenceMethod\fR arg (=empirical) inference method .TP \fB\-\-opl\fR output likelihoods for every read and haplotype .SS "[Required] :" .TP \fB\-\-ref\fR arg fasta reference sequence (should be indexed with .fai file) .TP \fB\-\-outputFile\fR arg file\-prefix for output results .SS "[Required] Program option:" .TP \fB\-\-analysis\fR arg (=indels) Analysis type: getCIGARindels: Extract indels from CIGARs of mapped reads, and infer library insert size distributions indels: infer indels realignCandidates: Realign/reposition candidates in candidate file .SS "[Required] BAM input. Choose one of the following:" .TP \fB\-\-bamFile\fR arg read alignment file (should be indexed) .TP \fB\-\-bamFiles\fR arg file containing filepaths for BAMs to be jointly analysed (not possible for \fB\-\-analysis\fR=\fI=indels\fR .PP [Required for analysis == getCIGARindels]: Region to be considered for extraction of candidate indels.: .TP \fB\-\-region\fR arg region to be analysed in format start\-end, eg. 1000\-2000 .TP \fB\-\-tid\fR arg target sequence (eg 'X') .SS "[Required for analysis == indels]:" .TP \fB\-\-varFile\fR arg file with candidate variants to be tested. .TP \fB\-\-varFileIsOneBased\fR coordinates in varFile are one\-based .SS "Output options:" .TP \fB\-\-outputRealignedBAM\fR output BAM file with realigned reads .TP \fB\-\-processRealignedBAM\fR arg ABSOLUTE path to script to process realigned BAM file .TP \fB\-\-quiet\fR quiet output .SS "parameters for analysis==indels option:" .TP \fB\-\-doDiploid\fR analyze data assuming a diploid sequence .TP \fB\-\-doPooled\fR estimate haplotype frequencies using Bayesian EM algorithm. May be applied to single individual and pools. .SS "General algorithm parameters:" .TP \fB\-\-faster\fR use faster but less accurate ungapped read\-haplotype alignment model .TP \fB\-\-filterHaplotypes\fR prefilter haplotypes based on coverage .TP \fB\-\-flankRefSeq\fR arg (=2) #bases of reference sequence of indel region .TP \fB\-\-flankMaxMismatch\fR arg (=2) max number of mismatches in indel region .TP \fB\-\-priorSNP\fR arg (=0.001) prior probability of a SNP site .TP \fB\-\-priorIndel\fR arg (=0.0001) prior probability of a detected indel not being a sequencing error .TP \fB\-\-width\fR arg (=60) number of bases to left and right of indel .TP \fB\-\-maxHap\fR arg (=8) maximum number of haplotypes in likelihood computation .TP \fB\-\-maxRead\fR arg (=10000) maximum number of reads in likelihood computation .TP \fB\-\-mapQualThreshold\fR arg (=0.98999999999999999) lower limit for read mapping quality .TP \fB\-\-capMapQualThreshold\fR arg (=100) upper limit for read mapping quality in observationmodel_old (phred units) .TP \fB\-\-capMapQualFast\fR arg (=45) cap mapping quality in alignment using fast ungapped method .IP (WARNING: setting it too high (>50) .IP might result in significant overcalling!) .TP \fB\-\-skipMaxHap\fR arg (=200) skip computation if number of haplotypes exceeds this number .TP \fB\-\-minReadOverlap\fR arg (=20) minimum overlap between read and haplotype .TP \fB\-\-maxReadLength\fR arg (=500) maximum length of reads .TP \fB\-\-minCount\fR arg (=1) minimum number of WS observations of indel .TP \fB\-\-maxHapReadProd\fR arg (=10000000) skip if product of number of reads and haplotypes exceeds this value .TP \fB\-\-changeINStoN\fR change sequence of inserted sequence to \&'N', so that no penalty is incurred if a read mismatches the inserted sequence .SS "parameters for --pooled option:" .TP \fB\-\-bayesa0\fR arg (=0.001) Dirichlet a0 parameter haplotype frequency prior .TP \fB\-\-bayesType\fR arg (=singlevariant) Bayesian EM program type (all or singlevariant or priorpersite) .SS "General algorithm filtering options:" .HP \fB\-\-checkAllCIGARs\fR arg (=1) include all indels at the position of the call site .TP \fB\-\-filterReadAux\fR arg match string for exclusion of reads based on auxilary information .SS "Observation model parameters:" .TP \fB\-\-pError\fR arg (=0.00050000000000000001) probability of a read indel .TP \fB\-\-pMut\fR arg (=1.0000000000000001e\-05) probability of a mutation in the read .TP \fB\-\-maxLengthIndel\fR arg (=5) maximum length of a _sequencing error_ indel in read [not for \fB\-\-faster\fR option] .SS "Library options:" .TP \fB\-\-libFile\fR arg file with library insert histograms (as generated by \fB\-\-analysis\fR getCIGARindels) .SS "Misc results analysis options:" .TP \fB\-\-compareReadHap\fR compare likelihood differences in reads against haplotypes .HP \fB\-\-compareReadHapThreshold\fR arg (=0.5) difference threshold for viewing .TP \fB\-\-showEmpirical\fR show empirical distribution over nucleotides .TP \fB\-\-showCandHap\fR show candidate haplotypes for fast method .TP \fB\-\-showHapAlignments\fR show for each haplotype which reads map to it .TP \fB\-\-showReads\fR show reads .TP \fB\-\-inferenceMethod\fR arg (=empirical) inference method .TP \fB\-\-opl\fR output likelihoods for every read and haplotype .SH "SEE ALSO" The full documentation for .B dindel you find referenced on https://sites.google.com/site/keesalbers/soft/dindel