Scroll to navigation

METABAT1(1) User Commands METABAT1(1)

NAME

metabat1 - MetaBAT: Metagenome Binning based on Abundance and Tetranucleotide frequency (version 1)

DESCRIPTION

MetaBAT: Metagenome Binning based on Abundance and Tetranucleotide frequency (version 1) by Don Kang (ddkang@lbl.gov), Jeff Froula, Rob Egan, and Zhong Wang (zhongwang@lbl.gov)

OPTIONS

produce help message
Contigs in (gzipped) fasta file format [Mandatory]
Base file name for each bin. The default output is fasta format. Use -l option to output only contig names [Mandatory]
A file having mean and variance of base coverage depth (tab delimited; the first column should be contig names, and the first row will be considered as the header and be skipped) [Optional]
When a coverage file without variance (from third party tools) is used instead of abdFile from jgi_summarize_bam_contig_depths
A file having paired reads mapping information. Use it to increase sensitivity. (tab delimited; should have 3 columns of contig index (ordered by), its mate contig index, and supporting mean read coverage. The first row will be considered as the header and be skipped) [Optional]
Probability cutoff for bin seeding. It mainly controls the number of potential bins and their specificity. The higher, the more (specific) bins would be. (Percentage; Should be between 0 and 100)
Probability cutoff for secondary neighbors. It supports p1 and better be close to p1. (Percentage; Should be between 0 and 100)
Minimum probability for binning consideration. It controls sensitivity. Usually it should be >= 75. (Percentage; Should be between 0 and 100)
Minimum proportion of already binned neighbors for one's membership inference. It contorls specificity. Usually it would be <= 50 (Percentage; Should be between 0 and 100)
For greater sensitivity, especially in a simple community. It is the shortcut for --p1 90 --p2 85 --pB 20 --minProb 75 --minBinned 20 --minCorr 90
For better sensitivity [default]. It is the shortcut for --p1 90 --p2 90 --pB 20 --minProb 80 --minBinned 40 --minCorr 92
For better specificity. Different from --sensitive when using correlation binning or ensemble binning. It is the shortcut for --p1 90 --p2 90 --pB 30 --minProb 80 --minBinned 40 --minCorr 96
For greater specificity. No correlation binning for short contig recruiting. It is the shortcut for --p1 90 --p2 90 --pB 40 --minProb 80 --minBinned 40
For the best specificity. It is the shortcut for --p1 95 --p2 90 --pB 50 --minProb 80 --minBinned 20
Minimum pearson correlation coefficient for binning missed contigs to increase sensitivity (Helpful when there are many samples). Should be very high (>=90) to reduce contamination. (Percentage; Should be between 0 and 100; 0 disables)
Minimum number of sample sizes for considering correlation based recruiting
Minimum mean coverage of a contig to consider for abundance distance calculation in each library
Minimum total mean coverage of a contig (sum of all libraries) to consider for abundance distance calculation

-s [ --minClsSize ] arg (=200000) Minimum size of a bin to be considered as the output

Minimum size of a contig to be considered for binning (should be >=1500; ideally >=2500). If # of samples >= minSamples, small contigs (>=1000) will be given a chance to be recruited to existing bins by default.
Minimum size of a contig to be considered for recruiting by pearson correlation coefficients (activated only if # of samples >= minSamples; disabled when minContigByCorr > minContig)
Number of threads to use (0: use all cores)
Percentage cutoff for merging fuzzy contigs
Binning with fuzziness which assigns multiple memberships of a contig to bins (activated only with --pairFile at the moment)
Output only sequence labels as a list in a column without sequences
If set, then every sample that falls below the minCV will be used in an aggregate sample
Ignore any contigs where variance / mean exceeds this ratio (0 disables)
File to save (or load if exists) TNF matrix for each contig in input
File to save (or load if exists) distance graph at lowest probability cutoff
Save cluster memberships as a matrix format
Generate [outFile].unbinned.fa file for unbinned contigs
No bin output. Usually combined with --saveCls to check only contig memberships
Number of bootstrapping for ensemble binning (Recommended to be >=20)
Proportion of shared membership in bootstrapping. Major control for sensitivity/specificity. The higher, the specific. (Percentage; Should be between 0 and 100)
For reproducibility in ensemble binning, though it might produce slightly different results. (0: use random seed)
Keep the intermediate files for later usage
Debug output
Verbose output

AUTHOR


This manpage was written by Andreas Tille for the Debian distribution and
can be used for any other usage of the program.

May 2020 metabat1 2.15