Scroll to navigation

VCF2GENOMICSDB_INIT(1) User Commands VCF2GENOMICSDB_INIT(1)

NAME

vcf2genomicsdb_init - workspace initializer for GenomicsDB

SYNOPSIS

vcf2genomicsdb_init [options]

OPTIONS

--help, -h
Print a usage message summarizing options available and exit
--workspace=<GenomicsDB workspace URI>, -w <GenomicsDB workspace URI>
If workspace does not exist, it is created first exits if workspace exists and is invoked without the overwrite-workspace option
--overwrite-workspace, -o
Allow for workspace json artifacts to be overwritten
--sample-list=<sample list>, -s <sample list file>
Specify sample URIs for import, one line per sample path
--samples-dir=<folder to samples>, -S <folder to samples>
Specify Folder URI containing samples. Only vcf.gz/bcf.gz compressed samples are considered
--interval-list=<genomic interval list>, -i <genomic interval list file>
Optional, create array partitions from intervals in interval list, one line per interval, default is partition by chromosome/contig, overrides --number-of-array-partitions and --size-of-array-partitions
--number-of-array-partitions=<number>, -n <number>
Optional, suggested number of array partitions. Usually, the partitioning is per contig But, if this is 0, only a single array is created for the entire genomic space and it overrides all other partition specific command arguments
--size-of-array-partitions=<size>, -z <size>
Optional, suggested size of arrays partitions, overrides --number-of-array-partitions
--merge-small-contigs, -m
Optional, default is false and any contig smaller than ~1M will be merged into scaffolds
--include-fields=<fields>, -f <fields>
Optional, Include only fields(comma-separated) listed in this argument while generating the vidmap default is to include all fields found in the vcf headers
--template-loader-json=<template file>, -t <template file>
Optional, specify a template loader json file to use as a basis with loader json files
--append-samples, -a
Optional, if specified, callsets will be appended with the new samples and lb_row_idx set to the new starting row. Note that the workspace and vidmap/callset/loader jsons should already exist and that the interval list, number and size of partitions and merge small contigs options are ignored with append samples
--verbose, -v
Allow verbose messages to be logged
--version Print version and exit
July 2022 vcf2genomicsdb_init