Scroll to navigation

GT_MPI_GATHER(1) User Commands GT_MPI_GATHER(1)

NAME

gt_mpi_gather - MPI gatherer for GenomicsDB

SYNOPSIS

gt_mpi_gather [options]

OPTIONS

--help, -h
Print a usage message summarizing options available and exit
--json-config=<query json file>, -j <query json file>
Can specify workspace, array, query_column_ranges, query_row_ranges, vid_mapping_file, callset_mapping_file, query_attributes, query_filter, reference_genome, etc. as fields in the json file e.g.
{ "workspace" : "/tmp/ws",
"array" : "t0_1_2", "query_column_ranges" : [ [ [0, 100 ], 500 ] ], "query_row_ranges" : [ [ [0, 2 ] ], "vid_mapping_file" : "/tests/inputs/vid.json", "callset_mapping_file": "/tests/inputs/callset_mapping.json", "query_attributes" : [ "REF", "ALT", "BaseQRankSum", "MQ", "MQ0", "ClippingRankSum", "MQRankSum", "ReadPosRankSum", "DP", "GT", "GQ", "SB", "AD", "PL", "DP_FORMAT", "MIN_DP" ] }
--loader-json-config=<loader json file>, -l <loader json file>
Optional, if vid_mapping_file and callset_mapping_file fields are specified in the query json file
--workspace=<workspace dir>, -w <GenomicsDB workspace dir>
Optional, if workspace is specified in any of the json config files
--array=<array dir>, -A <GenomicsDB array dir>
Optional, if array is specified in any of the json config files
--print-calls
Optional, prints VariantCalls in a JSON format
--print-csv
Optional, outputs CSV with the fields and the order of CSV lines determined by the query attributes
--produce-Broad-GVCF
Optional, produces combined gVCF from the GenomicsDB data constrained by the query configuration --output-format=<output_format>, -O <output_format>
Output format can be one of the following strings: "z[0-9]" (compressed VCF),"b[0-9]" (compressed BCF) or "bu" (uncompressed BCF). Default is uncompressed VCF if not specified.
--produce-histogram
Optional
--produce-interesting-positions
Optional
--version Print version and exit
If none of the print/produce arguments are specified, the tool prints all the Variants constrained by the query configuration in a JSON format
Parallel Querying
MPI could be used for parallel querying, e.g. mpirun -n <num_processes> -hostfile <hostfile> ./bin/gt_mpi_gather -j <query.json> -l <loader.json> [<other_args>]
July 2022 gt_mpi_gather