.\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.47.16. .TH DEFINECLONES.PY "1" "October 2020" "DefineClones.py 1.0.1" "User Commands" .SH NAME DefineClones.py \- Repertoire clonal assignment toolkit (Python 3) .SH DESCRIPTION usage: DefineClones.py [\-\-version] [\-h] \fB\-d\fR DB_FILES [DB_FILES ...] .TP [\-o OUT_FILES [OUT_FILES ...]] [\-\-outdir OUT_DIR] [\-\-outname OUT_NAME] [\-\-log LOG_FILE] [\-\-failed] [\-\-format {airr,changeo}] [\-\-nproc NPROC] [\-\-sf SEQ_FIELD] [\-\-vf V_FIELD] [\-\-jf J_FIELD] [\-\-gf GROUP_FIELDS [GROUP_FIELDS ...]] [\-\-mode {allele,gene}] [\-\-act {first,set}] [\-\-model {ham,aa,hh_s1f,hh_s5f,mk_rs1nf,mk_rs5nf,hs1f_compat,m1n_compat}] [\-\-dist DISTANCE] [\-\-norm {len,mut,none}] [\-\-sym {avg,min}] [\-\-link {single,average,complete}] [\-\-maxmiss MAX_MISSING] .PP Assign Ig sequences into clones .SS "help:" .TP \fB\-\-version\fR show program's version number and exit .TP \fB\-h\fR, \fB\-\-help\fR show this help message and exit .SS "standard arguments:" .TP \fB\-d\fR DB_FILES [DB_FILES ...] A list of tab delimited database files. (default: None) .TP \fB\-o\fR OUT_FILES [OUT_FILES ...] Explicit output file name. Note, this argument cannot be used with the \fB\-\-failed\fR, \fB\-\-outdir\fR, or \fB\-\-outname\fR arguments. If unspecified, then the output filename will be based on the input filename(s). (default: None) .TP \fB\-\-outdir\fR OUT_DIR Specify to changes the output directory to the location specified. The input file directory is used if this is not specified. (default: None) .TP \fB\-\-outname\fR OUT_NAME Changes the prefix of the successfully processed output file to the string specified. May not be specified with multiple input files. (default: None) .TP \fB\-\-log\fR LOG_FILE Specify to write verbose logging to a file. May not be specified with multiple input files. (default: None) .TP \fB\-\-failed\fR If specified create files containing records that fail processing. (default: False) .TP \fB\-\-format\fR {airr,changeo} Specify input and output format. (default: airr) .TP \fB\-\-nproc\fR NPROC The number of simultaneous computational processes to execute (CPU cores to utilized). (default: 8) .SS "cloning arguments:" .TP \fB\-\-sf\fR SEQ_FIELD Field to be used to calculate distance between records. Defaults to junction (airr) or JUNCTION (changeo). (default: None) .TP \fB\-\-vf\fR V_FIELD Field containing the germline V segment call. Defaults to v_call (airr) or V_CALL (changeo). (default: None) .TP \fB\-\-jf\fR J_FIELD Field containing the germline J segment call. Defaults to j_call (airr) or J_CALL (changeo). (default: None) .TP \fB\-\-gf\fR GROUP_FIELDS [GROUP_FIELDS ...] Additional fields to use for grouping clones aside from V, J and junction length. (default: None) .TP \fB\-\-mode\fR {allele,gene} Specifies whether to use the V(D)J allele or gene for initial grouping. (default: gene) .TP \fB\-\-act\fR {first,set} Specifies how to handle multiple V(D)J assignments for initial grouping. The "first" action will use only the first gene listed. The "set" action will use all gene assignments and construct a larger gene grouping composed of any sequences sharing an assignment or linked to another sequence by a common assignment (similar to single\-linkage). (default: set) .TP \fB\-\-model\fR {ham,aa,hh_s1f,hh_s5f,mk_rs1nf,mk_rs5nf,hs1f_compat,m1n_compat} Specifies which substitution model to use for calculating distance between sequences. The "ham" model is nucleotide Hamming distance and "aa" is amino acid Hamming distance. The "hh_s1f" and "hh_s5f" models are human specific single nucleotide and 5\-mer content models, respectively, from Yaari et al, 2013. The "mk_rs1nf" and "mk_rs5nf" models are mouse specific single nucleotide and 5\-mer content models, respectively, from Cui et al, 2016. The "m1n_compat" and "hs1f_compat" models are deprecated models provided backwards compatibility with the "m1n" and "hs1f" models in Change\-O v0.3.3 and SHazaM v0.1.4. Both 5\-mer models should be considered experimental. (default: ham) .TP \fB\-\-dist\fR DISTANCE The distance threshold for clonal grouping (default: 0.0) .TP \fB\-\-norm\fR {len,mut,none} Specifies how to normalize distances. One of none (do not normalize), len (normalize by length), or mut (normalize by number of mutations between sequences). (default: len) .TP \fB\-\-sym\fR {avg,min} Specifies how to combine asymmetric distances. One of avg (average of A\->B and B\->A) or min (minimum of A\->B and B\->A). (default: avg) .TP \fB\-\-link\fR {single,average,complete} Type of linkage to use for hierarchical clustering. (default: single) .TP \fB\-\-maxmiss\fR MAX_MISSING The maximum number of non\-ACGT characters (gaps or Ns) to permit in the junction sequence before excluding the record from clonal assignment. Note, under single linkage non\-informative positions can create artifactual links between unrelated sequences. Use with caution. (default: 0) .SS "output files:" .IP clone\-pass .IP database with assigned clonal group numbers. .IP clone\-fail .IP database with records failing clonal grouping. .SS "required fields:" .IP sequence_id, v_call, j_call, junction .SS "output fields:" .IP clone_id .SH AUTHOR This manpage was written by Nilesh Patra for the Debian distribution and can be used for any other usage of the program.