.\" Man page generated from reStructuredText. . .TH "PYCHOPPER" "1" "Oct 26, 2020" "2.5.0" "package documentation" .SH NAME pychopper \- package documentation . .nr rst2man-indent-level 0 . .de1 rstReportMargin \\$1 \\n[an-margin] level \\n[rst2man-indent-level] level margin: \\n[rst2man-indent\\n[rst2man-indent-level]] - \\n[rst2man-indent0] \\n[rst2man-indent1] \\n[rst2man-indent2] .. .de1 INDENT .\" .rstReportMargin pre: . RS \\$1 . nr rst2man-indent\\n[rst2man-indent-level] \\n[an-margin] . nr rst2man-indent-level +1 .\" .rstReportMargin post: .. .de UNINDENT . RE .\" indent \\n[an-margin] .\" old: \\n[rst2man-indent\\n[rst2man-indent-level]] .nr rst2man-indent-level -1 .\" new: \\n[rst2man-indent\\n[rst2man-indent-level]] .in \\n[rst2man-indent\\n[rst2man-indent-level]]u .. .SH COMMAND LINE TOOLS .SS Command line tools .SS cdna_classifier .sp Tool to identify, orient and rescue full\-length cDNA reads. .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C usage: cdna_classifier [\-h] [\-b primers] [\-g phmm_file] [\-c config_file] [\-k kit] [\-q cutoff] [\-Q min_qual] [\-z min_len] [\-r report_pdf] [\-u unclass_output] [\-l len_fail_output] [\-w rescue_output] [\-S stats_output] [\-K qc_fail_output] [\-Y autotune_nr] [\-L autotune_samples] [\-A scores_output] [\-m method] [\-x rescue] [\-p] [\-t threads] [\-B batch_size] [\-D read stats] input_fastx output_fastx .ft P .fi .UNINDENT .UNINDENT .SS Positional Arguments .INDENT 0.0 .TP .Binput_fastx Input file. .TP .Boutput_fastx Output file. .UNINDENT .SS Named Arguments .INDENT 0.0 .TP .B\-b Primers fasta. .TP .B\-g File with custom profile HMMs (None). .TP .B\-c File to specify primer configurations for each direction (None). .TP .B\-k Use primer sequences from this kit (PCS109). .sp Default: "PCS109" .TP .B\-q Cutoff parameter (autotuned). .TP .B\-Q Minimum mean base quality (7.0). .sp Default: 7.0 .TP .B\-z Minimum segment length (50). .sp Default: 50 .TP .B\-r Report PDF (cdna_classifier_report.pdf). .sp Default: "cdna_classifier_report.pdf" .TP .B\-u Write unclassified reads to this file. .TP .B\-l Write fragments failing the length filter in this file. .TP .B\-w Write rescued reads to this file. .TP .B\-S Write statistics to this file. .sp Default: "cdna_classifier_report.tsv" .TP .B\-K Write reads failing mean quality filter to this file. .TP .B\-Y Approximate number of reads used for tuning the cutoff parameter (10000). .sp Default: 10000 .TP .B\-L Number of samples taken when tuning cutoff parameter (30). .sp Default: 30 .TP .B\-A Write alignment scores to this BED file. .TP .B\-m Detection method: phmm or edlib (phmm). .sp Default: "phmm" .TP .B\-x Protocol\-specific read rescue: DCS109 (None). .TP .B\-p Keep primers, but trim the rest. .sp Default: False .TP .B\-t Number of threads to use (8). .sp Default: 8 .TP .B\-B Maximum number of reads processed in each batch (1000000). .sp Default: 1000000 .TP .B\-D Tab separated file with per\-read stats (None). .UNINDENT .SH FULL API REFERENCE .SS pychopper .SS pychopper package .SS Subpackages .SS pychopper.phmm_data package .SS Module contents .SS pychopper.primer_data package .SS Module contents .SS pychopper.tests package .SS Submodules .SS pychopper.tests.test_detector module .INDENT 0.0 .TP .B class pychopper.tests.test_detector.TestDetector(methodName=\(aqrunTest\(aq) Bases: \fBunittest.case.TestCase\fP .sp Create an instance of the class that will use the named test method when executed. Raises a ValueError if the instance does not have a method with the specified name. .INDENT 7.0 .TP .B testPairAlign() .UNINDENT .INDENT 7.0 .TP .B testScoreCutoff() .UNINDENT .UNINDENT .SS pychopper.tests.test_regression_simple module .INDENT 0.0 .TP .B class pychopper.tests.test_regression_simple.TestIntegration(methodName=\(aqrunTest\(aq) Bases: \fBunittest.case.TestCase\fP .sp Create an instance of the class that will use the named test method when executed. Raises a ValueError if the instance does not have a method with the specified name. .INDENT 7.0 .TP .B testIntegration() Integration test. .UNINDENT .UNINDENT .SS Module contents .SS Submodules .SS pychopper.alignment_hits module .INDENT 0.0 .TP .B pychopper.alignment_hits.process_hits(hits, max_score) Process alignment hits by removing overlaps .UNINDENT .SS pychopper.chopper module .INDENT 0.0 .TP .B pychopper.chopper.analyse_hits(hits, config) Segment reads based on alignment hits using dynamic programming. The algorithm is based on the rule that each primer alignment hit can be used only once. Hence if a segment is included, the next one has to be excluded. .UNINDENT .INDENT 0.0 .TP .B pychopper.chopper.chopper_edlib(reads, primers, config, max_ed, cutoff, pool, min_batch) Segment using the edlib/parasail backend .UNINDENT .INDENT 0.0 .TP .B pychopper.chopper.chopper_phmm(reads, phmm_file, config, cutoff, threads, pool, min_batch) Segment using the profile HMM backend .UNINDENT .INDENT 0.0 .TP .B pychopper.chopper.segments_to_reads(read, segments, keep_primers) Convert segments to output reads with annotation .UNINDENT .SS pychopper.common_structures module .INDENT 0.0 .TP .B class pychopper.common_structures.Hit(Ref, RefStart, RefEnd, Query, QueryStart, QueryEnd, Score) Bases: \fBtuple\fP .sp Create new instance of Hit(Ref, RefStart, RefEnd, Query, QueryStart, QueryEnd, Score) .INDENT 7.0 .TP .B Query Alias for field number 3 .UNINDENT .INDENT 7.0 .TP .B QueryEnd Alias for field number 5 .UNINDENT .INDENT 7.0 .TP .B QueryStart Alias for field number 4 .UNINDENT .INDENT 7.0 .TP .B Ref Alias for field number 0 .UNINDENT .INDENT 7.0 .TP .B RefEnd Alias for field number 2 .UNINDENT .INDENT 7.0 .TP .B RefStart Alias for field number 1 .UNINDENT .INDENT 7.0 .TP .B Score Alias for field number 6 .UNINDENT .UNINDENT .INDENT 0.0 .TP .B class pychopper.common_structures.Segment(Left, Start, End, Right, Strand, Len) Bases: \fBtuple\fP .sp Create new instance of Segment(Left, Start, End, Right, Strand, Len) .INDENT 7.0 .TP .B End Alias for field number 2 .UNINDENT .INDENT 7.0 .TP .B Left Alias for field number 0 .UNINDENT .INDENT 7.0 .TP .B Len Alias for field number 5 .UNINDENT .INDENT 7.0 .TP .B Right Alias for field number 3 .UNINDENT .INDENT 7.0 .TP .B Start Alias for field number 1 .UNINDENT .INDENT 7.0 .TP .B Strand Alias for field number 4 .UNINDENT .UNINDENT .INDENT 0.0 .TP .B class pychopper.common_structures.Seq(Id, Name, Seq, Qual) Bases: \fBtuple\fP .sp Create new instance of Seq(Id, Name, Seq, Qual) .INDENT 7.0 .TP .B Id Alias for field number 0 .UNINDENT .INDENT 7.0 .TP .B Name Alias for field number 1 .UNINDENT .INDENT 7.0 .TP .B Qual Alias for field number 3 .UNINDENT .INDENT 7.0 .TP .B Seq Alias for field number 2 .UNINDENT .UNINDENT .SS pychopper.edlib_backend module .INDENT 0.0 .TP .B pychopper.edlib_backend.find_locations(reads, all_primers, max_ed, pool, min_batch) Find alignment hits of all primers in all reads using the edlib/parasail backend .UNINDENT .SS pychopper.hmmer_backend module .INDENT 0.0 .TP .B pychopper.hmmer_backend.find_locations(reads, phmm_file, E, pool, min_batch) Find alignment hits of all primers in all reads using the pHMM/nhmmscan backend .UNINDENT .SS pychopper.parasail_backend module .INDENT 0.0 .TP .B pychopper.parasail_backend.first_cigar(cigar) Extract details of the first operation in a cigar string. .UNINDENT .INDENT 0.0 .TP .B pychopper.parasail_backend.pair_align(reference, query, query_name, subs_mat, params) Perform pairwise local alignment using parsail\-python .UNINDENT .INDENT 0.0 .TP .B pychopper.parasail_backend.process_alignment(aln, query, query_name, aln_params) Process an alignment, extracting score, start and end. .UNINDENT .INDENT 0.0 .TP .B pychopper.parasail_backend.refine_locations(read, all_primers, locations, aln_params={\(aqgap_extend\(aq: 1, \(aqgap_open\(aq: 1, \(aqmatch\(aq: 1, \(aqmismatch\(aq: \-2}, subs_mat=) Refine alignment edges based on local alignment .UNINDENT .SS pychopper.report module .INDENT 0.0 .TP .B class pychopper.report.Report(pdf) Bases: \fBobject\fP .sp Class for plotting utilities on the top of matplotlib. Plots are saved in the specified file through the PDF backend. .INDENT 7.0 .TP .B Parameters .INDENT 7.0 .IP \(bu 2 \fBself\fP \-\- object. .IP \(bu 2 \fBpdf\fP \-\- Output pdf. .UNINDENT .TP .B Returns The report object. .TP .B Return type Report .UNINDENT .INDENT 7.0 .TP .B close() Close PDF backend. Do not forget to call this at the end of your script or your output will be damaged! .INDENT 7.0 .TP .B Parameters \fBself\fP \-\- object .TP .B Returns None .TP .B Return type object .UNINDENT .UNINDENT .INDENT 7.0 .TP .B plot_arrays(data_map, title=\(aq\(aq, xlab=\(aq\(aq, ylab=\(aq\(aq, marker=\(aq.\(aq, legend_loc=\(aqbest\(aq, legend=True, vlines=None, vlcolor=\(aqgreen\(aq, vlwitdh=0.5) Plot multiple pairs of data arrays. .INDENT 7.0 .TP .B Parameters .INDENT 7.0 .IP \(bu 2 \fBself\fP \-\- object. .IP \(bu 2 \fBdata_map\fP \-\- A dictionary with labels as keys and tupples of data arrays (x,y) as values. .IP \(bu 2 \fBtitle\fP \-\- Figure title. .IP \(bu 2 \fBxlab\fP \-\- X axis label. .IP \(bu 2 \fBylab\fP \-\- Y axis label. .IP \(bu 2 \fBmarker\fP \-\- Marker passed to the plot function. .IP \(bu 2 \fBlegend_loc\fP \-\- Location of legend. .IP \(bu 2 \fBlegend\fP \-\- Plot legend if True .IP \(bu 2 \fBvlines\fP \-\- Dictionary with labels and positions of vertical lines to draw. .IP \(bu 2 \fBvlcolor\fP \-\- Color of vertical lines drawn. .IP \(bu 2 \fBvlwidth\fP \-\- Width of vertical lines drawn. .UNINDENT .TP .B Returns None .TP .B Return type object .UNINDENT .UNINDENT .INDENT 7.0 .TP .B plot_bars_simple(data_map, title=\(aq\(aq, xlab=\(aq\(aq, ylab=\(aq\(aq, alpha=0.6, xticks_rotation=0, auto_limit=False) Plot simple bar chart from input dictionary. .INDENT 7.0 .TP .B Parameters .INDENT 7.0 .IP \(bu 2 \fBself\fP \-\- object. .IP \(bu 2 \fBdata_map\fP \-\- A dictionary with labels as keys and data as values. .IP \(bu 2 \fBtitle\fP \-\- Figure title. .IP \(bu 2 \fBxlab\fP \-\- X axis label. .IP \(bu 2 \fBylab\fP \-\- Y axis label. .IP \(bu 2 \fBalpha\fP \-\- Alpha value. .IP \(bu 2 \fBxticks_rotation\fP \-\- Rotation value for x tick labels. .IP \(bu 2 \fBauto_limit\fP \-\- Set y axis limits automatically. .UNINDENT .TP .B Returns None .TP .B Return type object .UNINDENT .UNINDENT .INDENT 7.0 .TP .B plot_histograms(data_map, title=\(aq\(aq, xlab=\(aq\(aq, ylab=\(aq\(aq, bins=50, alpha=0.7, legend_loc=\(aqbest\(aq, legend=True, vlines=None) Plot histograms of multiple data arrays. .INDENT 7.0 .TP .B Parameters .INDENT 7.0 .IP \(bu 2 \fBself\fP \-\- object. .IP \(bu 2 \fBdata_map\fP \-\- A dictionary with labels as keys and data arrays as values. .IP \(bu 2 \fBtitle\fP \-\- Figure title. .IP \(bu 2 \fBxlab\fP \-\- X axis label. .IP \(bu 2 \fBylab\fP \-\- Y axis label. .IP \(bu 2 \fBbins\fP \-\- Number of bins. .IP \(bu 2 \fBalpha\fP \-\- Transparency value for histograms. .IP \(bu 2 \fBlegend_loc\fP \-\- Location of legend. .IP \(bu 2 \fBlegend\fP \-\- Plot legend if True. .IP \(bu 2 \fBvlines\fP \-\- Dictionary with labels and positions of vertical lines to draw. .UNINDENT .TP .B Returns None .TP .B Return type object .UNINDENT .UNINDENT .INDENT 7.0 .TP .B save_close() Utility method to save and close figure. .UNINDENT .UNINDENT .SS pychopper.seq_utils module .INDENT 0.0 .TP .B pychopper.seq_utils.base_complement(k) Return complement of base. .sp Performs the subsitutions: A<=>T, C<=>G, X=>X for both upper and lower case. The return value is identical to the argument for all other values. .INDENT 7.0 .TP .B Parameters \fBk\fP \-\- A base. .TP .B Returns Complement of base. .TP .B Return type str .UNINDENT .UNINDENT .INDENT 0.0 .TP .B pychopper.seq_utils.errs_tab(n) Generate list of error rates for qualities less than equal than n. .UNINDENT .INDENT 0.0 .TP .B pychopper.seq_utils.get_primers(primers) Load primers from fasta file .UNINDENT .INDENT 0.0 .TP .B pychopper.seq_utils.get_runid(desc) Parse out runid from sequence description. .UNINDENT .INDENT 0.0 .TP .B pychopper.seq_utils.mean_qual(quals, qround=False, tab=[1.0, 0.7943282347242815, 0.6309573444801932, 0.5011872336272722, 0.3981071705534972, 0.31622776601683794, 0.251188643150958, 0.19952623149688797, 0.15848931924611134, 0.12589254117941673, 0.1, 0.07943282347242814, 0.06309573444801933, 0.05011872336272722, 0.039810717055349734, 0.03162277660168379, 0.025118864315095794, 0.0199526231496888, 0.015848931924611134, 0.012589254117941675, 0.01, 0.007943282347242814, 0.00630957344480193, 0.005011872336272725, 0.003981071705534973, 0.0031622776601683794, 0.0025118864315095794, 0.001995262314968879, 0.001584893192461114, 0.0012589254117941675, 0.001, 0.0007943282347242813, 0.000630957344480193, 0.0005011872336272725, 0.00039810717055349735, 0.00031622776601683794, 0.00025118864315095795, 0.00019952623149688788, 0.00015848931924611142, 0.00012589254117941674, 0.0001, 7.943282347242822e\-05, 6.309573444801929e\-05, 5.011872336272725e\-05, 3.9810717055349695e\-05, 3.1622776601683795e\-05, 2.5118864315095822e\-05, 1.9952623149688786e\-05, 1.584893192461114e\-05, 1.2589254117941661e\-05, 1e\-05, 7.943282347242822e\-06, 6.30957344480193e\-06, 5.011872336272725e\-06, 3.981071705534969e\-06, 3.162277660168379e\-06, 2.5118864315095823e\-06, 1.9952623149688787e\-06, 1.584893192461114e\-06, 1.2589254117941661e\-06, 1e\-06, 7.943282347242822e\-07, 6.30957344480193e\-07, 5.011872336272725e\-07, 3.981071705534969e\-07, 3.162277660168379e\-07, 2.5118864315095823e\-07, 1.9952623149688787e\-07, 1.584893192461114e\-07, 1.2589254117941662e\-07, 1e\-07, 7.943282347242822e\-08, 6.30957344480193e\-08, 5.011872336272725e\-08, 3.981071705534969e\-08, 3.162277660168379e\-08, 2.511886431509582e\-08, 1.9952623149688786e\-08, 1.5848931924611143e\-08, 1.2589254117941661e\-08, 1e\-08, 7.943282347242822e\-09, 6.309573444801943e\-09, 5.011872336272715e\-09, 3.981071705534969e\-09, 3.1622776601683795e\-09, 2.511886431509582e\-09, 1.9952623149688828e\-09, 1.584893192461111e\-09, 1.2589254117941663e\-09, 1e\-09, 7.943282347242822e\-10, 6.309573444801942e\-10, 5.011872336272714e\-10, 3.9810717055349694e\-10, 3.1622776601683795e\-10, 2.511886431509582e\-10, 1.9952623149688828e\-10, 1.584893192461111e\-10, 1.2589254117941662e\-10, 1e\-10, 7.943282347242822e\-11, 6.309573444801942e\-11, 5.011872336272715e\-11, 3.9810717055349695e\-11, 3.1622776601683794e\-11, 2.5118864315095823e\-11, 1.9952623149688828e\-11, 1.5848931924611107e\-11, 1.2589254117941662e\-11, 1e\-11, 7.943282347242821e\-12, 6.309573444801943e\-12, 5.011872336272715e\-12, 3.9810717055349695e\-12, 3.1622776601683794e\-12, 2.5118864315095823e\-12, 1.9952623149688827e\-12, 1.584893192461111e\-12, 1.258925411794166e\-12, 1e\-12, 7.943282347242822e\-13, 6.309573444801942e\-13, 5.011872336272715e\-13, 3.981071705534969e\-13, 3.162277660168379e\-13, 2.511886431509582e\-13, 1.9952623149688827e\-13, 1.584893192461111e\-13]) Calculate average basecall quality of a read. Receive the ascii quality scores of a read and return the average quality for that read First convert Phred scores to probabilities, calculate average error probability convert average back to Phred scale .UNINDENT .INDENT 0.0 .TP .B pychopper.seq_utils.random(size=None) Return random floats in the half\-open interval [0.0, 1.0). Alias for \fIrandom_sample\fP to ease forward\-porting to the new random API. .UNINDENT .INDENT 0.0 .TP .B pychopper.seq_utils.readfq(fp, sample=None, min_qual=None, rfq_sup={}) Below function taken from \fI\%https://github.com/lh3/readfq/blob/master/readfq.py\fP Much faster parsing of large files compared to Biopyhton. .UNINDENT .INDENT 0.0 .TP .B pychopper.seq_utils.record_size(read, in_format=\(aqfastq\(aq) Calculate record size. .UNINDENT .INDENT 0.0 .TP .B pychopper.seq_utils.revcomp_seq(seq) Reverse complement sequence record .UNINDENT .INDENT 0.0 .TP .B pychopper.seq_utils.reverse_complement(seq) Return reverse complement of a string (base) sequence. .INDENT 7.0 .TP .B Parameters \fBseq\fP \-\- Input sequence. .TP .B Returns Reverse complement of input sequence. .TP .B Return type str .UNINDENT .UNINDENT .INDENT 0.0 .TP .B pychopper.seq_utils.writefq(r, fh) Write read to fastq file .UNINDENT .SS pychopper.utils module .INDENT 0.0 .TP .B pychopper.utils.batch(iterable, size) .UNINDENT .INDENT 0.0 .TP .B pychopper.utils.check_command(cmd) .UNINDENT .INDENT 0.0 .TP .B pychopper.utils.check_min_hmmer_version(major, minor) .UNINDENT .INDENT 0.0 .TP .B pychopper.utils.count_fastq_records(fname, size=128000000) .UNINDENT .INDENT 0.0 .TP .B pychopper.utils.hit2bed(hit, read) .UNINDENT .INDENT 0.0 .TP .B pychopper.utils.parse_config_string(s) .UNINDENT .SS Module contents .INDENT 0.0 .IP \(bu 2 genindex .IP \(bu 2 modindex .IP \(bu 2 search .UNINDENT .SH AUTHOR ONT Applications Group .SH COPYRIGHT 2020, Oxford Nanopore Technologies Ltd. .\" Generated by docutils manpage writer. .