table of contents
PYCHOPPER(1) | package documentation | PYCHOPPER(1) |
NAME¶
pychopper - package documentation
COMMAND LINE TOOLS¶
Command line tools¶
cdna_classifier¶
Tool to identify, orient and rescue full-length cDNA reads.
usage: cdna_classifier [-h] [-b primers] [-g phmm_file] [-c config_file]
[-k kit] [-q cutoff] [-Q min_qual] [-z min_len]
[-r report_pdf] [-u unclass_output]
[-l len_fail_output] [-w rescue_output]
[-S stats_output] [-K qc_fail_output] [-Y autotune_nr]
[-L autotune_samples] [-A scores_output] [-m method]
[-x rescue] [-p] [-t threads] [-B batch_size]
[-D read stats]
input_fastx output_fastx
Positional Arguments¶
Named Arguments¶
- b
- Primers fasta.
- g
- File with custom profile HMMs (None).
- c
- File to specify primer configurations for each direction (None).
- k
- Use primer sequences from this kit (PCS109).
Default: "PCS109"
- q
- Cutoff parameter (autotuned).
- Q
- Minimum mean base quality (7.0).
Default: 7.0
- z
- Minimum segment length (50).
Default: 50
- r
- Report PDF (cdna_classifier_report.pdf).
Default: "cdna_classifier_report.pdf"
- u
- Write unclassified reads to this file.
- l
- Write fragments failing the length filter in this file.
- w
- Write rescued reads to this file.
- S
- Write statistics to this file.
Default: "cdna_classifier_report.tsv"
- K
- Write reads failing mean quality filter to this file.
- Y
- Approximate number of reads used for tuning the cutoff parameter (10000).
Default: 10000
- L
- Number of samples taken when tuning cutoff parameter (30).
Default: 30
- A
- Write alignment scores to this BED file.
- m
- Detection method: phmm or edlib (phmm).
Default: "phmm"
- x
- Protocol-specific read rescue: DCS109 (None).
- p
- Keep primers, but trim the rest.
Default: False
- t
- Number of threads to use (8).
Default: 8
- B
- Maximum number of reads processed in each batch (1000000).
Default: 1000000
- D
- Tab separated file with per-read stats (None).
FULL API REFERENCE¶
pychopper¶
pychopper package¶
Subpackages¶
pychopper.phmm_data package¶
Module contents¶
pychopper.primer_data package¶
Module contents¶
pychopper.tests package¶
Submodules¶
pychopper.tests.test_detector module¶
- class pychopper.tests.test_detector.TestDetector(methodName='runTest')
- Bases: unittest.case.TestCase
Create an instance of the class that will use the named test method when executed. Raises a ValueError if the instance does not have a method with the specified name.
pychopper.tests.test_regression_simple module¶
- class pychopper.tests.test_regression_simple.TestIntegration(methodName='runTest')
- Bases: unittest.case.TestCase
Create an instance of the class that will use the named test method when executed. Raises a ValueError if the instance does not have a method with the specified name.
- testIntegration()
- Integration test.
Module contents¶
Submodules¶
pychopper.alignment_hits module¶
- pychopper.alignment_hits.process_hits(hits, max_score)
- Process alignment hits by removing overlaps
pychopper.chopper module¶
- pychopper.chopper.analyse_hits(hits, config)
- Segment reads based on alignment hits using dynamic programming. The algorithm is based on the rule that each primer alignment hit can be used only once. Hence if a segment is included, the next one has to be excluded.
- pychopper.chopper.chopper_edlib(reads, primers, config, max_ed, cutoff, pool, min_batch)
- Segment using the edlib/parasail backend
- pychopper.chopper.chopper_phmm(reads, phmm_file, config, cutoff, threads, pool, min_batch)
- Segment using the profile HMM backend
- pychopper.chopper.segments_to_reads(read, segments, keep_primers)
- Convert segments to output reads with annotation
pychopper.common_structures module¶
- class pychopper.common_structures.Hit(Ref, RefStart, RefEnd, Query, QueryStart, QueryEnd, Score)
- Bases: tuple
Create new instance of Hit(Ref, RefStart, RefEnd, Query, QueryStart, QueryEnd, Score)
- Query
- Alias for field number 3
- QueryEnd
- Alias for field number 5
- QueryStart
- Alias for field number 4
- Ref
- Alias for field number 0
- RefEnd
- Alias for field number 2
- RefStart
- Alias for field number 1
- Score
- Alias for field number 6
- class pychopper.common_structures.Segment(Left, Start, End, Right, Strand, Len)
- Bases: tuple
Create new instance of Segment(Left, Start, End, Right, Strand, Len)
- End
- Alias for field number 2
- Left
- Alias for field number 0
- Len
- Alias for field number 5
- Right
- Alias for field number 3
- Start
- Alias for field number 1
- Strand
- Alias for field number 4
- class pychopper.common_structures.Seq(Id, Name, Seq, Qual)
- Bases: tuple
Create new instance of Seq(Id, Name, Seq, Qual)
- Id
- Alias for field number 0
- Name
- Alias for field number 1
- Qual
- Alias for field number 3
- Seq
- Alias for field number 2
pychopper.edlib_backend module¶
- pychopper.edlib_backend.find_locations(reads, all_primers, max_ed, pool, min_batch)
- Find alignment hits of all primers in all reads using the edlib/parasail backend
pychopper.hmmer_backend module¶
- pychopper.hmmer_backend.find_locations(reads, phmm_file, E, pool, min_batch)
- Find alignment hits of all primers in all reads using the pHMM/nhmmscan backend
pychopper.parasail_backend module¶
- pychopper.parasail_backend.first_cigar(cigar)
- Extract details of the first operation in a cigar string.
- pychopper.parasail_backend.pair_align(reference, query, query_name, subs_mat, params)
- Perform pairwise local alignment using parsail-python
- pychopper.parasail_backend.process_alignment(aln, query, query_name, aln_params)
- Process an alignment, extracting score, start and end.
pychopper.report module¶
- class pychopper.report.Report(pdf)
- Bases: object
Class for plotting utilities on the top of matplotlib. Plots are saved in the specified file through the PDF backend.
- close()
- Close PDF backend. Do not forget to call this at the end of your script or your output will be damaged!
- Parameters
- self -- object
- Returns
- None
- Return type
- object
- plot_arrays(data_map, title='', xlab='', ylab='', marker='.', legend_loc='best', legend=True, vlines=None, vlcolor='green', vlwitdh=0.5)
- Plot multiple pairs of data arrays.
- self -- object.
- data_map -- A dictionary with labels as keys and tupples of data arrays (x,y) as values.
- title -- Figure title.
- xlab -- X axis label.
- ylab -- Y axis label.
- marker -- Marker passed to the plot function.
- legend_loc -- Location of legend.
- legend -- Plot legend if True
- vlines -- Dictionary with labels and positions of vertical lines to draw.
- vlcolor -- Color of vertical lines drawn.
- vlwidth -- Width of vertical lines drawn.
- Returns
- None
- Return type
- object
- plot_bars_simple(data_map, title='', xlab='', ylab='', alpha=0.6, xticks_rotation=0, auto_limit=False)
- Plot simple bar chart from input dictionary.
- self -- object.
- data_map -- A dictionary with labels as keys and data as values.
- title -- Figure title.
- xlab -- X axis label.
- ylab -- Y axis label.
- alpha -- Alpha value.
- xticks_rotation -- Rotation value for x tick labels.
- auto_limit -- Set y axis limits automatically.
- Returns
- None
- Return type
- object
- plot_histograms(data_map, title='', xlab='', ylab='', bins=50, alpha=0.7, legend_loc='best', legend=True, vlines=None)
- Plot histograms of multiple data arrays.
- self -- object.
- data_map -- A dictionary with labels as keys and data arrays as values.
- title -- Figure title.
- xlab -- X axis label.
- ylab -- Y axis label.
- bins -- Number of bins.
- alpha -- Transparency value for histograms.
- legend_loc -- Location of legend.
- legend -- Plot legend if True.
- vlines -- Dictionary with labels and positions of vertical lines to draw.
- Returns
- None
- Return type
- object
- save_close()
- Utility method to save and close figure.
pychopper.seq_utils module¶
- pychopper.seq_utils.base_complement(k)
- Return complement of base.
Performs the subsitutions: A<=>T, C<=>G, X=>X for both upper and lower case. The return value is identical to the argument for all other values.
- Parameters
- k -- A base.
- Returns
- Complement of base.
- Return type
- str
- pychopper.seq_utils.errs_tab(n)
- Generate list of error rates for qualities less than equal than n.
- pychopper.seq_utils.get_primers(primers)
- Load primers from fasta file
- pychopper.seq_utils.get_runid(desc)
- Parse out runid from sequence description.
- pychopper.seq_utils.random(size=None)
- Return random floats in the half-open interval [0.0, 1.0). Alias for random_sample to ease forward-porting to the new random API.
- pychopper.seq_utils.readfq(fp, sample=None, min_qual=None, rfq_sup={})
- Below function taken from https://github.com/lh3/readfq/blob/master/readfq.py Much faster parsing of large files compared to Biopyhton.
- pychopper.seq_utils.record_size(read, in_format='fastq')
- Calculate record size.
- pychopper.seq_utils.revcomp_seq(seq)
- Reverse complement sequence record
- pychopper.seq_utils.reverse_complement(seq)
- Return reverse complement of a string (base) sequence.
- Parameters
- seq -- Input sequence.
- Returns
- Reverse complement of input sequence.
- Return type
- str
- pychopper.seq_utils.writefq(r, fh)
- Write read to fastq file
pychopper.utils module¶
Module contents¶
- genindex
- modindex
- search
AUTHOR¶
ONT Applications Group
COPYRIGHT¶
2020, Oxford Nanopore Technologies Ltd.
October 26, 2020 | 2.5.0 |