.\" Automatically generated by Pod::Man 4.11 (Pod::Simple 3.35) .\" .\" Standard preamble: .\" ======================================================================== .de Sp \" Vertical space (when we can't use .PP) .if t .sp .5v .if n .sp .. .de Vb \" Begin verbatim text .ft CW .nf .ne \\$1 .. .de Ve \" End verbatim text .ft R .fi .. .\" Set up some character translations and predefined strings. \*(-- will .\" give an unbreakable dash, \*(PI will give pi, \*(L" will give a left .\" double quote, and \*(R" will give a right double quote. \*(C+ will .\" give a nicer C++. Capital omega is used to do unbreakable dashes and .\" therefore won't be available. \*(C` and \*(C' expand to `' in nroff, .\" nothing in troff, for use with C<>. .tr \(*W- .ds C+ C\v'-.1v'\h'-1p'\s-2+\h'-1p'+\s0\v'.1v'\h'-1p' .ie n \{\ . ds -- \(*W- . ds PI pi . if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch . if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\" diablo 12 pitch . ds L" "" . ds R" "" . ds C` "" . ds C' "" 'br\} .el\{\ . ds -- \|\(em\| . ds PI \(*p . ds L" `` . ds R" '' . ds C` . ds C' 'br\} .\" .\" Escape single quotes in literal strings from groff's Unicode transform. .ie \n(.g .ds Aq \(aq .el .ds Aq ' .\" .\" If the F register is >0, we'll generate index entries on stderr for .\" titles (.TH), headers (.SH), subsections (.SS), items (.Ip), and index .\" entries marked with X<> in POD. Of course, you'll have to process the .\" output yourself in some meaningful fashion. .\" .\" Avoid warning from groff about undefined register 'F'. .de IX .. .nr rF 0 .if \n(.g .if rF .nr rF 1 .if (\n(rF:(\n(.g==0)) \{\ . if \nF \{\ . de IX . tm Index:\\$1\t\\n%\t"\\$2" .. . if !\nF==2 \{\ . nr % 0 . nr F 2 . \} . \} .\} .rr rF .\" ======================================================================== .\" .IX Title "Bio::SearchIO 3pm" .TH Bio::SearchIO 3pm "2020-10-28" "perl v5.30.3" "User Contributed Perl Documentation" .\" For nroff, turn off justification. Always turn off hyphenation; it makes .\" way too many mistakes in technical documents. .if n .ad l .nh .SH "NAME" Bio::SearchIO \- Driver for parsing Sequence Database Searches (BLAST, FASTA, ...) .SH "SYNOPSIS" .IX Header "SYNOPSIS" .Vb 12 \& use Bio::SearchIO; \& # format can be \*(Aqfasta\*(Aq, \*(Aqblast\*(Aq, \*(Aqexonerate\*(Aq, ... \& my $searchio = Bio::SearchIO\->new( \-format => \*(Aqblastxml\*(Aq, \& \-file => \*(Aqblastout.xml\*(Aq ); \& while ( my $result = $searchio\->next_result() ) { \& while( my $hit = $result\->next_hit ) { \& # process the Bio::Search::Hit::HitI object \& while( my $hsp = $hit\->next_hsp ) { \& # process the Bio::Search::HSP::HSPI object \& } \& } \& } .Ve .SH "DESCRIPTION" .IX Header "DESCRIPTION" This is a driver for instantiating a parser for report files from sequence database searches. This object serves as a wrapper for the format parsers in Bio::SearchIO::* \- you should not need to ever use those format parsers directly. (For people used to the SeqIO system it, we are deliberately using the same pattern). .PP Once you get a SearchIO object, calling \fBnext_result()\fR gives you back a Bio::Search::Result::ResultI compliant object, which is an object that represents one Blast/Fasta/HMMER whatever report. .PP A list of module names and formats is below: .PP .Vb 12 \& blast BLAST (WUBLAST, NCBIBLAST,bl2seq) \& fasta FASTA \-m9 and \-m0 \& blasttable BLAST \-m9 or \-m8 output (both NCBI and WUBLAST tabular) \& megablast MEGABLAST \& psl UCSC PSL format \& waba WABA output \& axt AXT format \& sim4 Sim4 \& hmmer HMMER2 hmmpfam and hmmsearch or HMMER3 hmmscan and hmmsearch \& exonerate Exonerate CIGAR and VULGAR format \& blastxml NCBI BLAST XML \& wise Genewise \-genesf format .Ve .PP Also see the SearchIO \s-1HOWTO:\s0 http://bioperl.org/howtos/SearchIO_HOWTO.html .SH "FEEDBACK" .IX Header "FEEDBACK" .SS "Mailing Lists" .IX Subsection "Mailing Lists" User feedback is an integral part of the evolution of this and other Bioperl modules. Send your comments and suggestions preferably to the Bioperl mailing list. Your participation is much appreciated. .PP .Vb 2 \& bioperl\-l@bioperl.org \- General discussion \& http://bioperl.org/wiki/Mailing_lists \- About the mailing lists .Ve .SS "Support" .IX Subsection "Support" Please direct usage questions or support issues to the mailing list: .PP \&\fIbioperl\-l@bioperl.org\fR .PP rather than to the module maintainer directly. Many experienced and reponsive experts will be able look at the problem and quickly address it. Please include a thorough description of the problem with code and data examples if at all possible. .SS "Reporting Bugs" .IX Subsection "Reporting Bugs" Report bugs to the Bioperl bug tracking system to help us keep track of the bugs and their resolution. Bug reports can be submitted via the web: .PP .Vb 1 \& https://github.com/bioperl/bioperl\-live/issues .Ve .SH "AUTHOR \- Jason Stajich & Steve Chervitz" .IX Header "AUTHOR - Jason Stajich & Steve Chervitz" Email jason\-at\-bioperl.org Email sac\-at\-bioperl.org .SH "APPENDIX" .IX Header "APPENDIX" The rest of the documentation details each of the object methods. Internal methods are usually preceded with a _ .SS "new" .IX Subsection "new" .Vb 10 \& Title : new \& Usage : my $obj = Bio::SearchIO\->new(); \& Function: Builds a new Bio::SearchIO object \& Returns : Bio::SearchIO initialized with the correct format \& Args : \-file => $filename \& \-format => format \& \-fh => filehandle to attach to \& \-result_factory => object implementing Bio::Factory::ObjectFactoryI \& \-hit_factory => object implementing Bio::Factory::ObjectFactoryI \& \-hsp_factory => object implementing Bio::Factory::ObjectFactoryI \& \-writer => object implementing Bio::SearchIO::SearchWriterI \& \-output_format => output format, which will dynamically load writer \& \-inclusion_threshold => e\-value threshold for inclusion in the \& PSI\-BLAST score matrix model \& \-signif => float or scientific notation number to be used \& as a P\- or Expect value cutoff \& \-check_all_hits => boolean. Check all hits for significance against \& significance criteria. Default = false. \& If false, stops processing hits after the first \& non\-significant hit or the first hit that fails \& the hit_filter call. This speeds parsing, \& taking advantage of the fact that the hits are \& processed in the order they appear in the report. \& \-min_query_len => integer to be used as a minimum for query sequence \& length. Reports with query sequences below this \& length will not be processed. \& default = no minimum length. \& \-best => boolean. Only process the best hit of each report; \& default = false. .Ve .PP See Bio::Factory::ObjectFactoryI, Bio::SearchIO::SearchWriterI .PP Any factory objects in the arguments are passed along to the SearchResultEventBuilder object which holds these factories and sets default ones if none are supplied as arguments. .SS "newFh" .IX Subsection "newFh" .Vb 10 \& Title : newFh \& Usage : $fh = Bio::SearchIO\->newFh(\-file=>$filename, \& \-format=>\*(AqFormat\*(Aq) \& Function: does a new() followed by an fh() \& Example : $fh = Bio::SearchIO\->newFh(\-file=>$filename, \& \-format=>\*(AqFormat\*(Aq) \& $result = <$fh>; # read a ResultI object \& print $fh $result; # write a ResultI object \& Returns : filehandle tied to the Bio::SearchIO::Fh class \& Args : .Ve .SS "fh" .IX Subsection "fh" .Vb 8 \& Title : fh \& Usage : $obj\->fh \& Function: \& Example : $fh = $obj\->fh; # make a tied filehandle \& $result = <$fh>; # read a ResultI object \& print $fh $result; # write a ResultI object \& Returns : filehandle tied to the Bio::SearchIO::Fh class \& Args : .Ve .SS "format" .IX Subsection "format" .Vb 5 \& Title : format \& Usage : $format = $obj\->format() \& Function: Get the search format \& Returns : search format \& Args : none .Ve .SS "attach_EventHandler" .IX Subsection "attach_EventHandler" .Vb 5 \& Title : attach_EventHandler \& Usage : $parser\->attatch_EventHandler($handler) \& Function: Adds an event handler to listen for events \& Returns : none \& Args : Bio::SearchIO::EventHandlerI .Ve .PP See Bio::SearchIO::EventHandlerI .SS "_eventHandler" .IX Subsection "_eventHandler" .Vb 5 \& Title : _eventHandler \& Usage : private \& Function: Get the EventHandler \& Returns : Bio::SearchIO::EventHandlerI \& Args : none .Ve .PP See Bio::SearchIO::EventHandlerI .SS "next_result" .IX Subsection "next_result" .Vb 3 \& Title : next_result \& Usage : $result = stream\->next_result \& Function: Reads the next ResultI object from the stream and returns it. \& \& Certain driver modules may encounter entries in the stream that \& are either misformatted or that use syntax not yet understood \& by the driver. If such an incident is recoverable, e.g., by \& dismissing a feature of a feature table or some other non\-mandatory \& part of an entry, the driver will issue a warning. In the case \& of a non\-recoverable situation an exception will be thrown. \& Do not assume that you can resume parsing the same stream after \& catching the exception. Note that you can always turn recoverable \& errors into exceptions by calling $stream\->verbose(2) (see \& Bio::Root::RootI POD page). \& Returns : A Bio::Search::Result::ResultI object \& Args : n/a .Ve .PP See Bio::Root::RootI .SS "write_result" .IX Subsection "write_result" .Vb 9 \& Title : write_result \& Usage : $stream\->write_result($result_result, @other_args) \& Function: Writes data from the $result_result object into the stream. \& : Delegates to the to_string() method of the associated \& : WriterI object. \& Returns : 1 for success and 0 for error \& Args : Bio::Search:Result::ResultI object, \& : plus any other arguments for the Writer \& Throws : Bio::Root::Exception if a Writer has not been set. .Ve .PP See Bio::Root::Exception .SS "write_report" .IX Subsection "write_report" .Vb 10 \& Title : write_report \& Usage : $stream\->write_report(SearchIO stream, @other_args) \& Function: Writes data directly from the SearchIO stream object into the \& : writer. This is mainly useful if one has multiple ResultI objects \& : in a SearchIO stream and you don\*(Aqt want to reiterate header/footer \& : between each call. \& Returns : 1 for success and 0 for error \& Args : Bio::SearchIO stream object, \& : plus any other arguments for the Writer \& Throws : Bio::Root::Exception if a Writer has not been set. .Ve .PP See Bio::Root::Exception .SS "writer" .IX Subsection "writer" .Vb 7 \& Title : writer \& Usage : $writer = $stream\->writer; \& Function: Sets/Gets a SearchWriterI object to be used for this searchIO. \& Returns : 1 for success and 0 for error \& Args : Bio::SearchIO::SearchWriterI object (when setting) \& Throws : Bio::Root::Exception if a non\-Bio::SearchIO::SearchWriterI object \& is passed in. .Ve .SS "result_count" .IX Subsection "result_count" .Vb 8 \& Title : result_count \& Usage : $num = $stream\->result_count; \& Function: Gets the number of Blast results that have been successfully parsed \& at the point of the method call. This is not the total # of results \& in the file. \& Returns : integer \& Args : none \& Throws : none .Ve .SS "inclusion_threshold" .IX Subsection "inclusion_threshold" .Vb 9 \& Title : inclusion_threshold \& Usage : my $incl_thresh = $isreb\->inclusion_threshold; \& : $isreb\->inclusion_threshold(1e\-5); \& Function: Get/Set the e\-value threshold for inclusion in the PSI\-BLAST \& score matrix model (blastpgp) that was used for generating the reports \& being parsed. \& Returns : number (real) \& Default value: $Bio::SearchIO::IteratedSearchResultEventBuilder::DEFAULT_INCLUSION_THRESHOLD \& Args : number (real) (e.g., 0.0001 or 1e\-4 ) .Ve .SS "max_significance" .IX Subsection "max_significance" .Vb 9 \& Usage : $obj\->max_significance(); \& Purpose : Set/Get the P or Expect value used as significance screening cutoff. \& This is the value of the \-signif parameter supplied to new(). \& Hits with P or E\-value above this are skipped. \& Returns : Scientific notation number with this format: 1.0e\-05. \& Argument : Scientific notation number or float (when setting) \& Comments : Screening of significant hits uses the data provided on the \& : description line. For NCBI BLAST1 and WU\-BLAST, this data \& : is P\-value. for NCBI BLAST2 it is an Expect value. .Ve .SS "signif" .IX Subsection "signif" Synonym for \fBmax_significance()\fR .SS "min_score" .IX Subsection "min_score" .Vb 8 \& Usage : $obj\->min_score(); \& Purpose : Set/Get the Blast score used as screening cutoff. \& This is the value of the \-score parameter supplied to new(). \& Hits with scores below this are skipped. \& Returns : Integer or scientific notation number. \& Argument : Integer or scientific notation number (when setting) \& Comments : Screening of significant hits uses the data provided on the \& : description line. .Ve .SS "min_query_length" .IX Subsection "min_query_length" .Vb 6 \& Usage : $obj\->min_query_length(); \& Purpose : Gets the query sequence length used as screening criteria. \& This is the value of the \-min_query_len parameter supplied to new(). \& Hits with sequence length below this are skipped. \& Returns : Integer \& Argument : n/a .Ve .SS "best_hit_only" .IX Subsection "best_hit_only" .Vb 6 \& Title : best_hit_only \& Usage : print "only getting best hit.\en" if $obj\->best_hit_only; \& Purpose : Set/Get the indicator for whether or not to process only \& : the best BlastHit. \& Returns : Boolean (1 | 0) \& Argument : Boolean (1 | 0) (when setting) .Ve .SS "check_all_hits" .IX Subsection "check_all_hits" .Vb 8 \& Title : check_all_hits \& Usage : print "checking all hits.\en" if $obj\->check_all_hits; \& Purpose : Set/Get the indicator for whether or not to process all hits. \& : If false, the parser will stop processing hits after the \& : the first non\-significance hit or the first hit that fails \& : any hit filter. \& Returns : Boolean (1 | 0) \& Argument : Boolean (1 | 0) (when setting) .Ve .SS "_load_format_module" .IX Subsection "_load_format_module" .Vb 6 \& Title : _load_format_module \& Usage : *INTERNAL SearchIO stuff* \& Function: Loads up (like use) a module at run time on demand \& Example : \& Returns : \& Args : .Ve .SS "_get_seq_identifiers" .IX Subsection "_get_seq_identifiers" .Vb 6 \& Title : _get_seq_identifiers \& Usage : my ($gi, $acc,$ver) = &_get_seq_identifiers($id) \& Function: Private function to get the gi, accession, version data \& for an ID (if it is in NCBI format) \& Returns : 3\-pule of gi, accession, version \& Args : ID string to process (NCBI format) .Ve .SS "_guess_format" .IX Subsection "_guess_format" .Vb 6 \& Title : _guess_format \& Usage : $obj\->_guess_format($filename) \& Function: \& Example : \& Returns : guessed format of filename (lower case) \& Args : .Ve