NAME¶
Bio::ASN1::EntrezGene::Indexer - Indexes NCBI Sequence files.
VERSION¶
version 1.70
SYNOPSIS¶
use Bio::ASN1::EntrezGene::Indexer;
# creating & using the index is just a few lines
my $inx = Bio::ASN1::EntrezGene::Indexer->new(
-filename => 'entrezgene.idx',
-write_flag => 'WRITE'); # needed for make_index call, but if opening
# existing index file, don't set write flag!
$inx->make_index('Homo_sapiens', 'Mus_musculus', 'Rattus_norvegicus');
my $seq = $inx->fetch(10); # Bio::Seq obj for Entrez Gene #10
# alternatively, if one prefers just a data structure instead of objects
$seq = $inx->fetch_hash(10); # a hash produced by Bio::ASN1::EntrezGene
# that contains all data in the Entrez Gene record
# note that in case you wonder, you can get the files 'Homo_sapiens'
# from NCBI Entrez Gene ftp download, DATA/ASN/Mammalia directory
DESCRIPTION¶
Bio::ASN1::EntrezGene::Indexer is a Perl Indexer for NCBI Entrez Gene genome
databases. It processes an ASN.1-formatted Entrez Gene record and stores the
file position for each record in a way compliant with Bioperl standard (in
fact its a subclass of Bioperl's index objects).
Note that this module does not parse record, because it needs to run fast and
grab only the gene ids. For parsing record, use Bio::ASN1::EntrezGene, or
better yet, use Bio::SeqIO, format 'entrezgene'.
It takes this module (version 1.07) 21 seconds to index the human genome Entrez
Gene file (Apr. 5/2005 download) on one 2.4 GHz Intel Xeon processor.
METHODS¶
fetch¶
Parameters: $geneid - id for the Entrez Gene record to be retrieved
Example: my $hash = $indexer->fetch(10); # get Entrez Gene #10
Function: fetch the data for the given Entrez Gene id.
Returns: A Bio::Seq object produced by Bio::SeqIO::entrezgene
Notes: One needs to have Bio::SeqIO::entrezgene installed before
calling this function!
fetch_hash¶
Parameters: $geneid - id for the Entrez Gene record to be retrieved
Example: my $hash = $indexer->fetch_hash(10); # get Entrez Gene #10
Function: fetch a hash produced by Bio::ASN1::EntrezGene for given Entrez
Gene id.
Returns: A data structure containing all data items from the Entrez
Gene record.
Notes: Alternative to fetch()
INTERNAL METHODS¶
_version¶
_type_stamp¶
_index_file¶
_file_handle¶
Title : _file_handle
Usage : $fh = $index->_file_handle( INT )
Function: Returns an open filehandle for the file
index INT. On opening a new filehandle it
caches it in the @{$index->_filehandle} array.
If the requested filehandle is already open,
it simply returns it from the array.
Example : $fist_file_indexed = $index->_file_handle( 0 );
Returns : ref to a filehandle
Args : INT
Notes : This function is copied from Bio::Index::Abstract. Once that module
changes file handle code like I do below to fit perl 5.005_03, this
sub would be removed from this module
PREREQUISITE¶
Bio::ASN1::EntrezGene, Bioperl version that contains Stefan Kirov's
entrezgene.pm and all dependencies therein.
INSTALLATION¶
Same as Bio::ASN1::EntrezGene
SEE ALSO¶
For details on various parsers I generated for Entrez Gene, example scripts that
uses/benchmarks the modules, please see
<
http://sourceforge.net/projects/egparser/>. Those other parsers etc.
are included in V1.05 download.
CITATION¶
Liu, Mingyi, and Andrei Grigoriev. "Fast parsers for Entrez Gene."
Bioinformatics 21, no. 14 (2005): 3189-3190.
OPERATION SYSTEMS SUPPORTED¶
Any OS that Perl & Bioperl run on.
FEEDBACK¶
Mailing lists¶
User feedback is an integral part of the evolution of this and other Bioperl
modules. Send your comments and suggestions preferably to the Bioperl mailing
list. Your participation is much appreciated.
bioperl-l@bioperl.org - General discussion
http://bioperl.org/wiki/Mailing_lists - About the mailing lists
Support¶
Please direct usage questions or support issues to the mailing list:
bioperl-l@bioperl.org
rather than to the module maintainer directly. Many experienced and reponsive
experts will be able look at the problem and quickly address it. Please
include a thorough description of the problem with code and data examples if
at all possible.
Reporting bugs¶
Report bugs to the Bioperl bug tracking system to help us keep track of the bugs
and their resolution. Bug reports can be submitted via the web:
https://redmine.open-bio.org/projects/bioperl/
AUTHOR¶
Dr. Mingyi Liu <mingyiliu@gmail.com>
COPYRIGHT¶
This software is copyright (c) 2005 by Mingyi Liu, 2005 by GPC Biotech AG, and
2005 by Altana Research Institute.
This software is available under the same terms as the perl 5 programming
language system itself.