NAME¶
Bio::SeqIO::seqxml - SeqXML sequence input/output stream
SYNOPSIS¶
# Do not use this module directly. Use it via the Bio::SeqIO class.
use Bio::SeqIO;
# read a SeqXML file
my $seqio = Bio::SeqIO->new(-format => 'seqxml',
-file => 'my_seqs.xml');
while (my $seq_object = $seqio->next_seq) {
print join("\t",
$seq_object->display_id,
$seq_object->description,
$seq_object->seq,
), "\n";
}
# write a SeqXML file
#
# Note that you can (optionally) specify the source
# (usually a database) and source version.
my $seqwriter = Bio::SeqIO->new(-format => 'seqxml',
-file => ">outfile.xml",
-source => 'Ensembl',
-sourceVersion => '56');
$seqwriter->write_seq($seq_object);
# once you've written all of your seqs, you may want to do
# an explicit close to get the closing </seqXML> tag
$seqwriter->close;
DESCRIPTION¶
This object can transform Bio::Seq objects to and from SeqXML format. For more
information on the SeqXML standard, visit <
http://www.seqxml.org>.
In short, SeqXML is a lightweight sequence format that takes advantage of the
validation capabilities of XML while not overburdening you with a strict and
complicated schema.
This module is based in part (particularly the XML-parsing part) on
Bio::TreeIO::phyloxml by Mira Han.
FEEDBACK¶
Mailing Lists¶
User feedback is an integral part of the evolution of this and other Bioperl
modules. Send your comments and suggestions preferably to one of the Bioperl
mailing lists. Your participation is much appreciated.
bioperl-l@bioperl.org - General discussion
http://bioperl.org/wiki/Mailing_lists - About the mailing lists
Support¶
Please direct usage questions or support issues to the mailing list:
bioperl-l@bioperl.org
rather than to the module maintainer directly. Many experienced and reponsive
experts will be able look at the problem and quickly address it. Please
include a thorough description of the problem with code and data examples if
at all possible.
Reporting Bugs¶
Report bugs to the Bioperl bug tracking system to help us keep track the bugs
and their resolution. Bug reports can be submitted via the web:
https://github.com/bioperl/bioperl-live/issues
AUTHORS - Dave Messina¶
Email:
dmessina@cpan.org
CONTRIBUTORS¶
APPENDIX¶
The rest of the documentation details each of the object methods. Internal
methods are usually preceded with a _
_initialize¶
Title : _initialize
Usage : $self->_initialize(@args)
Function: constructor (for internal use only).
Besides the usual SeqIO arguments (-file, -fh, etc.),
Bio::SeqIO::seqxml accepts three arguments which are used
when writing out a seqxml file. They are all optional.
Returns : none
Args : -source => source string (usually a database name)
-sourceVersion => source version. The version number of the source
-seqXMLversion => the version of seqXML that will be used
Throws : Exception if XML::LibXML::Reader or XML::Writer
is not initialized
next_seq¶
Title : next_seq
Usage : $seq = $stream->next_seq()
Function: returns the next sequence in the stream
Returns : L<Bio::Seq> object, or nothing if no more available
Args : none
write_seq¶
Title : write_seq
Usage : $stream->write_seq(@seq)
Function: Writes the $seq object into the stream
Returns : 1 for success and 0 for error
Args : Array of 1 or more L<Bio::PrimarySeqI> objects
_initialize_seqxml_node_methods¶
Title : _initialize_seqxml_node_methods
Usage : $self->_initialize_xml_node_methods
Function: sets up code ref mapping of each seqXML node type
to a method for processing that node type
Returns : none
Args : none
schemaLocation¶
Title : schemaLocation
Usage : $self->schemaLocation
Function: gets/sets the schema location in the <seqXML> header
Returns : the schema location string
Args : To set the schemaLocation, call with a schemaLocation as the argument.
source¶
Title : source
Usage : $self->source
Function: gets/sets the data source in the <seqXML> header
Returns : the data source string
Args : To set the source, call with a source string as the argument.
sourceVersion¶
Title : sourceVersion
Usage : $self->sourceVersion
Function: gets/sets the data source version in the <seqXML> header
Returns : the data source version string
Args : To set the source version, call with a source version string
as the argument.
seqXMLversion¶
Title : seqXMLversion
Usage : $self->seqXMLversion
Function: gets/sets the seqXML version in the <seqXML> header
Returns : the seqXML version string.
Args : To set the seqXML version, call with a seqXML version string
as the argument.
Methods for parsing the XML document¶
processXMLNode¶
Title : processXMLNode
Usage : $seqio->processXMLNode
Function: reads the XML node and processes according to the node type
Returns : none
Args : none
Throws : Exception on unexpected XML node type, warnings on unexpected
XML element names.
processAttribute¶
Title : processAttribute
Usage : $seqio->processAttribute(\%hash_for_attribute);
Function: reads the attributes of the current element into a hash
Returns : none
Args : hash reference where the attributes will be stored.
Title : parseHeader
Usage : $self->parseHeader();
Function: reads the opening <seqXML> block and grabs the metadata from it,
namely the source, sourceVersion, and seqXMLversion.
Returns : none
Args : none
Throws : Exception if it hits an <entry> tag, because that means it's
missed the <seqXML> tag and read too far into the file.
element_seqXML¶
Title : element_seqXML
Usage : $self->element_seqXML
Function: processes the opening <seqXML> node
Returns : none
Args : none
element_entry¶
Title : element_entry
Usage : $self->element_entry
Function: processes a sequence <entry> node
Returns : none
Args : none
Throws : Exception if sequence ID is not present in <entry> element
element_species¶
Title : element_entry
Usage : $self->element_entry
Function: processes a <species> node, creating a Bio::Species object
Returns : none
Args : none
Throws : Exception if <species> tag exists but is empty,
or if the attributes 'name' or 'ncbiTaxID' are undefined
element_description¶
Title : element_description
Usage : $self->element_description
Function: processes a sequence <description> node;
a no-op -- description text is read by
processXMLnode
Returns : none
Args : none
element_RNAseq¶
Title : element_RNAseq
Usage : $self->element_RNAseq
Function: processes a sequence <RNAseq> node
Returns : none
Args : none
element_DNAseq¶
Title : element_DNAseq
Usage : $self->element_DNAseq
Function: processes a sequence <DNAseq> node
Returns : none
Args : none
element_AAseq¶
Title : element_AAseq
Usage : $self->element_AAseq
Function: processes a sequence <AAseq> node
Returns : none
Args : none
element_DBRef¶
Title : element_DBRef
Usage : $self->element_DBRef
Function: processes a sequence <DBRef> node,
creating a Bio::Annotation::DBLink object
Returns : none
Args : none
element_property¶
Title : element_property
Usage : $self->element_property
Function: processes a sequence <property> node, creating a
Bio::Annotation::SimpleValue object
Returns : none
Args : none
end_element_RNAseq¶
Title : end_element_RNAseq
Usage : $self->end_element_RNAseq
Function: processes a sequence <RNAseq> node
Returns : none
Args : none
end_element_DNAseq¶
Title : end_element_DNAseq
Usage : $self->end_element_DNAseq
Function: processes a sequence <DNAseq> node
Returns : none
Args : none
end_element_AAseq¶
Title : end_element_AAseq
Usage : $self->end_element_AAseq
Function: processes a sequence <AAseq> node
Returns : none
Args : none
end_element_entry¶
Title : end_element_entry
Usage : $self->end_element_entry
Function: processes the closing </entry> node, creating the Seq object
Returns : a Bio::Seq object
Args : none
Throws : Exception if sequence, sequence ID, or alphabet are missing
end_element_default¶
Title : end_element_default
Usage : $self->end_element_default
Function: processes all other closing tags;
a no-op.
Returns : none
Args : none
DESTROY¶
Title : DESTROY
Usage : called automatically by Perl just before object
goes out of scope
Function: performs a write flush
Returns : none
Args : none
close¶
Title : close
Usage : $seqio_obj->close().
Function: writes closing </seqXML> tag.
close() will be called automatically by Perl when your
program exits, but if you want to use the seqXML file
you've written before then, you'll need to do an explicit
close first to get the final </seqXML> tag.
Returns : none
Args : none