.\" Automatically generated by Pod::Man 4.09 (Pod::Simple 3.35) .\" .\" Standard preamble: .\" ======================================================================== .de Sp \" Vertical space (when we can't use .PP) .if t .sp .5v .if n .sp .. .de Vb \" Begin verbatim text .ft CW .nf .ne \\$1 .. .de Ve \" End verbatim text .ft R .fi .. .\" Set up some character translations and predefined strings. \*(-- will .\" give an unbreakable dash, \*(PI will give pi, \*(L" will give a left .\" double quote, and \*(R" will give a right double quote. \*(C+ will .\" give a nicer C++. Capital omega is used to do unbreakable dashes and .\" therefore won't be available. \*(C` and \*(C' expand to `' in nroff, .\" nothing in troff, for use with C<>. .tr \(*W- .ds C+ C\v'-.1v'\h'-1p'\s-2+\h'-1p'+\s0\v'.1v'\h'-1p' .ie n \{\ . ds -- \(*W- . ds PI pi . if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch . if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\" diablo 12 pitch . ds L" "" . ds R" "" . ds C` "" . ds C' "" 'br\} .el\{\ . ds -- \|\(em\| . ds PI \(*p . ds L" `` . ds R" '' . ds C` . ds C' 'br\} .\" .\" Escape single quotes in literal strings from groff's Unicode transform. .ie \n(.g .ds Aq \(aq .el .ds Aq ' .\" .\" If the F register is >0, we'll generate index entries on stderr for .\" titles (.TH), headers (.SH), subsections (.SS), items (.Ip), and index .\" entries marked with X<> in POD. Of course, you'll have to process the .\" output yourself in some meaningful fashion. .\" .\" Avoid warning from groff about undefined register 'F'. .de IX .. .if !\nF .nr F 0 .if \nF>0 \{\ . de IX . tm Index:\\$1\t\\n%\t"\\$2" .. . if !\nF==2 \{\ . nr % 0 . nr F 2 . \} .\} .\" ======================================================================== .\" .IX Title "XML::XPath::XMLParser 3pm" .TH XML::XPath::XMLParser 3pm "2018-10-20" "perl v5.26.2" "User Contributed Perl Documentation" .\" For nroff, turn off justification. Always turn off hyphenation; it makes .\" way too many mistakes in technical documents. .if n .ad l .nh .SH "NAME" XML::XPath::XMLParser \- The default XML parsing class that produces a node tree .SH "SYNOPSIS" .IX Header "SYNOPSIS" .Vb 7 \& my $parser = XML::XPath::XMLParser\->new( \& filename => $self\->get_filename, \& xml => $self\->get_xml, \& ioref => $self\->get_ioref, \& parser => $self\->get_parser, \& ); \& my $root_node = $parser\->parse; .Ve .SH "DESCRIPTION" .IX Header "DESCRIPTION" This module generates a node tree for use as the context node for XPath processing. It aims to be a quick parser, nothing fancy, and yet has to store more information than most parsers. To achieve this I've used array refs everywhere \- no hashes. I don't have any performance figures for the speedups achieved, so I make no apologies for anyone not used to using arrays instead of hashes. I think they make good sense here where we know the attributes of each type of node. .SH "Node Structure" .IX Header "Node Structure" All nodes have the same first 2 entries in the array: node_parent and node_pos. The type of the node is determined using the \fIref()\fR function. The node_parent always contains an entry for the parent of the current node \- except for the root node which has undef in there. And node_pos is the position of this node in the array that it is in (think: \&\f(CW$node\fR == \f(CW$node\fR\->[node_parent]\->[node_children]\->[$node\->[node_pos]] ) .PP Nodes are structured as follows: .SS "Root Node" .IX Subsection "Root Node" The root node is just an element node with no parent. .PP .Vb 6 \& [ \& undef, # node_parent \- check for undef to identify root node \& undef, # node_pos \& undef, # node_prefix \& [ ... ], # node_children (see below) \& ] .Ve .SS "Element Node" .IX Subsection "Element Node" .Vb 9 \& [ \& $parent, # node_parent \& , # node_pos \& \*(Aqxxx\*(Aq, # node_prefix \- namespace prefix on this element \& [ ... ], # node_children \& \*(Aqyyy\*(Aq, # node_name \- element tag name \& [ ... ], # node_attribs \- attributes on this element \& [ ... ], # node_namespaces \- namespaces currently in scope \& ] .Ve .SS "Attribute Node" .IX Subsection "Attribute Node" .Vb 7 \& [ \& $parent, # node_parent \- the element node \& , # node_pos \& \*(Aqxxx\*(Aq, # node_prefix \- namespace prefix on this element \& \*(Aqhref\*(Aq, # node_key \- attribute name \& \*(Aqftp://ftp.com/\*(Aq, # node_value \- value in the node \& ] .Ve .SS "Namespace Nodes" .IX Subsection "Namespace Nodes" Each element has an associated set of namespace nodes that are currently in scope. Each namespace node stores a prefix and the expanded name (retrieved from the xmlns:prefix=\*(L"...\*(R" attribute). .PP .Vb 6 \& [ \& $parent, \& , \& \*(Aqa\*(Aq, # node_prefix \- the namespace as it was written as a prefix \& \*(Aqhttp://my.namespace.com\*(Aq, # node_expanded \- the expanded name. \& ] .Ve .SS "Text Nodes" .IX Subsection "Text Nodes" .Vb 5 \& [ \& $parent, \& , \& \*(AqThis is some text\*(Aq # node_text \- the text in the node \& ] .Ve .SS "Comment Nodes" .IX Subsection "Comment Nodes" .Vb 5 \& [ \& $parent, \& , \& \*(AqThis is a comment\*(Aq # node_comment \& ] .Ve .SS "Processing Instruction Nodes" .IX Subsection "Processing Instruction Nodes" .Vb 6 \& [ \& $parent, \& , \& \*(Aqtarget\*(Aq, # node_target \& \*(Aqdata\*(Aq, # node_data \& ] .Ve .SH "Usage" .IX Header "Usage" If you feel the need to use this module outside of XML::XPath (for example you might use this module directly so that you can cache parsed trees), you can follow the following \s-1API:\s0 .SS "new" .IX Subsection "new" The new method takes either no parameters, or any of the following parameters: .PP .Vb 4 \& filename \& xml \& parser \& ioref .Ve .PP This uses the familiar hash syntax, so an example might be: .PP .Vb 1 \& use XML::XPath::XMLParser; \& \& my $parser = XML::XPath::XMLParser\->new(filename => \*(Aqexample.xml\*(Aq); .Ve .PP The parameters represent a filename, a string containing \s-1XML,\s0 an XML::Parser instance and an open filehandle ref respectively. You can also set or get all of these properties using the get_ and set_ functions that have the same name as the property: e.g. get_filename, set_ioref, etc. .SS "parse" .IX Subsection "parse" The parse method generally takes no parameters, however you are free to pass either an open filehandle reference or an \s-1XML\s0 string if you so require. The return value is a tree that XML::XPath can use. The parse method will die if there is an error in your \s-1XML,\s0 so be sure to use perl's exception handling mechanism (eval{};) if you want to avoid this. .SS "parsefile" .IX Subsection "parsefile" The parsefile method is identical to \fIparse()\fR except it expects a single parameter that is a string naming a file to open and parse. Again it returns a tree and also dies if there are \s-1XML\s0 errors. .SH "NOTICES" .IX Header "NOTICES" This file is distributed as part of the XML::XPath module, and is copyright 2000 Fastnet Software Ltd. Please see the documentation for the module as a whole for licencing information.