.\" Automatically generated by Pod::Man 4.09 (Pod::Simple 3.35) .\" .\" Standard preamble: .\" ======================================================================== .de Sp \" Vertical space (when we can't use .PP) .if t .sp .5v .if n .sp .. .de Vb \" Begin verbatim text .ft CW .nf .ne \\$1 .. .de Ve \" End verbatim text .ft R .fi .. .\" Set up some character translations and predefined strings. \*(-- will .\" give an unbreakable dash, \*(PI will give pi, \*(L" will give a left .\" double quote, and \*(R" will give a right double quote. \*(C+ will .\" give a nicer C++. Capital omega is used to do unbreakable dashes and .\" therefore won't be available. \*(C` and \*(C' expand to `' in nroff, .\" nothing in troff, for use with C<>. .tr \(*W- .ds C+ C\v'-.1v'\h'-1p'\s-2+\h'-1p'+\s0\v'.1v'\h'-1p' .ie n \{\ . ds -- \(*W- . ds PI pi . if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch . if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\" diablo 12 pitch . ds L" "" . ds R" "" . ds C` "" . ds C' "" 'br\} .el\{\ . ds -- \|\(em\| . ds PI \(*p . ds L" `` . ds R" '' . ds C` . ds C' 'br\} .\" .\" Escape single quotes in literal strings from groff's Unicode transform. .ie \n(.g .ds Aq \(aq .el .ds Aq ' .\" .\" If the F register is >0, we'll generate index entries on stderr for .\" titles (.TH), headers (.SH), subsections (.SS), items (.Ip), and index .\" entries marked with X<> in POD. Of course, you'll have to process the .\" output yourself in some meaningful fashion. .\" .\" Avoid warning from groff about undefined register 'F'. .de IX .. .if !\nF .nr F 0 .if \nF>0 \{\ . de IX . tm Index:\\$1\t\\n%\t"\\$2" .. . if !\nF==2 \{\ . nr % 0 . nr F 2 . \} .\} .\" ======================================================================== .\" .IX Title "XML::Grove 3pm" .TH XML::Grove 3pm "2018-07-12" "perl v5.26.2" "User Contributed Perl Documentation" .\" For nroff, turn off justification. Always turn off hyphenation; it makes .\" way too many mistakes in technical documents. .if n .ad l .nh .SH "NAME" XML::Grove \- Perl\-style XML objects .SH "SYNOPSIS" .IX Header "SYNOPSIS" .Vb 1 \& use XML::Grove; \& \& # Basic parsing and grove building \& use XML::Grove::Builder; \& use XML::Parser::PerlSAX; \& $grove_builder = XML::Grove::Builder\->new; \& $parser = XML::Parser::PerlSAX\->new ( Handler => $grove_builder ); \& $document = $parser\->parse ( Source => { SystemId => \*(Aqfilename\*(Aq } ); \& \& # Creating new objects \& $document = XML::Grove::Document\->new ( Contents => [ ] ); \& $element = XML::Grove::Element\->new ( Name => \*(Aqtag\*(Aq, \& Attributes => { }, \& Contents => [ ] ); \& \& # Accessing XML objects \& $tag_name = $element\->{Name}; \& $contents = $element\->{Contents}; \& $parent = $element\->{Parent}; \& $characters\->{Data} = \*(AqXML is fun!\*(Aq; .Ve .SH "DESCRIPTION" .IX Header "DESCRIPTION" XML::Grove is a tree-based object model for accessing the information set of parsed or stored \s-1XML, HTML,\s0 or \s-1SGML\s0 instances. XML::Grove objects are Perl hashes and arrays where you access the properties of the objects using normal Perl syntax: .PP .Vb 1 \& $text = $characters\->{Data}; .Ve .SS "How To Create a Grove" .IX Subsection "How To Create a Grove" There are several ways for groves to come into being, they can be read from a file or string using a parser and a grove builder, they can be created by your Perl code using the `\f(CW\*(C`new()\*(C'\fR' methods of XML::Grove::Objects, or databases or other sources can act as groves. .PP The most common way to build groves is using a parser and a grove builder. The parser is the package that reads the characters of an \&\s-1XML\s0 file, recognizes the \s-1XML\s0 syntax, and produces ``events'' reporting when elements (tags), text (characters), processing instructions, and other sequences occur. A grove builder receives (``consumes'' or ``handles'') these events and builds XML::Grove objects. The last thing the parser does is return the XML::Grove::Document object that the grove builder created, with all of it's elements and character data. .PP The most common parser and grove builder are XML::Parser::PerlSAX (in libxml-perl) and XML::Grove::Builder. To build a grove, create the grove builder first: .PP .Vb 1 \& $grove_builder = XML::Grove::Builder\->new; .Ve .PP Then create the parser, passing it the grove builder as it's handler: .PP .Vb 1 \& $parser = XML::Parser::PerlSAX\->new ( Handler => $grove_builder ); .Ve .PP This associates the grove builder with the parser so that every time you parse a document with this parser it will return an XML::Grove::Document object. To parse a file, use the `\f(CW\*(C`Source\*(C'\fR' parameter to the `\f(CW\*(C`parse()\*(C'\fR' method containing a `\f(CW\*(C`SystemId\*(C'\fR' parameter (\s-1URL\s0 or path) of the file you want to parse: .PP .Vb 1 \& $document = $parser\->parse ( Source => { SystemId => \*(Aqkjv.xml\*(Aq } ); .Ve .PP To parse a string held in a Perl variable, use the `\f(CW\*(C`Source\*(C'\fR' parameter containing a `\f(CW\*(C`String\*(C'\fR' parameter: .PP .Vb 1 \& $document = $parser\->parse ( Source => { String => $xml_text } ); .Ve .PP The following are all parsers that work with XML::Grove::Builder: .PP .Vb 3 \& XML::Parser::PerlSAX (in libxml\-perl, uses XML::Parser) \& XML::ESISParser (in libxml\-perl, uses James Clark\*(Aqs \`nsgmls\*(Aq) \& XML::SAX2Perl (in libxml\-perl, translates SAX 1.0 to PerlSAX) .Ve .PP Most parsers supply more properties than the standard information set below and XML::Grove will make available all the properties given by the parser, refer to the parser documentation to find out what additional properties it may provide. .PP Although there are not any available yet (August 1999), PerlSAX filters can be used to process the output of a parser before it is passed to XML::Grove::Builder. XML::Grove::PerlSAX can be used to provide input to PerlSAX filters or other PerlSAX handlers. .SS "Using Groves" .IX Subsection "Using Groves" The properties provided by parsers are available directly using Perl's normal syntax for accessing hashes and arrays. For example, to get the name of an element: .PP .Vb 1 \& $element_name = $element\->{Name}; .Ve .PP By convention, all properties provided by parsers are in mixed case. `\f(CW\*(C`Parent\*(C'\fR' properties are available using the `\f(CW\*(C`Data::Grove::Parent\*(C'\fR' module. .PP The following is the minimal set of objects and their properties that you are likely to get from all parsers: .SS "XML::Grove::Document" .IX Subsection "XML::Grove::Document" The Document object is parent of the root element of the parsed \s-1XML\s0 document. .IP "Contents" 12 .IX Item "Contents" An array containing the root element. .PP A document's `Contents' may also contain processing instructions, comments, and whitespace. .PP Some parsers provide information about the document type, the \s-1XML\s0 declaration, or notations and entities. Check the parser documentation for property names. .SS "XML::Grove::Element" .IX Subsection "XML::Grove::Element" The Element object represents elements from the \s-1XML\s0 source. .IP "Parent" 12 .IX Item "Parent" The parent object of this element. .IP "Name" 12 .IX Item "Name" A string, the element type name of this element .IP "Attributes" 12 .IX Item "Attributes" A hash of strings or arrays .IP "Contents" 12 .IX Item "Contents" An array of elements, characters, processing instructions, etc. .PP In a purely minimal grove, the attributes of an element will be plain text (Perl scalars). Some parsers provide access to notations and entities in attributes, in which case the attribute may contain an array. .SS "XML::Grove::Characters" .IX Subsection "XML::Grove::Characters" The Characters object represents text from the \s-1XML\s0 source. .IP "Parent" 12 .IX Item "Parent" The parent object of this characters object .IP "Data" 12 .IX Item "Data" A string, the characters .SS "XML::Grove::PI" .IX Subsection "XML::Grove::PI" The \s-1PI\s0 object represents processing instructions from the \s-1XML\s0 source. .IP "Parent" 12 .IX Item "Parent" The parent object of this \s-1PI\s0 object. .IP "Target" 12 .IX Item "Target" A string, the processing instruction target. .IP "Data" 12 .IX Item "Data" A string, the processing instruction data, or undef if none was supplied. .PP In addition to the minimal set of objects above, XML::Grove knows about and parsers may provide the following objects. Refer to the parser documentation for descriptions of the properties of these objects. .PP .Vb 11 \& XML::Grove:: \& ::Entity::External External entity reference \& ::Entity::SubDoc External SubDoc reference (SGML) \& ::Entity::SGML External SGML reference (SGML) \& ::Entity Entity reference \& ::Notation Notation declaration \& ::Comment \& ::SubDoc A parsed subdocument (SGML) \& ::CData A CDATA marked section \& ::ElementDecl An element declaration from the DTD \& ::AttListDecl An element\*(Aqs attribute declaration, from the DTD .Ve .SH "METHODS" .IX Header "METHODS" XML::Grove by itself only provides one method, \fInew()\fR, for creating new XML::Grove objects. There are Data::Grove and XML::Grove extension modules that give additional methods for working with XML::Grove objects and new extensions can be created as needed. .ie n .IP "$obj = XML::Grove::OBJECT\->new( [\s-1PROPERTIES\s0] )" 4 .el .IP "\f(CW$obj\fR = XML::Grove::OBJECT\->new( [\s-1PROPERTIES\s0] )" 4 .IX Item "$obj = XML::Grove::OBJECT->new( [PROPERTIES] )" `\f(CW\*(C`new\*(C'\fR' creates a new XML::Grove object with the type \fI\s-1OBJECT\s0\fR, and with the initial \fI\s-1PROPERTIES\s0\fR. \fI\s-1PROPERTIES\s0\fR may be given as either a list of key-value pairs, a hash, or an XML::Grove object to copy. \&\fI\s-1OBJECT\s0\fR may be any of the objects listed above. .PP This is a list of available extensions and the methods they provide (as of Feb 1999). Refer to their module documentation for more information on how to use them. .PP .Vb 3 \& XML::Grove::AsString \& as_string return portions of groves as a string \& attr_as_string return an element\*(Aqs attribute as a string \& \& XML::Grove::AsCanonXML \& as_canon_xml return XML text in canonical XML format \& \& XML::Grove::PerlSAX \& parse emulate a PerlSAX parser using the grove objects \& \& Data::Grove::Parent \& root return the root element of a grove \& rootpath return an array of all objects between the root \& element and this object, inclusive \& \& Data::Grove::Parent also adds \`C\*(Aq and \`C\*(Aq properties \& to grove objects. \& \& Data::Grove::Visitor \& accept call back a subroutine using an object type name \& accept_name call back using an element or tag name \& children_accept for each child in Contents, call back a sub \& children_accept_name same, but using tag names \& attr_accept call back for the objects in attributes \& \& XML::Grove::IDs \& get_ids return a list of all ID attributes in grove \& \& XML::Grove::Path \& at_path $el\->at_path(\*(Aq/html/body/ul/li[4]\*(Aq) \& \& XML::Grove::Sub \& filter run a sub against all the objects in the grove .Ve .SH "WRITING EXTENSIONS" .IX Header "WRITING EXTENSIONS" The class `\f(CW\*(C`XML::Grove\*(C'\fR' is the superclass of all classes in the XML::Grove module. `\f(CW\*(C`XML::Grove\*(C'\fR' is a subclass of `\f(CW\*(C`Data::Grove\*(C'\fR'. .PP If you create an extension and you want to add a method to \fIall\fR XML::Grove objects, then create that method in the XML::Grove package. Many extensions only need to add methods to XML::Grove::Document and/or XML::Grove::Element. .PP When you create an extension you should definitely provide a way to invoke your module using objects from your package too. For example, XML::Grove::AsString's `\f(CW\*(C`as_string()\*(C'\fR' method can also be called using an XML::Grove::AsString object: .PP .Vb 2 \& $writer= new XML::Grove::AsString; \& $string = $writer\->as_string ( $xml_object ); .Ve .SH "AUTHOR" .IX Header "AUTHOR" Ken MacLeod, ken@bitsko.slc.ut.us .SH "SEE ALSO" .IX Header "SEE ALSO" \&\fIperl\fR\|(1), \fIXML::Grove\fR\|(3) .PP Extensible Markup Language (\s-1XML\s0)