.\" Automatically generated by Pod::Man 4.14 (Pod::Simple 3.43)
.\"
.\" Standard preamble:
.\" ========================================================================
.de Sp \" Vertical space (when we can't use .PP)
.if t .sp .5v
.if n .sp
..
.de Vb \" Begin verbatim text
.ft CW
.nf
.ne \\$1
..
.de Ve \" End verbatim text
.ft R
.fi
..
.\" Set up some character translations and predefined strings.  \*(-- will
.\" give an unbreakable dash, \*(PI will give pi, \*(L" will give a left
.\" double quote, and \*(R" will give a right double quote.  \*(C+ will
.\" give a nicer C++.  Capital omega is used to do unbreakable dashes and
.\" therefore won't be available.  \*(C` and \*(C' expand to `' in nroff,
.\" nothing in troff, for use with C<>.
.tr \(*W-
.ds C+ C\v'-.1v'\h'-1p'\s-2+\h'-1p'+\s0\v'.1v'\h'-1p'
.ie n \{\
.    ds -- \(*W-
.    ds PI pi
.    if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch
.    if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\"  diablo 12 pitch
.    ds L" ""
.    ds R" ""
.    ds C` ""
.    ds C' ""
'br\}
.el\{\
.    ds -- \|\(em\|
.    ds PI \(*p
.    ds L" ``
.    ds R" ''
.    ds C`
.    ds C'
'br\}
.\"
.\" Escape single quotes in literal strings from groff's Unicode transform.
.ie \n(.g .ds Aq \(aq
.el       .ds Aq '
.\"
.\" If the F register is >0, we'll generate index entries on stderr for
.\" titles (.TH), headers (.SH), subsections (.SS), items (.Ip), and index
.\" entries marked with X<> in POD.  Of course, you'll have to process the
.\" output yourself in some meaningful fashion.
.\"
.\" Avoid warning from groff about undefined register 'F'.
.de IX
..
.nr rF 0
.if \n(.g .if rF .nr rF 1
.if (\n(rF:(\n(.g==0)) \{\
.    if \nF \{\
.        de IX
.        tm Index:\\$1\t\\n%\t"\\$2"
..
.        if !\nF==2 \{\
.            nr % 0
.            nr F 2
.        \}
.    \}
.\}
.rr rF
.\" ========================================================================
.\"
.IX Title "XML::Compile::Translate::Reader 3pm"
.TH XML::Compile::Translate::Reader 3pm "2022-11-27" "perl v5.36.0" "User Contributed Perl Documentation"
.\" For nroff, turn off justification.  Always turn off hyphenation; it makes
.\" way too many mistakes in technical documents.
.if n .ad l
.nh
.SH "NAME"
XML::Compile::Translate::Reader \- translate XML to HASH
.SH "INHERITANCE"
.IX Header "INHERITANCE"
.Vb 2
\& XML::Compile::Translate::Reader
\&   is a XML::Compile::Translate
.Ve
.SH "SYNOPSIS"
.IX Header "SYNOPSIS"
.Vb 2
\& my $schema = XML::Compile::Schema\->new(...);
\& my $code   = $schema\->compile(READER => ...);
.Ve
.SH "DESCRIPTION"
.IX Header "DESCRIPTION"
The translator understands schemas, but does not encode that into
actions.  This module implements those actions to translate from \s-1XML\s0
into a (nested) Perl \s-1HASH\s0 structure.
.PP
Extends \*(L"\s-1DESCRIPTION\*(R"\s0 in XML::Compile::Translate.
.SH "METHODS"
.IX Header "METHODS"
Extends \*(L"\s-1METHODS\*(R"\s0 in XML::Compile::Translate.
.SH "DETAILS"
.IX Header "DETAILS"
Extends \*(L"\s-1DETAILS\*(R"\s0 in XML::Compile::Translate.
.SS "Translator options"
.IX Subsection "Translator options"
Extends \*(L"Translator options\*(R" in XML::Compile::Translate.
.SS "Processing Wildcards"
.IX Subsection "Processing Wildcards"
If you want to collect information from the \s-1XML\s0 structure, which is
permitted by \f(CW\*(C`any\*(C'\fR and \f(CW\*(C`anyAttribute\*(C'\fR specifications in the schema,
you have to implement that yourself.  The problem is \f(CW\*(C`XML::Compile\*(C'\fR
has less knowledge than you about the possible data.
.PP
\fIoption any_attribute\fR
.IX Subsection "option any_attribute"
.PP
By default, the \f(CW\*(C`anyAttribute\*(C'\fR specification is ignored.  When \f(CW\*(C`TAKE_ALL\*(C'\fR
is given, all attributes which are fulfilling the name-space requirement
added to the returned data-structure.  As key, the absolute element name
will be used, with as value the related unparsed \s-1XML\s0 element.
.PP
In the current implementation, if an explicit attribute is also
covered by the name-spaces permitted by the anyAttribute definition,
then it will also appear in that list (and hence the handler will
be called as well).
.PP
Use XML::Compile::Schema::compile(any_attribute) to write your
own handler, to influence the behavior.  The handler will be called for
each attribute, and you must return list of pairs of derived information.
When the returned is empty, the attribute data is lost.  The value may
be a complex structure.
.PP
\fIoption any_element\fR
.IX Subsection "option any_element"
.PP
By default, the \f(CW\*(C`any\*(C'\fR definition in a schema will ignore all elements
from the container which are not used.  Also in this case \f(CW\*(C`TAKE_ALL\*(C'\fR
is required to produce \f(CW\*(C`any\*(C'\fR results.  \f(CW\*(C`SKIP_ALL\*(C'\fR will ignore all
results, although this are being processed for validation needs.
.PP
\fIoption any_type \s-1CODE\s0\fR
.IX Subsection "option any_type CODE"
.PP
By default, the elements which have type \*(L"xsd:anyType\*(R" will return
an XML::LibXML::Element when there are sub-elements.  Otherwise,
it will return the textual content.
.PP
If you pass your own \s-1CODE\s0 reference, you can change this behavior.  It
will get called with the path, the node, and the default handler.  Be
awayre the \f(CW$node\fR may actually be a string already.
.PP
.Vb 6
\&   $schema\->compile(READER => ..., any_type => \e&handle_any_type);
\&   sub handle_any_type($$$)
\&   { my ($path, $node, $handler) = @_;
\&     ref $node or return $node;
\&     $node;
\&   }
.Ve
.SS "Mixed elements"
.IX Subsection "Mixed elements"
[available since 0.86]
ComplexType and ComplexContent components can be declared with the
\&\f(CW\*(C`<mixed="true"\*(C'\fR> attribute.  This implies that text is not limited
to the content of containers, but may also be used inbetween elements.
Usually, you will only find ignorable white-space between elements.
.PP
In this example, the \f(CW\*(C`a\*(C'\fR container is marked to be mixed:
  <a id=\*(L"5\*(R"> before <b>2</b> after </a>
.PP
Often the \*(L"mixed\*(R" option is bending one of both ways: either the element
is needed as text, or the element should be parsed and the text ignored.
The reader has various options to avoid the need of processing raw
XML::LibXML nodes.
.PP
[1.00]
When the return is a \s-1HASH,\s0 that \s-1HASH\s0 will also contain the
\&\f(CW\*(C`_MIXED_ELEMENT_MODE\*(C'\fR key, to help people understand what
happens.  This is not possible for all modes, only for some.
.PP
With XML::Compile::Schema::compile(mixed_elements) set to
.IP "\s-1ATTRIBUTES\s0  (the default)" 4
.IX Item "ATTRIBUTES (the default)"
a \s-1HASH\s0 is returned, the attributes are processed.  The node is found
as XML::LibXML::Element with the key '_'.  Above example will
produce
  \f(CW$r\fR = { id => 5, _ => \f(CW$xmlnode\fR };
.IP "\s-1TEXTUAL\s0" 4
.IX Item "TEXTUAL"
Like the previous, but now the textual representation of the content is
returned with key '_'.  Above example will produce
  \f(CW$r\fR = { id => 5, _ => ' before 2 after '};
.IP "\s-1STRUCTURAL\s0" 4
.IX Item "STRUCTURAL"
will remove all mixed-in text, and treat the element as normal element.
The example will be transformed into
  \f(CW$r\fR = { id => 5, b => 2 };
.IP "\s-1XML_NODE\s0" 4
.IX Item "XML_NODE"
return the XML::LibXML::Node itself.  The example:
  \f(CW$r\fR = \f(CW$xmlnode\fR;
.IP "\s-1XML_STRING\s0" 4
.IX Item "XML_STRING"
return the mixed node as \s-1XML\s0 string, just as in the source.  Be warned
that it is rather expensive: the string was parsed and then stringified
again, which is costly for large nodes.  Result:
  \f(CW$r\fR = '<a id=\*(L"5\*(R"> before <b>2</b> after </a>';
.IP "\s-1CODE\s0 reference" 4
.IX Item "CODE reference"
the reference is called with the XML::LibXML::Node as first argument.
When a value is returned (even undef), then the right tag with the value
will be included in the translators result.  When an empty list is
returned by the code reference, then nothing is returned (which may
result in an error if the element is required according to the schema)
.PP
When some of your mixed elements need different behavior from other
elements, then you have to go play with the normal hooks in specific
cases.
.SS "Schema hooks"
.IX Subsection "Schema hooks"
\fIhooks executed before the \s-1XML\s0 is being processed\fR
.IX Subsection "hooks executed before the XML is being processed"
.PP
The \f(CW\*(C`before\*(C'\fR hooks receives an XML::LibXML::Node object and
the path string.  It must return a new (or same) \s-1XML\s0 node which
will be used from then on.  You probably can best modify a node
clone, not the original as provided by the user.  When \f(CW\*(C`undef\*(C'\fR
is returned, the whole node will disappear.
.PP
This hook offers a predefined \f(CW\*(C`PRINT_PATH\*(C'\fR.
.PP
\fIhooks executed as replacement\fR
.IX Subsection "hooks executed as replacement"
.PP
Your \f(CW\*(C`replace\*(C'\fR hook should return a list of key-value pairs. To produce
it, it will get the XML::LibXML::Element, the translator settings as
\&\s-1HASH,\s0 the path, and the localname.
.PP
This hook has a predefined \f(CW\*(C`SKIP\*(C'\fR, which will not process the
found element, but simply return the string \*(L"\s-1SKIPPED\*(R"\s0 as value.
This way, a whole tree of unneeded translations can be avoided.
.PP
[1.51] The predefined hook \f(CW\*(C`XML_NODE\*(C'\fR will not attempt to parse the
selected element, but returns the XML::LibXML::Element node instead.
This may break on some schema-contained validations.
.PP
Sometimes, the Schema spec is such a mess, that XML::Compile cannot
automatically translate it.  I have seen cases where confusion
over name-spaces is created: a choice between three elements with
the same name but different types.  Well, in such case you may use
XML::LibXML::Simple to translate a part of your tree.  Simply
.PP
.Vb 10
\& use XML::LibXML::Simple  qw/XMLin/;
\& $schema\->addHook
\&   ( action  => \*(AqREADER\*(Aq
\&   , type    => \*(Aqtns:xyz\*(Aq     # or pack_type($tns,\*(Aqxyz\*(Aq)
\&  #  path    => qr!/company$! # by element name
\&   , replace =>
\&       sub { my ($xml, $args, $path, $type, $r) = @_;
\&             ($type => XMLin($xml, ...));
\&           }
\&   );
.Ve
.PP
\fIhooks for post-processing, after the data is collected\fR
.IX Subsection "hooks for post-processing, after the data is collected"
.PP
Your code reference gets called with three parameters: the \s-1XML\s0 node,
the data collected and the path.  Be careful that the collected data
might be a \s-1SCALAR\s0 (for simpleType).  Return a \s-1HASH\s0 or a \s-1SCALAR.\s0  \f(CW\*(C`undef\*(C'\fR
may work, unless it is the value of a required element you throw awy.
.PP
This hook also offers a predefined \f(CW\*(C`PRINT_PATH\*(C'\fR.  Besides, it
has \f(CW\*(C`INCLUDE_PATH\*(C'\fR, \f(CW\*(C`XML_NODE\*(C'\fR, \f(CW\*(C`NODE_TYPE\*(C'\fR, \f(CW\*(C`ELEMENT_ORDER\*(C'\fR,
and \f(CW\*(C`ATTRIBUTE_ORDER\*(C'\fR, which will result in additional fields in
the \s-1HASH,\s0 respectively containing the \s-1NODE\s0 which was processed (an
XML::LibXML::Element), the type_of_node, the element names, and the
attribute names.  The keys start with an underscore \f(CW\*(C`_\*(C'\fR.
.SS "Typemaps"
.IX Subsection "Typemaps"
In a typemap, a relation between an \s-1XML\s0 element type and a Perl class (or
object) is made.  Each translator back-end will implement this a little
differently.  This section is about how the reader handles typemaps.
.PP
\fITypemap to Class\fR
.IX Subsection "Typemap to Class"
.PP
Usually, an \s-1XML\s0 type will be mapped on a Perl class.  The Perl class
implements the \f(CW\*(C`fromXML\*(C'\fR method as constructor.
.PP
.Vb 1
\& $schema\->addTypemaps($sometype => \*(AqMy::Perl::Class\*(Aq);
\&
\& package My::Perl::Class;
\& ...
\& sub fromXML
\& {   my ($class, $data, $xmltype) = @_;
\&     my $self = $class\->new($data);
\&     ...
\&     $self;
\& }
.Ve
.PP
Your method returns the data which will be included in the result tree
of the reader.  You may return an object, the unmodified \f(CW$data\fR, or
\&\f(CW\*(C`undef\*(C'\fR.  When \f(CW\*(C`undef\*(C'\fR is returned, this may fail the schema parser
when the data element is required.
.PP
In the simpelest implementation, the class stores its data exactly as
the \s-1XML\s0 structure:
.PP
.Vb 5
\& package My::Perl::Class;
\& sub fromXML
\& {   my ($class, $data, $xmltype) = @_;
\&     bless $data, $class;
\& }
\&
\& # The same, even shorter:
\& sub fromXML { bless $_[1], $_[0] }
.Ve
.PP
\fITypemap to Object\fR
.IX Subsection "Typemap to Object"
.PP
Another option is to implement an object factory: one object which creates
other objects.  In this case, the \f(CW$xmltype\fR parameter can come of use,
to have one object spawning many different other objects.
.PP
.Vb 2
\& my $object = My::Perl::Class\->new(...);
\& $schema\->typemap($sometype => $object);
\&
\& package My::Perl::Class;
\& sub fromXML
\& {   my ($object, $xmltype, $data) = @_;
\&     return Some::Other::Class\->new($data);
\& }
.Ve
.PP
This object factory may be a very simple solution when you map \s-1XML\s0 onto
objects which are not under your control; where there is not way to
add the \f(CW\*(C`fromXML\*(C'\fR method.
.PP
\fITypemap to \s-1CODE\s0\fR
.IX Subsection "Typemap to CODE"
.PP
The light version of an object factory works with \s-1CODE\s0 references.
.PP
.Vb 7
\& $schema\->typemap($t1 => \e&myhandler);
\& sub myhandler
\& {   my ($backend, $data, $type) = @_;
\&     return My::Perl::Class\->new($data)
\&         if $backend eq \*(AqREADER\*(Aq;
\&     $data;
\& }
\&
\& # shorter
\& $schema\->typemap($t1 => sub {My::Perl::Class\->new($_[1])} );
.Ve
.PP
\fITypemap implementation\fR
.IX Subsection "Typemap implementation"
.PP
Internally, the typemap is simply translated into an \*(L"after\*(R" hook for the
specific type.  After the data was processed via the usual mechanism,
the hook will call method \f(CW\*(C`fromXML\*(C'\fR on the class or object you specified
with the data which was read.  You may still use \*(L"before\*(R" and \*(L"replace\*(R"
hooks, if you need them.
.PP
Syntactic sugar:
.PP
.Vb 2
\&  $schema\->typemap($t1 => \*(AqMy::Package\*(Aq);
\&  $schema\->typemap($t2 => $object);
.Ve
.PP
is comparible to
.PP
.Vb 2
\&  $schema\->typemap($t1 => sub {My::Package\->fromXML(@_)});
\&  $schema\->typemap($t2 => sub {$object\->fromXML(@_)} );
.Ve
.PP
with some extra checks.
.SH "SEE ALSO"
.IX Header "SEE ALSO"
This module is part of XML-Compile distribution version 1.63,
built on July 02, 2019. Website: \fIhttp://perl.overmeer.net/xml\-compile/\fR
.SH "LICENSE"
.IX Header "LICENSE"
Copyrights 2006\-2019 by [Mark Overmeer <markov@cpan.org>]. For other contributors see ChangeLog.
.PP
This program is free software; you can redistribute it and/or modify it
under the same terms as Perl itself.
See \fIhttp://dev.perl.org/licenses/\fR