NAME¶
XML::GRDDL - transform XML and XHTML to RDF
SYNOPSIS¶
High-level interface:
my $grddl = XML::GRDDL->new;
my $model = $grddl->data($xmldoc, $baseuri);
# $model is an RDF::Trine::Model
Low-level interface:
my $grddl = XML::GRDDL->new;
my @transformations = $grddl->discover($xmldoc, $baseuri);
foreach my $t (@transformations)
{
# $t is an XML::GRDDL::Transformation
my ($output, $mediatype) = $t->transform($xmldoc);
# $output is a string of type $mediatype.
}
DESCRIPTION¶
GRDDL is a W3C Recommendation for extracting RDF data from arbitrary XML and
XHTML via a transformation, typically written in XSLT. See
<
http://www.w3.org/TR/grddl/> for more details.
This module implements GRDDL in Perl. It offers both a low level interface,
allowing you to generate a list of transformations associated with the
document being processed, and thus the ability to selectively run the
transformation; and a high-level interface where a single RDF model is
returned representing the union of the RDF graphs generated by applying all
available transformations.
Constructor¶
- "XML::GRDDL->new"
- The constructor accepts no parameters and returns an XML::GRDDL
object.
Methods¶
- "$grddl->discover($xml, $base, %options)"
- Processes the document to discover the transformations associated with it.
$xml is the raw XML source of the document, or an XML::LibXML::Document
object. ($xml cannot be "tag soup" HTML, though you should be
able to use HTML::HTML5::Parser to parse tag soup into an
XML::LibXML::Document.) $base is the base URI for resolving relative
references.
Returns a list of XML::GRDDL::Transformation objects.
Options include:
- •
- force_rel - boolean; interpret XHTML rel="transformation"
even in the absence of the GRDDL profile.
- •
- strings - boolean; return a list of plain strings instead of
blessed objects.
- "$grddl->data($xml, $base, %options)"
- Processes the document, discovers the transformations associated with it,
applies the transformations and merges the results into a single RDF
model. $xml and $base are as per "discover".
Returns an RDF::Trine::Model containing the data. Statement contexts (a.k.a.
named graphs / quads) are used to distinguish between data from the result
of each transformation.
Options include:
- •
- force_rel - boolean; interpret XHTML rel="transformation"
even in the absence of the GRDDL profile.
- •
- metadata - boolean; include provenance information in the default
graph (a.k.a. nil context).
- "$grddl->ua( [$ua] )"
- Get/set the user agent used for HTTP requests. $ua, if supplied, must be
an LWP::UserAgent.
Constants¶
These constants may be exported upon request.
- "GRDDL_NS"
- "XHTML_NS"
FEATURES¶
XML::GRDDL supports transformations written in XSLT 1.0, and in RDF-EASE.
XML::GRDDL is a good HTTP citizen: Referer headers are included in requests, and
appropriate Accept headers supplied. To be an even better citizen, I recommend
changing the User-Agent header to advertise the name of the application:
$grddl->ua->default_header(user_agent => 'MyApp/1.0 ');
Provenance information for GRDDL transformations is returned using the GRDDL
vocabulary at
http://www.w3.org/2003/g/data-view#
<
http://www.w3.org/2003/g/data-view#>.
Certain XHTML profiles and XML namespaces known not to contain any
transformations, or to contain useless transformations are skipped. See
XML::GRDDL::Namespace and XML::GRDDL::Profile for details. In particular
profiles for RDFa and many Microformats are skipped, as RDF::RDFa::Parser and
HTML::Microformats will typically yield far superior results.
BUGS¶
Please report any bugs to <
http://rt.cpan.org/>.
Known limitations:
- •
- Recursive GRDDL doesn't work yet.
That is, the profile documents and namespace documents linked to from your
primary document cannot themselves rely on GRDDL.
SEE ALSO¶
XML::GRDDL::Transformation, XML::GRDDL::Namespace, XML::GRDDL::Profile,
XML::GRDDL::Transformation::RDF_EASE::Functional, XML::Saxon::XSLT2.
HTML::HTML5::Parser, RDF::RDFa::Parser, HTML::Microformats.
JSON::GRDDL.
<
http://www.w3.org/TR/grddl/>.
<
http://www.perlrdf.org/>.
This module is derived from Swignition
<
http://buzzword.org.uk/swignition/>.
AUTHOR¶
Toby Inkster <tobyink@cpan.org>.
COPYRIGHT AND LICENCE¶
Copyright 2008-2012 Toby Inkster
This library is free software; you can redistribute it and/or modify it under
the same terms as Perl itself.
DISCLAIMER OF WARRANTIES¶
THIS PACKAGE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED
WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE.