NAME¶
XML::GRDDL - transform XML and XHTML to RDF
SYNOPSIS¶
High-level interface:
my $grddl = XML::GRDDL->new;
my $model = $grddl->data($xmldoc, $baseuri);
# $model is an RDF::Trine::Model
Low-level interface:
my $grddl = XML::GRDDL->new;
my @transformations = $grddl->discover($xmldoc, $baseuri);
foreach my $t (@transformations)
{
# $t is an XML::GRDDL::Transformation
my ($output, $mediatype) = $t->transform($xmldoc);
# $output is a string of type $mediatype.
}
DESCRIPTION¶
GRDDL is a W3C Recommendation for extracting RDF data from arbitrary XML and
XHTML via a transformation, typically written in XSLT. See
<
http://www.w3.org/TR/grddl/> for more details.
This module implements GRDDL in Perl. It offers both a low level interface,
allowing you to generate a list of transformations associated with the
document being processed, and thus the ability to selectively run the
transformation; and a high-level interface where a single RDF model is
returned representing the union of the RDF graphs generated by applying all
available transformations.
Constructor¶
- "XML::GRDDL->new"
- The constructor accepts no parameters and returns an
XML::GRDDL object.
Methods¶
- "$grddl->discover($xml, $base, %options)"
- Processes the document to discover the transformations
associated with it. $xml is the raw XML source of the document, or an
XML::LibXML::Document object. ($xml cannot be "tag soup" HTML,
though you should be able to use HTML::HTML5::Parser to parse tag soup
into an XML::LibXML::Document.) $base is the base URI for resolving
relative references.
Returns a list of XML::GRDDL::Transformation objects.
Options include:
- •
- force_rel - boolean; interpret XHTML
rel="transformation" even in the absence of the GRDDL
profile.
- •
- strings - boolean; return a list of plain strings
instead of blessed objects.
- "$grddl->data($xml, $base, %options)"
- Processes the document, discovers the transformations
associated with it, applies the transformations and merges the results
into a single RDF model. $xml and $base are as per "discover".
Returns an RDF::Trine::Model containing the data. Statement contexts (a.k.a.
named graphs / quads) are used to distinguish between data from the result
of each transformation.
Options include:
- •
- force_rel - boolean; interpret XHTML
rel="transformation" even in the absence of the GRDDL
profile.
- •
- metadata - boolean; include provenance information
in the default graph (a.k.a. nil context).
- "$grddl->ua( [$ua] )"
- Get/set the user agent used for HTTP requests. $ua, if
supplied, must be an LWP::UserAgent.
FEATURES¶
XML::GRDDL supports transformations written in XSLT 1.0, and in RDF-EASE.
XML::GRDDL is a good HTTP citizen: Referer headers are included in requests, and
appropriate Accept headers supplied. To be an even better citizen, I recommend
changing the User-Agent header to advertise the name of the application:
$grddl->ua->default_header(user_agent => 'MyApp/1.0 ');
Provenance information for GRDDL transformations is returned using the GRDDL
vocabulary at
http://www.w3.org/2003/g/data-view#
<
http://www.w3.org/2003/g/data-view#>.
Certain XHTML profiles and XML namespaces known not to contain any
transformations, or to contain useless transformations are skipped. See
XML::GRDDL::Namespace and XML::GRDDL::Profile for details. In particular
profiles for RDFa and many Microformats are skipped, as RDF::RDFa::Parser and
HTML::Microformats will typically yield far superior results.
BUGS¶
Please report any bugs to <
http://rt.cpan.org/>.
Known limitations:
- •
- Recursive GRDDL doesn't work yet.
That is, the profile documents and namespace documents linked to from your
primary document cannot themselves rely on GRDDL.
SEE ALSO¶
XML::GRDDL::Transformation, XML::GRDDL::Namespace, XML::GRDDL::Profile,
XML::GRDDL::Transformation::RDF_EASE::Functional, XML::Saxon::XSLT2.
HTML::HTML5::Parser, RDF::RDFa::Parser, HTML::Microformats.
JSON::GRDDL.
<
http://www.w3.org/TR/grddl/>.
<
http://www.perlrdf.org/>.
This module is derived from Swignition
<
http://buzzword.org.uk/swignition/>.
AUTHOR¶
Toby Inkster <tobyink@cpan.org>.
COPYRIGHT¶
Copyright 2008-2011 Toby Inkster
This library is free software; you can redistribute it and/or modify it under
the same terms as Perl itself.