.\" Automatically generated by Pod::Man 4.14 (Pod::Simple 3.43) .\" .\" Standard preamble: .\" ======================================================================== .de Sp \" Vertical space (when we can't use .PP) .if t .sp .5v .if n .sp .. .de Vb \" Begin verbatim text .ft CW .nf .ne \\$1 .. .de Ve \" End verbatim text .ft R .fi .. .\" Set up some character translations and predefined strings. \*(-- will .\" give an unbreakable dash, \*(PI will give pi, \*(L" will give a left .\" double quote, and \*(R" will give a right double quote. \*(C+ will .\" give a nicer C++. Capital omega is used to do unbreakable dashes and .\" therefore won't be available. \*(C` and \*(C' expand to `' in nroff, .\" nothing in troff, for use with C<>. .tr \(*W- .ds C+ C\v'-.1v'\h'-1p'\s-2+\h'-1p'+\s0\v'.1v'\h'-1p' .ie n \{\ . ds -- \(*W- . ds PI pi . if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch . if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\" diablo 12 pitch . ds L" "" . ds R" "" . ds C` "" . ds C' "" 'br\} .el\{\ . ds -- \|\(em\| . ds PI \(*p . ds L" `` . ds R" '' . ds C` . ds C' 'br\} .\" .\" Escape single quotes in literal strings from groff's Unicode transform. .ie \n(.g .ds Aq \(aq .el .ds Aq ' .\" .\" If the F register is >0, we'll generate index entries on stderr for .\" titles (.TH), headers (.SH), subsections (.SS), items (.Ip), and index .\" entries marked with X<> in POD. Of course, you'll have to process the .\" output yourself in some meaningful fashion. .\" .\" Avoid warning from groff about undefined register 'F'. .de IX .. .nr rF 0 .if \n(.g .if rF .nr rF 1 .if (\n(rF:(\n(.g==0)) \{\ . if \nF \{\ . de IX . tm Index:\\$1\t\\n%\t"\\$2" .. . if !\nF==2 \{\ . nr % 0 . nr F 2 . \} . \} .\} .rr rF .\" .\" Accent mark definitions (@(#)ms.acc 1.5 88/02/08 SMI; from UCB 4.2). .\" Fear. Run. Save yourself. No user-serviceable parts. . \" fudge factors for nroff and troff .if n \{\ . ds #H 0 . ds #V .8m . ds #F .3m . ds #[ \f1 . ds #] \fP .\} .if t \{\ . ds #H ((1u-(\\\\n(.fu%2u))*.13m) . ds #V .6m . ds #F 0 . ds #[ \& . ds #] \& .\} . \" simple accents for nroff and troff .if n \{\ . ds ' \& . ds ` \& . ds ^ \& . ds , \& . ds ~ ~ . ds / .\} .if t \{\ . ds ' \\k:\h'-(\\n(.wu*8/10-\*(#H)'\'\h"|\\n:u" . ds ` \\k:\h'-(\\n(.wu*8/10-\*(#H)'\`\h'|\\n:u' . ds ^ \\k:\h'-(\\n(.wu*10/11-\*(#H)'^\h'|\\n:u' . ds , \\k:\h'-(\\n(.wu*8/10)',\h'|\\n:u' . ds ~ \\k:\h'-(\\n(.wu-\*(#H-.1m)'~\h'|\\n:u' . ds / \\k:\h'-(\\n(.wu*8/10-\*(#H)'\z\(sl\h'|\\n:u' .\} . \" troff and (daisy-wheel) nroff accents .ds : \\k:\h'-(\\n(.wu*8/10-\*(#H+.1m+\*(#F)'\v'-\*(#V'\z.\h'.2m+\*(#F'.\h'|\\n:u'\v'\*(#V' .ds 8 \h'\*(#H'\(*b\h'-\*(#H' .ds o \\k:\h'-(\\n(.wu+\w'\(de'u-\*(#H)/2u'\v'-.3n'\*(#[\z\(de\v'.3n'\h'|\\n:u'\*(#] .ds d- \h'\*(#H'\(pd\h'-\w'~'u'\v'-.25m'\f2\(hy\fP\v'.25m'\h'-\*(#H' .ds D- D\\k:\h'-\w'D'u'\v'-.11m'\z\(hy\v'.11m'\h'|\\n:u' .ds th \*(#[\v'.3m'\s+1I\s-1\v'-.3m'\h'-(\w'I'u*2/3)'\s-1o\s+1\*(#] .ds Th \*(#[\s+2I\s-2\h'-\w'I'u*3/5'\v'-.3m'o\v'.3m'\*(#] .ds ae a\h'-(\w'a'u*4/10)'e .ds Ae A\h'-(\w'A'u*4/10)'E . \" corrections for vroff .if v .ds ~ \\k:\h'-(\\n(.wu*9/10-\*(#H)'\s-2\u~\d\s+2\h'|\\n:u' .if v .ds ^ \\k:\h'-(\\n(.wu*10/11-\*(#H)'\v'-.4m'^\v'.4m'\h'|\\n:u' . \" for low resolution devices (crt and lpr) .if \n(.H>23 .if \n(.V>19 \ \{\ . ds : e . ds 8 ss . ds o a . ds d- d\h'-1'\(ga . ds D- D\h'-1'\(hy . ds th \o'bp' . ds Th \o'LP' . ds ae ae . ds Ae AE .\} .rm #[ #] #H #V #F C .\" ======================================================================== .\" .IX Title "Catmandu::Importer::OAI 3pm" .TH Catmandu::Importer::OAI 3pm "2023-10-26" "perl v5.36.0" "User Contributed Perl Documentation" .\" For nroff, turn off justification. Always turn off hyphenation; it makes .\" way too many mistakes in technical documents. .if n .ad l .nh .SH "NAME" Catmandu::Importer::OAI \- Package that imports OAI\-PMH feeds .SH "SYNOPSIS" .IX Header "SYNOPSIS" .Vb 1 \& # From the command line \& \& # Harvest records \& $ catmandu convert OAI \-\-url http://myrepo.org/oai \& $ catmandu convert OAI \-\-url http://myrepo.org/oai \-\-metadataPrefix didl \-\-handler raw \& \& # Harvest repository description \& $ catmandu convert OAI \-\-url http://myrepo.org/oai \-\-identify 1 \& \& # Harvest identifiers \& $ catmandu convert OAI \-\-url http://myrepo.org/oai \-\-listIdentifiers 1 \& \& # Harvest sets \& $ catmandu convert OAI \-\-url http://myrepo.org/oai \-\-listSets 1 \& \& # Harvest metadataFormats \& $ catmandu convert OAI \-\-url http://myrepo.org/oai \-\-listMetadataFormats 1 \& \& # Harvest one record \& $ catmandu convert OAI \-\-url http://myrepo.org/oai \-\-getRecord 1 \-\-identifier oai:myrepo:1234 .Ve .SH "DESCRIPTION" .IX Header "DESCRIPTION" Catmandu::Importer::OAI is an Catmandu importer to harvest metadata records from an OAI-PMH endpoint. .SH "CONFIGURATION" .IX Header "CONFIGURATION" .IP "url" 4 .IX Item "url" OAI-PMH Base \s-1URL.\s0 .IP "metadataPrefix" 4 .IX Item "metadataPrefix" Metadata prefix to specify the metadata format. Set to \f(CW\*(C`oai_dc\*(C'\fR by default. .ie n .IP "handler( sub {} | $object | '\s-1NAME\s0' | '+NAME' )" 4 .el .IP "handler( sub {} | \f(CW$object\fR | '\s-1NAME\s0' | '+NAME' )" 4 .IX Item "handler( sub {} | $object | 'NAME' | '+NAME' )" Handler to transform each record from \s-1XML DOM\s0 (XML::LibXML::Element) into Perl hash. .Sp Handlers can be provided as function reference, an instance of a Perl package that implements 'parse', or by a package \s-1NAME.\s0 Package names should be prepended by \f(CW\*(C`+\*(C'\fR or prefixed with \f(CW\*(C`Catmandu::Importer::OAI::Parser\*(C'\fR. E.g \&\f(CW\*(C`foobar\*(C'\fR will create a \f(CW\*(C`Catmandu::Importer::OAI::Parser::foobar\*(C'\fR instance. .Sp By default the handler Catmandu::Importer::OAI::Parser::oai_dc is used for metadataPrefix \f(CW\*(C`oai_dc\*(C'\fR, Catmandu::Importer::OAI::Parser::marcxml for \&\f(CW\*(C`marcxml\*(C'\fR, Catmandu::Importer::OAI::Parser::mods for \&\f(CW\*(C`mods\*(C'\fR, and Catmandu::Importer::OAI::Parser::struct for other formats. In addition there is Catmandu::Importer::OAI::Parser::raw to return the \s-1XML\s0 as it is. .IP "identifier" 4 .IX Item "identifier" Option return only results for this particular identifier .IP "set" 4 .IX Item "set" An optional set for selective harvesting. .IP "from" 4 .IX Item "from" An optional datetime value (YYYY-MM-DD or YYYY\-MM\-DDThh:mm:ssZ) as lower bound for datestamp-based selective harvesting. .IP "until" 4 .IX Item "until" An optional datetime value (YYYY-MM-DD or YYYY\-MM\-DDThh:mm:ssZ) as upper bound for datestamp-based selective harvesting. .IP "identify" 4 .IX Item "identify" Harvest the repository description instead of all records. .IP "getRecord" 4 .IX Item "getRecord" Harvest one record instead of all records. .IP "listIdentifiers" 4 .IX Item "listIdentifiers" Harvest identifiers instead of full records. .IP "listRecords" 4 .IX Item "listRecords" Harvest full records. Default operation. .IP "listSets" 4 .IX Item "listSets" Harvest sets instead of records. .IP "listMetadataFormats" 4 .IX Item "listMetadataFormats" Harvest metadata formats of records .IP "resumptionToken" 4 .IX Item "resumptionToken" An optional resumptionToken to start harvesting from. .IP "dry" 4 .IX Item "dry" Don't do any \s-1HTTP\s0 requests but return URLs that data would be queried from. .IP "strict" 4 .IX Item "strict" Optional validate all parameters first against the \s-1OAI 2\s0 specifications before sending it to an \s-1OAI\s0 server. Default: undef. .IP "xslt" 4 .IX Item "xslt" Preprocess \s-1XML\s0 records with \s-1XSLT\s0 script(s) given as comma separated list or array reference. Requires Catmandu::XML. .IP "max_retries" 4 .IX Item "max_retries" When an oai request fails, the importer will retry this number of times. Set to '0' by default. .Sp Internally the exponential backoff algorithm is used for this. This means that after every failed request the importer will choose a random number between 0 and 2^collision (excluded), and wait that number of seconds. So the actual amount of time before the importer stops can differ: .Sp .Vb 6 \& first retry: \& wait [ 0..2^1 [ seconds \& second retry: \& wait [ 0..2^2 [ seconds \& third retry: \& wait [ 0..2^3 [ seconds \& \& .. .Ve .IP "sleep" 4 .IX Item "sleep" Sleep a number of seconds between OAI-PMH calls to the endpoint (default 0). .IP "realm" 4 .IX Item "realm" An optional realm value. This value is used when the importer harvests from a repository which is secured with basic authentication through Integrated Windows Authentication (\s-1NTLM\s0 or Kerberos). .IP "username" 4 .IX Item "username" An optional username value. This value is used when the importer harvests from a repository which is secured with basic authentication. .IP "password" 4 .IX Item "password" An optional password value. This value is used when the importer harvests from a repository which is secured with basic authentication. .SH "METHOD" .IX Header "METHOD" Every Catmandu::Importer is a Catmandu::Iterable all its methods are inherited. The Catmandu::Importer::OAI methods are not idempotent: OAI-PMH feeds can only be read once. .PP In addition to methods inherited from Catmandu::Iterable, this module provides the following public methods: .ie n .SS "handle_record( $dom )" .el .SS "handle_record( \f(CW$dom\fP )" .IX Subsection "handle_record( $dom )" Process an \s-1XML DOM\s0 as with xslt and handler as configured and return the result. .SH "ENVIRONMENT" .IX Header "ENVIRONMENT" If you are connected to the internet via a proxy server you need to set the coordinates to this proxy in your environment: .PP .Vb 1 \& export http_proxy="http://localhost:8080" .Ve .PP If you are connecting to a \s-1HTTPS\s0 server and don't want to verify the validity of certificates of the peer you can set the \s-1PERL_LWP_SSL_VERIFY_HOSTNAME\s0 to false in your environment. This maybe required to connect to broken \s-1SSL\s0 servers: .PP .Vb 1 \& export PERL_LWP_SSL_VERIFY_HOSTNAME=0 .Ve .SH "SEE ALSO" .IX Header "SEE ALSO" Catmandu , Catmandu::Importer .SH "AUTHOR" .IX Header "AUTHOR" Nicolas Steenlant, \f(CW\*(C`\*(C'\fR .SH "CONTRIBUTOR" .IX Header "CONTRIBUTOR" Patrick Hochstenbach, \f(CW\*(C`\*(C'\fR .PP Jakob Voss, \f(CW\*(C`\*(C'\fR .PP Nicolas Franck, \f(CW\*(C`\*(C'\fR .SH "LICENSE AND COPYRIGHT" .IX Header "LICENSE AND COPYRIGHT" Copyright 2016 Ghent University Library .PP This program is free software; you can redistribute it and/or modify it under the terms of either: the \s-1GNU\s0 General Public License as published by the Free Software Foundation; or the Artistic License. .PP See http://dev.perl.org/licenses/ for more information.