.\" Automatically generated by Pod::Man 4.14 (Pod::Simple 3.42)
.\"
.\" Standard preamble:
.\" ========================================================================
.de Sp \" Vertical space (when we can't use .PP)
.if t .sp .5v
.if n .sp
..
.de Vb \" Begin verbatim text
.ft CW
.nf
.ne \\$1
..
.de Ve \" End verbatim text
.ft R
.fi
..
.\" Set up some character translations and predefined strings.  \*(-- will
.\" give an unbreakable dash, \*(PI will give pi, \*(L" will give a left
.\" double quote, and \*(R" will give a right double quote.  \*(C+ will
.\" give a nicer C++.  Capital omega is used to do unbreakable dashes and
.\" therefore won't be available.  \*(C` and \*(C' expand to `' in nroff,
.\" nothing in troff, for use with C<>.
.tr \(*W-
.ds C+ C\v'-.1v'\h'-1p'\s-2+\h'-1p'+\s0\v'.1v'\h'-1p'
.ie n \{\
.    ds -- \(*W-
.    ds PI pi
.    if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch
.    if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\"  diablo 12 pitch
.    ds L" ""
.    ds R" ""
.    ds C` ""
.    ds C' ""
'br\}
.el\{\
.    ds -- \|\(em\|
.    ds PI \(*p
.    ds L" ``
.    ds R" ''
.    ds C`
.    ds C'
'br\}
.\"
.\" Escape single quotes in literal strings from groff's Unicode transform.
.ie \n(.g .ds Aq \(aq
.el       .ds Aq '
.\"
.\" If the F register is >0, we'll generate index entries on stderr for
.\" titles (.TH), headers (.SH), subsections (.SS), items (.Ip), and index
.\" entries marked with X<> in POD.  Of course, you'll have to process the
.\" output yourself in some meaningful fashion.
.\"
.\" Avoid warning from groff about undefined register 'F'.
.de IX
..
.nr rF 0
.if \n(.g .if rF .nr rF 1
.if (\n(rF:(\n(.g==0)) \{\
.    if \nF \{\
.        de IX
.        tm Index:\\$1\t\\n%\t"\\$2"
..
.        if !\nF==2 \{\
.            nr % 0
.            nr F 2
.        \}
.    \}
.\}
.rr rF
.\" ========================================================================
.\"
.IX Title "XML::SAX::Base 3pm"
.TH XML::SAX::Base 3pm "2022-10-15" "perl v5.34.0" "User Contributed Perl Documentation"
.\" For nroff, turn off justification.  Always turn off hyphenation; it makes
.\" way too many mistakes in technical documents.
.if n .ad l
.nh
.SH "NAME"
XML::SAX::Base \- Base class SAX Drivers and Filters
.SH "SYNOPSIS"
.IX Header "SYNOPSIS"
.Vb 3
\&  package MyFilter;
\&  use XML::SAX::Base;
\&  @ISA = (\*(AqXML::SAX::Base\*(Aq);
.Ve
.SH "DESCRIPTION"
.IX Header "DESCRIPTION"
This module has a very simple task \- to be a base class for PerlSAX
drivers and filters. It's default behaviour is to pass the input directly
to the output unchanged. It can be useful to use this module as a base class
so you don't have to, for example, implement the \fBcharacters()\fR callback.
.PP
The main advantages that it provides are easy dispatching of events the right
way (ie it takes care for you of checking that the handler has implemented
that method, or has defined an \s-1AUTOLOAD\s0), and the guarantee that filters
will pass along events that they aren't implementing to handlers downstream
that might nevertheless be interested in them.
.SH "WRITING SAX DRIVERS AND FILTERS"
.IX Header "WRITING SAX DRIVERS AND FILTERS"
The Perl Sax \s-1API\s0 Reference is at <http://perl\-xml.sourceforge.net/perl\-sax/>.
.PP
Writing \s-1SAX\s0 Filters is tremendously easy: all you need to do is
inherit from this module, and define the events you want to handle. A
more detailed explanation can be found at
http://www.xml.com/pub/a/2001/10/10/sax\-filters.html.
.PP
Writing Drivers is equally simple. The one thing you need to pay
attention to is \fB\s-1NOT\s0\fR to call events yourself (this applies to Filters
as well). For instance:
.PP
.Vb 2
\&  package MyFilter;
\&  use base qw(XML::SAX::Base);
\&
\&  sub start_element {
\&    my $self = shift;
\&    my $data = shift;
\&    # do something
\&    $self\->{Handler}\->start_element($data); # BAD
\&  }
.Ve
.PP
The above example works well as precisely that: an example. But it has
several faults: 1) it doesn't test to see whether the handler defines
start_element. Perhaps it doesn't want to see that event, in which
case you shouldn't throw it (otherwise it'll die). 2) it doesn't check
ContentHandler and then Handler (ie it doesn't look to see that the
user hasn't requested events on a specific handler, and if not on the
default one), 3) if it did check all that, not only would the code be
cumbersome (see this module's source to get an idea) but it would also
probably have to check for a DocumentHandler (in case this were \s-1SAX1\s0)
and for AUTOLOADs potentially defined in all these packages. As you can
tell, that would be fairly painful. Instead of going through that,
simply remember to use code similar to the following instead:
.PP
.Vb 2
\&  package MyFilter;
\&  use base qw(XML::SAX::Base);
\&
\&  sub start_element {
\&    my $self = shift;
\&    my $data = shift;
\&    # do something to filter
\&    $self\->SUPER::start_element($data); # GOOD (and easy) !
\&  }
.Ve
.PP
This way, once you've done your job you hand the ball back to
XML::SAX::Base and it takes care of all those problems for you!
.PP
Note that the above example doesn't apply to filters only, drivers
will benefit from the exact same feature.
.SH "METHODS"
.IX Header "METHODS"
A number of methods are defined within this class for the purpose of
inheritance. Some probably don't need to be overridden (eg parse_file)
but some clearly should be (eg parse). Options for these methods are
described in the PerlSAX2 specification available from
http://cvs.sourceforge.net/cgi\-bin/viewcvs.cgi/~checkout~/perl\-xml/libxml\-perl/doc/sax\-2.0.html?rev=HEAD&content\-type=text/html.
.IP "\(bu" 4
parse
.Sp
The parse method is the main entry point to parsing documents. Internally
the parse method will detect what type of \*(L"thing\*(R" you are parsing, and
call the appropriate method in your implementation class. Here is the
mapping table of what is in the Source options (see the Perl \s-1SAX 2.0\s0
specification for the meaning of these values):
.Sp
.Vb 6
\&  Source Contains           parse() calls
\&  ===============           =============
\&  CharacterStream (*)       _parse_characterstream($stream, $options)
\&  ByteStream                _parse_bytestream($stream, $options)
\&  String                    _parse_string($string, $options)
\&  SystemId                  _parse_systemid($string, $options)
.Ve
.Sp
However note that these methods may not be sensible if your driver class 
is not for parsing \s-1XML.\s0 An example might be a \s-1DBI\s0 driver that generates
\&\s-1XML/SAX\s0 from a database table. If that is the case, you likely want to
write your own \fBparse()\fR method.
.Sp
Also note that the Source may contain both a PublicId entry, and an
Encoding entry. To get at these, examine \f(CW$options\fR\->{Source} as passed
to your method.
.Sp
(*) A CharacterStream is a filehandle that does not need any encoding
translation done on it. This is implemented as a regular filehandle
and only works under Perl 5.7.2 or higher using PerlIO. To get a single
character, or number of characters from it, use the perl core \fBread()\fR
function. To get a single byte from it (or number of bytes), you can 
use \fBsysread()\fR. The encoding of the stream should be in the Encoding
entry for the Source.
.IP "\(bu" 4
parse_file, parse_uri, parse_string
.Sp
These are all convenience variations on \fBparse()\fR, and in fact simply
set up the options before calling it. You probably don't need to
override these.
.IP "\(bu" 4
get_options
.Sp
This is a convenience method to get options in \s-1SAX2\s0 style, or more
generically either as hashes or as hashrefs (it returns a hashref).
You will probably want to use this method in your own implementations
of \fBparse()\fR and of \fBnew()\fR.
.IP "\(bu" 4
get_feature, set_feature
.Sp
These simply get and set features, and throw the
appropriate exceptions defined in the specification if need be.
.Sp
If your subclass defines features not defined in this one,
then you should override these methods in such a way that they check for
your features first, and then call the base class's methods
for features not defined by your class. An example would be:
.Sp
.Vb 10
\&  sub get_feature {
\&      my $self = shift;
\&      my $feat = shift;
\&      if (exists $MY_FEATURES{$feat}) {
\&          # handle the feature in various ways
\&      }
\&      else {
\&          return $self\->SUPER::get_feature($feat);
\&      }
\&  }
.Ve
.Sp
Currently this part is unimplemented.
.IP "\(bu" 4
set_handler
.Sp
This method takes a handler type (Handler, ContentHandler, etc.) and a
handler object as arguments, and changes the current handler for that
handler type, while taking care of resetting the internal state that 
needs to be reset. This allows one to change a handler during parse
without running into problems (changing it on the parser object 
directly will most likely cause trouble).
.IP "\(bu" 4
set_document_handler, set_content_handler, set_dtd_handler, set_lexical_handler, set_decl_handler, set_error_handler, set_entity_resolver
.Sp
These are just simple wrappers around the former method, and take a
handler object as their argument. Internally they simply call
set_handler with the correct arguments.
.IP "\(bu" 4
get_handler
.Sp
The inverse of set_handler, this method takes a an optional string containing a handler type (DTDHandler, 
ContentHandler, etc. 'Handler' is used if no type is passed). It returns a reference to the object that implements
that class, or undef if that handler type is not set for the current driver/filter.
.IP "\(bu" 4
get_document_handler, get_content_handler, get_dtd_handler, get_lexical_handler, get_decl_handler, 
get_error_handler, get_entity_resolver
.Sp
These are just simple wrappers around the \fBget_handler()\fR method, and take no arguments. Internally 
they simply call get_handler with the correct handler type name.
.PP
It would be rather useless to describe all the methods that this
module implements here. They are all the methods supported in \s-1SAX1\s0 and
\&\s-1SAX2.\s0 In case your memory is a little short, here is a list. The
apparent duplicates are there so that both versions of \s-1SAX\s0 can be
supported.
.IP "\(bu" 4
start_document
.IP "\(bu" 4
end_document
.IP "\(bu" 4
start_element
.IP "\(bu" 4
start_document
.IP "\(bu" 4
end_document
.IP "\(bu" 4
start_element
.IP "\(bu" 4
end_element
.IP "\(bu" 4
characters
.IP "\(bu" 4
processing_instruction
.IP "\(bu" 4
ignorable_whitespace
.IP "\(bu" 4
set_document_locator
.IP "\(bu" 4
start_prefix_mapping
.IP "\(bu" 4
end_prefix_mapping
.IP "\(bu" 4
skipped_entity
.IP "\(bu" 4
start_cdata
.IP "\(bu" 4
end_cdata
.IP "\(bu" 4
comment
.IP "\(bu" 4
entity_reference
.IP "\(bu" 4
notation_decl
.IP "\(bu" 4
unparsed_entity_decl
.IP "\(bu" 4
element_decl
.IP "\(bu" 4
attlist_decl
.IP "\(bu" 4
doctype_decl
.IP "\(bu" 4
xml_decl
.IP "\(bu" 4
entity_decl
.IP "\(bu" 4
attribute_decl
.IP "\(bu" 4
internal_entity_decl
.IP "\(bu" 4
external_entity_decl
.IP "\(bu" 4
resolve_entity
.IP "\(bu" 4
start_dtd
.IP "\(bu" 4
end_dtd
.IP "\(bu" 4
start_entity
.IP "\(bu" 4
end_entity
.IP "\(bu" 4
warning
.IP "\(bu" 4
error
.IP "\(bu" 4
fatal_error
.SH "TODO"
.IX Header "TODO"
.Vb 3
\&  \- more tests
\&  \- conform to the "SAX Filters" and "Java and DOM compatibility"
\&    sections of the SAX2 document.
.Ve
.SH "AUTHOR"
.IX Header "AUTHOR"
Kip Hampton (khampton@totalcinema.com) did most of the work, after porting
it from XML::Filter::Base.
.PP
Robin Berjon (robin@knowscape.com) pitched in with patches to make it 
usable as a base for drivers as well as filters, along with other patches.
.PP
Matt Sergeant (matt@sergeant.org) wrote the original XML::Filter::Base,
and patched a few things here and there, and imported it into
the \s-1XML::SAX\s0 distribution.
.SH "SEE ALSO"
.IX Header "SEE ALSO"
\&\s-1XML::SAX\s0