TclXML(3tcl) | TclXML Package Commands | TclXML(3tcl) |
See the file "LICENSE" for information on
usage and redistribution of this file, and for a DISCLAIMER OF ALL WARRANTIES.
See also -startdoctypedeclcommand and -enddoctypedeclcommand.
Additional information about the element takes the form of configuration
options. Possible options are:
Additional information about the element takes the form of configuration
options. Possible options are:
The return result of the callback script determines the action of the parser.
Note that these codes are interpreted in a different manner to other
callbacks.
switch -glob -- $uri { tcl:* { regexp {^tcl:(.*)$} $uri discard script return [uplevel #0 $script] } default { return -code continue {} }
} } set parser [xml::parser -externalentitycommand External] $parser parse {<!DOCTYPE example [
<!ENTITY example SYSTEM "tcl:set%20example%20HelloWorld"> ]> <example>
&example; </example> } puts $example This script will print "HelloWorld" to stdout.
puts -nonewline $data } set parser [::xml::parser -characterdatacommand cdata] $parser parse [read stdin] This script counts the number of elements in an XML document read from stdin. package require xml proc EStart {varName name attlist args} {
upvar #0 $varName var
incr var } set count 0 set parser [::xml::parser -elementstartcommand [list EStart count]] $parser parse [read stdin] puts "The XML document contains $count elements"
NAME¶
TclXML - XML parser support for TclSYNOPSIS¶
package require xml package require parserclass ::xml::parserclass option ? arg arg ... ? ::xml::parser ? name? ? -option value ... ? parser option argDESCRIPTION¶
TclXML provides event-based parsing of XML documents. The application may register callback scripts for certain document features, and when the parser encounters those features while parsing the document the callback is evaluated. The parser may also perform other functions, such as normalisation, validation and/or entity expansion. Generally, these functions are under the control of configuration options. Whether these functions can be performed at all depends on the parser implementation. The TclXML package provides a generic interface for use by a Tcl application, along with a low-level interface for use by a parser implementation. Each implementation provides a class of XML parser, and these register themselves using the ::xml::parserclass create command. One of the registered parser classes will be the default parser class. Loading the package with the generic package require xml command allows the package to automatically determine the default parser class. In order to select a particular parser class as the default, that class' package may be loaded directly, eg. package require xml::libxml2. In all cases, all available parser classes are registered with the TclXML package, the difference is simply in which one becomes the default.COMMANDS¶
::xml::parserclass¶
The ::xml::parserclass command is used to manage XML parser classes.Command Options¶
The following command options may be used:- create
- create name ? -createcommand script? ? -createentityparsercommand script? ? -parsecommand script? ? -configurecommand script? ? -getcommand script? ? -deletecommand script?
- destroy
- destroy name
- info
- info names default
::xml::parser¶
The ::xml::parser command creates an XML parser object. The return value of the command is the name of the newly created parser. The parser scans an XML document's syntactical structure, evaluating callback scripts for each feature found. At the very least the parser will normalise the document and check the document for well-formedness. If the document is not well-formed then the -errorcommand option will be evaluated. Some parser classes may perform additional functions, such as validation. Additional features provided by the various parser classes are described in the section Parser Classes Parsing is performed synchronously. The command blocks until the entire document has been parsed. Parsing may be terminated by an application callback, see the section Callback Return Codes. Incremental parsing is also supported by using the -final configuration option.Configuration Options¶
The ::xml::parser command accepts the following configuration options:- -attlistdeclcommand
- -attlistdeclcommand script
-
name
Element type name
-
attrname
Attribute name being declared
-
type
Attribute type
-
default
Attribute default, such as #IMPLIED
-
value
Default attribute value. Empty string if none given.
- -baseuri -baseurl
- -baseuri URI -baseurl URI
- -characterdatacommand
- -characterdatacommand script
-
data
Character data in the document
- -commentcommand
- -commentcommand script
-
data
Comment data
- -defaultcommand
- -defaultcommand script
-
data
Document data
- -defaultexpandinternalentities
- -defaultexpandinternalentities boolean
- -doctypecommand
- -doctypecommand script
-
name
The name of the document element
-
public
Public identifier for the external DTD subset
-
system
System identifier for the external DTD subset. Usually a URI.
-
dtd
The internal DTD subset
- -elementdeclcommand
- -elementdeclcommand script
-
name
The element type name
-
model
Content model specification
- -elementendcommand
- -elementendcommand script
-
name
The element type name that has ended
-
args
Additional information about this element
-
-empty
boolean
The empty element syntax was used for this element
-
-namespace
uri
The element is in the XML namespace associated with the given URI
- -elementstartcommand
- -elementstartcommand script
-
name
The element type name that has started
-
attlist
A Tcl list containing the attributes for this element. The list of attributes is formatted as pairs of attribute names and their values.
-
args
Additional information about this element
-
-empty
boolean
The empty element syntax was used for this element
-
-namespace
uri
The element is in the XML namespace associated with the given URI
-
-namespacedecls
list
The start tag included one or more XML Namespace declarations. list is a Tcl list giving the namespaces declared. The list is formatted as pairs of values, the first value is the namespace URI and the second value is the prefix used for the namespace in this document. A default XML namespace declaration will have an empty string for the prefix.
- -encoding
- -encoding value
- -endcdatasectioncommand
- -endcdatasectioncommand script
- -enddoctypedeclcommand
- -enddoctypedeclcommand script
- -entitydeclcommand
- -entitydeclcommand script
-
name
The name of the entity being declared
-
args
Additional information about the entity declaration. An internal entity shall have a single argument, the replacement text. An external parsed entity shall have two additional arguments, the public and system indentifiers of the external resource. An external unparsed entity shall have three additional arguments, the public and system identifiers followed by the notation name.
- -entityreferencecommand
- -entityreferencecommand script
-
name
The name of the entity being referenced
- -errorcommand
- -errorcommand script
-
errorcode
A single word description of the error, intended for use by an application
-
errormsg
A human-readable description of the error
- -externalentitycommand
- -externalentitycommand script
-
name
The Tcl command name of the current parser
-
baseuri
An absolute URI for the current entity which is to be used to resolve relative URIs
-
uri
The system identifier of the external entity, usually a URI
-
id
The public identifier of the external entity. If no public identifier was given in the entity declaration then id will be an empty string.
- TCL_OK
switch -glob -- $uri { tcl:* { regexp {^tcl:(.*)$} $uri discard script return [uplevel #0 $script] } default { return -code continue {} }
} } set parser [xml::parser -externalentitycommand External] $parser parse {<!DOCTYPE example [
<!ENTITY example SYSTEM "tcl:set%20example%20HelloWorld"> ]> <example>
&example; </example> } puts $example This script will print "HelloWorld" to stdout.
- TCL_CONTINUE
- TCL_BREAK
- TCL_ERROR
- -final
- -final boolean
- -ignorewhitespace
- -ignorewhitespace boolean
- -notationdeclcommand
- -notationdeclcommand script
-
name
The name of the notation
-
uri
An external identifier for the notation, usually a URI.
- -notstandalonecommand
- -notstandalonecommand script
- -paramentityparsing
- -paramentityparsing boolean
- -parameterentitydeclcommand
- -parameterentitydeclcommand script
-
name
The name of the parameter entity
-
args
For an internal parameter entity there is only one additional argument, the replacement text. For external parameter entities there are two additional arguments, the system and public identifiers respectively.
- -parser
- -parser name
- -processinginstructioncommand
- -processinginstructioncommand script
-
target
The name of the processing instruction target
-
data
Remaining data from the processing instruction
- -reportempty
- -reportempty boolean
- -startcdatasectioncommand
- -startcdatasectioncommand script
- -startdoctypedeclcommand
- -startdoctypedeclcommand script
- -unknownencodingcommand
- -unknownencodingcommand script
- -unparsedentitydeclcommand
- -unparsedentitydeclcommand script
-
system
The system identifier of the external entity, usually a URI
-
public
The public identifier of the external entity
-
notation
The name of the notation for the external entity
- -validate
- -validate boolean
- -warningcommand
- -warningcommand script
-
warningcode
A single word description of the warning, intended for use by an application
-
wanringmsg
A human-readable description of the warning
- -xmldeclcommand
- -xmldeclcommand script
-
version
The version number of the XML specification to which this document purports to conform
-
encoding
The character encoding of the document
-
standalone
A boolean declaring whether the document is standalone
Parser Command¶
The ::xml::parser command creates a new Tcl command with the same name as the parser. This command may be used to invoke various operations on the parser object. It has the following general form: name option arg option and the arg determine the exact behaviour of the command. The following commands are possible for parser objects:- cget
- cget -option
- configure
- configure -option value
- entityparser
- entityparser option value
- free
- free name
- get
- get name args
- parse
- parse xml args
- reset
- reset
CALLBACK RETURN CODES¶
Every callback script evaluated by a parser may return a return code other than TCL_OK. Return codes are interpreted as follows:- break Suppresses invocation of all further callback scripts. The parse method returns the TCL_OK return code.
- continue Suppresses invocation of further callback scripts until the current element has finished.
- error Suppresses invocation of all further callback scripts. The parse method also returns the TCL_ERROR return code.
- default Any other return code suppresses invocation of all further callback scripts. The parse method returns the same return code.
ERROR MESSAGES¶
If an error or warning condition is detected then an error message is returned. These messages are structured as a Tcl list, as described below: {domain level code node line message int1 int2 string1 string2 string3}- domain
- level
- code
- node
- line
- message
- int1
- int2
- string1
- string2
- string3
APPLICATION EXAMPLES¶
This script outputs the character data of an XML document read from stdin. package require xml proc cdata {data args} {puts -nonewline $data } set parser [::xml::parser -characterdatacommand cdata] $parser parse [read stdin] This script counts the number of elements in an XML document read from stdin. package require xml proc EStart {varName name attlist args} {
upvar #0 $varName var
incr var } set count 0 set parser [::xml::parser -elementstartcommand [list EStart count]] $parser parse [read stdin] puts "The XML document contains $count elements"
SAFE XML¶
TclXML/Tcl and TclXML/libxml2 may be used in a Safe Tcl interpreter. When a document is parsed in a Safe Tcl interpreter, any attempt by the XML document to load an external entity is handled by the -externalentitycommand callback. This callback is evaluated in the context of the safe interpreter and therefore is subject to the security policy in force for that interpreter. The default entity loader will not be invoked, even if the callback script returns a TCL_CONTINUE code. See the description of the -externalentitycommand for further details.PARSER CLASSES¶
This section will discuss how a parser class is implemented.Tcl Parser Class¶
The pure-Tcl parser class requires no compilation - it is a collection of Tcl scripts. This parser implementation is non-validating, ie. it can only check well-formedness in a document. However, by enabling the -validate option it will read the document's DTD and resolve external entities. This parser class is referred to as TclXML/tcl. This parser implementation aims to implement XML v1.0 and supports XML Namespaces. Generally the parser produces XML Infoset information items. That is, it gives the application a slightly higher-level view than the raw XML syntax. For example, it does not report CDATA Sections. TclXML/tcl is not able to handle character encodings other than UTF-8.libxml2 Parser Class¶
The libxml2 parser class provides a Tcl interface to the libxml2 XML parser library. This parser class is referred to as TclXML/libxml2. When the package is loaded the variable ::xml::libxml2::libxml2version is set to the version number of the libxml2 library being used. On MS Windows, it is necessary to load the generic XML package first, and then the TclXML/libxml2 package. For example, package require xml package require xml::libxml2get Method¶
TclXML/libxml2 provides the following arguments to the get method:- document
Additional Options¶
- -keep
- -keep normal | implicit
- -retainpath
- -retainpath xpath
- -retainpathns
- -retainpathns prefix ns ...
Limitations¶
The libxml2 parser classes has the following limitations:
* -reportempty has no effect. libxml2 does not report empty element
syntax.
* Incremental (push) parsing, ie. -final 0 is not
supported.
* TclXML/libxml2 does not provide (DTD) validation, (WXS) schema validation or
Relax NG validation, although the libxml2 library does provide those
functions. These functions are provided by the TclDOM/libxml2 package, but
only in a "posteriori" fashion (ie. only after the document has been
parsed).
* libxml2 supports XML Namespaces. The use of XML Namespaces can be queried, but
the declaration of a XML Namespace is not reported.
KEYWORDS¶
3.2 | TclXML |