NAME¶
dom - Create an in-memory DOM tree from XML
SYNOPSIS¶
package require tdom
dom method ?arg arg ...?
DESCRIPTION ¶
This command provides the creation of complete DOM trees in memory. In the usual
case a string containing a XML information is parsed and converted into a DOM
tree.
method indicates a specific subcommand.
The valid methods are:
- dom parse
?options ? ?data?
- Parses the XML information and builds up the DOM tree in
memory providing a Tcl object command to this DOM document object.
Example:
dom parse $xml doc
$doc documentElement root
parses the XML in the variable xml, creates the DOM tree in memory, make a
reference to the document object, visible in Tcl as a document object command,
and assigns this new object name to the variable doc. When doc gets freed, the
DOM tree and the associated Tcl command object (document and all node objects)
are freed automatically.
set document [dom parse $xml]
set root [$document documentElement]
parses the XML in the variable xml, creates the DOM tree in memory, make a
reference to the document object, visible in Tcl as a document object command,
and returns this new object name, which is then stored in
document. To
free the underlying DOM tree and the associative Tcl object commands (document
+ nodes + fragment nodes) the document object command has to be explicitly
deleted by:
or
The valid options are:
- -simple
- If -simple is specified, a simple but fast parser is
used (conforms not fully to XML recommendation). That should double
parsing and DOM generation speed. The encoding of the data is not
transformed inside the parser. The simple parser does not respect any
encoding information in the XML declaration. It skips over the internal
DTD subset and ignores any information in it. Therefor it doesn't include
defaulted attribute values into the tree, even if the according attribute
declaration is in the internal subset. It also doesn't expand internal or
external entity references other than the predefined entities and
character references.
- -html
- If -html is specified, a fast HTML parser is used,
which tries to even parse badly formed HTML into a DOM tree.
- -keepEmpties
- If -keepEmpties is specified, text nodes, which
contain only whitespaces, will be part of the resulting DOM tree. In
default case ( -keepEmpties not given) those empty text nodes are
removed at parsing time.
- -channel <channel-ID>
- If -channel <channel-ID> is specified, the
input to be parsed is read from the specified channel. The encoding
setting of the channel (via fconfigure -encoding) is respected, ie the
data read from the channel are converted to UTF-8 according to the
encoding settings, befor the data is parsed.
- -baseurl <baseURI>
- If -baseurl <baseURI> is specified, the
baseURI is used as the base URI of the document. External entities
referenced in the document are resolved relative to this base URI. This
base URI is also stored within the DOM tree.
- -feedbackAfter <#bytes>
- If -feedbackAfter <#bytes> is specified, the
tcl command ::dom::domParseFeedback is evaluated after parsing every
#bytes. If you use this option, you have to create a tcl proc named
::dom::domParseFeedback, otherwise you will get an error. Please notice,
that the calls of ::dom::domParseFeedback are not done exactly every
#bytes, but always at the first element start after every #bytes.
- -externalentitycommand
<script>
- If -externalentitycommand <script> is
specified, the specified tcl script is called to resolve any external
entities of the document. The actual evaluated command consists of this
option followed by three arguments: the base uri, the system identifier of
the entity and the public identifier of the entity. The base uri and the
public identifier may be the empty list. The script has to return a tcl
list consisting of three elements. The first element of this list signals,
how the external entity is returned to the processor. At the moment, the
two allowed types are "string" and "channel". The
second element of the list has to be the (absolute) base URI of the
external entity to be parsed. The third element of the list are data,
either the already read data out of the external entity as string in the
case of type "string", or the name of a tcl channel, in the case
of type "channel". Note that if the script returns a tcl
channel, it will not be closed by the processor. It must be closed
separately if it is no longer required.
- -useForeignDTD <boolean>
- If <boolean> is true and the document does not have
an external subset, the parser will call the -externalentitycommand script
with empty values for the systemId and publicID arguments. Pleace notice,
that, if the document also doesn't have an internal subset, the
-startdoctypedeclcommand and -enddoctypedeclcommand scripts, if set, are
not called. The -useForeignDTD respects
- -paramentityparsing
<always|never|notstandalone>
- The -paramentityparsing option controls, if the
parser tries to resolve the external entities (including the external DTD
subset) of the document, while building the DOM tree.
-paramentityparsing requires an argument, which must be either
"always", "never", or "notstandalone". The
value "always" means, that the parser tries to resolves
(recursively) all external entities of the XML source. This is the
default, in case -paramentityparsing is omitted. The value
"never" means, that only the given XML source is parsed and no
external entity (including the external subset) will be resolved and
parsed. The value "notstandalone" means, that all external
entities will be resolved and parsed, with the execption of documents,
which explicitly states standalone="yes" in their XML
declaration.
- dom createDocument
docElemName ?objVar?
- Creates a new DOM document object with one element node
with node name docElemName. The objVar controls the memory
handling as explained above.
- dom createDocumentNS
uri docElemName ?objVar?
- Creates a new DOM document object with one element node
with node name docElemName. Uri gives the namespace of the
document element to create. The objVar controls the memory handling
as explained above.
- dom createDocumentNode
?objVar?
- Creates a new, 'empty' DOM document object without any
element node. objVar controls the memory handling as explained
above.
- dom setResultEncoding
?encodingName?
- If encodingName is not given the current global
result encoding is returned. Otherwise the global result encoding is set
to encodingName. All character data, attribute values, etc. will
then be converted from UTF-8, which is delivered from the Expat XML
parser, to the given 8 bit encoding at XML/DOM parse time. Valid values
for encodingName are: utf-8, ascii, cp1250, cp1251, cp1252, cp1253,
cp1254, cp1255, cp1256, cp437, cp850, en, iso8859-1, iso8859-2, iso8859-3,
iso8859-4, iso8859-5, iso8859-6, iso8859-7, iso8859-8, iso8859-9,
koi8-r.
- dom createNodeCmd
?-returnNodeCmd?
(element|comment|text|cdata|pi)Node
commandName
- This method creates Tcl commands, which in turn create tDOM
nodes. Tcl commands created by this command are only avaliable inside a
script given to the domNode method appendFromScript. If a command
created with createNodeCmd is invoked in any other context, it will
return error. The created command commandName replaces any existing
command or procedure with that name. If the commandName includes
any namespace qualifiers, it is created in the specified namespace.
If such command is invoked inside a script given as argument to the domNode
method
appendFromScript, it creates a new node and appends this node at
the end of the child list of the invoking element node. If the option
-returnNodeCmd was given, the command returns the created node as Tcl
command. If this option was omitted, the command returns nothing. Each command
creates always the same type of node. Which type of node is created by the
command is determined by the first argument to the
createNodeCmd. The
syntax of the created command depends on the type of the node it creates.
If the first argument of the method is
elementNode, the created command
will create an element node. The tag name of the created node is
commandName without namespace qualifiers. The syntax of the created
command is:
elementNodeCmd ?attributeName attributeValue ...? ?script?
elementNodeCmd ?-attributeName attributeValue ...? ?script?
elementNodeCmd name_value_list script
The command syntax allows three different ways to specify the attributes of the
resulting element. These could be specified with
attributeName
attributeValue argument pairs, in an "option style" way with
-attriubteName attributeValue argument pairs (the '-' character is only
syntactical sugar and will be stripped off) or as a Tcl list with elements
interpreted as attribute name and the corresponding attribute value. The
attribute name elements in the list may have a leading '-' character, which
will be stripped off.
Every
elementNodeCmd accepts an optional Tcl script as last argument.
This script is evaluated as recursive
appendFromScript script with the
node created by the
elementNodeCmd as parent of all nodes created by
the script.
If the first argument of the method is
textNode, the command will create
a text node. The syntax of the created command is:
textNodeCmd ?-disableOutputEscaping? data
If the optional flag
-disableOutputEscaping is given, the escaping of the
ampersand character (&) and the left angle bracket (<) inside the data
is disabled. You should use this flag carefully.
If the first argument of the method is
commentNode, or
cdataNode,
the command will create an comment node or CDATA section node. The syntax of
the created command is:
If the first argument of the method is
piNode, the command will create a
processing instruction node. The syntax of the created command is:
- dom setStoreLineColumn
?boolean?
- If switched on, the DOM nodes will contain line and column
position information for the original XML document after parsing. The
default is, not to store line and column position information.
- dom setNameCheck
?boolean ?
- If NameCheck is true, every method which expects an XML
Name, a full qualified name or a processing instructing target will check,
if the given string is valid according to his production rule. For
commands created with the createNodeCmd method to be used in the
context of appendFromScript the status of the flag at creation time
decides. If NameCheck is true at creation time, the command will check his
arguments, otherwise not. The setNameCheck set this flag. It
returns the current NameCheck flag state. The default state for NameCheck
is true.
- dom setTextCheck
?boolean ?
- If TextCheck is true, every command which expects XML
Chars, a comment, a CDATA section value or a processing instructing value
will check, if the given string is valid according to his production rule.
For commands created with the createNodeCmd method to be used in
the context of appendFromScript the status of the flag at creation
time decides. If TextCheck is true at creation time, the command will
check his arguments, otherwise not.The setTextCheck method set this
flag. It returns the current TextCheck flag state. The default state for
TextCheck is true.
- dom setObjectCommands
?(automatic|token|command)?
- Controls, if documents and nodes are created as tcl
commands or as token to be used with the domNode and domDoc commands. If
the mode is 'automatic', then methods used at tcl commands will create tcl
commands and methods used at doc or node tokes will create tokens. If the
mode is 'command' then always tcl commands will be created. If the mode is
'token', then always token will be created. The method returns the current
mode. This method is an experimental interface.
- dom isName
name
- Returns 1, if name is a valid XML Name according to
production 5 of the XML 1.0 recommendation. This means, that name
is a valid XML element or attribute name. Otherwise it returns 0.
- dom isPIName
name
- Returns 1, if name is a valid XML processing
instruction target according to production 17 of the XML 1.0
recommendation. Otherwise it returns 0.
- dom isNCName
name
- Returns 1, if name is a valid NCName according to
production 4 of the of the Namespaces in XML recommendation. Otherwise it
returns 0.
- dom isQName
name
- Returns 1, if name is a valid QName according to
production 6 of the of the Namespaces in XML recommendation. Otherwise it
returns 0.
- dom isCharData
string
- Returns 1, if every character in string is a valid
XML Char according to production 2 of the XML 1.0 recommendation.
Otherwise it returns 0.
- dom isComment
string
- Returns 1, if string is a valid comment according to
production 15 of the XML 1.0 recommendation. Otherwise it returns 0.
- dom isCDATA
string
- Returns 1, if string is valid according to
production 20 of the XML 1.0 recommendation. Otherwise it returns 0.
- dom isPIValue
string
- Returns 1, if string is valid according to
production 16 of the XML 1.0 recommendation. Otherwise it returns 0.
KEYWORDS¶
XML, DOM, document, node, parsing