'\" '\" Generated from schema.xml '\" '\" BEGIN man.macros .if t .wh -1.3i ^B .nr ^l \n(.l .ad b .de AP .ie !"\\$4"" .TP \\$4 .el \{\ . ie !"\\$2"" .TP \\n()Cu . el .TP 15 .\} .ta \\n()Au \\n()Bu .ie !"\\$3"" \{\ \&\\$1 \\fI\\$2\\fP (\\$3) .\".b .\} .el \{\ .br .ie !"\\$2"" \{\ \&\\$1 \\fI\\$2\\fP .\} .el \{\ \&\\fI\\$1\\fP .\} .\} .. .de AS .nr )A 10n .if !"\\$1"" .nr )A \\w'\\$1'u+3n .nr )B \\n()Au+15n .\" .if !"\\$2"" .nr )B \\w'\\$2'u+\\n()Au+3n .nr )C \\n()Bu+\\w'(in/out)'u+2n .. .AS Tcl_Interp Tcl_CreateInterp in/out .de BS .br .mk ^y .nr ^b 1u .if n .nf .if n .ti 0 .if n \l'\\n(.lu\(ul' .if n .fi .. .de BE .nf .ti 0 .mk ^t .ie n \l'\\n(^lu\(ul' .el \{\ .\" Draw four-sided box normally, but don't draw top of .\" box if the box started on an earlier page. .ie !\\n(^b-1 \{\ \h'-1.5n'\L'|\\n(^yu-1v'\l'\\n(^lu+3n\(ul'\L'\\n(^tu+1v-\\n(^yu'\l'|0u-1.5n\(ul' .\} .el \}\ \h'-1.5n'\L'|\\n(^yu-1v'\h'\\n(^lu+3n'\L'\\n(^tu+1v-\\n(^yu'\l'|0u-1.5n\(ul' .\} .\} .fi .br .nr ^b 0 .. .de VS .if !"\\$2"" .br .mk ^Y .ie n 'mc \s12\(br\s0 .el .nr ^v 1u .. .de VE .ie n 'mc .el \{\ .ev 2 .nf .ti 0 .mk ^t \h'|\\n(^lu+3n'\L'|\\n(^Yu-1v\(bv'\v'\\n(^tu+1v-\\n(^Yu'\h'-|\\n(^lu+3n' .sp -1 .fi .ev .\} .nr ^v 0 .. .de ^B .ev 2 'ti 0 'nf .mk ^t .if \\n(^b \{\ .\" Draw three-sided box if this is the box's first page, .\" draw two sides but no top otherwise. .ie !\\n(^b-1 \h'-1.5n'\L'|\\n(^yu-1v'\l'\\n(^lu+3n\(ul'\L'\\n(^tu+1v-\\n(^yu'\h'|0u'\c .el \h'-1.5n'\L'|\\n(^yu-1v'\h'\\n(^lu+3n'\L'\\n(^tu+1v-\\n(^yu'\h'|0u'\c .\} .if \\n(^v \{\ .nr ^x \\n(^tu+1v-\\n(^Yu \kx\h'-\\nxu'\h'|\\n(^lu+3n'\ky\L'-\\n(^xu'\v'\\n(^xu'\h'|0u'\c .\} .bp 'fi .ev .if \\n(^b \{\ .mk ^y .nr ^b 2 .\} .if \\n(^v \{\ .mk ^Y .\} .. .de DS .RS .nf .sp .. .de DE .fi .RE .sp .. .de SO .SH "STANDARD OPTIONS" .LP .nf .ta 5.5c 11c .ft B .. .de SE .fi .ft R .LP See the \\fBoptions\\fR manual entry for details on the standard options. .. .de OP .LP .nf .ta 4c Command-Line Name: \\fB\\$1\\fR Database Name: \\fB\\$2\\fR Database Class: \\fB\\$3\\fR .fi .IP .. .de CS .RS .nf .ta .25i .5i .75i 1i .if t .ft C .. .de CE .fi .if t .ft R .RE .. .de UL \\$1\l'|0\(ul'\\$2 .. '\" END man.macros .TH schema 3tcl "" Tcl "" .BS .SH NAME tdom::schema \- Creates a schema validation command .SH SYNOPSIS .nf package require tdom \&\fBtdom::schema\fP \fI?create?\fR \fIcmdName\fR .fi .BE .SH "DESCRIPTION " .PP Every call of this command creates a new validation command. A validation command has methods to define a schema and is able to validate XML data or to post-validate a tDOM DOM tree (and to some degree other kind of hierarchical data) against this schema. .PP Also, a validation command may be used as argument to the \&\fI-validateCmd\fR option of the \fIdom parse\fR and the \&\fIexpat\fR commands to enable validation additionally to what they do otherwise. .PP The methods of created commands are: .TP \&\fB\fBprefixns\fP \fI?prefixUriList?\fB \&\fRThis method controls prefix (or abbreviation) to namespace URI mapping. Wherever a namespace argument is expected in the schema command methods the "prefix" could be used instead of the namespace URI. If the list maps the same prefix to different namespace URIs, the first one wins. If there is no such prefix, the namespace argument is used literally as namespace URI. If the method is called without argument, it returns the current prefixUriList. If the method is called with the empty string, any namespace URI arguments are used literally. This is the default. .TP \&\fB\fBdefelement\fP \fIname\fB \fI?namespace?\fB \fI\fB \&\fRThis method defines the element \fIname\fR (optional in the namespace \fInamespace\fR) in the schema. The \&\fIdefinition script\fR is evaluated and defines the content model of the element. If the \fInamespace\fR argument is given, any \fIelement\fR or \fIref\fR references in the definition script not wrapped inside a \fInamespace\fR command are resolved in that namespace. If there is already a element definition for the name/namespace combination, the command raises error. .TP \&\fB\fBdefelementtype\fP \fItypename\fB \fIname\fB \fI?namespace?\fB \fI\fB \&\fRThis method defines the element type \fItypename\fR (optional in the namespace \fInamespace\fR) in the schema. If the element type is used in a definition script with the schema command elementtype, the validation engine expects an element named \fIname\fR (in the namespace \fInamespace\fR, if given) and the content model \fIdefinition script\fR. Defining element types seems only sensible if you really have elements with the same name and namespace but different content models. The \fIdefinition script\fR is evaluated and defines the content model of the element. If the \&\fInamespace\fR argument is given, any \fIelement\fR or \&\fIref\fR references in the definition script not wrapped inside a \fInamespace\fR command are resolved in that namespace. If there is already an elementtype definition for the name/namespace combination, the command raises error. The document element of any XML to validate cannot be a \&\fIdefelementtype\fR defined element. .TP \&\fB\fBdefpattern\fP \fIname\fB \fI?namespace?\fB \fI\fB \&\fRThis method defines a (maybe complex) content particle with the \fIname\fR (optional in the namespace \&\fInamespace\fR) in the schema, to be used in other definition scripts with the definition command \fIref\fR. The \&\fIdefinition script\fR is evaluated and defines the content model of the content particle. If the \fInamespace\fR argument is given, any \fIelement\fR or \fIref\fR references in the definition script not wrapped inside a \fInamespace\fR command are resolved in that namespace. If there is already a pattern definition for the name/namespace combination, the command raises error. .TP \&\fB\fBdeftexttype\fP \fIname\fB \fI\fB \&\fRThis method defines a bundle of text constraints that can be referred to by \fIname\fR while defining constraints on text element or attribute values. If there is already a text type definition with this name, the command raises error. A text type must be defined before it can be used in schema definition scripts. .TP \&\fB\fBstart\fP \fIdocumentElement\fB \fI?namespace?\fB \&\fRThis method defines the name and namespace of the root element of a tree to validate. If this method is used, the root element must match for validity. If \fIstart\fR is not used, any element defined by \fIdefelement\fR may be the root of a valid document. The \fIstart\fR method may be used several times with varying arguments during the lifetime of a validation command. If the command is called with just the empty string (and no namespace argument), the validation constraint for the root element is removed and any defined element will be valid as root of a tree to validate. .TP \&\fB\fBdefine\fP \fI\fB \&\fRThis method allows to define several elements or patterns or a whole schema with one call. All schema command methods so far (\fIprefixns\fR, \fIdefelement\fR, \&\fIdefelementtype\fR, \fIdefpattern\fR, \fIdeftexttype\fR and \&\fIstart\fR) are allowed top level in the \fIdefinition script\fR. The \fIdefine\fR method itself isn't allowed recursively. .TP \&\fB\fBevent\fP \fI(start|end|text)\fB \fI?event specific data?\fB \&\fRThis method allows the validation of hierarchical data against the content constraints of the validation command. .RS .IP "\fBstart \fIname ?attributes? ?namespace? \fP\fR" Checks if the current validation state allows the element \fIname\fR in the \fInamespace\fR to start here. It raises error if not. .IP "\fBend\fR" Checks if the current innermost open element may end there in the current state without violation of validation constraints. It raises error if not. .IP "\fBtext \fItext\fP\fR" Checks if the current validation state allows the given text content. It raises error if not. .RE .TP \&\fB\fBvalidate\fP \fI\fB \fI?objVar?\fB \&\fRReturns true if the \fI\fR is valid, or false, otherwise. If validation has failed and the optional \&\fIobjVar\fR argument is given, the variable with that name is set to a validation error message. If the XML string is valid and the optional \fIobjVar\fR argument is given, the variable with that name is set to the empty string. .TP \&\fB\fBvalidatefile\fP \fIfilename\fB \fI?objVar?\fB \&\fRReturns true if the content of \fIfilename\fR is valid, or false, otherwise. The given file is feeded as binary stream to expat, therefore only US-ASCII, ISO-8859-1, UTF-8 or UTF-16 encoded data will work with this method. If validation has failed and the optional \fIobjVar\fR argument is given, the variable with that name is set to a validation error message. If the XML string is valid and the optional \fIobjVar\fR argument is given, the variable with that name is set to the empty string. .TP \&\fB\fBvalidatechannel\fP \fIchannel\fB \fI?objVar?\fB \&\fRReturns true if the content read from the Tcl channel \&\fIchannel\fR is valid, or false, otherwise. Since data read out of a Tcl channel is UTF-8 encoded, any misleading encoding declaration at the beginning of the data will lead to errors. If the validation fails and the optional \fIobjVar\fR argument is given, the variable with that name is set to a validation error message. If the XML string is valid and the optional \fIobjVar\fR argument is given, the variable with that name is set to the empty string. .TP \&\fB\fBdomvalidate\fP \fIdomNode\fB \fI?objVar?\fB \&\fRReturns true if the first argument is a valid tree, or false, otherwise. If validation has failed and the optional \&\fIobjVar\fR argument is given, the variable with that name is set to a validation error message. If the dom tree is valid and the optional \fIobjVar\fR argument is given, the variable with that name is set to the empty string. .TP \&\fB\fBreportcmd\fP \fI?cmd?\fB \&\fRThis method expects the name of a Tcl command to be called in case of validation error. The command will be called with two arguments appended: the schema command which raises the validation error, and a validation error code. .RS .PP The possible error codes are: .TP MISSING_ELEMENT .TP MISSING_TEXT .TP UNEXPECTED_ELEMENT .TP UNEXPECTED_ROOT_ELEMENT .TP UNEXPECTED_TEXT .TP UNKNOWN_ROOT_ELEMENT .TP UNKNOWN_ATTRIBUTE .TP MISSING_ATTRIBUTE .TP INVALID_ATTRIBUTE_VALUE .TP DOM_KEYCONSTRAINT .TP DOM_XPATH_BOOLEAN .TP INVALID_KEYREF .TP INVALID_VALUE .TP UNKOWN_GLOBAL_ID .TP UNKOWN_ID .PP For more detailed information see section Recovering. .RE .TP \&\fB\fBdelete\fP \&\fRThis method deletes the validation command. .TP \&\fB\fBinfo\fP \fI?args?\fB \&\fRThis method bundles methods to query the state of and details about the schema command. .RS .IP "\fBvalidationstate\fR" This method returns the state of the validation command with respect to validation state. The possible return values and their meanings are: .RS .TP READY The validation command is ready to start validation .TP VALIDATING The validation command is in the process of validating input. .TP FINISHED The validation has finished, no further events are expected. .RE .IP "\fBvstate\fR" This method is a shorter alias for validationstate; see there. .IP "\fBline\fR" If the schema command is currently validating, this method returns the line part of the parsing position information, and the empty string in all other cases. If the schema command is currently post-validating a DOM tree, there may be no position information stored at some or all nodes. The empty string is returned in these cases. .IP "\fBcolumn\fR" If the schema command is currently validating this method returns the column part of the parsing position information, and the empty string in all other cases. If the schema command is currently post-validating a DOM tree, there may be no position information stored at some or all nodes. The empty string is returned in these cases. .IP "\fBdomNode\fR" If the schema command isn't currently post-validating a DOM tree this method returns the empty string. Otherwise, if the schema command waits for the reportcmd script to finish while recovering from a validation error it returns the node on which the validation engine is currently looking at in case the node is an ELEMENT_NODE or, if not, its parent node. It is recommended that you do not use this method. Or at least leave the DOM tree alone, use it read-only. .IP "\fBnrForwardDefinitions\fR" Returns how many elements, element types and ref patterns are referenced that aren't defined so far (summed together). .IP "\fBdefinedElements\fR" Returns in no particular order the defined elements in the grammar as list. If an element is namespaced, its list entry will be itself a list with two elements, with the name as first and the namespace as second element. .IP "\fBdefinedElementtypes\fR" Returns in no particular order the defined element types in the grammar as list. If an element type is namespaced, its list entry will be itself a list with two elements, with the name as first and the namespace as second element. .IP "\fBdefinedPatterns\fR" Returns in no particular order the defined named pattern in the grammar as list. If a named pattern is namespaced, its list entry will be itself a list with two elements, with the name as first and the namespace as second element. .IP "\fBexpected\fR" Returns in no particular order all possible next events (since the last successful event match, if there was one) as a list. If an element is namespaced its list entry will be itself a list with two elements, with the name as first and the namespace as second element. If text is a possible next event, the list entry will be a two elements list, with #text as first element and the empty string as second. If an any element constraint is possible. the list entry will be a two elements list, with as first element and the empty string as second. If an any element in a certain namespace constraint is possible, the list entry will be a two elements list, with as first element and the namespace as second. If element end is a possible event, the list entry will be a two elements list with as first element and the empty string as second element. .IP "\fBdefinition name ?namespace?\fR" Returns the code that defines the given element. The command raises error if there is no definition of that element. .IP "\fBtypedefinition name ?namespace?\fR" Returns the code that defines the given element type definition. The command raises error if there is no definition of that element. .IP "\fBpatterndefinition name ?namespace?\fR" Returns the code that defines the given pattern definition. The command raises error if there is no definition of a pattern with that name and, if given, namespace. .IP "\fBvaction ?name|namespace|text?\fR" .RS .PP This method returns useful information only if the schema command waits for the reportcmd script to finish while recovering from a validation error. Otherwise it returns NONE. .PP If the command is called without the optional argument the possible return values and their meanings are: .TP NONE The schema command currently does not recover from a validation event. .TP MATCH_ELEMENT_START Element start event, which includes looking for missing or unknown attributes. .TP MATCH_ELEMENT_END Element end event. .TP MATCH_TEXT Validating text between tags. .TP MATCH_ATTRIBUTE_TEXT Attribute text value constraint check .TP MATCH_GLOBAL Checking global IDs .TP MATCH_DOM_KEYCONSTRAINT Checking domunique constraint .TP MATCH_DOM_XPATH_BOOLEAN Checking domxpathboolean constant .PP If called with one of the possible optional arguments, the command returns detail information depending on current action. .TP name Returns the name of the element that has to match in case of MATCH_ELEMENT_START. Returns the name of the closed element in case of MATCH_ELEMENT_END. Returns the name of the attribute in case of MATCH_ATTRIBUTE_TEXT. Returns the name of the parent element in case of MATCH_TEXT. .TP namespace Returns the namespace of the element that has to match in case of MATCH_ELEMENT_START. Returns the namespace of the closed element in case of MATCH_ELEMENT_END. Returns the namespace of the attribute in case of MATCH_ATTRIBUTE_TEXT. Returns the namespace of the parent element in case of MATCH_TEXT. .TP text Returns the text to match in case of MATCH_TEXT. Returns the value of the attribute in case of MATCH_ATTRIBUTE_TEXT. .RE .IP "\fBstack top|inside|associated\fR" In Tcl scripts evaluated by validation this method provides information about the current validation stack. Called outside this context the method returns the empty string. .RS .IP "\fBtop\fR" Returns the element whose content is currently checked (the open element tag at this moment). .IP "\fBinside\fR" Returns all currently open elements as a list. .IP "\fBassociated\fR" Returns the data associated with the current top most stack content particle or the empty string if there isn't any. .RE .RE .TP \&\fB\fBreset\fP \&\fRThis method resets the validation command into state READY (while preserving the defined grammar). .SH "Schema definition scripts" .PP Schema definition scripts are ordinary Tcl scripts evaluated in the namespace tdom::schema. The schema definition commands listed below in this Tcl namespace allow the definition of a wide variety of document structures. Every schema definition command establishes a validation constraint on the content which has to match or must be optional to qualify the content as valid. It is a validation error if there is additional (not matched) content. White-space-only text (in the XML sense of white space) between any different tags is ignored, with the exception of text only elements (for which even white-space-only text will be considered as significant content). .PP The schema definition commands are: .TP \&\fB\fBelement\fP \fIname\fB \fI?quant?\fB \fI??\fB \&\fRIf the optional argument \fIdefinition script\fR is not given this command refers to the element defined with \&\fIdefelement\fR with the name \fIname\fR in the current context namespace. If the \fIdefelement script\fR argument is given, the validation constraint expects an element with the name \fIname\fR in the current namespace with content "locally" defined by the \fIdefinition script\fR. Forward references to so far not defined elements or patterns or other local definitions of the same name inside the \fIdefinition script\fR are allowed. If a forward referenced element is not defined until validation, only an empty element with name \&\fIname\fR and namespace \fInamespace\fR and no attributes matches. .TP \&\fB\fBelementtype\fP \fIname\fB \fI?quant?\fB \&\fRThis command refers to the element defined with \&\fIdefelementtype\fR with the type name \fIname\fR in the current context namespace. Forward references to so far not defined element types or recursive references are allowed. If a forward referenced element type is not defined until validation any empty element without attributes will be accepted. .TP \&\fB\fBref\fP \fIname\fB \fI?quant?\fB \&\fRThis command refers to the content particle defined with \&\fIdefpattern\fR with the name \fIname\fR in the current context namespace. Forward references to a so far not defined pattern and recursive references are allowed. If a forward referenced pattern is not defined until validation no content whatsoever is expected ("empty match"). .TP \&\fB\fBgroup\fP \fI?quant?\fB \fI\fB \&\fRThis method allows to group a sequence of content particles defined by the \fIdefinition script>\fR, which have to match in this sequence order. .TP \&\fB\fBchoice\fP \fI?quant?\fB \fI\fB \&\fRThis schema constraint matches if one of the top level content particles defined by the \fIdefinition script>\fR matches. If one of this top level content particle is optional this constraint matches the "empty match". .TP \&\fB\fBinterleave\fP \fI?quant?\fB \fI\fB \&\fRThis schema constraint matches after every of the required top level content particles defined by the \fIdefinition script>\fR have matched (and, optional, some or all other) in any arbitrary order. .TP \&\fB\fBmixed\fP \fI?quant?\fB \fI\fB \&\fRThis schema constraint matches for any text (including the empty one) and every top level content particle defined by the \&\fIdefinition script>\fR with default quantifier *. .TP \&\fB\fBtext\fP \fI?|\*(lqtype\*(lq typename?\fB \&\fRWithout the optional constraint script this validation constraint matches every string (including the empty one). With \fIconstraint script\fR or with a given text type argument a text matching this script or the text type is expected. .TP \&\fB\fBany\fP \fI?namespace?\fB \fI?quant?\fB \&\fRThe any command matches every element (in the namespace \&\fInamespace\fR, if that is given) (with whatever attributes) or subtree, no matter if known within the schema or not. Please note that in case of no \fInamespace\fR argument is given that means that the quantifier * and + will eat up any elements until the enclosing element ends. If you really have a namespace that looks like a valid tDOM schema quantifier you will have to spell out always both arguments. .TP \&\fB\fBattribute\fP \fIname\fB \fI?quant?\fB \fI(?|\*(lqtype\*(lq typename?)\fB \&\fRThe attribute command defines an attribute (in no namespace) to the enclosing element. The first definition of \&\fIname\fR inside an element definition wins; later definitions of the same name are silently ignored. After the \&\fIname\fR argument there may be one of the quantifiers ? or !. If there is, it will be used. Otherwise the attribute will be required (must be present in the XML source). If there is one argument more this argument is evaluated as constraint script, defining the value constraints of the attribute. Otherwise, if there are two more arguments and the first of them is the bare-word "type" the following argument is used as a text type name. This command is only allowed at top level in the definition script of an defelement/element script. .TP \&\fB\fBnsattribute\fP \fIname\fB \fInamespace\fB \fI?quant?\fB \fI(?|\*(lqtype\*(lq typename?)\fB \&\fRThis command does the same as the command \&\fIattribute\fR, for the attribute \fIname\fR in the namespace \fInamespace\fR. .TP \&\fB\fBnamespace\fP \fIURI\fB \fI\fB \&\fREvaluates the \fIdefinition script\fR with context namespace \fIURI\fR. Every element, element type or ref command name will be looked up in the namespace \fIURI\fR, and local defined elements will be in that namespace. An empty string as \fIURI\fR means no namespace. .TP \&\fB\fBtcl\fP \fItclcmd\fB \fI?arg arg ...?\fB \&\fREvaluates the Tcl script \fItclcmd arg arg ... \fR. This validation command is only allowed in strict sequential context (not in choice, mixed and interleave). If the return code is something else than TCL_OK, this is an error (which is not catched and reported by reportcmd). .TP \&\fB\fBself\fP \&\fRReturns the schema command. .TP \&\fB\fBassociate\fP \fIdata\fB \&\fRThis command is only allowed top-level inside definition scripts of the element, elementtype, pattern or interleave content particles. Associates the \fIdata\fR given as argument with the currently defined content particle and may be requested in scripts evaluated while validating the content of that particle with the schema command method call \fIinfo stack associated\fR. .TP \&\fB\fBdomunique\fP \fIselector\fB \fIfieldlist\fB \fI?name?\fB \fI?\*(lqIGNORE_EMPTY_FIELD_SET\*(lq|(\*(lqEMPTY_FIELD_SET_VALUE\*(lq emptyFieldSetValue)?\fB \&\fRIf not postvalidating a DOM tree with \fIdomvalidate\fR this constraint always matches. If postvalidating this constraint resembles the xsd key/keyref mechanism. The \&\fIselector\fR argument may be any valid XPath expression (without the xsd limits). Several \fIdomunique\fR commands within one element definition are allowed. They are checked in definition order. The argument name is available in the recovering script per \fIinfo vaction name\fR. If the \&\fIfieldlist\fR does not select something for a node of the result set of the \fIselector\fR the key value will be the empty string by default. If the arguments \&\fIEMPTY_FIELD_SET_VALUE \fR are given an empty node set will have the key value \fIvalue\fR. If instead the flag \fIIGNORE_EMPTY_FIELD_SET\fR flag is given an empty node set result will not have any key value. .TP \&\fB\fBdomxpathboolean\fP \fIXPath_expr\fB \fI?name?\fB \&\fR .RS .PP If not postvalidating a DOM tree with \&\fIdomvalidate\fR this constraint always matches. If postvalidating the \fIXPath_expr\fR argument is evaluated (with the node matching the schema parent of the \&\fIdomxpathboolean\fR command as context node). The constraint maches if the result of this XPath expression, converted to boolean by XPath rules, is true. Several \&\fIdomxpathboolean\fR commands within one element definition are allowed. They are checked in definition order. .PP This enables checks depending on more than one element. Consider .CS tdom::schema s s define { defelement doc { element a ! text element b ! text element c ! text domxpathboolean "a * b * c >= 20000" volume domxpathboolean "a > b and b > c" sequence } } .CE .RE .TP \&\fB\fBprefixns\fP \fI?prefixUriList?\fB \&\fRThis defines a prefix to namespace URI mapping exactly as a \fIschemacmd prefixns\fR would. It is meant as top-level command of a \fIschemacmd define\fR script. This command is not allowed nested in another definition script command and will raise error, if you call it there. .TP \&\fB\fBdefelement\fP \fIname\fB \fI?namespace?\fB \fI\fB \&\fRThis defines an element exactly as a \fIschemacmd defelement\fR call would. It is meant as top-level command of a \&\fIschemacmd define\fR script. This command is not allowed nested in another definition script command and will raise error, if you call it there. .TP \&\fB\fBdefelementtype\fP \fIname\fB \fI?namespace?\fB \fI\fB \&\fRThis defines an elementtype exactly as a \fIschemacmd defelementtype\fR call would. It is meant as top-level command of a \fIschemacmd define\fR script. This command is not allowed nested in another definition script command and will raise error, if you call it there. .TP \&\fB\fBdefpattern\fP \fIname\fB \fI?namespace?\fB \fI\fB \&\fRThis defines a named pattern exactly as a \fIschemacmd defpattern\fR call would. It is meant as top-level command of a \&\fIschemacmd define\fR script. This command is not allowed nested in another definition script command and will raise error, if you call it there. .TP \&\fB\fBdeftexttype\fP \fIname\fB \fI\fB \&\fRThis defines a named bundle of text constraints exactly as a \fIschemacmd deftexttype\fR call would. It is meant as top-level command of a \fIschemacmd define\fR script. This command is not allowed nested in another definition script command and will raise error, if you call it there. .TP \&\fB\fBstart\fP \fIname\fB \fI?namespace?\fB \&\fRThis command works exactly as a \fIschemacmd start\fR call would. It is meant as top-level command of a \fIschemacmd define\fR script. This command is not allowed nested in another definition script command and will raise error, if you call it there. .SH "Quantity specifier" .PP Several schema definition commands expect a quantifier as one of their arguments which determines how often the content particle specified by the command is expected. The valid values for a \fIquant\fR argument are: .IP "\fB!\fR" The content particle has to occur exactly once in valid documents. .IP "\fB?\fR" The content particle may not occur more than once in valid documents - the particle is optional. .IP "\fB*\fR" The content particle may occur zero or more times in a row in valid documents. .IP "\fB+\fR" The content particle may occur one or more times in a row in valid documents. .IP "\fBn\fR" The content particle must occur n times in a row in valid documents. The quantifier must be an integer greater zero. .IP "\fB{n m}\fR" The content particle must occur at least n and at most m times in a row in valid documents. The quantifier must be a Tcl list with two elements. Both elements must be integers, with n >= 0 and n < m. .PP If an optional quantifier is not given, it defaults to * in case of the \fImixed\fR command and to ! for all other commands. .SH "Text constraint scripts" .PP Text (parsed character data, as XML calls it) sometimes has to be of a certain kind or comply with certain rules to be valid. The text constraint script arguments to text, attribute, nsattribute and deftexttype commands are evaluated in the Tcl namespace \&\fItdom::schema::text\fR namespace and allow the ensuing text constraint commands to check text for certain properties. The commands are defined in the Tcl namespace \&\fItdom::schema::text\fR. They raise error in case they are called outside of a text constraint script. .PP A few of the ensuing text type commands are exposed as general Tcl commands. They are defined in the namespace tdom::type and are called as documented below with the text to check appended to the argument list. They return a logical value. Please note that the commands may not accept starting or ending white space. If a command is available in the tdom::type namespace is recorded in its documentation. .SS "The tcl text constraint command" .PP The \fItcl\fR text constraint command dispatches the check to an arbitrary Tcl command, thus enable any programmable decision rules. .TP \&\fB\fBtcl\fP \fItclcmd\fB \fI?arg arg ...?\fB \&\fREvaluates the Tcl script \fItclcmd arg arg ... \fR and the text to validate appended to the argument list. The return value of the Tcl command is interpreted as a boolean. .SS "Basic XML types" .TP \&\fB\fBname\fP .UR "https://www.w3.org/TR/xml/#NT-Name" .UE \&\fRThis text constraint matches if the text value matches the XML name production \&. This means that the text value must start with a letter, underscore (_), or colon (:), and may contain only letters, digits, underscores (_), colons (:), hyphens (-), and periods (.). .TP \&\fB\fBncname\fP .UR "https://www.w3.org/TR/xml-names/#NT-NCName" .UE \&\fRThis text constraint matches if the text value matches the XML ncname production \&. This means that the text value must start with a letter or underscore (_), and may contain only letters, digits, underscores (_), hyphens (-), and periods (.) (The only difference to the name constraint is that colons are not permitted.) .TP \&\fB\fBqname\fP .UR "https://www.w3.org/TR/xml-names/#NT-QName" .UE \&\fRThis text constraint matches if the text value matches the XML qname production \&. This means that the text value is either a ncname or two ncnames joined by a colon (:). .TP \&\fB\fBnmtoken\fP .UR "https://www.w3.org/TR/xml/#NT-Nmtoken" .UE \&\fRThis text constraint matches if the text value matches the XML nmtoken production .TP \&\fB\fBnmtokens\fP .UR "https://www.w3.org/TR/xml/#NT-Nmtokens" .UE \&\fRThis text constraint matches if the text value matches the XML nmtokens production .SS "Basic type tests" .PP .TP \&\fB\fBinteger\fP \fI?(xsd|tcl)?\fB \&\fRThis text constraint matches if the text value could be parsed as an integer. If the optional argument to the command is \fItcl\fR, everything that returns TCL_OK if feeded into Tcl_GetInt() matches. If the optional argument to the command is \fIxsd\fR, the constraint matches if the value is a valid xsd:integer. Without argument \fIxsd\fR is the default. .TP \&\fB\fBnegativeInteger\fP \fI?(xsd|tcl)?\fB \&\fRThis text constraint matches the same text values as the \&\fIinteger\fR text constraint (see there), with the additional constraint, that the value must be < zero. .TP \&\fB\fBnonNegativeInteger\fP \fI?(xsd|tcl)?\fB \&\fRThis text constraint matches the same text values as the \&\fIinteger\fR text constraint (see there), with the additional constraint, that the value must be >= zero. .TP \&\fB\fBnonPositiveInteger\fP \fI?(xsd|tcl)?\fB \&\fRThis text constraint matches the same text values as the \&\fIinteger\fR text constraint (see there), with the additional constraint, that the value must be <= zero. .TP \&\fB\fBpositiveInteger\fP \fI?(xsd|tcl)?\fB \&\fRThis text constraint matches the same text values as the \&\fIinteger\fR text constraint (see there), with the additional constraint, that the value must be > zero. .TP \&\fB\fBnumber\fP \fI?(xsd|tcl)?\fB \&\fRThis text constraint matches if the text value could be parsed as a number. If the optional argument to the command is \&\fItcl\fR, everything that returns TCL_OK if feeded into Tcl_GetDouble() matches. If the optional argument to the command is \fIxsd\fR, the constraint matches if the value is a valid xsd:decimal. Without argument \fIxsd\fR is the default. .TP \&\fB\fBboolean\fP \fI?(xsd|tcl)?\fB \&\fRThis text constraint matches if the text value could be parsed as a boolean. If the optional argument to the command is \&\fItcl\fR, everything that returns TCL_OK if feeded into Tcl_GetBoolean() matches. If the optional argument to the command is \fIxsd\fR, the constraint matches if the value is a valid xsd:boolean. Without argument \fIxsd\fR is the default. .TP \&\fB\fBdate\fP \&\fRThis text constraint matches if the text value is a xsd:date, which is basically like an ISO 8601 date of the form YYYY-MM-DD, with optional time zone part (either the letter Z or plus (+) or minus (-) followed by hh:mm and with maximum allowed positive or negative time zone 14:00). It follows the date rules of the Gregorian calendar for all dates. A preceding minus sign for bce dates is allowed. There is no year 0. The year may have more than 4 digits, but only if needed (no extra leading zeros). This is available as common Tcl command tdom::type::date. .TP \&\fB\fBtime\fP \&\fRThis text constraint matches if the text value is a xsd:time, which is basically like an ISO 8601 time of the form hh:mm:ss with optional time zone part. The time zone part follow the rules of the \fIdate\fR command; see there. All three parts of the time value (hours, minutes, seconds) must be spelled out with 2 digits. Additional fractional seconds (with a point ('.') as separator) are allowed, but not just a dangling point. The time value 24:00:00 (without fractional part) is allowed. This is available as common Tcl command tdom::type::time. .TP \&\fB\fBdateTime\fP \&\fRThis text constraint matches if the text value is a xsd:dateTime, which is basically like an ISO 8601 date time of the form YYYY-MM-DDThh:mm:ss with optional time zone part. The date and time zone parts follows the rules of the \fIdate\fR and \fItime\fR command; see there. The time part (including the signaling 'T' character) is mandatory. This is available as common Tcl command tdom::type::dateTime. .TP \&\fB\fBduration\fP \&\fRThis text constraint matches if the text value is a xsd:duration, which is basically like an ISO 8601 duration of the form PnYnMnDTnHnMnS. All parts other than the starting P and - if one of H, M or S is given - T are optional. In case the following sign letter is S, n may be a decimal (with at least one digit before and after the dot), otherwise it must be a (positive) integer. This is available as common Tcl command tdom::type::duration. .TP \&\fB\fBbase64\fP \&\fRThis text constraint matches if text is valid according to RFC 4648. .TP \&\fB\fBhexBinary\fP \&\fRThis text constraint matches if text is a sequence of binary octets in hexadecimal encoding, where each binary octet is a two-character hexadecimal number. Lowercase and uppercase letters A through F are permitted. .TP \&\fB\fBunsignedByte\fP \&\fRThis text constraint matches if the text value is a xsd:unsignedByte. This is an integer between 0 and 255, both included, optionally preceded by a + sign and leading zeros. .TP \&\fB\fBunsignedShort\fP \&\fRThis text constraint matches if the text value is a xsd:unsignedShort. This is an integer between 0 and 65535, both included, optionally preceded by a + sign and leading zeros. .TP \&\fB\fBunsignedInt\fP \&\fRThis text constraint matches if the text value is a xsd:unsignedInt. This is an integer between 0 and 4294967295, both included, optionally preceded by a + sign and leading zeros. .TP \&\fB\fBunsignedLong\fP \&\fRThis text constraint matches if the text value is a xsd:unsignedLong. This is an integer between 0 and 18446744073709551615, both included, optionally preceded by a + sign and leading zeros. .SS "Logical constructs" .TP \&\fB\fBoneOf\fP \fI\fB \&\fRThis text constraint matches if one of the text constraints defined in the argument \fIconstraint script\fR matches the text. It stops after the first matches and probes the text constraints in the order of definition. .TP \&\fB\fBallOf\fP \fI\fB \&\fRThis text constraint matches if all of the text constraints defined in the argument \fIconstraint script\fR matches the text. It stops after the first match failure and probes the text constraints in the order of definition. Since the schema definition command \fItext\fR also expects all text constraints to match the text constraint, \fIallOf\fR is useful mostly in connection with the \fIoneOf\fR text constraint command. .TP \&\fB\fBnot\fP \fI\fB \&\fRThis text constraint matches if none of the text constraints defined in the argument \fIconstraint script\fR matches the text. It stops after the first matching constraint in the \fIconstraint script\fR and reports validation error. The text constraints in the \&\fIconstraint script\fR are probed in the order of definition. .SS "Constraints on processed text value" .TP \&\fB\fBwhitespace\fP \fI(preserve|replace|collapse)\fB \fI\fB \&\fRThis text constraint command does white-space (#x20 (space, ' '), #x9 (tab, \et), #xA (linefeed, \en), and #xD (carriage return, \er) normalization to the text value and checks the resulting text with the text constraints of the constraint script argument. The normalization method \&\fIpreserve\fR keeps everything as it is; this is another way to say \fIallOf\fR. The \fIreplace\fR normalization method replaces any single white-space character (as above) to a space. The \fIcollapse\fR normalization method removes all leading and trailing white-space, and all the other sequences of contiguous white-space are replaced by a single space. .TP \&\fB\fBsplit\fP \fI?type ?args??\fB\fI\fB \&\fR .RS .PP This text constraint command splits the text to test into a list of values and tests all elements of that list for the text constraints in the evaluated \fIconstraint script>\fR. .PP The available types are: .TP whitespace The text to split is stripped of all white space at start and end and splitted into a list at any successive white space. .TP tcl tclcmd ?arg ...? The text to split is handed to the \fItclcmd\fR, which is evaluated on global level, appended with every given arg and the text to split as last argument. This call must return a valid Tcl list whose elements are tested. .PP The default in case no split type argument is given is \&\fIwhitespace\fR. .RE .TP \&\fB\fBstrip\fP \fI\fB \&\fRThis text constraint command tests all text constraints in the evaluated \fIconstraint script>\fR with the text to test stripped of all white space at start and end. .SS "Various other string properties" .TP \&\fB\fBfixed\fP \fIvalue\fB \&\fRThe text constraint only matches if the text value is string equal to the given value. .TP \&\fB\fBenumeration\fP \fIlist\fB \&\fRThis text constraint matches if the text value is equal to one element (respecting case and any white-space) of the argument \fIlist\fR, which has to be a valid Tcl list. .TP \&\fB\fBmatch\fP \fI?-nocase?\fB \fIglob_style_match_pattern>\fB .UR "https://www.tcl.tk/man/tcl8.6/TclCmd/string.htm#M35" .UE \&\fRThis text constraint matches if the text value matches the glob style pattern given as argument. It follows the rules of the Tcl [string match] command, see \&. .TP \&\fB\fBregexp\fP \fIexpression\fB .UR "https://www.tcl.tk/man/tcl8.6/TclCmd/re_syntax.htm" .UE \&\fRThis text constraint matches if the text value matches the regular expression given as argument. describes the regular expression syntax .TP \&\fB\fBlength\fP \fIlength\fB \&\fRThis text constraint matches if the length of the text value (in characters, not bytes) is \fIlength\fR. The length argument must be a positive integer or zero. .TP \&\fB\fBmaxLength\fP \fIlength\fB \&\fRThis text constraint matches if the length of the text value (in characters, not bytes) is at most \fIlength\fR. The length argument must be an integer greater zero. .TP \&\fB\fBminLength\fP \fIlength\fB \&\fRThis text constraint matches if the length of the text value (in characters, not bytes) is at least \fIlength\fR. The length argument must be an integer greater zero. .TP \&\fB\fBid\fP \fI?keySpace?\fB \&\fRThis text constraint command marks the text as a document wide ID (to be referenced by an idref). Every ID value within a document must be unique. It isn't an error if the ID isn't actually referenced within the document. The optional argument \fIkeySpace\fR does all this for a named key space. The key space "" (the empty sting) is another key space then the \fIid\fR command without keySpace argument. .TP \&\fB\fBidref\fP \fI?keySpace?\fB \&\fRThis text constraint command expects the text to be a reference to an ID within the document. The referenced ID may appear later in the document, that the reference. Several references within the document to one ID are possible. .SH "Local key constraints" .PP Document wide uniqueness and foreign key constraints are available with the text constraint commands id and idref. Keyspaces allow for sub-tree local uniqueness and foreign key constraints. .TP \&\fB\fBkeyspace\fP \fI\fB \fI\fB \&\fRAny number of keyspaces are possible. A keyspace is either active or not. An inside a \fIconstraint script\fR called keyspace with the same name does nothing. .PP This text constraint commands work with keyspaces: .TP \&\fB\fBkey\fP \fI\fB \&\fRIf the keyspace with the name \fI\fR is not active the constraint always matches. If the keyspace is active, reports error if there is already a key with the value. Otherwise it stores the value as key in this keyspace and matches. .TP \&\fB\fBkeyref\fP \fI\fB \&\fRIf the keyspace with the name \fI\fR is not active always matches. If the keyspace is active then reports error if there is still no key as the value at the end of the keyspace \fI\fR. Otherwise, it matches. .SH Recovering .PP By default the validation engine stops at the first detected validation violation and reports that finding. It does so by return false (and sets, if given, the result variable with an error message) in case the schema command itself is used to validate input. If the schema command is used by a SAX parser or the DOM parser, it does so by throwing error. .PP If a \fIreportcmd\fR is set this command is called on global level appended with the schema command and an error type as arguments in case a validation violation is detected. Then the validation recovers from the error and continues. For some validation errors the recover strategy can be determined with the script result of the reportcmd. .PP With a \fIreportcmd\fR (which does not throw error if called) the validation engine will never report validation failure to its caller. The validation engine recovers, continues, and reports the next error (if occuring) and so on until the end of the input. The schema command will return true and the SAX parser and DOM builder will process normally until the end of the input, as if there had not been a validation error. .PP Please note that this happens only for validation errors. It is not possible to recover from well-formedness errors. If the input is not well-formed, the schema command returns false and sets (if given) the result variable with an error message about the well-formedness error. .PP If the \fIreportcmd\fR throws error while called by the validation engine then validation stops and the schema command throws error with the error message of the script. .PP While validating basically three events can happen: an element start tag has to match, a piece of text has to match or an element end tag has to match. The method \fIinfo vaction\fR called in the recovering script or any script code called from there returns, which event has triggered the error report (MATCH_ELEMENT_START, MATCH_TEXT, MATCH_ELEMENT_END, respectively). While the command walks throu the schema looking whether the event matches other, data driven events (as, for example checking, if any keyref within a keyspace exists) may happen. .PP Several of the validation error codes, appended as second argument to the \fIreportcmd\fR calls, may happen at more than one kind of validation event. The \fIinfo vaction\fR method and its subcommands provide information about the current validation event, if called from the report command. .PP If a structural validation error happens, the default recovering strategy is to ignore any following (or missing) content within the current subtree and to continue with the element end event of the subtree. .PP Returning "ignore" from the recovering script in case of error type MISSING_ELEMENT recovers by ignoring the failed contraint and continues to match the event further against the schema. .PP Returning "vanish" from the recover script in case of the error types MISSING_ELEMENT and UNEXPECTED_ELEMENT recovers by ignoring the event. .SH Examples .PP .UR "https://www.w3.org/TR/xmlschema-0/" .UE The XML Schema Part 0: Primer Second Edition () starts with this example schema: .CS Purchase order schema for Example.com. Copyright 2000 Example.com. All rights reserved. .CE .PP A likely one-to-one translation of that into a tDOM schema definition script would be: .CS tdom::schema schema schema define { # Purchase order schema for Example.com. # Copyright 2000 Example.com. All rights reserved. defelement purchaseOrder {ref PurchaseOrderType} foreach elm {comment name street city state product} { defelement $elm text } defpattern PurchaseOrderType { element shipTo ! {ref USAddress} element billTo ! {ref USAddress} element comment ? element items attribute orderDate date } defpattern USAddress { element name element street element city element state element zip ! {text number} attribute country ! {fixed "US"} } defelement items { element item * { element product element quantity ! {text integer} element USPrice ! {text number} element comment element shipDate ? {text date} attribute partNum ! {pattern "^\ed{3}-[A-Z]{2}$"} } } } .CE .PP .UR "http://relaxng.org/tutorial-20011203.html" .UE The RELAX NG Tutorial () starts with this example: .CS Consider a simple XML representation of an email address book: John Smith js@example.com Fred Bloggs fb@example.net The DTD would be as follows: ]> A RELAX NG pattern for this could be written as follows: .CE .PP This schema definition script will do the same: .CS tdom::schema schema schema define { defelement addressBook { element card * } defelement card { element name element email } foreach e {name email} { defelement $e text } } .CE .SH KEYWORDS Validation, Postvalidation, DOM, SAX