'\" '\" Generated from pullparser.xml '\" '\" BEGIN man.macros .if t .wh -1.3i ^B .nr ^l \n(.l .ad b .de AP .ie !"\\$4"" .TP \\$4 .el \{\ . ie !"\\$2"" .TP \\n()Cu . el .TP 15 .\} .ta \\n()Au \\n()Bu .ie !"\\$3"" \{\ \&\\$1 \\fI\\$2\\fP (\\$3) .\".b .\} .el \{\ .br .ie !"\\$2"" \{\ \&\\$1 \\fI\\$2\\fP .\} .el \{\ \&\\fI\\$1\\fP .\} .\} .. .de AS .nr )A 10n .if !"\\$1"" .nr )A \\w'\\$1'u+3n .nr )B \\n()Au+15n .\" .if !"\\$2"" .nr )B \\w'\\$2'u+\\n()Au+3n .nr )C \\n()Bu+\\w'(in/out)'u+2n .. .AS Tcl_Interp Tcl_CreateInterp in/out .de BS .br .mk ^y .nr ^b 1u .if n .nf .if n .ti 0 .if n \l'\\n(.lu\(ul' .if n .fi .. .de BE .nf .ti 0 .mk ^t .ie n \l'\\n(^lu\(ul' .el \{\ .\" Draw four-sided box normally, but don't draw top of .\" box if the box started on an earlier page. .ie !\\n(^b-1 \{\ \h'-1.5n'\L'|\\n(^yu-1v'\l'\\n(^lu+3n\(ul'\L'\\n(^tu+1v-\\n(^yu'\l'|0u-1.5n\(ul' .\} .el \}\ \h'-1.5n'\L'|\\n(^yu-1v'\h'\\n(^lu+3n'\L'\\n(^tu+1v-\\n(^yu'\l'|0u-1.5n\(ul' .\} .\} .fi .br .nr ^b 0 .. .de VS .if !"\\$2"" .br .mk ^Y .ie n 'mc \s12\(br\s0 .el .nr ^v 1u .. .de VE .ie n 'mc .el \{\ .ev 2 .nf .ti 0 .mk ^t \h'|\\n(^lu+3n'\L'|\\n(^Yu-1v\(bv'\v'\\n(^tu+1v-\\n(^Yu'\h'-|\\n(^lu+3n' .sp -1 .fi .ev .\} .nr ^v 0 .. .de ^B .ev 2 'ti 0 'nf .mk ^t .if \\n(^b \{\ .\" Draw three-sided box if this is the box's first page, .\" draw two sides but no top otherwise. .ie !\\n(^b-1 \h'-1.5n'\L'|\\n(^yu-1v'\l'\\n(^lu+3n\(ul'\L'\\n(^tu+1v-\\n(^yu'\h'|0u'\c .el \h'-1.5n'\L'|\\n(^yu-1v'\h'\\n(^lu+3n'\L'\\n(^tu+1v-\\n(^yu'\h'|0u'\c .\} .if \\n(^v \{\ .nr ^x \\n(^tu+1v-\\n(^Yu \kx\h'-\\nxu'\h'|\\n(^lu+3n'\ky\L'-\\n(^xu'\v'\\n(^xu'\h'|0u'\c .\} .bp 'fi .ev .if \\n(^b \{\ .mk ^y .nr ^b 2 .\} .if \\n(^v \{\ .mk ^Y .\} .. .de DS .RS .nf .sp .. .de DE .fi .RE .sp .. .de SO .SH "STANDARD OPTIONS" .LP .nf .ta 5.5c 11c .ft B .. .de SE .fi .ft R .LP See the \\fBoptions\\fR manual entry for details on the standard options. .. .de OP .LP .nf .ta 4c Command-Line Name: \\fB\\$1\\fR Database Name: \\fB\\$2\\fR Database Class: \\fB\\$3\\fR .fi .IP .. .de CS .RS .nf .ta .25i .5i .75i 1i .if t .ft C .. .de CE .fi .if t .ft R .RE .. .de UL \\$1\l'|0\(ul'\\$2 .. '\" END man.macros .TH pullparser 3tcl "" Tcl "" .BS .SH NAME tdom::pullparser \- Create an XML pull parser command .SH SYNOPSIS .nf package require tdom \fBtdom::pullparser\fP \fIcmdName\fR \fR?\fP -ignorewhitecdata \fR?\fP .fi .BE .SH "DESCRIPTION " .PP This command creates XML pull parser commands with a simple API, along the lines of a simple StAX parser. After creation, you've to set an input source, to do anything useful with the pull parser. For this see the methods \fIinput\fR, \fIinputchannel\fR and \fIinputfile\fR. .PP The parser has always a \fIstate\fR. You start parsing the XML data until some next state, do what has to be done and skip again to the next state. XML well-formedness errors along the way will be reported as TCL_ERROR with additional info in the error message. .PP The pull parsers don't follow external entities and are XML 1.0 only, they know nothing about XML Namespaces. You get the tags and attribute names as in the source. You aren't noticed about comments, processing instructions and external entities; they are silently ignored for you. CDATA Sections are handled as if their content would have been provided without using a CDATA Section. .PP On the brighter side is that character entity and attribute default declarations in the internal subset are respected (because of using expat as underlying parser). It is probably somewhat faster than a comperable implementation with the SAX interface. It's a nice programming model. It's a slim interface. .PP If the option \fI-ignorewhitecdata\fR is given, the created XML pull parser command will ignore any white space only (' ', \et, \&\en and \er) text content between START_TAG and START_TAG / END_TAG. The parser won't stop at such input and will create TEXT state events only for not white space only text. .PP Not all methods are valid in every state. The parser will raise TCL_ERROR if a method is called in a state the method isn't valid for. Valid methods of the created commands are: .TP \&\fB\fBstate\fP \&\fRThis method is valid in all parser states. The possible return values and their meanings are: .RS .IP "\(bu" \&\fIREADY\fR - The parser is created or reset, but no input is set. .IP "\(bu" \&\fISTART_DOCUMENT\fR - Input is set, parser is ready to start parsing. .IP "\(bu" \&\fISTART_TAG\fR - Parser has stopped parsing at a start tag. .IP "\(bu" \&\fIEND_TAG\fR - Parser has stopped parsing at an end tag .IP "\(bu" \&\fITEXT\fR - Parser has stopped parsing to report text between tags. .IP "\(bu" \&\fIEND_DOKUMENT\fR - Parser has finished parsing without error. .IP "\(bu" \&\fIPARSE_ERROR\fR - Parser stopped parsing at XML error in input. .RE .TP \&\fB\fBinput\fP \fIdata\fB \&\fRThis method is only valid in state \fIREADY\fR. It prepares the parser to use \fIdata\fR as XML input to parse and switches the parser into state START_DOCUMENT. .TP \&\fB\fBinputchannel\fP \fIchannel\fB \&\fRThis method is only valid in state \fIREADY\fR. It prepares the parser to read the XML input to parse out of \&\fIchannel\fR and switches the parser into state START_DOCUMENT. .TP \&\fB\fBinputfile\fP \fIfilename\fB \&\fRThis method is only valid in state \fIREADY\fR. It open \fIfilename\fR and prepares the parser to read the XML input to parse out of that file. The method returns TCL_ERROR, if the file could not be open in read mode. Otherwise it switches the parser into state START_DOCUMENT. .TP \&\fB\fBnext\fP \&\fRThis method is valid in state \fISTART_DOCUMENT\fR, \&\fISTART_TAG\fR, \fIEND_TAG\fR and \fITEXT\fR. It continues parsing of the XML input until the next event, which it will return. .TP \&\fB\fBtag\fP \&\fRThis method is only valid in states \fISTART_TAG\fR and \&\fIEND_TAG\fR. It returns the tag name of the current start or end tag. .TP \&\fB\fBattributes\fP \&\fRThis method is only valid in state \fISTART_TAG\fR. It returns all attributes of the element in a name value list. .TP \&\fB\fBtext\fP \&\fRThis method is only valid in state \fITEXT\fR. It returns the character data of the event. There will be always at most one TEXT event between START_TAG and the next START_TAG or END_TAG event. .TP \&\fB\fBskip\fP \&\fRThis method is only valid in state \fISTART_TAG\fR. It skips to the corresponding end tag and ignores all events (but not XML parsing errors) on the way and returns the new state END_TAG. .TP \&\fB\fBfind-element\fP \fItagname\fB \&\fRThis method is only valid in states \&\fISTART_DOCUMENT\fR, \fISTART_TAG\fR and \fIEND_TAG\fR. It skips forward until the next element start tag with tag name \&\fItagname\fR and returns the new start START_TAG. If there isn't such an element the parser stops at the end of the input and returns END_DOCUMENT. .TP \&\fB\fBreset\fP \&\fRThis method is valid in all parser states. It resets the parser into READY state and returns that. .TP \&\fB\fBdelete\fP \&\fRThis method is valid in all parser states. It deletes the parser command. .PP Miscellaneous methods: .TP \&\fB\fBline\fP \&\fRThis method is valid in all parser states except READY and TEXT. It returns the line number of the parsing position. .TP \&\fB\fBline\fP \&\fRThis method is valid in all parser states except READY and TEXT. It returns the offset, from the beginning of the current line, of the parsing position. .SH KEYWORDS XML, pull, parsing