NAME¶
grammar::peg::interp - Interpreter for parsing expression grammars
SYNOPSIS¶
package require
Tcl 8.4
package require
grammar::mengine ?0.1?
package require
grammar::peg::interp ?0.1.1?
::grammar::peg::interp::setup peg
::grammar::peg::interp::parse nextcmd errorvar
astvar
DESCRIPTION¶
This package provides commands for the controlled matching of a character stream
via a parsing expression grammar and the creation of an abstract syntax tree
for the stream and partials.
It is built on top of the virtual machine provided by the package
grammar::me::tcl and directly interprets the parsing expression grammar
given to it. In other words, the grammar is
not pre-compiled but used
as is.
The grammar to be interpreted is taken from a container object following the
interface specified by the package
grammar::peg::container. Only the
relevant parts are copied into the state of this package.
It should be noted that the package provides exactly one instance of the
interpreter, and interpreting a second grammar requires the user to either
abort or complete a running interpretation, or to put them into different Tcl
interpreters.
Also of note is that the implementation assumes a pull-type handling of the
input. In other words, the interpreter pulls characters from the input stream
as it needs them. For usage in a push environment, i.e. where the environment
pushes new characters as they come we have to put the engine into its own
thread.
THE INTERPRETER API¶
The package exports the following API
- ::grammar::peg::interp::setup peg
- This command (re)initializes the interpreter. It returns the empty string.
This command has to be invoked first, before any matching run.
Its argument peg is the handle of an object containing the parsing
expression grammar to interpret. This grammar has to be valid, or an error
will be thrown.
- ::grammar::peg::interp::parse nextcmd errorvar
astvar
- This command interprets the loaded grammar and tries to match it against
the stream of characters represented by the command prefix nextcmd.
The command prefix nextcmd represents the input stream of characters
and is invoked by the interpreter whenever the a new character from the
stream is required. The callback has to return either the empty list, or a
list of 4 elements containing the token, its lexeme attribute, and its
location as line number and column index, in this order. The empty list is
the signal that the end of the input stream has been reached. The lexeme
attribute is stored in the terminal cache, but otherwise not used by the
machine.
The result of the command is a boolean value indicating whether the matching
process was successful ( true), or not ( false). In the case
of a match failure error information will be stored into the variable
referenced by errorvar. The variable referenced by astvar
will always contain the generated abstract syntax tree, however in the
case of an error it will be only partial and possibly malformed.
The abstract syntax tree is represented by a nested list, as described in
section AST VALUES of document grammar::me_ast.
BUGS, IDEAS, FEEDBACK¶
This document, and the package it describes, will undoubtedly contain bugs and
other problems. Please report such in the category
grammar_peg of the
Tcllib Trackers [
http://core.tcl.tk/tcllib/reportlist]. Please also
report any ideas for enhancements you may have for either package and/or
documentation.
KEYWORDS¶
LL(k), TDPL, context-free languages, expression, grammar, matching, parsing,
parsing expression, parsing expression grammar, push down automaton, recursive
descent, state, top-down parsing languages, transducer, virtual machine
CATEGORY¶
Grammars and finite automata
COPYRIGHT¶
Copyright (c) 2005-2011 Andreas Kupries <andreas_kupries@users.sourceforge.net>