NAME¶
page_util_norm_peg - page AST normalization, PEG
SYNOPSIS¶
package require
page::util::norm_peg ?0.1?
package require
snit
::page::util::norm::peg tree
DESCRIPTION¶
This package provides a single utility command which takes an AST for a parsing
expression grammar and normalizes it in various ways. The result is called a
Normalized PE Grammar Tree.
Note that this package can only be used from within a plugin managed by
the package
page::pluginmgr.
API¶
- ::page::util::norm::peg tree
- This command assumes the tree object contains for a parsing
expression grammar. It normalizes this tree in place. The result is called
a Normalized PE Grammar Tree.
The following operations are performd
- [1]
- The data for all terminals is stored in their grandparental nodes. The
terminal nodes and their parents are removed. Type information is
dropped.
- [2]
- All nodes which have exactly one child are irrelevant and are removed,
with the exception of the root node. The immediate child of the root is
irrelevant as well, and removed as well.
- [3]
- The name of the grammar is moved from the tree node it is stored in to an
attribute of the root node, and the tree node removed.
The node keeping the start expression separate is removed as irrelevant and
the root node of the start expression tagged with a marker attribute, and
its handle saved in an attribute of the root node for quick access.
- [4]
- Nonterminal hint information is moved from nodes into attributes, and the
now irrelevant nodes are deleted.
Note: This transformation is dependent on the removal of all nodes
with exactly one child, as it removes the all 'Attribute' nodes already.
Otherwise this transformation would have to put the information into the
grandparental node.
The default mode given to the nonterminals is value.
Like with the global metadata definition specific information is moved out
out of nodes into attributes, the now irrelevant nodes are deleted, and
the root nodes of all definitions are tagged with marker attributes. This
provides us with a mapping from nonterminal names to their defining nodes
as well, which is saved in an attribute of the root node for quick
reference.
At last the range in the input covered by a definition is computed. The left
extent comes from the terminal for the nonterminal symbol it defines. The
right extent comes from the rightmost child under the definition. While
this not an expression tree yet the location data is sound already.
- [5]
- The remaining nodes under all definitions are transformed into proper
expression trees. First character ranges, followed by unary operations,
characters, and nonterminals. At last the tree is flattened by the removal
of superfluous inner nodes.
The order matters, to shed as much nodes as possible early, and to avoid
unnecessary work later.
BUGS, IDEAS, FEEDBACK¶
This document, and the package it describes, will undoubtedly contain bugs and
other problems. Please report such in the category
page of the
Tcllib Trackers [
http://core.tcl.tk/tcllib/reportlist]. Please also
report any ideas for enhancements you may have for either package and/or
documentation.
KEYWORDS¶
PEG, graph walking, normalization, page, parser generator, text processing, tree
walking
CATEGORY¶
Page Parser Generator
COPYRIGHT¶
Copyright (c) 2007 Andreas Kupries <andreas_kupries@users.sourceforge.net>