.\" Automatically generated by Pod::Man 2.28 (Pod::Simple 3.28)
.\"
.\" Standard preamble:
.\" ========================================================================
.de Sp \" Vertical space (when we can't use .PP)
.if t .sp .5v
.if n .sp
..
.de Vb \" Begin verbatim text
.ft CW
.nf
.ne \\$1
..
.de Ve \" End verbatim text
.ft R
.fi
..
.\" Set up some character translations and predefined strings.  \*(-- will
.\" give an unbreakable dash, \*(PI will give pi, \*(L" will give a left
.\" double quote, and \*(R" will give a right double quote.  \*(C+ will
.\" give a nicer C++.  Capital omega is used to do unbreakable dashes and
.\" therefore won't be available.  \*(C` and \*(C' expand to `' in nroff,
.\" nothing in troff, for use with C<>.
.tr \(*W-
.ds C+ C\v'-.1v'\h'-1p'\s-2+\h'-1p'+\s0\v'.1v'\h'-1p'
.ie n \{\
.    ds -- \(*W-
.    ds PI pi
.    if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch
.    if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\"  diablo 12 pitch
.    ds L" ""
.    ds R" ""
.    ds C` ""
.    ds C' ""
'br\}
.el\{\
.    ds -- \|\(em\|
.    ds PI \(*p
.    ds L" ``
.    ds R" ''
.    ds C`
.    ds C'
'br\}
.\"
.\" Escape single quotes in literal strings from groff's Unicode transform.
.ie \n(.g .ds Aq \(aq
.el       .ds Aq '
.\"
.\" If the F register is turned on, we'll generate index entries on stderr for
.\" titles (.TH), headers (.SH), subsections (.SS), items (.Ip), and index
.\" entries marked with X<> in POD.  Of course, you'll have to process the
.\" output yourself in some meaningful fashion.
.\"
.\" Avoid warning from groff about undefined register 'F'.
.de IX
..
.nr rF 0
.if \n(.g .if rF .nr rF 1
.if (\n(rF:(\n(.g==0)) \{
.    if \nF \{
.        de IX
.        tm Index:\\$1\t\\n%\t"\\$2"
..
.        if !\nF==2 \{
.            nr % 0
.            nr F 2
.        \}
.    \}
.\}
.rr rF
.\"
.\" Accent mark definitions (@(#)ms.acc 1.5 88/02/08 SMI; from UCB 4.2).
.\" Fear.  Run.  Save yourself.  No user-serviceable parts.
.    \" fudge factors for nroff and troff
.if n \{\
.    ds #H 0
.    ds #V .8m
.    ds #F .3m
.    ds #[ \f1
.    ds #] \fP
.\}
.if t \{\
.    ds #H ((1u-(\\\\n(.fu%2u))*.13m)
.    ds #V .6m
.    ds #F 0
.    ds #[ \&
.    ds #] \&
.\}
.    \" simple accents for nroff and troff
.if n \{\
.    ds ' \&
.    ds ` \&
.    ds ^ \&
.    ds , \&
.    ds ~ ~
.    ds /
.\}
.if t \{\
.    ds ' \\k:\h'-(\\n(.wu*8/10-\*(#H)'\'\h"|\\n:u"
.    ds ` \\k:\h'-(\\n(.wu*8/10-\*(#H)'\`\h'|\\n:u'
.    ds ^ \\k:\h'-(\\n(.wu*10/11-\*(#H)'^\h'|\\n:u'
.    ds , \\k:\h'-(\\n(.wu*8/10)',\h'|\\n:u'
.    ds ~ \\k:\h'-(\\n(.wu-\*(#H-.1m)'~\h'|\\n:u'
.    ds / \\k:\h'-(\\n(.wu*8/10-\*(#H)'\z\(sl\h'|\\n:u'
.\}
.    \" troff and (daisy-wheel) nroff accents
.ds : \\k:\h'-(\\n(.wu*8/10-\*(#H+.1m+\*(#F)'\v'-\*(#V'\z.\h'.2m+\*(#F'.\h'|\\n:u'\v'\*(#V'
.ds 8 \h'\*(#H'\(*b\h'-\*(#H'
.ds o \\k:\h'-(\\n(.wu+\w'\(de'u-\*(#H)/2u'\v'-.3n'\*(#[\z\(de\v'.3n'\h'|\\n:u'\*(#]
.ds d- \h'\*(#H'\(pd\h'-\w'~'u'\v'-.25m'\f2\(hy\fP\v'.25m'\h'-\*(#H'
.ds D- D\\k:\h'-\w'D'u'\v'-.11m'\z\(hy\v'.11m'\h'|\\n:u'
.ds th \*(#[\v'.3m'\s+1I\s-1\v'-.3m'\h'-(\w'I'u*2/3)'\s-1o\s+1\*(#]
.ds Th \*(#[\s+2I\s-2\h'-\w'I'u*3/5'\v'-.3m'o\v'.3m'\*(#]
.ds ae a\h'-(\w'a'u*4/10)'e
.ds Ae A\h'-(\w'A'u*4/10)'E
.    \" corrections for vroff
.if v .ds ~ \\k:\h'-(\\n(.wu*9/10-\*(#H)'\s-2\u~\d\s+2\h'|\\n:u'
.if v .ds ^ \\k:\h'-(\\n(.wu*10/11-\*(#H)'\v'-.4m'^\v'.4m'\h'|\\n:u'
.    \" for low resolution devices (crt and lpr)
.if \n(.H>23 .if \n(.V>19 \
\{\
.    ds : e
.    ds 8 ss
.    ds o a
.    ds d- d\h'-1'\(ga
.    ds D- D\h'-1'\(hy
.    ds th \o'bp'
.    ds Th \o'LP'
.    ds ae ae
.    ds Ae AE
.\}
.rm #[ #] #H #V #F C
.\" ========================================================================
.\"
.IX Title "Marpa::R2::Semantics::Order 3pm"
.TH Marpa::R2::Semantics::Order 3pm "2015-03-06" "perl v5.20.2" "User Contributed Perl Documentation"
.\" For nroff, turn off justification.  Always turn off hyphenation; it makes
.\" way too many mistakes in technical documents.
.if n .ad l
.nh
.SH "NAME"
Marpa::R2::Semantics::Order \- How the SLIF ranks ambiguous parses
.SH "Description"
.IX Header "Description"
Marpa allows ambiguous parses.
While an unambiguous parse can produce at most one parse tree
and one parse result,
an ambiguous parse will produce a parse series.
A parse series is a sequence of parse trees,
each of which will have its own parse result.
.PP
This document describes ways of controlling
the order in which
the \s-1SLIF\s0 recognizer's \f(CW\*(C`value()\*(C'\fR method
evaluates the parse
trees of an ambiguous parse.
It also describes ways to exclude selected parse trees
from the parse series.
.SS "Duplicate parses are eliminated"
.IX Subsection "Duplicate parses are eliminated"
When evaluating the parse trees in a parse series,
Marpa never evaluates the same parse tree twice.
What this means probably matches the programmer's
intuition of what it should mean.
Marpa considers two parse trees to be the same if they are
\&\fBsemantic equivalents\fR.
.PP
Two parse trees are semantic equivalents if
and only if
a recursive, top-down evaluation of each
applies
the same rules
in the same order
at the same G1 locations.
If the semantics are deterministic,
and if two parse trees are
semantic equivalents according to this definition,
the two parse trees will always produce the same parse result.
.PP
The two parse trees are called semantic equivalents,
because from the point
of view of a deterministic semantics they are indistinguishable.
When the Marpa documentation refers to duplicate
parses,
unless otherwise stated,
it means that the two
are semantic equivalents.
.SS "Default parse order"
.IX Subsection "Default parse order"
By calling
the recognizer's
\&\f(CW\*(C`value()\*(C'\fR
method
repeatedly,
Marpa can produce all the parse results
in the current parse series.
The default is for the parse results to be returned
in an \fBarbitrary parse order\fR.
This corresponds to the "\f(CW\*(C`none\*(C'\fR" value of
the recognizer's \f(CW\*(C`ranking_method\*(C'\fR
named argument.
.PP
Traversal of the parse trees in
arbitrary parse order
will be always be well-behaved
in the sense
that no two parse trees will be semantic duplicates,
and no unique (semantic non-duplicate)
parse tree will be omitted in it.
No other property of arbitrary parse order is guaranteed.
For example, the order may
change each time
the parse series is traversed.
.SS "Ranking methods"
.IX Subsection "Ranking methods"
\&\s-1SLIF\s0 recognizer objects have a \f(CW\*(C`ranking_method\*(C'\fR named
argument,
whose value can be the name of a ranking method,
or "\f(CW\*(C`none\*(C'\fR", indicating that the default ranking method is to
be used.
.ie n .SS "The ""rule"" ranking method"
.el .SS "The \f(CWrule\fP ranking method"
.IX Subsection "The rule ranking method"
The rule method ranks alternative parses according to their rule alternatives.
Every rule alternative has a \fBnumeric rank\fR.
A rule's rank can be specified using the
the \f(CW\*(C`rank\*(C'\fR adverb
argument for that \s-1RHS\s0 alternative.
Rule ranks must be integers.
They may be negative.
If no numeric rank is specified, the numeric rank is 0.
.ie n .SS "The ""high_rule_only"" ranking method"
.el .SS "The \f(CWhigh_rule_only\fP ranking method"
.IX Subsection "The high_rule_only ranking method"
The \f(CW\*(C`high_rule_only\*(C'\fR ranking method is similar to the
\&\f(CW\*(C`rule\*(C'\fR ranking method, except that, at every choice point,
it discards all of the choices which
have a rank lower than that of the highest ranked choice.
.PP
The \f(CW\*(C`high_rule_only\*(C'\fR ranking method 
can reduce the ambiguity of a parse,
but it does not necessarily do so.
This is because, at each choice point among the parse trees,
it is possible that several of the choices,
or all of them, will have the same rank
as the highest ranked choice.
.SS "Rule ranking"
.IX Subsection "Rule ranking"
A parse series is kept in a structure called a \fBparse bocage\fR.
The parse bocage is a tree-like structure, whose root node
is the common root of all the parse trees of the parse series.
In an unambiguous parse,
there will be only one parse tree,
and the parse bocage will be equivalent
to that parse tree.
In an ambiguous parse,
there will be \fBchoice points\fR in the parse bocage.
At the choice points, there will be two or more
\&\fBalternatives\fR \*(-- choices which
result in different parse trees.
.PP
When ranking, the logic traverses the parse bocage,
looking for choice points.
From the point of view of the individual parse trees,
this traversal will be top-down
and left-to-right.
At the choice points,
the choices
are ranked as follows:
.IP "\(bu" 4
\&\fBDifferent numeric ranks\fR:
.Sp
If the two choices have different numeric ranks,
they must also have different rule alternatives.
The choice whose rule alternative has the higher numeric rank
will rank high.
.IP "\(bu" 4
\&\fBSame rule alternative\fR:
.Sp
If the two choices have the same rule alternative,
they rank as described
under \*(L"Null variant ranking\*(R".
.IP "\(bu" 4
\&\fBSame numeric rank, different rule alternatives\fR:
.Sp
Two different rule alternatives can have the same numeric rank.
If the two choices are for
rule alternatives that are different,
but that have the same numeric rank,
the relative order of the two choices is
arbitrary.
.PP
Note that, in the above, the logic is the same regardless of the \s-1DSL\s0 rule
to which the rule alternatives belong.
Different rule alternatives can,
in the case of a prioritized rule,
belong to the same \s-1DSL\s0 rule.
But two rule alternatives may also be different because they are
from two different \s-1DSL\s0 rules.
.SS "Null variant ranking"
.IX Subsection "Null variant ranking"
Some rules have a \s-1RHS\s0 which contains
\&\fBproper nullables\fR:
symbols
which may be nulled, but which are not nulling
symbols.
(Nulling symbols are symbols which are \fBalways\fR nulled.)
.PP
When a rule alternative contains proper nullables,
each instance
of that rule creates a \fBnulling variant\fR.
A \fBnulling variant\fR is
a specific pattern of
null and non-null symbols in a rule instance's \s-1RHS.\s0
In many cases, this creates an ambiguity \*(-- different
nulling variants can match the same substring in the input.
In ambiguous parsings of this kind,
some applications may want to rank nulling variants that start
with non-null symbols higher.
Other applications may want to do the opposite \*(--
to rank nulling variants that start
with null symbols higher.
.PP
The
\&\f(CW\*(C`null\-ranking\*(C'\fR adverb
for \s-1RHS\s0 alternatives
specifies which nulling variants are ranked high or low.
If the \f(CW\*(C`null\-ranking\*(C'\fR is "\f(CW\*(C`low\*(C'\fR",
then the closer a nulling variant
places its \fBvisible\fR (non-null) symbols to the start of the rule instance,
the higher it ranks.
A null ranking of \f(CW\*(C`low\*(C'\fR is the default.
If the \f(CW\*(C`null\-ranking\*(C'\fR is "\f(CW\*(C`high\*(C'\fR",
then the closer a nulling variant
places its \fBnull\fR symbols to the start of the rule instance,
the higher it ranks.
In ranking nulling variants with more than one proper nullable,
major-to-minor is left-to-right.
.SS "A general approach to sorting parses"
.IX Subsection "A general approach to sorting parses"
The most general way to sort Marpa parses is for the application
to take control.
The application can set up the Marpa semantic actions
so that the parse result of every parse tree is a
\&\f(CW\*(C`<rank, true_value>\*(C'\fR duple.
The duples can then be sorted by \f(CW\*(C`rank\*(C'\fR.
Once the results are sorted,
the \f(CW\*(C`rank\*(C'\fR element of the duple can be discarded.
(Those familiar with the Schwartzian transform
may note a resemblance.
In Perl,
duples can be implemented as references to arrays of 2 elements.)
.PP
The user needs to be careful.
In theory, ambiguity can cause an exponential explosion in the number of results.
In practice, ambiguity tends to get out of hand very easily.
Producing and sorting all the parses can take a very
long time.
.SH "Formal definitions"
.IX Header "Formal definitions"
This section is a restatement of earlier material in more formal language.
It is recorded here for those who find it helpful.
Most readers will want to ignore this section.
.PP
Call the set of parse trees, \f(CW\*(C`T\*(C'\fR.
Semantic equivalence is an equivalence relation
on \f(CW\*(C`T\*(C'\fR.
Call this relation \f(CW\*(C`~\*(C'\fR.
Call \f(CW\*(C`E\*(C'\fR, the quotient set of \f(CW\*(C`T\*(C'\fR by \f(CW\*(C`~\*(C'\fR.
In this document, the term
\&\fBarbitrary parse order\fR
is used to mean an
arbitrary choice among the relations
which are strict total orders of \f(CW\*(C`E\*(C'\fR.
.SH "Copyright and License"
.IX Header "Copyright and License"
.Vb 5
\&  Copyright 2014 Jeffrey Kegler
\&  This file is part of Marpa::R2.  Marpa::R2 is free software: you can
\&  redistribute it and/or modify it under the terms of the GNU Lesser
\&  General Public License as published by the Free Software Foundation,
\&  either version 3 of the License, or (at your option) any later version.
\&
\&  Marpa::R2 is distributed in the hope that it will be useful,
\&  but WITHOUT ANY WARRANTY; without even the implied warranty of
\&  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
\&  Lesser General Public License for more details.
\&
\&  You should have received a copy of the GNU Lesser
\&  General Public License along with Marpa::R2.  If not, see
\&  http://www.gnu.org/licenses/.
.Ve