.\" Automatically generated by Pod::Man 2.28 (Pod::Simple 3.28)
.\"
.\" Standard preamble:
.\" ========================================================================
.de Sp \" Vertical space (when we can't use .PP)
.if t .sp .5v
.if n .sp
..
.de Vb \" Begin verbatim text
.ft CW
.nf
.ne \\$1
..
.de Ve \" End verbatim text
.ft R
.fi
..
.\" Set up some character translations and predefined strings.  \*(-- will
.\" give an unbreakable dash, \*(PI will give pi, \*(L" will give a left
.\" double quote, and \*(R" will give a right double quote.  \*(C+ will
.\" give a nicer C++.  Capital omega is used to do unbreakable dashes and
.\" therefore won't be available.  \*(C` and \*(C' expand to `' in nroff,
.\" nothing in troff, for use with C<>.
.tr \(*W-
.ds C+ C\v'-.1v'\h'-1p'\s-2+\h'-1p'+\s0\v'.1v'\h'-1p'
.ie n \{\
.    ds -- \(*W-
.    ds PI pi
.    if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch
.    if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\"  diablo 12 pitch
.    ds L" ""
.    ds R" ""
.    ds C` ""
.    ds C' ""
'br\}
.el\{\
.    ds -- \|\(em\|
.    ds PI \(*p
.    ds L" ``
.    ds R" ''
.    ds C`
.    ds C'
'br\}
.\"
.\" Escape single quotes in literal strings from groff's Unicode transform.
.ie \n(.g .ds Aq \(aq
.el       .ds Aq '
.\"
.\" If the F register is turned on, we'll generate index entries on stderr for
.\" titles (.TH), headers (.SH), subsections (.SS), items (.Ip), and index
.\" entries marked with X<> in POD.  Of course, you'll have to process the
.\" output yourself in some meaningful fashion.
.\"
.\" Avoid warning from groff about undefined register 'F'.
.de IX
..
.nr rF 0
.if \n(.g .if rF .nr rF 1
.if (\n(rF:(\n(.g==0)) \{
.    if \nF \{
.        de IX
.        tm Index:\\$1\t\\n%\t"\\$2"
..
.        if !\nF==2 \{
.            nr % 0
.            nr F 2
.        \}
.    \}
.\}
.rr rF
.\"
.\" Accent mark definitions (@(#)ms.acc 1.5 88/02/08 SMI; from UCB 4.2).
.\" Fear.  Run.  Save yourself.  No user-serviceable parts.
.    \" fudge factors for nroff and troff
.if n \{\
.    ds #H 0
.    ds #V .8m
.    ds #F .3m
.    ds #[ \f1
.    ds #] \fP
.\}
.if t \{\
.    ds #H ((1u-(\\\\n(.fu%2u))*.13m)
.    ds #V .6m
.    ds #F 0
.    ds #[ \&
.    ds #] \&
.\}
.    \" simple accents for nroff and troff
.if n \{\
.    ds ' \&
.    ds ` \&
.    ds ^ \&
.    ds , \&
.    ds ~ ~
.    ds /
.\}
.if t \{\
.    ds ' \\k:\h'-(\\n(.wu*8/10-\*(#H)'\'\h"|\\n:u"
.    ds ` \\k:\h'-(\\n(.wu*8/10-\*(#H)'\`\h'|\\n:u'
.    ds ^ \\k:\h'-(\\n(.wu*10/11-\*(#H)'^\h'|\\n:u'
.    ds , \\k:\h'-(\\n(.wu*8/10)',\h'|\\n:u'
.    ds ~ \\k:\h'-(\\n(.wu-\*(#H-.1m)'~\h'|\\n:u'
.    ds / \\k:\h'-(\\n(.wu*8/10-\*(#H)'\z\(sl\h'|\\n:u'
.\}
.    \" troff and (daisy-wheel) nroff accents
.ds : \\k:\h'-(\\n(.wu*8/10-\*(#H+.1m+\*(#F)'\v'-\*(#V'\z.\h'.2m+\*(#F'.\h'|\\n:u'\v'\*(#V'
.ds 8 \h'\*(#H'\(*b\h'-\*(#H'
.ds o \\k:\h'-(\\n(.wu+\w'\(de'u-\*(#H)/2u'\v'-.3n'\*(#[\z\(de\v'.3n'\h'|\\n:u'\*(#]
.ds d- \h'\*(#H'\(pd\h'-\w'~'u'\v'-.25m'\f2\(hy\fP\v'.25m'\h'-\*(#H'
.ds D- D\\k:\h'-\w'D'u'\v'-.11m'\z\(hy\v'.11m'\h'|\\n:u'
.ds th \*(#[\v'.3m'\s+1I\s-1\v'-.3m'\h'-(\w'I'u*2/3)'\s-1o\s+1\*(#]
.ds Th \*(#[\s+2I\s-2\h'-\w'I'u*3/5'\v'-.3m'o\v'.3m'\*(#]
.ds ae a\h'-(\w'a'u*4/10)'e
.ds Ae A\h'-(\w'A'u*4/10)'E
.    \" corrections for vroff
.if v .ds ~ \\k:\h'-(\\n(.wu*9/10-\*(#H)'\s-2\u~\d\s+2\h'|\\n:u'
.if v .ds ^ \\k:\h'-(\\n(.wu*10/11-\*(#H)'\v'-.4m'^\v'.4m'\h'|\\n:u'
.    \" for low resolution devices (crt and lpr)
.if \n(.H>23 .if \n(.V>19 \
\{\
.    ds : e
.    ds 8 ss
.    ds o a
.    ds d- d\h'-1'\(ga
.    ds D- D\h'-1'\(hy
.    ds th \o'bp'
.    ds Th \o'LP'
.    ds ae ae
.    ds Ae AE
.\}
.rm #[ #] #H #V #F C
.\" ========================================================================
.\"
.IX Title "Marpa::R2::Advanced::Bibliography 3pm"
.TH Marpa::R2::Advanced::Bibliography 3pm "2015-03-06" "perl v5.20.2" "User Contributed Perl Documentation"
.\" For nroff, turn off justification.  Always turn off hyphenation; it makes
.\" way too many mistakes in technical documents.
.if n .ad l
.nh
.SH "NAME"
Marpa::R2::Advanced::Bibliography \- A Marpa bibliography
.SH "History of the Marpa algorithm"
.IX Header "History of the Marpa algorithm"
.IP "1970" 4
.IX Item "1970"
Jay Earley invents the algorithm that now bears his
name.
.IP "1991" 4
.IX Item "1991"
Joop Leo describes a way to modify Earley's algorithm so that it
runs in O(n) time for all LR-regular
grammars.
LR-regular is a vast class of grammars, including all the
\&\s-1LR\s0(k) grammars, all grammars parseable with recursive descent,
and regular expressions.
LR-regular can safely be thought of as including all grammars
in practical use today, and then some.
.IP "2002" 4
.IX Item "2002"
Aycock and Horspool describe a way to do \s-1\fILR\s0\fR\|(0)
precomputation
for Earley's algorithm.
Their method makes Earley's faster in most
practical situations, but not all.
In particular, right-recursion remains quadratic in
the Aycock and Horspool algorithm.
Worst case is no better than Earley's.
Leo is unaware of Aycock and Horspool's work
and Aycock and Horspool seem unaware of Leo.
.IP "2010" 4
.IX Item "2010"
Marpa combines the Leo and Aycock-Horspool
algorithms,
in the process making significant
changes to both of them.
The result preserves the
best features of both.
Marpa also tackles the many remaining
implementation issues.
.SH "Bibliography"
.IX Header "Bibliography"
.SS "Aho and Ullman 1972"
.IX Subsection "Aho and Ullman 1972"
\&\fIThe Theory of Parsing, Translation and Compiling, Volume I: Parsing\fR
by Alfred Aho and Jeffrey Ullman
(Prentice-Hall: Englewood Cliffs, New Jersey, 1972).
I think this was the
standard source for Earley's algorithm for decades.
It certainly was \fBmy\fR standard source.
The account of Earley's algorithm is on pages 320\-330.
.SS "Aycock and Horspool 2002"
.IX Subsection "Aycock and Horspool 2002"
Marpa is based on ideas from
John Aycock and R.
Nigel Horspool's \*(L"Practical Earley Parsing\*(R", \fIThe Computer Journal\fR,
Vol. 45, No. 6, 2002, pp. 620\-630.
The idea of doing \s-1\fILR\s0\fR\|(0)
precomputation for
Earley's general parsing algorithm,
and Marpa's approach to handling nullable symbols and rules,
both came from this article.
.PP
The Aycock and Horspool paper
summarizes Earley's very nicely and is
available on the web: <http://www.cs.uvic.ca/~nigelh/Publications/PracticalEarleyParsing.pdf>.
Unlike \*(L"Earley 1970\*(R",
Aycock and Horspool 2002 is \fBnot\fR easy reading.
I have been following
this particular topic on and off for years
and nonetheless found this paper very heavy going.
.SS "Dominus 2005"
.IX Subsection "Dominus 2005"
Although my approach to parsing is not influenced
by Mark Jason Dominus's \fIHigher Order Perl\fR,
Mark's treatment of parsing is an excellent introduction to parsing,
especially in a Perl context.
His focus on just about every other technique \fBexcept\fR
general \s-1BNF\s0 parsing is pretty much standard, and
will help a beginner understand how unconventional
Marpa's approach is.
.PP
Mark's book opened my eyes to many new ideas.
Both Mark's Perl and his English are examples of good writing,
and the book is dense with insights.
Mark's discussion on memoization in Chapter 3 is the
best I've seen.
I wish I'd bought his book earlier in my coding.
.PP
Mark's book is available on-line.
You can download chapter-by-chapter or the whole thing at once,
and you can take your pick of his original sources or \s-1PDF,\s0
at <http://hop.perl.plover.com/book/>.
A \s-1PDF\s0 of the parsing chapter is at <http://hop.perl.plover.com/book/pdf/08Parsing.pdf>.
.SS "Earley 1970"
.IX Subsection "Earley 1970"
Of
Jay Earley's papers on his general parsing algorithm,
the most readily available
is \*(L"An efficient context-free parsing algorithm\*(R",
\&\fICommunications of the Association for Computing Machinery\fR,
13:2:94\-102, 1970.
.PP
Ordinarily, I'd not bother pointing out 35\-year old nits
in a brilliant and historically important article.
But more than a few people treat this article as not just the first word in Earley
parsing, but the last as well.
Many implementations of Earley's algorithm come, directly and
unaltered, from his paper.
These implementers and their users need to be aware of two issues.
.PP
First, the recognition engine itself, as described, has a serious bug.
There's an easy fix, but one that greatly slows down an algorithm
whose main problem, in its original form, was speed.
This issue is well laid out by
Aycock and Horspool
in their article.
.PP
Second,
according to Tomita there is a mistake in the parse
tree representation.
See page 153 of \*(L"Grune and Jacobs 1990\*(R",
page 210 of \*(L"Grune and Jacobs 2008\*(R",
and the bibliography entry for Earley 1970 in \*(L"Grune and Jacobs 2008\*(R".
In the printed edition of the 2008 bibliography, the entry is on page 578,
and on the web
(<ftp://ftp.cs.vu.nl/pub/dick/PTAPG_2nd_Edition/CompleteList.pdf>),
it's on pp. 583\-584.
My methods for producing parse results
from Earley sets do not come from Earley 1970,
so I am taking Tomita's word on this one.
.SS "Grune and Jacobs 1990"
.IX Subsection "Grune and Jacobs 1990"
\&\fIParsing Techniques: A Practical Guide\fR,
by Dick Grune and
Ceriel Jacobs,
(Ellis Horwood Limited: Chichester, West Sussex, England,
1990).
This book is available on the Web: <http://www.cs.vu.nl/~dick/PTAPG.html>
.SS "Grune and Jacobs 2008"
.IX Subsection "Grune and Jacobs 2008"
\&\fIParsing Techniques: A Practical Guide\fR,
by Dick Grune and
Ceriel Jacobs,
2nd Edition.
(Springer: New York \s-1NY, 2008\s0).
This is the most authoritative and comprehensive introduction
to parsing I know of.
In theory it requires no mathematics, only a programming background,
but even so it is moderately difficult reading.
.PP
This is \*(L"Grune and Jacobs 1990\*(R" updated.
The bibliography for this book is available in enlarged form
on the web: <ftp://ftp.cs.vu.nl/pub/dick/PTAPG_2nd_Edition/CompleteList.pdf>.
.SS "Kegler 2013"
.IX Subsection "Kegler 2013"
My writeup of the theory behind Marpa,
with proofs of correctness and of my complexity claims
is available online:
<https://docs.google.com/file/d/0B9_mR_M2zOc4Ni1zSW5IYzk3TGc/edit?usp=sharing>.
.SS "Leo 1991"
.IX Subsection "Leo 1991"
Marpa's handling of right-recursion uses the ideas in
Joop M.I.M. Leo's
\&\*(L"A General Context-Free Parsing Algorithm Running in Linear
Time on Every \s-1LR\s0(k) Grammar Without Using Lookahead\*(R",
\&\fITheoretical Computer Science\fR,
Vol. 82, No. 1, 1991, pp 165\-176.
This is a difficult paper.
Unfortunately, there is no copy of it on-line.
.SS "Wikipedia"
.IX Subsection "Wikipedia"
Wikipedia's article on Backus-Naur form is
<http://en.wikipedia.org/wiki/Backus\-Naur_form>.
It's a great place to start if you don't know the
basics of grammars and parsing.
As Wikipedia points out,
\&\s-1BNF\s0 might better be called Panini-Backus Form.
The grammarian Panini
gave a precise description of Sanskrit
more than 23 centuries earlier in India
using a similar notation.
.SH "Copyright and License"
.IX Header "Copyright and License"
.Vb 5
\&  Copyright 2014 Jeffrey Kegler
\&  This file is part of Marpa::R2.  Marpa::R2 is free software: you can
\&  redistribute it and/or modify it under the terms of the GNU Lesser
\&  General Public License as published by the Free Software Foundation,
\&  either version 3 of the License, or (at your option) any later version.
\&
\&  Marpa::R2 is distributed in the hope that it will be useful,
\&  but WITHOUT ANY WARRANTY; without even the implied warranty of
\&  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
\&  Lesser General Public License for more details.
\&
\&  You should have received a copy of the GNU Lesser
\&  General Public License along with Marpa::R2.  If not, see
\&  http://www.gnu.org/licenses/.
.Ve
.SH "POD ERRORS"
.IX Header "POD ERRORS"
Hey! \fBThe above document had some coding errors, which are explained below:\fR
.IP "Around line 29:" 4
.IX Item "Around line 29:"
Expected text after =item, not a number
.IP "Around line 40:" 4
.IX Item "Around line 40:"
Expected text after =item, not a number
.IP "Around line 53:" 4
.IX Item "Around line 53:"
Expected text after =item, not a number