table of contents
other versions
- jessie 2.0n-2-0.3
pretzel(1) | General Commands Manual | pretzel(1) |
NAME¶
pretzel - the universal prettyprinter generatorSYNOPSIS¶
pretzel [-qtgdh] [-o outfile] fileprefix pretzel [-qtgdh] [-o outfile] file1 file2DESCRIPTION¶
Pretzel is a program that generates a prettyprinter module from a formal description of the way a certain language should be prettyprinted. A prettyprinter is a function or program that rearranges source code to enhance its readability. Prettyprinters generated by pretzel output LaTeX source code that can be used within your own documents. NB that pretzel produces modules, not programs! You have to provide two input files to pretzel that specify the way given source code should be prettyprinted. These two files are called the formatted token file (suffix .ft) and the formatted grammar file (suffix .fg). From this input, pretzel generates two things: a valid flex(1) file that forms the prettyprinting scanner and a valid bison(1) input file that can be used to build the prettyprinting parser (which is the actual prettyprinter). There is a shell script pretzel-it that faciliates using pretzel (see pretzel-it(1)). This man page is only meant as a quick reference to pretzel usage. Look into the main documentation of pretzel if you are new to all this.Invoking pretzel¶
Invoking pretzel can take two forms: Either invoke it specifying only the common prefix of the two input files, or specify both files seperately on the command line. If you specify both files, the formatted token file comes first.Examples¶
Say your input files are called foo.ft and foo.fg. Then you can saypretzel foo
to invoke pretzel properly. If your files are called foo.ft and
bar.fg then you would have to say
pretzel foo.ft bar.fg
to do the job.
OPTIONS¶
Pretzel recognizes the following options:- -q
- Run quietly.
- -t
- Process formatted token file only.
- -g
- Process formatted grammar file only (options -t and -g are mutually exclusive).
- -d
- Print debug information to the screen.
- -h
- Print full usage message.
- -o name
- Use name as prefix of the generated output files.
THE INPUT FILES¶
This section summarizes the format of the input files and the format command primitives that pretzel supports.The formatted token file¶
The formatted token file contains a list of token definitions with their corresponding "prettyprinted" form. The prettyprinted form of a token will be called an attribute or a translation. The general outline of the formatted token file isdeclarations
%%
token definitions
Normally, the declarations part is empty. You can put a general
description of the file here (as a C comment) and redefinitions of the default
interface go here as well.
The token definitions section of the formatted token file contains a
series of token definitions of the form:
pattern token attribute
The pattern must be a valid regular expression (in terms of
flex(1)) and must be unindented. The token specifies the
symbolic name of the token for the pattern and begins at the first
non-whitespace character after the pattern. The token name must be a legal
name for an identifier in Pascal notation and must be all in upper
case. (Underlines are allowed but not at the beginning of a word.)
The attribute for this token, that is it's prettyprinted form, consists
of all text between the two curling brackets { and }. Attributes
can be either simple strings (surrounded by double quotes), format commands
(see below), your own C++ code (enclosed in angled brackets [ and
], see below) or a combination of both joined together by an optional
+ sign. Attribute definitions can cover several lines and the starting
{ needn't stand on the same line as the token definition; however
subsequent lines must be indented with at least one blank or one tab.
If you define strings as part of an attribute definition, you have to specify
them in a C kind of fashion, i.e. you can insert newlines and tabs with
\n and \t. But if you want to insert a backslash into a string,
you mustn't forget to put two backslashes \\ into the input file. This
is especially noteworthy if you are using TeX as typesetter.
If the definition of the attribute is omitted pretzel creates an attribute for
this pattern by default. The default attribute consists of the string
containing the text matched by the corresponding pattern.
The user himself may also refer to the matched text by using the sequence
**. Thus
"foo" BAR
"foo" BAR { ** }
"foo" BAR { "foo" }
all have the same meaning.
You can use a | sign as a token name; this signals that the current
regular expression has the same token name (and also the same attribute) as
the token specified in the following line (empty lines are ignored). An
attribute definition behind a | is illegal. However you may specify
regular expressions with neither a token name nor an attribute to give a
default rule or to eat up whitespace.
The declarations and the token definitions must be separated by a line
containing only the two characters %%.
Examples¶
The following examples are all legal token definitions:[0-9] DIGIT
"{" OPEN { "\\{" indent force }
[a-z][a-z0-9]* ID { "{\\it " ** "}" }
"function" |
"procedure" PROC_INTRO { big_force + ** }
[\t\ \n] |
.
The formatted grammar file¶
In the formatted grammar file the user encodes the general prettyprinting grammar for the programming language. This is done by specifying a context free grammar of the language and by adding information about the creation of new attributes in every rule. Its general outline looks like this:token declarations
%%
grammar rules
The token declarations section may be empty and the separator between the
two parts of the file %% must appear unindented on a single line by
itself.
The grammar rules section contains the collection of rules of the context
free grammar that can be accompanied by an attribute definition. A rule is
specified by stating the resulting token, a colon and then the series of
tokens which will be reduced by this rule. The rule is ended by a semicolon. A
block definition in Pascal for example might look like this:
block : BEGIN stmt_list END ;
%token tokenname
It is very important not to forget this.
Examples¶
For example, here again is the possible definition of a block in Pascal, now with an example attribute definition:block : BEGIN stmt_list END { $1 $2 force $3 } ;
stmt : block SEMI ;
stmt : block SEMI { $1 $2 } ;
These are legal rules too:
stmt_list : { force }
| stmt_list stmt SEMI { $1 $2 $3 force };
| stmt_list stmt SEMI { $1 $2 $3 force };
Comments and Code¶
There is a very simple way of putting comments into the formatted token and formatted grammar files. This is done in a C++ kind of manner by preceding the comment with a double slash //. All characters between this sign and the end of the line are ignored by pretzel. In both files you can put additional C/C++ code before and after the definitions/grammar sections. If you want to insert code at the end of your file, you have to put a second %% on a line by itself and put the code behind it. C/C++ code before the definitions/rules section has to be tied in with a %{, %} pair. Inserting extra code is interesting for people who want to access it from within the attribute definition.Code within attribute definitions¶
From version 2.0 onwards pretzel allows to insert C++ code into attribute definitions. This is how pretzel expects you to write code inside your pretzel input files: Code fragments are bracketed within angled brackets. Any angled brackets that appear within the C code must be escaped with a backslash. There can blocks of code before and behind the attribute definition which are called starting code and endingcode. Only one starting or ending code block is allowed. Both are totally optional, but if you want to specify either or, you need an attribute definition. Starting code is executed before the attribute of the new token is built, ending code is executed after building the attribute and before returning to the calling function (in the scanner). Code parts within attribute definitions must return a pointer to an Attribute class object (see file attr/attr.nw in the pretzel distribution for details). Within the formatted token file, the matched text is visible to you in form of a char* yytext variable. The symbolic names of the tokens are available by the same name that pretzel gives them. Starting code, code within attribute definitions and ending code is totally optional. But at any place where they are allowed, only one bracketed code bit may be placed. Here's an example from the formatted grammar file:id : ID { [lookup($1) ? create("{\\bf ")
:
create("{\\it ")] $1 "}" };
This example shows how to format an identifier depending on whether it is in a
lookup table or not. Identifiers could be installed in the table for example
like this:
create("{\\it ")] $1 "}" };
typedef : TYPEDEF_LIKE INT_LIKE ID
[ install($3); ]
{ $1 $2 "{\\bf " $3 "}" };
More examples can be found in the Pretzelbook. Common routines to escape
identifiers, to build and manage lookup tables, to convert to and from
Attribute* or to output debug information can be found in the files
belonging to the C prettyprinter in the directory languages/cee of the
pretzel distribution.
[ install($3); ]
{ $1 $2 "{\\bf " $3 "}" };
The set of format commands¶
Here's a list of the format commands supported by pretzel and their meaning:- null
- empty command.
- indent
- indents the next line a little more.
- outdent
- takes back the last indentation (de-indent).
- force
- forces a line break.
- break_space
- denotes a possible space for a line break.
- opt1...opt9
- denotes an optional line break with the continuation line indented a litte with respect to the normal starting position.
- backup
- denotes a small backspace.
- big_force
- forces a line break and inserts a little extra space.
- no_indent
- causes the current line to be output flushleft.
- cancel
- obliterates any break_space, opt, force or big_force command that immediatly precedes or follows it and also cancels any backup command that follows it. For a complete reference on how to write pretzel input, look into the Pretzelbook which is included in the pretzel distribution.
Format command preprocessing¶
The format commands are preprocessed according to the following two rules:- 1. A sequence of consecutive
- break_space, force, and/or big_force commands is replaced by a single command (the maximum of the given ones).
- 2. The
- cancel command cancels any break_space, opt, force or big_force command that immediatly precede or follow it and also cancels any backup command that follows it.
THE OUTPUT FILES¶
If pretzel runs without error, you will obtain the definition of a C++ prettyprinter class in form of two files. The first file is a valid bison(1) file from which the actual prettyprinting parser class can be obtained. The second file (generated from the formatted token file) can be processed with the flex(1) scanner generator to form the prettyprinting scanner class used by the parser.The bison file¶
The generated bison file contains the definitions for a prettyprinting parser class that is a subclass of the following abstract base class (contained in the file Pparse.h within the pretzel include directory):#include<iostream>
#include"attr.h"
#include"output.h"
class Pparse {
public:
The prettyprinter generated by pretzel will be a subclass of the following form:
Pparse() {};
~Pparse() {};
virtual int prettyprint(istream*, ostream*) = 0;
virtual int prettyprint(istream*, Output*) = 0;
};#include Pparse.h // include abstract base class
class PPARSE_NAME : public Pparse {
public:
The name of the class may be changed by redefining the preprocessor macro
PPARSE_NAME within the formatted grammar file. The actual
prettyprinting function is prettyprint that reads text from an input
stream (i.e. a C++ istream object) and outputs the results to an output
stream (i.e. a C++ ostream object, see ios(3C++)). The second
overloaded version of prettyprint takes an Output object (see
the file output/output.nw and the Pretzelbook in the pretzel
distribution for details) and uses this to output the prettyprinted code. The
debug functions can be used to turn debugging output to cerr on
and off.
PPARSE_NAME(); ~PPARSE_NAME();
int prettyprint(istream*, ostream*);
int prettyprint(istream*, Output*);
void debug_on(); void debug_off();
};The flex file¶
The prettyprinting parser class relies on the service of a prettyprinting scanner that can be produced using the second pretzel file. It contails a complete definition of a scanner subclass of this abstract base class (see file Pscan.h in the pretzel include directory):#include<iostream>
#include"attr.h"
class Pscan {
public:
The scanner must be initialized with a C++ istream pointer from which it
takes its input. A call to the actual scan function returns an integer
(the token code of the token just scanned or 0 on end-of-file) plus a call by
reference attribute containing the contents of the token (see file
attr/attr.nw from the pretzel distribution).
The produced prettyprinting scanner class is a subclass and looks like this:
Pscan(istream*) {}; ~Pscan() {};
virtual int scan(Attribute**) = 0;
};#include Pscan.h // include abstract base class
class PSCAN_NAME : public Pscan {
public:
The name of the scanner can be changed within the formatted token file by
redefining the PSCAN_NAME macro within the declarations section. The
scanner class expects to find token definitions common to the scanner and the
parser in a file called ptokdefs.h and will try to include this file.
You either have to provide this file yourself or use the -d option of
Bison to create one that fits a formatted grammar (see bison(1)). You
may change the name of the file that the scanner expects by redefining the
PTOKDEFS_NAME macro in the declarations section of the formatted token
file. Commen header files for the abstract base classes and the default
subclasses reside in the pretzel include directory.
PSCAN_NAME(istream*);
~PSCAN_NAME();
int scan(Attribute**);
FILES¶
- /usr/lib/pretzel/libpretzel.a
- pretzel runtime library.
- /usr/include/pretzel
- directory for runtime library include files (pretzel include directory).
- /usr/local/lib/pretzel/include/Pscan.h
- /usr/include/pretzel/Pparse.h
- headers for abstract base files.
- /usr/include/pretzel/Ppscan.h
- /usr/include/pretzel/Ppparse.h
- default headers for generated subclasses.
- /usr/lib/texmf/tex/latex/pretzel/pretzel-latex.sty
- LaTeX style to typeset pretzel output.
SEE ALSO¶
pretzel-it(1), flex(1), bison(1) The PretzelBook, second edition - ultimate source of information, included in the pretzel distribution. The Pretzel homepage on the WWW at http://www.iti.informatik.tu-darmstadt.de/~gaertner/pretzelAUTHOR¶
Felix Gaertner, email: fcg@acm.orgJune 11, 1998 |