.TH "Genlex" 3o source: 2016-12-21 OCamldoc "OCaml library" .SH NAME Genlex \- A generic lexical analyzer. .SH Module Module Genlex .SH Documentation .sp Module .BI "Genlex" : .B sig end .sp A generic lexical analyzer\&. .sp This module implements a simple \&'standard\&' lexical analyzer, presented as a function from character streams to token streams\&. It implements roughly the lexical conventions of OCaml, but is parameterized by the set of keywords of your language\&. .sp Example: a lexer suitable for a desk calculator is obtained by .B let lexer = make_lexer ["+";"\-";"*";"/";"let";"="; "("; ")"] .sp The associated parser would be a function from .B token stream to, for instance, .B int , and would have rules such as: .sp .B .B let rec parse_expr = parser .B | [< n1 = parse_atom; n2 = parse_remainder n1 >] \-> n2 .B and parse_atom = parser .B | [< \&'Int n >] \-> n .B | [< \&'Kwd "("; n = parse_expr; \&'Kwd ")" >] \-> n .B and parse_remainder n1 = parser .B | [< \&'Kwd "+"; n2 = parse_expr >] \-> n1+n2 .B | [< >] \-> n1 .B .sp One should notice that the use of the .B parser keyword and associated notation for streams are only available through camlp4 extensions\&. This means that one has to preprocess its sources e\&. g\&. by using the .B "\-pp" command\-line switch of the compilers\&. .sp .sp .sp .I type token = | Kwd .B of .B string | Ident .B of .B string | Int .B of .B int | Float .B of .B float | String .B of .B string | Char .B of .B char .sp The type of tokens\&. The lexical classes are: .B Int and .B Float for integer and floating\-point numbers; .B String for string literals, enclosed in double quotes; .B Char for character literals, enclosed in single quotes; .B Ident for identifiers (either sequences of letters, digits, underscores and quotes, or sequences of \&'operator characters\&' such as .B + , .B * , etc); and .B Kwd for keywords (either identifiers or single \&'special characters\&' such as .B ( , .B } , etc)\&. .sp .I val make_lexer : .B string list -> char Stream.t -> token Stream.t .sp Construct the lexer function\&. The first argument is the list of keywords\&. An identifier .B s is returned as .B Kwd s if .B s belongs to this list, and as .B Ident s otherwise\&. A special character .B s is returned as .B Kwd s if .B s belongs to this list, and cause a lexical error (exception .B Stream\&.Error with the offending lexeme as its parameter) otherwise\&. Blanks and newlines are skipped\&. Comments delimited by .B (* and .B *) are skipped as well, and can be nested\&. A .B Stream\&.Failure exception is raised if end of stream is unexpectedly reached\&. .sp