'\" '\" Generated from file 'me_cpucore\&.man' by tcllib/doctools with format 'nroff' '\" Copyright (c) 2005-2006 Andreas Kupries '\" .TH "grammar::me::cpu::core" 3tcl 0\&.2 tcllib "Grammar operations and usage" .\" The -*- nroff -*- definitions below are for supplemental macros used .\" in Tcl/Tk manual entries. .\" .\" .AP type name in/out ?indent? .\" Start paragraph describing an argument to a library procedure. .\" type is type of argument (int, etc.), in/out is either "in", "out", .\" or "in/out" to describe whether procedure reads or modifies arg, .\" and indent is equivalent to second arg of .IP (shouldn't ever be .\" needed; use .AS below instead) .\" .\" .AS ?type? ?name? .\" Give maximum sizes of arguments for setting tab stops. Type and .\" name are examples of largest possible arguments that will be passed .\" to .AP later. If args are omitted, default tab stops are used. .\" .\" .BS .\" Start box enclosure. From here until next .BE, everything will be .\" enclosed in one large box. .\" .\" .BE .\" End of box enclosure. .\" .\" .CS .\" Begin code excerpt. .\" .\" .CE .\" End code excerpt. .\" .\" .VS ?version? ?br? .\" Begin vertical sidebar, for use in marking newly-changed parts .\" of man pages. The first argument is ignored and used for recording .\" the version when the .VS was added, so that the sidebars can be .\" found and removed when they reach a certain age. If another argument .\" is present, then a line break is forced before starting the sidebar. .\" .\" .VE .\" End of vertical sidebar. .\" .\" .DS .\" Begin an indented unfilled display. .\" .\" .DE .\" End of indented unfilled display. .\" .\" .SO ?manpage? .\" Start of list of standard options for a Tk widget. The manpage .\" argument defines where to look up the standard options; if .\" omitted, defaults to "options". The options follow on successive .\" lines, in three columns separated by tabs. .\" .\" .SE .\" End of list of standard options for a Tk widget. .\" .\" .OP cmdName dbName dbClass .\" Start of description of a specific option. cmdName gives the .\" option's name as specified in the class command, dbName gives .\" the option's name in the option database, and dbClass gives .\" the option's class in the option database. .\" .\" .UL arg1 arg2 .\" Print arg1 underlined, then print arg2 normally. .\" .\" .QW arg1 ?arg2? .\" Print arg1 in quotes, then arg2 normally (for trailing punctuation). .\" .\" .PQ arg1 ?arg2? .\" Print an open parenthesis, arg1 in quotes, then arg2 normally .\" (for trailing punctuation) and then a closing parenthesis. .\" .\" # Set up traps and other miscellaneous stuff for Tcl/Tk man pages. .if t .wh -1.3i ^B .nr ^l \n(.l .ad b .\" # Start an argument description .de AP .ie !"\\$4"" .TP \\$4 .el \{\ . ie !"\\$2"" .TP \\n()Cu . el .TP 15 .\} .ta \\n()Au \\n()Bu .ie !"\\$3"" \{\ \&\\$1 \\fI\\$2\\fP (\\$3) .\".b .\} .el \{\ .br .ie !"\\$2"" \{\ \&\\$1 \\fI\\$2\\fP .\} .el \{\ \&\\fI\\$1\\fP .\} .\} .. .\" # define tabbing values for .AP .de AS .nr )A 10n .if !"\\$1"" .nr )A \\w'\\$1'u+3n .nr )B \\n()Au+15n .\" .if !"\\$2"" .nr )B \\w'\\$2'u+\\n()Au+3n .nr )C \\n()Bu+\\w'(in/out)'u+2n .. .AS Tcl_Interp Tcl_CreateInterp in/out .\" # BS - start boxed text .\" # ^y = starting y location .\" # ^b = 1 .de BS .br .mk ^y .nr ^b 1u .if n .nf .if n .ti 0 .if n \l'\\n(.lu\(ul' .if n .fi .. .\" # BE - end boxed text (draw box now) .de BE .nf .ti 0 .mk ^t .ie n \l'\\n(^lu\(ul' .el \{\ .\" Draw four-sided box normally, but don't draw top of .\" box if the box started on an earlier page. .ie !\\n(^b-1 \{\ \h'-1.5n'\L'|\\n(^yu-1v'\l'\\n(^lu+3n\(ul'\L'\\n(^tu+1v-\\n(^yu'\l'|0u-1.5n\(ul' .\} .el \}\ \h'-1.5n'\L'|\\n(^yu-1v'\h'\\n(^lu+3n'\L'\\n(^tu+1v-\\n(^yu'\l'|0u-1.5n\(ul' .\} .\} .fi .br .nr ^b 0 .. .\" # VS - start vertical sidebar .\" # ^Y = starting y location .\" # ^v = 1 (for troff; for nroff this doesn't matter) .de VS .if !"\\$2"" .br .mk ^Y .ie n 'mc \s12\(br\s0 .el .nr ^v 1u .. .\" # VE - end of vertical sidebar .de VE .ie n 'mc .el \{\ .ev 2 .nf .ti 0 .mk ^t \h'|\\n(^lu+3n'\L'|\\n(^Yu-1v\(bv'\v'\\n(^tu+1v-\\n(^Yu'\h'-|\\n(^lu+3n' .sp -1 .fi .ev .\} .nr ^v 0 .. .\" # Special macro to handle page bottom: finish off current .\" # box/sidebar if in box/sidebar mode, then invoked standard .\" # page bottom macro. .de ^B .ev 2 'ti 0 'nf .mk ^t .if \\n(^b \{\ .\" Draw three-sided box if this is the box's first page, .\" draw two sides but no top otherwise. .ie !\\n(^b-1 \h'-1.5n'\L'|\\n(^yu-1v'\l'\\n(^lu+3n\(ul'\L'\\n(^tu+1v-\\n(^yu'\h'|0u'\c .el \h'-1.5n'\L'|\\n(^yu-1v'\h'\\n(^lu+3n'\L'\\n(^tu+1v-\\n(^yu'\h'|0u'\c .\} .if \\n(^v \{\ .nr ^x \\n(^tu+1v-\\n(^Yu \kx\h'-\\nxu'\h'|\\n(^lu+3n'\ky\L'-\\n(^xu'\v'\\n(^xu'\h'|0u'\c .\} .bp 'fi .ev .if \\n(^b \{\ .mk ^y .nr ^b 2 .\} .if \\n(^v \{\ .mk ^Y .\} .. .\" # DS - begin display .de DS .RS .nf .sp .. .\" # DE - end display .de DE .fi .RE .sp .. .\" # SO - start of list of standard options .de SO 'ie '\\$1'' .ds So \\fBoptions\\fR 'el .ds So \\fB\\$1\\fR .SH "STANDARD OPTIONS" .LP .nf .ta 5.5c 11c .ft B .. .\" # SE - end of list of standard options .de SE .fi .ft R .LP See the \\*(So manual entry for details on the standard options. .. .\" # OP - start of full description for a single option .de OP .LP .nf .ta 4c Command-Line Name: \\fB\\$1\\fR Database Name: \\fB\\$2\\fR Database Class: \\fB\\$3\\fR .fi .IP .. .\" # CS - begin code excerpt .de CS .RS .nf .ta .25i .5i .75i 1i .. .\" # CE - end code excerpt .de CE .fi .RE .. .\" # UL - underline word .de UL \\$1\l'|0\(ul'\\$2 .. .\" # QW - apply quotation marks to word .de QW .ie '\\*(lq'"' ``\\$1''\\$2 .\"" fix emacs highlighting .el \\*(lq\\$1\\*(rq\\$2 .. .\" # PQ - apply parens and quotation marks to word .de PQ .ie '\\*(lq'"' (``\\$1''\\$2)\\$3 .\"" fix emacs highlighting .el (\\*(lq\\$1\\*(rq\\$2)\\$3 .. .\" # QR - quoted range .de QR .ie '\\*(lq'"' ``\\$1''\\-``\\$2''\\$3 .\"" fix emacs highlighting .el \\*(lq\\$1\\*(rq\\-\\*(lq\\$2\\*(rq\\$3 .. .\" # MT - "empty" string .de MT .QW "" .. .BS .SH NAME grammar::me::cpu::core \- ME virtual machine state manipulation .SH SYNOPSIS package require \fBTcl 8\&.4\fR .sp package require \fBgrammar::me::cpu::core ?0\&.2?\fR .sp \fB::grammar::me::cpu::core\fR \fBdisasm\fR \fIasm\fR .sp \fB::grammar::me::cpu::core\fR \fBasm\fR \fIasm\fR .sp \fB::grammar::me::cpu::core\fR \fBnew\fR \fIasm\fR .sp \fB::grammar::me::cpu::core\fR \fBlc\fR \fIstate\fR \fIlocation\fR .sp \fB::grammar::me::cpu::core\fR \fBtok\fR \fIstate\fR ?\fIfrom\fR ?\fIto\fR?? .sp \fB::grammar::me::cpu::core\fR \fBpc\fR \fIstate\fR .sp \fB::grammar::me::cpu::core\fR \fBiseof\fR \fIstate\fR .sp \fB::grammar::me::cpu::core\fR \fBat\fR \fIstate\fR .sp \fB::grammar::me::cpu::core\fR \fBcc\fR \fIstate\fR .sp \fB::grammar::me::cpu::core\fR \fBsv\fR \fIstate\fR .sp \fB::grammar::me::cpu::core\fR \fBok\fR \fIstate\fR .sp \fB::grammar::me::cpu::core\fR \fBerror\fR \fIstate\fR .sp \fB::grammar::me::cpu::core\fR \fBlstk\fR \fIstate\fR .sp \fB::grammar::me::cpu::core\fR \fBastk\fR \fIstate\fR .sp \fB::grammar::me::cpu::core\fR \fBmstk\fR \fIstate\fR .sp \fB::grammar::me::cpu::core\fR \fBestk\fR \fIstate\fR .sp \fB::grammar::me::cpu::core\fR \fBrstk\fR \fIstate\fR .sp \fB::grammar::me::cpu::core\fR \fBnc\fR \fIstate\fR .sp \fB::grammar::me::cpu::core\fR \fBast\fR \fIstate\fR .sp \fB::grammar::me::cpu::core\fR \fBhalted\fR \fIstate\fR .sp \fB::grammar::me::cpu::core\fR \fBcode\fR \fIstate\fR .sp \fB::grammar::me::cpu::core\fR \fBeof\fR \fIstatevar\fR .sp \fB::grammar::me::cpu::core\fR \fBput\fR \fIstatevar\fR \fItok\fR \fIlex\fR \fIline\fR \fIcol\fR .sp \fB::grammar::me::cpu::core\fR \fBrun\fR \fIstatevar\fR ?\fIn\fR? .sp .BE .SH DESCRIPTION .PP This package provides an implementation of the ME virtual machine\&. Please go and read the document \fBgrammar::me_intro\fR first if you do not know what a ME virtual machine is\&. .PP This implementation represents each ME virtual machine as a Tcl value and provides commands to manipulate and query such values to show the effects of executing instructions, adding tokens, retrieving state, etc\&. .PP The values fully follow the paradigm of Tcl that every value is a string and while also allowing C implementations for a proper Tcl_ObjType to keep all the important data in native data structures\&. Because of the latter it is recommended to access the state values \fIonly\fR through the commands of this package to ensure that internal representation is not shimmered away\&. .PP The actual structure used by all state values is described in section \fBCPU STATE\fR\&. .SH API The package directly provides only a single command, and all the functionality is made available through its methods\&. .TP \fB::grammar::me::cpu::core\fR \fBdisasm\fR \fIasm\fR This method returns a list containing a disassembly of the match instructions in \fIasm\fR\&. The format of \fIasm\fR is specified in the section \fBMATCH PROGRAM REPRESENTATION\fR\&. .sp Each element of the result contains instruction label, instruction name, and the instruction arguments, in this order\&. The label can be the empty string\&. Jump destinations are shown as labels, strings and tokens unencoded\&. Token names are prefixed with their numeric id, if, and only if a tokmap is defined\&. The two components are separated by a colon\&. .TP \fB::grammar::me::cpu::core\fR \fBasm\fR \fIasm\fR This method returns code in the format as specified in section \fBMATCH PROGRAM REPRESENTATION\fR generated from ME assembly code \fIasm\fR, which is in the format as returned by the method \fBdisasm\fR\&. .TP \fB::grammar::me::cpu::core\fR \fBnew\fR \fIasm\fR This method creates state value for a ME virtual machine in its initial state and returns it as its result\&. .sp The argument \fImatchcode\fR contains a Tcl representation of the match instructions the machine has to execute while parsing the input stream\&. Its format is specified in the section \fBMATCH PROGRAM REPRESENTATION\fR\&. .sp The \fItokmap\fR argument taken by the implementation provided by the package \fBgrammar::me::tcl\fR is here hidden inside of the match instructions and therefore not needed\&. .TP \fB::grammar::me::cpu::core\fR \fBlc\fR \fIstate\fR \fIlocation\fR This method takes the state value of a ME virtual machine and uses it to convert a location in the input stream (as offset) into a line number and column index\&. The result of the method is a 2-element list containing the two pieces in the order mentioned in the previous sentence\&. .sp \fINote\fR that the method cannot convert locations which the machine has not yet read from the input stream\&. In other words, if the machine has read 7 characters so far it is possible to convert the offsets \fB0\fR to \fB6\fR, but nothing beyond that\&. This also shows that it is not possible to convert offsets which refer to locations before the beginning of the stream\&. .sp This utility allows higher levels to convert the location offsets found in the error status and the AST into more human readable data\&. .TP \fB::grammar::me::cpu::core\fR \fBtok\fR \fIstate\fR ?\fIfrom\fR ?\fIto\fR?? This method takes the state value of a ME virtual machine and returns a Tcl list containing the part of the input stream between the locations \fIfrom\fR and \fIto\fR (both inclusive)\&. If \fIto\fR is not specified it will default to the value of \fIfrom\fR\&. If \fIfrom\fR is not specified either the whole input stream is returned\&. .sp This method places the same restrictions on its location arguments as the method \fBlc\fR\&. .TP \fB::grammar::me::cpu::core\fR \fBpc\fR \fIstate\fR This method takes the state value of a ME virtual machine and returns the current value of the stored program counter\&. .TP \fB::grammar::me::cpu::core\fR \fBiseof\fR \fIstate\fR This method takes the state value of a ME virtual machine and returns the current value of the stored eof flag\&. .TP \fB::grammar::me::cpu::core\fR \fBat\fR \fIstate\fR This method takes the state value of a ME virtual machine and returns the current location in the input stream\&. .TP \fB::grammar::me::cpu::core\fR \fBcc\fR \fIstate\fR This method takes the state value of a ME virtual machine and returns the current token\&. .TP \fB::grammar::me::cpu::core\fR \fBsv\fR \fIstate\fR This method takes the state value of a ME virtual machine and returns the current semantic value stored in it\&. This is an abstract syntax tree as specified in the document \fBgrammar::me_ast\fR, section \fBAST VALUES\fR\&. .TP \fB::grammar::me::cpu::core\fR \fBok\fR \fIstate\fR This method takes the state value of a ME virtual machine and returns the match status stored in it\&. .TP \fB::grammar::me::cpu::core\fR \fBerror\fR \fIstate\fR This method takes the state value of a ME virtual machine and returns the current error status stored in it\&. .TP \fB::grammar::me::cpu::core\fR \fBlstk\fR \fIstate\fR This method takes the state value of a ME virtual machine and returns the location stack\&. .TP \fB::grammar::me::cpu::core\fR \fBastk\fR \fIstate\fR This method takes the state value of a ME virtual machine and returns the AST stack\&. .TP \fB::grammar::me::cpu::core\fR \fBmstk\fR \fIstate\fR This method takes the state value of a ME virtual machine and returns the AST marker stack\&. .TP \fB::grammar::me::cpu::core\fR \fBestk\fR \fIstate\fR This method takes the state value of a ME virtual machine and returns the error stack\&. .TP \fB::grammar::me::cpu::core\fR \fBrstk\fR \fIstate\fR This method takes the state value of a ME virtual machine and returns the subroutine return stack\&. .TP \fB::grammar::me::cpu::core\fR \fBnc\fR \fIstate\fR This method takes the state value of a ME virtual machine and returns the nonterminal match cache as a dictionary\&. .TP \fB::grammar::me::cpu::core\fR \fBast\fR \fIstate\fR This method takes the state value of a ME virtual machine and returns the abstract syntax tree currently at the top of the AST stack stored in it\&. This is an abstract syntax tree as specified in the document \fBgrammar::me_ast\fR, section \fBAST VALUES\fR\&. .TP \fB::grammar::me::cpu::core\fR \fBhalted\fR \fIstate\fR This method takes the state value of a ME virtual machine and returns the current halt status stored in it, i\&.e\&. if the machine has stopped or not\&. .TP \fB::grammar::me::cpu::core\fR \fBcode\fR \fIstate\fR This method takes the state value of a ME virtual machine and returns the code stored in it, i\&.e\&. the instructions executed by the machine\&. .TP \fB::grammar::me::cpu::core\fR \fBeof\fR \fIstatevar\fR This method takes the state value of a ME virtual machine as stored in the variable named by \fIstatevar\fR and modifies it so that the eof flag inside is set\&. This signals to the machine that whatever token are in the input queue are the last to be processed\&. There will be no more\&. .TP \fB::grammar::me::cpu::core\fR \fBput\fR \fIstatevar\fR \fItok\fR \fIlex\fR \fIline\fR \fIcol\fR This method takes the state value of a ME virtual machine as stored in the variable named by \fIstatevar\fR and modifies it so that the token \fItok\fR is added to the end of the input queue, with associated lexeme data \fIlex\fR and \fIline\fR/\fIcol\fRumn information\&. .sp The operation will fail with an error if the eof flag of the machine has been set through the method \fBeof\fR\&. .TP \fB::grammar::me::cpu::core\fR \fBrun\fR \fIstatevar\fR ?\fIn\fR? This method takes the state value of a ME virtual machine as stored in the variable named by \fIstatevar\fR, executes a number of instructions and stores the state resulting from their modifications back into the variable\&. .sp The execution loop will run until either .RS .IP \(bu \fIn\fR instructions have been executed, or .IP \(bu a halt instruction was executed, or .IP \(bu the input queue is empty and the code is asking for more tokens to process\&. .RE .sp If no limit \fIn\fR was set only the last two conditions are checked for\&. .PP .SS "MATCH PROGRAM REPRESENTATION" A match program is represented by nested Tcl list\&. The first element, \fIasm\fR, is a list of integer numbers, the instructions to execute, and their arguments\&. The second element, \fIpool\fR, is a list of strings, referenced by the instructions, for error messages, token names, etc\&. The third element, \fItokmap\fR, provides ordering information for the tokens, mapping their names to their numerical rank\&. This element can be empty, forcing lexicographic comparison when matching ranges\&. .PP All ME instructions are encoded as integer numbers, with the mapping given below\&. A number of the instructions, those which handle error messages, have been given an additional argument to supply that message explicitly instead of having it constructed from token names, etc\&. This allows the machine state to store only the message ids instead of the full strings\&. .PP Jump destination arguments are absolute indices into the \fIasm\fR element, refering to the instruction to jump to\&. Any string arguments are absolute indices into the \fIpool\fR element\&. Tokens, characters, messages, and token (actually character) classes to match are coded as references into the \fIpool\fR as well\&. .PP .IP [1] "\fBict_advance\fR \fImessage\fR" .IP [2] "\fBict_match_token\fR \fItok\fR \fImessage\fR" .IP [3] "\fBict_match_tokrange\fR \fItokbegin\fR \fItokend\fR \fImessage\fR" .IP [4] "\fBict_match_tokclass\fR \fIcode\fR \fImessage\fR" .IP [5] "\fBinc_restore\fR \fIbranchlabel\fR \fInt\fR" .IP [6] "\fBinc_save\fR \fInt\fR" .IP [7] "\fBicf_ntcall\fR \fIbranchlabel\fR" .IP [8] "\fBicf_ntreturn\fR" .IP [9] "\fBiok_ok\fR" .IP [10] "\fBiok_fail\fR" .IP [11] "\fBiok_negate\fR" .IP [12] "\fBicf_jalways\fR \fIbranchlabel\fR" .IP [13] "\fBicf_jok\fR \fIbranchlabel\fR" .IP [14] "\fBicf_jfail\fR \fIbranchlabel\fR" .IP [15] "\fBicf_halt\fR" .IP [16] "\fBicl_push\fR" .IP [17] "\fBicl_rewind\fR" .IP [18] "\fBicl_pop\fR" .IP [19] "\fBier_push\fR" .IP [20] "\fBier_clear\fR" .IP [21] "\fBier_nonterminal\fR \fImessage\fR" .IP [22] "\fBier_merge\fR" .IP [23] "\fBisv_clear\fR" .IP [24] "\fBisv_terminal\fR" .IP [25] "\fBisv_nonterminal_leaf\fR \fInt\fR" .IP [26] "\fBisv_nonterminal_range\fR \fInt\fR" .IP [27] "\fBisv_nonterminal_reduce\fR \fInt\fR" .IP [28] "\fBias_push\fR" .IP [29] "\fBias_mark\fR" .IP [30] "\fBias_mrewind\fR" .IP [31] "\fBias_mpop\fR" .PP .SH "CPU STATE" A state value is a list containing the following elements, in the order listed below: .IP [1] \fIcode\fR: Match instructions, see \fBMATCH PROGRAM REPRESENTATION\fR\&. .IP [2] \fIpc\fR: Program counter, \fIint\fR\&. .IP [3] \fIhalt\fR: Halt flag, \fIboolean\fR\&. .IP [4] \fIeof\fR: Eof flag, \fIboolean\fR .IP [5] \fItc\fR: Terminal cache, and input queue\&. Structure see below\&. .IP [6] \fIcl\fR: Current location, \fIint\fR\&. .IP [7] \fIct\fR: Current token, \fIstring\fR\&. .IP [8] \fIok\fR: Match status, \fIboolean\fR\&. .IP [9] \fIsv\fR: Semantic value, \fIlist\fR\&. .IP [10] \fIer\fR: Error status, \fIlist\fR\&. .IP [11] \fIls\fR: Location stack, \fIlist\fR\&. .IP [12] \fIas\fR: AST stack, \fIlist\fR\&. .IP [13] \fIms\fR: AST marker stack, \fIlist\fR\&. .IP [14] \fIes\fR: Error stack, \fIlist\fR\&. .IP [15] \fIrs\fR: Return stack, \fIlist\fR\&. .IP [16] \fInc\fR: Nonterminal cache, \fIdictionary\fR\&. .PP .PP \fItc\fR, the input queue of tokens waiting for processing and the terminal cache containing the tokens already processing are one unified data structure simply holding all tokens and their information, with the current location separating that which has been processed from that which is waiting\&. Each element of the queue/cache is a list containing the token, its lexeme information, line number, and column index, in this order\&. .PP All stacks have their top element aat the end, i\&.e\&. pushing an item is equivalent to appending to the list representing the stack, and popping it removes the last element\&. .PP \fIer\fR, the error status is either empty or a list of two elements, a location in the input, and a list of messages, encoded as references into the \fIpool\fR element of the \fIcode\fR\&. .PP \fInc\fR, the nonterminal cache is keyed by nonterminal name and location, each value a four-element list containing current location, match status, semantic value, and error status, in this order\&. .SH "BUGS, IDEAS, FEEDBACK" This document, and the package it describes, will undoubtedly contain bugs and other problems\&. Please report such in the category \fIgrammar_me\fR of the \fITcllib Trackers\fR [http://core\&.tcl\&.tk/tcllib/reportlist]\&. Please also report any ideas for enhancements you may have for either package and/or documentation\&. .PP When proposing code changes, please provide \fIunified diffs\fR, i\&.e the output of \fBdiff -u\fR\&. .PP Note further that \fIattachments\fR are strongly preferred over inlined patches\&. Attachments can be made by going to the \fBEdit\fR form of the ticket immediately after its creation, and then using the left-most button in the secondary navigation bar\&. .SH KEYWORDS grammar, parsing, virtual machine .SH CATEGORY Grammars and finite automata .SH COPYRIGHT .nf Copyright (c) 2005-2006 Andreas Kupries .fi