.\" Copyright (C) 2001 Information-technology Promotion Agency (IPA) .\" Copyright (C) 2001-2011 .\" National Institute of Advanced Industrial Science and Technology (AIST) .\" This file is part of the m17n library documentation. .\" Permission is granted to copy, distribute and/or modify this document .\" under the terms of the GNU Free Documentation License, Version 1.2 or .\" any later version published by the Free Software Foundation; with no .\" Invariant Section, no Front-Cover Texts, .\" and no Back-Cover Texts. A copy of the license is included in the .\" appendix entitled "GNU Free Documentation License". .TH "mdbIM" 5 "12 Jan 2011" "Version 1.6.2" "The m17n Library" \" -*- nroff -*- .ad l .nh .SH NAME mdbIM \- Input Method .SH "DESCRIPTION" .PP The m17n library provides a driver for input methods that are dynamically loadable from the m17n database (see \fBm17nInputMethod\fP ). .PP This section describes the data format that defines those input methods. .SH "SYNTAX and SEMANTICS" .PP The following data format defines an input method. The driver loads a definition from a file, a stream, etc. The definition is converted into the form of plist in the driver. .PP .PP .nf INPUT\-METHOD ::= IM\-DECLARATION ? IM\-DESCRIPTION ? TITLE ? VARIABLE\-LIST ? COMMAND\-LIST ? MODULE\-LIST ? MACRO\-LIST ? MAP\-LIST ? STATE\-LIST ? IM\-DECLARATION ::= '(' 'input\-method' LANGUAGE NAME EXTRA\-ID ? VERSION ? ')' LANGUAGE ::= SYMBOL NAME ::= SYMBOL EXTRA\-ID ::= SYMBOL VERSION ::= '(' 'version' VERSION\-NUMBER ')' IM\-DESCRIPTION ::= '(' 'description' DESCRIPTION ')' DESCRIPTION ::= MTEXT\-OR\-GETTEXT | 'nil' MTEXT\-OR\-GETTEXT ::= [ MTEXT | '(' '_' MTEXT ')'] TITLE ::= '(' 'title' TITLE\-TEXT ')' TITLE\-TEXT ::= MTEXT VARIABLE\-LIST ::= '(' 'variable' VARIABLE\-DECLARATION * ')' VARIABLE\-DECLARATION ::= '(' VAR\-NAME [ DESCRIPTION VALUE VALUE\-CANDIDATE * ]')' VAR\-NAME ::= SYMBOL VALUE ::= MTEXT | SYMBOL | INTEGER VALUE\-CANDIDATE ::= VALUE | '(' RANGE\-FROM RANGE\-TO ')' RANGE\-FROM ::= INTEGER RANGE\-TO ::= INTEGER COMMAND\-LIST ::= '(' 'command' COMMAND\-DECLARATION * ')' COMMAND\-DECLARATION ::= '(' CMD\-NAME [ DESCRIPTION KEYSEQ * ] ')' CMD\-NAME ::= SYMBOL .fi .PP .PP \fCIM\-DECLARATION\fP specifies the language and name of this input method. .PP When \fCLANGUAGE\fP is \fCt\fP, the use of the input method is not limited to one language. .PP When \fCNAME\fP is \fCnil\fP, the input method is not standalone, but is expected to be used in other input methods. In such cases, \fCEXTRA\-ID\fP is required to identify the input method. .PP \fCVERSION\fP specifies the required minimum version number of the m17n library. The format is 'XX.YY.ZZ' where XX is a major version number, YY is a minor version number, and ZZ is a patch level. .PP \fCDESCRIPTION\fP, if not nil, specifies the description text of an input method, a variable or a command. If \fCMTEXT\-OR\-GETTEXT\fP takes the second form, the text is translated according to the current locale by 'gettext' (if the translation is provided). .PP \fCTITLE\-TEXT\fP is a text displayed on the screen when this input method is active. .PP There is one special input method file 'global.mim' that declares common variables and commands. The input method driver always loads this file and other input methods can inherit the variables and the commands. .PP \fCVARIABLE\-DECLARATION\fP declares a variable used in this input method. If a variable must be initialized to the default value, or is to be customized by a user, it must be declared here. The declaration can be used in two ways. One is to introduce a new variable. In that case, \fCVALUE\fP must not be omitted. Another is to inherit the variable from what declared in 'global.mim', and to give the different default value and/or to make the variable customizable specially for the current input method. In the latter case, \fCVALUE\fP can be omitted. .PP \fCCOMMAND\-DECLARATION\fP declares a command used in this input method. If a command must be bound to the default key sequence, or is to be customized by a user, it must be declared here. Like \fCVARIABLE\-DECLARATION\fP, the declaration can be used in two ways. One is to introduce a new command. In that case, \fCKEYSEQ\fP must not be omitted. Another is to inherit the command from what declared in 'global.mim', and to give the different key binding and/or to make the command customizable specially for the current input method. In the latter case, \fCKEYSEQ\fP can be omitted. .PP .PP .nf MODULE\-LIST ::= '(' 'module' MODULE * ')' MODULE ::= '(' MODULE\-NAME FUNCTION * ')' MODULE\-NAME ::= SYMBOL FUNCTION ::= SYMBOL .fi .PP .PP Each \fCMODULE\fP declares the name of an external module (i.e. dynamic library) and function names exported by the module. If a \fCFUNCTION\fP has name 'init', it is called with only the default arguments (see the section about \fCCALL\fP) when an input context is created for the input method. If a \fCFUNCTION\fP has name 'fini', it is called with only the default arguments when an input context is destroyed. .PP .PP .nf MACRO\-LIST ::= MACRO\-INCLUSION ? '(' 'macro' MACRO * ')' MACRO\-INCLUSION ? MACRO ::= '(' MACRO\-NAME MACRO\-ACTION * ')' MACRO\-NAME ::= SYMBOL MACRO\-ACTION ::= ACTION TAGS ::= `(` LANGUAGE NAME EXTRA\-ID ? `)` MACRO\-INCLUSION ::= '(' 'include' TAGS 'macro' MACRO\-NAME ? ')' .fi .PP .PP \fCMACRO\-INCLUSION\fP includes macros from another input method specified by \fCTAGS\fP. When \fCMACRO\-NAME\fP is not given, all macros from the input method are included. .PP .PP .nf MAP\-LIST ::= MAP\-INCLUSION ? '(' 'map' MAP * ')' MAP\-INCLUSION ? MAP ::= '(' MAP\-NAME RULE * ')' MAP\-NAME ::= SYMBOL RULE ::= '(' KEYSEQ MAP\-ACTION * ')' KEYSEQ ::= MTEXT | '(' [ SYMBOL | INTEGER ] * ')' MAP\-INCLUSION ::= '(' 'include' TAGS 'map' MAP\-NAME ? ')' .fi .PP .PP When an input method is never standalone and always included in another method, \fCMAP\-LIST\fP can be omitted. .PP \fCSYMBOL\fP in the definitions of \fCMAP\-NAME\fP must not be \fCt\fP nor \fCnil\fP. .PP \fCMTEXT\fP in the definition of \fCKEYSEQ\fP consists of characters that can be generated by a keyboard. Therefore \fCMTEXT\fP usually contains only ASCII characters. However, if the input method is intended to be used, for instance, with a West European keyboard, \fCMTEXT\fP may contain Latin\-1 characters. .PP \fCSYMBOL\fP in the definition of \fCKEYSEQ\fP must be the return value of the minput_event_to_key() function. Under the X window system, you can quickly check the value using the \fCxev\fP command. For example, the return key, the backspace key, and the 0 key on the keypad are represented as \fC\fP(Return) , \fC\fP(BackSpace) , and \fC\fP(KP_0) respectively. If the shift, control, meta, alt, super, and hyper modifiers are used, they are represented by the S\- , C\- , M\- , A\- , s\- , and H\- prefixes respectively in this order. Thus, 'return with shift with meta with hyper' is \fC\fP(S\-M\-H\-Return) . Note that 'a with shift' .. 'z with shift' are represented simply as A .. Z . Thus 'a with shift with meta with hyper' is \fC\fP(M\-H\-A) . .PP \fCINTEGER\fP in the definition of \fCKEYSEQ\fP must be a valid character code. .PP \fCMAP\-INCLUSION\fP includes maps from another input method specified by \fCTAGS\fP. When \fCMAP\-NAME\fP is not given, all maps from the input method are included. .PP .PP .nf MAP\-ACTION ::= ACTION ACTION ::= INSERT | DELETE | SELECT | MOVE | MARK | SHOW | HIDE | PUSHBACK | POP | UNDO | COMMIT | UNHANDLE | SHIFT | CALL | SET | IF | COND | '(' MACRO\-NAME ')' PREDEFINED\-SYMBOL ::= '@0' | '@1' | '@2' | '@3' | '@4' | '@5' | '@6' | '@7' | '@8' | '@9' | '@<' | '@=' | '@>' | '@\-' | '@+' | '@[' | '@]' | '@@' | '@\-0' | '@\-N' | '@+N' .fi .PP .PP .PP .nf STATE\-LIST ::= STATE\-INCUSION ? '(' 'state' STATE * ')' STATE\-INCUSION ? STATE ::= '(' STATE\-NAME [ STATE\-TITLE\-TEXT ] BRANCH * ')' STATE\-NAME ::= SYMBOL STATE\-TITLE\-TEXT ::= MTEXT BRANCH ::= '(' MAP\-NAME BRANCH\-ACTION * ')' | '(' 'nil' BRANCH\-ACTION * ')' | '(' 't' BRANCH\-ACTION * ')' STATE\-INCLUSION ::= '(' 'include' TAGS 'state' STATE\-NAME ? ')' .fi .PP .PP When an input system is never standalone and always included in another system, \fCSTATE\-LIST\fP can be omitted. .PP \fCSTATE\-INCLUSION\fP includes states from another input method specified by \fCTAGS\fP. When \fCSTATE\-NAME\fP is not given, all states from the input method are included. .PP The optional \fCSTATE\-TITLE\-TEXT\fP specifies a title text displayed on the screen when the input method is in this state. If \fCSTATE\-TITLE\-TEXT\fP is omitted, \fCTITLE\-TEXT\fP is used. .PP In the first form of \fCBRANCH\fP, \fCMAP\-NAME\fP must be an item that appears in \fCMAP\fP. In this case, if a key sequence matching one of \fCKEYSEQs\fP of \fCMAP\-NAME\fP is typed, \fCBRANCH\-ACTIONs\fP are executed. .PP In the second form of \fCBRANCH\fP, \fCBRANCH\-ACTIONs\fP are executed if a key sequence that doesn't match any of \fCBranch's\fP of the current state is typed. .PP If there is no \fCBRANCH\fP beginning with \fCnil\fP and the typed key sequence does not match any of the current \fCBRANCHs\fP, the input method transits to the initial state. .PP In the third form of \fCBRANCH\fP, \fCBRANCH\-ACTIONs\fP are executed when shifted to the current state. If the current state is the initial state, \fCBRANCH\-ACTIONs\fP are executed also when an input context of the input method is created. .PP .PP .nf BRANCH\-ACTION ::= ACTION .fi .PP .PP An input method has the following two lists of symbols. .PP .PD 0 .IP "\(bu" 2 marker list .PP A marker is a symbol indicating a character position in the preediting text. The \fCMARK\fP action assigns a position to a marker. The position of a marker is referred by the \fCMOVE\fP and the \fCDELETE\fP actions. .PP .IP "\(bu" 2 variable list .PP A variable is a symbol associated with an integer, a symbol, or an M\-text value. The integer value of a variable can be set and referred by the \fCSET\fP action. It can be referred by the \fCSET\fP, the \fCINSERT\fP, the \fCSELECT\fP, the \fCUNDO\fP, the \fCIF\fP, the \fCCOND\fP actions. The M\-text value of a variable can be referred by the \fCINSERT\fP action. The symbol value of a variable can not be referred directly, is used the library implicitly (e.g. candidates\-charset). All variables are implicitly initialized to the integer value zero. .PP .PP .PP Each \fCPREDEFINED\-SYMBOL\fP has a special meaning when used as a marker. .PP .PD 0 .IP "\(bu" 2 \fC@0\fP, \fC@1\fP, \fC@2\fP, \fC@3\fP, \fC@4\fP, \fC@5\fP, \fC@6\fP, \fC@7\fP, \fC@8\fP, \fC@9\fP .PP The 0th, 1st, 2nd, ... 9th position respectively. .PP .IP "\(bu" 2 \fC@<\fP, \fC@=\fP, \fC@>\fP .PP The first, the current, and the last position. .PP .IP "\(bu" 2 \fC@\-\fP, \fC@+\fP .PP The previous and the next position. .PP .IP "\(bu" 2 \fC@\fP[, \fC@\fP] .PP The previous and the next position where a candidate list changes. .PP .PP Some of the \fCPREDEFINED\-SYMBOL\fP has a special meaning when used as a candidate index in the \fCSELECT\fP action. .PP .PD 0 .IP "\(bu" 2 \fC@<\fP, \fC@=\fP, \fC@>\fP .PP The first, the current, and the last candidate of the current candidate group. .PP .IP "\(bu" 2 \fC@\-\fP .PP The previous candidate. If the current candidate is the first one in the current candidate group, then it means the last candidate in the previous candidate group. .PP .IP "\(bu" 2 \fC@+\fP .PP The next candidate. If the current candidate is the last one in the current candidate group, then it means the first candidate in the next candidate group. .PP .IP "\(bu" 2 \fC@\fP[, \fC@\fP] .PP The candidate in the previous and the next candidate group having the same candidate index as the current one. .PP .PP And, this also has a special meaning. .PP .PD 0 .IP "\(bu" 2 \fC@@\fP .PP Number of handled keys at that moment. .PP .PP .PP These are for supporting surround text handling. .PP .PD 0 .IP "\(bu" 2 \fC@\-0\fP .PP \-1 if surrounding text is supported, \-2 if not. .PP .IP "\(bu" 2 \fC@\-N\fP .PP Here, \fCN\fP is a positive integer. The value is the Nth previous character in the preedit buffer. If there are only M (M' | '<=' | '>=' .fi .PP .PP This action treats \fCSYMBOL1\fP and \fCSYMBOL2\fP as variables and sets the value of \fCSYMBOL1\fP as below. .PP If \fCCMD\fP is 'set', it sets the value of \fCSYMBOL1\fP to the value of \fCEXPRESSION\fP. .PP If \fCCMD\fP is 'add', it increments the value of \fCSYMBOL1\fP by the value of \fCEXPRESSION\fP. .PP If \fCCMD\fP is 'sub', it decrements the value of \fCSYMBOL1\fP by the value of \fCEXPRESSION\fP. .PP If \fCCMD\fP is 'mul', it multiplies the value of \fCSYMBOL1\fP by the value of \fCEXPRESSION\fP. .PP If \fCCMD\fP is 'div', it divides the value of \fCSYMBOL1\fP by the value of \fCEXPRESSION\fP. .PP .PP .nf IF ::= '(' CONDITION ACTION\-LIST1 ACTION\-LIST2 ? ')' CONDITION ::= [ '=' | '<' | '>' | '<=' | '>=' ] EXPRESSION1 EXPRESSION2 ACTION\-LIST1 ::= '(' ACTION * ')' ACTION\-LIST2 ::= '(' ACTION * ')' .fi .PP .PP This action performs actions in \fCACTION\-LIST1\fP if \fCCONDITION\fP is true, and performs \fCACTION\-LIST2\fP (if any) otherwise. .PP .PP .nf COND ::= '(' 'cond' [ '(' EXPRESSION ACTION * ') ] * ')' .fi .PP .PP This action performs the first action \fCACTION\fP whose corresponding \fCEXPRESSION\fP has nonzero value. .SH "EXAMPLE 1" .PP This is a very simple example for inputting Latin characters with diacritical marks (acute and cedilla). For instance, when you type: .PP .nf Comme'die\-Franc,aise, chic,, .fi .PP you will get this: .PP The definition of the input method is very simple as below, and it is quite straight forward to extend it to cover all Latin characters. .SH "EXAMPLE 2" .PP This example is for inputting Unicode characters by typing C\-u (Control\-u) followed by four hexadecimal digits. For instance, when you type ('^u' means Control\-u): .PP .nf ^u2190^u2191^u2192^u2193 .fi .PP you will get this (Unicode arrow symbols): .PP The definition utilizes \fCSET\fP and \fCIF\fP commands as below: .PP .nf (title "UNICODE") (map (starter ((C\-U) "U+")) (hex ("0" ?0) ("1" ?1) ... ("9" ?9) ("a" ?A) ("b" ?B) ... ("f" ?F))) (state (init (starter (set code 0) (set count 0) (shift unicode))) (unicode (hex (set this @\-) (< this ?A ((sub this 48)) ((sub this 55))) (mul code 16) (add code this) (add count 1) (= count 4 ((delete @<) (insert code) (shift init)))))) .fi .PP .SH "EXAMPLE 3" .PP This example is for inputting Chinese characters by typing PinYin key sequence. .SH "SEE ALSO" .PP \fBInput Methods provided by the m17n database\fP, \fBmdbGeneral(5)\fP .SH COPYRIGHT Copyright (C) 2001 Information\-technology Promotion Agency (IPA) .br Copyright (C) 2001\-2011 National Institute of Advanced Industrial Science and Technology (AIST) .br Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License .