.\" Portability macros (not validated).
.\" FIXME: For some reason in man2html the registers are always null-strings.
.\" Also, in man2html the code doesn't display the HTML code even
.\" if the conditionals are changed to always be true.
.
.\" Check whether we are using grohtml.
.nr mH 0
.if \n(.g \
. if '\*(.T'html' \
. nr mH 1
.
.\" Start URL.
.de UR
. ds m1 \\$1\"
. nh
. if \\n(mH \{\
. \" Start diversion in a new environment.
. do ev URL-div
. do di URL-div
. \}
..
.
.
.\" End URL.
.de UE
. ie \\n(mH \{\
. br
. di
. ev
.
. \" Has there been one or more input lines for the link text?
. ie \\n(dn \{\
. do HTML-NS ""
. \" Yes, strip off final newline of diversion and emit it.
. do chop URL-div
. do URL-div
\c
. do HTML-NS
. \}
. el \
. do HTML-NS "\\*(m1"
\&\\$*\"
. \}
. el \
\\*(la\\*(m1\\*(ra\\$*\"
.
. hy \\n(HY
..
.\" define .EX/.EE (for multiline user-command examples; normal Courier font)
.de EX
.Vb
.nf
.ft CW
..
.de EE
.Ve
.ft P
.fi
..
.\" =========================================================================
.\" Hey, EMACS: -*- nroff -*-
.\" First parameter, NAME, should be all caps
.\" Second parameter, SECTION, should be 1-8, maybe w/ subsection
.\" other parameters are allowed: see man(7), man(1)
.TH LINK-PARSER 1 "2021-03-30" "Version 5.9.0"
.\" Please adjust this date whenever revising the manpage.
.\"
.\" Some roff macros, for reference:
.\" .nh disable hyphenation
.\" .hy enable hyphenation
.\" .ad l left justify
.\" .ad b justify to both left and right margins
.\" .nf disable filling
.\" .fi enable filling
.\" .br insert line break
.\" .sp insert n+1 empty lines
.\" for manpage-specific macros, see man(7)
.SH NAME
link\-parser \- parse natural language sentences using Link Grammar
.SH SYNOPSIS
.B link\-parser
.RB \-\-help
.br
.B link\-parser
.RB \-\-version
.br
.nf
.B link\-parser [\fIlanguage\fP|\fIdict\_location\fP] \
\fR[\-\-quiet]\fP [\fI\-\fP...]
.fi
.SH DESCRIPTION
.PP
.\" TeX users may be more comfortable with the \fB\fP and
.\" \fI\fP escape sequences to invoke bold face and italics,
.\" respectively.
\fBlink\-parser\fP is the command-line wrapper to the \%link\-grammar
natural language parsing library. This library will parse sentences
written in English, Russian and other languages,
generating linkage trees showing relationships
between the subject, the verb, and various adjectives, adverbs,
etc. in the sentence.
.PP
.SH EXAMPLE
.EX
.B link\-parser
.EE
.PP
Starts the parser interactive shell. Enter any sentence to parse:
.PP
.EX
linkparser> \fBReading a man page is informative.\fP
Found 18 linkages (18 had no P.P. violations)
Linkage 1, cost vector = (UNUSED=0 DIS= 0.00 LEN=16)
+------------------------Xp------------------------+
+--------------->WV-------------->+ |
| +----------Ss*g---------+ |
| +--------Os-------+ | |
| | +---Ds**x---+ | |
+--->Wd---+ +-PHc+---A--+ +---Pa---+ |
| | | | | | | |
LEFT\-WALL reading.g a man.ij page.n is.v informative.a .
.EE
.SH BACKGROUND
The \fBlink\-parser\fP command-line tool is useful for
general exploration and use, although it is presumed that, for the
parsing of large quantities of text, a custom application, making
use of the \%link\-grammar library, will be written. Several such
applications are described on the Link Grammar web page (see SEE ALSO
below); these include
the AbiWord grammar checker, and the RelEx semantic relation extractor.
.PP
The theory of Link Grammar is explained in many academic papers.
In the first of these, Daniel Sleator and Davy Temperley,
"Parsing English with a Link Grammar" (1991),
the authors defined a new formal grammatical system called a
"link grammar". A sequence of words is in the language of a link
grammar if there is a way to draw "links" between words in such a way
that the local requirements of each word are satisfied, the links do
not cross, and the words form a consistent connected graph. The authors
encoded English grammar into such a system, and wrote \%\fBlink\-parser\fP
to parse English using this grammar.
.PP
The engine that performs the parsing is separate from the dictionaries
describing a language. Currently, the most fully developed, complete
dictionaries are for the English and Russian languages, although
experimental, incomplete dictionaries exist for German and eight
other languages.
.SH OVERVIEW
.PP
\fBlink\-parser\fP, when invoked manually, starts an interactive shell,
taking control of the terminal. Any lines beginning with an exclamation
mark are assumed to be a "special command"; these are described below.
The command \%\fB!help\fP will provide more info; the command
\%\fB!variables\fP will print all of the special commands. These are also
called "variables", as almost all commands have a value associated with
them: the command typically enable or disable some function, or they
alter some multi-valued setting.
.PP
All other input is treated as a single, English-language sentence;
it is parsed, and the result of the parse is printed. The variables
control what is printed: By default, an ASCII-graphics linkage is
printed, although post-script output is also possible. The printing of
the constituent tree can also be enabled. Other output controls include
the printing of disjuncts and complete link data.
.PP
In order to analyze sentences, \%\fBlink\-parser\fP depends on a
\%link\-grammar dictionary. This contains lists of words and associated
metadata about their grammatical properties. An English language
dictionary is provided by default. If other language dictionaries
are installed in the default search locations, these may be explicitly
specified by means of a 2-letter ISO language code: so, for example:
.PP
.RS 4
.EX
.B link-parser de
.EE
.RE
.PP
will start the parser shell with the German dictionary.
.PP
Alternately, the dictionary location can be specified explicitly with
either an absolute or a relative file path; so, for example:
.PP
.RS 4
.EX
.B link\-parser /usr/share/link\-grammar/en
.EE
.RE
.PP
will run \%\fBlink\-parser\fP using the English dictionary located in the
typical install directory.
.PP
\fBlink\-parser\fP can also be used in a non-interactive mode ("batch mode")
via the \%\fB\-batch\fP option (a special command, see below):
So, for example:
.PP
.RS 4
.EX
.B cat thesis.txt | link\-parser -batch
.EE
.RE
.PP
will read lines from the file \%\fBthesis.txt\fP,
processing each one as a complete sentence. For sentences that don't have a
full parse, it will print
.br
.B +++++ error N
.br
(\fBN\fP is a number) to the standard output.
.PP
Alternately, an input file may be specified with the \%\fB!file\fP \fIfilename\fP
special command, described below.
.PP
Note that using "batch mode" disables the usual
ASCII-graphics linkage printing. The input sentences also don't appear by
default on stdout. These features may be re-enabled via special
commands; special commands may be interspersed with the input.
.PP
Instead of specifying \fB-batch\fP in the command-line, \fB!batch\fP can
be specified in the said input file.
.PP
For more details, use \fB!help batch\fP in \%\fBlink\-parser\fP's
interactive shell.
.SH OPTIONS
.TP
.B \-\-help
Print usage and exit.
.TP
.B \-\-version
Print program version and configuration details, and exit.
.TP
.B \-\-quiet
Suppress the version messages on startup.
.SS Special "!" commands
The special "!" commands can be specified as command-line options in the
command-line, or within the interactive shell itself by prefixing them with
"!" on line start. The full option name does not need to be used; only enough
letters to make the option unique must be specified.
.PP
When specifying as a command-line option, a special command is proceeded
by "-" instead of "!".
.PP
Boolean variables may be toggled simply by giving the \%\fB!\fP\fIvarname\fP,
for example, \%\fB!batch\fP. Setting other variables require using an
equals sign: \%\fI!varname=value\fP, for example, \%\fB!width=100\fP.
.PP
The \%\fB!help\fP command prints general help. When issued from
the interactive shell, it can get an argument, usually a special command.
The \%\fB!variables\fP
command prints all of the current variable settings. The
\%\fB!file\fP command reads input from its argument file. The \%\fB!file\fP
command is \fInot\fP a variable; it cannot be set. It can be used
repeatedly.
.PP
The \%\fB!exit\fP command instructs \%\fBlink\-parser\fP to exit.
.PP
The exclamation mark "!" is also a special command by itself, used to inspect
the dictionary entry for any given word (optionally terminated by a subscript).
Thus two exclamation marks are needed before such a word when doing so from the
interactive shell. The wildcard character "*" can be specified as the last
character of the word in order to find multiple matches.
Default values of the special commands below are shown in parenthesis. Most of
them are the default ones of the \%link\-grammar library.
.br
Boolean default values are shown as \fBon\fP (1) or \fBoff\fP (0).
.TP
.BR !bad \ (off)
Enable display of bad linkages.
.TP
.BR !batch \ (off)
Enable batch mode.
.TP
.BR !constituents \ (0)
Generate constituent output. Its value may be:
.RS
.IP 0
Disabled
.IP 1
Treebank-style constituent tree
.IP 2
Flat, bracketed tree [A like [B this B] A]
.IP 3
Flat, treebank-style tree (A like (B this))
.RE
.TP
.BR !cost-max \ (2.7)
Largest cost to be considered.
.TP
.BR !dialect \ (no\ value)
Use the specified (comma-separated) names.
.br
They modify the disjunct cost of dictionary words
whose expressions contain symbolic cost specifications.
.TP
.BR !disjuncts \ (off)
Display of disjuncts used.
.TP
.BR !echo \ (off)
Echo input sentence.
.TP
.BR !graphics \ (on)
Enable graphical display of linkage.
For each linkage, the sentence is printed along with a graphical
representation of its linkage above it.
.PP
.RS
The following notations are used for words in the sentence:
.IP [word]
A word with no linkage.
.IP word[?].x
An unknown word whose POS category x has been found by the parser.
.IP word[!]
An unknown word whose \%link\-grammar dictionary entry has been assigned
by a RegEx.
(Use !morphology=1 to see the said dictionary entry.)
.IP word[~]
There was an unknown word in this position, and it has got replaced,
using a spell guess with this word, that is found in the \%link\-grammar
dictionary.
.IP word[&]
This word is a part of an unknown word which has been found to consist
of two or more words that are in the \%link\-grammar dictionary.
.IP word.POS
This word found in the dictionary as word.POS.
.IP word.#CORRECTION
This word is probably a typo - got linked as an alternative word CORRECTION.
.PP
For dictionaries that support morphology (enable with !morphology=1):
.IP word=
A prefix morpheme
.IP =word
A suffix morpheme
.IP word.=
A stem
.RE
.TP
.BR !islands-ok \ (on)
Use null-linked islands.
.TP
.BR !limit \ (1000)
Limit the maximum linkages processed.
.TP
.BR !links \ (off)
Enable display of complete link data.
.TP
.BR !null \ (on)
Allow null links.
.TP
.BR !morphology \ (off)
Display word morphology.
When a word matches a RegEx, show the matching dictionary entry.
.TP
.BR !panic \ (on)
Use "panic mode" if a parse cannot be quickly found.
.br
The command \%\fB!panic_variables\fP prints the special variables that are used only in "panic mode".
.TP
.BR !postscript \ (off)
Generate postscript output.
.TP
.BR !short \ (16)
Maximum length of short links.
.TP
.BR !spell \ (7)
If zero, no spell and run-on corrections of unknown words are performed.
.br
Else, use up to this many spell-guesses per unknown word. In that
case, the number of run-on corrections (word split) of unknown
words is not limited.
.TP
.BR !timeout \ (30)
Abort parsing after this many seconds.
.TP
.BR !use-sat \ (off)
Use Boolean SAT-based parser.
.TP
.BR !verbosity \ (1)
Level of detail in output. Some useful values:
.RS
.IP 0
No prompt, minimal library messages
.IP 1
Normal verbosity
.IP 2
Show times of the parsing steps
.IP 3
Display some more information messages
.IP 4
Display data file search and locale setup
.IP 5-9
Tokenizer and parser debugging
.IP 10-19
Dictionary debugging
.IP 101
Print all the dictionary connectors, along with their length limit
.RE
.TP
.BR !walls \ (off)
Display wall words.
.TP
.BR !width \ (16381)(*)
The width of the display.
.br
* When writing to a terminal, this value is set from its width.
.br
.TP
.BR !wordgraph \ (0)
Display the wordgraph (word-split graph).
.RS
.IP 0
Disabled
.IP 1
Default display
.IP 2
Display parent tokens as subgraphs
.IP 3
Use esoteric display flags as set by !test=wg:FLAGS
.RE
.SH FILES
The following files are per-language, when \fILL\fP is the 2-letter
ISO language code.
.TP
.IR LL /4.0.dict
The Link Grammar dictionary.
.TP
.IR LL /4.0.affix
Values of entities used in tokenization.
.TP
.IR LL /4.0.regex
Regular expressions (see
.BR regex (7))
that are used to match tokens not found in the dictionary.
.TP
.IR LL /4.0.dialect
Dialect component definitions.
.TP
.IR LL /4.0.knowledge
Post-processing definitions.
.TP
.IR LL /4.0.constituent\-knowledge
Definitions for producing a constituent tree.
.TP
.RI command-help- LL .txt \ \fBor \ command-help- LL-CC .txt
Help text for the \%\fB!help\fP \fItopic\fP special "!" command.
If several such files are provided, the desired one can be selected
by e.g. the LANGUAGE environment variable if it is set to \fILL\fP or
\fILL-CC\fP (default is \fBen\fP). Currently only \fBcommand-help-en.txt\fP
is provided.
.sp 2
.TP
The directory search order for these files is:
.RI \[bu]\ "./"
.br
.RI \[bu]\ "data/"
.br
.RI \[bu]\ "../"
.br
.RI \[bu]\ "../data/"
.br
\[bu]\ A custom data directory, as set by the API call \%\fBdictionary_set_data_dir()\fP.
.br
\[bu]\ Installation-depended system data directory (*)
.sp 2
* This location is displayed as DICTIONARY_DIR when the \%\fB\-\-version\fP
argument is provided to \%\fBlink\-parser\fP on the command line.
On windows it may be relative to the location of the \%link\-grammar library DLL;
in that case the actual location is displayed as "System data directory" when
\%\fBlink\-parser\fP is invoked with -verbosity=4.
.SH SEE ALSO
.nh
Information on the \%link\-grammar shared-library API and the link types
used in the parse is available at the
.UR http://www.abisource.com/projects/link-grammar/
AbiWord website
.UE .
.PP
Peer-reviewed papers explaining Link Grammar can be found at
.UR http://www.link.cs.cmu.edu/link/papers
original CMU site
.UE .
.PP
The source code of \%\fBlink\-parser\fP and the \%link\-grammar library is
located at
.UR https://github.com/opencog/link-grammar
GitHub
.UE .
.PP
The mailing list for Link Grammar discussion is at
.UR http://groups.google.com/group/link-grammar?hl=en
link-grammar Google group
.UE .
.SH AUTHOR
.nh
\fBlink\-parser\fP and the \%link\-grammar library were written by Daniel
Sleator , Davy Temperley ,
and John Lafferty
.PP
This manual page was written by Ken Bloom ,
for the Debian project, and updated by Linas Vepstas
.