NAME¶
kytea — a word segmentation/pronunciation estimation tool
SYNOPSIS¶
kytea [
options]
DESCRIPTION¶
This manual page documents briefly the
kytea command.
This manual page was written for the
Debian distribution because the
original program does not have a manual page. Instead, it has documentation in
the GNU
Info format; see below.
kytea is morphological analysis system based on pointwise predictors. It
separetes sentences into words, tagging and predict pronunciations. The
pronunciation of KyTea is same as cutie.
OPTIONS¶
A summary of options is included below.
Analysis Options:¶
- -model
- The model file to use when analyzing text
- -nows
- Don't do word segmentation (raw input cannot be accepted)
- -notags
- Do only word segmentation, no tagging
- -notag
- Skip the tag of the nth tag (n starts at 1)
- -nounk
- Don't estimate the pronunciation of unknown words
- -wsconst
- Specifies character types to not be segmented (e.g. D for digits)
- -unkbeam
- The width of the beam to use in beam search for unknown words (default 50,
0 for full search)
- -debug
- The debugging level (0=silent, 1=simple, 2=detailed)
- -in
- The formatting of the input (raw/tok/full/part/conf, default raw)
- -out
- The formatting of the output (full/part/conf/eda/tags, default full)
- -tagmax
- The maximum number of tags to print for one word (default 3, 0 implies no
limit)
- -deftag
- A tag for words that cannot be given any tag (for example, unknown words
that contain a character not in the subword dictionary)
- -unktag
- A tag to append to indicate words not in the dictionary
- -wordbound
- The separator for words in full annotation (" ")
- -tagbound
- The separator for tags in full/partial annotation ("/")
- -elembound
- The separator for candidates in full/partial annotation
("&")
- -unkbound
- Indicates unannotated boundaries in partial annotation ("
")
- -skipbound
- Indicates skipped boundaries in partial annotation ("?")
- -nobound
- Indicates non-existence of boundaries in partial annotation
("-")
- -hasbound
- Indicates existence of boundaries in partial annotation
("|")
AUTHOR¶
This manual page was written by Koichi Akabe vbkaisetsu@gmail.com for the
Debian system (and may be used by others). Permission is granted to
copy, distribute and/or modify this document under the terms of the GNU
General Public License, Version 2 any later version published by the Free
Software Foundation.
On Debian systems, the complete text of the GNU General Public License can be
found in /usr/share/common-licenses/GPL.