Scroll to navigation

KYTEA(1) General Commands Manual KYTEA(1)


kytea — a word segmentation/pronunciation estimation tool


kytea [options]


This manual page documents briefly the kytea command.

This manual page was written for the Debian distribution because the original program does not have a manual page. Instead, it has documentation in the GNU Info format; see below.

kytea is morphological analysis system based on pointwise predictors. It separetes sentences into words, tagging and predict pronunciations. The pronunciation of KyTea is same as cutie.


A summary of options is included below.

Analysis Options:

The model file to use when analyzing text
Don't do word segmentation (raw input cannot be accepted)
Do only word segmentation, no tagging
Skip the tag of the nth tag (n starts at 1)
Don't estimate the pronunciation of unknown words
Specifies character types to not be segmented (e.g. D for digits)
The width of the beam to use in beam search for unknown words (default 50, 0 for full search)
The debugging level (0=silent, 1=simple, 2=detailed)

Format Options:

The formatting of the input (raw/tok/full/part/conf, default raw)
The formatting of the output (full/part/conf/eda/tags, default full)
The maximum number of tags to print for one word (default 3, 0 implies no limit)
A tag for words that cannot be given any tag (for example, unknown words that contain a character not in the subword dictionary)
A tag to append to indicate words not in the dictionary

Format Options (for advanced users):

The separator for words in full annotation (" ")
The separator for tags in full/partial annotation ("/")
The separator for candidates in full/partial annotation ("&")
Indicates unannotated boundaries in partial annotation (" ")
Indicates skipped boundaries in partial annotation ("?")
Indicates non-existence of boundaries in partial annotation ("-")
Indicates existence of boundaries in partial annotation ("|")


This manual page was written by Koichi Akabe for the Debian system (and may be used by others). Permission is granted to copy, distribute and/or modify this document under the terms of the GNU General Public License, Version 2 any later version published by the Free Software Foundation.

On Debian systems, the complete text of the GNU General Public License can be found in /usr/share/common-licenses/GPL.