.TH "KYTEA" "1" .SH "NAME" kytea \(em a word segmentation/pronunciation estimation tool .SH "SYNOPSIS" .PP \fBkytea\fR [\fBoptions\fP] .SH "DESCRIPTION" .PP This manual page documents briefly the \fBkytea\fR command. .PP This manual page was written for the \fBDebian\fP distribution because the original program does not have a manual page. Instead, it has documentation in the GNU \fBInfo\fP format; see below. .PP \fBkytea\fR is morphological analysis system based on pointwise predictors. It separetes sentences into words, tagging and predict pronunciations. The pronunciation of KyTea is same as cutie. .PP .SH "OPTIONS" .PP A summary of options is included below. .SS Analysis Options: .IP "\fB-model\fP" 11 The model file to use when analyzing text .IP "\fB-nows\fP" 11 Don't do word segmentation (raw input cannot be accepted) .IP "\fB-notags\fP" 11 Do only word segmentation, no tagging .IP "\fB-notag\fP" 11 Skip the tag of the nth tag (n starts at 1) .IP "\fB-nounk\fP" 11 Don't estimate the pronunciation of unknown words .IP "\fB-wsconst\fP" 11 Specifies character types to not be segmented (e.g. D for digits) .IP "\fB-unkbeam\fP" 11 The width of the beam to use in beam search for unknown words (default 50, 0 for full search) .IP "\fB-debug\fP" 11 The debugging level (0=silent, 1=simple, 2=detailed) .SS Format Options: .IP "\fB-in\fP" 11 The formatting of the input (raw/tok/full/part/conf, default raw) .IP "\fB-out\fP" 11 The formatting of the output (full/part/conf/eda/tags, default full) .IP "\fB-tagmax\fP" 11 The maximum number of tags to print for one word (default 3, 0 implies no limit) .IP "\fB-deftag\fP" 11 A tag for words that cannot be given any tag (for example, unknown words that contain a character not in the subword dictionary) .IP "\fB-unktag\fP" 11 A tag to append to indicate words not in the dictionary .SS Format Options (for advanced users): .IP "\fB-wordbound\fP" 11 The separator for words in full annotation (" ") .IP "\fB-tagbound\fP" 11 The separator for tags in full/partial annotation ("/") .IP "\fB-elembound\fP" 11 The separator for candidates in full/partial annotation ("&") .IP "\fB-unkbound\fP" 11 Indicates unannotated boundaries in partial annotation (" ") .IP "\fB-skipbound\fP" 11 Indicates skipped boundaries in partial annotation ("?") .IP "\fB-nobound\fP" 11 Indicates non-existence of boundaries in partial annotation ("\-") .IP "\fB-hasbound\fP" 11 Indicates existence of boundaries in partial annotation ("|") .PP .RE .SH "AUTHOR" .PP This manual page was written by Koichi Akabe vbkaisetsu@gmail.com for the \fBDebian\fP system (and may be used by others). Permission is granted to copy, distribute and/or modify this document under the terms of the GNU General Public License, Version 2 any later version published by the Free Software Foundation. .PP On Debian systems, the complete text of the GNU General Public License can be found in /usr/share/common-licenses/GPL.