NAME¶
apertium-tagger - This application is part of (
apertium )
This tool is part of the apertium open-source machine translation architecture:
http://www.apertium.org.
SYNOPSIS¶
apertium-tagger --train|-t {n} DIC CRP TSX PROB [--debug|-d]
apertium-tagger --supervised|-s {n} DIC CRP TSX PROB HTAG UNTAG
[--debug|-d]
apertium-tagger --retrain|-r {n} CRP PROB [--debug|-d]
apertium-tagger --tagger|-g [--first|-f] PROB [--debug|-d] [INPUT
[OUTPUT]]
DESCRIPTION¶
apertium-tagger is the application responsible for the apertium
part-of-speech tagger training or tagging, depending on the calling options.
This command only reads from the standard input if the option
--tagger
or
-g is used.
OPTIONS¶
- -t {n}, --train {n}
- Initializes parameters through the Kupiec's method
(unsupervised), then performs n iterations of the Baum-Welch
training algorithm (unsupervised).
- -s {n}, --supervised {n}
- Initializes parameters against a hand-tagged text
(supervised) through the maximum likelihood estimate method, then performs
n iterations of the Baum-Welch training algorithm
(unsupervised)
- -r {n}, --retrain {n}
- Retrains the model with n additional Baum-Welch
iterations (unsupervised).
- -g, --tagger
- Tags input text by means of Viterbi algorithm.
- -p, --show-superficial
- Prints the superficial form of the word along side the
lexical form in the output stream.
- -f, --first
- Used if conjuntion with -g (--tagger) makes the tagger to
give all lexical forms of each word, being the choosen one in the first
place (after the lemma)
- -d, --debug
- Print error (if any) or debug messages while
operating.
- -m, --mark
- Mark disambiguated words.
- -h, --help
- Display a help message.
FILES¶
These are the kinds of files used with each option:
DIC Full expanded dictionary file
CRP Training text corpus file
TSX Tagger specification file, in XML format
PROB Tagger data file, built in the training and used while tagging
HTAG Hand-tagged text corpus
UNTAG Untagged text corpus, morphological analysis of HTAG corpus to use
both jointly with -s option
INPUT Input file, stdin by default
OUTPUT Output file, stdout by default
SEE ALSO¶
lt-proc(1), lt-comp(1), lt-expand(1),
apertium-translator(1), apertium(1).
BUGS¶
Lots of...lurking in the dark and waiting for you!
AUTHOR¶
Copyright (c) 2005, 2006 Universitat d'Alacant / Universidad de Alicante. This
is free software. You may redistribute copies of it under the terms of the GNU
General Public License <
http://www.gnu.org/licenses/gpl.html>.