.Dd August 30, 2006 .Dt APERTIUM-TAGGER 1 .Os Apertium .Sh NAME .Nm apertium-tagger .Nd part-of-speech tagger and trainer for Apertium .Sh SYNOPSIS .Nm apertium-tagger .Op Ar options .Fl g Ar serialized_tagger Op Ar input Op Ar output .Nm apertium-tagger .Op Ar options .Fl r Ar iterations corpus serialized_tagger .Nm apertium-tagger .Op Ar options .Fl s Ar iterations dictionary corpus tagger_spec serialized_tagger .Ar tagged_corpus untagged_corpus .Nm apertium-tagger .Op Ar options .Fl s Cm 0 .Ar dictionary tagger_spec serialized_tagger tagged_corpus untagged_corpus .Nm apertium-tagger .Op Ar options .Fl s Cm 0 Fl u Ar model serialized_tagger tagged_corpus .Nm apertium-tagger .Op Ar options .Fl t Ar iterations dictionary corpus tagger_spec serialized_tagger .Sh DESCRIPTION .Nm apertium-tagger is the application responsible for the apertium part-of-speech tagger training or tagging, depending on the calling options. This command only reads from the standard input if the option .Fl Fl tagger or .Fl g is used. .Sh OPTIONS .Bl -tag -width Ds .It Fl t Ar n , Fl Fl train Ar n Initializes parameters through Kupiec's method (unsupervised), then performs .Ar n iterations of the Baum-Welch training algorithm (unsupervised). .It Fl s Ar n , Fl Fl supervised Ar n Initializes parameters against a hand-tagged text (supervised) through the maximum likelihood estimate method, then performs .Ar n iterations of the Baum-Welch training algorithm (unsupervised). The CRP argument can be omitted only when .Ar n = 0. .It Fl r Ar n , Fl Fl retrain n Retrains the model with .Ar n additional Baum-Welch iterations (unsupervised). .It Fl g , Fl Fl tagger Tags input text by means of Viterbi algorithm. .It Fl p , Fl Fl show-superficial Prints the superficial form of the word along side the lexical form in the output stream. .It Fl f , Fl Fl first Used in conjunction with .Fl g .Pq Fl Fl tagger makes the tagger give all lexical forms of each word, with the chosen one in the first place (after the lemma) .It Fl d , Fl Fl debug Print error (if any) or debug messages while operating. .It Fl m , Fl Fl mark Mark disambiguated words. .It Fl h , Fl Fl help Display a help message. .It Fl z , Fl Fl null-flush Used in conjunction with .Fl g to flush the output after getting each null character. .It Fl u , Fl Fl unigram=MODEL use unigram algorithm MODEL from .It Fl w , Fl Fl sliding-window use the Light Sliding Window algorithm .It Fl x , Fl Fl perceptron use the averaged perceptron algorithm .It Fl e, Fl Fl skip-on-error Used with Fl xs to ignore certain types of errors with the training corpus .El .Sh FILES These are the kinds of files used with each option: .Bl -tag -width Ds .It Ar dictionary Full expanded dictionary file .It Ar corpus Training text corpus file .It Ar tagger_spec Tagger specification file, in XML format .It Ar serialized_tagger Tagger data file, built in the training and used while tagging .It Ar tagged_corpus Hand-tagged text corpus .It Ar untagged_corpus Untagged text corpus, morphological analysis of hand-tagged corpus to use both jointly with .Fl s option .It Ar input Input file, stdin by default .It Ar output Output file, stdout by default .El .Sh SEE ALSO .Xr apertium 1 , .Xr lt-comp 1 , .Xr lt-expand 1 , .Xr lt-proc 1 .Sh COPYRIGHT Copyright \(co 2005, 2006 Universitat d'Alacant / Universidad de Alicante. This is free software. You may redistribute copies of it under the terms of .Lk https://www.gnu.org/licenses/gpl.html the GNU General Public License . .Sh BUGS Many... lurking in the dark and waiting for you!