NAME¶
fst-train - learning transducer weights
SYNOPSIS¶
fst-train [ options ] file [
input-file ]
OPTIONS¶
- -t file
- use multiple transducers in the same way as
fst-infl2.
- -b
- This option is used for supervised training with
disambiguated data.
- -d
- Disambiguate the analyses symbolically as described in the
man pages of fst-infl2.
- -q
- quiet mode
DESCRIPTION¶
fst-train is used to learn statistical weights for the transducers
transitions based on training data. Training is either unsupervised (default)
or supervised (option -b).
In supervised mode, the input contains fully disambiguated data with the surface
and the analysis form. The format restrictions are identical to those applying
for lexicon entries, i.e. all operators other than the colon operator (:) are
interpreted literally.
In unsupervised mode, the input data consists of surface strings. The format is
identical to the input format of
fst-infl and
fst-infl2.
The transducer weights are stored in files whose names are obtained by appending
.prob to the names of the transducer files.
BUGS¶
No bugs are known so far.
SEE ALSO¶
fst-infl2, fst-compiler
AUTHOR¶
Helmut Schmid, Institute for Computational Linguistics, University of Stuttgart,
Email: schmid@ims.uni-stuttgart.de, This software is available under the GNU
Public License.