.\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.40.10. .TH ESTIMATE-NGRAM "1" "January 2013" "MITLM" "User Commands" .SH NAME estimate-ngram \- estimates n\-gram language model .SH SYNOPSIS .B estimate-ngram [\fIOptions\fR] .SH DESCRIPTION Estimates an n\-gram language model by cumulating n\-gram count statistics, smoothing observed counts, and building a backoff n\-gram model. Parameters can be optionally tuned to optimize development set performance. .PP Filename argument can be an ASCII file, a compressed file (ending in .Z or .gz), or '\-' to indicate stdin/stdout. .SH OPTIONS .TP \fB\-h\fR, \fB\-help\fR Print this message. .TP \fB\-verbose\fR Set verbosity level. .IP Default: 1 .TP \fB\-o\fR, \fB\-order\fR Set the n\-gram order of the estimated LM. .IP Default: 3 .TP \fB\-v\fR, \fB\-vocab\fR Fix the vocab to only words from the specified file. .TP \fB\-u\fR, \fB\-unk\fR Replace all out of vocab words with . .IP Default: false .TP \fB\-t\fR, \fB\-text\fR Add counts from text files. .TP \fB\-c\fR, \fB\-counts\fR Add counts from counts files. .TP \fB\-s\fR, \fB\-smoothing\fR Specify smoothing algorithms. .IP Default: ModKN .TP \fB\-wf\fR, \fB\-weight\-features\fR Specify n\-gram weighting features. .TP \fB\-p\fR, \fB\-params\fR Set initial model params. .TP \fB\-oa\fR, \fB\-opt\-alg\fR Specify optimization algorithm. .IP Default: Powell .TP \fB\-op\fR, \fB\-opt\-perp\fR Tune params to minimize dev set perplexity. .TP \fB\-ow\fR, \fB\-opt\-wer\fR Tune params to minimize lattice word error rate. .TP \fB\-om\fR, \fB\-opt\-margin\fR Tune params to minimize lattice margin. .TP \fB\-wb\fR, \fB\-write\-binary\fR Write LM/counts files in binary format. .IP Default: false .TP \fB\-wp\fR, \fB\-write\-params\fR Write tuned model params to file. .TP \fB\-wv\fR, \fB\-write\-vocab\fR Write LM vocab to file. .TP \fB\-wc\fR, \fB\-write\-counts\fR Write n\-gram counts to file. .TP \fB\-wec\fR, \fB\-write\-eff\-counts\fR Write effective n\-gram counts to file. .TP \fB\-wlc\fR, \fB\-write\-left\-counts\fR Write left\-branching n\-gram counts to file. .TP \fB\-wrc\fR, \fB\-write\-right\-counts\fR Write right\-branching n\-gram counts to file. .TP \fB\-wl\fR, \fB\-write\-lm\fR Write ARPA backoff LM to file. .TP \fB\-ep\fR, \fB\-eval\-perp\fR Compute test set perplexity. .TP \fB\-ew\fR, \fB\-eval\-wer\fR Compute test set lattice word error rate. .TP \fB\-em\fR, \fB\-eval\-margin\fR Compute test set lattice margin. .PP .SH SEE ALSO \fBevaluate-ngram\fR(1), \fBinterpolate-ngram\fR(1)