'\" t
.\"     Title: classifier_tester
.\"    Author: [see the "AUTHOR" section]
.\" Generator: DocBook XSL Stylesheets v1.79.1 <http://docbook.sf.net/>
.\"      Date: 01/21/2019
.\"    Manual: \ \&
.\"    Source: \ \&
.\"  Language: English
.\"
.TH "CLASSIFIER_TESTER" "1" "01/21/2019" "\ \&" "\ \&"
.\" -----------------------------------------------------------------
.\" * Define some portability stuff
.\" -----------------------------------------------------------------
.\" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.\" http://bugs.debian.org/507673
.\" http://lists.gnu.org/archive/html/groff/2009-02/msg00013.html
.\" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.ie \n(.g .ds Aq \(aq
.el       .ds Aq '
.\" -----------------------------------------------------------------
.\" * set default formatting
.\" -----------------------------------------------------------------
.\" disable hyphenation
.nh
.\" disable justification (adjust text to left margin only)
.ad l
.\" -----------------------------------------------------------------
.\" * MAIN CONTENT STARTS HERE *
.\" -----------------------------------------------------------------
.SH "NAME"
classifier_tester \- for *legacy tesseract* engine\&.
.SH "SYNOPSIS"
.sp
\fBclassifier_tester\fR \-U \fIunicharset_file\fR \-F \fIfont_properties_file\fR \-X \fIxheights_file\fR \-classifier \fIx\fR \-lang \fIlang\fR [\-output_trainer trainer] *\&.tr
.SH "DESCRIPTION"
.sp
classifier_tester(1) runs Tesseract in a special mode\&. It takes a list of \&.tr files and tests a character classifier on data as formatted for training, but it doesn\(cqt have to be the same as the training data\&.
.SH "IN/OUT ARGUMENTS"
.sp
a list of \&.tr files
.SH "OPTIONS"
.PP
\-l \fIlang\fR
.RS 4
(Input) three character language code; default value
\fIeng\fR\&.
.RE
.PP
\-classifier \fIx\fR
.RS 4
(Input) One of "pruner", "full"\&.
.RE
.PP
\-U \fIunicharset\fR
.RS 4
(Input) The unicharset for the language\&.
.RE
.PP
\-F \fIfont_properties_file\fR
.RS 4
(Input) font properties file, each line is of the following form, where each field other than the font name is 0 or 1:
.sp
.if n \{\
.RS 4
.\}
.nf
*font_name* *italic* *bold* *fixed_pitch* *serif* *fraktur*
.fi
.if n \{\
.RE
.\}
.RE
.PP
\-X \fIxheights_file\fR
.RS 4
(Input) x heights file, each line is of the following form, where xheight is calculated as the pixel x height of a character drawn at 32pt on 300 dpi\&. [ That is, if base x height + ascenders + descenders = 133, how much is x height? ]
.sp
.if n \{\
.RS 4
.\}
.nf
*font_name* *xheight*
.fi
.if n \{\
.RE
.\}
.RE
.PP
\-output_trainer \fItrainer\fR
.RS 4
(Output, Optional) Filename for output trainer\&.
.RE
.SH "SEE ALSO"
.sp
tesseract(1)
.SH "COPYING"
.sp
Copyright (C) 2012 Google, Inc\&. Licensed under the Apache License, Version 2\&.0
.SH "AUTHOR"
.sp
The Tesseract OCR engine was written by Ray Smith and his research groups at Hewlett Packard (1985\-1995) and Google (2006\-present)\&.