'\" t .\" Title: bogoutil .\" Author: [see the "AUTHOR" section] .\" Generator: DocBook XSL Stylesheets v1.78.0 .\" Date: 06/29/2013 .\" Manual: Bogofilter Reference Manual .\" Source: Bogofilter .\" Language: English .\" .TH "BOGOUTIL" "1" "06/29/2013" "Bogofilter" "Bogofilter Reference Manual" .\" ----------------------------------------------------------------- .\" * Define some portability stuff .\" ----------------------------------------------------------------- .\" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .\" http://bugs.debian.org/507673 .\" http://lists.gnu.org/archive/html/groff/2009-02/msg00013.html .\" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .ie \n(.g .ds Aq \(aq .el .ds Aq ' .\" ----------------------------------------------------------------- .\" * set default formatting .\" ----------------------------------------------------------------- .\" disable hyphenation .nh .\" disable justification (adjust text to left margin only) .ad l .\" ----------------------------------------------------------------- .\" * MAIN CONTENT STARTS HERE * .\" ----------------------------------------------------------------- .SH "NAME" bogoutil \- Dumps, loads, and maintains bogofilter database files .SH "SYNOPSIS" .HP \w'\fBbogoutil\fR\ 'u \fBbogoutil\fR {\-h | \-V} .HP \w'\fBbogoutil\fR\ 'u \fBbogoutil\fR [options] {\-d\ \fIfile\fR | \-H\ \fIfile\fR | \-l\ \fIfile\fR | \-m\ \fIfile\fR | \-w\ \fIfile\fR | \-p\ \fIfile\fR} .HP \w'\fBbogoutil\fR\ 'u \fBbogoutil\fR {\-r\ \fIfile\fR | \-R\ \fIfile\fR} .HP \w'\fBbogoutil\fR\ 'u \fBbogoutil\fR {\-\-db\-print\-leafpage\-count\ \fIfile\fR | \-\-db\-print\-pagesize\ \fIfile\fR | \-\-db\-verify\ \fIfile\fR | \-\-db\-checkpoint\ \fIdirectory\fR\ [flag...] | \-\-db\-list\-logfiles\ \fIdirectory\fR | \-\-db\-prune\ \fIdirectory\fR | \-\-db\-recover\ \fIdirectory\fR | \-\-db\-recover\-harder\ \fIdirectory\fR | \-\-db\-remove\-environment\ \fIdirectory\fR} .PP where \fBoptions\fR is .HP \w'\fBbogoutil\fR\ 'u \fBbogoutil\fR [\-v] [\-n] [\-C] [\-D] [\-a\ \fIage\fR] [\-c\ \fIcount\fR] [\-s\ \fImin,max\fR] [\-y\ \fIdate\fR] [\-I\ \fIfile\fR] [\-O\ \fIfile\fR] [\-x\ \fIflags\fR] [\-\-config\-file\ \fIfile\fR] .SH "DESCRIPTION" .PP Bogoutil is part of the bogofilter Bayesian spam filter package\&. .PP It is used to dump and load bogofilter\*(Aqs Berkeley DB databases to and from text files, perform database maintenance functions, and to display the values for specific words\&. .SH "OPTIONS" .PP The \fB\-d \fR\fB\fIfile\fR\fR option tells bogoutil to print the contents of the database file to \fBstdout\fR\&. .PP The \fB\-H \fR\fB\fIfile\fR\fR option tells bogoutil to print a histogram of the database file to \fBstdout\fR\&. The output is similar to bogofilter \-vv\&. Finally, hapaxes (tokens which were only seen once) and pure tokens (tokens which were encountered only in ham or only in spam) are counted\&. .PP The \fB\-l \fR\fB\fIfile\fR\fR option tells bogoutil to load the data from \fBstdin\fR into the database file\&. If the database file exists, \fBstdin\fR data is merged into the database file, with counts added up\&. .PP The \fB\-m\fR option tells bogoutil to perform maintenance functions on the specified database, i\&.e\&. discard tokens that are older than desired, have counts that are too small, or sizes (lengths) that are too long or too short\&. .PP The \fB\-w \fR\fB\fIfile\fR\fR option tells bogoutil to display token information from the database file\&. The option takes an argument, which is either the name of the wordlist (usually wordlist\&.db) or the name of the directory containing it\&. Tokens can be listed on the command line or piped to bogoutil\&. When there are extra arguments on the command line, bogoutil will use them as the tokens to lookup\&. If there are no extra arguments, bogoutil will read tokens from \fBstdin\fR\&. .PP The \fB\-p \fR\fB\fIfile\fR\fR option tells bogoutil to display the database information for one or more tokens\&. The display includes a probability column with the token\*(Aqs spam score (computed using bogofilter\*(Aqs default values)\&. Option \fB\-p\fR takes the same arguments as option \fB\-w\fR \&. .PP The \fB\-r \fR\fB\fIfile\fR\fR option tells bogoutil to recalculate the ROBX value and print it as a six\-digit fraction\&. .PP The \fB\-R \fR\fB\fIfile\fR\fR option does the same as \fB\-r\fR, but saves the result in the training database without printing it\&. .PP The \fB\-I \fR\fB\fIfile\fR\fR option tells bogoutil to read its input from \fIfile\fR rather than stdin\&. .PP The \fB\-O \fR\fB\fIfile\fR\fR option tells bogoutil to write its output to \fIfile\fR rather than stdout\&. .PP The \fB\-v\fR option produces verbose output on \fBstderr\fR\&. This option is primarily useful for debugging\&. .PP The \fB\-C\fR inhibits reading configuration files and lets bogoutil go with the defaults\&. .PP The \fB\-\-config\-file \fR\fB\fIfile\fR\fR option tells bogoutil to read \fIfile\fR instead of the standard configuration file\&. .PP The \fB\-D\fR redirects debug output to stdout (it usually goes to stderr)\&. .PP The \fB\-x \fR\fB\fIflags\fR\fR option sets debugging flags\&. .PP Option \fB\-n\fR stands for "replace non\-ascii characters"\&. It will replace characters with the high bit (0x80) by question marks\&. This can be useful if a word list has lots of unreadable tokens, for example from Asian spam\&. The "bad" characters will be converted to question marks and matching tokens will be combined when used with \fB\-m\fR or \fB\-l\fR, but not with \fB\-d\fR\&. .PP Option \fB\-a age\fR indicates an acceptable token age, with older ones being discarded\&. The age can be a date (in form YYYYMMMDD) or a day count, i\&.e\&. discard tokens older than \fBage\fR days\&. .PP Option \fB\-c value\fR indicates that tokens with counts less than or equal to \fBvalue\fR are to be discarded\&. .PP Option \fB\-s min,max\fR is used to discard tokens based on their size, i\&.e\&. length\&. All tokens shorter than \fBmin\fR or longer than \fBmax\fR will be discarded\&. .PP Option \fB\-y date\fR is specifies the date to give to tokens that don\*(Aqt have dates\&. The format is YYYYMMDD\&. .PP The \fB\-h\fR option prints the help message and exits\&. .PP The \fB\-V\fR option prints the version number and exits\&. .SH "ENVIRONMENT MAINTENANCE" .PP The \fB\-\-db\-checkpoint \fR\fB\fIdir\fR\fR option causes bogoutil to flush the buffer caches and checkpoint the database environment\&. .PP The \fB\-\-db\-list\-logfiles \fR\fB\fIdir\fR\fR option causes bogoutil to list the log files in the environment\&. Zero or more keywords can be added or combined (separated by whitespace) to modify the behavior of this mode\&. The default behavior is to list only inactive log files with relative paths\&. You can add \fBall\fR to list all log files (inactive and active)\&. You can add \fBabsolute\fR to switch the listing to absolute paths\&. .PP The \fB\-\-db\-prune \fR\fB\fIdir\fR\fR option causes bogoutil to checkpoint the database environment and remove inactive log files\&. .PP The \fB\-\-db\-recover \fR\fB\fIdir\fR\fR option runs a regular database recovery in the specified database directory\&. If that fails, it will retry with a (usually slower) catastrophic database recovery\&. If that fails, too, your database cannot be repaired and must be rebuilt from scratch\&. This is only supported when compiled with Berkeley DB support with transactions enabled\&. Trying recovery with QDBM or SQLite3 support will result in an error\&. .PP The \fB\-\-db\-recover\-harder \fR\fB\fIdir\fR\fR option runs a catastrophic data base recovery in the specified database directory\&. If that fails, your database cannot be repaired and must be rebuilt from scratch\&. This is only supported when compiled with Berkeley DB support with transactions enabled\&. Trying recovery with QDBM or SQLite3 support will result in an error\&. .PP The \fB\-\-db\-remove\-environment \fR\fB\fIdirectory\fR\fR option has no short option equivalent\&. It runs recovery in the given directory and then removes the database environment\&. Use this \fIbefore\fR upgrading to a new Berkeley DB version if the new version to be installed requires a log file format update\&. .PP The \fB\-\-db\-print\-leafpage\-count \fR\fB\fIfile\fR\fR option prints the number of leaf pages in the database file \fIfile\fR as a decimal number, or UNKNOWN if the database does not support querying this figure\&. .PP The \fB\-\-db\-print\-pagesize \fR\fB\fIfile\fR\fR option prints the size of a database page in \fIfile\fR as a decimal number, or UNKNOWN for databases with variable page size or databases that do not allow a query of the database page size\&. .PP The \fB\-\-db\-verify \fR\fB\fIfile\fR\fR option requests that bogofilter verifies the database file\&. It prints only errors, unless in verbose mode\&. .SH "DATA FORMAT" .PP Bogoutil reads and writes text files where each nonblank line consists of a word, any amount of horizontal whitespace, a numeric word count, more whitespace, and (optionally) a date in form YYYYMMDD\&. Blank lines are skipped\&. .SH "RETURN VALUES" .PP 0 for successful operation\&. 1 for most errors\&. 3 for I/O or other errors\&. Error 3 usually means that something is seriously wrong with the database files\&. .SH "AUTHOR" .PP Gyepi Sam \&. .PP Matthias Andree \&. .PP David Relson \&. .PP For updates, see \m[blue]\fBthe bogofilter project page\fR\m[]\&\s-2\u[1]\d\s+2\&. .SH "SEE ALSO" .PP bogofilter(1), bogolexer(1), bogotune(1), bogoupgrade(1) .SH "NOTES" .IP " 1." 4 the bogofilter project page .RS 4 \%http://bogofilter.sourceforge.net/ .RE