'\" t .\" Title: bogofilter .\" Author: [see the "AUTHOR" section] .\" Generator: DocBook XSL Stylesheets vsnapshot .\" Date: 05/19/2019 .\" Manual: Bogofilter Reference Manual .\" Source: Bogofilter .\" Language: English .\" .TH "BOGOFILTER" "1" "05/19/2019" "Bogofilter" "Bogofilter Reference Manual" .\" ----------------------------------------------------------------- .\" * Define some portability stuff .\" ----------------------------------------------------------------- .\" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .\" http://bugs.debian.org/507673 .\" http://lists.gnu.org/archive/html/groff/2009-02/msg00013.html .\" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .ie \n(.g .ds Aq \(aq .el .ds Aq ' .\" ----------------------------------------------------------------- .\" * set default formatting .\" ----------------------------------------------------------------- .\" disable hyphenation .nh .\" disable justification (adjust text to left margin only) .ad l .\" ----------------------------------------------------------------- .\" * MAIN CONTENT STARTS HERE * .\" ----------------------------------------------------------------- .SH "NAME" bogofilter \- fast Bayesian spam filter .SH "SYNOPSIS" .HP \w'\fBbogofilter\fR\ 'u \fBbogofilter\fR [help\ options | classification\ options | registration\ options | parameter\ options | info\ options] [general\ options] [config\ file\ options] .PP where .PP \fBhelp options\fR are: .HP \w'\ 'u [\-h] [\-\-help] [\-V] [\-Q] .PP \fBclassification options\fR are: .HP \w'\ 'u [\-p] [\-e] [\-t] [\-T] [\-u] [\-H] [\-M] [\-b] [\-B\ \fIobject\ \&.\&.\&.\fR] [\-R] [general\ options] [parameter\ options] [config\ file\ options] .PP \fBregistration options\fR are: .HP \w'\ 'u [\-s | \-n] [\-S | \-N] [general\ options] .PP \fBgeneral options\fR are: .HP \w'\ 'u [\-c\ \fIfilename\fR] [\-C] [\-d\ \fIdir\fR] [\-k\ \fIcachesize\fR] [\-l] [\-L\ \fItag\fR] [\-I\ \fIfilename\fR] [\-O\ \fIfilename\fR] .PP \fBparameter options\fR are: .HP \w'\ 'u [\-E\ \fIvalue\fR\fI[,value]\fR] [\-m\ \fIvalue\fR\fI[,value]\fR\fI[,value]\fR] [\-o\ \fIvalue\fR\fI[,value]\fR] .PP \fBinfo options\fR are: .HP \w'\ 'u [\-v] [\-y\ \fIdate\fR] [\-D] [\-x\ \fIflags\fR] .PP \fBconfig file options\fR are: .HP \w'\ 'u [\-\-\fIoption=value\fR] .PP Note: Use \fBbogofilter \-\-help\fR to display the complete list of options\&. .SH "DESCRIPTION" .PP Bogofilter is a Bayesian spam filter\&. In its normal mode of operation, it takes an email message or other text on standard input, does a statistical check against lists of "good" and "bad" words, and returns a status code indicating whether or not the message is spam\&. Bogofilter is designed with a fast algorithm, uses the Berkeley DB for fast startup and lookups, coded directly in C, and tuned for speed, so it can be used for production by sites that process a lot of mail\&. .SH "THEORY OF OPERATION" .PP Bogofilter treats its input as a bag of tokens\&. Each token is checked against a wordlist, which maintains counts of the numbers of times it has occurred in non\-spam and spam mails\&. These numbers are used to compute an estimate of the probability that a message in which the token occurs is spam\&. Those are combined to indicate whether the message is spam or ham\&. .PP While this method sounds crude compared to the more usual pattern\-matching approach, it turns out to be extremely effective\&. Paul Graham\*(Aqs paper \m[blue]\fBA Plan For Spam\fR\m[]\&\s-2\u[1]\d\s+2 is recommended reading\&. .PP This program substantially improves on Paul\*(Aqs proposal by doing smarter lexical analysis\&. Bogofilter does proper MIME decoding and a reasonable HTML parsing\&. Special kinds of tokens like hostnames and IP addresses are retained as recognition features rather than broken up\&. Various kinds of MTA cruft such as dates and message\-IDs are ignored so as not to bloat the wordlist\&. Tokens found in various header fields are marked appropriately\&. .PP Another improvement is that this program offers Gary Robinson\*(Aqs suggested modifications to the calculations (see the parameters robx and robs below)\&. These modifications are described in Robinson\*(Aqs paper \m[blue]\fBSpam Detection\fR\m[]\&\s-2\u[2]\d\s+2\&. .PP Since then, Robinson (see his Linux Journal article \m[blue]\fBA Statistical Approach to the Spam Problem\fR\m[]\&\s-2\u[3]\d\s+2) and others have realized that the calculation can be further optimized using Fisher\*(Aqs method\&. \m[blue]\fBAnother improvement\fR\m[]\&\s-2\u[4]\d\s+2 compensates for token redundancy by applying separate effective size factors (ESF) to spam and nonspam probability calculations\&. .PP In short, this is how it works: The estimates for the spam probabilities of the individual tokens are combined using the "inverse chi\-square function"\&. Its value indicates how badly the null hypothesis that the message is just a random collection of independent words with probabilities given by our previous estimates fails\&. This function is very sensitive to small probabilities (hammish words), but not to high probabilities (spammish words); so the value only indicates strong hammish signs in a message\&. Now using inverse probabilities for the tokens, the same computation is done again, giving an indicator that a message looks strongly spammish\&. Finally, those two indicators are subtracted (and scaled into a 0\-1\-interval)\&. This combined indicator (bogosity) is close to 0 if the signs for a hammish message are stronger than for a spammish message and close to 1 if the situation is the other way round\&. If signs for both are equally strong, the value will be near 0\&.5\&. Since those message don\*(Aqt give a clear indication there is a tristate mode in bogofilter to mark those messages as unsure, while the clear messages are marked as spam or ham, respectively\&. In two\-state mode, every message is marked as either spam or ham\&. .PP Various parameters influence these calculations, the most important are: .PP robx: the score given to a token which has not seen before\&. robx is the probability that the token is spammish\&. .PP robs: a weight on robx which moves the probability of a little seen token towards robx\&. .PP min\-dev: a minimum distance from \&.5 for tokens to use in the calculation\&. Only tokens farther away from 0\&.5 than this value are used\&. .PP spam\-cutoff: messages with scores greater than or equal to will be marked as spam\&. .PP ham\-cutoff: If zero or spam\-cutoff, all messages with values strictly below spam\-cutoff are marked as ham, all others as spam (two\-state)\&. Else values less than or equal to ham\-cutoff are marked as ham, messages with values strictly between ham\-cutoff and spam\-cutoff are marked as unsure; the rest as spam (tristate) .PP sp\-esf: the effective size factor (ESF) for spam\&. .PP ns\-esf: the ESF for nonspam\&. These ESF values default to 1\&.0, which is the same as not using ESF in the calculation\&. Values suitable to a user\*(Aqs email population can be determined with the aid of the bogotune program\&. .SH "OPTIONS" .PP HELP OPTIONS .PP The \fB\-h\fR option prints the help message and exits\&. .PP The \fB\-V\fR option prints the version number and exits\&. .PP The \fB\-Q\fR (query) option prints bogofilter\*(Aqs configuration, i\&.e\&. registration parameters, parsing options, bogofilter directory, etc\&. .PP CLASSIFICATION OPTIONS .PP The \fB\-p\fR (passthrough) option outputs the message with an X\-Bogosity line at the end of the message header\&. This requires keeping the entire message in memory when it\*(Aqs read from stdin (or from a pipe or socket)\&. If the message is read from a file that can be rewound, bogofilter will read it a second time\&. .PP The \fB\-e\fR (embed) option tells bogofilter to exit with code 0 if the message can be classified, i\&.e\&. if there is not an error\&. Normally bogofilter uses different codes for spam, ham, and unsure classifications, but this simplifies using bogofilter with procmail or maildrop\&. .PP The \fB\-t\fR (terse) option tells bogofilter to print an abbreviated spamicity message containing 1 letter and the score\&. Spam is indicated with "Y", ham by "N", and unsure by "U"\&. Note: the formatting can be customized using the config file\&. .PP The \fB\-T\fR provides an invariant terse mode for scripts to use\&. bogofilter will print an abbreviated spamicity message containing 1 letter and the score\&. Spam is indicated with "S", ham by "H", and unsure by "U"\&. .PP The \fB\-TT\fR provides an invariant terse mode for scripts to use\&. Bogofilter prints only the score and displays it to 16 significant digits\&. .PP The \fB\-u\fR option tells bogofilter to register the message\*(Aqs text after classifying it as spam or non\-spam\&. A spam message will be registered on the spamlist and a non\-spam message on the goodlist\&. If the classification is "unsure", the message will not be registered\&. Effectively this option runs bogofilter with the \fB\-s\fR or \fB\-n\fR flag, as appropriate\&. Caution is urged in the use of this capability, as any classification errors bogofilter may make will be preserved and will accumulate until manually corrected with the \fB\-Sn\fR and \fB\-Ns\fR option combinations\&. Note this option causes the database to be opened for write access, which can entail massive slowdowns through lock contention and synchronous I/O operations\&. .PP The \fB\-H\fR option tells bogofilter to not tag tokens from the header\&. This option is for testing, you should not use it in normal operation\&. .PP The \fB\-M\fR option tells bogofilter to process its input as a mbox formatted file\&. If the \fB\-v\fR or \fB\-t\fR option is also given, a spamicity line will be printed for each message\&. .PP The \fB\-b\fR (streaming bulk mode) option tells bogofilter to classify multiple objects whose names are read from stdin\&. If the \fB\-v\fR or \fB\-t\fR option is also given, bogofilter will print a line giving file name and classification information for each file\&. This is an alternative to \fB\-B\fR which lists objects on the command line\&. .PP An object in this context shall be a maildir (autodetected), or if it\*(Aqs not a maildir, a single mail unless \fB\-M\fR is given \- in that case it\*(Aqs processed as mbox\&. (The Content\-Length: header is not taken into account currently\&.) .PP When reading mbox format, bogofilter relies on the empty line after a mail\&. If needed, \fBformail \-es\fR will ensure this is the case\&. .PP The \fB\-B \fR\fB\fIobject \&.\&.\&.\fR\fR (bulk mode) option tells bogofilter to classify multiple objects named on the command line\&. The objects may be filenames (for single messages), mailboxes (files with multiple messages), or directories (of maildir and MH format)\&. If the \fB\-v\fR or \fB\-t\fR option is also given, bogofilter will print a line giving file name and classification information for each file\&. This is an alternative to \fB\-b\fR which lists objects on stdin\&. .PP The \fB\-R\fR option tells bogofilter to output an R data frame in text form on the standard output\&. See the section on integration with R, below, for further detail\&. .PP REGISTRATION OPTIONS .PP The \fB\-s\fR option tells bogofilter to register the text presented as spam\&. The database is created if absent\&. .PP The \fB\-n\fR option tells bogofilter to register the text presented as non\-spam\&. .PP Bogofilter doesn\*(Aqt detect if a message registered twice\&. If you do this by accident, the token counts will off by 1 from what you really want and the corresponding spam scores will be slightly off\&. Given a large number of tokens and messages in the wordlist, this doesn\*(Aqt matter\&. The problem \fIcan\fR be corrected by using the \fB\-S\fR option or the \fB\-N\fR option\&. .PP The \fB\-S\fR option tells bogofilter to undo a prior registration of the same message as spam\&. If a message was incorrectly entered as spam by \fB\-s\fR or \fB\-u\fR and you want to remove it and enter it as non\-spam, use \fB\-Sn\fR\&. If \fB\-S\fR is used for a message that wasn\*(Aqt registered as spam, the counts will still be decremented\&. .PP The \fB\-N\fR option tells bogofilter to undo a prior registration of the same message as non\-spam\&. If a message was incorrectly entered as non\-spam by \fB\-n\fR or \fB\-u\fR and you want to remove it and enter it as spam, then use \fB\-Ns\fR\&. If \fB\-N\fR is used for a message that wasn\*(Aqt registered as non\-spam, the counts will still be decremented\&. .PP GENERAL OPTIONS .PP The \fB\-c \fR\fB\fIfilename\fR\fR option tells bogofilter to read the config file named\&. .PP The \fB\-C\fR option prevents bogofilter from reading configuration files\&. .PP The \fB\-d \fR\fB\fIdir\fR\fR option allows you to set the directory for the database\&. See the ENVIRONMENT section for other directory setting options\&. .PP The \fB\-k \fR\fB\fIcachesize\fR\fR option sets the cache size for the BerkeleyDB subsystem, in units of 1 MiB (1,048,576 bytes)\&. Properly sizing the cache improves bogofilter\*(Aqs performance\&. The recommended size is one third of the size of the database file\&. You can run the bogotune script (in the tuning directory) to determine the recommended size\&. .PP The \fB\-l\fR option writes an informational line to the system log each time bogofilter is run\&. The information logged depends on how bogofilter is run\&. .PP The \fB\-L \fR\fB\fItag\fR\fR option configures a tag which can be included in the information being logged by the \fB\-l\fR option, but it requires a custom format that includes the %l string for now\&. This option implies \fB\-l\fR\&. .PP The \fB\-I \fR\fB\fIfilename\fR\fR option tells bogofilter to read its input from the specified file, rather than from \fBstdin\fR\&. .PP The \fB\-O \fR\fB\fIfilename\fR\fR option tells bogofilter where to write its output in passthrough mode\&. Note that this only works when \-p is explicitly given\&. .PP PARAMETER OPTIONS .PP The \fB\-E \fR\fB\fIvalue\fR\fI[,value]\fR\fR option allows setting the sp\-esf value and the ns\-esf value\&. With two values, both sp\-esf and ns\-esf are set\&. If only one value is given, parameters are set as described in the note below\&. .PP The \fB\-m \fR\fB\fIvalue\fR\fI[,value]\fR\fI[,value]\fR\fR option allows setting the min\-dev value and, optionally, the robs and robx values\&. With three values, min\-dev, robs, and robx are all set\&. If fewer values are given, parameters are set as described in the note below\&. .PP The \fB\-o \fR\fB\fIvalue\fR\fI[,value]\fR\fR option allows setting the spam\-cutoff ham\-cutoff values\&. With two values, both spam\-cutoff and ham\-cutoff are set\&. If only one value is given, parameters are set as described in the note below\&. .PP Note: All of these options allow fewer values to be provided\&. Values can be skipped by using just the comma delimiter, in which case the corresponding parameter(s) won\*(Aqt be changed\&. If only the first value is provided, then only the first parameter is set\&. Trailing values can be skipped, in which case the corresponding parameters won\*(Aqt be changed\&. Within the parameter list, spaces are not allowed after commas\&. .PP INFO OPTIONS .PP The \fB\-v\fR option produces a report to standard output on bogofilter\*(Aqs analysis of the input\&. Each additional \fBv\fR will increase the verbosity of the output, up to a maximum of 4\&. With \fB\-vv\fR, the report lists the tokens with highest deviation from a mean of 0\&.5 association with spam\&. .PP Option \fB\-y date\fR can be used to override the current date when timestamping tokens\&. A value of zero (0) turns off timestamping\&. .PP The \fB\-D\fR option redirects debug output to stdout\&. .PP The \fB\-x \fR\fB\fIflags\fR\fR option allows setting of debug flags for printing debug information\&. See header file debug\&.h for the list of usable flags\&. .PP CONFIG FILE OPTIONS .PP Using GNU longopt \fB\-\-\fR syntax, a config file\*(Aqs \fB\fIname=value\fR\fR statement becomes a command line\*(Aqs \fB\-\-\fR\fB\fIoption=value\fR\fR\&. Use command \fBbogofilter \-\-help\fR for a list of options and see bogofilter\&.cf\&.example for more info on them\&. For example to change the X\-Bogosity header to "X\-Spam\-Header", use: .PP \fB\fI\-\-spam\-header\-name=X\-Spam\-Header\fR\fR .SH "ENVIRONMENT" .PP Bogofilter uses a database directory, which can be set in the config file\&. If not set there, bogofilter will use the value of \fBBOGOFILTER_DIR\fR\&. Both can be overridden by the \fB\-d \fR\fB\fIdir\fR\fR option\&. If none of that is available, bogofilter will use directory $HOME/\&.bogofilter\&. .SH "CONFIGURATION" .PP The bogofilter command line allows setting of many options that determine how bogofilter operates\&. File /etc/bogofilter\&.cf can be used to set additional parameters that affect its operation\&. File /etc/bogofilter\&.cf\&.example has samples of all of the parameters\&. Status and logging messages can be customized for each site\&. .SH "RETURN VALUES" .PP 0 for spam; 1 for non\-spam; 2 for unsure ; 3 for I/O or other errors\&. .PP If both \fB\-p\fR and \fB\-e\fR are used, the return values are: 0 for spam or non\-spam; 3 for I/O or other errors\&. .PP Error 3 usually means that the wordlist file bogofilter wants to read at startup is missing or the hard disk has filled up in \fB\-p\fR mode\&. .SH "INTEGRATION WITH OTHER TOOLS" .PP Use with procmail .PP The following recipe (a) spam\-bins anything that bogofilter rates as spam, (b) registers the words in messages rated as spam as such, and (c) registers the words in messages rated as non\-spam as such\&. With this in place, it will normally only be necessary for the user to intervene (with \fB\-Ns\fR or \fB\-Sn\fR) when bogofilter miscategorizes something\&. .sp .if n \{\ .RS 4 .\} .nf # filter mail through bogofilter, tagging it as Ham, Spam, or Unsure, # and updating the wordlist :0fw | bogofilter \-u \-e \-p # if bogofilter failed, return the mail to the queue; # the MTA will retry to deliver it later # 75 is the value for EX_TEMPFAIL in /usr/include/sysexits\&.h :0e { EXITCODE=75 HOST } # file the mail to spam\-bogofilter if it\*(Aqs spam\&. :0: * ^X\-Bogosity: Spam, tests=bogofilter spam\-bogofilter # file the mail to unsure\-bogofilter # if it\*(Aqs neither ham nor spam\&. :0: * ^X\-Bogosity: Unsure, tests=bogofilter unsure\-bogofilter # With this recipe, you can train bogofilter starting with an empty # wordlist\&. Be sure to check your unsure\-folder regularly, take the # messages out of it, classify them as ham (or spam), and use them to # train bogofilter\&. .fi .if n \{\ .RE .\} .PP The following procmail rule will take mail on stdin and save it to file spam if bogofilter thinks it\*(Aqs spam: .sp .if n \{\ .RS 4 .\} .nf :0HB: * ? bogofilter spam .fi .if n \{\ .RE .\} .sp and this similar rule will also register the tokens in the mail according to the bogofilter classification: .sp .if n \{\ .RS 4 .\} .nf :0HB: * ? bogofilter \-u spam .fi .if n \{\ .RE .\} .PP If bogofilter fails (returning 3) the message will be treated as non\-spam\&. .PP This one is for maildrop, it automatically defers the mail and retries later when the xfilter command fails, use this in your ~/\&.mailfilter: .sp .if n \{\ .RS 4 .\} .nf xfilter "bogofilter \-u \-e \-p" if (/^X\-Bogosity: Spam, tests=bogofilter/) { to "spam\-bogofilter" } .fi .if n \{\ .RE .\} .PP The following \&.muttrc lines will create mutt macros for dispatching mail to bogofilter\&. .sp .if n \{\ .RS 4 .\} .nf macro index d "unset wait_key\en\e bogofilter \-n\en\e set wait_key\en\e " "delete message as non\-spam" macro index \eed "unset wait_key\en\e bogofilter \-s\en\e set wait_key\en\e " "delete message as spam" .fi .if n \{\ .RE .\} .PP Integration with Mail Transport Agent (MTA) .sp .RS 4 .ie n \{\ \h'-04' 1.\h'+01'\c .\} .el \{\ .sp -1 .IP " 1." 4.2 .\} bogofilter can also be integrated into an MTA to filter all incoming mail\&. While the specific implementation is MTA dependent, the general steps are as follows: .RE .sp .RS 4 .ie n \{\ \h'-04' 2.\h'+01'\c .\} .el \{\ .sp -1 .IP " 2." 4.2 .\} Install bogofilter on the mail server .RE .sp .RS 4 .ie n \{\ \h'-04' 3.\h'+01'\c .\} .el \{\ .sp -1 .IP " 3." 4.2 .\} Prime the bogofilter databases with a spam and non\-spam corpus\&. Since bogofilter will be serving a larger community, it is important to prime it with a representative set of messages\&. .RE .sp .RS 4 .ie n \{\ \h'-04' 4.\h'+01'\c .\} .el \{\ .sp -1 .IP " 4." 4.2 .\} Set up the MTA to invoke bogofilter on each message\&. While this is an MTA specific step, you\*(Aqll probably need to use the \fB\-p\fR, \fB\-u\fR, and \fB\-e\fR options\&. .RE .sp .RS 4 .ie n \{\ \h'-04' 5.\h'+01'\c .\} .el \{\ .sp -1 .IP " 5." 4.2 .\} Set up a mechanism for users to register spam/non\-spam messages, as well as to correct mis\-classifications\&. The most generic solution is to set up alias email addresses to which users bounce messages\&. .RE .sp .RS 4 .ie n \{\ \h'-04' 6.\h'+01'\c .\} .el \{\ .sp -1 .IP " 6." 4.2 .\} See the doc and contrib directories for more information\&. .RE .PP Use of R to verify bogofilter\*(Aqs calculations .PP The \-R option tells bogofilter to generate an R data frame\&. The data frame contains one row per token analyzed\&. Each such row contains the token, the sum of its database "good" and "spam" counts, the "good" count divided by the number of non\-spam messages used to create the training database, the "spam" count divided by the spam message count, Robinson\*(Aqs f(w) for the token, the natural logs of (1 \- f(w)) and f(w), and an indicator character (+ if the token\*(Aqs f(w) value exceeded the minimum deviation from 0\&.5, \- if it didn\*(Aqt)\&. There is one additional row at the end of the table that contains a label in the token field, followed by the number of words actually used (the ones with + indicators), Robinson\*(Aqs P, Q, S, s and x values and the minimum deviation\&. .PP The R data frame can be saved to a file and later read into an R session (see \m[blue]\fBthe R project website\fR\m[]\&\s-2\u[5]\d\s+2 for information about the mathematics package R)\&. Provided with the bogofilter distribution is a simple R script (file bogo\&.R) that can be used to verify bogofilter\*(Aqs calculations\&. Instructions for its use are included in the script in the form of comments\&. .SH "LOG MESSAGES" .PP Bogofilter writes messages to the system log when the \fB\-l\fR option is used\&. What is written depends on which other flags are used\&. .PP A classification run will generate (we are not showing the date and host part here): .sp .if n \{\ .RS 4 .\} .nf bogofilter[1412]: X\-Bogosity: Ham, spamicity=0\&.000227 bogofilter[1415]: X\-Bogosity: Spam, spamicity=0\&.998918 .fi .if n \{\ .RE .\} .PP Using \fB\-u\fR to classify a message and update a wordlist will produce (one a single line): .sp .if n \{\ .RS 4 .\} .nf bogofilter[1426]: X\-Bogosity: Spam, spamicity=0\&.998918, register \-s, 329 words, 1 messages .fi .if n \{\ .RE .\} .PP Registering words (\fB\-l\fR and \fB\-s\fR, \fB\-n\fR, \fB\-S\fR, or \fB\-N\fR) will produce: .sp .if n \{\ .RS 4 .\} .nf bogofilter[1440]: register\-n, 255 words, 1 messages .fi .if n \{\ .RE .\} .PP A registration run (using \fB\-s\fR, \fB\-n\fR, \fB\-N\fR, or \fB\-S\fR) will generate messages like: .sp .if n \{\ .RS 4 .\} .nf bogofilter[17330]: register\-n, 574 words, 3 messages bogofilter[6244]: register\-s, 1273 words, 4 messages .fi .if n \{\ .RE .\} .SH "FILES" .PP /etc/bogofilter\&.cf .RS 4 System configuration file\&. .RE .PP ~/\&.bogofilter\&.cf .RS 4 User configuration file\&. .RE .PP ~/\&.bogofilter/wordlist\&.db .RS 4 Combined list of good and spam tokens\&. .RE .SH "AUTHOR" .sp .if n \{\ .RS 4 .\} .nf Eric S\&. Raymond \&. David Relson \&. Matthias Andree \&. Greg Louis \&. .fi .if n \{\ .RE .\} .PP For updates, see the \m[blue]\fBbogofilter project page\fR\m[]\&\s-2\u[6]\d\s+2\&. .SH "SEE ALSO" .PP bogolexer(1), bogotune(1), bogoupgrade(1), bogoutil(1) .SH "NOTES" .IP " 1." 4 A Plan For Spam .RS 4 \%http://www.paulgraham.com/spam.html .RE .IP " 2." 4 Spam Detection .RS 4 \%http://radio-weblogs.com/0101454/stories/2002/09/16/spamDetection.html .RE .IP " 3." 4 A Statistical Approach to the Spam Problem .RS 4 \%http://www.linuxjournal.com/article/6467 .RE .IP " 4." 4 Another improvement .RS 4 \%http://www.garyrobinson.net/2004/04/improved%5fchi.html .RE .IP " 5." 4 the R project website .RS 4 \%http://cran.r-project.org/ .RE .IP " 6." 4 bogofilter project page .RS 4 \%http://bogofilter.sourceforge.net/ .RE