.\" $Id: dspam_train.1,v 1.10 2011/06/28 00:13:48 sbajic Exp $ .\" -*- nroff -*- .\" .\" dspam_train3.9 .\" .\" Authors: Jonathan A. Zdziarski .\" Stevan Bajic .\" .\" Copyright (C) 2002-2011 DSPAM Project .\" All rights reserved .\" .TH dspam_train 1 "Apr 17, 2010" "DSPAM" "DSPAM" .SH NAME dspam_train \- train a corpus of mail .SH SYNOPSIS .na .B dspam_train [\c .BI username\fR\c ] [\c .BI \--client\fR\c ] [\c .BI \-i\ \fR\c index|\c .BI spam_corpus\fR\c \ \c .BI nonspam_corpus\fR\c ] .ad .SH DESCRIPTION .LP .B dspam_train is used to train and test a corpus of mail (in maildir or MBOX format). This tool will present each message to DSPAM for a classification and then retrain only if the message was incorrect. This provides close to real\-world training and should be used to build pretrained databases. Upon execution, the tool will automatically determine the ratio of spam:nonspam and train based on that ratio to ensure both corpora are trained consecutively. This tool can also be used as a test jig to measure the efficiency and accuracy of a particular corpus against DSPAM in a given configuration. .SH OPTIONS .LP .ne 3 .TP .ne 3 .TP .BI \--client\c If specified, DSPAM is used in client\-server mode. .ne 3 .TP .BI username\c Specifies the user to train, if omitted the current user name is used. .ne 3 .TP .BI \-i\fR\ index\c Use a index file instead of the usual spam_corpus and nonspam_corpus. .B index : Path to the index file having the following format per line: .br [class] [path to message] .ne 3 .TP .BI spam_corpus\c Specifies either the pathname to the directory containing the corpus of spam, with each in a separate file (e.g. maildir format) or a path to the mailbox in the traditional Unix MBOX format. .ne 3 .TP .BI nonspam_corpus\c Specifies either the pathname to the directory containing the corpus of nonspam with each message in a separate file or a path to the mailbox in the traditional Unix MBOX format. .SH EXIT VALUE .LP .ne 3 .PD 0 .TP .B 0 Operation was successful. .ne 3 .TP .B other Operation resulted in an error. .PD .SH COPYRIGHT Copyright \(co 2002\-2011 DSPAM Project .br All rights reserved. .br For more information, see http://dspam.sourceforge.net. .SH SEE ALSO .BR dspam (1), .BR dspam_admin (1), .BR dspam_clean (1), .BR dspam_crc (1), .BR dspam_dump (1), .BR dspam_logrotate (1), .BR dspam_merge (1), .BR dspam_stats (1)