Scroll to navigation

filtermail(1) fitermail - extensive mail filter filtermail(1)

NAME

filtermail - Filter incoming e-mail to accepted, spam or ignored

SYNOPSIS

filtermail [OPTIONS] base
[OPTIONS] - cf. section OPTIONS
base: the absolute path to the user’s home directory

When specified paths with options or in the configuration file specifications may start with ~/, in which case ~ is replaced by base.

DESCRIPTION

Filtermail filters incoming e-mail as either accepted, spam, or ignored e-mail. It uses rule files, which are inspected in sequence until the incoming e-mail matches a rule. Once that happens the rule’s associated action (accept, spam, or ignore) is executed. If the e-mail is not matched by any rule then the e-mail is accepted.

Alternatively, when using option --inspect filtermail can be used to find the domain name of the sender, its IP address, its country of origin and the cidr-range containing the received IP address (see sections OPTIONS and FILTERMAIL INSPECT below).

Accepted e-mail normally is appended to the mail file which is used by the incoming mail server when receiving mail for the current user. E.g., if the user’s username is frank then incoming mail is appended to the file /var/mail/frank. Users may also define directories to contain saved e-mails (e.g., ~/Mail), and filtermail can be configured to append e-mail considered as spam to, e.g., ~/Mail/spam. Likewise, e-mail matching the ’ignore’ criteria could be appended to ~/Mail/ignore. Instead of appending the complete e-mail to its destination file the received e-mail’s From: and Subject: headers can be appended to its destination file. This is achieved by prefixing :HDRS: to the name of the destination file. Alternatively, such e-mail can also be ignored, losing it completely, by not specifying a destination file. The option to merely log the received e-mail’s From: and Subject: headers may come in handy if the received e-mail is also kept elsewhere (e.g., in another account, which forwards the received e-mail to the computer running filtermail) and the ignoring rules might result in occasional false positive decisions. (see also section OPTIONS below).

Filtermail uses three types of files:

The configuration file contains values of options with are generally used (see sections CONFIGURATION and OPTIONS);
Mail filtering rules are hierarchically ordered in the rules file: incoming mail is sequentially matched against the patterns defined in files specified in the rules file until a match is found. Once a match has been found the rule’s action (accept, ignore or spam) is executed, ending the filtering process (see section RULES);
Each file specified in the rules file defines matching patterns, which are tested sequentially. Testing those patterns ends once the incoming mail matches a pattern. The result of the matching is forwarded to the rules file which may result in executing the rule’s action (see section PATTERNS).

If filtermail detects a syntax error in the rules file or in a rule specification file the incoming mail is accepted. To avoid this situation the --syntax option (see section OPTIONS) should be used when modifying, adding or removing rule files to verify that the specified rules were correctly formulated.

To use filtermail the incoming mail server must recognize it as a valid mail handling program (see section EXAMPLES).

CONFIGURATION

Options (see section OPTIONS) not flagged with `NO_CONFIG’ can also be specified in a configutation file. By default the configuration file ~/.filtermail/config is used.

Command line options always take precedence over specifications in the configuration file.

The configuration file must exist, but may be empty. It must exist because its directory defines the directory where the files defining the filtering rules are located. Empty lines and the content of lines starting at the #-character are ignored.

Option --expire is used to remove patterns whose date stamps (cf. section PATTERNS) indicate dates before the date specified by --expire. The configuration file may contain dont-expire: lines. Each dont-expire: line specifies the name of a pattern file whose entries don’t expire. Files specified in dont-expire: lines may not exist, but to avoid inspecting pattern files for expired dates the name(s) of those s must be identical to the names of the s used in the rules file (cf. section RULES below).

Other files in the configuration file may use the [~]/path format or must be plain filenames (i.e., not starting with /-characters). Plain files are relative to the directory containing the configuration file (cf. section EXAMPLES).

RULES

All mail filtering rules are defined in the rules file. Mail filtering starts at the first rule until either the incoming e-mail matches a rule, or until all rules have been processed and the e-mail does not match any rule. In the latter case the e-mail is considered accepted.

Empty lines and lines whose first non-blank character is a #-character are ignored. The rules themselves cannot contain #-characters.

Rules are written according to the following syntax (elements between square brackets are optional, the content of bracketed sections followed by a * character may be repeated (not using the square brackets). Lowercase words are keywords and cannot be used otherwise. Capitalized words are described below. Each rule is specified on its own line. Line continuation (using, e.g., \ at the end of a line) is not supported). Here’s the rule’s syntax:


if Header File [and Header File]* Action

Header is the name of a mail header (e.g., From:, Received:). Header specifications must be identical to the first words of header lines of the received e-mail. So to match the From header in the e-mail’s first line specify From (i.e., no colon). Some headers have variants. E.g., Received: and Received-SPF:. To select all headers sharing their initial characters append a +-character to the initial part (e.g., to select all headers starting with Received use Received+, and use From+ to select all From: headers including the e-mail’s first line).

File is the name of the file containing patterns to inspect. Filenames must start with ./ and define the locations of files below the configuration file’s directory. E.g., ./spam/subject specifies the file subject in a subdirectory spam containing patterns considered by the rule.

Action specifies the action to execute when e-mail matches a rule. Action can be

accept:
the e-mail is accepted (i.e., appended to the accept location spcified in the configuration file)
ignore:
the e-mail is ignored (i.e., appended to the ignore location spcified in the configuration file)
spam:
the e-mail is considered spam (i.e., appended to the spam location spcified in the configuration file)

PATTERNS

Files specified in rules files define patterns which may be found in the headers defined by the rules. The header lines which are selected by the Header specifications in the rules file are matched against those patterns after removing the header labels from those lines. So Subject: hello world is passed to the patterns as the (trimmed) line hello world. Once a header’s content matches a pattern inspection ends with a successful match (the rule itself may specify not, in which case a successful match results in a failing match of the rule).

Pattern files may start with file-specific comment (i.e., empty lines and lines whose first non-blank character is #) up to a comment line equal to #=. The patterns themselves may also be preceded by comment lines. Once a pattern is matched it is moved one position upward in the pattern file (including its associated comment).

Pattern specifications use the following syntax (elements between square brackets are optional (when used, the square brackets are not specified), capitalized words are described below, each pattern is defined on a single line, line continuation is not supported):


Nr Date Expression [and Expression]*

This pattern indicates a match when all Expressions match.

This syntax uses the following elements:

Nr: specifies the number of times e-mail was successfully matched by the pattern. At each successful match the count is incremented (up to a maximum of 999), and the pattern rises one position in the pattern file. The field width of the number field must be at least 3 character positions wide. When adding a pattern use 1;
Date: shows the date of the most recent match. Its format is yy-mm-dd, e.g. 23-03-31. When adding a pattern the current date could be used.
Expression: defines at least one matching mode and associated expression.

The Expressions themselves use the following syntax:

MatchMode [not] Spec
The selected headers are matched against Spec using the specified MatchMode. The not keyword is optional. When specified (omit the square brackets) the result of the match is negated. When multiple Expression specifications are joined by and keywords, then the final pattern results in a match if all Expression specifications indicate a match. Once an Expression does not indicate a match, then subsequent Expressions are not evaluated and headers do not match the pattern.

When using not the e-mail may not match the Spec specification. E.g, when e-mail should contain a To: header or a Cc: header the following rule can be used (cf. section EXAMPLES:)


if To: ./match/noto and Cc: ./match/noto spam
with match/noto:

1 23-05-10 p not ’.’

Note that Spec must be surrounded by single quotes. To use a single quote inside a Spec escape it (as \’). In general: the character following a backslash is used as-is, removing the backslash from the Spec (e.g., to construct a Spec containing ’\n specify \’\\n).

There are five types of MatchModes. MatchModes using regular expressions use extended regular expression patterns: prefix multipliers and bounding-characters by backslashes when they should be interpreted as ordinary characters (i.e., *, +, ?, ^, $, |, (, ), [, ], {, } should be escaped when used as literal characters).

c: (cidr) matches [d{1.3}.d{1.3}.d{1.3}.d{1.3}] IP address patterns found in selected headers against the Spec pattern, e.g., ’1.22.333.0/24’;
i: (ignore case) matches the selected headers case-insensitively against the Spec pattern, e.g., ’yes’;
n: (no-case pattern) matches the selected headers case-insensitively against the Spec regular expression pattern, e.g., ’news.*\.*@reply’;
p: (pattern) matches the selected headers (case-sensitively) against the Spec regular expression pattern;
s: (string text) matches the selected headers as specified (case-sensitively) against the Spec pattern, e.g., ’noreply@’. This matchmode can also be used to match IP version 6 patterns if the cidr-range uses multiples of 16-bit values, e.g., ’[2a02:7a60:’;
$: (script or program): Spec calls a script or program. The selected headers are passed to the script or program, which must return 0 to indicate a match and any other value to indicate no match. The script or program may specify arguments, but in addition two additional arguments are passed to the script or program, the first one being the name of a file containing the selected e-mail headers, the second one being the name of the file containing the e-mail’s content. The script or program does not use the PATH environment variable, so the path to the script or program (possibly starting with ~/) must be specified. When Spec starts with the word SHELL] then the script or program is called as /bin/sh -c ’script/program [arguments]’. Examples:

# call program in the user’s bin dir. with 4 arguments
’~/bin/program arg1, arg2’
# call script in the user’s bin dir. with 4 arguments
# via /bin/sh
’SHELL ~/bin/script arg1, arg2’

OPTIONS

Short options, when defined, are provided between parentheses immediately following their long option equivalents. Several parameters specify locations of files written or used by filtermail. If a location specification starts with ~/ then the tilde-character is replaced by the base directory specified as filtermail’s argument. Otherwise, if the location does not start with a slash (/) character then the location is prefixed by the path of the directory containing the configuration file.

Some options can also be specified in the configuration file (cf. section CONFIGURATION). Options that cannot be specified in the configuration file are marked as NO-CONFIG.

--accept([:HDRS:]path)
the path receiving accept-marked mail. Mail not matching any rules is also accepted. Prepend :HDRS: to path to log the mail’s From: and Subject: headers to path. To completely ignore accept-marked mail omit this option (although ignoring accept-marked mail is probably undesirable). Examples (commonly frank (used in the example) is replaced by the user’s username):

--accept /var/mail/USER
--accept :HDRS:/var/mail/USER

--cls
By default filtermail clears the terminal’s screen (calling tput clear). If the screen should not be cleared option --cls no can be specified. A default value can also be specified in the configuration file, using

cls: no
in which case option --cls yes can be specified to overrule the configuration file’s specification. This option is only used when the --inspect option is specified.
NO-CONFIG --config=path (-c)
the path to the configuration file (default: ~/etc/mailfilter/config). (see also section CONFIGURATION below).
NO-CONFIG --expire=date (-e)
no mail is read, but patterns having date stamps older than date are removed from their files and are stored in files having extension .exp (cf. section PATTERNS below). Use format yy-mm-dd when specifying date. This option and --interactive (see below) cannot both be specified. Example:

--date 23-01-31

NO-CONFIG --help (-h)
a summary of filtermail’s usage is written to cout and filtermail ends, returning 0 to the shell.
--ignore([:HDRS:]path)
the path receiving ignore-marked mail. Prepend :HDRS: to path to log the mail’s From: and Subject: headers to path. To completely ignore ignore-marked mail omit this option. Examples:

--ignore ~/Mail/ignore
--ignore :HDRS: ~/Mail/ignore

NO-CONFIG --inspect (-I)
when specifying this option filtermail expects a received e-mail file at its standard input showing the domain name of the sender, its IP address, its country of origin and the cidr-range containing the received IP address. E.g.,

from renxincj.com (unknown [104.223.188.228])
IP = `104.223.188.228’, Country: US, CIDR = 104.223.128.0/17
See section FILTERMAIL INSPECT for additional info.
NO-CONFIG --interactive (-i)
the matching result of matching patterns is interactively simulated (no mail is read). This option and --expire cannot both be specified.
--IP4-pattern=regex (-p)
by default IP4 patterns are recognized using the regular expression pattern \[((\d{1,3})\.(\d{1,3})\.(\d{1,3})\.(\d{1,3}))\], matching addresses like [123.11.22.33] in specified headers. This format is commonly encountered in Received: headers. When different IP4-patterns should be recognized then this option can be used to specify another regular expression for matching IP4-patterns in headers.
The required regex is an extended regular expression in which the characters *, +, ?, ^, $, |, (, ), [, ], {, and } must be escaped when used as literal characters. Perl-like escape sequences \b (word-boundary), \d (digit character), \s (white-space character), and \w (alpha-numeric character) are supported. Their capital variants represent their complementary sets.
The regexes specified using this option must have at least five elements, addressable using indices 1..5:
* element 1 defines the complete IP4 pattern,
* element 2 defines the most significant (4th) octet (available in bits 24 thru 31 in the converted binary address),
* element 3 defines the 3th IP4 octet (available in bits 16 thru 23 in the converted binary address),
* element 4 defines the 2nd IP4 octet (available in bits 8 thru 15 in the converted binary address),
* element 5 defines the least significant (1st) IP4 octet (available in bits 0 thru 7 in the converted binary address).
--log=spec (-l)
specify the log-facility. By default no logging is used.
The spec argument can be the location of a file receiving the log messages.
If spec uses the format facility:level (facility and level being the names defined in the syslog(3) man-page omitting their LOG_ prefixes) then the log messages are written by syslog. Examples:

--log log/log # uses log/log in the configuration
# file’s directory
--log USER:NOTICE # uses syslog

--preamble
merely verify that the basic options are correctly specified. No mail is read, but all options and configuration file specifications are processed. When the verification successfully ends the message

Preamble successfully completed
is written to cout and filtermail returns 0.
--received=content (-R)
before incoming mail reaches your computer your, e.g., organization’s mail server may have processed the incoming mail, and it may be OK to inspect only the Received: headers following that mail server’s Received: header. The content of the Received: header may (partially) be specified using this option, which can also be provided in the configuration file. E.g., if the Received: header originating from your organization’s mail server contains ourorganization.org then the option -r ourorganization.org causes all Received: headers to be ignored until the Received: header containing ourorganization.org has been seen. If the incoming mail doesn’t contain such a Received: header then all Received: headers are inspected. This option is only used when the --inspect option is specified.
--rules=path (-r)
the location of the rules specification file (cf. section RULES below).
--spam [:HDRS:]path
the path receiving spam-marked mail. Prepend :HDRS: to path to log the mail’s From: and Subject: headers to path. To completely ignore spam-marked mail omit this option. Examples:

--spam ~/Mail/spam
--spam :HDRS: ~/Mail/spam

NO-CONFIG --syntax
perform a syntax check of the specified rules. Encountered syntax errors are logged in the log-file. The begin and end of the syntax check is also logged in the log file. No mail is read from the standard input stream.
NO-CONFIG --version (-v)
writes filtermail’s version to cout and ends, returning 0.

FILTERMAIL INSPECT

When specifying the --inspect (-I) option filtermail expects a received e-mail file at its standard input showing the domain name of the sender, its IP address, its country of origin and the cidr-range containing the received IP address. E.g.,


from renxincj.com (unknown [104.223.188.228])
IP = `104.223.188.228’, Country: US, CIDR = 104.223.128.0/17

Mail handling programs (e.g., mutt(1)) allow its users to pipe an e-mail file to a program, so the received e-mail can be inspected from inside the mail handling program. E.g., with mutt typing | shows the prompt


Pipe to command:

and assuming that the filtermail program is available in the user’s PATH environment variable enter `filtermail -I’ to pass the received e-mail to filtermail:


Pipe to command: filtermail -I

Depending on the content of the Received: headers filtermail’s output shows the domain name of the sender, its IP address, its country of origin and the cidr-range containing the received IP address. E.g.,


from renxincj.com (unknown [104.223.188.228])
IP = `104.223.188.228’, Country: US, CIDR = 104.223.128.0/17

IP version 6 addresses are also inspected, producing output like


from mail.resoascijournal.info (s857e6ba3.fastvps-server.com \
[2a03:f480:2:8::3f])
IP = `2a03:f480:2:8::3f’, Country: EE, CIDR = 2a03:f480:2::/48

If the received e-mail is considered conspicuous (e.g., spam or mail to ignore) then the cidr range could be added to a file like suspect.cidr. Once more e-mails from the suspected cidr-range are received, the range could be added to, e.g., ~/etc/filtermail/spam/cidr or to ~/etc/filtermail/ignore/cidr, using a pattern line like


1 23-05-10 s ’2a03:f480:2:’

When the option --cls is specified as yes (either as command-line option or in the configuration file) then the terminal screen will be cleared before showing --inspect’s output. When the option --received is specified Received: headers appearing before the Received: header containing the content specified at the --received option are ignored.

EXAMPLES

configure filtermail as a valid mail handling program:

Commonly incoming mail servers define a directory where valid mail handling programs (or links to those programs) are listed. E.g., sendmail(8) uses the `sendmail restricted shell’ (/etc/mail/smrsh) directory. If filtermail is installed in a standard user-accessible directory (e.g., /usr/bin) then the smrsh directory should contain the link


filtermail -> /usr/bin/filtermail

Once the filtermail program is recognized by the incoming mail server users may filter incoming e-mail through filtermail using, e.g., a ~/.forward file. Such .forward files ignore empty lines and end-of-line comment (starting at #). Assuming a standard filtermail-configuration (cf. section CONFIGURATION) and assuming that user frank’s home-directory is /home/frank, then /home/frank/.forward should contain the following line:


"|/usr/bin/filtermail /home/USER"
Note the double quotes: they are required because filtermail is called with an argument.
a configuration file:

The following configuration file specifies that the rules and log files are located in the configuration file’s directory and defines paths for all three mail categories:


rules: rules
log: log/log
accept: /var/spool/mail/USER
spam: ~/Mail/spam
ignore: ~/Mail/ignore

# as illustration of a ’dont-expire:’ specification:
#dont-expire: ./ignore/from

rule specifications:

Filtering rules are defined in the file specified by the --rules option or in the rules: line of the configuration file. Note that the pattern files must start with ./.


if From: ./ignore/from ignore
if Subject: ./spam/nolowercase spam
# inspect all Received... headers:
if Received+ ./spam/cidr spam
# a To: or Cc: header is required:
if To: ./match/noto and Cc: ./match/noto ignore
The final rule uses the pattern in ./match/tocc (shown below) specifying the `any character’ (’.’) regular expression: if the To: header is empty (which is also true if there is no To: header, then the not To: condition matches. The same holds true for the second condition. So if neither condition matches there is neither a To: nor a Cc: header, in which case the e-mail is sent to the spam destination. Also note that, according to De Morgan’s rule, a not X and not Y rule is identical to an X or Y rule.
pattern specifications:


# the ./spam/nolowercase pattern:
1 23-05-10 not p ’[a-z]’
# e.g., the ./match/noto as used in the above example
# requiring either a To: or a Cc: header:
1 23-05-10 p not ’.’

FILES

By default the configuration file is expected in the subdirectory etc/filtermail of the directory specified as filtermail’s argument.

SEE ALSO

mutt(1), pattern(3bobcat), regcomp(3), sendmail(8), syslog(3), tput(1), whois(1)

BUGS

None reported.

AUTHOR

Frank B. Brokken (f.b.brokken@rug.nl).

2023 filtermail_1.05.00