We have a mail server which has a little more than 2000 virtual users and some quite active mailing lists run by Mailman. The main idea of the server is to serve members of an association so that the people can have address@ourdomain and the mail arriving to those addresses is forwarded to their real mail addresses.
The spam percentage of all mail coming to our server is shocking. Incoming count is about 40000-50000 emails per week and only less than 10000 of them is not spam. About 15000-25000 is blocked by ordb, dsbl, rfc-ignorant.org and our own access lists. The rest get through and are checked by SpamAssassin. Less than one third of them are tagged as not spam.
Some of the people wanted their all mail, even spam, to be forwarded to them and just to be tagged by SpamAssassin. Some of them wanted everything even smelling spam to be totally killed before going to fill their real mailboxes. Somebody wanted only sure spams to be sent to the /dev/null. So we needed a spam killer which could be configured per recipient address.
This document assumes that you have some kind of knowledge how Postfix, Spamassassin and Procmail do work. It's good if you understand Perl code too.
The master.cf is configured like this (non-default settings shown only). The hostname is "qsp" as you can see.
qsp:smtp inet n - n - - smtpd
-o content_filter=filter:dummy
-o cleanup_service_name=pre-cleanup
localhost:smtp inet n - n - - smtpd
pre-cleanup unix n - n - 0 cleanup
-o alias_maps=
-o virtual_maps=
cleanup unix n - n - 0 cleanup
filter unix - n n - - pipe
flags=Rq user=spamd argv=/usr/bin/procmail -Y -m /etc/procmailrcs/master.rc ${sender} ${recipient}
So every mail coming from the internet is sent to content filter "filter" and cleanup service is special "pre-cleanup" so that we have the virtual address when going to filter. Because of that we can match virtual address in the filter, not real address.
The service "filter" sends the mail to procmail with arguments sender and recipient(s).
Only mail coming from the internet is sent to the filter, outgoing mail is not. That is primarily beacuse of Mailman which sends all mail through a socket on localhost:25.
You should read the FILTER_README file from the Postfix distribution package to understand this configuration completely.
As seen above, the mail is sent to procmail with parameters -Y -m master.rc
${sender} ${recipient}
So procmail gets first argument as envelope sender and 2..n parameters as
recipients. Then the mail is checked by SpamAssassin using spamc which
connects to spamd running in the background. Consult the manual of
SpamAssassin for more info.
You should read the procmailrc(5), spamd(1) and spamc(1) manpages to understand this completely.
SHELL=/bin/sh
DROPPRIVS=YES
LINEBUF=32768
SENDMAILFLAGS="-oi"
SPAMC="/usr/local/bin/spamc"
FROM="<$1>"
SHIFT=1
:0f
|$SPAMC -f -U /var/run/spamd.sock
:0
* ^X-Spam-Level: \*\*\*\*\*
{
SWITCHRC="/etc/procmailrcs/spamkill.rc"
}
:0
! -f $FROM "$@"
|
We must have space for 1000 addresses. Set sendmailflags just to be sure. Set the variable FROM to the envelope sender and remove it from $@ Filter through spamd which is started to use unix domain socket, not TCP port. If we get at least five points, we will switch to another script spamkill.rc Wasn't spam so forward to all recipients. |
If the mail is spam (in SpamAssassin's opinion) we send the recipient list to our perl script filter_recipients which returns a list of recipients tha allow spam with that level delivered to them or who don't exist in the database. They who gave us permission to kill spam, will be removed from the recipient list before forwarding the mail.
LOGFILE="/var/log/spamkill/spamkill"
UMASK=022
LOGABSTRACT="no"
VERBOSE="no"
FORMAIL="/usr/bin/formail"
FILTERPL="/usr/sbin/filter_recipients"
DATE=`date +%Y%m%d-%H%M%S`
XSPAMLEVEL=`$FORMAIL -zxX-Spam-Level`
:0 Wi
RECIPIENTS=|$FILTERPL "$XSPAMLEVEL" "$@"
:0 Wi
OLDRECIPIENTS=|echo "$@"
SHIFT=1000
LOG="$DATE $$ Level: $XSPAMLEVEL From: $FROM
$DATE $$ Orig-To: $OLDRECIPIENTS
"
:0
* RECIPIENTS ?? @ourdomain
{
LOG="$DATE $$ Sent-To: $RECIPIENTS
"
:0
! -f $FROM $RECIPIENTS
}
LOG="$DATE $$ Sent-To: /dev/null
"
:0
/dev/null
|
Set the logfile... and umask so that others can read the log We don't want everything logged... so logabstract and verbose are off Get the date for logging Extract the X-Spam-Level header Send the spam level and recipient list to our script and get the new recipient list. Save the original recipient list for logging. Note that $@ is expanded only when in an argument list to a program Clear the variable $@ because ! sends to $@ Do the logging. Notice the newlines! If we have something in variable RECIPIENTS (any pattern that should match a valid recipient) Then log it Notice the newline and dquote on next line again Forward the mail No recipients left, log it And send there where all spam belongs to |
We have a map file where we have addresses and their minimum spam levels
when the mail can be sent to /dev/null. The map file is made by command
postmap hash:/etc/postfix/spamkill and the source map file
includes something like this:
foo@ourdomain 5 bar@ourdomain 7 baz@ourdomain 8
So user foo allows us to kill mail when it's level is at least 5, bar when it's at least 7 and so on. The recipient addresses who aren't in the map will get all spam. Notice that we play with integer spam levels here. In our case the map file is handled and created by a web based application, as the virtual user map too, but that's an another story :)
The recipient list is given to the perl script "filter_recipients" and it removes those recipients who allow us to kill spam automatically.
The script:
#!/usr/bin/perl
use strict;
use DB_File;
my $mapdb = "/etc/postfix/spamkill.db";
my $spamlevel = length(shift @ARGV);
tie(my %db, 'DB_File', $mapdb, O_RDONLY, 0664, $DB_HASH)
or print "@ARGV\n" and exit(1);
foreach my $addr (@ARGV) {
$addr = lc($addr);
(my $test = $addr) =~ s/\+[^@]+//;
$test .= chr(0);
if ($db{$test} > $spamlevel or !defined $db{$test}) {
print "$addr ";
}
}
untie(%db);
print "\n";
exit(0);
|
Real men always use strict :-) Set the map db here Get the spam level from the first argument Open the db file for reading or just return the list if error in opening Go through the recipient list and print only those who allow spam to be delivered on this level or who are not in the db at all Untie the db, print a newline and exit |
No problems at least with this amount of mail. We get more than a thousand spams per day. Our server is a AMD Athlon™ XP 2400+ running on 2GHz and the machine has 512MB of memory. On an average the CPU is about 99% idle although there is a webserver running in the same machine.
I tested that perl script with a mapfile including 1000 addresses and random levels. I gave 1000 addresses on argument list. On that machine the execution took something between 0.02 to 0.03 seconds, so it really is fast enough for us. It shouldn't be impossible to write that program again in C so that it would be some milliseconds faster.
If you have to scan eg. millions of mail every day, this may not be the right solution. But I really don't know, I haven't tested it with bigger volumes. Hope that this information can still be at least partly usable for that kind of situations.