You are here

How I Handle SPAM

It's a multi-pronged approach using F(L)OSS on Debian: Exim4, SPAMassassin, Dovecot, sieve, managesieve, and SpamCop.

Site-wide (Exim4 and SPAMassassin)

The first is something similar to what virtually everyone running an email
server is doing: I use spamassassin. I run spamd as a system service and then
configure my SMTP server (exim4) to query it at SMTP time. Doing some high-
level filtering at SMTP time is good because you can reject the message
without generating a separate bounce message a.k.a. an MDN. MDNs are bad
because they often go the the wrong person (since all you have is the
untrusted "To" address) causing SPAM "back-scatter". Some DNSBLs will
actually list servers participating the SPAM "back-scatter" as open relays,
which in some sense, they are.

My spamassassin configuration looks like this:

lock_method flock
required_score 1.0
use_bayes 0
bayes_auto_learn 0
body            AE_MEDS35       /\bwww(?:\s\W?\s?|\W\s)\w{3,6}\d{2,6}(?:\s\W?
\s?|\W\s)(?:c\s?o\s?m|n\s?e\s?t|o\s?r\s?g)\b/i
describe        AE_MEDS35       obfuscated domain seen in spam
score           AE_MEDS35       1.00

I'm not using bayes because I don't have a good way to teach it false
negatives or false positives. It might actually filter out more (I still get
about 6 SPAMs a day) if I would turn that on, but I'd need to up the
threshold; I get quite a few non-SPAM messages scoring 1 or above.

My exim4 configuration (for the DATA acl) looks like this:

drop
        spam = Debian-exim:true/defer_ok
        message = This message is ${spam_score_int}% SPAM.
        add_header = X-Spam-Score: $spam_score ($spam_bar)
        condition = ${if >= {$spam_score_int}{1} {1}{0}}
        set acl_m_spam_delay = ${if = {$spam_score_int}{10} {1}{0}}
        add_header = X-Spam-Report: $spam_report
        condition = ${if >= {$spam_score_int}{100} {1}{0}}

The delay slows the server that is sending SPAM. The additional headers are
used by a different part of my infrastructure. This actually ignores the
threshold set in my spamassassin configuration, doing different things for
different levels of SPAM. It does reject messages that score 10 or more from
spamassassin, which is very conservative. I'd like as few false positives
as possible at this step because it would affect all my users.

Service-oriented (SpamCop)

Next, I signed up for the SpamCop reporting service. They will
process SPAM and send messages on your behalf to websites and/or ISPs that may
be able to stop the SPAM. Once you sign up, you'll be given an email address
that you can forward SPAM (as attachments, so they have full headers) to for
initial reporting. You do still have to visit the website to confirm the
message is SPAM, so it's not something you can really do in bulk, but once you
get SPAM to a reasonable level this can help. Based on the reports they also
maintain a DNSBL that spamassassin will automatically use, so each SPAM I
report here will increase the number my server (and all other users of
spamassassin) will reject out-of-hand.

Client-specific (Dovecot, sieve, and managesieve)

Finally, I've configured my IMAP server (dovecot) to also handle sieve (a mail
filtering language) and managesieve (a way to update the server-side filter
from your client). This lets me centralize my filters so I don't have to
maintain a separate one for my PDA/Smartphone (when I get one), my laptop, and
my desktop. The server handles all that.

The main sieve rule I use for handling SPAM is:

if anyof ( header :contains "X-Spam-Level" "*****",
           header :contains "X-Spam-Score" "+++++",
           header :contains "X-Spam-Status" "BAYES_99" )
{
        discard;
        stop;
}

If X-Spam-Level or X-Spam-Score is too high, the message is simply deleted server side.

If X-Spam-Status contains BAYES_99, which
means spamassassin's Bayesian filter is 99+% sure it is spam, the message is
similarly simply deleted.

The second sieve rule I use looks like this:

if allof ( anyof ( header :contains "X-Spam-Level" "*",
                   header :contains "X-Spam-Score" "+" ),
           not header :contains "X-Spam-Status" "BAYES_00" )
{
        if header :contains "X-Spam-Report" "RCVD_IN_BL_SPAMCOP_NET" {
                discard;
                stop;
        }

        fileinto "INBOX.Possible SPAM";
        stop;
}

Anything that is even suspicious is caught by this rule. Messages with a
spamassassin score of 1 or above are filed into a special folder, UNLESS
spamassassin's Bayesian filter is 99+% sure it is not-SPAM (BAYES_00). SPAMs
that appear in that folder (or my INBOX [rare!]) are the ones I report to
SpamCop. Since it does little good to report SPAM they already know above, I
simply trash messages that were received from someone in SpamCop's blacklist.

I'd have virtually no way to know if I was trashing false-positives, but I
haven't had anyone complain that I'm missing their mails and my settings are
rather conservative so I'm fairly happy to be down to only about 6 SPAMs a
day.