SpamAssassin and Black/Block Lists -- A Test of Effectiveness

Effectiveness of Blacklists in Spam Assassin -- A Simple Test

LOOKING FOR SPAMASSASSIN RULES, TESTS AND PLUGINS? HAVE ALOOK HERE FOR SOME POINTERS!

Want to make sure that blacklists (both dnsbl and uribl) are enabled in SpamAssassin? They should be enabled by default but you can check in your .pre files located in /etc/mail/spamssassin/ directory. Look in init.pre and the other version-specific pre files such as v320.pre You should see:

loadplugin Mail::SpamAssassin::Plugin::URIDNSBL

That will check any URIs against blacklists and

loadplugin Mail::SpamAssassin::Plugin::DNSEval

This will check both whitelist DNS servers and blacklist DNS servers.

Looking to simply blacklist a From: E-mail address? Simple, just go into your ~/.spamassassin/user_prefs file (for user-specific blacklisting) or go into your /etc/mail/spamassassin/local.cf (for server-wide blacklisting) and blacklist them similarly to how you would whitelist someone:

blacklist_from    spammer@spamdomain.com

Looking to increase the weight of blacklist hits in SpamAssassin? See our recommendations on this on our rules page. Looking to disable blacklists in SpamAssassin? Why?! Read about a simple test below.

The debate about real time block/black listing -- Domain Name Service Black List (DNSBL) or Real Time Block List (RTBL) -- is endless. Some people believe they are too aggressive and hurt innocent people. Others see them as a less-than-perfect but necessary tool to combat E-mail spam. ISPs often implement them in secret for fear that their users will revolt and complain. Or they stop using them after one user complains about a mistakenly blocked E-mail (ignoring the fact that thousands of other users are quite happy to be free from millions of spam E-mails during the same time period).

So leaving that debate for a minute, [perhaps for another page here at pettingers.org] lets see how well they work for tagging E-mail as opposed to blocking the receipt of the E-mail. One would think this would make both sides of the debate happy. The proponents of blacklists can utilize the power of the real-time black list to identify spam. The opponents of the idea can still receive all their E-mail and any E-mail mistakenly tagged as spam is still delivered (albeit marked as spam).

The Test Plan

I love it when a plan comes together. During the month of April, 2004, I was still using a dial-up ISP even though I had DSL service. Long story. Anyway, the ISP was using SpamAssassin (SA) Version 2.62 on their server which was running qmail. I was also running SA 2.62 on my mail server with Sendmail. I noticed a huge amount of spam slipping through my dial-up ISP account and I knew something was amiss because my familiarity with SpamAssassin told me it could do better.

I soon noticed that none of the ISP's SA headers included hits on blacklists which I thought was unusual. After contacting the ISP, they confirmed that they had disabled the blacklists in SpamAssassin. Swell. I decided to run the messages through my server running a default SpamAssassin configuration -- blacklists are enabled by default.

For one month, I pulled all the E-mail from my dial-up ISP using Fetchmail on my mail server. This was possible because my ISP (and mail account) was accessible via the Internet as well as though the dial-up service. This allowed both the ISP to process the E-mail and me to process the E-mail and compare the effectiveness of the two SA implementations. Ideally I would gain some visibility on any benefit of utilizing blacklists within SA. Anything the ISP tagged as spam was left tagged. That is, if they flagged it as spam without blacklists enabled, then surely my configuration would also flag it as spam with blacklists enabled.

This account is a target rich environment. About 90% of the mail received is spam. Although it has nothing to do with my test, I can tell you that there were ZERO false positives with either implementation of SA.

The test configuration

Below is the schematic of my configuration. Not rocket science.

Results

You can see a somewhat more complete tally of results in the CSV file but I've summarized it below.

1 April 2004 through 30 April 2004
----------------------------------
----------------------------------
Total E-mail Received:                          271   Messages
Total Spam Received:                            248   Messages
Total E-mail trapped WITHOUT blacklists in SA:  123   Messages
Effectiveness WITHOUT blacklists in SA:          49.6 %       
Total E-mail trapped WITH blacklists in SA:     244   Messages
Effectiveness WITH blacklists in SA:             98.4 %       
 
----------------------------------
Blacklist Detail 
----------------------------------
DSBL Hits:                                       53 (21.7%)
RFCI Hits:                                       36 (14.8%)
NJABL Hits:                                      25 (10.3%)
SpamCop Hits:                                    61 (25.0%)
SORBS Hits:                                      73 (29.9%)
SBL Hits:                                        22  (9.0%)

(Note: many messages hit multiple blacklists so total will be more than 100%)

You Decide

The effectiveness of SpamAssassin is cut in half by turning off black/block lists.

The debate about using real-time blacklists within mail transfer agents will rage on. Yet for the administrator who is configuring SpamAssassin, which only tags the E-mail and does not block it, it would seem foolish to disable such a powerful tool in the fight against E-mail spam.

Some things to keep in mind:

1. This is junk science. At least I admit it unlike about 80% of the crap I see on the evening news.... There is no control group. I have a small sample size. Take this for what its worth. I think it is extremely demonstrative but you can chose to ignore it if you'd like.

2. SpamAssassin has become even more deadly with the addition of URL black listing in the version 3.0 series. This not only checks the sender, but checks the content of the E-mail for references to spamvertized web sites. A fantastic addition. I would love to see some data on the effectiveness of this but haven't done a test yet. Just as a guess, I would say it has improved hit rates at least 40%.

3. Even if you still don't agree with the use of blacklists, you can leave them enabled in SA and just decrease the associated score. This will allow you to see blacklist "hits" even though they may not push the E-mail over the spam threshold. Watch your own traffic and I believe you will be convinced of the effectiveness.

4. Since the time of this test, the blacklists have improved (e.g. XBL) and SpamAssassin's use of the lists has also been improved.

5. A quick look at some of the SA summaries for these messages makes me believe that you can not force a non-blacklist enabled SA to compete with one that has blacklists enabled. That is, you can't say, "I'll just bump up the scores associated with test xxx and test yyy so that they catch more spam." You would be hard pressed to do it. Your spam may vary.

6. The true power of real-time blacklists is at the mail server (MTA), not SpamAssassin. If you can block the SMTP negotiation you save a huge amount of processing time, bandwidth and resources. See our sendmail configuration page for some pointers on how to use blacklists at the MTA.