Some Tips for Using SpamAssassin
These tips are based on my observations of my use of SpamAssassin (currently at version 3.0.3) over several years. They assume you have already installed SpamAssassin, are familiar with how it works, and know where to find the SpamAssassin documentation (in particular the SpamAssassin Wiki, which is a great jumping-off point for SpamAssassin FAQs, documentation, and other tips). I've tried to avoid covering anything available in those other locations.
General Observations
- Set your
required_score
parameter high until you are confident that you know what kind of score distributions your emails (both ham and spam) come in at — the last thing you want to happen is for any ham to be marked spam and overlooked. I still use a threshold of 8, despite the fact that incoming ham very rarely scores more than 5. If you have large volumes of email, a mailer such as mutt can be useful, where you can sort by spam score by adding something like:spam "X-Spam-Status: (Yes|No), (score|hits)=(-?[[:digit:]]+\.[[:digit:]]+)" "%3"
to your~/.muttrc
. This may help you spot trends. - Use Razor, Pyzor and DCC if you can. They make a large contribution to spam scores in general. See NetworkTests for more information.
- Purists will insist that spam and viruses are not the same thing, but they often arrive together, and you rarely want either. I have had very good results using the ClamAV plugin to keep out virus-laden spam, although I would personally recommend using a lot lower value than the suggested score of 10, since no virus-check can be perfect. Since ClamAV checks for phishing patterns and other similar attacks also, which tend to be time-critical, keeping its database up-to-date is important — you should probably use freshclam.
- Do use Bayes - it really does make a difference. And don't worry about the odd email being mis-learned (i.e. ham as spam or vice-versa). An example is blacklisting the occassional 'legitimate' organisation that sends you what you consider to be spam. This won't adversely affect your Bayes setup as long as the volume of these is sufficiently outweighed by 'real' spam.
Gotchas
Certain plugins in
/etc/spamassassin/init.pre
are often commented out by default. Read this file, read the manpages for the plugins, install any prerequisite Perl modules, and then uncomment them. In my experience, it is particularly worthwhile making sure that the URIDNSBL plugin is enabled, as this can contribute highly to the score of a spam. Usingspamassassin -D
may help here to check the plugins are enabled.If you enable the
RelayCountry
plugin, you may want to add:add_header all Relay _RELAYCOUNTRY_
to your configuration.
By default, the BAYES scores in SpamAssassin are generated using the genetic algorithm, and thus do not decrease and increase monotonically. If you don't like this, like me, configure the BAYES scores manually. I use:
score BAYES_00 0 0 -12 -12 score BAYES_05 0 0 -6 -6 score BAYES_20 0 0 -4 -4 score BAYES_40 0 0 -2 -2 score BAYES_50 0 0 0.01 0.01 score BAYES_60 0 0 1 1 score BAYES_80 0 0 2 2 score BAYES_95 0 0 3 3 score BAYES_99 0 0 6 6
This scoring is a lot more aggressive than the default, as I have had good experiences with Bayes.
The same principle is also true of the Razor scoring. I use:
score RAZOR2_CHECK 0 0.3 0 0.3 score RAZOR2_CF_RANGE_51_100 0 1.5 0 1.5