We have an Astaro Security Linux firewall (actually two of them, running in High Availability mode) at work. ASL includes SpamAssassin, which is effective in filtering out some spam email from our corporate network. Unfortunately, there doesn’t seem to be a way to ‘train’ the bayesian filter. Most SpamAssassin implementations that I’ve seen allow you to set up a spam trap email address where you forward all of your spam messages, and SpamAssassin uses these to train the bayesian filter to recognize spam.
Well, I was playing around at the command prompt, and I found where Astaro keeps the bayes database. It can be found at:
I also found that the sa-learn command was installed and working. So, I took a couple of mbox-style files, one with my spam, and one with regular emails, and ran the following commands:
sa-learn --dbpath /var/lib/nobody/.spamassassin/ --spam --inbox --showdots spam
sa-learn --dbpath /var/lib/nobody/.spamassassin/ --ham --inbox --showdots inbox
where ‘spam’ was my spam file, and ‘inbox’ was my non-spam file.
The results so far, after training with about 5000 spam messages and 10000 regular messages seems to be working. Looking at the ASL log files, I see more of the spam messages getting a higher bayes score.