[TriLUG] Managing SpamAssassin's Bayesian classifier with maildir

Mike Johnson mike at enoch.org
Wed Jun 1 10:20:58 EDT 2005


Thought someone might benefit from this, so I figured I'd toss it out to 
the list(s).

I'm a big fan of maildir, but I just found a little shortcut that makes 
me even happier to be using it.  I have a server running Courier IMAP 
and I use Thunderbird as my mailreader.  I have a folder set up on the 
server called Spam.  Anything spamassasin tags as spam goes into that 
directory, rather than being deleted (I've had a false positive or two). 
  I also use Thunderbird's junk mail detection.  When something is 
tagged as spam, either by myself or tbird, it gets moved to the Spam 
folder.  The key to all this is all these spam messages are unread.

So, when I'm ready for spamassassin to learn from tbird, I mark a 
message as read in tbird.  Simple, right?  That's all there is to it 
from the client perspective.  Behind the scenes, I have a cronjob that 
executes this:
pushd $HOME/Maildir/.Spam > /dev/null && nice sa-learn --no-sync --spam 
cur && nice sa-learn --sync && find cur -type f -exec mv {} cur.old \;

For those who haven't used Maildir before, when a message is unread it's 
in the 'new' folder.  When it's read, it gets moved to the 'cur' directory.

The movement of the messages from the 'cur' directory to the 'cur.old' 
directory is a preference.  You could just delete the files.  You could 
leave them there.  I move them in case I later find a false or I want 
some spam laying around for future training.

Add /dev/null redirects as desired and have fun with easy Bayesian 
training.  Oh, and if this eats your email or your firstborn, it's not 
my fault.

Mike



More information about the TriLUG mailing list