[TriLUG] Managing SpamAssassin's Bayesian classifier with maildir
Mike Johnson
mike at enoch.org
Wed Jun 1 10:20:58 EDT 2005
Thought someone might benefit from this, so I figured I'd toss it out to
the list(s).
I'm a big fan of maildir, but I just found a little shortcut that makes
me even happier to be using it. I have a server running Courier IMAP
and I use Thunderbird as my mailreader. I have a folder set up on the
server called Spam. Anything spamassasin tags as spam goes into that
directory, rather than being deleted (I've had a false positive or two).
I also use Thunderbird's junk mail detection. When something is
tagged as spam, either by myself or tbird, it gets moved to the Spam
folder. The key to all this is all these spam messages are unread.
So, when I'm ready for spamassassin to learn from tbird, I mark a
message as read in tbird. Simple, right? That's all there is to it
from the client perspective. Behind the scenes, I have a cronjob that
executes this:
pushd $HOME/Maildir/.Spam > /dev/null && nice sa-learn --no-sync --spam
cur && nice sa-learn --sync && find cur -type f -exec mv {} cur.old \;
For those who haven't used Maildir before, when a message is unread it's
in the 'new' folder. When it's read, it gets moved to the 'cur' directory.
The movement of the messages from the 'cur' directory to the 'cur.old'
directory is a preference. You could just delete the files. You could
leave them there. I move them in case I later find a false or I want
some spam laying around for future training.
Add /dev/null redirects as desired and have fun with easy Bayesian
training. Oh, and if this eats your email or your firstborn, it's not
my fault.
Mike
More information about the TriLUG
mailing list