[TriLUG] Re: spam solutions - spamassassin
Jon Carnes
jonc at nc.rr.com
Sat May 22 23:17:48 EDT 2004
On Sat, 2004-05-22 at 18:23, Gregory Woodbury wrote:
> On Sat, May 22, 2004 at 04:42:37PM -0400, Myrhillion wrote:
> > Hi Rick,
> >
> > I've been struggling a bit trying to setup a spam solution.
> > I have tried assp, but find it wanting.. too much training.
> >
> > Can you give me your opinions on spamassassin? I hear a lot of people
> > using this,
> > but was curious what kind of commitment is required to get it setup to
> > actually
> > filter spam as opposed to spam + legit email.
> >
> > I thought an opinion from someone running it might be helpful.
> > Thanks for your time.
> >
> > Doug Taggart
>
> I've been using spamassassin for several years now. I have it trained
> fairly well to detect spam with 0 false positives and a low rate of
> false negatives. In other words, it never classes legit email as spam
> and only rarely lets spam thru to my mail box.
>
> The amount of training is steep on the front end. At first you have to
> feed a large selection of spam into the trainer and then you jut have to
> tweak the thing occasionally as new trends in spam arise.
>
> I have a collection of selected spam messages that I use to seed newly
> installed user filters (or if I re-install the whole system). This is
> not as big a deal as it might sound. I use "selected" messages that
> don't have many of the "cache poisoning" random word collections that
> has become one of the recent attempts by spammers to bypass SA and other
> baysian detectors.
>
> If so inclined you can install the optional extended interface to
> Vipul's Razor collective database of spam but I find that a bit much for
> my lazy tastes.
>
> If you are on a Linux platform (easy guess huh?) an anti-virus filter
> like ClamAV is worth installing. I slipped ClamAV into my system in
> about 15 minutes just a week or so ago [Fedora Core 1/sendmail] and it
> does wonders for the virus laden emails before they can slip past the SA
> filters.
>
> Installing SA can take two forms: a system-wide filtering with all mail
> being passed into SA by sendmail before local delivery; or a per-user
> installation where you have each users' procmail scripts process thru SA
> as part of local delivery. I opted for the local procmail solution as
> there are a few usernames that get no spam and thus don't require the
> overhead. (Besides, I was too lazy to figure out the SA "milter" at the
> time of initial install and don't particularly relish being the "censor"
> for all the users - I wasn't sure how tuning/training worked when the
> system-wide method is used. I installed at home before I installed at
> any of my work locations.)
>
> For per-user installations, the following is used at the top of user
> procmailrc files to pass stuff thru the spamassassin daemon:
>
>
> ---------------------------(cut here)
> :0f:
> | /usr/bin/spamc -f
>
> :0:
> * ^X-Spam-Flag: YES
> mbox.caughtspam
>
> :0
> * ^TO .*@trilug.org
> mbox.trilug
> ---------------------------(cut here)
>
> As you can see, the spam filtering is done before filtering mailing
> lists into separate folders, though you could select to filter stuff
> into folders before filtering thru spamc.
>
> I'm supporting about 10 users and ~20 accounts on 6 machines here at
> home. At one job we were supporting >5000 accounts and matching network
> scale with 1 SA daemon on the incoming mail server. Perhaps others will
> chime in with how it scales for them.
>
>
> --
> G.Wolfe Woodbury `- -'
> U
> The Line Eater is a boojum!
I'm running SA (via MailScanner) at a client with over 2000 nodes. It
currently filters out about 90% of the spam. I'm using the SpamHaus
lists to filter the mail as well.
To feed it Ham I create imaginary users and subscribe them to various
internal and external lists used by the organization. Any whitelisted
mail or mail scored lower than 2 that comes in for these imaginary users
is submitted as Ham.
To feed it Spam I create other bogus users (using very common names) and
hide the email addresses on web-sites and publicized guest lists.
This works fairly well at keeping the filtering up-to-date. Still the
random crap does make it through occasionally. I've been thinking about
running an additional Spam Assassin test which keeps track of a hash for
every email that passes through from an external source. If a certain
threshold of emails comes in with that hash during an hour (or if a mail
matches a hash coming in for one of my bogus Spam users), then any
emails with that hash are marked as suspected Spam.
I don't do that now due to the volume of mail flowing through this site.
SpamAssassin rocks. Used with a virus scanner, the SpamHaus RBL's, and
MailScanner, it really does a great job.
Good Luck - Jon Carnes
More information about the TriLUG
mailing list