[TriLUG] Spamassassin question - Bayesian filtering

Chris MacLeod stick at miscellaneous.net
Thu Mar 13 23:10:42 EST 2003


I read the same thing (I'm also trying to grok and make useful the
bayesian filters).  But don't to make it meaningful don't you also have
to tell it about and equal or more amount of ham mail?  I have a ton of
spam (this email has been in use for going on 5 years) but all my ham is
mostly mailing lists or sorted (I try to be orgainized) I know I can
pull it all together but I thought I read somewhere that you want most
of both your spam and ham to be recent, else the bayesian filters take
in the date of the message to figure it relavence to spam...

Stick

On Thu, Mar 13, 2003 at 05:37:55PM -0500, Jeremy Portzer wrote:
> On Thu, 2003-03-13 at 14:36, Jeremy Portzer wrote:
>
> Here's something I just found that eases my concerns somewhat, from the
> man page for configuration options:
> 
> <quote>
> auto_learn_threshold_nonspam n.nn (default -2.0)
>     The score threshold below which a mail has to score, to be fed into
> SpamAssassin's learning systems automatically as a non-spam message.
> 
> auto_learn_threshold_spam n.nn (default 15.0)
> The score threshold above which a mail has to score, to be fed into
> SpamAssassin's learning systems automatically as a spam message.
> </quote>
> 
> These values are pretty conservative.  The spam emails that I was
> worried about normally have a score of 2 or 3, certainly not a negative
> score, so it looks like they're not being put into the Bayesian database
> anyway.  So I don't need to worry about "forgetting" them from the "ham"
> side of the DB.
> 
> This also shows that not all spam being caught (per default, score of 5
> is tagged as spam) is going into the Bayesian system.   So what I could
> do is find everything that is spam, but with a score lower than 15, and
> feed that to sa-learn manually.  Hmm.
> 
> --Jeremy
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://www.trilug.org/pipermail/trilug/attachments/20030313/3e765074/attachment.pgp>


More information about the TriLUG mailing list