[TriLUG] stopping Cyrillic spam.
Daniel Sterling
dan at lost-habit.com
Sun Jan 28 01:05:06 EST 2007
Cristóbal Palmer wrote:
> We're already using content checks... and other techniques.
Excellent! I hate to be repetitive, but please keep using statistical
analysis! I run spamassassin with the bayes *off*. Spam that
spamassassin misses is filtered by Thunderbird's built in statistical
analysis. I have a silly setup like this mostly because it works and I
am too lazy to change it.
Anyway, my Thunderbird's filters are catching the Cyrillic spam. I
noticed that the following fun keyword is in mine:
charset="windows-1251"
windows-1251 is the Cyrillic encoding. You can definitely trash messages with that string.
Also, you may or may not have good luck with the following bit of regex: [\x{400}-\x{52f}] -- let me know! (I suppose it mostly depends on whether or not the string to be matched against is using byte or character semantics.)
-- Dan
More information about the TriLUG
mailing list