[TriLUG] stopping Cyrillic spam.

Cristóbal Palmer cristobalpalmer at gmail.com
Mon Jan 29 18:31:57 EST 2007


[\x{400}-\x{52f}] is awesome! It catches approx. 92% of the test
messages I gave it. Now to see if I can actually get it working
properly with SA...

Thanks!

I'll ping back with the rule I come up with.

-CMP

On 1/28/07, Daniel Sterling <dan at lost-habit.com> wrote:
> Cristóbal Palmer wrote:
> > We're already using content checks... and other techniques.
> Excellent! I hate to be repetitive, but please keep using statistical
> analysis! I run spamassassin with the bayes *off*. Spam that
> spamassassin misses is filtered by Thunderbird's built in statistical
> analysis. I have a silly setup like this mostly because it works and I
> am too lazy to change it.
>
> Anyway, my Thunderbird's filters are catching the Cyrillic spam. I
> noticed that the following fun keyword is in mine:
>
> charset="windows-1251"
>
> windows-1251 is the Cyrillic encoding. You can definitely trash messages with that string.
>
> Also, you may or may not have good luck with the following bit of regex: [\x{400}-\x{52f}] -- let me know! (I suppose it mostly depends on whether or not the string to be matched against is using byte or character semantics.)
>
> -- Dan
>
>
> --
> TriLUG mailing list        : http://www.trilug.org/mailman/listinfo/trilug
> TriLUG Organizational FAQ  : http://trilug.org/faq/
> TriLUG Member Services FAQ : http://members.trilug.org/services_faq/
>


-- 
Cristóbal M. Palmer
UNC-CH SILS Student -- ils.unc.edu/~cmpalmer
TriLUG Vice Chair
"There are many roads to enlightenment, and thus many roads back to
the One True Debian" --crimsun


More information about the TriLUG mailing list