[TriLUG] Base 64 (perl) regular expressions - ɹǝʇɹoԀ uɐl∀™

matt at noway2.thruhere.net matt at noway2.thruhere.net
Wed Mar 7 17:33:00 EST 2012


I agree that blindly rejecting UTF-8 is problematic.  That is where the
regex part comes into play.  It determines what character set is being
used.  For example, standard CJK Ideographs fall into U+4E00 to U+9FFF or
Bytecode e4 b8 80 to Bytecode e4 b8 80.

The trick is to do the base64 decode, then look for the hex strings
corresponding to these ranges in the subject header.  Spamassassin does
this.  For anyone who is having trouble with this SPAM, you can take the
code in the link in my previous post, create a file from it with a .cf
extension and put it in /etc/spamassassin.  This will cause spamassassin
to identify the ideographic character set and score points to.  I've
tested this pretty thoroughly against a few SPAM and it is quite
effective.

Ideally, I would like to get the detection done at the MTA level.

I am just having a bit of trouble understanding part of the expression.

>
> You'll get a lot of false positives if you're going to blindly discard
> emails that encode their subjects using UTF-8.
>
> For a while, I had a maildrop filter that would UTF-decode the subjects
> and do checks against that.  But recently, I have switched to sieve,
> which has a less rich set of scripting tools for mail filtering.
>
> All that said, you're looking at postfix filtering, which happens at the
> MTA stage as the mail enters the server, way before the MDA stage where
> it is being delivered to my user and checked against my sorting rules.
>
> Good luck, and report back what you find out.
>
> --
> # ɹǝʇɹoԀ uɐl∀
>
> --
> This message was sent to: Matt Flyer <matt at noway2.thruhere.net>
> To unsubscribe, send a blank message to trilug-leave at trilug.org from that
> address.
> TriLUG mailing list : http://www.trilug.org/mailman/listinfo/trilug
> Unsubscribe or edit options on the web	:
> http://www.trilug.org/mailman/options/trilug/matt%40noway2.thruhere.net
> TriLUG FAQ          :
> http://www.trilug.org/wiki/Frequently_Asked_Questions




More information about the TriLUG mailing list