June 30, 2009
The Trifecta, or, tweaking your way to glory
We have our own home-grown sendmail antispam filters here, which use a fairly broad brush to score incoming mail, but which have been remarkably effective for us for over six years.
One of the data points we check is of course whether the sending host has a generic PTR, via the enemieslist DNSBL. But we also find it useful to check the TCP fingerprint of the sending host, to see if the box on the other end is running some form of Windows - particularly certain highly vulnerable releases and patchlevels, like Windows XP Service Pack 1. We also check to see whether the message in question is in multipart/alternative format, or "HTML email", because in our experience it's rare to see spam that is in plain text format.
Each of these conditions (HTML, Windows, generic PTR) scores a fairly low spam score, because of course it's perfectly normal for mail to be in HTML format, and there are many Windows boxes running MSExchange and other legitimate Windows-based mail server software. And of course, there are many small businesses with generic addressing on their static netspace. The problem is when we see all three together.
As a default, all of our local accounts here have a spam score threshold of 4, which is sufficient to keep out the vast majority of the inbound spam - especially if the local scoring has been tweaked to give high scores to generic HELOs and low to generic PTRs - and which lets almost all normal mail traffic through. For historical reasons, the scoring is all done in integers, so we don't have the fine-tuning capabilities available in SpamAssassin, for example, where an HTML message might get a 1.7 just for containing HTML and no text part. Here, by default, HTML email scores a point, any Windows system scores a point, and any other issue is usually enough to dump it into the quarantine. A static generic PTR gets 2 points. So, the Trifecta is 4 points, enough to reject on for most accounts.
Pretty much the only time we ever have to whitelist anyone here is when the sender has hit the Trifecta outlined above. HTML-only email, sending from a Windows box, with a generic (almost always static) PTR. What's sad about this isn't that we have to make up for their IT consultants' failure to bother to request a custom PTR, or that some people run MTA software that spits out HTML-only email. No, that's pretty much par for the course in any industry without a need for a full-time IT person or team. Lawyers, galleries, non-profits, small businesses of many kinds are subject to the pressure to conform - and to pay lots of money for Exchange (when they could use free, high-performance Unix-based mail server software). And for the skills needed to install it (poorly), maintain it (poorly) and patch and upgrade it (rarely). OK, enough Unix bigotry. For now.
Some will complain that we shouldn't be blocking (or even scoring discriminately) on known "statics". The problem is that there are a lot more statically assigned IPs out there that have unfiltered access to the rest of the Internet, and are vulnerable to infection by the botnets, than there are legitimate mail servers with generic PTRs.
For example, yesterday we blocked 349 messages sent from static generics out of 8810 total rejected messages, or 4% of our total rejections, with one false positive (the message that spurred on this post). Of those, 117 were from .com or .net hosts, with the rest coming from ccTLDs we rarely have legitimate traffic from, so we can't just accept from static generics with .com or .net TLDs.
To effectively work around the infected statics problem while avoiding the occasional Trifecta-as-FP problem will take some more analysis, or, some more widespread clue among WIndows IT consultants. And we're not going to reduce our overall filter effectiveness by 4% daily just because of a once-a-quarter FP due to a lack of care on the part of someone else. So we need to tweak, and tune, our policies on this end without compromising our perimeter defenses, or adding to my quarantine watch workload.
Our system usually generates what, to our biased minds, are perfectly useful and informative error messages, especially in response to particular problems. The problem with the Trifecta is that we're blocking based on a score, not a specific set of problems, so the error looks like this:
554 5.7.1 HISCORE Contact firstname.lastname@example.org if this is in error, but your message was rejected as spam; it simply failed too many tests. (threshold: 4; score: 4)
There's a token (for our stats), immediately followed by a contact email address that is more or less unfiltered, a rationale, and a score/threshold. The problem is that many Exchange servers either truncate the error message, rendering it less useful, or explain that the remote system did not provide a reason - often including the complete error message beneath! - which most people don't bother to read. So we get phone calls to the effect that our system is blocking their mail. Which it is, and in many cases these are actual false positives. So we whitelist their IP address, and they can send again. (Incidentally, of the 349 messages we rejected, six had a 4/4 threshold/score; one of those was the false positive. Two had a 4/5, two had a 4/6, three had a 4/7. So, one way to deal with this is to raise our default threshold to 5, thereby letting in 7 more spams a day in order to prevent a quarterly FP. This on a system where userbase-wide we see about 3 or 4 spams/day make it through the filters, and maybe a couple 419 scams and phishing scams. So, a difficult choice - how tolerant do we become, and how low do we sink in order to accommodate these arguably at-fault systems?)
What's even more annoying is that once we've whitelisted the sending IP address of one of these poor victims, they'll go home and try to send from Outlook Web Access, which many IT consultants set up on yet another IP address, also with a generic static PTR. So we go through the whole rigamarole again, only this time with their OWA IP address.
The real problem here is two-fold: the failure of IT consultants to have even the most basic understanding of the nature of deliverability and its relationship to the generic PTR question, and the continuing acceptance of such a low standard of compliance with email community norms. (And yes, there's a third factor, namely, my reluctance to raise the default spam score threshold just to accommodate these edge cases.)
So let me close with a plea to any IT consultant tasked with setting up a Windows-based mail system: please, for the love of all that is good and holy, ask your customers' ISPs for custom reverse DNS for any system legitimately sending mail. We'll tolerate your HTML-only email, and your choice of Windows, if you'll do your part and signal to us with a custom PTR that this is a system that is intended to send mail, rather than an infected end-user system or NAT or insecure LAN.
Posted by schampeo at June 30, 2009 3:08 PM
TrackBack URL for this entry: