April 26, 2005

Brian Livingston: "Can We Restore Reliability to Email?"

Livingston's latest article (see also the previous articles) discusses whether email will ever be trusted as a reliable medium. He cites several studies and quotes Suresh calling at least one of the studies into question.

The variation, such as it is, in e-mail reliability studies like these is almost certainly due to the fact that these consulting firms are measuring different mail streams. Each one has a client base that's used as a "test bed" to determine which e-mails get delivered to test accounts at which ISPs. Given that reality, it's remarkable that the latest three studies are as similiar as they are.

I don't know whether the studies are reliable, but I think that before "we" can restore reliability to email, the "permission-based direct marketing" industry needs to restore, or more accurately, achieve, trust. The reason we reject or quarantine messages suspected of being spam has to do with the fact that most of it is spam. We get mail to addresses that don't exist, much less to addresses that requested it. We see marketing lists without confirmed opt-in loops, messages lacking even the most basic of RFC compliance, sent from hosts with questionable names, using behavior that offends protocols.

And as a side note, it's been amazing to me that simply by implementing stricter checks on basic things like whether a message has a Message-ID header (which it should) ends up blocking so much abusive mail and so little legitimate mail.

Unfortunately, even though most spam has these flaws, and they can therefore be associated with spam and viruses, legit mail can often also look just as poorly constructed. Microsoft, for example, broke the Message-ID support in Outlook deliberately, as a security measure designed to prevent "leakage" of internal LAN hostnames, even though most of them were stupid and useless NETBIOS names or RFC 1918 IP addresses anyway.

We blocked a few hundred thousand viruses in the months between MyDoom and Sober, just because we started rejecting mail from hosts whose HELO looked like a NETBIOS hostname. Sober and some of the more recent viruses wised up and started tacking on ".com" or ".org" to the NETBIOS name, which would also be easy to block if we were able to reject solely on whether the HELO string resolves, but there are so many broken hosts out there we can't do that, either, without losing some mail.

So the game continues; Livingston is correct in pointing out that legit mail gets lost. But the bottom line is that there are many different shades of "legit", and we'd be much better off if everyone started taking RFC compliance just a bit more seriously. Some would argue that the spammers would just start sending their spam as better-formed email from better-configured hosts, and they're probably right. But for the time being, strictness is keeping a lot of spam out of our servers, and if it means that some yoyo with a misconfigured mail server can't send us a legit message without a Message-ID from a copy of Outlook, I guess that's up to them to fix, not me.

