Internet security & antispam
« new pats posted - 20090713 (maintenance pats release) | Main | new pats posted - 20090715 (maintenance pats release) »
A perennial debate has just arisen again, twice in the same week, and no, it's not "vi vs. emacs", it is the question of whether static IPs with generic names are sufficiently high-risk for us to block or score mail sent from them. Of course, everyone's mail flows are different, as are their tolerances for spam and other abuse. So you'll need to frame your own views of the matter in terms of your local policies, how your tools work, your aims in looking at PTR and HELO, etc. But let me lay out the history, outline my basic argument, and back it up with some data from a recent CBL zone.
Historically speaking, the concept of "dynamics" in the context of blocking spam arose out of the use of throwaway dialup accounts to send spam; early ankle-biter spammers, derided as "chickenboners" (in reference to their imagined mobile home littered with chicken bones) were assumed to be low-rent halfwit losers who'd fallen for some get-rich-quick-by-email scheme or similar. As it is a basic tenet that mail servers themselves should have relatively static addresses (because if they act as MX for their domain, you don't want to give up your IP to some other dynamic host that might then start receiving mail already queued up to be sent to your old, dynamically assigned, IP), the rough distinction between "dynamics" and "statics" was born.
Administrators started blocking based on whether an IP was believed to be dynamic, using various mechanisms. Early attempts included simple (and overly simplistic) regular expressions, which treated any host with substrings common to dynamics as suspect: "dial", "dyn", "ppp", etc. Blacklists such as SORBS DUL and Dynablock started up with the sole intent of listing such IPs. And Enemieslist started collecting fully-qualified regular expressions for "generics", classifying them as dynamic or static as best we could; our sendmail antispam package treated generic and static as less risky in most contexts, but still scored them as somewhat risky.
(As an aside, I fondly recall the days when you could get spammed via an obvious dialup account and respond by sending them the "Ping of Death" - ping with a custom payload such that when the remote host's TCP stack sent it back it sent the Hayes modem command for "hang up now", which the modem promptly obeyed. Heh. Good times.)
Anyway, with the rapid rise of so-called "broadband", or high speed residential Internet connections (cable, DSL, fiber optics, WiMax, and more) the distinction has become blurred - for example, I had the same "dynamic" IP assigned to the cable modem at my house for three years, during which time we renumbered our connection to the office three times, using three different providers. (Fortunately, we had hosted our servers in a colo throughout, so it wasn't an issue from the perspective of dealing with email). What was once "dynamic" in practice as well as in definition has become more static in reality. So, the definition broadened, or morphed, to denote IPs that were assigned, however dynamically or statically, to "end users", whose PCs were presumed to get infected more than hosts on commercial networks with dedicated IT staff to watch over them. And this may well be the case. We certainly see more spam traffic from hosts we've classified as dynamic than we do from those we've classified as static.
Nonetheless, while it may be true, and effective, to make the argument that you don't want to accept mail from "dynamics", because the risk of that traffic being spam in this million-host-botnet day and age, it doesn't hold that just because a host has a statically assigned address, and a generic name, that you may not want also to filter mail from there. (I'm leaving aside for now the question of whether it makes sense to run a small home mail server from a dynamically assigned IP, using dyndns or other mechanism. I have my opinions, but they're not particularly germane to the central point I'm trying to make about statics.) Why should we treat any host with generic PTR as suspect, regardless of assignment type?
The simplest way to think of this is in terms of Venn diagrams. You can divide the Internet into three basic classes of IP: dynamically assigned "leaf nodes", or end users, statically assigned leaf/end user nodes, and statically assigned infrastructure nodes (over which the intermediate traffic flows). You can also include a slice for unassigned or unused IP space, and further subdivide them into still more categories, but those are the basics as far as we're concerned. Hosts with the first type of address are what most people think of when they think of botnets - your grandmother's home PC, connected directly to the Internet via high-speed DSL, infected with perhaps several bots, and spewing spam just under the rate-limiting threshold. But when people think of the second kind of hosts, for some reason they don't think exactly the same thing.
Statistically speaking, this is odd because even if you take out the "infrastructure" static IPs mentioned above, most of the remaining "leaf node" statics will not be mail servers. At our tiny office, we have a /27 at our disposal - 32 static addresses, five burned as network infrastructure (network, broadcast, cable modem, and two NAT/VPN hosts); of the rest, only two send any mail at all, and one of those is a spamtrap server and by definition only forwards known spam to feed remote trap sinks. That's one, possibly two if you count the trap server, of the remaining 27 statics approved to send mail to remote servers. Many businesses are in a similar position, or use hosted offsite mail solutions for their MXen. Even universities with large (typically /16 allocations) with public static LAN IPs fall under similar ratios - the larger the network, the more likely they will have a dedicated subnet or subnets for mail infrastructure, often with custom PTRs. So the remainder of hosts with static assignments will likely have generic names and not be marked as legitimate mail sources. (Note that this excludes mail sent within an organization, which in modern times is usually sent via authenticated connections such as SMTP AUTH.)
A couple of special cases to consider are Web hosting providers, which are typically statically assigned (though with the rise of "cloud" computing even this is becoming less common), and NAT and/or PAT firewalls, which although static may have multiple, even dynamic, hosts sending traffic through them. Enemieslist classifies these as well, because of the heightened risk associated with them; the former are commonly compromised and used in phishing attacks, and the latter are often not secured properly against unauthorized outbound port 25 traffic.
But back to statics - I said I'd share some data. I resolved a whole CBL zone back in May, and once I'd stripped out the IPs without any PTR at all, had a set of 4169150 unique hosts. Of those hosts, Enemieslist has patterns that match all but 19082 of them (99.54%). Of those, they are classified as follows (bear in mind that it's probable that many of those simply classed as "generic" may be either dynamic or static or even a mix of both; we're trying to reduce the number of "generic" patterns, but it's a long slog and may be hopeless in most cases due to lack of information).
| count | class |
|---|---|
| 3548898 | dynamic |
| 328227 | static |
| 174089 | generic |
| 49910 | badrdns |
| 34672 | mixed |
| 19082 | no enemieslist classification |
| 8538 | natproxy |
| 4119 | unassigned |
| 1181 | webhost |
| 234 | outmx |
| 184 | resnet |
| 9 | spammer |
| 7 | cloud |
Okay, looking at the data, obviously there are many more dynamics in that list than anything else; roughly eleven times the number of statics, and roughly twenty times the number of generics. But notice that 328227 of the hosts are static - roughly 8% of the total. Throw in NATs, webhosts, resnets, and mixed, and we're looking at 9%, and if you assume that generics are static (because not obviously dynamic) we're talking about more than 1 in 8.
Still think it's not worth your while to block, or at least score, static IPs with generic names, as probable sources of spam? Admittedly, the risks are slightly higher - there are of course going to be more actual legitimate mail servers in amongst those statics, and the word about custom PTRs for mail servers hasn't quite gotten out to many admins in small businesses and the like. But still, an impressive number of statics are infected and spewing on any given day, and if you look at the numbers above you'll see that even the hosts Enemieslist doesn't match (which are, for the most part, one-off mail servers with custom PTRs that we haven't bothered to make patterns for) amount to less than six percent of the number of statics.
So, the next time you start to argue that blocking mail from static generically named hosts isn't worth the risk, ask yourself whether it's actually worth the risk of false positives to let in traffic from the 94% of the infected, known static hosts above.
Posted by schampeo at July 13, 2009 2:55 PM