Internet security & antispam
« Links Roundup | Main | new pats posted - 20090612 (maintenance pats release) »
In addition to the humorous, and sometimes blatantly idiotic, practices discussed in our last post, we now turn our attention to the core set of end user node naming practices (for name-based filtering of abusive traffic, anyway). Perhaps first and foremost of these is what I like to call the Most Significant Token principle. Similar in concept to the idea of a most significant bit in programming, this is an approach to naming that expects the token to the immediate left of the domain to be where we find the most significance (for our purposes).
Before I get into the obvious "Right Way" to do it, I'd like to take a minute to highlight some of the "Very Wrong Ways" to do it, using real-life examples.
dynamic-ip-adsl-201.222.91.236.cotas.com.bo [201.222.91.236]
This host name makes two common "mistakes" - the dynamic token is at the far left, and the dotted IP is host-octet-right (rather than net-octet, which would cover a wider set of hosts as a right-anchored substring).
216-237-236-18-dynamic.northstate.net [216.237.236.18]
This host name comes closer, but because the dynamic token isn't dot-separated, sendmail and other MTAs cannot do a substring match on it using access.db alone; regular expressions are required.
cpe-212-18-43-98.static.amis.net [212.18.43.98]
Not necessarily bad on its own - the static token is the most significant token - this is here to illustrate that simplistic pattern matching on things like /^cpe/ is a great way to make dangerous assumptions about what constitutes a "dynamic" hostname. (It is still obviously generic, but not dynamic in the way that concerns us.)
adsl-69-226-72-166.dsl.scrm01.pacbell.net [69.226.72.166]
Here we have a static ADSL line (it's SWIP'd to an insurance company). Even though much ADSL in the US is dynamic/residential, and even though pacbell.net uses 'ded' for some of its dedicated/static lines, the most significant token here is the geographic locale, of which there are literally dozens. Many of the former Baby Bells' naming conventions are similar.
adsl-068-213-145-063.sip.rdu.bellsouth.net [68.213.145.63]
Here's another example from a former RBOC. The minor mistake made here is that the locale token is considered more important than the 'sip', signifying "static". So in order to use them without regular expressions, you have to collect a few dozen substrings.
One of the sad things is when an ISP acquires, and then loses, "clue". One example of this is below (as of late 2003):
cable-66-103-40-69.clarenville.dyn.personainc.net [66.103.40.69
The admin in charge of naming was eager to be in compliance with the Most Significant Token principle, so all of their dynamics could be filtered using the single substring "dyn.personainc.net". In later years, however, new allocations looked like this:
h219.204.244.66.cable.gldn.personainc.net [66.244.204.219]
No indication of dynamic/static, though they do make the "cable" aspect clear. And as their rwhois server is non-functional (for me, anyway) there's no way to tell whether this is a corporate customer hosting a mail server, or a bot on a residential leaf node. Subsequent naming uses "$HEXADECIMAL.cpe.persona.ca", again with no indication of staticity.
Next week, we'll discuss some good examples. Unless I come across some even more egregious bad examples and can't help myself, that is.
Posted by schampeo at June 11, 2009 5:11 PM