June 9, 2009
A few thoughts on reverse DNS / PTR naming
Over the years, I've collected nearly forty thousand PTR naming conventions, creating regular expressions from them for the sake of compactness and power. I've classified them in terms of whatever best guess I can make as to their assignment type and the technology that connects those hosts to the Internet. This started as an effective antispam tactic and response to the rise of botnets, and has grown into more than that the longer I keep at it. It's been a rewarding and instructive trip, and there's always a new problem to solve, whether it's figuring out the meaning behind some new and cryptic naming, or trying to read ISP and telco Web sites in translation, or guessing how people say "mail" in a hundred languages.
However, I will always have a special circle set aside in my own personal Hell for the following list of domains and regions (in no particular order).
- assigns static IPs in 'lnk.telstra.net', any name goes
- ditto, only without a subdomain
- incapable of deciding on a single or small set of conventions, nearing a hundred at the time of this writing
- like tenet.odessa.ua, telstra
- assign statics with "dynamic-ip" tokens, dynamics with "static-ip" tokens
- cheerfully mixes geographical, technical, IP-related tokens for statics and dynamics, all under rdsnet.ro
- like tenet.odessa.ua, no subdomains at all
- ditto, for the most part
- much of Poland
- a mix of top-level, geographical, municipal and various other naming (and their whois? fugeddaboudit)
- nearly all of Brazil
- it seems obvious that one or two technicians set up the DNS infrastructure for the entire country: most of it is n-n-n-n.domain.com.br
- assigned a single PTR ('tm.net.my') to their entire end user network
- Viet Nam
- has a strange predeliction for naming their hosts "localhost" or "adsl-xxx"
- apparently, subdomains are for losers
I'm sure I'll add more to this list; these are just a few that came to mind.
Now, granted, I'm going to be coming at these names and practices with a particular purpose in mind - namely, trying to classify them by their names and assess the risk of accepting certain types of traffic from them - so it may seem unreasonable to some why I'd find certain practices distasteful. It's only recently that there have even been Internet Drafts to suggest best current practices for naming pools and customer blocks, so I shouldn't be so surprised to find such a gulf between what I consider "useful" naming and what I consider evidence of incompetence or insanity at worst. (I mean, come on, who names a public IP "localhost"...?)
I can appreciate the special madness that systems and network administrators have for the act of naming - whether it is the sort that names servers after characters from comic strips like Bloom County; or the sort that incorporates interface names, rack locations, IP addresses in hexadecimal, into their conventions; or even the sort that uses Roman Numerals (as do three Finnish ISPs and one German) or Latin numbers (one German webhost). I learned about the act while watching my first real netadmin name new Sun hardware after seas (which got confusing when Sun launched Java[tm] because java the server didn't have anything to do with it). I named my Web hosting servers after hot peppers for a long time (and currently have a "tabasco" and a "jalapeno", having retired "habanero" and "serrano"). I took on the habit of naming my laptops after my then-current favorite musicians, so I'm writing this on a several-year-old "tupelo" and had "fugazi", "waits", "radiohead" (which I repurposed for a wireless router) and so forth. I've had several boxes named after bourbons. I understand.
You want to be able to give some personality to cold and impersonal hardware, hardware you will be working around and on and under all the time. You want to be able to distinguish the server that always has a disk crash on holidays from the one whose ethernet card freaks out and throws the box into a kernel debugger.
Sometimes, you just want to express disgust or admiration or disrespect. I once had a home-built Pentium 133 that ran Windows NT that I called "stepchild" as a joke (we had to reboot it at least once a week, and all it was doing was running proxy software for our internal LAN). My brother the Luddite, when roped into hacking perl for us one year while he was between overseas teaching jobs, named his laptop "powerloom" as a constant reminder that he may as well be manacled to it like a child during the early Industrial Revolution.
But with spam levels hovering at around 9 in every 10 messages, million-host botnets capable of pretty much anything their "owners" (or renters, of course) decide they want to do with them, desperate administrators everywhere are using every trick they can in order to allow for more accurate, and rapid, discrimination as close to the edge of their networks as possible. And if you look at the numbers, you'll see that the boxes most likely to get infected by a bot are the very boxes whose actual owners wouldn't know a hostname from a horseshoe.
So while the urge to be cute, or clever, remains, when coming up with your PTR records, a little seriousness is called for here. Don't be like the Ukrainian DSL provider who chose two random words for each hostname in their customers' dynamic pools. Don't be like the Australian, Dutch, Romanian, Russian, Polish, or Bulgarian ISPs and telcos who think that every customer should have their own custom name as an RR in their top-level domain. Don't be like the Indian ISP who seems to create variations on their naming conventions like the special Indian genius created gods for every village. Don't be like the imagined Brazilian network admin who decided that 90% of all the hosts in 200/8 should be a generic name based on the IP, with no assignment type or technology tokens at all.
So what should you do? In the opinion of someone who has spent the better part of the last six years tracking down and classifying networks' PTR naming conventions, this is what you should do. Feedback, argument, speculation, clue, etc. all welcome - this is an evolving document, and I don't expect to get everything right at first.
At the very least, unallocated/unrouted IPs should be named as such - so it's easy to tell when they've been hijacked. And, for the love of all that is good and holy, name them something else when you finally do allocate them.
Dynamically assigned IPs should say that they are dynamic - and that should be the very first token to the left of the domain. Those who use lists of substrings to block spam from dynamics don't want to have to collect one for every town in every state your mega-telco or ISP holding company happens to be capable of serving, if you decide that 'dynamic.raleigh.nc' makes more sense than 'raleigh.nc.dynamic' (think about it). And if possible, please distinguish between low-bandwidth (dialup, ISDN, frame relay, wifi) and high-bandwidth (DSL, fiber - whether catv or ftth/p, metro ethernet, wimax, etc.). Why? Because a bot on a multiple megabit fiber link can spew a lot more spam, or DDoS packets, or ssh scans, or dictionary attacks on ftp servers, than can one on a dialup - and the owner is less likely to notice, so they're likely to remain infected longer.
Statically assigned IPs should say they are static, or be custom and associated with the customer's domain, not the provider's. Particularly for mail servers, NAT/PAT boxes, and the like.
Honestly, the problem is so bad that many are simply using "generic" as a basis for rejecting mail from unknown hosts. And it works. If you're running a mail server on a host with generic PTR, good luck getting it delivered at all two years from now. If your mail server PTR domain is your ISP, you really need to look into getting a custom PTR yesterday.
Web hosting and colo providers should already be forcing low-end customers' mail through their own carefully monitored smarthosts, to reduce the amount of spam and other abuse coming from oft-compromised hosting control panel platforms such as cPanel. I used to be disgusted by the folks who named their webhosting PTRs things like "hosted.by.example.net" or "2gbamonth.for.just.7.95.example.net", but now I love them. They're just the most perfect indicators of super cheap mass virtual Web hosting, from whom I almost never want any mail unless sent through a smarthost.
University residential networks, or 'resnets' should contain the token resnet (or dorm, popular in Europe and Asia). Yes, I know, it's nice to be able to name a node after the building it is in, but without a lot of work and research it is difficult to tell whether a given building is the school of economics or a co-ed dorm. Don't count on local knowledge being global. And tokens like 'housing' or 'student' are ambiguous, they might refer to "the housing department" or "mail servers for students", not to student residential networks.
Cable TV and VOIP and triple-play providers - if you provide both residential service (which we assume by default for Internet access over the same fiber as cable television) and commercial (which is becoming more popular with the rise of VOIP) - please, say so in the names. Road Runner has "res.rr.com" and "biz.rr.com" (though they named a mail server mail.biz.rr.com, much to my initial consternation). Charter and Comcast do not (though the latter has 'comcastbusiness.net' for business customers). It matters.
Telcos providing DSL; please indicate whether your ADSL is dynamic (as it usually is in the US but not in the Netherlands, for example) and whether it is residential or commercial. Do not rely on us to know that "maxpro" is commercial and "fastweb" is residential; we do not care to know the branding exercises of every ISP/telco in the world. And if we're counting on ADSL to be residential, tell us it's ADSL, not just generic DSL or "broadband".
Corporate network admins? Distinguish between dedicated NATs and PATs and those that have mail servers directly behind them; if you haven't secured your NAT against unauthorized outbound port 25 traffic, it's not up to us to determine whether refusing all mail coming from your NAT will also drop legitimate mail on the floor of the server room. The odds are very, very good that mail from a NAT is bot-originated these days. Don't make us think too hard here.
If your employees and locations are lucky enough to have their own LAN-side public IPs, please monitor your gateways for sudden upticks in outbound mail from end user LAN nodes if you can't simply block that traffic altogether, or re-route it through your mail servers.
It's appalling how many ISPs and telcos and cable TV providers don't actually ever say what their "super fast Internet speeds" are going to traverse, technology-wise. I'd say that on half of the Web sites I visit, in vain attempts to classify a new naming convention or add some detail to older patterns, it is simply impossible to know whether they sell DSL, fiber, ethernet, and so on. Oddly, wireless providers are quite clear about the fact that they're selling wireless, whether broadband or not. And the proud new owners of Eastern European networks will usually brag to the visitor about how fast their all-fiber net has grown, and how fast it will continue to grow.
Finally, name your mail servers "mail" or "smtp" or "mx" or something that indicates that they're legitimate sources of mail. (Granted, some i18n consideration must come into account here, but the words for mail aren't that different in the Romance languages, or in the Slavic; we can stand to learn "correo" and "poczta", just don't make us learn ten thousand languages' form of "mail").
Feedback is welcome: firstname.lastname@example.org. If there's any real interest in a public debate I'll open up comments here.
Posted by schampeo at June 9, 2009 1:59 PM
TrackBack URL for this entry: