« new patterns posted - 20101206 (maintenance patterns release) | Main | new patterns posted - 20101209 (maintenance patterns release) »

December 7, 2010

Enemieslist 20101207-02 vs. CBL list.txt 20101201

So, every now and then as a coverage check I like to run a current CBL list.txt file through a fast rDNS resolver and see how well Enemieslist compares to what's in the CBL. This time through, we resolved list.txt from mid-day December 1st. This left us with a file containing 7518991 IPs, 4411459 of which had PTRs (that we could resolve, anyway), 4410694 that had a PTR that wasn't just a dotted quad, when sorted uniquely it left 4406537, and when finally cleaned to remove any line that lacked both an IP in dotted quad and something resembling a PTR, there were 4404978 unique IPs to test. (Some of the damage was simply the result of flaws in how we rapidly resolved the PTRs, using a parallel perl resolver with a few bugs - consider it the price we paid to resolve seven and a half million PTRs in a few hours. So there's a slop factor of around 0.0015)

Of the 4404978 unique IP/PTR pairs we tested, here's the breakdown in terms of how EL classified them.

countclass
3380237dynamic
446402static
223027mixed
215289badrdns
76023generic
39657natproxy
15459no enemieslist classification
4409unassigned
3835webhost
247outmx
206resnet
183cloud
4spammer

EL did not have a pattern for 15459 of the hosts, or 0.35%, giving us a match rate of approximately 99.65%. Not too shabby, and comparable to the 99.54% we got back in mid-May 2009. A quick eyeballing of the hosts we didn't match this time around suggests around 5% snowshoe, 40% generic hosts we'll be able to make patterns for, another 20% or so we won't, and the rest are likely one-off mail servers that are low priority trying to classify.

Posted by schampeo at December 7, 2010 4:08 PM

Trackback Pings

TrackBack URL for this entry:
http://enemieslist.com/mt/cgi-bin/mt-tb.cgi/1185