23729 patterns, 10324 right anchor strings, 29797 test IPs.
More QA. Expect more of this.
PLEASE NOTE: the patterns_xwalk file and rightanchors file now contain
a new token, 'mixed' for patterns_xwalk and 'MIXED' for rightanchors,
which will be increasingly used to designate and distinguish between a
"we're not sure what this is but it is generic" class, and a "this is
a naming shared by both known dynamic and known static hosts" class. I
will be going back through the history of the dataset and looking for
those cases where a pattern had a non-generic class that was subsequently
reduced to generic, and changing these to 'mixed'. It's restricted to a
very small number of strings and anchors at present, but expect more.
Download them here:
Posted by schampeo at October 12, 2007 12:47 PM