Calling out bad crawlers: the Kintiskton nuisance

Posted Feb 17th, 2009 by David Calhoun in Uncategorized

I have never been involved in creating a web crawler, but as a website owner I’m well aware of the behavior of good crawlers versus bad crawlers.  For instance, a good crawler must not only follow the rules set by robots.txt, but it must also not impose an undue load on the server being indexed.

Famously, Cuil exhibited this bad behavior for at least several months before they claimed to have fixed it.  In any case, I had to ban their IP range because they were just hitting my site too hard (compared to all of the other major crawlers out there).

Today I’m looking at my traffic stats for my WWI flight sim site and I see that yesterday I got over 200% new visitors.  Strange thing was there was no major referring site, only direct hits!  What on earth!  So I check the logs and find that most of the IPs are from the range 65.208.151.112-65.208.151.119, which resolves to kintiskton-gw.customer.alter.net [63.114.61.170] before the tracert dies.

Apparently this IP block is owned by Kintiskton LLC, whatever that is.  When I do a Google search, I can’t find the actual company, only complaints about its crawler abusing people’s websites going back to December 2008 (several months).

The IP block is hosted by Verizon Business, so I shot over an email to abuse@verizon.net.  After several months of this Kintiskton doing their excessive crawling, hopefully Verizon will eventually step up and look into it.  Apparently they haven’t yet…

In the meantime, it’s good old Apache to the rescue.

I’ll be adding this to my .htaccess file:

Deny from 65.208.151.112
Deny from 65.208.151.113
Deny from 65.208.151.114
Deny from 65.208.151.115
Deny from 65.208.151.116
Deny from 65.208.151.117
Deny from 65.208.151.118
Deny from 65.208.151.119

  • LGR on 27 Feb 2009 at 12:11 pm

    I have the exact same problem. I just ban those IP’s by default now on every site I work on. I guess that is all we can do to stop them. Thanks Verizon Business for being a nuisance.

  • mrdeus on 28 Feb 2009 at 2:12 pm

    I blocked this bot today. It’s annoying because it loads images and everything. And does it all at once, like you said.

    It’s shorter to use the CIDR notation for the .htaccess file:

    Deny from 65.208.151.112/29

  • Kintiskton « Mark Turner Dot Net on 23 Apr 2009 at 9:45 am

    [...] that made choose not to share my web content with Kintiskton. I first added the subnet to my .htaccess rules but decided to add it to my firewall rules. That took care of [...]

  • Julian Perry on 15 May 2009 at 12:10 am

    I am getting very similar activity from cuil.com.
    Direct hits from 216.129.119.16.
    Google Adsense shows the hits as page impressions for one day then they’re gone the next.

  • canuck on 17 Jun 2009 at 7:25 am

    Hi. They came on to my site and I received over 600 hits from them in approx half an hour. I was able to exempt my statcounter.com from logging their IP address, but that doesn’t mean they can no longer access my site.

    I use blogger. Is there a place in my Template where I can insert info that would deny them even loading my page? I am computer not-too-smart and so if you can help in kid language, that would be ideal :)

    Cheers & thanks.

Trackback URI | Comments RSS

Leave a Reply

Categories