Linux Weekly News is the main news site for the programmers who work on the Linux computer kernel and its associated software. LWN editor Jonathan Corbet wrote up how the site deals with the scourg…
I, for one, am enjoying this new game of cat and mouse.
On my main domain I employ a tarpit that triggers on 404 and returns 200 so the bot doesn’t think it was a 404. It then prints a bunch of unique links that go nowhere (404, so loops back to itself) and starts very slowly printing a 13 megabyte string of base64. If your bot can deal with all of this, go ahead man. You can have it.
I have a honeypot on one of my lesser domains which simply takes incoming IPs and scans them for the usual HTTP ports. I’m gonna be careful what I say in public but 80% of traffic are scanners that identify themselves and 19% are unknown, likely scrapers, and 1% are unknown, likely still scrapers, that for some reason have open admin interfaces with default logins. Do what you want with this information.
I, for one, am enjoying this new game of cat and mouse.
On my main domain I employ a tarpit that triggers on 404 and returns 200 so the bot doesn’t think it was a 404. It then prints a bunch of unique links that go nowhere (404, so loops back to itself) and starts very slowly printing a 13 megabyte string of base64. If your bot can deal with all of this, go ahead man. You can have it.
I have a honeypot on one of my lesser domains which simply takes incoming IPs and scans them for the usual HTTP ports. I’m gonna be careful what I say in public but 80% of traffic are scanners that identify themselves and 19% are unknown, likely scrapers, and 1% are unknown, likely still scrapers, that for some reason have open admin interfaces with default logins. Do what you want with this information.