Blocking an IP Range using .htaccess
I’ve decided to block all traffic from all McColo Corporation users.
Not all are guilty I’m sure, but I just got hit by one of their customers called “Digital Infinity”, reportedly a Moscow based company. I was crawled repeatedly by IP 12 IP addresses within the 208.66.195.1-208.66.195.20 range. Now, <100mb of transfers isn't so much to loose over a couple days, but it's enough to catch my eye. Looking up one of those IP addresses shows McColo Corporation has leased 208.66.195.1-208.66.195.15 to “Digital Infinity”. However, several of the IP addresses that scanned me are within McColo Corporation generic pool. I’ve also seen posts related McColo Corporation’s 208.66.192.* range being a major source of WordPress comment spam.
So, guys, you’re outta here.
I’m blocking them via .htaccess. I suggest you might want to do the same for at least 208.66.195.1-208.66.195.20 if you’re feeling more charitable than I am this morning.
Since they have four groups of addresses, I add four lines to my .htaccess file. As a whole, that section now looks something like this:
[html]
order allow,deny
deny from 208.66.192
deny from 208.66.193
deny from 208.66.194
deny from 208.66.195
allow from all
[/html]
What does the HEAD do?
http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html
9.4 HEAD
The HEAD method is identical to GET except that the server MUST NOT return a message-body in the response. The metainformation contained in the HTTP headers in response to a HEAD request SHOULD be identical to the information sent in response to a GET request. This method can be used for obtaining metainformation about the entity implied by the request without transferring the entity-body itself. This method is often used for testing hypertext links for validity, accessibility, and recent modification.
The response to a HEAD request MAY be cacheable in the sense that the information contained in the response MAY be used to update a previously cached entity from that resource. If the new field values indicate that the cached entity differs from the current entity (as would be indicated by a change in Content-Length, Content-MD5, ETag or Last-Modified), then the cache MUST treat the cache entry as stale.
In short, a head request can be used as a low bandwidth check to see if a page is even there. I don’t know how often or even if this is used by spammers to see if a site is there, but if I am blocking a range I may as well block all requests from that range….