Why you should ban Amazon's Cloud IPs
Thursday, 27th December 2012, 14:50
Life used to be simple, you made a website, submitted it to Yahoo and maybe a load of other search engines via a tool or website. These days, in some respects things are simpler still, you submit it to Google if it hasn't found it already, and Bing if you are feeling generous, then you sit back and get scraped by sites like Pinterest who take your images and serve them as their own.
But Pinterest isn't the only problem, there are loads of other similar start-ups out there doing the same, many you won't have heard of, many you'll never hear of. And Amazon's cloud services appear to be the system of choice for them to use.
Added to this, there will be a number of unscrupulous... if you don't consider the above as such... people who will abuse these same cloud services to scrape your content and generally try to free-load off your output whilst eating up your bandwidth.
The solution? Just block Amazon's IP ranges from your websites. Don't use a firewall, this stops you accessing services on AWS, just block them from reading your websites, then you only have to worry about Russian and Chinese scrapers. :)
Blocking Amazon Cloud IPs in nginx
Firstly, you need to create a new file to put the block list in, let's call this blocklist.conf, and we need to make sure we include it in the http section of our nginx.conf file. It shouldn't matter where, so if you use a wild-card include that picks up *.conf in a sub-directory, you can just add the blocklist.conf there and not do the next bit.
# all your usual generic stuff here
And now for what we'll be putting in our blocklist.conf file, which will use the IPs from the official sticky forum thread on the Amazon AWS forums:
# US East (Northern Virginia)
# US West (Oregon)
# US West (Northern California)
# EU (Ireland)
# Asia Pacific (Singapore)
# Asia Pacific (Sydney)
# Asia Pacific (Tokyo)
# South America (Sao Paulo)
Now, this is accurate as of the time of posting, but don't assume this IP address range list to be static, you'll need to monitor the forum post I linked above!
After doing any major change to your webserver like this, ALWAYS carefully monitor your webstats, Google Webmaster Tools, Bing Webmaster Tools and anything else which shows the continued health of your website. If it looks likely to be impacting your traffic and/or search positions badly, back it out immediately and consider if this is the cause.
Hope you had a nice Christmas, and have a happy new year. :)