Skip to content

Blog website crawlers and bots in Apache2

Found a couple more bots crawling my website, and from the look at online resources it seems I catched a few of the bad guys. Crawlers which ignore the robots.txt standard, and just crawl a website for content.

Decided to do something against it, and added a filter in Apache2.

The way I have my webserver setup is that I have templates for every website (they all have different configs), and deploy them using Ansible. Parts of the website configuration which are the same, or at least similar, are handled by includes.

 

Continue reading "Blog website crawlers and bots in Apache2"
  • Twitter
  • Bookmark Blog website crawlers and bots in Apache2 at del.icio.us
  • Facebook
  • Google Bookmarks
  • FriendFeed
  • Digg Blog website crawlers and bots in Apache2
  • Bloglines Blog website crawlers and bots in Apache2
  • Technorati Blog website crawlers and bots in Apache2
  • Fark this: Blog website crawlers and bots in Apache2
  • Bookmark Blog website crawlers and bots in Apache2 at YahooMyWeb
  • Bookmark Blog website crawlers and bots in Apache2 at Furl.net
  • Bookmark Blog website crawlers and bots in Apache2 at reddit.com
  • Bookmark Blog website crawlers and bots in Apache2 at blinklist.com
  • Bookmark Blog website crawlers and bots in Apache2 at Spurl.net
  • Bookmark Blog website crawlers and bots in Apache2 at Simpy.com
  • Bookmark Blog website crawlers and bots in Apache2 at blogmarks
  • Bookmark Blog website crawlers and bots in Apache2 with wists
  • wong it!
  • Bookmark using any bookmark manager!
  • Stumble It!
  • Identi.ca