[TriLUG] MSN bot is pounding my website...

Jeff Groves jgroves at krenim.org
Thu Dec 9 15:23:31 EST 2004


Create a file called robots.txt in your web server's root directory and 
populate it with:

User-agent: msnbot
Disallow: /


Jeff G.

gregbrown at mindspring.com wrote:

>The following is the number of hits from MSN bot, from all MSN bot IP addresses, to my webserver (through ALL historical logs I still have around):
>
>   1227 65.54.188.69
>     58 65.54.188.70
>     42 65.54.188.64
>     18 65.54.188.68
>      4 65.54.188.67
>
>
>If I look at all traffic to my website MSN bot is still on top
>
>   1227 65.54.188.69
>    127 192.58.204.226
>     59 65.54.188.70
>     42 65.54.188.64
>     29 64.244.30.79
>     24 66.196.91.227
>     19 65.87.170.103
>     19 129.33.49.251
>     18 65.54.188.68
>     17 66.26.93.162
>
>
>I know it's from MSN because it leaves the following in my log:
>"msnbot/0.3 (+http://search.msn.com/msnbot.htm)"
>
>I assume over at MSN they are trying to scrape the Internet to build up their own web search engine.  I am curious if others are seeing this same activity.
>
>The command I used for these queries was (as root in /var/log/httpd):
>
>for msn bot
>cat access_log| grep msnbot |  awk '{ print $1 }' | sort | uniq -c | sort -gr | head
>
>and
>
>for all hits
>cat access_log| awk '{ print $1 }' | sort | uniq -c | sort -gr | head
>
>Greg
>
>
>
>  
>




More information about the TriLUG mailing list