Wikimedia Traffic Analysis Report - Crawler requests

Monthly requests or daily averages, for period: 1 Jan 2012 - 31 Jan 2012 (last 12 months)
000 ⇒ k
 

 This analysis is based on a 1:1000 sampled server log (squids)

 See also: Requests by destination or by origin / Methods / Scripts / User agents / Skins / Crawlers / Op.Sys. / Mobile devices / Browsers / Google / Country data / Traffic trends, and notes about reliability of these data

The following overview of crawler (aka bot) page requests is based on the user agent information that accompanies most server requests. Unfortunately this user agent information follows rather loosely defined guidelines.
Also please bear in mind than the most popular crawler names may be somewhat overrepresented. This is the result of so called user agent spoofing (where a requester supplies false credentials, e.g. to bypass web servers filters).
GoogleBot seems to be a favorite for spoofing. Therefore requests from an ip address registered by Google (see below) are color coded GoogleBot, others GoogleBot

For this report page requests are considered to be issued by a crawler in two cases:
1 The user agent string contains a web address (only crawlers should have that, but there a some false positives, where a browser sends a user agent string with a web address (ill behaved plug-in, main offenders have been eliminated)
2 The user agent string contains the term bot, spider or crawl[er]'

In total 68,004,290 page requests (mime type text/html only!) per day are considered crawler requests, out of 478,690,710 external requests, which is 14.2%

Page requests for crawlers that specify a url in the agent string
Count
x 1000
Secondary domain
(~site) name
URLMime typeUser agent
google
 www.google.com/bot.htmltext/..Mozilla/5.0 (compatible; GoogleBot/2.1; url)
 www.google.com/bot.html-Mozilla/5.0 (compatible; GoogleBot/2.1; url)
 www.google.com/bot.html-Mozilla/5.0 (compatible; GoogleBot/2.1; url)
 www.google.com/bot.htmltext/..Mozilla/5.0 (iPhone; CPU iPhone OS 4_1 like Mac OS X; en-us) AppleWebKit/532.9 KHTML Version/4.0.5 Mobile/8B117 Safari/6531.22.7 (compatible; GoogleBot-Mobile/2.1; url)
 www.google.com/bot.htmlimage/..Mozilla/5.0 (compatible; GoogleBot/2.1; url)
 desktop.google.com/application/xmlMozilla/5.0 (compatible; Google Desktop/5.9.1005.12335; url)
 desktop.google.com/image/..Mozilla/5.0 (compatible; Google Desktop/5.9.1005.12335; url)
 www.google.com/bot.htmltext/..DoCoMo/2.0 N905i(c100;TB;W24H16) (compatible; GoogleBot-Mobile/2.1; url)
 www.google.com/feedfetcher.htmlimage/..Mozilla/5.0 (compatible) FeedFetcher-Google; (url)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: ortografia4)
 www.google.com/bot.htmltext/..SAMSUNG-SGH-E250/1.0 Profile/MIDP-2.0 Configuration/CLDC-1.1 UP.Browser/6.2.3.3.c.1.101 (GUI) MMP/2.0 (compatible; GoogleBot-Mobile/2.1; url)
 www.google.com/feedfetcher.html-FeedFetcher-Google; (url)
 code.google.com/appengineapplication/jsonAppEngine-Google; (url; appid: s~redconceptual)
 www.google.com/feedfetcher.htmlapplication/xmlFeedFetcher-Google; (url)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: ortopedianew)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: wikien4)
 www.google.com/bot.html-Mozilla/5.0 (iPhone; CPU iPhone OS 4_1 like Mac OS X; en-us) AppleWebKit/532.9 KHTML Version/4.0.5 Mobile/8B117 Safari/6531.22.7 (compatible; GoogleBot-Mobile/2.1; url)
 www.google.com/feedfetcher.htmltext/..Mozilla/5.0 (compatible) FeedFetcher-Google; (url)
 www.google.com/feedfetcher.htmlapplication/jsonMozilla/5.0 (compatible) FeedFetcher-Google; (url)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: rarplayer)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: wikien3)
 code.google.com/p/crawler4j/text/..crawler4j (url)
 desktop.google.com/text/..Mozilla/5.0 (compatible; Google Desktop/5.9.1005.12335; url)
 www.google.com/feedfetcher.htmltext/..FeedFetcher-Google; (url)
 code.google.com/appengineimage/..AppEngine-Google; (url; appid: s~senchaiosrc)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: s~liyi1999)
 www.google.com/coop/cse/creftext/..FeedFetcher-Google-CoOp; (url)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: s~keytanwiki4)
 www.google.com/feedfetcher.htmlapplication/xmlMozilla/5.0 (compatible) FeedFetcher-Google; (url)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: s~keytanwiki2)
 code.google.com/appengineapplication/xmlAppEngine-Google; (url; appid: wikipedia-raw)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: s~keytanwiki3)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: web-phpproxy)
 docs.google.comimage/..Mozilla/5.0 (compatible; GoogleDocs; url)
 www.google.com/bot.htmltext/..SAMSUNG-SGH-E250/1.0 Profile/MIDP-2.0 Configuration/CLDC-1.1 UP.Browser/6.2.3.3.c.1.101 (GUI) MMP/2.0 (compatible; GoogleBot-Mobile/2.1; url)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: boxapp)
 code.google.com/appenginetext/..WikiBot/0.1 AppEngine-Google; (url; appid: newikipedia)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: s~link123451)
 code.google.com/p/crawler4j/image/..crawler4j (url)
 code.google.com/appengineapplication/jsonMozilla 3.5 AppEngine-Google; (url; appid: prfleme)
 www.google.com/bot.htmltext/..GoogleBot/2.1 (url)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: s~tpbitalia)
 code.google.com/appengineapplication/jsonAppEngine-Google; (url; appid: prfleme)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: s~expinia-wiki)
 code.google.com/appengineapplication/jsonMozilla 4.0 AppEngine-Google; (url; appid: prfleme)
 code.google.com/appengineapplication/jsonMWBOT GAE Edition AppEngine-Google; (url; appid: philip-bot)
 www.google.com/bot.htmlimage/..DoCoMo/2.0 N905i(c100;TB;W24H16) (compatible; GoogleBot-Mobile/2.1; url)
 docs.google.comtext/..Mozilla/5.0 (compatible; GoogleDocs; url)
 code.google.com/appenginetext/..www.productontology.org/1.0 (Contact: mail address ) AppEngine-Google; (url; appid: gr4bing)
 docs.google.comimage/..Mozilla/5.0 (compatible; GoogleDocs; documents; url)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: 100thpriest)
 www.google.com/bot.html-DoCoMo/2.0 N905i(c100;TB;W24H16) (compatible; GoogleBot-Mobile/2.1; url)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: good-proxy)
 www.google.com/bot.htmltext/..Mozilla/5.0 (compatible; GoogleBot/2.1; url)
 desktop.google.com/application/xmlMozilla/5.0 (compatible; Google Desktop/5.9.909.30391; url)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: usawebdl)
 code.google.com/appengineimage/..AppEngine-Google; (url; appid: d24-img)
 www.google.com/bot.html-SAMSUNG-SGH-E250/1.0 Profile/MIDP-2.0 Configuration/CLDC-1.1 UP.Browser/6.2.3.3.c.1.101 (GUI) MMP/2.0 (compatible; GoogleBot-Mobile/2.1; url)
 code.google.com/appengineimage/..AppEngine-Google; (url; appid: s~liyi1999)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: proxy-neil)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: retimeme2)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: s~education-center)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: d24-img)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: kbworld24)
 desktop.google.com/image/..Mozilla/5.0 (compatible; Google Desktop/5.9.911.3589; url)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: cmd-proxy)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: openeyeproxy)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: davidgotmoney50)
 code.google.com/p/rondaapplication/jsonRonda - url
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: thakurproxy)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: korvas-sux)
 desktop.google.com/-Mozilla/5.0 (compatible; Google Desktop/5.9.1005.12335; url)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: nmimsforti)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: freetobrowse)
 www.google.com/bot.htmlNONE/wikipedia- Mozilla/5.0 (compatible; GoogleBot/2.1; url)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: abdulfat)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: proxyproxy2884)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: s~aurora-prox)
 www.google.com/bot.htmltext/..DoCoMo/2.0 N905i(c100;TB;W24H16) (compatible; GoogleBot-Mobile/2.1; url)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: wagagate)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: hydraroxy)
 www.google.com/bot.html-DoCoMo/2.0 N905i(c100;TB;W24H16) (compatible; GoogleBot-Mobile/2.1; url)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: tdmplong)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: webponline7)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: s~zagrobelnyprox)
 code.google.com/appenginetext/..oohEmbed.com AppEngine-Google; (url; appid: vipoembed)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: workrelatedworkstuff)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: s~kushgenius)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: tortelliniman)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: threewiki)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: gizmo-jumpjet)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: webusadlp6)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: myproxywx)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: drrkproxxxy)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: proxynaungnaung)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: kalchth)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: weps001)
 www.google.com/bot.htmlimage/..SAMSUNG-SGH-E250/1.0 Profile/MIDP-2.0 Configuration/CLDC-1.1 UP.Browser/6.2.3.3.c.1.101 (GUI) MMP/2.0 (compatible; GoogleBot-Mobile/2.1; url)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: hao1-prxoy)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: finchproxy)
facebook
 www.facebook.com/externalhit_uatext.phpimage/..facebookexternalhit/1.0 (url)
 www.facebook.com/externalhit_uatext.phptext/..facebookexternalhit/1.0 (url)
 www.facebook.com/externalhit_uatext.phptext/..facebookexternalhit/1.1 (url)
 developers.facebook.comimage/..facebookplatform/1.0 (url)
 www.facebook.com/externalhit_uatext.phpimage/..facebookexternalhit/1.1 (url)
 www.facebook.com/externalhit_uatext.php-facebookexternalhit/1.0 (url)
 www.facebook.com/externalhit_uatext.php-facebookexternalhit/1.1 (url)
 developers.facebook.com-facebookplatform/1.0 (url)
 www.facebook.com/externalhit_uatext.php-facebookexternalhit/1.0 (url)
 www.facebook.com/externalhit_uatext.phptext/..facebookexternalhit/1.1 (url)
bing
 www.bing.com/bingbot.htmtext/..Mozilla/5.0 (compatible; bingbot/2.0; url)
 www.bing.com/bingbot.htm-Mozilla/5.0 (compatible; bingbot/2.0; url)
 www.bing.com/bingbot.htm-Mozilla/5.0 (compatible; bingbot/2.0; url)
 www.bing.com/bingbot.htmtext/..Mozilla/5.0 (compatible; bingbot/2.0; url) ASProxy/5.5b5
 www.bing.com/bingbot.htmtext/..Mozilla/5.0 (compatible; bingbot/2.0; url) ASProxy/5.5b3
 www.bing.com/bingbot.htmtext/..Mozilla/5.0 (compatible; bingbot/2.0; url)
 www.bing.com/bingbot.htmapplication/vnd.php.serializedMozilla/5.0 (compatible; bingbot/2.0; url)
 www.bing.com/bingbot.htmtext/..Mozilla/5.0 (compatible; bingbot/2.0; url) ASProxy/5.5b4
 www.bing.com/bingbot.htmimage/..Mozilla/5.0 (compatible; bingbot/2.0; url)
 www.bing.com/bingbot.htmtext/..User-Agent :Mozilla/5.0 (compatible; bingbot/2.0; url)
 www.bing.com/bingbot.htmtext/..Mozilla/5.0 (compatible; bingbot/2.0; url) (via Web-Blaster/2.21 (http://www.assoziations-blaster.de/web-blast.html))
 www.bing.com/bingbot.htmtext/..Mozilla/5.0 (compatible; bingbot/2.0; url) AppEngine-Google; (http://code.google.com/appengine; appid: proxydisk8)
 www.bing.com/bingbot.htmtext/..Mozilla/5.0 (compatible; bingbot/2.0; url) AppEngine-Google; (http://code.google.com/appengine; appid: wxcity1)
 www.bing.com/bingbot.htmtext/..Mozilla/5.0 (compatible; bingbot/2.0; url) AppEngine-Google; (http://code.google.com/appengine; appid: proxydisk9)
 www.bing.com/bingbot.htmtext/..Mozilla/5.0 (compatible; bingbot/2.0; url) AppEngine-Google; (http://code.google.com/appengine; appid: surf603)
 www.bing.com/bingbot.htmtext/..Mozilla/5.0 (compatible; bingbot/2.0; url) (via babelfish.yahoo.com)
 www.bing.com/bingbot.htmapplication/xmlMozilla/5.0 (compatible; bingbot/2.0; url)
 www.bing.com/bingbot.htmtext/..Mozilla/5.0 (compatible; bingbot/2.0; url) (via Web-Blaster/2.21 (http://www.a-blast.org/web-blast.html))
 www.bing.com/bingbot.htmtext/..Mozilla/5.0 (compatible;bingbot/2.0;url)
 www.bing.com/bingbot.htmtext/..Mozilla/5.0 (compatible; bingbot/2.0; url) AppEngine-Google; (http://code.google.com/appengine; appid: tcpudp10)
 www.bing.com/bingbot.htmtext/..Mozilla/5.0 (compatible; bingbot/2.0; url) AppEngine-Google; (http://code.google.com/appengine; appid: yourrevenues)
google?
 www.google.com/bot.htmltext/..Mozilla/5.0 (compatible; GoogleBot/2.1; url)
 www.google.com/bot.htmlapplication/vnd.php.serializedMozilla/5.0 (compatible; GoogleBot/2.1; url)
 www.google.com/bot.html-Mozilla/5.0 (compatible; GoogleBot/2.1; url)
 www.google.com/bot.htmltext/..GoogleBot/2.1 (url)
 www.google.com/bot.html-Mozilla/5.0 (compatible; GoogleBot/2.1; url)
 www.google.com/bot.htmlimage/..Mozilla/5.0 (compatible; GoogleBot/2.1; url)
 www.google.com/bot.htmltext/..Mozilla/5.0(compatible;GoogleBot/2.1;url)
 www.google.com/bot.htmltext/..Mozilla/5.0 (iPhone; CPU iPhone OS 4_1 like Mac OS X; en-us) AppleWebKit/532.9 KHTML Version/4.0.5 Mobile/8B117 Safari/6531.22.7 (compatible; GoogleBot-Mobile/2.1; url)
 www.google.com/bot.htmltext/..GoogleBot/2.1 (url)
 www.google.com/bot.htmltext/..DoCoMo/2.0 N905i(c100;TB;W24H16) (compatible; GoogleBot-Mobile/2.1; url)
 www.google.com/bot.htmltext/..Mozilla/5.0 (compatible; GoogleBot/2.1; url)
 www.google.com/bot.htmlapplication/xmlMozilla/5.0 (compatible; GoogleBot/2.1; url)
baidu
 www.baidu.com/search/spider.htmltext/..Mozilla/5.0 (compatible; Baiduspider/2.0; url)
 www.baidu.com/search/spider.htmlapplication/vnd.php.serializedMozilla/5.0 (compatible; Baiduspider/2.0; url)
 www.baidu.com/search/spider.htmtext/..Baiduspider-image(url)
 www.baidu.com/search/spider.html-Mozilla/5.0 (compatible; Baiduspider/2.0; url)
 www.baidu.com/search/spider.htmtext/..Baiduspider(url)
 www.baidu.com/search/spider.htmltext/..Mozilla/5.0(compatible;Baiduspider/2.0;url)
 www.baidu.com/search/spider.htmltext/..Mozilla/5.0 (compatible;Baiduspider/2.0;url)
 www.baidu.com/search/spider.htmltext/..Mozilla/5.0 (compatible; Baiduspider/2.0; url)
 www.baidu.com/search/spider.htmlapplication/xmlMozilla/5.0 (compatible; Baiduspider/2.0; url)
 www.baidu.com/search/spider.htmlimage/..Mozilla/5.0 (compatible; Baiduspider/2.0; url)
yahoo
 help.yahoo.com/help/us/ysearch/slurpimage/..Mozilla/5.0 (compatible; Yahoo! Slurp/3.0; url)
 help.yahoo.com/help/us/ysearch/slurptext/..Mozilla/5.0 (compatible; Yahoo! Slurp; url)
 help.yahoo.com/help/us/ysearch/slurptext/..Mozilla/5.0 (compatible; Yahoo! Slurp/3.0; url)
 help.yahoo.co.jp/help/jp/search/indexing/indexing-15.htmltext/..'Mozilla/5.0 (compatible; Y!J SearchMonkey/1.0 (Y!J-AGENT; url))'
 listing.yahoo.co.jp/support/faq/int/other/other_001.htmltext/..Y!J-BRJ/YATS crawler (url)
 help.yahoo.co.jp/help/jp/search/indexing/indexing-15.htmltext/..Y!J-BRW/1.0 crawler (url)
 help.yahoo.com/help/us/ysearch/slurpimage/..Mozilla/5.0 (compatible; Yahoo! Slurp; url)
 help.yahoo.com/help/us/ysearch/slurp-Mozilla/5.0 (compatible; Yahoo! Slurp; url)
 developer.yahoo.com/yql/providertext/..Mozilla/5.0 (compatible; Yahoo Pipes 2.0; url) Gecko/20090729 Firefox/3.5.2
 help.yahoo.co.jp/help/jp/search/indexing/indexing-15.htmlimage/..'Mozilla/5.0 (compatible; Y!J SearchMonkey/1.0 (Y!J-AGENT; url))'
 help.yahoo.com/help/us/ysearch/slurpapplication/vnd.php.serializedMozilla/5.0 (compatible Yahoo! Slurp/3.0 url)
 help.yahoo.co.jp/help/jp/search/indexing/indexing-15.htmltext/..Y!J-BRI/0.0.1 crawler ( url )
 help.yahoo.co.jp/help/jp/search/indexing/indexing-15.htmltext/..Y!J-BRT/1.0 crawler (url)
 help.yahoo.com/help/us/ysearch/slurpapplication/jsonMozilla/5.0 (compatible; Yahoo! Slurp/3.0; url)
 help.yahoo.comtext/..Mozilla/5.0 (YahooYSMcm/3.0.0; url)
naver
 help.naver.com/robots/text/..Yeti/1.0 (NHN Corp.; url)
 help.naver.com/robots/-Yeti/1.0 (NHN Corp.; url)
 help.naver.com/robots/-Yeti/1.0 (NHN Corp.; url)
 help.naver.com/robots/image/..Yeti/1.0 (NHN Corp.; url)
 help.naver.com/robots/text/..Yeti/1.0 (NHN Corp.; url) ASProxy/5.5b3
 help.naver.com/robots/text/..Yeti/1.0 (NHN Corp.; url) ASProxy/5.5b5
 help.naver.com/robots/application/xmlYeti/1.0 (NHN Corp.; url)
yandex
 yandex.com/botstext/..Mozilla/5.0 (compatible; YandexBot/3.0; url)
 yandex.com/botstext/..Mozilla/5.0 (compatible; YandexImages/3.0; url)
 yandex.com/botstext/..Mozilla/5.0 (compatible; YandexDirect/3.0; url)
 yandex.com/botsimage/..Mozilla/5.0 (compatible; YandexImages/3.0; url)
 yandex.com/botsimage/..Mozilla/5.0 (compatible; YandexAntivirus/2.0; url)
 yandex.com/bots-Mozilla/5.0 (compatible; YandexBot/3.0; url)
 yandex.com/botsimage/..Mozilla/5.0 (compatible; YandexImageResizer/2.0; url)
 yandex.com/botstext/..Mozilla/5.0 (compatible; YandexAntivirus/2.0; url)
 yandex.com/botstext/..Mozilla/5.0 (compatible; YandexNewslinks; url)
 yandex.com/botsapplication/vnd.php.serializedMozilla/5.0 (compatible; YandexBot/3.0; url)
msn
 search.msn.com/msnbot.htmtext/..msnbot/2.0b (url)._
 search.msn.com/msnbot.htmtext/..msnbot-media/1.1 (url)
 search.msn.com/msnbot.htmtext/..msnbot/2.0b (url)
 search.msn.com/msnbot.htmimage/..msnbot-media/1.1 (url)
 search.msn.com/msnbot.htmtext/..msnbot-NewsBlogs/2.0b (url)
 search.msn.com/msnbot.htmtext/..msnbot-Products/1.0 (url)
 search.msn.com/msnbot.htmtext/..msnbot-UDiscovery/2.0b (url)
 search.msn.com/msnbot.htmtext/..msnbot/0.01 (url)
ahrefs
 ahrefs.com/robot/text/..Mozilla/5.0 (compatible; AhrefsBot/2.0; url)
 ahrefs.com/robot/-Mozilla/5.0 (compatible; AhrefsBot/2.0; url)
 ahrefs.com/robot/text/..Mozilla/5.0 (compatible; AhrefsBot/1.0; url)
majestic12
 www.majestic12.co.uk/bot.php?text/..Mozilla/5.0 (compatible; MJ12bot/v1.4.1; url)
 www.majestic12.co.uk/bot.php?text/..Mozilla/5.0 (compatible; MJ12bot/v1.4.0; url)
php
 pear.php.net/application/vnd.php.serializedPEAR HTTP_Request class ( url )
 pear.php.net/application/xmlPEAR HTTP_Request class ( url )
 pear.php.net/package/http_request2text/..HTTP_Request2/0.5.2 (url) PHP/5.2.17
 pear.php.net/text/..PEAR HTTP_Request class ( url )
 pear.php.net/package/http_request2application/vnd.php.serializedHTTP_Request2/2.0.0 (url) PHP/5.3.8
 pear.php.net/image/..PEAR HTTP_Request class ( url )
 pear.php.net/package/http_request2text/..HTTP_Request2/2.0.0 (url) PHP/5.3.2-1ubuntu4.10
archive
 www.archive.org/details/archive.org_bottext/..Mozilla/5.0 (compatible; archive.org_bot url)
 www.archive.org/details/archive.org_botimage/..Mozilla/5.0 (compatible; archive.org_bot url)
 archive.org/details/archive.org_botimage/..Mozilla/5.0 (compatible; heritrix/3.1.1-SNAPSHOT-20120118.092903 url)
 www.archive.org/details/archive.org_bot-Mozilla/5.0 (compatible; archive.org_bot url)
80legs
 www.80legs.com/webcrawler.htmltext/..Mozilla/5.0 (compatible; 008/0.83; url) Gecko/2008032620
 www.80legs.com/webcrawler.htmlimage/..Mozilla/5.0 (compatible; 008/0.83; url) Gecko/2008032620
yacy
 yacy.net/bot.htmltext/..yacybot (sciencenet-any; amd64 Linux 2.6.32-33-generic; java 1.6.0_20; Europe/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; i386 Linux 2.6.32-37-generic; java 1.6.0_20; America/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Windows 7 6.1; java 1.6.0_29; Europe/de) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 3.0.0-14-generic; java 1.6.0_23; Europe/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 2.6.41.4-1.fc15.x86_64; java 1.6.0_22; W-SU/ru) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; x86 Windows 2003 5.2; java 1.6.0_29; Europe/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 2.6.39-bpo.2-amd64; java 1.6.0_18; Europe/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 2.6.24-28-server; java 1.6.0_18; Europe/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Windows 7 6.1; java 1.7.0_02; Europe/de) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; i386 Linux 2.6.18-028stab091.2; java 1.6.0_20; Europe/de) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 2.6.32-5-openvz-amd64; java 1.6.0_18; Europe/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 3.0.0-15-generic; java 1.6.0_26; Europe/sv) url
 yacy.net/bot.html-yacybot (freeworld/global; amd64 Windows 7 6.1; java 1.6.0_29; Europe/de) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 2.6.32-5-amd64; java 1.6.0_18; Europe/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 3.1.0-1.2-desktop; java 1.6.0_22; Europe/en) url
 yacy.net/bot.htmltext/..yacybot (webportal/global; amd64 Linux 2.6.32-37-server; java 1.6.0_20; Europe/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; i386 Linux 2.6.32-220.2.1.el6.i686; java 1.6.0_22; US/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld-global; amd64 Linux 2.6.32-5-amd64; java 1.6.0_24; Europe/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; x86_64 Mac OS X 10.6.8; java 1.6.0_29; Asia/ru) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; i386 Linux 2.6.38-13-generic-pae; java 1.6.0_22; Europe/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 3.0.0-12-generic; java 1.6.0_23; Indian/en) url
 yacy.net/bot.htmltext/..yacybot (webportal/global; amd64 Linux 2.6.32-33-server; java 1.6.0_20; Europe/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 2.6.39.4-4.2-desktop; java 1.6.0_12-ea; Europe/fr) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 3.0.0-14-generic; java 1.6.0_23; Europe/ru) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 3.0.6-gentoo; java 1.6.0_22; US/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; i386 Linux 2.6.37.6-0.5-desktop; java 1.6.0_20; Europe/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; x86 Windows XP 5.1; java 1.6.0_29; Europe/es) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Windows 7 6.1; java 1.6.0_29; Europe/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 2.6.32-34-server; java 1.6.0_20; Europe/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 3.2.0-10-generic; java 1.7.0_147-icedtea; America/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 3.1.6-gentoo; java 1.7.0_147-icedtea; Europe/de) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 3.0.0-1-amd64; java 1.6.0_24; Europe/it) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 3.0.0-15-generic; java 1.6.0_23; Europe/ru) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; i386 Linux 3.0.0-13-generic-pae; java 1.6.0_23; America/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; i386 Linux 2.6.32-37-generic-pae; java 1.6.0_20; Europe/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 2.6.38.2-grsec-xxxx-grs-ipv6-64; java 1.6.0_18; Europe/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 3.2.0; java 1.7.0_147-icedtea; Europe/en) url
 yacy.net/bot.html-yacybot (freeworld/global; x86 Windows XP 5.1; java 1.6.0_29; Europe/es) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 3.1.0-1-amd64; java 1.7.0_147-icedtea; Europe/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 2.6.41.10-3.fc15.x86_64; java 1.6.0_22; W-SU/ru) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 2.6.18-274.12.1.el5; java 1.7.0; GMT/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; i386 Linux 2.6.32-5-686; java 1.6.0_18; Europe/fr) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 3.0.0-14-generic; java 1.6.0_23; Europe/de) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 2.6.16.46-0.12-smp; java 1.6.0_15; Europe/de) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 3.2.1; java 1.6.0_24; Europe/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; i386 Linux 2.6.24-26-generic; java 1.6.0_18; America/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 2.6.31-gentoo-r6; java 1.6.0_17; Etc/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; i386 Linux 2.6.32-5-686; java 1.6.0_18; Europe/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 3.2.1-gentoo-r2; java 1.7.0_147-icedtea; Europe/de) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; i386 Linux 3.1.0-1-686-pae; java 1.6.0_24; Europe/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; i386 Linux 2.6.39-bpo.2-486; java 1.6.0_26; US/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; x86 Windows XP 5.1; java 1.6.0_29; Europe/de) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; i386 Linux 2.6.32-5-686; java 1.6.0_18; America/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Windows 7 6.1; java 1.6.0_29; Europe/es) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 2.6.32-028stab092.1; java 1.6.0_26; Europe/de) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; i386 Linux 2.6.32-38-generic; java 1.6.0_20; America/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 3.1.0-1-amd64; java 1.6.0_24; Europe/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 2.6.35.13-92.fc14.x86_64; java 1.6.0_20; Europe/ru) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 2.6.32-4-pve; java 1.6.0_18; Europe/de) url
wwwgogetpapers
 wwwgogetpapers.com/application/jsonUser-Agent: GoGetPapersBot (url)
 wwwgogetpapers.com/text/..User-Agent: GoGetPapersBot (url)
wordpress
 josefboberg.wordpress.comtext/..WordPress/3.4-alpha-19719; url
 driwancybermuseum.wordpress.comtext/..WordPress/3.4-alpha-19719; url
 02varvara.wordpress.comtext/..WordPress/3.4-alpha-19719; url
 syiahali.wordpress.comtext/..WordPress/3.4-alpha-19719; url
 driwancybermuseum.wordpress.comtext/..WordPress/3.4-alpha-19672; url
 eof737.wordpress.comtext/..WordPress/3.4-alpha-19719; url
 curtisnarimatsu.wordpress.comtext/..WordPress/3.4-alpha-19719; url
 lobbyistsofficesofgrw.wordpress.comtext/..WordPress/3.4-alpha-19719; url
 greatriversofhope.wordpress.comtext/..WordPress/3.4-alpha-19719; url
 jrolofer.wordpress.comtext/..WordPress/3.4-alpha-19719; url
 strona2.wordpress.comtext/..WordPress/3.4-alpha-19719; url
 syiahali.wordpress.comtext/..WordPress/3.4-alpha-19643; url
sblog
 fulltext.sblog.cz/screenshot/image/..Mozilla/5.0 (compatible; Seznam screenshot-generator 2.0; url)
 fulltext.sblog.cz/text/..SeznamBot/3.0 (url)
 fulltext.sblog.cz/screenshot/text/..Mozilla/5.0 (compatible; Seznam screenshot-generator 2.0; url)
 fulltext.sblog.cz/-SeznamBot/3.0 (url)
 fulltext.sblog.cz/screenshot/application/javascriptMozilla/5.0 (compatible; Seznam screenshot-generator 2.0; url)
www.
 www.text/..GoogleBot/2.1 ( urlGoogleBot.com/bot.html)
 www.text/..GoogleBot/2.1 (urlGoogleBot.com/bot.html)
 www.text/..GoogleBot-Image/1.0 ( urlGoogleBot.com/bot.html)
 www.-GoogleBot/2.1 (urlGoogleBot.com/bot.html)
 www.image/..GoogleBot/2.1 (urlGoogleBot.com/bot.html)
youdao
 www.youdao.com/help/webmaster/spider/text/..Mozilla/5.0 (compatible; YoudaoBot/1.0; url; )
 www.youdao.com/help/webmaster/spider/image/..Mozilla/5.0 (compatible;YodaoBot-Image/1.0;url;)
 www.youdao.com/help/webmaster/spider/text/..Mozilla/5.0 (compatible;YodaoBot-Image/1.0;url;)
 toolbar.youdao.com/image/..Youdao Toolbar (url)
 www.youdao.com/help/webmaster/spider/-Mozilla/5.0 (compatible; YoudaoBot/1.0; url; )
jike
 shoulu.jike.com/spider.htmltext/..Mozilla/5.0 (compatible; JikeSpider; url)
 shoulu.jike.com/spider.html-Mozilla/5.0 (compatible; JikeSpider; url)
 shoulu.jike.com/spider.html-Mozilla/5.0 (compatible; JikeSpider; url)
 shoulu.jike.com/spider.htmlapplication/xmlMozilla/5.0 (compatible; JikeSpider; url)
sogou
 www.sogou.com/docs/help/webmasters.htm#07text/..Sogou web spider/4.0(url)
 www.sogou.com/docs/help/webmasters.htm#07-Sogou web spider/4.0(url)
 www.sogou.com/docs/help/webmasters.htm#07-Sogou web spider/4.0(url)
 www.sogou.com/docs/help/webmasters.htm#07application/vnd.php.serializedSogou web spider/4.0(url)
wikipedia
 en.wikipedia.org/wiki/Wikipedia:Huggletext/..Huggle/2.1.18.0 url
 en.wikipedia.org/wiki/User:NicoV/Wikipedia_Cleaner/Documentationtext/..WikiCleaner (url)
 en.wikipedia.org/wiki/Wikipedia:Huggletext/..Huggle/2.1.19.0 url
 en.wikipedia.orgtext/..url
 fr.wikipedia.org/wiki/Utilisateur:Salebotapplication/jsonSalebot, see url (uses Perl MediaWiki::API)
traslated
 mymemory.traslated.net/doc/text/..Mozilla/5.0 (MyMemory Bot url)
toolserver
 wiki.toolserver.org/view/GeoHacktext/..Geohack (url)
 toolserver.org/~bayo/text/..LudoThecaire/1.0 (url)
 toolserver.org/~dispenser/text/..DispensersTools (url)
 toolserver.org/~guandalug/application/vnd.php.serializedGuandalugs PHPWikiBot/1.1 (url;de:User:Guandalug)
 toolserver.org/~para/cgi-bin/kmlexporttext/..url libwww-perl/6.02
FeedBurner
 www.FeedBurner.comtext/..FeedBurner/1.0 (url)
exabot
 www.exabot.com/go/robottext/..Mozilla/5.0 (compatible; Exabot/3.0; url)
wikimedia
 tools.wikimedia.de/~daniel/text/..WikiSense (url)
bin-co
 www.bin-co.com/php/scripts/load/text/..BinGet/1.00.A (url)
 www.bin-co.com/php/scripts/load/application/vnd.php.serializedBinGet/1.00.A (url)
sf
 magpierss.sf.nettext/..MagpieRSS/0.7x (url)
 liferea.sf.net/text/..Liferea/0.x.x (Linux; en_US.UTF-8; url)
 liferea.sf.net/text/..Liferea/1.x.x (Linux; es_ES.UTF-8; url)
 magpierss.sf.netapplication/xmlMagpieRSS/0.72 (url; No cache)
mediawiki
 www.mediawiki.org/text/..MediaWiki OAI Harvester 0.2 (url)
discoveryengine
 discoveryengine.com/discobot.htmltext/..Mozilla/5.0 (compatible; discobot/2.0; url)
 discoveryengine.com/discobot.htmltext/..Mozilla/5.0 (compatible; discobot/1.1; url)
zum
 help.zum.com/inquirytext/..ZumBot/1.0 (ZUM Search; url)
 help.zum.com/inquiryimage/..ZumBot/1.0 (ZUM Search; url)
echonest
 the.echonest.com/reader/application/xmlnestReader/0.3 (discovery; url; reader at echonest.com)
 the.echonest.com/reader/text/..nestReader/0.3 (discovery; url; reader at echonest.com)
wikidict
 www.wikidict.detext/..url
enwp
 enwp.org/User:SDPatrolBottext/..SDPatrolBot (url)
 enwp.org/User:H3llkn0wz/WikiSharpAPItext/..WikiSharpAPI/0.3 url (C# .NET)
 enwp.org/User:KingpinBottext/..KingpinBot (url)
sistrix
 crawler.sistrix.net/text/..Mozilla/5.0 (compatible; SISTRIX Crawler; url)
soso
 help.soso.com/webspider.htmtext/..Sosospider(url)
 help.soso.com/webspider.htm-Sosospider(url)
flipboard
 flipboard.com/browserproxyimage/..Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; en-US; rv:1.9.2) Gecko/20100115 Firefox/3.6 (FlipboardProxy/0.0.5; url)
 flipboard.com/browserproxytext/..Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; en-US; rv:1.9.2) Gecko/20100115 Firefox/3.6 (FlipboardProxy/1.1; url)
 flipboard.com/browserproxyapplication/jsonMozilla/5.0 (Macintosh; Intel Mac OS X 10.6; en-US; rv:1.9.2) Gecko/20100115 Firefox/3.6 (FlipboardProxy/0.0.1; url)
 flipboard.com/browserproxytext/..Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; en-US; rv:1.9.2) Gecko/20100115 Firefox/3.6 (FlipboardProxy/0.0.5; url)
archive-it
 archive-it.org/files/site-owners.htmlimage/..Mozilla/5.0 (compatible; archive.org_bot; Archive-It; url)
 archive-it.org/files/site-owners.htmltext/..Mozilla/5.0 (compatible; archive.org_bot; Archive-It; url)
goo
 help.goo.ne.jp/contact/text/..goo wikipedia (url)
avantbrowser
 www.avantbrowser.comtext/..Advanced Browser (url)
 www.avantbrowser.comtext/..Avant Browser (url)
gnip
 www.gnip.com/text/..UnwindFetchor/1.0 (url)
 www.gnip.com/-UnwindFetchor/1.0 (url)
federatedmedia
 federatedmedia.nettext/..Mozilla/5.0 (url) Gecko/20061208 Firefox/2.0.0.1
freebase
 www.freebase.comtext/..metaweb/Nutch-1.0-dev (url; help_at_metaweb.com)
feedshow
 www.feedshow.comtext/..Feedshow/x.0 (url; 1 subscriber)
 www.feedshow.comtext/..FeedshowOnline (url)
newsgator
 www.newsgator.comtext/..NewsGatorOnline/2.0 (url; 1 subscribers)
 www.newsgator.com/text/..FeedDemon/2.7 (url; Microsoft Windows XP)
jetbrains
 www.jetbrains.com/omea_reader/text/..JetBrains Omea Reader 1.0.x (url)
 www.jetbrains.com/omea_reader/text/..JetBrains Omea Reader 2.0 Release Candidate 1 (url)
kosmix
 www.kosmix.com/html/kosmos.htmlapplication/xmlMozilla/5.0(compatible;Kosmos/1.0;url)
entireweb
 www.entireweb.com/about/search_tech/speedy_spider/text/..Mozilla/5.0 (Windows; Windows NT 5.1; en-US) Speedy Spider (url)
cmu
 boston.lti.cs.cmu.edu/crawler_12/text/..Mozilla/5.0 (compatible; lemurwebcrawler mail address ; url)
 boston.lti.cs.cmu.edu/crawler_12/-Mozilla/5.0 (compatible; lemurwebcrawler mail address ; url)
apercite
 www.apercite.fr/robot/index.htmlimage/..Mozilla/5.0 (compatible; Apercite; url)
github
 github.com/pauldix/typhoeus/tree/mastertext/..Typhoeus - url
 github.com/NeilCrosby/wikislurpapplication/vnd.php.serializedWikiSlurp (url)
tumblr
 benderthewebrobot.tumblr.comtext/..Mozilla/5.0 (compatible; Bender; url)
 benderthewebrobot.tumblr.comapplication/vnd.php.serializedMozilla/5.0 (compatible; Bender; url)
sentymetr
 sentymetr.pl/bot.htmlapplication/jsonMozilla/5.0 (compatible; SentymetrBot 1.0; url)
 sentymetr.pl/bot.htmltext/..Mozilla/5.0 (compatible; SentymetrBot 1.0; url)
4chat
 www.4chat.tvtext/..url
semager
 www.semager.de/blog/semager-bots/text/..Mozilla/5.0 (compatible; Semager/1.4c; url)
 www.semager.de/blog/semager-bots/text/..Mozilla/5.0 (compatible; Semager/1.4; url)
whatrhymeswith
 www.whatrhymeswith.com/site/rhyme-bottext/..RhymeBot/0.1 (url)
emining
 emining.jp/text/..emBot-GalaBuzz/Nutch-1.0 (url; mail address )
 emining.jp/-emBot-GalaBuzz/Nutch-1.0 (url; mail address )
daum
 ws.daum.net/aboutWebSearch.htmltext/..Mozilla/5.0 (compatible; MSIE or Firefox mutant; not on Windows server; url) Daumoa/2.0
 ws.daum.net/aboutWebSearch.htmltext/..Mozilla/5.0 (compatible; MSIE or Firefox mutant; not on Windows server; url) Daumoa/3.0
wikimpress
 wikimpress.org/text/..Mozilla/5.0 (compatible; Linux i686 (x86_64); de-DE; url>Wikimpress) Wikimpress/1.0
z-add
 w3.z-add.co.uk/linkcheck/text/..Z-Add Link Checker (url)
tinyurl
 tinyurl.com/64t5ntext/..Rome Client (url) Ver: 0.9
 tinyurl.com/64t5napplication/xmlRome Client (url) Ver: UNKNOWN
tweetmeme
 tweetmeme.com/text/..Mozilla/5.0 (compatible; TweetmemeBot/2.11; url)
 tweetmeme.com/-Mozilla/5.0 (compatible; TweetmemeBot/2.11; url)
ranchero
 ranchero.com/netnewswire/text/..NetNewsWire/2.x (Mac OS X; url)
blogbridge
 www.blogbridge.com/text/..BlogBridge 2.13 (url)
rssreader
 www.rssreader.comtext/..RssReader/1.0.xx.x (url) Microsoft Windows NT 5.1.2600.0
seebot
 seebot.orgtext/..Lynx/2.8 (;url)
whstour
 tokyo.whstour.comtext/..WordPress/3.2.1; url
 osaka.whstour.comtext/..WordPress/3.2.1; url
 nagoya.whstour.comtext/..WordPress/3.2.1; url
orcabrowser
 www.orcabrowser.comtext/..Orca Browser (url)
kula
 kula.jp/endotext/..endo/1.0 (Mac OS X; ppc i386; url)
graemef
 graemef.comtext/..NewsGator FetchLinks extension/0.2.0 (url)
it-influentials
 search.it-influentials.com/bot.htmtext/..Mozilla/5.0 (compatible;FindITAnswersbot/1.0;url)
wandex
 wandex.nettext/..Mozilla/5.0 (compatible; World Wide Web Wanderer (Wandex Bot)/1.3; url)
SearchNearMe
 SearchNearMe.com/contact.phpapplication/vnd.php.serializedSearchNearMe (url)
 SearchNearMe.com/contact.phptext/..SearchNearMe (url)
ponderer
 ponderer.org/download/annotate_google.user.jstext/..annotate_google; url
zootycoon
 www.zootycoon.comtext/..Zoo Tycoon 2 Client -- url
nemui
 mozshot.nemui.org/text/..Mozilla/5.0 (Gecko/20070310 Mozshot/0.0.20070628; url)
winpodder
 winpodder.comtext/..WinPodder (url)
bsurprised
 bsurprised.com/text/..BSurprised WikiBox 0.1.3 (url)
snarfware
 www.snarfware.com/text/..Snarfer/0.x.x (url)
zipcommander
 www.zipcommander.com/text/..1st ZipCommander (Net) - url
rssbandit
 www.rssbandit.orgtext/..RssBandit/1.5.0.10 (WinNT 5.1.2600.0; url) (WinNT 5.1.2600.0; )
plagger
 plagger.org/text/..Plagger/0.x.xx (url)
timewe
 timewe.nettext/..CDR/1.7.1 Simulator/0.7(url) Profile/MIDP-1.0 Configuration/CLDC-1.0
feeds4all
 www.feeds4all.com/feedzcollectortext/..FeedZcollector v1.x (Platinum) url
abonti
 www.abonti.comtext/..Mozilla/5.0 (compatible; Abonti/0.91 - url)
hatena
 a.hatena.ne.jp/helptext/..Hatena Antenna/0.5 (url)
enotes
 www.enotes.comtext/..eNotesBot 2.0 (url)
 www.enotes.comimage/..eNotesBot 2.0 (url)
artez
 www.artez.nltext/..artezTest/0.1 (url)
commoncrawl
 www.commoncrawl.org/bot.htmltext/..CCBot/1.0 (url)
warebay
 www.warebay.com/bot.htmltext/..Mozilla/5.0 (compatible; WBSearchBot/1.1; url)
yioop
 www.yioop.com/bot.phptext/..Mozilla/5.0 (compatible; YioopBot url)
wikiglass
 wikiglass.comtext/..url : mail address
bibalex
 archive.bibalex.org/bot/image/..Mozilla/5.0 (compatible; archive.bibalex.org_bot; url)
 archive.bibalex.org/bot/text/..Mozilla/5.0 (compatible; archive.bibalex.org_bot; url)
speaktoit
 www.speaktoit.comapplication/jsonSpeaktoit url
alexa
 www.alexa.com/site/help/webmasterstext/..ia_archiver (url; mail address )
netnewswireapp
 netnewswireapp.com/mac/-NetNewsWire/3.3 (Mac OS X; url; gzip-happy)
spinn3r
 spinn3r.com/robottext/..Mozilla/5.0 (X11; Linux x86_64; en-US; rv:1.9.0.19; aggregator:Spinn3r (Spinn3r 3.1); url) Gecko/2010040121 Firefox/3.0.19
textdigger
 textdigger.comtext/..Mozilla/5.0 (url) Gecko/20061208 Firefox/2.0.0.1
simplepie
 simplepie.orgapplication/xmlSimplePie/1.2 (Feed Parser; url; Allow like Gecko) Build/20090627192103
 simplepie.orgtext/..SimplePie/1.2 (Feed Parser; url; Allow like Gecko) Build/20090627192103
rockpeaks
 www.rockpeaks.com/contacttext/..RockPeaks/0.1 (url)
wattsupwiththat
 wattsupwiththat.comtext/..WordPress/3.4-alpha-19719; url
globalspec
 www.globalspec.com/Ocellitext/..Ocelli/1.4 (url)
seokicks
 www.seokicks.de/robot.htmltext/..Mozilla/5.0 (compatible; SEOkicks-Robot url)
blogscope
 www.blogscope.net/text/..Mozilla/5.0 (compatible; BlogScope/1.0; url; U of Toronto)
drupal
 drupal.org/text/..User-Agent: Drupal (url)
 drupal.org/text/..Drupal (url)
paper
 support.paper.li/entries/20023257-what-is-paper-litext/..Mozilla/5.0 (compatible; PaperLiBot/2.1; url)
ngapa
 www.ngapa.comtext/..NgapaBot/Nutch-1.3 (NgapaBot is crawler for Indonesia Search Engine; url; mail address )
Anonymouse
 Anonymouse.org/image/..url (Unix)
 Anonymouse.org/text/..url (Unix)
suggy
 blog.suggy.com/was-ist-suggy/suggy-webcrawler/text/..Mozilla/5.0 (compatible; suggybot v0.01a, url)
 blog.suggy.com/was-ist-suggy/suggy-webcrawler/-Mozilla/5.0 (compatible; suggybot v0.01a, url)
searchtechnologies
 www.searchtechnologies.comtext/..Mozilla/5.0 (compatible; heritrix/1.14.3 url)
weblio
 www.weblio.jp/text/..Mozilla/5.0 (compatible; WeblioBot; url)
cdac
 www.cdac.intext/..abhishek/Nutch-0.9 (cdacp; url; mail address )
netvibes
 www.netvibes.comtext/..Netvibes (url)
webarchiv
 www.webarchiv.cztext/..Mozilla/5.0 (compatible; heritrix/1.14.3 url)
netseer
 www.netseer.com/crawler.htmltext/..Mozilla/5.0 (compatible; NetSeer crawler/2.0; url; mail address )
froute
 labs.froute.jp/pc2m/help.htmltext/..Froute Mobile Gateway/1.0 (url)
linkedin
 www.linkedin.comimage/..LinkedInBot/1.0 (compatible; Mozilla/5.0; Jakarta Commons-HttpClient/3.1 url)
 www.linkedin.comtext/..LinkedInBot/1.0 (compatible; Mozilla/5.0; Jakarta Commons-HttpClient/3.1 url)
 www.linkedin.com-LinkedInBot/1.0 (compatible; Mozilla/5.0; Jakarta Commons-HttpClient/3.1 url)
oreneta
 oreneta.com/libro-verde/text/..HerramientaAfiliado/0.1 (url; mail address )
memidex
 www.memidex.com/_bottext/..Mozilla/5.0 (compatible; Memibot/1.0; url )
chilkatsoft
 www.chilkatsoft.com/ChilkatHttpUA.asptext/..Chilkat/1.0.0 (url)
js-kit
 js-kit.com/text/..JS-Kit URL Resolver, url
ibis
 ibis.ne.jp/browser/about.htmlimage/..Mozilla/4.0 (compatible; ibisBrowser; url)
 ibis.ne.jp/browser/about.htmltext/..Mozilla/4.0 (compatible; ibisBrowser; url)
test
 www.test.testtext/..Mozilla/5.0 (compatible; heritrix/1.14.4 url)
potaru
 potaru.com/robo.htmltext/..Mozilla/5.0 (compatible; Robo/1.0b; url)/Nutch-1.2
topsy
 labs.topsy.com/butterfly/text/..Mozilla/5.0 (compatible; Butterfly/1.0; url) Gecko/2009032608 Firefox/3.0.8
search
 www.search.ch/rim.htmltext/..UltraSpider3000/1.0 (url)
bnf
 www.bnf.fr/fr/outils/a.dl_web_capture_robot.htmltext/..Mozilla/5.0 (compatible; bnf.fr_bot; url)
 www.bnf.fr/fr/outils/a.dl_web_capture_robot.htmlimage/..Mozilla/5.0 (compatible; bnf.fr_bot; url)
superfeedr
 superfeedr.comapplication/xmlSuperfeedr: Superparser bot/1.1 url - Please read this http://blog.superfeedr.com/publishers.html or get in touch if we are polling too hard
summify
 summify.comtext/..Summify (Summify/1.0.1; url)
kr:6600
 www.checkprivacy.or.kr:6600/RS/PRIVACY_FAQ.jsptext/..url
creativecommons
 wiki.creativecommons.org/Metadata_Scrapertext/..CC Metadata Scaper url
picsearch
 www.picsearch.com/bot.htmltext/..psbot/0.1 (url)
mytvmoments
 www.mytvmoments.comtext/..My TV Moments (url)
tourdeskde
 osaka.tourdeskde.comtext/..WordPress/3.2.1; url
 tokyo.tourdeskde.comtext/..WordPress/3.2.1; url
 nagoya.tourdeskde.comtext/..WordPress/3.2.1; url
97498.7399999992total

Page requests for probable crawlers, recognized by keyword
Count
x 1000
Agent string
  Mime type (count ≥ 3)
PythonWikipediaBot/1.0
 application/json
 application/xml
 text/..
 image/..
GoogleBot-Image/1.0
 image/..
 text/..
 -
MediaWikiCrawler-Google/2.0 ( mail address )
 text/..
 -
php wikibot classes
 application/vnd.php.serialized
 text/..
 -
LinkParser/2.0
 text/..
 -
Empedia Bot
 text/..
 -
wikiwix-bot-3.0
 text/..
 -
Mozilla/5.0 (Windows; Windows NT 5.1; fr; rv:1.8.1) VoilaBot BETA 1.2 ( mail address )
 text/..
 -
 application/pdf
GoogleBot-Image/1.0
 text/..
 image/..
 application/vnd.php.serialized
 -
 application/json
mail address
 application/vnd.php.serialized
 text/..
 application/xml
ClueBot/1.1
 application/vnd.php.serialized
Peachy MediaWiki Bot API Version 1.0
 application/vnd.php.serialized
 text/..
Answersbot
 text/..
ClueBot/2.0
 application/vnd.php.serialized
 text/..
bot Trivia Game - contact: mail address
 application/vnd.php.serialized
spider
 text/..
 application/json
 image/..
YBot/0.1
 application/vnd.php.serialized
 text/..
Pywikipediabot/2.0
 application/json
DotNetWikiBot/2.97 (Unix 2.6.32.36; )
 text/..
 application/xml
HTMLParser/2.0
 text/..
User:SatyrBot, v.2.0 - please contact User:SatyrTN if there are any problems
 application/vnd.php.serialized
Mozilla/5.0 (compatible; Ezooms/1.0; mail address )
 text/..
 -
 image/..
 application/vnd.php.serialized
Mozilla 5.0 (Apibot 0.32)
 application/vnd.php.serialized
DigitalsmithsBot
 text/..
Metabot 0.1
 text/..
wikbot/1.31 CFNetwork/548.0.4 Darwin/11.0.0
 image/..
 application/json
 text/..
 -
Onespot Crawler
 application/json
 text/..
 -
Wikipath Bot (email: mail address )
 application/json
python-wikitools/1.2 (User:BernsteinBot)
 application/json
 application/x-www-form-urlencoded
MediaWiki::Bot/3.2.6
 application/json
DotNetWikiBot/2.81 (Microsoft Windows NT 6.1.7601 Service Pack 1; )
 text/..
 application/xml
 image/..
 audio/midi
AnomieBOT 1.0 (TagDater)
 application/json
 text/..
 application/x-www-form-urlencoded
Code Search Crawler/Nutch-1.2 (Code Search Crawler; www.iai.uni-bonn.de)
 text/..
 image/..
 application/pdf
CorenSearchBot/1.7 en libwww-perl/6.03
 text/..
AnomieBOT 1.0 (ReplaceExternalLinks2)
 application/json
 text/..
Opera/8.01 (J2ME/MIDP; MXit WebBot/1.7.4.74) Opera Mini/3.1
 image/..
 -
 text/..
DotNetWikiBot/2.96 (Microsoft Windows NT 5.1.2600 Service Pack 3; )
 text/..
 application/xml
mail address mail address – MediaWiki Tcl Bot Framework 0.5 (r0)
 application/json
 application/x-www-form-urlencoded
 text/..
Test Webbot
 text/..
FAST Enterprise Crawler 6 used by ESP ( mail address )
 text/..
LinksCrawler 0.1beta
 text/..
 -
Mozilla/5.0 (compatible; Konqueror/3.5; Linux) KHTML/3.5.5 (Exabot-Thumbnails)
 image/..
 text/..
 application/json
Mozilla/5.0 (compatible; SnapPreviewBot; en-US; rv:1.8.0.9) Gecko/20061206 Firefox/1.5.0.9
 text/..
 -
Mozilla/5.0 (X11; Linux i686; en-US; rv:1.8.0.7) Gecko/20060909 Firefox/1.5.0.7 SnapPreviewBot
 text/..
 -
COIBot/1.00
 text/..
UCMore Crawler App
 text/..
 -
SineBot/1.5.18(User:SineBot)
 application/vnd.php.serialized
 text/..
 -
DotNetWikiBot/2.97 (Unix 2.6.32.37; )
 text/..
My Bot
 image/..
 text/..
 application/ogg
MLBot (www.metadatalabs.com/mlbot)
 text/..
 application/vnd.php.serialized
 -
 image/..
Tawbot (public svn release; plwiki)
 text/..
Mozilla/5.0 (compatible; Nigma.ru/3.0; mail address )
 text/..
 -
 application/xml
plantspedia data crawler
 text/..
Kavande Crawler 1.0/Nutch-1.4 (Iranian National Web Crawler)
 text/..
 image/..
 -
CorenSearchBot/1.7 en libwww-perl/5.834
 text/..
binbot
 text/..
ZuZzo.Net Spider
 text/..
Spinuf Spider
 text/..
DotNetWikiBot/2.97 (Unix 5.10.0.0; )
 application/xml
 text/..
MyCuteBot/0.1
 text/..
 application/vnd.php.serialized
 application/json
CaBot Script (running on nightshade.toolserver.org)
 application/vnd.php.serialized
 text/..
OrangeCrawler/Nutch-1.0 ( mail address )
 text/..
mail address mail address – MediaWiki Tcl Bot Framework 0.5 (r0)
 application/x-www-form-urlencoded
COIBot/2.0
 text/..
~Bot ([[:fr:w:User:TildeBot]] by [[:fr:w:User:Alphos]] mail address )
 text/..
Twitterbot/0.1
 text/..
 image/..
 -
nutch/Nutch-1.2 (MyTest; 140.115.82.105; mail address )
 text/..
 application/opensearchdescription+xml
HRoestBot, de-wikipedia using pywikipedia framework
 application/json
 application/xml
 text/..
gsa-crawler (Enterprise; T2-LYGGXQJZENSAT; mail address )
 text/..
AnomieBOT 1.0 (PERTableUpdater)
 application/json
 text/..
gsa-crawler (Enterprise; T2-C4B6ZJTX3WWKK; mail address )
 text/..
BritannicaProjBot mail address
 text/..
Webwiki Search Engine Bot - www.webwiki.de
 text/..
AnomieBOT 1.0 (FlagIconRemover)
 application/json
AniBot/0.9 php/curl
 application/vnd.php.serialized
JavaCrawler/1.1
 text/..
DNSTallyKwBot/0.2
 text/..
Peachy MediaWiki Bot API Version 0.1beta
 application/vnd.php.serialized
AnomieBOT 1.0 (TemplateSubster)
 application/json
SiocWikiBot/1.0
 application/vnd.php.serialized
 text/..
Opera/8.01 (J2ME/MIDP; MXit WebBot/1.7.2.71) Opera Mini/3.1
 image/..
 text/..
FAST Enterprise Crawler 6 used by LexisNexis ( mail address )
 text/..
 -
bitlybot
 text/..
 image/..
 -
Twitterbot/1.0
 text/..
 image/..
feedcrawler4/0.1 libwww-perl/6.03
 text/..
Wikibot
 text/..
 image/..
 -
GNAA-bot
 text/..
SchoolReviewNetworkWikiBot
 application/json
TrueKnowledgeBot bot mail address >
 application/vnd.php.serialized
 application/xml
GoogleBot/2.1
 text/..
 image/..
 -
DotNetWikiBot/2.7 (Microsoft Windows NT 6.1.7601 Service Pack 1; )
 text/..
 image/..
 application/xml
DotNetWikiBot/2.97 (Microsoft Windows NT 6.1.7600.0; )
 text/..
 application/xml
AnomieBOT 1.0 (BAGBot)
 application/json
 text/..
SurakWare MediaWiki Bot/1.0
 text/..
t_crawler/0.4
 text/..
DotNetWikiBot/2.96 (Unix 5.10.0.0; )
 text/..
 application/xml
FAST Enterprise Crawler/5.3.4 ( mail address )
 text/..
 -
OrlodrimBot/1.0
 text/..
SiteSeekerCrawler/1.0
 text/..
DotNetWikiBot/2.97 (Microsoft Windows NT 5.1.2600 Service Pack 3; )
 text/..
 application/xml
TVersity Media Robot
 text/..
DotNetWikiBot/2.97 (Microsoft Windows NT 6.1.7601 Service Pack 1; )
 text/..
 application/xml
 -
XLinkBot/1.00
 text/..
CheMoBot/1.00
 text/..
super cool bot
 application/vnd.php.serialized
Soundkiosk Relation-Crawler (Version 1.0; soundkiosk.de)
 application/xml
 text/..
Mozilla/5.0 (SnapPreviewBot) Gecko/20061206 Firefox/1.5.0.9
 image/..
 text/..
 application/json
feedcrawler4/0.1 libwww-perl/5.837
 text/..
Pastec bot
 text/..
 -
 image/..
python-wikitools/1.2 (User:Mr.Z-bot)
 application/json
wikbot/1.31 CFNetwork/485.13.9 Darwin/11.0.0
 image/..
 application/json
 text/..
HTMLParser/1.6
 text/..
wikbotlite/1.50 CFNetwork/548.0.4 Darwin/11.0.0
 image/..
 application/json
 text/..
LauschenBot/1.0 ( mail address )
 text/..
SoNet BOT
 application/json
Slevnicka.cz CURL bot
 text/..
Geni ircpybot 1.0
 text/..
 application/json
 application/xml
Opera/8.01 (J2ME/MIDP; MXit WebBot/1.7.2.71) Opera Mini/3.1
 -
TheKeens bot
 text/..
DotNetWikiBot/2.9 (Microsoft Windows NT 6.0.6000.0; )
 text/..
MyWikipediaBot/1.0
 application/vnd.php.serialized
MediaWiki::Bot/3.4.0
 application/json
DotNetWikiBot/2.9 (Unix 5.10.0.0; )
 text/..
Mozilla 5.0 (Apibot 0.30b5)
 application/vnd.php.serialized
AnomieBOT 1.0 (AFDMergeFromCleaner)
 application/json
ariabot
 text/..
wikbot/1.50 CFNetwork/548.0.4 Darwin/11.0.0
 image/..
 application/json
 text/..
Citation_bot; mail address
 text/..
AnomieBOT 1.0 (RandomPagePicker)
 application/json
 text/..
HBC Archive Indexerbot 0.9a
 text/..
WikiBot/0.1
 text/..
HTMLParser/1.4
 text/..
Mozilla/5.0 (compatible; FriendFeedBot/0.1; Http://friendfeed.com/about/bot; 371 subscribers; feed-id=3852576738117026533)
 application/xml
 -
Mozilla/5.0 (compatible; Tbot/1.0;)
 text/..
Mozilla/5.0 (Bgbot 0.5)
 text/..
python-wikitools/1.2 (User:LaraBot)
 application/json
Jabse.com Crawler v.2.0 www.jabse.com/crawler.php
 text/..
Goalkeeperbot(User:Beetstra)/1.0
 text/..
gsa-crawler (Enterprise; GID-01422; mail address )
 text/..
Baiduspider
 text/..
Freebase Deathbot
 text/..
AnomieBOT 1.0 (DeletionSortingCleaner)
 application/json
Mozilla/5.0 MaboMwFramework/1.1 (w:de:MerlIwBot)
 text/..
mail address (Mozilla compatible)
 text/..
A .NET Web Crawler
 text/..
wikbot/1.31 CFNetwork/485.12.7 Darwin/10.4.0
 image/..
 application/json
K-D Bot
 text/..
 image/..
Mozilla/4.0 (compatible; MT search portal spider/3.0; mail address )"
 application/xml
 text/..
Crawly the Spider/Nutch-1.4
 text/..
 application/ogg
wikbot/1.31 CFNetwork/548.0.3 Darwin/11.0.0
 image/..
 application/json
 -
 text/..
GoogleBot
 text/..
 image/..
AdMedia bot
 text/..
SoftNet Nutch Spider/Nutch-1.4
 text/..
 image/..
MediaWiki::Bot 3.1.5
 application/json
WikiBookBot/0.1
 text/..
Mozilla/5.0 (X11; Linux x86_64; de-DE; rv:1.9.0.19) Gecko/2010120923 ThumbShotsBot (KFSW 3.0.6-3)
 image/..
 text/..
 application/json
EarwigBot/0.1.dev (Python/2.7.1; https://github.com/earwig/earwigbot; mail address )
 application/json
18585.04total

IP ranges: known ip ranges for Google are 64.233.[160.0-191.255], 66.249.[64.0-95.255], 66.102.[0.0-15.255], 72.14.[192.0-255.255],
74.125.[0.0-255.255], 209.085.[128.0-255.255], 216.239.[32.0-63.255] and a few minor other subranges

Errata: WMF traffic logging service suffered from server capacity problems in Aug/Sep/Oct 2011.
Absolute traffic counts for October 2011 are approximatly 7% too low.
Data loss only occurred during peak hours. It therefore may have had somewhat different impact for traffic from different parts of the world.
and may have also skewed relative figures like share of traffic per browser or operating system.

From mid September till late November squid log records for mobile traffic were in invalid format.
Data could be repaired for logs from mid October onwards. Older logs were no longer available.

In a an unrelated server outage precisely half of traffic to WMF mobile sites was not counted from Oct 16 - Nov 29 (one of two load-balanced servers did not report traffic).
WMF has since improved server monitoring, so that similar outages should be detected and fixed much faster from now on.

Generated on Mon, Aug 6, 2012 14:59
Author:Erik Zachte (
Web site)
Mail: ezachte@### (no spam: ### = wikimedia.org)
All data and images on this page are in the public domain.

Note: page may load slower on Microsoft Internet explorer than on other major browsers