Wikimedia Traffic Analysis Report - Crawler requests

Monthly requests or daily averages, for period: 1 Mar 2012 - 31 Mar 2012 (last 12 months)
000 ⇒ k
 

 This analysis is based on a 1:1000 sampled server log (squids)

 See also: Requests by destination or by origin / Methods / Scripts / User Agents / Skins / Crawlers / Op.Sys. / Mobile devices / Browsers / Google / Country data, and notes about reliability of these data

The following overview of crawler (aka bot) page requests is based on the user agent information that accompanies most server requests. Unfortunately this user agent information follows rather loosely defined guidelines.
Also please bear in mind than the most popular crawler names may be somewhat overrepresented. This is the result of so called user agent spoofing (where a requester supplies false credentials, e.g. to bypass web servers filters).
GoogleBot seems to be a favorite for spoofing. Therefore requests from an ip address registered by Google (see below) are color coded GoogleBot, others GoogleBot

For this report page requests are considered to be issued by a crawler in two cases:
1 The user agent string contains a web address (only crawlers should have that, but there a some false positives, where a browser sends a user agent string with a web address (ill behaved plug-in, main offenders have been eliminated)
2 The user agent string contains the term bot, spider or crawl[er]'

In total 69,531,000 page requests (mime type text/html only!) per day are considered crawler requests, out of 468,518,000 external requests, which is 14.8%

Page requests for crawlers that specify a url in the agent string
Count
x 1000
Secondary domain
(~site) name
URLMime typeUser agent
google
 www.google.com/bot.htmltext/..Mozilla/5.0 (compatible; GoogleBot/2.1; url)
 www.google.com/bot.html-Mozilla/5.0 (compatible; GoogleBot/2.1; url)
 desktop.google.com/application/xmlMozilla/5.0 (compatible; Google Desktop/5.9.1005.12335; url)
 www.google.com/bot.htmltext/..Mozilla/5.0 (iPhone; CPU iPhone OS 4_1 like Mac OS X; en-us) AppleWebKit/532.9 KHTML Version/4.0.5 Mobile/8B117 Safari/6531.22.7 (compatible; GoogleBot-Mobile/2.1; url)
 www.google.com/bot.html-Mozilla/5.0 (iPhone; CPU iPhone OS 4_1 like Mac OS X; en-us) AppleWebKit/532.9 KHTML Version/4.0.5 Mobile/8B117 Safari/6531.22.7 (compatible; GoogleBot-Mobile/2.1; url)
 www.google.com/bot.htmlimage/..Mozilla/5.0 (compatible; GoogleBot/2.1; url)
 www.google.com/bot.htmltext/..DoCoMo/2.0 N905i(c100;TB;W24H16) (compatible; GoogleBot-Mobile/2.1; url)
 desktop.google.com/image/..Mozilla/5.0 (compatible; Google Desktop/5.9.1005.12335; url)
 www.google.com/bot.html-DoCoMo/2.0 N905i(c100;TB;W24H16) (compatible; GoogleBot-Mobile/2.1; url)
 desktop.google.com/text/..Mozilla/5.0 (compatible; Google Desktop/5.9.1005.12335; url)
 www.google.com/feedfetcher.htmlimage/..Mozilla/5.0 (compatible) FeedFetcher-Google; (url)
 www.google.com/bot.htmltext/..SAMSUNG-SGH-E250/1.0 Profile/MIDP-2.0 Configuration/CLDC-1.1 UP.Browser/6.2.3.3.c.1.101 (GUI) MMP/2.0 (compatible; GoogleBot-Mobile/2.1; url)
 www.google.com/feedfetcher.html-FeedFetcher-Google; (url)
 www.google.com/feedfetcher.htmlapplication/xmlFeedFetcher-Google; (url)
 code.google.com/appengineapplication/jsonAppEngine-Google; (url; appid: s~redconceptual)
 www.google.com/feedfetcher.htmlapplication/jsonMozilla/5.0 (compatible) FeedFetcher-Google; (url)
 www.google.com/feedfetcher.htmltext/..Mozilla/5.0 (compatible) FeedFetcher-Google; (url)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: ortografia4)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: ortopedianew)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: wikien4)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: wikien3)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: rarplayer)
 code.google.com/p/crawler4j/text/..crawler4j (url)
 www.google.com/feedfetcher.htmltext/..FeedFetcher-Google; (url)
 www.google.com/feedfetcher.htmlapplication/xmlMozilla/5.0 (compatible) FeedFetcher-Google; (url)
 www.google.com/bot.html-SAMSUNG-SGH-E250/1.0 Profile/MIDP-2.0 Configuration/CLDC-1.1 UP.Browser/6.2.3.3.c.1.101 (GUI) MMP/2.0 (compatible; GoogleBot-Mobile/2.1; url)
 code.google.com/appengineimage/..AppEngine-Google; (url; appid: s~senchaiosrc)
 www.google.com/coop/cse/creftext/..FeedFetcher-Google-CoOp; (url)
 code.google.com/appengineapplication/jsonMozilla 4.0 AppEngine-Google; (url; appid: prfleme)
 code.google.com/appengineapplication/xmlAppEngine-Google; (url; appid: wikipedia-raw)
 code.google.com/appenginetext/..WikiBot/0.1 AppEngine-Google; (url; appid: newikipedia)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: myproxywx)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: abdulfat)
 code.google.com/appengineapplication/jsonAppEngine-Google; (url; appid: prfleme)
 desktop.google.com/application/xmlMozilla/5.0 (compatible; Google Desktop/5.9.909.30391; url)
 code.google.com/appenginetext/..Mozilla/5.0 (Windows; Windows NT 5.1; en-US; rv:1.9.0.7) Gecko/2009021910 Firefox/3.0.7 AppEngine-Google; (url; appid: s~fonetika3)
 www.google.com/bot.htmltext/..GoogleBot/2.1 (url)
 docs.google.comimage/..Mozilla/5.0 (compatible; GoogleDocs; documents; url)
 code.google.com/appengineapplication/jsonMWBOT GAE Edition AppEngine-Google; (url; appid: philip-bot)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: usawebdl)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: moveable-weather)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: azamasmadi)
 code.google.com/appengineimage/..AppEngine-Google; (url; appid: d24-img)
 code.google.com/appenginetext/..www.productontology.org/1.0 (Contact: mail address ) AppEngine-Google; (url; appid: gr4bing)
 www.google.com/bot.htmlimage/..DoCoMo/2.0 N905i(c100;TB;W24H16) (compatible; GoogleBot-Mobile/2.1; url)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: pakgalaxy)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: kbworld24)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: webusadlp8)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: s~keytanwiki3)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: boxapp)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: webusadlp6)
 code.google.com/p/rondaapplication/jsonRonda - url
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: s~keytanwiki4)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: kires-roxy)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: s~drizzlprox)
 www.google.com/feedfetcher.html-Mozilla/5.0 (compatible) FeedFetcher-Google; (url)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: 100thpriest)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: proxyusing121)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: s~keytanwiki2)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: usawebdl3)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: pox)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: ridemyhell)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: jptaravellahighschool)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: simple-tools6)
 www.google.com/feedfetcher.htmltext/..Mozilla/5.0 (compatible) FeedFetcher-Google;(url)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: atxproxy)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: proxy-devakishor)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: d24-img)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: webproxy8-9)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: simple-tools2)
 desktop.google.com/-Mozilla/5.0 (compatible; Google Desktop/5.9.1005.12335; url)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: s~tpbitalia)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: nation4india)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: sjbrundage123456789)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: s~secure-facebook-ranney)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: freeoursouls)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: kikopea-openproxy)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: cmd-proxy)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: proxynaungnaung)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: raja584sekhar)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: kaveriselvaraj)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: zabastan)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: 114proxy)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: webproxy8-5)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: argim-free)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: s~misterhac)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: ikaryse)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: d-spark)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: nashimlive-nashimnx)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: webponline9)
 www.google.com/feedfetcher.htmlimage/..Mozilla/5.0 (compatible) FeedFetcher-Google;(url)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: ivankrisproxyserver)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: web-phpproxy)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: web-proxy-hh)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: vebproxy)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: wwwwebp2)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: s~deutiki)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: dkoxyserv)
 desktop.google.com/image/..Mozilla/5.0 (compatible; Google Desktop/5.9.911.3589; url)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: quigonjinn03)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: s~proxyseekkety)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: threewiki)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: wagagate)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: prexyproxy)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: hideproxyz)
 docs.google.comimage/..Mozilla/5.0 (compatible; GoogleDocs; apps-presentations; url)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: tusawebproxy4)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: webponline5)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: ilovethemothernature)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: imabouncytigger)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: varlopie)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: openeyeproxy)
facebook
 www.facebook.com/externalhit_uatext.phpimage/..facebookexternalhit/1.0 (url)
 www.facebook.com/externalhit_uatext.phptext/..facebookexternalhit/1.0 (url)
 www.facebook.com/externalhit_uatext.phptext/..facebookexternalhit/1.1 (url)
 developers.facebook.comimage/..facebookplatform/1.0 (url)
 www.facebook.com/externalhit_uatext.php-facebookexternalhit/1.0 (url)
 www.facebook.com/externalhit_uatext.phpimage/..facebookexternalhit/1.1 (url)
 developers.facebook.com-facebookplatform/1.0 (url)
 www.facebook.com/externalhit_uatext.php-facebookexternalhit/1.1 (url)
 developers.facebook.comtext/..facebookplatform/1.0 (url)
bing
 www.bing.com/bingbot.htmtext/..Mozilla/5.0 (compatible; bingbot/2.0; url)
 www.bing.com/bingbot.htm-Mozilla/5.0 (compatible; bingbot/2.0; url)
 www.bing.com/bingbot.htmapplication/vnd.php.serializedMozilla/5.0 (compatible; bingbot/2.0; url)
 www.bing.com/bingbot.htmtext/..Mozilla/5.0 (compatible; bingbot/2.0; url) ASProxy/5.5b3
 www.bing.com/bingbot.htmtext/..Mozilla/5.0 (compatible; bingbot/2.0; url) ASProxy/5.5b5
 www.bing.com/bingbot.htmimage/..Mozilla/5.0 (compatible; bingbot/2.0; url)
 www.bing.com/bingbot.htmtext/..User-Agent :Mozilla/5.0 (compatible; bingbot/2.0; url)
 www.bing.com/bingbot.htmtext/..Mozilla/5.0 (compatible; bingbot/2.0; url) AppEngine-Google; (http://code.google.com/appengine; appid: wxcity1)
 www.bing.com/bingbot.htmtext/..Mozilla/5.0 (compatible; bingbot/2.0; url) AppEngine-Google; (http://code.google.com/appengine; appid: proxydisk8)
 www.bing.com/bingbot.htmtext/..Mozilla/5.0 (compatible; bingbot/2.0; url) AppEngine-Google; (http://code.google.com/appengine; appid: surf603)
 www.bing.com/bingbot.htmtext/..Mozilla/5.0 (compatible; bingbot/2.0; url) AppEngine-Google; (http://code.google.com/appengine; appid: proxydisk9)
 www.bing.com/bingbot.htmtext/..Mozilla/5.0 (compatible; bingbot/2.0; url) (via babelfish.yahoo.com)
 www.bing.com/bingbot.htmtext/..Mozilla/5.0 (compatible; bingbot/2.0; url) AppEngine-Google; (http://code.google.com/appengine; appid: 546fga)
 www.bing.com/bingbot.htmtext/..Mozilla/5.0 (compatible; bingbot/2.0; url) AppEngine-Google; (http://code.google.com/appengine; appid: yourrevenues)
google?
 www.google.com/bot.htmltext/..Mozilla/5.0 (compatible; GoogleBot/2.1; url)
 www.google.com/bot.html-Mozilla/5.0 (compatible; GoogleBot/2.1; url)
 www.google.com/bot.htmltext/..GoogleBot/2.1 (url)
 www.google.com/bot.htmlapplication/vnd.php.serializedMozilla/5.0 (compatible; GoogleBot/2.1; url)
 www.google.com/bot.htmlimage/..Mozilla/5.0 (compatible; GoogleBot/2.1; url)
 www.google.com/bot.htmltext/..DoCoMo/2.0 N905i(c100;TB;W24H16) (compatible; GoogleBot-Mobile/2.1; url)
 www.google.com/bot.html-Mozilla/5.0 (iPhone; CPU iPhone OS 4_1 like Mac OS X; en-us) AppleWebKit/532.9 KHTML Version/4.0.5 Mobile/8B117 Safari/6531.22.7 (compatible; GoogleBot-Mobile/2.1; url)
 www.google.com/bot.htmltext/..Mozilla/5.0 (iPhone; CPU iPhone OS 4_1 like Mac OS X; en-us) AppleWebKit/532.9 KHTML Version/4.0.5 Mobile/8B117 Safari/6531.22.7 (compatible; GoogleBot-Mobile/2.1; url)
 www.google.com/bot.htmltext/..Mozilla/5.0 (compatible; GoogleBot/2.1; url)
 www.google.com/bot.htmlapplication/xmlMozilla/5.0 (compatible; GoogleBot/2.1; url)
 www.google.com/bot.htmltext/..Mozilla/5.0 (compatible; GoogleBot/2.1; url) [UsableNet Lift Mobile]
 www.google.com/bot.html-DoCoMo/2.0 N905i(c100;TB;W24H16) (compatible; GoogleBot-Mobile/2.1; url)
naver
 help.naver.com/robots/text/..Yeti/1.0 (NHN Corp.; url)
 help.naver.com/robots/-Yeti/1.0 (NHN Corp.; url)
 help.naver.com/robots/image/..Yeti/1.0 (NHN Corp.; url)
 help.naver.com/robots/text/..Yeti/1.0 (NHN Corp.; url) ASProxy/5.5b5
 help.naver.com/robots/text/..Yeti/1.0 (NHN Corp.; url) ASProxy/5.5b3
baidu
 www.baidu.com/search/spider.htmltext/..Mozilla/5.0 (compatible; Baiduspider/2.0; url)
 www.baidu.com/search/spider.html-Mozilla/5.0 (compatible; Baiduspider/2.0; url)
 www.baidu.com/search/spider.htmtext/..Baiduspider-image(url)
 www.baidu.com/search/spider.htmlapplication/vnd.php.serializedMozilla/5.0 (compatible; Baiduspider/2.0; url)
 www.baidu.com/search/spider.htmtext/..Baiduspider(url)
 www.baidu.com/search/spider.htmlimage/..Mozilla/5.0 (compatible; Baiduspider/2.0; url)
yahoo
 help.yahoo.com/help/us/ysearch/slurpimage/..Mozilla/5.0 (compatible; Yahoo! Slurp/3.0; url)
 help.yahoo.com/help/us/ysearch/slurptext/..Mozilla/5.0 (compatible; Yahoo! Slurp; url)
 help.yahoo.com/help/us/ysearch/slurptext/..Mozilla/5.0 (compatible; Yahoo! Slurp/3.0; url)
 help.yahoo.co.jp/help/jp/search/indexing/indexing-15.htmltext/..'Mozilla/5.0 (compatible; Y!J SearchMonkey/1.0 (Y!J-AGENT; url))'
 listing.yahoo.co.jp/support/faq/int/other/other_001.htmltext/..Y!J-BRJ/YATS crawler (url)
 developer.yahoo.com/yql/providertext/..Mozilla/5.0 (compatible; Yahoo Pipes 2.0; url) Gecko/20090729 Firefox/3.5.2
 help.yahoo.com/help/us/ysearch/slurp-Mozilla/5.0 (compatible; Yahoo! Slurp/3.0; url)
 help.yahoo.com/help/us/ysearch/slurp-Mozilla/5.0 (compatible; Yahoo! Slurp; url)
 help.yahoo.co.jp/help/jp/search/indexing/indexing-15.htmlimage/..'Mozilla/5.0 (compatible; Y!J SearchMonkey/1.0 (Y!J-AGENT; url))'
 help.yahoo.com/help/us/ysearch/slurpapplication/vnd.php.serializedMozilla/5.0 (compatible Yahoo! Slurp/3.0 url)
 help.yahoo.co.jp/help/jp/search/indexing/indexing-15.htmltext/..Y!J-BRI/0.0.1 crawler ( url )
 help.yahoo.co.jp/help/jp/search/indexing/indexing-15.htmltext/..Y!J-BRT/1.0 crawler (url)
 help.yahoo.com/help/us/ysearch/slurpapplication/jsonMozilla/5.0 (compatible; Yahoo! Slurp/3.0; url)
yandex
 yandex.com/botstext/..Mozilla/5.0 (compatible; YandexBot/3.0; url)
 yandex.com/botstext/..Mozilla/5.0 (compatible; YandexDirect/3.0; url)
 yandex.com/botstext/..Mozilla/5.0 (compatible; YandexImages/3.0; url)
 yandex.com/bots-Mozilla/5.0 (compatible; YandexBot/3.0; url)
 yandex.com/botsimage/..Mozilla/5.0 (compatible; YandexImages/3.0; url)
 yandex.com/botsimage/..Mozilla/5.0 (compatible; YandexAntivirus/2.0; url)
 yandex.com/botsimage/..Mozilla/5.0 (compatible; YandexImageResizer/2.0; url)
 yandex.com/botstext/..Mozilla/5.0 (compatible; YandexAntivirus/2.0; url)
 yandex.com/botsapplication/vnd.php.serializedMozilla/5.0 (compatible; YandexBot/3.0; url)
 yandex.com/botsimage/..Mozilla/5.0 (compatible; YandexBot/3.0; url)
 yandex.com/botstext/..Mozilla/5.0 (compatible; YandexNewslinks; url)
msn
 search.msn.com/msnbot.htmtext/..msnbot/2.0b (url)._
 search.msn.com/msnbot.htmtext/..msnbot-media/1.1 (url)
 search.msn.com/msnbot.htmtext/..msnbot/2.0b (url)
 search.msn.com/msnbot.htmtext/..msnbot-NewsBlogs/2.0b (url)
 search.msn.com/msnbot.htmimage/..msnbot-media/1.1 (url)
 search.msn.com/msnbot.htmtext/..msnbot-Products/1.0 (url)
 search.msn.com/msnbot.htmtext/..msnbot-UDiscovery/2.0b (url)
 search.msn.com/msnbot.htmtext/..msnbot/0.01 (url)
 search.msn.com/msnbot.htm-msnbot/2.0b (url)
 search.msn.com/msnbot.htm-msnbot-media/1.1 (url)
wwwgogetpapers
 wwwgogetpapers.com/application/jsonUser-Agent: GoGetPapersBot (url)
 wwwgogetpapers.com/text/..User-Agent: GoGetPapersBot (url)
sblog
 fulltext.sblog.cz/screenshot/image/..Mozilla/5.0 (compatible; Seznam screenshot-generator 2.0; url)
 fulltext.sblog.cz/text/..SeznamBot/3.0 (url)
 fulltext.sblog.cz/screenshot/text/..Mozilla/5.0 (compatible; Seznam screenshot-generator 2.0; url)
 fulltext.sblog.cz/-SeznamBot/3.0 (url)
majestic12
 www.majestic12.co.uk/bot.php?text/..Mozilla/5.0 (compatible; MJ12bot/v1.4.2; url)
archive
 www.archive.org/details/archive.org_bottext/..Mozilla/5.0 (compatible; archive.org_bot url)
 archive.org/details/archive.org_botimage/..Mozilla/5.0 (compatible; heritrix/3.1.1-SNAPSHOT-20120118.092903 url)
 www.archive.org/details/archive.org_botimage/..Mozilla/5.0 (compatible; archive.org_bot url)
 www.archive.org/details/archive.org_bot-Mozilla/5.0 (compatible; archive.org_bot url)
 archive.org/details/archive.org_bottext/..Mozilla/5.0 (compatible; heritrix/3.1.1-SNAPSHOT-20120118.092903 url)
php
 pear.php.net/application/vnd.php.serializedPEAR HTTP_Request class ( url )
 pear.php.net/application/xmlPEAR HTTP_Request class ( url )
 pear.php.net/package/http_request2text/..HTTP_Request2/0.5.2 (url) PHP/5.2.17
 pear.php.net/text/..PEAR HTTP_Request class ( url )
 pear.php.net/image/..PEAR HTTP_Request class ( url )
 pear.php.net/package/http_request2application/xmlHTTP_Request2/2.0.0 (url) PHP/5.3.8
 pear.php.net/package/http_request2text/..HTTP_Request2/2.0.0 (url) PHP/5.3.2-1ubuntu4.10
echonest
 the.echonest.com/reader/application/xmlnestReader/0.3 (discovery; url; reader at echonest.com)
 the.echonest.com/reader/text/..nestReader/0.3 (discovery; url; reader at echonest.com)
soso
 help.soso.com/webspider.htmtext/..Sosospider(url)
 help.soso.com/webspider.htm-Sosospider(url)
 help.soso.com/webspider.htmapplication/xmlSosospider(url)
www.
 www.text/..GoogleBot/2.1 ( urlGoogleBot.com/bot.html)
 www.text/..GoogleBot-Image/1.0 ( urlGoogleBot.com/bot.html)
 www.-GoogleBot/2.1 ( urlGoogleBot.com/bot.html)
 www.text/..GoogleBot/2.1 (urlGoogleBot.com/bot.html)
youdao
 www.youdao.com/help/webmaster/spider/text/..Mozilla/5.0 (compatible; YoudaoBot/1.0; url; )
 www.youdao.com/help/webmaster/spider/-Mozilla/5.0 (compatible; YoudaoBot/1.0; url; )
 www.youdao.com/help/webmaster/spider/image/..Mozilla/5.0 (compatible;YodaoBot-Image/1.0;url;)
 www.youdao.com/help/webmaster/spider/text/..Mozilla/5.0 (compatible;YodaoBot-Image/1.0;url;)
 toolbar.youdao.com/image/..Youdao Toolbar (url)
 www.youdao.com/help/webmaster/spider/application/vnd.php.serializedMozilla/5.0 (compatible; YoudaoBot/1.0; url; )
sogou
 www.sogou.com/docs/help/webmasters.htm#07text/..Sogou web spider/4.0(url)
 www.sogou.com/docs/help/webmasters.htm#07-Sogou web spider/4.0(url)
 www.sogou.com/docs/help/webmasters.htm#07application/vnd.php.serializedSogou web spider/4.0(url)
yacy
 yacy.net/bot.htmltext/..yacybot (sciencenet-any; amd64 Linux 2.6.32-33-generic; java 1.6.0_20; Europe/en) url
 yacy.net/bot.htmltext/..yacybot (sciencenet-any; amd64 Linux 2.6.38-13-generic; java 1.6.0_22; Europe/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 3.2.1-gentoo-r2; java 1.6.0_30; Asia/ja) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 3.1.0-1.2-desktop; java 1.6.0_22; Europe/de) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 2.6.32-38-generic; java 1.7.0; America/en) url
 yacy.net/bot.htmltext/..yacybot (sciencenet/any; amd64 Linux 3.0.0-16-generic; java 1.6.0_23; Europe/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 2.6.24-28-server; java 1.6.0_18; Europe/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 3.0.0-16-generic; java 1.6.0_23; America/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Windows 7 6.1; java 1.6.0_26; Europe/de) url
 yacy.net/bot.htmltext/..yacybot (sciencenet/any; amd64 Linux 2.6.26-2-amd64; java 1.6.0_18; Europe/en) url
 yacy.net/bot.htmltext/..yacybot (webportal/global; amd64 Linux 2.6.32-5-amd64; java 1.6.0_26; Europe/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 2.6.32-5-xen-amd64; java 1.6.0_18; Europe/fr) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 2.6.38-13-generic; java 1.6.0_22; Europe/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 2.6.42.9-1.fc15.x86_64; java 1.6.0_22; W-SU/ru) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 2.6.32-5-amd64; java 1.6.0_18; Europe/de) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Windows 7 6.1; java 1.7.0_01; Asia/ja) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 2.6.32.36-228-scalaxy; java 1.6.0_18; Etc/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 3.2.0-2-amd64; java 1.6.0_24; Europe/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; i386 Linux 3.2.9-1-pae; java 1.7.0_03-icedtea; Asia/de) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; i386 Linux 3.0.0-16-generic-pae; java 1.6.0_23; Europe/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 2.6.32-5-amd64; java 1.6.0_18; Europe/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; i386 Linux 3.0.0-ck1-solusos; java 1.6.0_18; GMT01:00/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld-global; amd64 Linux 2.6.32-5-amd64; java 1.6.0_18; Europe/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; i386 Linux 3.2.0-2-686-pae; java 1.6.0_24; Europe/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Windows Server 2008 R2 6.1; java 1.6.0_29; Europe/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; i386 Linux 3.0.0-ck1-solusos; java 1.6.0_26; GMT01:00/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; i386 Linux 2.6.32-39-generic-pae; java 1.6.0_20; Europe/en) url
 yacy.net/bot.html-yacybot (webportal/global; amd64 Linux 2.6.32-5-amd64; java 1.6.0_26; Europe/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; i386 Linux 3.2.8-1-ARCH; java 1.7.0_03-icedtea; Asia/de) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Windows 7 6.1; java 1.6.0_31; Europe/de) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 2.6.41.10-3.fc15.x86_64; java 1.6.0_22; W-SU/ru) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; i386 Linux 2.6.32-220.4.2.el6.i686; java 1.6.0_22; US/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 3.2.7-1.fc16.x86_64; java 1.7.0_b147-icedtea; Europe/de) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 3.2.0-17-generic; java 1.7.0_03-icedtea; America/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Windows Server 2008 6.0; java 1.7.0_03; Europe/en) url
jike
 shoulu.jike.com/spider.htmltext/..Mozilla/5.0 (compatible; JikeSpider; url)
 shoulu.jike.com/spider.html-Mozilla/5.0 (compatible; JikeSpider; url)
exabot
 www.exabot.com/go/robottext/..Mozilla/5.0 (compatible; Exabot/3.0; url)
 www.exabot.com/go/robot-Mozilla/5.0 (compatible; Exabot/3.0; url)
ahrefs
 ahrefs.com/robot/text/..Mozilla/5.0 (compatible; AhrefsBot/2.0; url)
 ahrefs.com/robot/-Mozilla/5.0 (compatible; AhrefsBot/2.0; url)
wikipedia
 en.wikipedia.org/wiki/Wikipedia:Huggletext/..Huggle/2.1.19.0 url
 en.wikipedia.org/wiki/User:NicoV/Wikipedia_Cleaner/Documentationtext/..WikiCleaner (url)
 en.wikipedia.org/wiki/Web_crawlertext/..GoogleBot/Nutch-1.0 (Prototype; url; mail address )
 ko.wikipedia.orgtext/..url
 fr.wikipedia.org/wiki/Utilisateur:Salebotapplication/jsonSalebot, see url (uses Perl MediaWiki::API)
 en.wikipedia.org/wiki/Wikipedia:Huggletext/..Huggle/2.1.19 url
 en.wikipedia.org/wiki/Wikipedia:Huggletext/..Huggle/2.1.18.0 url
crikey
 blogs.crikey.com.au/game-ontext/..WordPress/3.2.1; url
cibra
 cibra.de/text/..CiBra Data Collector (url)
toolserver
 wiki.toolserver.org/view/GeoHacktext/..Geohack (url)
 toolserver.org/~dispenser/text/..DispensersTools (url)
 toolserver.org/~dispenser/application/jsonDispensersTools (url)
 toolserver.org/~para/cgi-bin/kmlexporttext/..url libwww-perl/6.02
discoveryengine
 discoveryengine.com/discobot.htmltext/..Mozilla/5.0 (compatible; discobot/2.0; url)
 discoveryengine.com/discobot.htmlimage/..Mozilla/5.0 (compatible; discobot/2.0; url)
edu:8080
 vancouver.cs.washington.edu:8080/text/..Mozilla/5.0/heritrix/3.1.0 (compatible;; url)
entireweb
 www.entireweb.com/about/search_tech/speedy_spider/text/..Mozilla/5.0 (Windows; Windows NT 5.1; en-US) Speedy Spider (url)
mediawiki
 www.mediawiki.org/text/..MediaWiki OAI Harvester 0.2 (url)
 www.mediawiki.org/text/..MediaWiki OAI Harvester 0.2 (url) (client id: nttr.co.jp; experimental)
80legs
 www.80legs.com/webcrawler.htmltext/..Mozilla/5.0 (compatible; 008/0.83; url) Gecko/2008032620
 www.80legs.com/webcrawler.htmlimage/..Mozilla/5.0 (compatible; 008/0.83; url) Gecko/2008032620
enotes
 www.enotes.comtext/..eNotesBot 2.0 (url)
 www.enotes.comimage/..eNotesBot 2.0 (url)
gnip
 www.gnip.com/text/..UnwindFetchor/1.0 (url)
 www.gnip.com/image/..UnwindFetchor/1.0 (url)
wikidict
 www.wikidict.detext/..url
zum
 help.zum.com/inquirytext/..ZumBot/1.0 (ZUM Search; url)
 help.zum.com/inquiryimage/..ZumBot/1.0 (ZUM Search; url)
bin-co
 www.bin-co.com/php/scripts/load/text/..BinGet/1.00.A (url)
 www.bin-co.com/php/scripts/load/application/vnd.php.serializedBinGet/1.00.A (url)
flipboard
 flipboard.com/browserproxyimage/..Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; en-US; rv:1.9.2) Gecko/20100115 Firefox/3.6 (FlipboardProxy/0.0.5; url)
 flipboard.com/browserproxytext/..Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; en-US; rv:1.9.2) Gecko/20100115 Firefox/3.6 (FlipboardProxy/1.1; url)
 flipboard.com/browserproxytext/..Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; en-US; rv:1.9.2) Gecko/20100115 Firefox/3.6 (FlipboardProxy/0.0.5; url)
 flipboard.com/browserproxyapplication/jsonMozilla/5.0 (Macintosh; Intel Mac OS X 10.6; en-US; rv:1.9.2) Gecko/20100115 Firefox/3.6 (FlipboardProxy/0.0.1; url)
commoncrawl
 www.commoncrawl.org/bot.htmltext/..CCBot/1.0 (url)
lipperhey
 www.lipperhey.com/text/..Mozilla/5.0 (compatible; Lipperhey Site Explorer; url)
sf
 liferea.sf.net/text/..Liferea/0.x.x (Linux; en_US.UTF-8; url)
 magpierss.sf.nettext/..MagpieRSS/0.7x (url)
 liferea.sf.net/text/..Liferea/1.x.x (Linux; es_ES.UTF-8; url)
 magpierss.sf.netapplication/xmlMagpieRSS/0.72 (url; No cache)
traslated
 mymemory.traslated.net/doc/text/..Mozilla/5.0 (MyMemory Bot url)
FeedBurner
 www.FeedBurner.comtext/..FeedBurner/1.0 (url)
wesee
 www.wesee.com/en/support/bot/image/..WeSEE:Search/0.1 (Alpha, url)
 www.wesee.com/en/support/bot/text/..WeSEE:Search/0.1 (Alpha, url)
enwp
 enwp.org/User:SDPatrolBottext/..SDPatrolBot (url)
 enwp.org/User:KingpinBottext/..KingpinBot (url)
 enwp.org/User:H3llkn0wz/WikiSharpAPItext/..WikiSharpAPI/0.3 url (C# .NET)
wordpress
 danielradiorock.wordpress.comtext/..WordPress/3.4-alpha-20205; url
 josefboberg.wordpress.comtext/..WordPress/3.4-alpha-20264; url
 pennylibertygbow.wordpress.comtext/..WordPress/3.4-alpha-19994; url
 josefboberg.wordpress.comtext/..WordPress/3.4-alpha-20150; url
 klausgauger.wordpress.comtext/..WordPress/3.4-alpha-20150; url
daum
 ws.daum.net/aboutWebSearch.htmltext/..Mozilla/5.0 (compatible; MSIE or Firefox mutant; not on Windows server; url) Daumoa/2.0
 tab.search.daum.net/aboutWebSearch.htmltext/..Mozilla/5.0 (compatible; MSIE or Firefox mutant; not on Windows server; url) Daumoa/3.0
federatedmedia
 federatedmedia.nettext/..Mozilla/5.0 (url) Gecko/20061208 Firefox/2.0.0.1
 federatedmedia.netimage/..Mozilla/5.0 (url) Gecko/20061208 Firefox/2.0.0.1
kosmix
 www.kosmix.com/html/kosmos.htmlapplication/xmlMozilla/5.0(compatible;Kosmos/1.0;url)
goo
 help.goo.ne.jp/contact/text/..goo wikipedia (url)
 help.goo.ne.jp/door/crawler.htmltext/..ichiro/3.0 (url)
github
 github.com/pauldix/typhoeus/tree/mastertext/..Typhoeus - url
 github.com/NeilCrosby/wikislurpapplication/vnd.php.serializedWikiSlurp (url)
freebase
 www.freebase.comtext/..metaweb/Nutch-1.0-dev (url; help_at_metaweb.com)
avantbrowser
 www.avantbrowser.comtext/..Avant Browser (url)
 www.avantbrowser.comtext/..Advanced Browser (url)
feedshow
 www.feedshow.comtext/..FeedshowOnline (url)
 www.feedshow.comtext/..Feedshow/x.0 (url; 1 subscriber)
jetbrains
 www.jetbrains.com/omea_reader/text/..JetBrains Omea Reader 1.0.x (url)
 www.jetbrains.com/omea_reader/text/..JetBrains Omea Reader 2.0 Release Candidate 1 (url)
newsgator
 www.newsgator.com/text/..FeedDemon/2.7 (url; Microsoft Windows XP)
 www.newsgator.comtext/..NewsGatorOnline/2.0 (url; 1 subscribers)
ephorus
 www.ephorus.com/text/..Mozilla/5.0 (compatible; Ephorusbot/1.0.0; url)
netseer
 www.netseer.com/crawler.htmltext/..Mozilla/5.0 (compatible; NetSeer crawler/2.0; url; mail address )
veveo
 corporate.veveo.net/webmasters.htmltext/..Mozilla/5.0 (compatible; Veveobot; url)
whatrhymeswith
 www.whatrhymeswith.com/site/rhyme-bottext/..RhymeBot/0.1 (url)
emining
 emining.jp/text/..emBot-GalaBuzz/Nutch-1.0 (url; mail address )
apache
 lucene.apache.org/nutch/bot.htmltext/..NutchCVS/0.7.2 (Nutch; url; mail address )
wikimpress
 wikimpress.org/text/..Mozilla/5.0 (compatible; Linux i686 (x86_64); de-DE; url>Wikimpress) Wikimpress/1.0
archive-it
 archive-it.org/files/site-owners.htmlimage/..Mozilla/5.0 (compatible; archive.org_bot; Archive-It; url)
 archive-it.org/files/site-owners.htmltext/..Mozilla/5.0 (compatible; archive.org_bot; Archive-It; url)
 archive-it.org/files/site-owners.html-Mozilla/5.0 (compatible; archive.org_bot; Archive-It; url)
yioop
 www.yioop.com/bot.phptext/..Mozilla/5.0 (compatible; YioopBot; url)
 www.yioop.com/bot.phpimage/..Mozilla/5.0 (compatible; YioopBot; url)
accelobot
 www.accelobot.comtext/..Mozilla/5.0 (compatible; heritrix/1.14.3 url)
bibalex
 archive.bibalex.org/bot/image/..Mozilla/5.0 (compatible; archive.bibalex.org_bot; url)
 archive.bibalex.org/bot/text/..Mozilla/5.0 (compatible; archive.bibalex.org_bot; url)
thearchangelmichael
 thearchangelmichael.nettext/..WordPress/3.0; url
 thearchangelmichael.nettext/..WordPress/3.3.1; url
 info.thearchangelmichael.nettext/..WordPress/3.3.1; url
cmu
 boston.lti.cs.cmu.edu/crawler_12/text/..Mozilla/5.0 (compatible; lemurwebcrawler mail address ; url)
apercite
 www.apercite.fr/robot/index.htmlimage/..Mozilla/5.0 (compatible; Apercite; url)
wikiglass
 wikiglass.comtext/..url : mail address
hatena
 a.hatena.ne.jp/helptext/..Hatena Antenna/0.5 (url)
scoutjet
 www.scoutjet.com/text/..Mozilla/5.0 (compatible; ScoutJet; url)
bsurprised
 bsurprised.com/text/..BSurprised WikiBox 0.1.3 (url)
SearchNearMe
 SearchNearMe.com/contact.phpapplication/vnd.php.serializedSearchNearMe (url)
 SearchNearMe.com/contact.phptext/..SearchNearMe (url)
rssbandit
 www.rssbandit.orgtext/..RssBandit/1.5.0.10 (WinNT 5.1.2600.0; url) (WinNT 5.1.2600.0; )
easybib
 content.easybib.com/autocite/text/..EasyBib AutoCite (url)
 content.easybib.com/autocite/application/jsonEasyBib AutoCite (url)
tweetmeme
 tweetmeme.com/text/..Mozilla/5.0 (compatible; TweetmemeBot/2.11; url)
zipcommander
 www.zipcommander.com/text/..1st ZipCommander (Net) - url
timewe
 timewe.nettext/..CDR/1.7.1 Simulator/0.7(url) Profile/MIDP-1.0 Configuration/CLDC-1.0
speaktoit
 www.speaktoit.comapplication/jsonSpeaktoit url
ponderer
 ponderer.org/download/annotate_google.user.jstext/..annotate_google; url
graemef
 graemef.comtext/..NewsGator FetchLinks extension/0.2.0 (url)
tinyurl
 tinyurl.com/64t5ntext/..Rome Client (url) Ver: 0.9
zootycoon
 www.zootycoon.comtext/..Zoo Tycoon 2 Client -- url
orcabrowser
 www.orcabrowser.comtext/..Orca Browser (url)
nemui
 mozshot.nemui.org/text/..Mozilla/5.0 (Gecko/20070310 Mozshot/0.0.20070628; url)
seebot
 seebot.orgtext/..Lynx/2.8 (;url)
warebay
 www.warebay.com/bot.htmltext/..Mozilla/5.0 (compatible; WBSearchBot/1.1; url)
ranchero
 ranchero.com/netnewswire/text/..NetNewsWire/2.x (Mac OS X; url)
kula
 kula.jp/endotext/..endo/1.0 (Mac OS X; ppc i386; url)
sentymetr
 sentymetr.pl/bot.htmlapplication/jsonMozilla/5.0 (compatible; SentymetrBot 1.0; url)
 sentymetr.pl/bot.htmltext/..Mozilla/5.0 (compatible; SentymetrBot 1.0; url)
feeds4all
 www.feeds4all.com/feedzcollectortext/..FeedZcollector v1.x (Platinum) url
blogbridge
 www.blogbridge.com/text/..BlogBridge 2.13 (url)
rssreader
 www.rssreader.comtext/..RssReader/1.0.xx.x (url) Microsoft Windows NT 5.1.2600.0
winpodder
 winpodder.comtext/..WinPodder (url)
kalooga
 kalooga.com/crawlerimage/..Mozilla/5.0 (compatible; KaloogaBot; url)
 kalooga.com/crawlertext/..Mozilla/5.0 (compatible; KaloogaBot; url)
it-influentials
 search.it-influentials.com/bot.htmtext/..Mozilla/5.0 (compatible;FindITAnswersbot/1.0;url)
abonti
 www.abonti.comtext/..Mozilla/5.0 (compatible; Abonti/0.91 - url)
snarfware
 www.snarfware.com/text/..Snarfer/0.x.x (url)
plagger
 plagger.org/text/..Plagger/0.x.xx (url)
superfeedr
 superfeedr.comapplication/xmlSuperfeedr: Superparser bot/1.1 url - Please read this http://blog.superfeedr.com/publishers.html or get in touch if we are polling too hard
blogscope
 www.blogscope.net/text/..Mozilla/5.0 (compatible; BlogScope/1.0; url; U of Toronto)
textdigger
 textdigger.comtext/..Mozilla/5.0 (url) Gecko/20061208 Firefox/2.0.0.1
artez
 www.artez.nltext/..artezTest/0.1 (url)
picsearch
 www.picsearch.com/bot.htmltext/..psbot/0.1 (url)
kr:6600
 www.checkprivacy.or.kr:6600/RS/PRIVACY_FAQ.jsptext/..url
ac
 www.clips.ua.ac.be/pages/patternapplication/jsonPattern/1.0 url
 www.clips.ua.ac.be/pages/patterntext/..Pattern/1.0 url
 www.clips.ua.ac.be/pages/patternapplication/jsonPattern/2.3 url
 www.clips.ua.ac.be/pages/patterntext/..Pattern/2.3 url
linkbutler
 www.linkbutler.de/spidertext/..lb-spider/Mozilla/5.0 Gecko/20100101 Firefox/10.0.2 (lb-spider; url; mail address )
rockpeaks
 www.rockpeaks.com/contacttext/..RockPeaks/0.1 (url)
netnewswireapp
 netnewswireapp.com/mac/-NetNewsWire/3.3 (Mac OS X; url; gzip-happy)
tumblr
 benderthewebrobot.tumblr.comtext/..Mozilla/5.0 (compatible; Bender; url)
xbmc
 www.xbmc.orgimage/..XBMC/11.0-RC2 Git:20120229-f38655f (iOS; 11.0.0 AppleTV2,1, Version 5.0.1 (Build 9A406a); url)
alexa
 www.alexa.com/site/help/webmasterstext/..ia_archiver (url; mail address )
drupal
 drupal.org/text/..User-Agent: Drupal (url)
 drupal.org/text/..Drupal (url)
spinn3r
 spinn3r.com/robottext/..Mozilla/5.0 (X11; Linux x86_64; en-US; rv:1.9.0.19; aggregator:Spinn3r (Spinn3r 3.1); url) Gecko/2010040121 Firefox/3.0.19
pagepeeker
 pagepeeker.com/robotsimage/..PagePeeker.com (info: url)
 pagepeeker.com/robotstext/..PagePeeker.com (info: url)
Anonymouse
 Anonymouse.org/image/..url (Unix)
 Anonymouse.org/text/..url (Unix)
ecrawlerdata
 www.ecrawlerdata.comtext/..Mozilla/5.0 (compatible; heritrix/1.14.4 url)
bnf
 www.bnf.fr/fr/outils/a.dl_web_capture_robot.htmltext/..Mozilla/5.0 (compatible; bnf.fr_bot; url)
 www.bnf.fr/fr/outils/a.dl_web_capture_robot.htmlimage/..Mozilla/5.0 (compatible; bnf.fr_bot; url)
topsy
 labs.topsy.com/butterfly/text/..Mozilla/5.0 (compatible; Butterfly/1.0; url) Gecko/2009032608 Firefox/3.0.8
netvibes
 www.netvibes.comtext/..Netvibes (url)
metager2
 metager2.de/technology.phptext/..Mozilla/5.0 (compatible; metager2-verification-bot; url)
suggy
 blog.suggy.com/was-ist-suggy/suggy-webcrawler/text/..Mozilla/5.0 (compatible; suggybot v0.01a, url)
turnitin
 www.turnitin.com/robot/crawlerinfo.htmltext/..TurnitinBot/2.1 (url)
simplepie
 simplepie.orgapplication/xmlSimplePie/1.2 (Feed Parser; url; Allow like Gecko) Build/20090627192103
 simplepie.orgtext/..SimplePie/1.2 (Feed Parser; url; Allow like Gecko) Build/20090627192103
plagiarismcheck
 plagiarismcheck.orgapplication/jsonWikiCrawl 1.0b (url contact-mail: mail address )
iis
 www.iis.net/iisbot.htmltext/..iisbot/1.0 (url)
duckduckgo
 duckduckgo.com/duckduckpreview.htmltext/..DuckDuckPreview/1.0; (url)
 duckduckgo.com/duckduckpreview.html-DuckDuckPreview/1.0; (url)
 duckduckgo.com/duckduckbot.htmltext/..DuckDuckBot/1.1; (url)
froute
 labs.froute.jp/pc2m/help.htmltext/..Froute Mobile Gateway/1.0 (url)
paper
 support.paper.li/entries/20023257-what-is-paper-litext/..Mozilla/5.0 (compatible; PaperLiBot/2.1; url)
weblio
 www.weblio.jp/text/..Mozilla/5.0 (compatible; WeblioBot; url)
seegul
 www.seegul.comtext/..Seegul ImageBot - url
fotopedia
 www.fotopedia.comapplication/jsonPicor (url)
latestsearchtrends
 latestsearchtrends.comtext/..WordPress/3.3.1; url
creativecommons
 wiki.creativecommons.org/Metadata_Scrapertext/..CC Metadata Scaper url
pinterest
 pinterest.com/image/..Pinterest/0.1 url
searchtechnologies
 www.searchtechnologies.comtext/..Mozilla/5.0 (compatible; heritrix/1.14.3 url)
embed
 support.embed.ly/text/..Mozilla/5.0 (compatible; Embedly/0.2; url)
 support.embed.ly/image/..Mozilla/5.0 (compatible; Embedly/0.2; url)
ibis
 ibis.ne.jp/browser/about.htmlimage/..Mozilla/4.0 (compatible; ibisBrowser; url)
instapaper
 www.instapaper.com/text/..Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8) AppleWebKit/534.50 KHTML Version/5.1 Instapaper/4.0 (url)
sonyericsson
 www.sonyericsson.com/UAprof/R800xR301.xmlimage/..Mozilla/5.0 (Linux; Android/2.3.3; en-us; SonyEricssonR800xurl Build/3.0.1.E.1.44) AppleWebKit/533.1 KHTML Version/4.0 Mobile Safari/533.1
globalspec
 www.globalspec.com/Ocellitext/..Ocelli/1.4 (url)
rcdtokyo
 www.rcdtokyo.com/pc2m/text/..Mozilla/5.0 (compatible; PEAR HTTP_Request class; url)
edu
 ws.nju.edu.cn/falcons/text/..Mozilla/5.0 (compatible; Falconsbot; url)
lb
 www.lb.detext/..lb-spider/Nutch-1.4 (lb-spider; url; mail address )
semager
 www.semager.de/blog/semager-bots/text/..Mozilla/5.0 (compatible; Semager/1.4c; url)
wattsupwiththat
 wattsupwiththat.comtext/..WordPress/3.4-alpha-20205; url
 wattsupwiththat.comtext/..WordPress/3.4-alpha-20115; url
103,659total

Page requests for probable crawlers, recognized by keyword
Count
x 1000
Agent string
  Mime type (count ≥ 3)
PythonWikipediaBot/1.0
 application/json
 application/xml
 text/..
 -
 image/..
php wikibot classes
 application/vnd.php.serialized
 text/..
 -
MediaWikiCrawler-Google/2.0 ( mail address )
 text/..
 -
GoogleBot-Image/1.0
 image/..
 text/..
 -
LinkParser/2.0
 text/..
MoovidaBot/0.1
 text/..
Mozilla/5.0 (Windows; Windows NT 5.1; fr; rv:1.8.1) VoilaBot BETA 1.2 ( mail address )
 text/..
 -
 application/pdf
GoogleBot-Image/1.0
 text/..
 image/..
 -
 application/vnd.php.serialized
 application/json
wikiwix-bot-3.0
 text/..
 -
mail address
 application/vnd.php.serialized
 text/..
DotNetWikiBot/1.0
 text/..
SemrushBot/0.91
 text/..
 image/..
 -
 application/ogg
 video/ogg
ClueBot/1.1
 application/vnd.php.serialized
 -
Peachy MediaWiki Bot API Version 1.0
 application/vnd.php.serialized
 -
Answersbot
 text/..
spider
 text/..
 application/json
 application/vnd.php.serialized
 image/..
 application/xml
Pywikipediabot/2.0
 application/json
K-Crawler
 text/..
 application/ogg
Mozilla/5.0 (compatible; Ezooms/1.0; mail address )
 text/..
 -
 image/..
 application/xml
 application/vnd.php.serialized
ClueBot/2.0
 application/vnd.php.serialized
Mozilla 5.0 (Apibot 0.32)
 application/vnd.php.serialized
bob's crappy crawler; contact: mail address
 text/..
 -
DigitalsmithsBot
 text/..
Mozilla/5.0 (compatible; YandexBot/3.0)
 image/..
 text/..
 video/ogg
MediaWiki::Bot/3.2.6
 application/json
 -
AnomieBOT 1.0 (TagDater)
 application/json
 application/x-www-form-urlencoded
BritannicaProjBot mail address
 text/..
DotNetWikiBot/2.98 (Unix 3.0.0.12; )
 text/..
 application/xml
python-wikitools/1.2 (User:BernsteinBot)
 application/json
 application/x-www-form-urlencoded
 text/..
mail address mail address – MediaWiki Tcl Bot Framework 0.5 (r0)
 application/json
 application/x-www-form-urlencoded
Kavande Crawler 1.0/Nutch-1.4 ( Iranian National Web Crawler ; mail address )
 text/..
 application/pdf
 -
 image/..
VeeloBot 1.0
 text/..
 -
 image/..
 application/ogg
wikbot/1.50 CFNetwork/548.1.4 Darwin/11.0.0
 image/..
 application/json
 text/..
 -
YBot/0.1
 application/vnd.php.serialized
DotNetWikiBot/2.81 (Microsoft Windows NT 6.1.7601 Service Pack 1; )
 text/..
 application/xml
 image/..
MireoBot
 text/..
 -
 application/xml
FAST Enterprise Crawler 6 used by ESP ( mail address )
 text/..
Tawbot (public svn release; plwiki)
 text/..
AnomieBOT 1.0 (ReplaceExternalLinks2)
 application/json
daytrippy.com Crawler
 application/json
 text/..
Mozilla/5.0 (compatible; Konqueror/3.5; Linux) KHTML/3.5.5 (Exabot-Thumbnails)
 image/..
 text/..
 -
wikbot/1.50 CFNetwork/548.0.4 Darwin/11.0.0
 image/..
 application/json
 text/..
 -
MLBot (www.metadatalabs.com/mlbot)
 text/..
 application/vnd.php.serialized
 -
 image/..
Test Webbot
 text/..
SineBot/1.5.18(User:SineBot)
 application/vnd.php.serialized
 text/..
plantspedia data crawler
 text/..
DotNetWikiBot/2.97 (Unix 2.6.32.38; )
 text/..
SchoolReviewNetworkWikiBot
 application/json
DotNetWikiBot/2.97 (Unix 5.10.0.0; )
 application/xml
 text/..
 -
JavaCrawler/1.1
 text/..
TrueKnowledgeBot bot mail address >
 application/xml
 application/vnd.php.serialized
SiteSeekerCrawler/1.0
 text/..
Mozilla/5.0 (X11; Linux i686; en-US; rv:1.8.0.7) Gecko/20060909 Firefox/1.5.0.7 SnapPreviewBot
 text/..
 -
AniBot/0.9 php/curl
 application/vnd.php.serialized
UCMore Crawler App
 text/..
 -
Mozilla/5.0 (compatible; SnapPreviewBot; en-US; rv:1.8.0.9) Gecko/20061206 Firefox/1.5.0.9
 text/..
 -
Mozilla/5.0 MaboMwFramework/1.1 (w:de:MerlIwBot)
 text/..
DotNetWikiBot/2.97 (Microsoft Windows NT 6.1.7601 Service Pack 1; )
 text/..
 application/xml
COIBot/1.00
 text/..
AnomieBOT 1.0 (FlagIconRemover)
 application/json
DotNetWikiBot/2.96 (Microsoft Windows NT 5.1.2600 Service Pack 3; )
 text/..
 application/xml
CorenSearchBot/1.5 en libwww-perl/6.02
 text/..
Mozilla/5.0 (compatible; Nigma.ru/3.0; mail address )
 text/..
 -
 application/xml
HTMLParser/2.0
 text/..
Opera/8.01 (J2ME/MIDP; MXit WebBot/1.8.2.115) Opera Mini/3.1
 image/..
 -
 text/..
FAST Enterprise Crawler 6 used by LexisNexis ( mail address )
 text/..
 -
HRoestBot, de-wikipedia using pywikipedia framework
 text/..
 application/json
 application/xml
My Nutch Spider/Nutch-1.4
 text/..
 application/pdf
 image/..
GoogleBot 2.1
 text/..
DotNetWikiBot/2.96 (Unix 5.10.0.0; )
 text/..
 application/xml
wikiparser/1 CFNetwork/454.12.4 Darwin/10.8.0 (x86_64) (MacPro5,1)
 image/..
 text/..
Webwiki Search Engine Bot - www.webwiki.de
 text/..
SiocWikiBot/1.0
 application/vnd.php.serialized
 text/..
AnomieBOT 1.0 (OrphanReferenceFixer)
 application/json
AnomieBOT 1.0 (TemplateSubster)
 application/json
DotNetWikiBot/2.97 (Unix 2.6.32.36; )
 text/..
Wikibot
 text/..
 -
 image/..
TheKeens bot
 text/..
 -
GSLFbot
 text/..
 application/xml
Twitterbot/1.0
 text/..
 image/..
bitlybot
 text/..
 -
 image/..
AnomieBOT 1.0 (BAGBot)
 application/json
 text/..
SurakWare MediaWiki Bot/1.0
 text/..
~Bot ([[:fr:w:User:TildeBot]] by [[:fr:w:User:Alphos]] mail address )
 text/..
XLinkBot/1.00
 text/..
GNAA-bot
 text/..
Opera/8.01 (J2ME/MIDP; MXit WebBot/1.7.7.93) Opera Mini/3.1
 image/..
 -
 text/..
TVersity Media Robot
 text/..
OrlodrimBot/1.0
 text/..
Magus Bot 1.0
 text/..
 image/..
FAST Enterprise Crawler/5.3.4 ( mail address )
 text/..
 -
super cool bot
 application/vnd.php.serialized
 application/json
wikbotlite/1.50 CFNetwork/548.1.4 Darwin/11.0.0
 image/..
 application/json
 text/..
DotNetWikiBot/2.92 (Microsoft Windows NT 5.1.2600 Service Pack 3; )
 text/..
 application/xml
AnomieBOT 1.0 (PERTableUpdater)
 application/json
 text/..
python-wikitools/1.2 (User:Mr.Z-bot)
 application/json
DotNetWikiBot/2.98 (Microsoft Windows NT 5.1.2600 Service Pack 3; )
 text/..
 application/xml
 image/..
Mozilla/5.0 (SnapPreviewBot) Gecko/20061206 Firefox/1.5.0.9
 image/..
 text/..
wikbot/1.60 CFNetwork/548.1.4 Darwin/11.0.0
 image/..
 application/json
 text/..
 -
Nutch-Spider/Nutch-1.4
 text/..
Wikibot 1.54 (Macintosh; Mac OS X 10.7.3; en_TW)
 image/..
 text/..
LinksCrawler 0.1beta
 text/..
 -
Freebase Deathbot
 text/..
Goalkeeperbot(User:Beetstra)/1.0
 text/..
HosiryuhosiBot IRC-RecentChanges Util
 -
 text/..
DotNetWikiBot/2.97 (Microsoft Windows NT 5.1.2600 Service Pack 3; )
 text/..
 application/xml
Baiduspider
 text/..
DotNetWikiBot/2.97 (Microsoft Windows NT 6.1.7600.0; )
 text/..
Local Site Parser 1.0
 text/..
Mozilla/5.0 (compatible; LucidWorks/; ; crawler at example dot com)
 text/..
 -
FAST Search Web Crawler 14.0.0325.0000
 text/..
 -
wikbotlite/1.50 CFNetwork/548.0.4 Darwin/11.0.0
 image/..
 application/json
 -
 text/..
wikbot/1.50 CFNetwork/485.13.9 Darwin/11.0.0
 image/..
 application/json
 -
 text/..
Empedia Bot
 text/..
DotNetWikiBot, edited by D. Rodionov/2.91 (Microsoft Windows NT 6.0.6002 Service Pack 2; )
 text/..
 application/xml
DotNetWikiBot/2.9 (Unix 5.10.0.0; )
 text/..
CheMoBot/1.00
 text/..
HTMLParser/1.6
 text/..
Mozilla 5.0 (Apibot 0.30b5)
 application/vnd.php.serialized
OrangeCrawler/Nutch-1.0 ( mail address )
 text/..
HBC Archive Indexerbot 0.9a
 text/..
My Bot
 image/..
 text/..
COIBot/2.0
 text/..
WikiBot/0.1
 text/..
 application/xml
DNSTallyKwBot/0.2
 text/..
AnomieBOT 1.0 (RandomPagePicker)
 application/json
MaxPointCrawler/Nutch-1.1 (maxpoint.crawler at maxpointinteractive dot com)
 text/..
 -
 application/vnd.php.serialized
Geni ircpybot 1.0
 application/json
 text/..
DotNetWikiBot/2.9 (Microsoft Windows NT 6.0.6000.0; )
 text/..
KWSS Crawler Ver. 0.1
 text/..
Zing-BottaBot/1.0
 text/..
python-wikitools/1.2 (User:LaraBot)
 application/json
Opera/8.01 (J2ME/MIDP; MXit WebBot/1.8.4.121) Opera Mini/3.1
 -
 image/..
 text/..
Mozilla/5.0 (Bgbot 0.5)
 text/..
Opera/8.01 (J2ME/MIDP; MXit WebBot/1.8.3.119) Opera Mini/3.1
 image/..
 -
 text/..
MyCuteBot/0.1
 text/..
 application/json
 application/vnd.php.serialized
wikbot/1.50 CFNetwork/485.12.7 Darwin/10.4.0
 image/..
 application/json
 -
 text/..
InfoBot Krausman mail address
 text/..
Hexabot V1.3 - curl - api.php
 text/..
OPENSEEMOX BOT 1.0 -^/ www.openseemox.com
 text/..
GermCrawler
 application/json
 text/..
UiO webquality crawler
 text/..
AnomieBOT 1.0 (AFDMergeFromCleaner)
 application/json
DotNetWikiBot/2.94 (Microsoft Windows NT 5.1.2600 Service Pack 3; )
 text/..
 application/xml
AnomieBOT 1.0 (DeletionSortingCleaner)
 application/json
altorobot
 text/..
123peoplebot/1.0
 text/..
Handelabra WikiBot
 application/vnd.php.serialized
 text/..
MediaWiki::Bot 3.1.5
 application/json
Jbot
 text/..
 image/..
HAZY.SPIDER/Nutch-1.4
 text/..
Mozilla/5.0 (compatible; UnisterBot; mail address )
 text/..
GoogleBot-Image/1.0 [UsableNet Lift Mobile]
 text/..
 -
My Nutch Spider/Nutch-1.4 (JPO)
 text/..
SINA_ROBOT; Mozilla/5.0 (Windows; Windows NT 5.1; MSIE8.0; zh-CN; rv:1.9.1.8) Gecko/20100202 Firef8
 text/..
MediaWiki::Bot/1.00
 text/..
 application/json
IssueCrawler
 text/..
16,189total

IP ranges: known ip ranges for Google are 64.233.[160.0-191.255], 66.249.[64.0-95.255], 66.102.[0.0-15.255], 72.14.[192.0-255.255],
74.125.[0.0-255.255], 209.085.[128.0-255.255], 216.239.[32.0-63.255] and a few minor other subranges

Errata: WMF traffic logging service suffered from server capacity problems in Aug/Sep/Oct 2011.
Absolute traffic counts for October 2011 are approximatly 7% too low.
Data loss only occurred during peak hours. It therefore may have had somewhat different impact for traffic from different parts of the world.
and may have also skewed relative figures like share of traffic per browser or operating system.

From mid September till late November squid log records for mobile traffic were in invalid format.
Data could be repaired for logs from mid October onwards. Older logs were no longer available.

In a an unrelated server outage precisely half of traffic to WMF mobile sites was not counted from Oct 16 - Nov 29 (one of two load-balanced servers did not report traffic).
WMF has since improved server monitoring, so that similar outages should be detected and fixed much faster from now on.

Generated on Thu, Jul 26, 2012 21:43
Author:Erik Zachte (
Web site)
Mail: ezachte@### (no spam: ### = wikimedia.org)
All data and images on this page are in the public domain.

Note: page may load slower on Microsoft Internet explorer than on other major browsers