Wikimedia Traffic Analysis Report - Crawler requests

Monthly requests or daily averages, for period: 19 Oct 2011 - 31 Oct 2011 (last 12 months)
000 ⇒ k

 This analysis is based on a 1:1000 sampled server log (squids)

 See also: Requests by destination or by origin / Methods / Scripts / User Agents / Skins / Crawlers / Op.Sys. / Browsers / Google, and notes about reliability of these data

The following overview of crawler (aka bot) page requests is based on the user agent information that accompanies most server requests. Unfortunately this user agent information follows rather loosely defined guidelines.
Also please bear in mind than the most popular crawler names may be somewhat overrepresented. This is the result of so called user agent spoofing (where a requester supplies false credentials, e.g. to bypass web servers filters).
GoogleBot seems to be a favorite for spoofing. Therefore requests from an ip address registered by Google (see below) are color coded GoogleBot, others GoogleBot

For this report page requests are considered to be issued by a crawler in two cases:
1 The user agent string contains a web address (only crawlers should have that, but there a some false positives, where a browser sends a user agent string with a web address (ill behaved plug-in, main offenders have been eliminated)
2 The user agent string contains the term bot, spider or crawl[er]'

In total 67,843,000 page requests (mime type text/html only!) per day are considered crawler requests, out of 438,098,000 external requests, which is 15.5%

Page requests for crawlers that specify a url in the agent string
Count
x 1000
Secondary domain
(~site) name
URLMime typeUser agent
google
 www.google.com/bot.htmltext/..Mozilla/5.0 (compatible; GoogleBot/2.1; url)
 www.google.com/bot.html-Mozilla/5.0 (compatible; GoogleBot/2.1; url)
 www.google.com/bot.html-Mozilla/5.0 (compatible; GoogleBot/2.1; url)
 www.google.com/bot.htmlimage/..Mozilla/5.0 (compatible; GoogleBot/2.1; url)
 desktop.google.com/image/..Mozilla/5.0 (compatible; Google Desktop/5.9.1005.12335; url)
 www.google.com/bot.htmltext/..Mozilla/5.0 (iPhone; CPU iPhone OS 4_1 like Mac OS X; en-us) AppleWebKit/532.9 KHTML Version/4.0.5 Mobile/8B117 Safari/6531.22.7 (compatible; GoogleBot-Mobile/2.1; url)
 www.google.com/bot.htmltext/..DoCoMo/2.0 N905i(c100;TB;W24H16) (compatible; GoogleBot-Mobile/2.1; url)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: ortografia4)
 www.google.com/feedfetcher.htmlimage/..Mozilla/5.0 (compatible) FeedFetcher-Google; (url)
 www.google.com/bot.htmltext/..SAMSUNG-SGH-E250/1.0 Profile/MIDP-2.0 Configuration/CLDC-1.1 UP.Browser/6.2.3.3.c.1.101 (GUI) MMP/2.0 (compatible; GoogleBot-Mobile/2.1; url)
 www.google.com/feedfetcher.html-FeedFetcher-Google; (url)
 www.google.com/bot.htmltext/..Mozilla/5.0 (compatible; GoogleBot/2.1; url)
 code.google.com/appengineapplication/jsonAppEngine-Google; (url; appid: s~redconceptual)
 desktop.google.com/application/xmlMozilla/5.0 (compatible; Google Desktop/5.9.1005.12335; url)
 www.google.com/feedfetcher.htmlapplication/xmlFeedFetcher-Google; (url)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: ortopedianew)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: rarplayer)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: wikien3)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: wikien4)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: s~harunakaze)
 www.google.com/feedfetcher.htmltext/..Mozilla/5.0 (compatible) FeedFetcher-Google; (url)
 www.google.com/feedfetcher.htmlapplication/jsonMozilla/5.0 (compatible) FeedFetcher-Google; (url)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: s~expinia-wiki)
 www.google.com/feedfetcher.htmltext/..FeedFetcher-Google; (url)
 www.google.com/bot.html-DoCoMo/2.0 N905i(c100;TB;W24H16) (compatible; GoogleBot-Mobile/2.1; url)
 code.google.com/p/crawler4j/text/..crawler4j (url)
 code.google.com/appengineimage/..AppEngine-Google; (url; appid: tinysrc)
 code.google.com/appenginetext/..WikiBot/0.1 AppEngine-Google; (url; appid: newikipedia)
 www.google.com/feedfetcher.htmlapplication/xmlMozilla/5.0 (compatible) FeedFetcher-Google; (url)
 code.google.com/appengineapplication/xmlAppEngine-Google; (url; appid: wikipedia-raw)
 www.google.com/coop/cse/creftext/..FeedFetcher-Google-CoOp; (url)
 www.google.com/bot.htmltext/..DoCoMo/2.0 N905i(c100;TB;W24H16) (compatible; GoogleBot-Mobile/2.1; url)
 desktop.google.com/text/..Mozilla/5.0 (compatible; Google Desktop/5.9.1005.12335; url)
 code.google.com/p/crawler4j/image/..crawler4j (url)
 www.google.com/bot.htmltext/..SAMSUNG-SGH-E250/1.0 Profile/MIDP-2.0 Configuration/CLDC-1.1 UP.Browser/6.2.3.3.c.1.101 (GUI) MMP/2.0 (compatible; GoogleBot-Mobile/2.1; url)
 www.google.com/bot.htmltext/..GoogleBot/2.1 (url)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: boxapp)
 www.google.com/bot.htmlimage/..DoCoMo/2.0 N905i(c100;TB;W24H16) (compatible; GoogleBot-Mobile/2.1; url)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: usawebdl)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: kbworld24)
 code.google.com/appengineimage/..AppEngine-Google; (url; appid: d24-img)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: drrkproxxxy)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: wikigameapp)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: finchproxy)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: my-reg)
 code.google.com/appenginetext/..www.productontology.org/1.0 (Contact: mail address ) AppEngine-Google; (url; appid: gr4bing)
 code.google.com/appengineapplication/jsonMWBOT GAE Edition AppEngine-Google; (url; appid: philip-bot)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: demowaiy)
 code.google.com/appengineapplication/jsonMozilla 3.5 AppEngine-Google; (url; appid: prfleme)
 docs.google.comimage/..Mozilla/5.0 (compatible; GoogleDocs; documents; url)
 code.google.com/appengineimage/..Mozilla/5.0 (Windows; Windows NT 6.1; zh-CN; rv:1.9.2.16) Gecko/20110319 Firefox/3.6.16 ( .NET4.0E) QQDownload/1.7 AppEngine-Google; (url; appid: donut-1)
 docs.google.comimage/..Mozilla/5.0 (compatible; GoogleDocs; url)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: d24-img)
 desktop.google.com/-Mozilla/5.0 (compatible; Google Desktop/5.9.1005.12335; url)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: retimeme)
 www.google.com/bot.htmlimage/..SAMSUNG-SGH-E250/1.0 Profile/MIDP-2.0 Configuration/CLDC-1.1 UP.Browser/6.2.3.3.c.1.101 (GUI) MMP/2.0 (compatible; GoogleBot-Mobile/2.1; url)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: recrossed)
 code.google.com/appengineimage/..AppEngine-Google; (url; appid: s~harunakaze)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: dustbunnytycoonmonitor)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: s~luffyfactory)
 code.google.com/appenginetext/.. mail address AppEngine-Google; (url; appid: itravelapp)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: mygpxy)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: meme-darwin)
facebook
 www.facebook.com/externalhit_uatext.phpimage/..facebookexternalhit/1.0 (url)
 www.facebook.com/externalhit_uatext.phptext/..facebookexternalhit/1.0 (url)
 www.facebook.com/externalhit_uatext.phptext/..facebookexternalhit/1.1 (url)
 developers.facebook.comimage/..facebookplatform/1.0 (url)
 www.facebook.com/externalhit_uatext.phpimage/..facebookexternalhit/1.1 (url)
yahoo
 help.yahoo.com/help/us/ysearch/slurptext/..Mozilla/5.0 (compatible; Yahoo! Slurp; url)
 help.yahoo.com/help/us/ysearch/slurptext/..Mozilla/5.0 (compatible; Yahoo! Slurp/3.0; url)
 help.yahoo.com/help/us/ysearch/slurpimage/..Mozilla/5.0 (compatible; Yahoo! Slurp/3.0; url)
 help.yahoo.co.jp/help/jp/search/indexing/indexing-15.htmltext/..Y!J-BRW/1.0 crawler (url)
 listing.yahoo.co.jp/support/faq/int/other/other_001.htmltext/..Y!J-BRJ/YATS crawler (url)
 help.yahoo.com/help/us/ysearch/slurpimage/..Mozilla/5.0 (compatible; Yahoo! Slurp; url)
 developer.yahoo.com/yql/providertext/..Mozilla/5.0 (compatible; Yahoo Pipes 2.0; url) Gecko/20090729 Firefox/3.5.2
 help.yahoo.com/help/us/ysearch/slurp-Mozilla/5.0 (compatible; Yahoo! Slurp/3.0; url)
 help.yahoo.com/help/us/ysearch/slurptext/..Mozilla/5.0 (compatible; Yahoo! DE Slurp; url)
 help.yahoo.co.jp/help/jp/search/indexing/indexing-15.htmlimage/..'Mozilla/5.0 (compatible; Y!J SearchMonkey/1.0 (Y!J-AGENT; url))'
 help.yahoo.com/help/us/ysearch/slurp-Mozilla/5.0 (compatible; Yahoo! Slurp; url)
 help.yahoo.com/help/us/ysearch/slurpapplication/vnd.php.serializedMozilla/5.0 (compatible Yahoo! Slurp/3.0 url)
 help.yahoo.com/help/us/ysearch/slurpapplication/oggMozilla/5.0 (compatible; Yahoo! Slurp; url)
 help.yahoo.co.jp/help/jp/search/indexing/indexing-15.htmltext/..Y!J-BRT/1.0 crawler (url)
 help.yahoo.co.jp/help/jp/search/indexing/indexing-15.htmltext/..'Mozilla/5.0 (compatible; Y!J SearchMonkey/1.0 (Y!J-AGENT; url))'
 help.yahoo.com/help/us/ysearch/slurpapplication/jsonMozilla/5.0 (compatible; Yahoo! Slurp/3.0; url)
 help.yahoo.comtext/..Mozilla/5.0 (YahooYSMcm/3.0.0; url)
 misc.yahoo.com.cn/help.htmltext/..Mozilla/5.0 (compatible; Yahoo! Slurp China; url)
google?
 www.google.com/bot.htmltext/..Mozilla/5.0 (compatible; GoogleBot/2.1; url)
 www.google.com/bot.html-Mozilla/5.0 (compatible; GoogleBot/2.1; url)
 www.google.com/bot.htmlapplication/vnd.php.serializedMozilla/5.0 (compatible; GoogleBot/2.1; url)
 www.google.com/bot.htmltext/..GoogleBot/2.1 (url)
 www.google.com/bot.htmlimage/..Mozilla/5.0 (compatible; GoogleBot/2.1; url)
 www.google.com/bot.htmltext/..DoCoMo/2.0 N905i(c100;TB;W24H16) (compatible; GoogleBot-Mobile/2.1; url)
 www.google.com/bot.htmltext/..Mozilla/5.0(compatible;GoogleBot/2.1;url)
 www.google.com/bot.htmltext/..GoogleBot/2.1 (url)
 www.google.com/bot.html-Mozilla/5.0 (compatible; GoogleBot/2.1; url)
 www.google.com/bot.htmltext/..Mozilla/5.0 (compatible; GoogleBot/2.1; url)
bing
 www.bing.com/bingbot.htmtext/..Mozilla/5.0 (compatible; bingbot/2.0; url)
 www.bing.com/bingbot.htm-Mozilla/5.0 (compatible; bingbot/2.0; url)
 www.bing.com/bingbot.htmapplication/vnd.php.serializedMozilla/5.0 (compatible; bingbot/2.0; url)
 www.bing.com/bingbot.htmimage/..Mozilla/5.0 (compatible; bingbot/2.0; url)
 www.bing.com/bingbot.htmtext/..Mozilla/5.0 (compatible; bingbot/2.0; url)
 www.bing.com/bingbot.htmtext/..Mozilla/5.0 (compatible; bingbot/2.0; url) ASProxy/5.5b3
baidu
 www.baidu.com/search/spider.htmltext/..Mozilla/5.0 (compatible; Baiduspider/2.0; url)
 www.baidu.com/search/spider.htmtext/..Baiduspider-image(url)
 www.baidu.com/search/spider.htmlapplication/vnd.php.serializedMozilla/5.0 (compatible; Baiduspider/2.0; url)
 www.baidu.com/search/spider.html-Mozilla/5.0 (compatible; Baiduspider/2.0; url)
 www.baidu.com/search/spider.htmtext/..Baiduspider(url)
 www.baidu.com/search/spider.htmltext/..Mozilla/5.0(compatible;Baiduspider/2.0;url)
 www.baidu.com/search/spider.htmltext/..Mozilla/5.0 (compatible; Baiduspider/2.0; url)
 www.baidu.com/search/spider.htmltext/..Mozilla/5.0 (compatible; Baiduspider/2.0; url) (via babelfish.yahoo.com)
 www.baidu.com/search/spider.htmlimage/..Mozilla/5.0 (compatible; Baiduspider/2.0; url)
msn
 search.msn.com/msnbot.htmtext/..msnbot-Products/1.0 (url)
 search.msn.com/msnbot.htmtext/..msnbot/2.0b (url)._
 search.msn.com/msnbot.htmtext/..msnbot/2.0b (url)
 search.msn.com/msnbot.htmtext/..msnbot-NewsBlogs/2.0b (url)
 search.msn.com/msnbot.htmtext/..msnbot-media/1.1 (url)
 search.msn.com/msnbot.htmimage/..msnbot-media/1.1 (url)
 search.msn.com/msnbot.htmtext/..msnbot-UDiscovery/2.0b (url)
naver
 help.naver.com/robots/text/..Yeti/1.0 (NHN Corp.; url)
 help.naver.com/robots/image/..Yeti/1.0 (NHN Corp.; url)
 help.naver.com/robots/-Yeti/1.0 (NHN Corp.; url)
 help.naver.com/customer_webtxt_02.jsptext/..Mozilla/4.0 (compatible; NaverBot/1.0; url)
yandex
 yandex.com/botstext/..Mozilla/5.0 (compatible; YandexBot/3.0; url)
 yandex.com/botstext/..Mozilla/5.0 (compatible; YandexImages/3.0; url)
 yandex.com/bots-Mozilla/5.0 (compatible; YandexBot/3.0; url)
 yandex.com/botstext/..Mozilla/5.0 (compatible; YandexDirect/3.0; url)
 yandex.com/botsimage/..Mozilla/5.0 (compatible; YandexImages/3.0; url)
 yandex.com/botstext/..Mozilla/5.0 (compatible; YandexAntivirus/2.0; url)
 yandex.com/botsimage/..Mozilla/5.0 (compatible; YandexImageResizer/2.0; url)
 yandex.com/botsapplication/vnd.php.serializedMozilla/5.0 (compatible; YandexBot/3.0; url)
www.
 www.text/..GoogleBot-Image/1.0 ( urlGoogleBot.com/bot.html)
 www.text/..GoogleBot/2.1 ( urlGoogleBot.com/bot.html)
 www.text/..GoogleBot/2.1 (urlGoogleBot.com/bot.html)
sblog
 fulltext.sblog.cz/screenshot/image/..Mozilla/5.0 (compatible; Seznam screenshot-generator 2.0; url)
 fulltext.sblog.cz/text/..SeznamBot/3.0 (url)
 fulltext.sblog.cz/screenshot/text/..Mozilla/5.0 (compatible; Seznam screenshot-generator 2.0; url)
 fulltext.sblog.cz/text/..SeznamBot/3.0-test (url)
 fulltext.sblog.cz/-SeznamBot/3.0 (url)
 fulltext.sblog.cz/screenshot/application/javascriptMozilla/5.0 (compatible; Seznam screenshot-generator 2.0; url)
sentymetr
 sentymetr.pl/bot.htmlapplication/jsonMozilla/5.0 (compatible; SentymetrBot 1.0; url)
 sentymetr.pl/bot.htmltext/..Mozilla/5.0 (compatible; SentymetrBot 1.0; url)
traslated
 mymemory.traslated.net/doc/text/..Mozilla/5.0 (MyMemory Bot url)
youdao
 www.youdao.com/help/webmaster/spider/text/..Mozilla/5.0 (compatible; YoudaoBot/1.0; url; )
 www.youdao.com/help/webmaster/spider/-Mozilla/5.0 (compatible; YoudaoBot/1.0; url; )
 www.youdao.com/help/webmaster/spider/text/..Mozilla/5.0 (compatible;YodaoBot-Image/1.0;url;)
 toolbar.youdao.com/image/..Youdao Toolbar (url)
 www.youdao.com/help/webmaster/spider/image/..Mozilla/5.0 (compatible;YodaoBot-Image/1.0;url;)
 www.youdao.com/help/webmaster/spider/application/vnd.php.serializedMozilla/5.0 (compatible; YoudaoBot/1.0; url; )
archive
 crawls.archive.org/collections/eot/2012/about.htmltext/..Mozilla/5.0 (compatible; archive.org_bot/1.5.0 url)
 www.archive.org/details/archive.org_bottext/..Mozilla/5.0 (compatible; archive.org_bot url)
 crawls.archive.org/collections/eot/2012/about.htmlimage/..Mozilla/5.0 (compatible; archive.org_bot/1.5.0 url)
 www.archive.org/details/archive.org_botimage/..Mozilla/5.0 (compatible; archive.org_bot url)
php
 pear.php.net/application/vnd.php.serializedPEAR HTTP_Request class ( url )
 pear.php.net/application/xmlPEAR HTTP_Request class ( url )
 pear.php.net/package/http_request2text/..HTTP_Request2/0.5.2 (url) PHP/5.2.17
 pear.php.net/text/..PEAR HTTP_Request class ( url )
 pear.php.net/image/..PEAR HTTP_Request class ( url )
 pear.php.net/package/http_request2text/..HTTP_Request2/2.0.0RC1 (url) PHP/5.3.2-1ubuntu4.9
majestic12
 www.majestic12.co.uk/bot.php?text/..Mozilla/5.0 (compatible; MJ12bot/v1.4.0; url)
sogou
 www.sogou.com/docs/help/webmasters.htm#07text/..Sogou web spider/4.0(url)
 www.sogou.com/docs/help/webmasters.htm#07-Sogou web spider/4.0(url)
 www.sogou.com/docs/help/webmasters.htm#07-Sogou web spider/4.0(url)
 www.sogou.com/docs/help/webmasters.htm#07application/vnd.php.serializedSogou web spider/4.0(url)
wikipedia
 en.wikipedia.org/wiki/Wikipedia:Huggletext/..Huggle/2.1.18.0 url
 en.wikipedia.org/wiki/User:NicoV/Wikipedia_Cleaner/Documentationtext/..WikiCleaner (url)
 en.wikipedia.orgtext/..url
 fr.wikipedia.org/wiki/Utilisateur:Salebotapplication/jsonSalebot, see url (uses Perl MediaWiki::API)
yacy
 yacy.net/bot.htmltext/..yacybot (sciencenet-any; amd64 Linux 2.6.32-33-generic; java 1.6.0_20; Europe/en) url
 yacy.net/bot.htmltext/..yacybot (sciencenet-any; amd64 Linux 2.6.35-30-generic; java 1.6.0_20; Europe/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 2.6.38-11-server; java 1.6.0_22; America/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld-global; amd64 Linux 2.6.32-custom; java 1.6.0_18; Europe/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 3.0.0-12-generic; java 1.6.0_23; Europe/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; i386 Linux 2.6.38-11-generic; java 1.6.0_26; Europe/de) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; i386 Linux 2.6.39-bpo.2-686-pae; java 1.6.0_18; Europe/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 2.6.31-gentoo-r6; java 1.6.0_17; Etc/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 2.6.32-custom; java 1.6.0_18; Europe/en) url
 yacy.net/bot.htmltext/..yacybot (webportal/global; amd64 Linux 2.6.38-sysrescue-std220; java 1.6.0_29; Europe/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Windows 7 6.1; java 1.6.0_24; Europe/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; i386 Linux 2.6.38-8-generic; java 1.6.0_26; Europe/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 2.6.32-5-amd64; java 1.6.0_18; Europe/de) url
wikimedia
 tools.wikimedia.de/~daniel/text/..WikiSense (url)
justsystems
 www.justsystems.com/jp/tech/crawler/text/..JUST-CRAWLER(url)
wwwgogetpapers
 wwwgogetpapers.com/application/jsonUser-Agent: GoGetPapersBot (url)
 wwwgogetpapers.com/text/..User-Agent: GoGetPapersBot (url)
wordpress
 02varvara.wordpress.comtext/..WordPress/MU; url
 sanakajayamini.wordpress.comtext/..WordPress/MU; url
 arthur2rcasc.wordpress.comtext/..WordPress/MU; url
 stradivariusconcerti.wordpress.comtext/..WordPress/MU; url
 munkq.wordpress.comtext/..WordPress/MU; url
 jamiaurduhind.wordpress.comtext/..WordPress/MU; url
 philafric.wordpress.comtext/..WordPress/MU; url
 greatriversofhope.wordpress.comtext/..WordPress/MU; url
 eof737.wordpress.comtext/..WordPress/MU; url
 machikawaco.wordpress.comtext/..WordPress/MU; url
 worldwright.wordpress.comtext/..WordPress/MU; url
 curtisnarimatsu.wordpress.comtext/..WordPress/MU; url
toolserver
 wiki.toolserver.org/view/GeoHacktext/..Geohack (url)
 toolserver.org/~bayo/text/..LudoThecaire/1.0 (url)
 toolserver.org/~dispenser/text/..WebWikipedia Python (url)
 toolserver.org/~para/cgi-bin/kmlexporttext/..url libwww-perl/6.02
mediawiki
 www.mediawiki.org/text/..MediaWiki OAI Harvester 0.2 (url)
entireweb
 www.entireweb.com/about/search_tech/speedy_spider/text/..Mozilla/5.0 (Windows; Windows NT 5.1; en-US) Speedy Spider (url)
sf
 liferea.sf.net/text/..Liferea/1.x.x (Linux; es_ES.UTF-8; url)
 liferea.sf.net/text/..Liferea/0.x.x (Linux; en_US.UTF-8; url)
 magpierss.sf.nettext/..MagpieRSS/0.7x (url)
soso
 help.soso.com/webspider.htmtext/..Sosospider(url)
 help.soso.com/webspider.htm-Sosospider(url)
ac
 www.cse.iitb.ac.in/~vishaal_h4text/..DrRajendra/Nutch-0.9 (IIT Kharagpur; url; mail address )
 ce.yazduni.ac.irtext/..Mozilla/5.0 (compatible; heritrix/1.14.4 url)
 www.clips.ua.ac.be/pages/patternapplication/jsonPattern/1.0 url
 www.clips.ua.ac.be/pages/patterntext/..Pattern/1.0 url
enwp
 enwp.org/User:SDPatrolBottext/..SDPatrolBot (url)
 enwp.org/User:KingpinBottext/..KingpinBot (url)
 enwp.org/User:H3llkn0wz/WikiSharpAPItext/..WikiSharpAPI/0.3 url (C# .NET)
semager
 www.semager.de/blog/semager-bots/text/..Mozilla/5.0 (compatible; Semager/1.4; url)
echonest
 the.echonest.com/reader/application/xmlnestReader/0.3 (discovery; url; reader at echonest.com)
 the.echonest.com/reader/text/..nestReader/0.3 (discovery; url; reader at echonest.com)
zum
 help.zum.com/inquirytext/..ZumBot/1.0 (ZUM Search; url)
 help.zum.com/inquiryimage/..ZumBot/1.0 (ZUM Search; url)
jetbrains
 www.jetbrains.com/omea_reader/text/..JetBrains Omea Reader 2.0 Release Candidate 1 (url)
 www.jetbrains.com/omea_reader/text/..JetBrains Omea Reader 1.0.x (url)
covario
 www.covario.com/idstext/..CovarioIDS/1.1 (url; mail address )
feedshow
 www.feedshow.comtext/..FeedshowOnline (url)
 www.feedshow.comtext/..Feedshow/x.0 (url; 1 subscriber)
avantbrowser
 www.avantbrowser.comtext/..Avant Browser (url)
 www.avantbrowser.comtext/..Advanced Browser (url)
newsgator
 www.newsgator.com/text/..FeedDemon/2.7 (url; Microsoft Windows XP)
 www.newsgator.comtext/..NewsGatorOnline/2.0 (url; 1 subscribers)
tumblr
 benderthewebrobot.tumblr.comtext/..Mozilla/5.0 (compatible; Bender; url)
 benderthewebrobot.tumblr.comapplication/vnd.php.serializedMozilla/5.0 (compatible; Bender; url)
80legs
 www.80legs.com/webcrawler.htmltext/..Mozilla/5.0 (compatible; 008/0.83; url) Gecko/2008032620
 www.80legs.com/webcrawler.htmlimage/..Mozilla/5.0 (compatible; 008/0.83; url) Gecko/2008032620
federatedmedia
 federatedmedia.nettext/..Mozilla/5.0 (url) Gecko/20061208 Firefox/2.0.0.1
ahrefs
 ahrefs.com/robot/text/..Mozilla/5.0 (compatible; AhrefsBot/1.0; url)
github
 github.com/pauldix/typhoeus/tree/mastertext/..Typhoeus - url
 github.com/NeilCrosby/wikislurpapplication/vnd.php.serializedWikiSlurp (url)
tweetmeme
 tweetmeme.com/text/..Mozilla/5.0 (compatible; TweetmemeBot/2.11; url)
bsurprised
 bsurprised.com/text/..BSurprised WikiBox 0.1.3 (url)
kosmix
 www.kosmix.com/html/kosmos.htmlapplication/xmlMozilla/5.0(compatible;Kosmos/1.0;url)
goo
 help.goo.ne.jp/contact/text/..goo wikipedia (url)
 help.goo.ne.jp/help/article/1142/-DoCoMo/2.0 P900i(c100;TB;W24H11) (compatible; ichiro/mobile goo; url)
 help.goo.ne.jp/help/article/1142/text/..DoCoMo/2.0 P900i(c100;TB;W24H11) (compatible; ichiro/mobile goo; url)
FeedBurner
 www.FeedBurner.comtext/..FeedBurner/1.0 (url)
daum
 ws.daum.net/aboutWebSearch.htmltext/..Mozilla/5.0 (compatible; MSIE or Firefox mutant; not on Windows server; url) Daumoa/2.0
apercite
 www.apercite.fr/robot/index.htmlimage/..Mozilla/5.0 (compatible; Apercite; url)
emining
 emining.jp/text/..emBot-GalaBuzz/Nutch-1.0 (url; mail address )
bin-co
 www.bin-co.com/php/scripts/load/text/..BinGet/1.00.A (url)
 www.bin-co.com/php/scripts/load/application/vnd.php.serializedBinGet/1.00.A (url)
freebase
 www.freebase.comtext/..metaweb/Nutch-1.0-dev (url; help_at_metaweb.com)
weblio
 www.weblio.jp/text/..Mozilla/5.0 (compatible; WeblioBot; url)
scoutjet
 www.scoutjet.com/text/..Mozilla/5.0 (compatible; ScoutJet; url)
spottercharts
 www.spottercharts.com/text/..SpotterCharts (url)
orcabrowser
 www.orcabrowser.comtext/..Orca Browser (url)
seebot
 seebot.orgtext/..Lynx/2.8 (;url)
bibalex
 archive.bibalex.org/bot/image/..Mozilla/5.0 (compatible; archive.bibalex.org_bot; url)
 archive.bibalex.org/bot/text/..Mozilla/5.0 (compatible; archive.bibalex.org_bot; url)
graemef
 graemef.comtext/..NewsGator FetchLinks extension/0.2.0 (url)
speaktoit
 www.speaktoit.comapplication/jsonSpeaktoit url
plagger
 plagger.org/text/..Plagger/0.x.xx (url)
kula
 kula.jp/endotext/..endo/1.0 (Mac OS X; ppc i386; url)
apache
 lucene.apache.org/nutch/bot.htmltext/..NutchCVS/0.7.2 (Nutch; url; mail address )
tinyurl
 tinyurl.com/64t5ntext/..Rome Client (url) Ver: 0.9
blogbridge
 www.blogbridge.com/text/..BlogBridge 2.13 (url)
rssreader
 www.rssreader.comtext/..RssReader/1.0.xx.x (url) Microsoft Windows NT 5.1.2600.0
ranchero
 ranchero.com/netnewswire/text/..NetNewsWire/2.x (Mac OS X; url)
it-influentials
 search.it-influentials.com/bot.htmtext/..Mozilla/5.0 (compatible;FindITAnswersbot/1.0;url)
whatrhymeswith
 www.whatrhymeswith.com/site/rhyme-bottext/..RhymeBot/0.1 (url)
timewe
 timewe.nettext/..CDR/1.7.1 Simulator/0.7(url) Profile/MIDP-1.0 Configuration/CLDC-1.0
winpodder
 winpodder.comtext/..WinPodder (url)
rssbandit
 www.rssbandit.orgtext/..RssBandit/1.5.0.10 (WinNT 5.1.2600.0; url) (WinNT 5.1.2600.0; )
ponderer
 ponderer.org/download/annotate_google.user.jstext/..annotate_google; url
feeds4all
 www.feeds4all.com/feedzcollectortext/..FeedZcollector v1.x (Platinum) url
zipcommander
 www.zipcommander.com/text/..1st ZipCommander (Net) - url
zootycoon
 www.zootycoon.comtext/..Zoo Tycoon 2 Client -- url
nemui
 mozshot.nemui.org/text/..Mozilla/5.0 (Gecko/20070310 Mozshot/0.0.20070628; url)
snarfware
 www.snarfware.com/text/..Snarfer/0.x.x (url)
exabot
 www.exabot.com/go/robottext/..Mozilla/5.0 (compatible; Exabot/3.0; url)
hatena
 a.hatena.ne.jp/helptext/..Hatena Antenna/0.5 (url)
flipboard
 flipboard.com/browserproxyimage/..Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; en-US; rv:1.9.2) Gecko/20100115 Firefox/3.6 (FlipboardProxy/0.0.5; url)
 flipboard.com/browserproxytext/..Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; en-US; rv:1.9.2) Gecko/20100115 Firefox/3.6 (FlipboardProxy/1.1; url)
 flipboard.com/browserproxytext/..Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; en-US; rv:1.9.2) Gecko/20100115 Firefox/3.6 (FlipboardProxy/0.0.5; url)
Anonymouse
 Anonymouse.org/text/..url (Unix)
 Anonymouse.org/image/..url (Unix)
4chat
 www.4chat.tvtext/..url
SearchNearMe
 SearchNearMe.com/contact.phpapplication/vnd.php.serializedSearchNearMe (url)
 SearchNearMe.com/contact.phptext/..SearchNearMe (url)
gnip
 www.gnip.com/text/..UnwindFetchor/1.0 (url)
 www.gnip.com/text/..UnwindFetchor/1.0 (url)
test
 www.test.testtext/..Mozilla/5.0 (compatible; heritrix/1.6.0-OFFIS url)
z-add
 w3.z-add.co.uk/linkcheck/text/..Z-Add Link Checker (url)
puritysearch
 www.puritysearch.net/text/..Mozilla/5.0 (compatible; Purebot/1.1; url)
suggy
 blog.suggy.com/was-ist-suggy/suggy-webcrawler/text/..Mozilla/5.0 (compatible; suggybot v0.01a, url)
alexa
 www.alexa.com/site/help/webmasterstext/..ia_archiver (url; mail address )
searchtechnologies
 www.searchtechnologies.comtext/..Mozilla/5.0 (compatible; heritrix/1.14.3 url)
bnf
 www.bnf.fr/fr/outils/a.dl_web_capture_robot.htmltext/..Mozilla/5.0 (compatible; bnf.fr_bot; url)
 www.bnf.fr/fr/outils/a.dl_web_capture_robot.htmlimage/..Mozilla/5.0 (compatible; bnf.fr_bot; url)
netnewswireapp
 netnewswireapp.com/mac/-NetNewsWire/3.3 (Mac OS X; url; gzip-happy)
pannous
 pannous.infotext/..Mozilla/5.0 (Voice Actions url)
 pannous.nettext/..Mozilla/5.0 (Voice Actions url)
moviecus
 www.moviecus.com/botcontactinfo.phpapplication/yamlmoviecus bot (url)
spinn3r
 spinn3r.com/robottext/..Mozilla/5.0 (X11; Linux x86_64; en-US; rv:1.9.0.19; aggregator:Spinn3r (Spinn3r 3.1); url) Gecko/2010040121 Firefox/3.0.19
whstour
 tokyo.whstour.comtext/..WordPress/3.2.1; url
 osaka.whstour.comtext/..WordPress/3.2.1; url
 nagoya.whstour.comtext/..WordPress/3.2.1; url
textdigger
 textdigger.comtext/..Mozilla/5.0 (url) Gecko/20061208 Firefox/2.0.0.1
simplepie
 simplepie.orgtext/..SimplePie/1.2 (Feed Parser; url; Allow like Gecko) Build/20090627192103
 simplepie.orgapplication/xmlSimplePie/1.2 (Feed Parser; url; Allow like Gecko) Build/20090627192103
drupal
 drupal.org/text/..User-Agent: Drupal (url)
 drupal.org/text/..Drupal (url)
metamagazine
 metamagazine.comtext/..WordPress/3.2.1; url
arquivo
 arquivo.pt/faq-crawlingtext/..Arquivo-web-crawler (compatible; heritrix/1.14.3 url)
turnitin
 www.turnitin.com/robot/crawlerinfo.htmltext/..TurnitinBot/2.1 (url)
wikiglass
 wikiglass.comtext/..url : mail address
ibis
 ibis.ne.jp/browser/about.htmlimage/..Mozilla/4.0 (compatible; ibisBrowser; url)
 ibis.ne.jp/browser/about.htmltext/..Mozilla/4.0 (compatible; ibisBrowser; url)
mytvmoments
 www.mytvmoments.comtext/..My TV Moments (url)
rockpeaks
 www.rockpeaks.com/contacttext/..RockPeaks/0.1 (url)
froute
 labs.froute.jp/pc2m/help.htmltext/..Froute Mobile Gateway/1.0 (url)
blogscope
 www.blogscope.net/text/..Mozilla/5.0 (compatible; BlogScope/1.0; url; U of Toronto)
rcdtokyo
 www.rcdtokyo.com/pc2m/text/..Mozilla/5.0 (compatible; PEAR HTTP_Request class; url)
zapbot
 www.zapbot.nettext/..Mozilla/5.0 (compatible; ZapBot/0.2n; url)
 www.zapbot.comtext/..Mozilla/5.0 (compatible; ZapBot/0.2c; url)
 www.zapbot.orgtext/..Mozilla/5.0 (compatible; ZapBot/0.2o; url)
fairshare
 fairshare.cctext/..Mozilla crawl/5.0 (compatible; fairshare.cc url)
 fairshare.ccapplication/vnd.php.serializedMozilla/5.0 url (X11; FreeBSD i386; en-US; rv:1.2a) Gecko/20021021
 fairshare.cctext/..Mozilla/5.0 url (X11; FreeBSD i386; en-US; rv:1.2a) Gecko/20021021
search
 www.search.ch/rim.htmltext/..UltraSpider3000/1.0 (url)
embed
 support.embed.ly/text/..Mozilla/5.0 (compatible; Embedly/0.2; url)
 support.embed.ly/image/..Mozilla/5.0 (compatible; Embedly/0.2; url)
trendiction
 www.trendiction.de/bottext/..Mozilla/5.0 (Windows; Windows NT 6.0; en-GB; rv:1.0; trendictionbot0.4.5; trendiction search; url; please let us know of any problems; web at trendiction.com) Gecko/20071127 Firefox/3.0.0.11
sourceforge
 linkchecker.sourceforge.net/text/..LinkChecker/7.2 (url)
 linkchecker.sourceforge.net/text/..LinkChecker/5.1 (url)
paper
 support.paper.li/entries/20023257-what-is-paper-litext/..Mozilla/5.0 (compatible; PaperLiBot/2.1; url)
creativecommons
 wiki.creativecommons.org/Metadata_Scrapertext/..CC Metadata Scaper url
mondowindow
 www.mondowindow.comtext/..MondoWindow (url)
veveo
 corporate.veveo.net/webmasters.htmltext/..Mozilla/5.0 (compatible; Veveobot; url)
topsy
 labs.topsy.com/butterfly/text/..Mozilla/5.0 (compatible; Butterfly/1.0; url) Gecko/2009032608 Firefox/3.0.8
js-kit
 js-kit.com/text/..JS-Kit URL Resolver, url
picsearch
 www.picsearch.com/bot.htmltext/..psbot/0.1 (url)
89,612total

Page requests for probable crawlers, recognized by keyword
Count
x 1000
Agent string
  Mime type (count ≥ 3)
PythonWikipediaBot/1.0
 application/json
 application/xml
 text/..
 -
 image/..
GoogleBot-Image/1.0
 text/..
 image/..
 -
MediaWikiCrawler-Google/2.0 ( mail address )
 text/..
 -
ClueBot/1.1
 application/vnd.php.serialized
 -
php wikibot classes
 application/vnd.php.serialized
 text/..
GoogleBot-Image/1.0
 text/..
 image/..
 application/vnd.php.serialized
 -
 application/json
LinkParser/2.0
 text/..
Mozilla/5.0 (Windows; Windows NT 5.1; fr; rv:1.8.1) VoilaBot BETA 1.2 ( mail address )
 text/..
Kavande Crawler 1.0/Nutch-1.4-dev (Iranian National Web Crawler)
 text/..
 image/..
wikiwix-bot-3.0
 text/..
jikespider "Mozilla/5.0
 text/..
 image/..
 application/xml
 -
 application/ogg
 application/x-external-editor
Onespot Crawler
 application/json
 text/..
 -
Peachy MediaWiki Bot API Version 1.0
 application/vnd.php.serialized
 -
 text/..
ClueBot/2.0
 application/vnd.php.serialized
 text/..
spider
 text/..
 image/..
Answersbot
 text/..
Mozilla/5.0 (compatible; Ezooms/1.0; mail address )
 text/..
 application/vnd.php.serialized
Pywikipediabot/2.0
 application/json
mail address
 application/vnd.php.serialized
 text/..
Mozilla 5.0 (Apibot 0.32)
 application/vnd.php.serialized
DigitalsmithsBot
 text/..
Opera/8.01 (J2ME/MIDP; MXit WebBot/1.4.0.0) Opera Mini/3.1
 image/..
 text/..
wikiBot Ver0.1
 application/json
Test Webbot
 text/..
HTMLParser/1.6
 text/..
Mozilla/4.0 (compatible; EmberSpider 0.8; Scout (a); bgft)
 text/..
AnomieBOT 1.0 (TagDater)
 application/json
 text/..
python-wikitools/1.2 (User:BernsteinBot)
 application/json
MediaWiki::Bot/3.2.6
 application/json
AarghBot Linux
 text/..
 -
DotNetWikiBot/2.81 (Microsoft Windows NT 6.1.7601 Service Pack 1; )
 text/..
 application/xml
 image/..
YBot/0.1
 application/vnd.php.serialized
FAST Search Web Crawler 14.0.0291.0000
 text/..
Mozilla/5.0 (compatible; Konqueror/3.5; Linux) KHTML/3.5.5 (Exabot-Thumbnails)
 image/..
 text/..
 application/json
Mozilla/5.0 (compatible; Nigma.ru/3.0; mail address )
 text/..
 application/opensearchdescription+xml
DotNetWikiBot/2.97 (Unix 5.10.0.0; )
 application/xml
 text/..
setoozbot/0.30 ( compatible; SETOOZBOT/0.30 ; setooz.com ; bot AT setooz DOT com )
 text/..
Mozilla/5.0 (compatible; SnapPreviewBot; en-US; rv:1.8.0.9) Gecko/20061206 Firefox/1.5.0.9
 text/..
 -
UCMore Crawler App
 text/..
 -
Mozilla/5.0 (X11; Linux i686; en-US; rv:1.8.0.7) Gecko/20060909 Firefox/1.5.0.7 SnapPreviewBot
 text/..
HTMLParser/2.0
 text/..
FAST Enterprise Crawler 6 used by ESP ( mail address )
 text/..
MLBot (www.metadatalabs.com/mlbot)
 text/..
 application/vnd.php.serialized
 image/..
Webwiki Search Engine Bot - www.webwiki.de
 text/..
AnomieBOT 1.0 (ReplaceExternalLinks2)
 application/json
SineBot/1.5.17(User:SineBot)
 application/vnd.php.serialized
 text/..
GoogleBot
 text/..
 image/..
Mozilla/5.0 (compatible; Ezooms/1.0; mail address )
 text/..
BotMapDev/1.3.666 CFNetwork/548.0.3 Darwin/11.0.0
 image/..
Felix's spiderman/1.0
 text/..
 -
AniBot/0.9 php/curl
 application/vnd.php.serialized
 -
 text/..
Wiktionary spider. mail address
 text/..
jikespider ("Mozilla/5.0)
 text/..
 application/ogg
wikbot/1.23 CFNetwork/548.0.3 Darwin/11.0.0
 image/..
 application/json
 text/..
MyCuteBot/0.1
 text/..
 application/json
HRoestBot, de-wikipedia using pywikipedia framework
 application/json
 application/xml
 text/..
BotMap/1.3.666 CFNetwork/548.0.3 Darwin/11.0.0
 image/..
AdMedia bot
 text/..
COIBot/2.0
 text/..
BotMapDev/1.3.663 CFNetwork/548.0.3 Darwin/11.0.0
 image/..
BotMapDev/1.3.663 CFNetwork/548.0.3 Darwin/10.8.0
 image/..
MoovidaBot/0.1
 text/..
AnomieBOT 1.0 (FlagIconRemover)
 application/json
AnomieBOT 1.0 (OrphanReferenceFixer)
 application/json
Twitterbot/0.1
 text/..
 image/..
BotMapDev/1.3.672 CFNetwork/548.0.3 Darwin/11.0.0
 image/..
BotMapBeta/1.3.631 CFNetwork/485.13.9 Darwin/11.0.0
 -
strucr.com crawler 0.1.42 (refer to in robots.txt as strucr, see https://strucr.com/bot)
 text/..
Opera/8.01 (J2ME/MIDP; MXit WebBot/1.4.0.0) Opera Mini/3.1
 -
TVersity Media Robot
 text/..
BotMap/1.3.671 CFNetwork/548.0.3 Darwin/11.0.0
 image/..
Mozilla/5.0 MaboMwFramework/1.1 (w:de:MerlIwBot)
 text/..
Pastec bot
 text/..
 -
SiocWikiBot/1.0
 application/vnd.php.serialized
 text/..
COIBot/1.00
 text/..
Twitterbot/1.0
 text/..
 image/..
 application/ogg
SpinSpider
 text/..
AnomieBOT 1.0 (TemplateSubster)
 application/json
Online Spider/Nutch-1.3
 text/..
CaBot Script (running on nightshade.toolserver.org)
 application/vnd.php.serialized
 text/..
DotNetWikiBot/2.96 (Unix 5.10.0.0; )
 text/..
 application/xml
BotMap/1.3.663 CFNetwork/548.0.3 Darwin/11.0.0
 image/..
Mozilla/5.0 (compatible; Bond,James Bond/0.07; robot)
 application/json
 image/..
DotNetWikiBot/2.92 (Microsoft Windows NT 6.1.7600.0; )
 text/..
Slevnicka.cz CURL bot
 text/..
~Bot ([[:fr:w:User:TildeBot]] by [[:fr:w:User:Alphos]] mail address )
 text/..
OrlodrimBot/1.0
 text/..
AnomieBOT 1.0 (BAGBot)
 application/json
 text/..
Mozilla/5.0 (SnapPreviewBot) Gecko/20061206 Firefox/1.5.0.9
 image/..
 text/..
AnomieBOT 1.0 (ReplaceExternalLinks4)
 application/json
Tawbot (public svn release; plwiki)
 text/..
BotMapDev/1.3.673 CFNetwork/548.0.3 Darwin/11.0.0
 image/..
BotMapDev/1.3.671 CFNetwork/548.0.3 Darwin/11.0.0
 image/..
XLinkBot/1.00
 text/..
TheKeens bot
 text/..
Happy OpenBuildings Robot
 application/json
DotNetWikiBot/2.97 (Microsoft Windows NT 5.1.2600 Service Pack 3; )
 text/..
 application/xml
Soundkiosk Relation-Crawler (Version 1.0; soundkiosk.de)
 application/xml
SurakWare MediaWiki Bot/1.0
 text/..
FAST Enterprise Crawler 6 used by viaapia (viaapia)
 text/..
 -
BotMap/1.3.666 CFNetwork/548.0.3 Darwin/10.8.0
 image/..
Wikibot
 text/..
 image/..
MOTOBOT
 text/..
 application/ogg
CheMoBot/1.00
 text/..
Baiduspider
 text/..
infraEnterprise v8 Web Crawler
 -
MediaWiki::Bot/3.4.0
 application/json
Handelabra WikiBot
 application/vnd.php.serialized
 text/..
mail address mail address – MediaWiki Tcl Bot Framework 0.5 (r0)
 application/json
DotNetWikiBot/2.97 (Microsoft Windows NT 6.1.7601 Service Pack 1; )
 text/..
 application/xml
Mozilla/5.0 (compatible; Birubot/1.0) Gecko/2009032608 Firefox/3.0.8
 image/..
 text/..
Freebase Deathbot
 text/..
mail address mail address – MediaWiki Tcl Bot Framework 0.5 (r0)
 application/x-www-form-urlencoded
wikbot/1.23 CFNetwork/485.13.9 Darwin/11.0.0
 image/..
 application/json
mail address (Mozilla compatible)
 text/..
unblockbot/1.00
 text/..
IssueCrawler
 text/..
My Bot
 text/..
QuickFinder Crawler
 text/..
PadosAttilaCrawler/Nutch-1.0 (Ozi,PolandWiz,AustriaWiz,WiennaWiz crawlers, Attila Pados, mail address ; www.ozi.hu, www.polandwiz.com,www.wiennawiz.com,www.austriawiz.com; attila dot mail address )
 text/..
DotNetWikiBot/2.9 (Microsoft Windows NT 6.0.6000.0; )
 text/..
BotMapDev/1.3.662 CFNetwork/548.0.3 Darwin/11.0.0
 image/..
SearQuBot/SearQuBot v1.0
 text/..
BotMapDev/1.3.662 CFNetwork/548.0.3 Darwin/10.8.0
 image/..
wikbotlite/1.20 CFNetwork/548.0.3 Darwin/11.0.0
 application/json
 image/..
 -
TrueKnowledgeBot bot mail address >
 application/xml
 application/vnd.php.serialized
AnomieBOT 1.0 (RandomPagePicker)
 application/json
Geni ircpybot 1.0
 application/json
 text/..
AnomieBOT 1.0 (AFDMergeFromCleaner)
 application/json
Mozilla/5.0 (compatible; FriendFeedBot/0.1; Http://friendfeed.com/about/bot; 371 subscribers; feed-id=3852576738117026533)
 -
 application/xml
SkimWordsBot/1.0
 text/..
HBC Archive Indexerbot 0.9a
 text/..
python-wikitools/1.2 (User:LaraBot)
 application/json
My Bot
 image/..
 text/..
DotNetWikiBot/2.9 (Unix 5.10.0.0; )
 text/..
Mozilla/5.0 (Bgbot 0.5)
 text/..
AnomieBOT 1.0 (DeletionSortingCleaner)
 application/json
bitlybot
 text/..
 image/..
123peoplebot/1.0
 text/..
DotNetWikiBot/2.96 (Microsoft Windows NT 6.1.7601 Service Pack 1; )
 text/..
 application/xml
wikbotlite/1.20 CFNetwork/485.13.9 Darwin/11.0.0
 image/..
 application/json
 text/..
MediaWiki::Bot 3.1.5
 application/json
Mozilla 5.0 (Apibot 0.30b5)
 application/vnd.php.serialized
YourFilmsBot/0.1
 application/json
Citation_bot; mail address
 text/..
BotMapDev/1.3.674 CFNetwork/548.0.3 Darwin/10.8.0
 image/..
Opera/9.70 (Linux armv7l ; turbotabbee/TSV2.0/1.02Q; fr) Presto/2.2.1
 image/..
 application/json
 text/..
BotMapDev/1.3.661 CFNetwork/548.0.3 Darwin/10.8.0
 image/..
BotMap/1.3.671 CFNetwork/548.0.3 Darwin/10.8.0
 image/..
WikiBot/0.1
 text/..
AnomieBOT 1.0 (ReplaceExternalLinks3)
 application/json
NFCCheckBot/1.0
 text/..
gsa-crawler (Enterprise; GID-01422; jplastiras.com)
 text/..
Cooby Bot 201-109
 text/..
19,047total

IP ranges: known ip ranges for Google are 64.233.[160.0-191.255], 66.249.[64.0-95.255], 66.102.[0.0-15.255], 72.14.[192.0-255.255],
74.125.[0.0-255.255], 209.085.[128.0-255.255], 216.239.[32.0-63.255] and a few minor other subranges

Errata: WMF traffic logging service suffered from server capacity problems in Aug/Sep/Oct 2011.
Absolute traffic counts for October 2011 are approximatly 7% too low.
Data loss only occurred during peak hours. It therefore may have had somewhat different impact for traffic from different parts of the world.
and may have also skewed relative figures like share of traffic per browser or operating system.

From mid September till late November squid log records for mobile traffic were in invalid format.
Data could be repaired for logs from mid October onwards. Older logs were no longer available.

In a an unrelated server outage precisely half of traffic to WMF mobile sites was not counted from Oct 16 - Nov 29 (one of two load-balanced servers did not report traffic).
WMF has since improved server monitoring, so that similar outages should be detected and fixed much faster from now on.

Generated on Mon, Apr 30, 2012 16:35
Author:Erik Zachte (
Web site)
Mail: ezachte@### (no spam: ### = wikimedia.org)
All data and images on this page are in the public domain.

Note: page may load slower on Microsoft Internet explorer than on other major browsers