Wikimedia Traffic Analysis Report - Crawler requests

Monthly requests or daily averages, for period: 1 Oct 2012 - 31 Oct 2012 (last 12 months)
000 ⇒ k
 

 This analysis is based on a 1:1000 sampled server log (squids)

 See also: Requests by destination or by origin / Methods / Scripts / User agents / Skins / Crawlers / Op.Sys. / Mobile devices / Browsers / Google / Country data / Traffic trends, and notes about reliability of these data

The following overview of crawler (aka bot) page requests is based on the user agent information that accompanies most server requests. Unfortunately this user agent information follows rather loosely defined guidelines.
Also please bear in mind than the most popular crawler names may be somewhat overrepresented. This is the result of so called user agent spoofing (where a requester supplies false credentials, e.g. to bypass web servers filters).
GoogleBot seems to be a favorite for spoofing. Therefore requests from an ip address registered by Google (see below) are color coded GoogleBot, others GoogleBot

For this report page requests are considered to be issued by a crawler in two cases:
1 The user agent string contains a web address (only crawlers should have that, but there a some false positives, where a browser sends a user agent string with a web address (ill behaved plug-in, main offenders have been eliminated)
2 The user agent string contains the term bot, spider or crawl[er]'

In total 78,991,900 page requests (mime type text/html only!) per day are considered crawler requests, out of 518,048,900 external requests, which is 15.2%

Page requests for crawlers that specify a url in the agent string
Count
x 1000
Secondary domain
(~site) name
URLMime typeUser agent
google
 www.google.com/bot.htmltext/..Mozilla/5.0 (compatible; GoogleBot/2.1; url)
 www.google.com/bot.html-Mozilla/5.0 (compatible; GoogleBot/2.1; url)
 www.google.com/bot.htmltext/..DoCoMo/2.0 N905i(c100;TB;W24H16) (compatible; GoogleBot-Mobile/2.1; url)
 www.google.com/bot.htmltext/..Mozilla/5.0 (iPhone; CPU iPhone OS 4_1 like Mac OS X; en-us) AppleWebKit/532.9 KHTML Version/4.0.5 Mobile/8B117 Safari/6531.22.7 (compatible; GoogleBot-Mobile/2.1; url)
 www.google.com/bot.html-DoCoMo/2.0 N905i(c100;TB;W24H16) (compatible; GoogleBot-Mobile/2.1; url)
 www.google.com/bot.htmltext/..SAMSUNG-SGH-E250/1.0 Profile/MIDP-2.0 Configuration/CLDC-1.1 UP.Browser/6.2.3.3.c.1.101 (GUI) MMP/2.0 (compatible; GoogleBot-Mobile/2.1; url)
 www.google.com/bot.html-Mozilla/5.0 (iPhone; CPU iPhone OS 4_1 like Mac OS X; en-us) AppleWebKit/532.9 KHTML Version/4.0.5 Mobile/8B117 Safari/6531.22.7 (compatible; GoogleBot-Mobile/2.1; url)
 www.google.com/bot.html-SAMSUNG-SGH-E250/1.0 Profile/MIDP-2.0 Configuration/CLDC-1.1 UP.Browser/6.2.3.3.c.1.101 (GUI) MMP/2.0 (compatible; GoogleBot-Mobile/2.1; url)
 desktop.google.com/application/xmlMozilla/5.0 (compatible; Google Desktop/5.9.1005.12335; url)
 www.google.com/bot.htmlimage/..Mozilla/5.0 (compatible; GoogleBot/2.1; url)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: s~hr-pulsesubscriber)
 desktop.google.com/image/..Mozilla/5.0 (compatible; Google Desktop/5.9.1005.12335; url)
 www.google.com/feedfetcher.htmlimage/..Mozilla/5.0 (compatible) FeedFetcher-Google; (url)
 www.google.com/feedfetcher.html-FeedFetcher-Google; (url)
 www.google.com/feedfetcher.htmlapplication/xmlFeedFetcher-Google; (url)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: ortografia4)
 www.google.com/feedfetcher.htmltext/..Mozilla/5.0 (compatible) FeedFetcher-Google; (url)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: rarplayer)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: ortopedianew)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: wikien3)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: s~cloudcrawling)
 desktop.google.com/text/..Mozilla/5.0 (compatible; Google Desktop/5.9.1005.12335; url)
 www.google.com/feedfetcher.htmltext/..FeedFetcher-Google; (url)
 www.google.com/feedfetcher.htmlapplication/jsonMozilla/5.0 (compatible) FeedFetcher-Google; (url)
 code.google.com/appenginetext/..WikiBot/0.1 AppEngine-Google; (url; appid: newikipedia)
 code.google.com/p/crawler4j/text/..crawler4j (url)
 desktop.google.com/application/xmlMozilla/5.0 (compatible; Google Desktop/5.9.909.30391; url)
 code.google.com/appengineapplication/jsonAppEngine-Google; (url; appid: s~redconceptual)
 www.google.com/feedfetcher.htmlapplication/xmlMozilla/5.0 (compatible) FeedFetcher-Google; (url)
 code.google.com/appenginetext/..Mozilla/5.0 (Windows; Windows NT 5.1; en-US; rv:1.9.0.7) Gecko/2009021910 Firefox/3.0.7 AppEngine-Google; (url; appid: s~fonetika3)
 code.google.com/appengineimage/..Offline Mobile Wiki (Tel:44 141 334 5472, mail address ) AppEngine-Google; (url; appid: s~wiki2go-hrd)
 code.google.com/appengineapplication/xmlAppEngine-Google; (url; appid: wikipedia-raw)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: s~syytacit)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: wikien4)
 desktop.google.com/-Mozilla/5.0 (compatible; Google Desktop/5.9.1005.12335; url)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: usawebdl)
 docs.google.comimage/..Mozilla/5.0 (compatible; GoogleDocs; documents; url)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: s~keytanwiki2)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: s~keytanwiki3)
 www.google.com/bot.htmltext/..GoogleBot/2.1 (url)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: s~keytanwiki4)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: s~francetiki)
 code.google.com/appenginetext/..www.productontology.org/1.0 (Contact: mail address ) AppEngine-Google; (url; appid: gr4bing)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: s~wikigraph2)
 www.google.com/feedfetcher.htmltext/..Mozilla/5.0 (compatible) FeedFetcher-Google;(url)
 code.google.com/appenginetext/..Offline Mobile Wiki (Tel:44 141 334 5472, mail address ) AppEngine-Google; (url; appid: s~wiki2go-hrd)
 code.google.com/appenginetext/..Python-urllib/2.5 AppEngine-Google; (url; appid: s~isnt-it)
 www.google.com/coop/cse/creftext/..FeedFetcher-Google-CoOp; (url)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: s~kasumiremix)
 docs.google.comimage/..Mozilla/5.0 (compatible; GoogleDocs; apps-presentations; url)
 desktop.google.com/text/..Mozilla/5.0 (compatible; Google Desktop/5.9.909.30391; url)
 www.google.com/bot.htmlNONE/wikipedia- Mozilla/5.0 (compatible; GoogleBot/2.1; url)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: s~app3123ak)
 code.google.com/appengineimage/..AppEngine-Google; (url; appid: d24-img)
 code.google.com/appenginetext/..Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/536.5 KHTML Chrome/19.0.1084.52 Safari/536.5 AppEngine-Google; (url; appid: seiyukyouen)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: d24-img)
 www.google.com/feedfetcher.html-Mozilla/5.0 (compatible) FeedFetcher-Google; (url)
 code.google.com/p/rondaapplication/jsonRonda - url
 code.google.com/appengineapplication/jsonMWBOT GAE Edition AppEngine-Google; (url; appid: philip-bot)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: threewiki)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: pakgalaxy)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: worldwide-propaganda)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: s~keytanwiki1)
 www.google.com/bot.htmlapplication/pdfMozilla/5.0 (compatible; GoogleBot/2.1; url)
 desktop.google.com/image/..Mozilla/5.0 (compatible; Google Desktop/5.9.911.3589; url)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: kurizogeorge)
 code.google.com/appengineimage/..AppEngine-Google; (url; appid: kurizogeorge)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: s~misterhac)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: boxapp)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: toom16-10)
 code.google.com/p/rondatext/..Ronda - url
 code.google.com/appenginetext/..Wiki.java 0.26 AppEngine-Google; (url; appid: wikipediatools)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: kires-roxy)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: s~drizzlprox)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: dustbunnytycoonmonitor)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: abdulfat)
 www.google.com/bot.htmlapplication/pdfGoogleBot/2.1 (url)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: usawebproxy0)
facebook
 www.facebook.com/externalhit_uatext.phpimage/..facebookexternalhit/1.0 (url)
 www.facebook.com/externalhit_uatext.phpimage/..facebookexternalhit/1.1 (url)
 www.facebook.com/externalhit_uatext.phptext/..facebookexternalhit/1.0 (url)
 www.facebook.com/externalhit_uatext.phptext/..facebookexternalhit/1.1 (url)
 developers.facebook.comimage/..facebookplatform/1.0 (url)
 www.facebook.com/externalhit_uatext.php-facebookexternalhit/1.1 (url)
 www.facebook.com/externalhit_uatext.php-facebookexternalhit/1.0 (url)
 www.facebook.com/externalhit_uatext.phpapplication/jsonfacebookexternalhit/1.1 (url)
 developers.facebook.com-facebookplatform/1.0 (url)
 developers.facebook.comtext/..facebookplatform/1.0 (url)
bing
 www.bing.com/bingbot.htmtext/..Mozilla/5.0 (compatible; bingbot/2.0; url)
 www.bing.com/bingbot.htm-Mozilla/5.0 (compatible; bingbot/2.0; url)
 www.bing.com/bingbot.htmimage/..Mozilla/5.0 (compatible; bingbot/2.0; url)
 www.bing.com/bingbot.htmapplication/jsonMozilla/5.0 (compatible; bingbot/2.0; url)
 www.bing.com/bingbot.htmtext/..Mozilla/5.0 (compatible; bingbot/2.0; url) ASProxy/5.5b3
google?
 www.google.com/bot.htmltext/..Mozilla/5.0 (compatible; GoogleBot/2.1; url)
 www.google.com/bot.htmltext/..GoogleBot/2.1 (url)
 www.google.com/bot.html-Mozilla/5.0 (compatible; GoogleBot/2.1; url)
 www.google.com/bot.htmlimage/..Mozilla/5.0 (compatible; GoogleBot/2.1; url)
 www.google.com/bot.htmlapplication/jsonMozilla/5.0 (compatible; GoogleBot/2.1; url)
 www.google.com/bot.htmlapplication/vnd.php.serializedMozilla/5.0 (compatible; GoogleBot/2.1; url)
 www.google.com/bot.html-DoCoMo/2.0 N905i(c100;TB;W24H16) (compatible; GoogleBot-Mobile/2.1; url)
 www.google.com/bot.html-Mozilla/5.0 (iPhone; CPU iPhone OS 4_1 like Mac OS X; en-us) AppleWebKit/532.9 KHTML Version/4.0.5 Mobile/8B117 Safari/6531.22.7 (compatible; GoogleBot-Mobile/2.1; url)
 www.google.com/bot.htmltext/..Mozilla/5.0 (iPhone; CPU iPhone OS 4_1 like Mac OS X; en-us) AppleWebKit/532.9 KHTML Version/4.0.5 Mobile/8B117 Safari/6531.22.7 (compatible; GoogleBot-Mobile/2.1; url)
 www.google.com/bot.html-SAMSUNG-SGH-E250/1.0 Profile/MIDP-2.0 Configuration/CLDC-1.1 UP.Browser/6.2.3.3.c.1.101 (GUI) MMP/2.0 (compatible; GoogleBot-Mobile/2.1; url)
 www.google.com/bot.htmltext/..Mozilla/5.0 (compatible; GoogleBot/2.1; url)
 www.google.com/bot.html-GoogleBot/2.1 (url)
yahoo
 help.yahoo.com/help/us/ysearch/slurpimage/..Mozilla/5.0 (compatible; Yahoo! Slurp/3.0; url)
 help.yahoo.com/help/us/ysearch/slurptext/..Mozilla/5.0 (compatible; Yahoo! Slurp/3.0; url)
 help.yahoo.com/help/us/ysearch/slurptext/..Mozilla/5.0 (compatible; Yahoo! Slurp; url)
 help.yahoo.co.jp/help/jp/search/indexing/indexing-15.htmltext/..'Mozilla/5.0 (compatible; Y!J SearchMonkey/1.0 (Y!J-AGENT; url))'
 help.yahoo.com/help/us/ysearch/slurpapplication/jsonMozilla/5.0 (compatible; Yahoo! Slurp/3.0; url)
 help.yahoo.co.jp/help/jp/search/indexing/indexing-15.htmltext/..Y!J-BRW/1.0 crawler (url)
 help.yahoo.com/help/us/ysearch/slurp-Mozilla/5.0 (compatible; Yahoo! Slurp; url)
 help.yahoo.co.jp/help/jp/search/indexing/indexing-15.htmlimage/..'Mozilla/5.0 (compatible; Y!J SearchMonkey/1.0 (Y!J-AGENT; url))'
 help.yahoo.co.jp/help/jp/search/indexing/indexing-15.htmltext/..Y!J-BRI/0.0.1 crawler ( url )
 developer.yahoo.com/yql/providertext/..Mozilla/5.0 (compatible; Yahoo Pipes 2.0; url) Gecko/20090729 Firefox/3.5.2
 help.yahoo.co.jp/help/jp/search/indexing/indexing-15.htmltext/..Y!J-BRT/1.0 crawler (url)
 help.yahoo.com/help/us/ysearch/slurptext/..Mozilla/5.0 (compatible; Yahoo! Nano; url)
 help.yahoo.com/help/us/ysearch/slurpapplication/xmlMozilla/5.0 (compatible; Yahoo! Slurp;url)
 help.yahoo.com/help/us/ysearch/slurp-Mozilla/5.0 (compatible; Yahoo! Slurp/3.0; url)
 help.yahoo.co.jp/help/jp/search/indexing/indexing-15.htmlimage/..Y!J-BRU/VSIDX dead link checker (url)
baidu
 www.baidu.com/search/spider.htmltext/..Mozilla/5.0 (compatible; Baiduspider/2.0; url)
 www.baidu.com/search/spider.html-Mozilla/5.0 (compatible; Baiduspider/2.0; url)
 www.baidu.com/search/spider.htmlapplication/jsonMozilla/5.0 (compatible; Baiduspider/2.0; url)
 www.baidu.com/search/spider.htmtext/..Baiduspider-image(url)
 www.baidu.com/search/spider.htmlapplication/xmlMozilla/5.0 (compatible; Baiduspider/2.0; url)
 www.baidu.com/search/spider.htmtext/..Baiduspider(url)
yandex
 yandex.com/botstext/..Mozilla/5.0 (compatible; YandexBot/3.0; url)
 yandex.com/bots-Mozilla/5.0 (compatible; YandexBot/3.0; url)
 yandex.com/botstext/..Mozilla/5.0 (compatible; YandexDirect/3.0; url)
 yandex.com/botsimage/..Mozilla/5.0 (compatible; YandexImageResizer/2.0; url)
 yandex.com/botstext/..Mozilla/5.0 (compatible; YandexImages/3.0; url)
 yandex.com/botsapplication/jsonMozilla/5.0 (compatible; YandexBot/3.0; url)
 yandex.com/botstext/..Mozilla/5.0 (compatible; YandexNews/3.0; url)
 yandex.com/botsimage/..Mozilla/5.0 (compatible; YandexImages/3.0; url)
naver
 help.naver.com/robots/text/..Yeti/1.0 (NHN Corp.; url)
 help.naver.com/robots/-Yeti/1.0 (NHN Corp.; url)
 help.naver.com/robots/application/jsonYeti/1.0 (NHN Corp.; url)
 help.naver.com/robots/image/..Yeti/1.0 (NHN Corp.; url)
 corp.naver.jp/text/..Mozilla/5.0 (compatible; NaverJapan/1.0; url)
msn
 search.msn.com/msnbot.htmtext/..msnbot/2.0b (url)
 search.msn.com/msnbot.htmimage/..msnbot-media/1.1 (url)
 search.msn.com/msnbot.htmtext/..msnbot-media/1.1 (url)
 search.msn.com/msnbot.htmtext/..msnbot-UDiscovery/2.0b (url)
 search.msn.com/msnbot.htmtext/..msnbot-NewsBlogs/2.0b (url)
 search.msn.com/msnbot.htmtext/..msnbot-Products/1.0 (url)
 search.msn.com/msnbot.htmtext/..msnbot/0.01 (url)
 search.msn.com/msnbot.htm-msnbot-media/1.1 (url)
 search.msn.com/msnbot.htmimage/..msnbot-NewsBlogs/2.0b (url)
 search.msn.com/msnbot.htmimage/..msnbot/2.0b (url)
 search.msn.com/msnbot.htm-msnbot/2.0b (url)
genieo
 www.genieo.com/webfilter.htmltext/..Mozilla/5.0 (compatible; Genieo/1.0 url)
 www.genieo.com/webfilter.htmlapplication/xmlMozilla/5.0 (compatible; Genieo/1.0 url)
 www.genieo.com/webfilter.htmlimage/..Mozilla/5.0 (compatible; Genieo/1.0 url)
cibra
 cibra.de/text/..CiBra Data Collector (url)
sblog
 fulltext.sblog.cz/screenshot/image/..Mozilla/5.0 (compatible; Seznam screenshot-generator 2.0; url)
 fulltext.sblog.cz/text/..SeznamBot/3.0 (url)
 fulltext.sblog.cz/screenshot/text/..Mozilla/5.0 (compatible; Seznam screenshot-generator 2.0; url)
 fulltext.sblog.cz/-SeznamBot/3.0 (url)
 fulltext.sblog.cz/screenshot/-Mozilla/5.0 (compatible; Seznam screenshot-generator 2.0; url)
80legs
 www.80legs.com/webcrawler.htmltext/..Mozilla/5.0 (compatible; 008/0.83; url) Gecko/2008032620
 www.80legs.com/webcrawler.html-Mozilla/5.0 (compatible; 008/0.83; url) Gecko/2008032620
finecomb
 finecomb.com/-api/1.1 (url; mail address )
 finecomb.com/application/jsonapi/1.1 (url; mail address )
php
 pear.php.net/application/vnd.php.serializedPEAR HTTP_Request class ( url )
 pear.php.net/package/http_request2text/..HTTP_Request2/0.5.2 (url) PHP/5.2.17
 pear.php.net/text/..PEAR HTTP_Request class ( url )
 pear.php.net/image/..PEAR HTTP_Request class ( url )
 pear.php.net/application/xmlPEAR HTTP_Request class ( url )
 pear.php.net/package/http_request2application/xmlHTTP_Request2/2.0.0 (url) PHP/5.3.8
 pear.php.net/package/http_request2text/..HTTP_Request2/2.1.1 (url) PHP/5.3.2-1ubuntu4.17
 pear.php.net/package/http_request2application/jsonHTTP_Request2/2.1.1 (url) PHP/5.3.16
 pear.php.net/package/http_request2image/..HTTP_Request2/2.1.1 (url) PHP/5.3.2-1ubuntu4.15
youdao
 www.youdao.com/help/webmaster/spider/text/..Mozilla/5.0 (compatible; YoudaoBot/1.0; url; )
 www.youdao.com/help/webmaster/spider/-Mozilla/5.0 (compatible; YoudaoBot/1.0; url; )
 toolbar.youdao.com/image/..Youdao Toolbar (url)
www.
 www.text/..GoogleBot/2.1 ( urlGoogleBot.com/bot.html)
 www.text/..GoogleBot-Image/1.0 ( urlGoogleBot.com/bot.html)
 www.text/..GoogleBot/2.1 (urlGoogleBot.com/bot.html)
 www.image/..GoogleBot/2.1 (urlGoogleBot.com/bot.html)
echonest
 the.echonest.com/reader/application/xmlnestReader/0.3 (discovery; url; reader at echonest.com)
 the.echonest.com/reader/text/..nestReader/0.3 (discovery; url; reader at echonest.com)
ahrefs
 ahrefs.com/robot/text/..Mozilla/5.0 (compatible; AhrefsBot/4.0; url)
 ahrefs.com/robot/text/..Mozilla/5.0 (compatible; AhrefsBot/3.1; url)
 ahrefs.com/robot/-Mozilla/5.0 (compatible; AhrefsBot/4.0; url)
 ahrefs.com/robot/application/jsonMozilla/5.0 (compatible; AhrefsBot/3.1; url)
 ahrefs.com/robot/application/jsonMozilla/5.0 (compatible; AhrefsBot/4.0; url)
wordpress
 josefboberg.wordpress.comtext/..WordPress/3.5-alpha-21989; url
 pepedilorenzo.wordpress.comtext/..WordPress/3.5-alpha-21989; url
 josefboberg.wordpress.comtext/..WordPress/3.5-alpha-21535; url
 greatriversofhope.wordpress.comtext/..WordPress/3.5-alpha-21989; url
 klima47.wordpress.comtext/..WordPress/3.5-alpha-21989; url
 libertedeparole.wordpress.comtext/..WordPress/3.5-alpha-21989; url
 tannaznadafan.wordpress.comtext/..WordPress/3.5-alpha-21535; url
 imagenssagradas.wordpress.comtext/..WordPress/3.5-alpha-21989; url
 tsjok45.wordpress.comtext/..WordPress/3.5-alpha-21535; url
 02varvara.wordpress.comtext/..WordPress/3.5-alpha-21989; url
 philafric.wordpress.comtext/..WordPress/3.5-alpha-21989; url
 greatriversofhope.wordpress.comtext/..WordPress/3.5-alpha-21535; url
 tsjok45.wordpress.comtext/..WordPress/3.5-alpha-21989; url
soso
 help.soso.com/webspider.htmtext/..Mozilla/5.0(compatible; Sosospider/2.0; url)
 help.soso.com/webspider.htm-Mozilla/5.0(compatible; Sosospider/2.0; url)
jike
 shoulu.jike.com/spider.htmltext/..Mozilla/5.0 (compatible; JikeSpider; url)
 shoulu.jike.com/spider.htmlimage/..Mozilla/5.0 (compatible; JikeSpider; url)
 shoulu.jike.com/spider.html-Mozilla/5.0 (compatible; JikeSpider; url)
majestic12
 www.majestic12.co.uk/bot.php?text/..Mozilla/5.0 (compatible; MJ12bot/v1.4.3; url)
 www.majestic12.co.uk/bot.php?text/..Mozilla/5.0 (compatible; MJ12bot/v1.4.2; url)
exabot
 www.exabot.com/go/robottext/..Mozilla/5.0 (compatible; Exabot/3.0; url)
 www.exabot.com/go/robot-Mozilla/5.0 (compatible; Exabot/3.0; url)
wikipedia
 en.wikipedia.org/wiki/Wikipedia:Huggletext/..Huggle/2.1.19.0 url
 en.wikipedia.org/wiki/User:NicoV/Wikipedia_Cleaner/Documentationtext/..WPCleaner (url)
 fr.wikipedia.org/wiki/Utilisateur:Salebotapplication/jsonSalebot, see url (uses Perl MediaWiki::API)
toolserver
 wiki.toolserver.org/view/GeoHacktext/..Geohack (url)
 toolserver.org/~dispenser/image/..CacheThumbs/1.2 (url)
 toolserver.org/~dispenser/text/..DispensersTools (url)
 toolserver.org/~dispenser/application/jsonDispensersTools (url)
 toolserver.org/~dispenser/text/..CacheThumbs/1.2 (url)
 toolserver.org/~platonides/catdown/image/..catdown Images_from_Wiki_Loves_Monuments_2012_in_Spain (url)
 toolserver.org/~para/cgi-bin/kmlexporttext/..url libwww-perl/6.02
 toolserver.org/~platonides/catdown/image/..catdown Google_Art_Project (url)
sogou
 www.sogou.com/docs/help/webmasters.htm#07text/..Sogou web spider/4.0(url)
 www.sogou.com/docs/help/webmasters.htm#07-Sogou web spider/4.0(url)
 www.sogou.com/docs/help/webmasters.htm#07application/jsonSogou web spider/4.0(url)
coccoc
 help.coccoc.vn/text/..coccoc/1.0 (url)
 help.coccoc.vn/-coccoc/1.0 (url)
blekko
 blekko.com/about/blekkobottext/..Mozilla/5.0 (compatible; Blekkobot; ScoutJet; url)
 blekko.com/about/blekkobot-Mozilla/5.0 (compatible; Blekkobot; ScoutJet; url)
zum
 help.zum.com/inquirytext/..ZumBot/1.0 (ZUM Search; url)
 help.zum.com/inquiryimage/..ZumBot/1.0 (ZUM Search; url)
bin-co
 www.bin-co.com/php/scripts/load/text/..BinGet/1.00.A (url)
 www.bin-co.com/php/scripts/load/application/vnd.php.serializedBinGet/1.00.A (url)
traslated
 mymemory.traslated.net/doc/text/..Mozilla/5.0 (MyMemory Bot url)
 mymemory.traslated.net/doc/-Mozilla/5.0 (MyMemory Bot url)
cognarius
 cognarius.comapplication/jsonAppsArlak/1.0 (url)
 cognarius.comtext/..AppsArlak/1.0 (url)
okian
 www.okian.ro/text/..MyBot/1.0 (url)
archive
 www.archive.org/details/archive.org_bottext/..Mozilla/5.0 (compatible; archive.org_bot url)
 www.archive.org/details/archive.org_bottext/..Mozilla/5.0 (compatible; heritrix/3.1.1-SNAPSHOT-20120116.200628 url)
 archive.org/details/archive.org_bottext/..Mozilla/5.0 (compatible; heritrix/3.1.2-SNAPSHOT-20121017.193342 url)
 archive.org/details/archive.org_bottext/..Mozilla/5.0 (compatible; heritrix/3.1.2-SNAPSHOT-20121018.064638 url)
 www.archive.org/details/archive.org_botimage/..Mozilla/5.0 (compatible; archive.org_bot url)
 www.archive.org/details/archive.org_bot-Mozilla/5.0 (compatible; archive.org_bot url)
wikidict
 www.wikidict.detext/..url
yacy
 yacy.net/bot.htmltext/..yacybot (freeworld-global; amd64 Linux 2.6.32-custom; java 1.6.0_26; Europe/en) url
 yacy.net/bot.htmltext/..yacybot (webportal-global; amd64 Linux 3.2.0-0.bpo.3-amd64; java 1.6.0_18; Europe/de) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 2.6.38-16-server; java 1.6.0_24; Europe/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Windows Server 2008 R2 6.1; java 1.7.0_04; Europe/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Windows 7 6.1; java 1.7.0_07; Europe/de) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 2.6.38.8-zu-20120305; java 1.6.0_18; Europe/ru) url
 yacy.net/bot.htmltext/..yacybot (webportal/global; amd64 Linux 2.6.23.17-dbserv; java 1.6.0_04; Europe/de) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 3.2.0-32-generic; java 1.7.0_07; Europe/de) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Windows 7 6.1; java 1.7.0_04; America/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 3.2.0-31-generic; java 1.6.0_24; Europe/de) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Windows 7 6.1; java 1.7.0_05; America/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 2.6.24-28-server; java 1.6.0_18; Europe/en) url
flipboard
 flipboard.com/browserproxyimage/..Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; en-US; rv:1.9.2) Gecko/20100115 Firefox/3.6 (FlipboardProxy/0.0.5; url)
 flipboard.com/browserproxytext/..Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; en-US; rv:1.9.2) Gecko/20100115 Firefox/3.6 (FlipboardProxy/1.1; url)
 flipboard.com/browserproxytext/..Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; en-US; rv:1.9.2) Gecko/20100115 Firefox/3.6 (FlipboardProxy/0.0.5; url)
 flipboard.com/browserproxyapplication/jsonMozilla/5.0 (Macintosh; Intel Mac OS X 10.6; en-US; rv:1.9.2) Gecko/20100115 Firefox/3.6 (FlipboardProxy/0.0.1; url)
 flipboard.com/browserproxyimage/..null (FlipboardProxy/1.1; url)
 flipboard.com/browserproxy-Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; en-US; rv:1.9.2) Gecko/20100115 Firefox/3.6 (FlipboardProxy/0.0.5; url)
wwwgogetpapers
 wwwgogetpapers.com/application/jsonUser-Agent: GoGetPapersBot (url)
discoveryengine
 discoveryengine.com/discoverybot.htmltext/..Mozilla/5.0 (compatible; discoverybot/2.0; url)
 discoveryengine.com/discoverybot.html-Mozilla/5.0 (compatible; discoverybot/2.0; url)
goo
 help.goo.ne.jp/contact/text/..goo wikipedia (url)
 help.goo.ne.jp/door/crawler.htmltext/..ichiro/3.0 (url)
 goo.gl/7y4SXtext/..GoogleProducer; (url)
 search.goo.ne.jp/option/use/sub4/sub4-1/text/..ichiro/3.0 (url)
 search.goo.ne.jp/option/use/sub4/sub4-1/-DoCoMo/2.0 P900i(c100;TB;W24H11) (compatible; ichiro/mobile goo; url)
 goo.gl/7y4SXimage/..GoogleProducer; (url)
daum
 tab.search.daum.net/aboutWebSearch.htmltext/..Mozilla/5.0 (compatible; MSIE or Firefox mutant; not on Windows server; url) Daumoa/3.0
zipcode
 zipcode.ustext/..Mozilla/5.0 (compatible; YourCoolBot/1.0; url)
FeedBurner
 www.FeedBurner.comtext/..FeedBurner/1.0 (url)
enwp
 enwp.org/User:SDPatrolBottext/..SDPatrolBot (url)
 enwp.org/User:KingpinBottext/..KingpinBot (url)
 enwp.org/User:H3llkn0wz/WikiSharpAPItext/..WikiSharpAPI/0.3 url (C# .NET)
tweetmeme
 tweetmeme.com/text/..Mozilla/5.0 (compatible; TweetmemeBot/2.11; url)
 tweetmeme.com/text/..Mozilla/5.0 (compatible; TweetmemeBot/3.0; url)
 tweetmeme.com/-Mozilla/5.0 (compatible; TweetmemeBot/2.11; url)
 tweetmeme.com/-Mozilla/5.0 (compatible; TweetmemeBot/3.0; url)
SearchNearMe
 SearchNearMe.com/contact.phpapplication/vnd.php.serializedSearchNearMe (url)
 SearchNearMe.com/contact.phptext/..SearchNearMe (url)
whatrhymeswith
 www.whatrhymeswith.com/site/rhyme-bottext/..RhymeBot/0.1 (url)
gnip
 www.gnip.com/text/..UnwindFetchor/1.0 (url)
kosmix
 www.kosmix.com/html/kosmos.htmlapplication/xmlMozilla/5.0(compatible;Kosmos/1.0;url)
wikiglass
 wikiglass.comtext/..url : mail address
toshiba
 www.toshiba.co.jp/rdc/about/crawl_info.htmtext/..TosCrawler/Nutch-1.4 (url; ' mail address dot co dot jp')
 www.toshiba.co.jp/rdc/about/crawl_info.htmtext/..TosCrawler/Nutch-1.5.1 (url; ' mail address dot co dot jp')
github
 github.com/pauldix/typhoeus/tree/mastertext/..Typhoeus - url
 github.com/stuartpb/metapoint-welpapplication/jsonMetapoint-WikipediaExternalLinkParser/0.1 (url; mail address )
 github.com/edsu/wikitweetsapplication/jsonwikitweets <url
 github.com/edsu/linkypediaapplication/jsonlinkpyediabot v0.1: url
 wiki.github.com/bixo/bixo/bixocrawlertext/..Mozilla/5.0 (compatible; pub-crawler; url; mail address )
apercite
 www.apercite.fr/robot/index.htmlimage/..Mozilla/5.0 (compatible; Apercite; url)
mediawiki
 www.mediawiki.org/text/..MediaWiki OAI Harvester 0.2 (url)
proximic
 www.proximic.com/info/spider.phptext/..Mozilla/5.0 (compatible; proximic; url)
dasdonkey
 www.dasdonkey.comtext/..Mozilla/5.0 (compatible; DonkeyBot/0.1; url)
sistrix
 crawler.sistrix.net/text/..Mozilla/5.0 (compatible; SISTRIX Crawler; url)
federatedmedia
 federatedmedia.nettext/..Mozilla/5.0 (url) Gecko/20061208 Firefox/2.0.0.1
speaktoit
 www.speaktoit.comapplication/jsonSpeaktoit url
xbmc
 www.xbmc.orgimage/..XBMC/11.0 Git:20120702-f3cd288 (iOS; 11.0.0 AppleTV2,1, Version 5.1.1 (Build 9B830); url)
 www.xbmc.orgimage/..XBMC/11.0 Git:20120321-14feb09 (Windows NT 6.1;WOW64;Win64;x64; url)
 www.xbmc.orgimage/..XBMC/11.0 Git:20120321-14feb09 (Windows NT 6.1; url)
emining
 emining.jp/text/..emBot-GalaBuzz/Nutch-1.0 (url; mail address )
 emining.jp/-emBot-GalaBuzz/Nutch-1.0 (url; mail address )
wikimpress
 wikimpress.org/text/..Mozilla/5.0 (compatible; Linux i686 (x86_64); de-DE; url>Wikimpress) Wikimpress/1.0
 wikimpress.org/-Mozilla/5.0 (compatible; Linux i686 (x86_64); de-DE; url>Wikimpress) Wikimpress/1.0
plos
 alm.plos.orgapplication/jsonPLoS Article Level Metrics - url
drupal
 drupal.org/text/..Drupal (url)
 drupal.org/image/..Drupal (url)
 drupal.org/text/..User-Agent: Drupal (url)
tineye
 tineye.com/crawler.htmlapplication/jsonTinEye/1.1 (url)
 tineye.com/crawler.htmlimage/..TinEye/1.1 (url)
 tineye.com/crawler.htmltext/..TinEye/1.1 (url)
paper
 support.paper.li/entries/20023257-what-is-paper-litext/..Mozilla/5.0 (compatible; PaperLiBot/2.1; url)
ephorus
 www.ephorus.com/text/..Mozilla/5.0 (compatible; Ephorusbot/1.4.5.6; url)
sf
 magpierss.sf.nettext/..MagpieRSS/0.7x (url)
 liferea.sf.net/text/..Liferea/1.x.x (Linux; es_ES.UTF-8; url)
 liferea.sf.net/text/..Liferea/0.x.x (Linux; en_US.UTF-8; url)
moviecus
 www.moviecus.com/botcontactinfo.phpapplication/yamlmoviecus bot (url)
semager
 www.semager.de/blog/semager-bots/text/..Mozilla/5.0 (compatible; Semager/1.4c; url)
textdigger
 textdigger.comtext/..Mozilla/5.0 (url) Gecko/20061208 Firefox/2.0.0.1
 textdigger.comimage/..Mozilla/5.0 (url) Gecko/20061208 Firefox/2.0.0.1
abonti
 www.abonti.comtext/..Mozilla/5.0 (compatible; Abonti/0.91 - url)
archive-it
 archive-it.org/files/site-owners.htmlimage/..Mozilla/5.0 (compatible; archive.org_bot; Archive-It; url)
 archive-it.org/files/site-owners.htmltext/..Mozilla/5.0 (compatible; archive.org_bot; Archive-It; url)
 archive-it.org/files/site-owners.html-Mozilla/5.0 (compatible; archive.org_bot; Archive-It; url)
bibalex
 archive.bibalex.org/bot/image/..Mozilla/5.0 (compatible; archive.bibalex.org_bot; url)
 archive.bibalex.org/bot/text/..Mozilla/5.0 (compatible; archive.bibalex.org_bot; url)
picsearch
 www.picsearch.com/bot.htmltext/..psbot/0.1 (url)
 www.picsearch.com/bot.htmlimage/..psbot/0.1 (url)
fucinamediale
 labs.fucinamediale.comtext/..Mozilla/5.0 (compatible; ExperimentalWikiBot/1.0; url)
freebase
 www.freebase.comtext/..metaweb/Nutch-1.0-dev (url; help_at_metaweb.com)
hatena
 a.hatena.ne.jp/helptext/..Hatena Antenna/0.5 (url)
kalooga
 kalooga.com/crawlerimage/..Mozilla/5.0 (compatible; KaloogaBot; url)
 kalooga.com/crawlertext/..Mozilla/5.0 (compatible; KaloogaBot; url)
easybib
 content.easybib.com/autocite/text/..EasyBib AutoCite (url)
 content.easybib.com/autocite/application/jsonEasyBib AutoCite (url)
yoursite
 yoursite.com/botinfotext/..Mozilla/5.0 (compatible; YourCoolBot/1.0; url)
plagiarismcheck
 plagiarismcheck.orgapplication/jsonWikiCrawl 1.0b (url contact-mail: mail address )
netseer
 www.netseer.com/crawler.htmltext/..Mozilla/5.0 (compatible; NetSeer crawler/2.0; url; mail address )
bsurprised
 bsurprised.com/text/..BSurprised WikiBox 0.1.3 (url)
avantbrowser
 www.avantbrowser.comtext/..Avant Browser (url)
 www.avantbrowser.comtext/..Advanced Browser (url)
cdac
 www.cdac.intext/..abhishek/Nutch-0.9 (cdacp; url; mail address )
newsgator
 www.newsgator.comtext/..NewsGatorOnline/2.0 (url; 1 subscribers)
 www.newsgator.com/text/..FeedDemon/2.7 (url; Microsoft Windows XP)
feedshow
 www.feedshow.comtext/..FeedshowOnline (url)
 www.feedshow.comtext/..Feedshow/x.0 (url; 1 subscriber)
coralcdn
 coralcdn.org/text/..CoralWebPrx/0.1.20 (See url)
topsy
 labs.topsy.com/butterfly/text/..Mozilla/5.0 (compatible; Butterfly/1.0; url) Gecko/2009032608 Firefox/3.0.8
sentymetr
 sentymetr.pl/bot.htmlapplication/jsonMozilla/5.0 (compatible; SentymetrBot 1.0; url)
 sentymetr.pl/bot.htmltext/..Mozilla/5.0 (compatible; SentymetrBot 1.0; url)
rockpeaks
 www.rockpeaks.com/contacttext/..RockPeaks/0.1 (url)
jetbrains
 www.jetbrains.com/omea_reader/text/..JetBrains Omea Reader 2.0 Release Candidate 1 (url)
 www.jetbrains.com/omea_reader/text/..JetBrains Omea Reader 1.0.x (url)
parsijoo
 www.parsijoo.irtext/..Mozilla/5.0 (compatible; mail address url)
 www.parsijoo.irimage/..Mozilla/5.0 (compatible; mail address url)
alexa
 www.alexa.com/site/help/webmasterstext/..ia_archiver (url; mail address )
warebay
 www.warebay.com/bot.htmltext/..Mozilla/5.0 (compatible; WBSearchBot/1.1; url)
matuschek
 www.matuschek.net/jobo.htmltext/..JoBo/1.4 (url)
netnewswireapp
 netnewswireapp.com/mac/-NetNewsWire/3.3.2 (Mac OS X; url; gzip-happy)
pagepeeker
 pagepeeker.com/robots/image/..Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/535.21 KHTML Chrome/19.0.1042.0 Safari/535.21 PagePeeker/2.1; url
 pagepeeker.com/robots/text/..Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/535.21 KHTML Chrome/19.0.1042.0 Safari/535.21 PagePeeker/2.1; url
pingdom
 www.pingdom.comtext/..Pingdom.com_bot_version_1.4_(url)
 www.pingdom.com/text/..Pingdom.com_bot_version_1.4_(url)
spinn3r
 spinn3r.com/robottext/..Mozilla/5.0 (X11; Linux x86_64; en-US; rv:1.9.0.19; aggregator:Spinn3r (Spinn3r 3.1); url) Gecko/2010040121 Firefox/3.0.19
worldaswillandfarce
 worldaswillandfarce.comtext/..WordPress/3.5-alpha-21989; url
 worldaswillandfarce.comtext/..WordPress/3.5-alpha-21535; url
weblio
 www.weblio.jp/text/..Mozilla/5.0 (compatible; WeblioBot; url)
svglib
 svglib.orgimage/..Mozilla/5.0 (compatible; heritrix/1.14.4 url)
 svglib.org-Mozilla/5.0 (compatible; heritrix/1.14.4 url)
 svglib.orgtext/..Mozilla/5.0 (compatible; heritrix/1.14.4 url)
fotopedia
 www.fotopedia.comapplication/jsonPicor (url)
simplepie
 simplepie.orgtext/..SimplePie/1.2.1 (Feed Parser; url; Allow like Gecko) Build/20111015034325
 simplepie.orgapplication/xmlSimplePie/1.2.1 (Feed Parser; url; Allow like Gecko) Build/20111015034325
 simplepie.orgapplication/xmlSimplePie/1.2 (Feed Parser; url; Allow like Gecko) Build/20090627192103
 simplepie.orgtext/..SimplePie/1.2 (Feed Parser; url; Allow like Gecko) Build/20090627192103
metamagazine
 metamagazine.comtext/..WordPress/3.4.2; url
openindex
 www.openindex.io/en/webmasters/spider.htmltext/..Mozilla/5.0 (compatible; OpenindexSpider; url)
semrush
 www.semrush.com/bot.htmltext/..Mozilla/5.0 (compatible; SemrushBot/0.95; url)
embed
 support.embed.ly/image/..Mozilla/5.0 (compatible; Embedly/0.2; snap; url)
 support.embed.ly/text/..Mozilla/5.0 (compatible; Embedly/0.2; url)
zeebox
 www.zeebox.comtext/..Zeebox (url)
 www.zeebox.comapplication/jsonZeebox (url)
netarkivet
 netarkivet.dk/webcrawler/text/..Mozilla/5.0 (compatible; heritrix/1.14.4 url)
ac
 www.ninjal.ac.jp/corpus_center/ulc/crawl-entext/..Mozilla/5.0 (compatible; heritrix/3.1.1 url)
instapaper
 www.instapaper.com/text/..Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8) AppleWebKit/534.50 KHTML Version/5.1 Instapaper/4.0 (url)
tinyurl
 tinyurl.com/64t5ntext/..Rome Client (url) Ver: 0.9
wiktionary
 en.wiktionary.org/wiki/User:Rukhabotapplication/jsonRukhabot/0.1 (url)
gsitecrawler
 gsitecrawler.com/text/..GSiteCrawler/v1.23 rev. 286 (url)
bnf
 www.bnf.fr/fr/outils/a.dl_web_capture_robot.htmlimage/..Mozilla/5.0 (compatible; bnf.fr_bot; url)
 www.bnf.fr/fr/outils/a.dl_web_capture_robot.htmltext/..Mozilla/5.0 (compatible; bnf.fr_bot; url)
grapeshot
 www.grapeshot.co.uk/crawler.phptext/..Mozilla/5.0 (compatible; GrapeshotCrawler/2.0; url)
duckduckgo
 duckduckgo.com/duckduckbot.htmltext/..DuckDuckBot/1.1; (url)
 duckduckgo.com/duckduckpreview.html-DuckDuckPreview/1.0; (url)
 duckduckgo.com/duckduckpreview.htmltext/..DuckDuckPreview/1.0; (url)
friendofrenia
 friendofrenia.com/text/..User-Agent: FriendoFrenia (url)
 friendofrenia.com/application/jsonUser-Agent: FriendoFrenia (url)
veveo
 corporate.veveo.net/webmasters.htmltext/..Mozilla/5.0 (compatible; Veveobot; url)
netvibes
 www.netvibes.comtext/..Netvibes (url)
trendiction
 www.trendiction.de/bottext/..Mozilla/5.0 (Windows; Windows NT 6.0; en-GB; rv:1.0; trendictionbot0.5.0; trendiction search; url; please let us know of any problems; web at trendiction.com) Gecko/20071127 Firefox/3.0.0.11
nb
 www.nb.no/vevfangstimage/..Mozilla/5.0 (compatible; heritrix/1.14.4 url)
 www.nb.no/vevfangsttext/..Mozilla/5.0 (compatible; heritrix/1.14.4 url)
js-kit
 js-kit.com/text/..JS-Kit URL Resolver, url
southmumbairealestate
 southmumbairealestate.metext/..WordPress/3.5-alpha-21989; url
 southmumbairealestate.metext/..WordPress/3.5-alpha-21535; url
turnitin
 www.turnitin.com/robot/crawlerinfo.htmltext/..TurnitinBot/2.1 (url)
blogbridge
 www.blogbridge.com/text/..BlogBridge 2.13 (url)
snarfware
 www.snarfware.com/text/..Snarfer/0.x.x (url)
backgroundswitcher
 www.backgroundswitcher.com/text/..John's Background Switcher 4.6 (url)
 www.backgroundswitcher.com/text/..John's Background Switcher 4.4 (url)
 www.backgroundswitcher.com/image/..John's Background Switcher 4.4 (url)
plagger
 plagger.org/text/..Plagger/0.x.xx (url)
feeds4all
 www.feeds4all.com/feedzcollectortext/..FeedZcollector v1.x (Platinum) url
orcabrowser
 www.orcabrowser.comtext/..Orca Browser (url)
ponderer
 ponderer.org/download/annotate_google.user.jstext/..annotate_google; url
nemui
 mozshot.nemui.org/text/..Mozilla/5.0 (Gecko/20070310 Mozshot/0.0.20070628; url)
winpodder
 winpodder.comtext/..WinPodder (url)
graemef
 graemef.comtext/..NewsGator FetchLinks extension/0.2.0 (url)
dbpedia
 dbpedia.orgtext/..DBpedia Sync - url - mail address
zipcommander
 www.zipcommander.com/text/..1st ZipCommander (Net) - url
microsystools
 www.microsystools.com/products/sitemap-generator/text/..A1 Sitemap Generator/3.5.1 (url) miggibot
example
 example.com/MyCoolToolPage/application/vnd.php.serializedUser-Agent: MyCoolTool (url)
 example.com/MyCoolToolPage/application/jsonMyCoolTool (url)
muso
 www.muso.comtext/..Mozilla/5.0 (compatible; musobot/1.0; mail address ; url)
kula
 kula.jp/endotext/..endo/1.0 (Mac OS X; ppc i386; url)
Anonymouse
 Anonymouse.org/image/..url (Unix)
 Anonymouse.org/text/..url (Unix)
rssreader
 www.rssreader.comtext/..RssReader/1.0.xx.x (url) Microsoft Windows NT 5.1.2600.0
it-influentials
 search.it-influentials.com/bot.htmtext/..Mozilla/5.0 (compatible;FindITAnswersbot/1.0;url)
search
 www.search.ch/rim.htmltext/..UltraSpider3000/1.0 (url)
ranchero
 ranchero.com/netnewswire/text/..NetNewsWire/2.x (Mac OS X; url)
searchtechnologies
 www.searchtechnologies.comtext/..Mozilla/5.0 (compatible; heritrix/1.14.3 url)
seebot
 seebot.orgtext/..Lynx/2.8 (;url)
localhost
 localhost/bot.phptext/..Mozilla/5.0 (compatible; g-ara.com-bot; url)
 localhost/wordpresstext/..WordPress/3.4.2; url
mobileproxy
 mobileproxy.mobitext/..Mozilla/5.0 (compatible; MobileSurf; url)
sonyericsson
 www.sonyericsson.com/UAprof/R800xR301.xmlimage/..Mozilla/5.0 (Linux; Android/2.3.3; en-us; SonyEricssonR800xurl Build/3.0.1.E.1.44) AppleWebKit/533.1 KHTML Version/4.0 Mobile Safari/533.1
zootycoon
 www.zootycoon.comtext/..Zoo Tycoon 2 Client -- url
rcdtokyo
 www.rcdtokyo.com/pc2m/text/..Mozilla/5.0 (compatible; PEAR HTTP_Request class; url)
rssbandit
 www.rssbandit.orgtext/..RssBandit/1.5.0.10 (WinNT 5.1.2600.0; url) (WinNT 5.1.2600.0; )
grid-son
 grid-son.comapplication/jsonurl
timewe
 timewe.nettext/..CDR/1.7.1 Simulator/0.7(url) Profile/MIDP-1.0 Configuration/CLDC-1.0
wotbox
 www.wotbox.com/bot/text/..Wotbox/2.01 (url)
120217.759999988total

Page requests for probable crawlers, recognized by keyword
Count
x 1000
Agent string
  Mime type (count ≥ 3)
PythonWikipediaBot/1.0
 application/json
 application/xml
 text/..
 -
 application/x-www-form-urlencoded
 image/..
 application/pdf
spider
 text/..
 application/vnd.php.serialized
 -
 image/..
 application/json
MediaWikiCrawler-Google/2.0 ( mail address )
 text/..
 -
GoogleBot-Image/1.0
 text/..
 image/..
 -
LinkParser/2.0
 text/..
Mozilla/5.0 (compatible; Web CEO Online robot)
 text/..
 image/..
 -
 application/rsd+xml
 application/xml
 application/opensearchdescription+xml
 application/ogg
 application/x-external-editor
php wikibot classes
 application/vnd.php.serialized
 -
 text/..
 application/x-www-form-urlencoded
GoogleBot-Image/1.0
 text/..
 image/..
 -
 application/json
 application/vnd.php.serialized
AniBot/0.9 php/curl
 application/vnd.php.serialized
 -
 image/..
 text/..
Peachy MediaWiki Bot API Version 1.0
 application/vnd.php.serialized
Pywikipediabot/2.0
 application/json
 text/..
Mozilla/5.0 (Windows; Windows NT 5.1; zh-CN; rv:1.8.0.11) Gecko/20070312 Firefox/1.5.0.11; 360Spider
 text/..
 -
 application/json
 image/..
 application/xml
 application/ogg
 audio/midi
ClueBot/1.1
 application/vnd.php.serialized
Answersbot
 text/..
ClueBot/2.0
 application/vnd.php.serialized
Mozilla/5.0 MaboMwFramework/1.2 (w:de:MerlIwBot)
 text/..
tigerbot
 application/json
 text/..
gsa-crawler (Enterprise; T3-RZJKZN773WWKK; mail address )
 text/..
Wikipath Bot (email: mail address )
 application/json
Mozilla 5.0 (Apibot 0.32)
 application/vnd.php.serialized
 text/..
DotNetWikiBot/2.101 (Microsoft Windows NT 6.1.7601 Service Pack 1; )
 text/..
 application/xml
 application/x-www-form-urlencoded
HTMLParser/2.0
 text/..
 -
Mozilla/5.0 (Windows; Windows NT 5.1; fr; rv:1.8.1) VoilaBot BETA 1.2 ( mail address )
 text/..
 -
 application/json
www.integromedb.org/Crawler
 text/..
 -
 image/..
 application/json
 application/xml
 application/ogg
Mozilla/5.0 (compatible; Konqueror/3.5; Linux) KHTML/3.5.5 (Exabot-Thumbnails)
 image/..
 text/..
 application/json
 -
wikiwix-bot-3.0
 text/..
 -
DigitalsmithsBot
 text/..
Mozilla/5.0 (compatible; Ezooms/1.0; mail address )
 text/..
 application/json
 application/vnd.php.serialized
 -
 image/..
plantspedia data crawler
 text/..
DotNetWikiBot/2.101 (Microsoft Windows NT 5.1.2600 Service Pack 3; )
 text/..
 application/xml
 -
MediaWiki::Bot/3.2.6
 application/json
mail address
 application/vnd.php.serialized
 text/..
mail address mail address – MediaWiki Tcl Bot Framework 0.5
 application/x-www-form-urlencoded
 application/json
DotNetWikiBot/2.81 (Microsoft Windows NT 6.1.7601 Service Pack 1; )
 text/..
 application/xml
 image/..
 application/ogg
DotNetWikiBot/2.100 (Unix 2.6.32.38; )
 text/..
AnomieBOT 1.0 (TagDater; see [[User:AnomieBOT]])
 application/json
Tawbot (public svn release; plwiki)
 text/..
Wikibot/2.0.1 CFNetwork/609 Darwin/13.0.0
 image/..
 application/json
 text/..
 -
python-wikitools/1.2 (User:BernsteinBot)
 application/json
gsa-crawler (Enterprise; T3-KELLT6PT9WSQZ; mail address )
 text/..
 -
FAST Search Web Crawler 14.0.0325.0000
 text/..
 -
 application/xml
NexiSpider/Nutch-1.5.1
 text/..
 -
Webwiki Search Engine Bot - www.webwiki.de
 text/..
bot cachetickler /home/mwalker/frontend_tester/p14_1
 text/..
 -
Mozilla/5.0 (compatible; Mail.RU_Bot/2.0)
 text/..
 image/..
 -
 application/vnd.php.serialized
CorenSearchBot/1.7 en libwww-perl/6.04
 text/..
WikiPlaysBot
 text/..
Mozilla/5.0 (compatible; Mail.RU/3.14) CrawlMl
 text/..
 -
TrueKnowledgeBot bot mail address >
 application/vnd.php.serialized
 application/xml
 text/..
SineBot/1.5.19(User:SineBot)
 application/vnd.php.serialized
 text/..
MediaWiki::Bot/3.005002
 application/json
DotNetWikiBot/2.100 (Microsoft Windows NT 6.2.8400.0; )
 text/..
 application/xml
AnomieBOT 1.0 (OrphanReferenceFixer; see [[User:AnomieBOT]])
 application/json
OrlodrimBot/1.0
 text/..
 -
 application/x-www-form-urlencoded
Opera/8.01 (J2ME/MIDP; MXit WebBot/6.2.1/1.8.5.168;) Opera Mini/3.1
 image/..
 text/..
 -
EasouSpider
 text/..
 application/json
 -
 image/..
 application/ogg
DotNetWikiBot/2.100 (Unix 5.10.0.0; )
 text/..
 application/xml
Wikibot/2.0 CFNetwork/609 Darwin/13.0.0
 image/..
 application/json
 text/..
 -
GermCrawler
 application/json
 text/..
DotNetWikiBot/2.97 (Microsoft Windows NT 5.1.2600 Service Pack 3; )
 text/..
JavaCrawler/1.1
 text/..
 image/..
quaba-spider
 text/..
Test Webbot
 text/..
SchoolReviewNetworkWikiBot
 application/json
YBot/0.1
 application/vnd.php.serialized
AdMedia bot
 text/..
 -
 image/..
Bot
 text/..
AnomieBOT 1.0 (FlagIconRemover; see [[User:AnomieBOT]])
 application/json
HosiryuhosiBot IRC-RecentChanges Checker
 text/..
 application/x-www-form-urlencoded
AnomieBOT 1.0 (TemplateSubster; see [[User:AnomieBOT]])
 application/json
Spider
 application/json
 text/..
AnomieBOT 1.0 (PERTableUpdater; see [[User:AnomieBOT]])
 application/json
 text/..
Mozilla/5.0 (compatible; Nigma.ru/3.0; mail address )
 text/..
RobBot ( mail address )
 application/vnd.php.serialized
wikbotlite/1.60 CFNetwork/609 Darwin/13.0.0
 image/..
 application/json
 text/..
MyCuteBot/0.1
 text/..
 application/json
 application/vnd.php.serialized
Twitterbot/1.0
 text/..
 image/..
 -
 application/pdf
DotNetWikiBot/2.100 (Microsoft Windows NT 6.1.7601 Service Pack 1; )
 text/..
 application/x-www-form-urlencoded
mail address mail address – MediaWiki Tcl Bot Framework 0.5 (r1)
 application/json
Noheto crawler
 text/..
SiocWikiBot/1.0
 application/vnd.php.serialized
 text/..
DotNetWikiBot/2.96 (Microsoft Windows NT 6.1.7601 Service Pack 1; )
 text/..
 application/xml
mySpider/Nutch-1.5.1
 text/..
 -
Mozilla/5.0 (compatible; SnapPreviewBot; en-US; rv:1.8.0.9) Gecko/20061206 Firefox/1.5.0.9
 text/..
SearchBot
 text/..
bot: fundraising-test
 image/..
 text/..
Mozilla/5.0 (SnapPreviewBot) Gecko/20061206 Firefox/1.5.0.9
 image/..
 text/..
 application/json
COIBot/1.00
 text/..
UCMore Crawler App
 text/..
 -
wikbot/1.60 CFNetwork/609 Darwin/13.0.0
 image/..
 application/json
 text/..
 -
Mozilla/5.0 (X11; Linux i686; en-US; rv:1.8.0.7) Gecko/20060909 Firefox/1.5.0.7 SnapPreviewBot
 text/..
SurakWare MediaWiki Bot/1.0
 text/..
~Bot ([[:fr:w:User:TildeBot]] by [[:fr:w:User:Alphos]] mail address )
 text/..
HRoestBot, de-wikipedia using pywikipedia framework
 text/..
 application/json
bot cachetickler /home/mwalker/frontend_tester/p14_1_single
 text/..
 -
AnomieBOT 1.0 (BAGBot; see [[User:AnomieBOT]])
 application/json
 text/..
bitlybot
 text/..
 -
 image/..
DotNetWikiBot/2.101 (Unix 3.2.0.31; )
 text/..
GoogleBot
 text/..
 image/..
 -
XLinkBot/1.00
 text/..
COIBot/2.0
 text/..
DotNetWikiBot/2.92 (Microsoft Windows NT 5.1.2600 Service Pack 3; )
 text/..
 application/xml
Zing-BottaBot/2.0
 text/..
TVersity Media Robot
 text/..
BOT: Rechtschreibung & Grammatik
 application/json
Wikibot/2.0.1 CFNetwork/548.1.4 Darwin/11.0.0
 image/..
 application/json
 text/..
Opera/8.01 (J2ME/MIDP; MXit WebBot/5.9.8/1.8.5.168;) Opera Mini/3.1
 image/..
 text/..
 -
wikbot/1.60 CFNetwork/548.1.4 Darwin/11.0.0
 image/..
 application/json
 text/..
 -
Phantom.js bot
 image/..
 text/..
Goalkeeperbot(User:Beetstra)/1.0
 text/..
Handelabra WikiBot
 application/vnd.php.serialized
 text/..
CorenSearchBot/1.7 en libwww-perl/6.02
 text/..
python-wikitools/1.2 (User:Mr.Z-bot)
 application/json
DotNetWikiBot/2.100 (Unix 3.0.0.12; )
 text/..
 application/xml
MaxPointCrawler/Nutch-1.1 (maxpoint.crawler at maxpointinteractive dot com)
 text/..
 -
TwynCatBot/0.1 (Contact: www.twyn.com)
 application/json
FTRF: Friendly robot/1.3
 text/..
LauschenBot/1.0 ( mail address )
 text/..
EarwigBot/0.2.dev.git4ff7612a (Python/2.7.3; https://github.com/earwig/earwigbot; mail address )
 application/json
 text/..
 application/x-www-form-urlencoded
DotNetWikiBot, edited by D. Rodionov/2.91 (Microsoft Windows NT 6.0.6002 Service Pack 2; )
 text/..
 application/xml
Metabot 0.1
 text/..
wikbotlite/1.60 CFNetwork/548.1.4 Darwin/11.0.0
 image/..
 application/json
 text/..
iteco dummy crawler
 text/..
Mozilla/5.0 (Bgbot 0.5)
 text/..
BibBot/0.9 (urshofer.ch)
 text/..
tellit_rest_bot, contact mail address
 text/..
 application/x-wiki
Peachy MediaWiki Bot API Version 0.1beta
 application/vnd.php.serialized
Erel Bot
 text/..
Mozilla/5.0 (compatible; LucidWorks/; ; crawler at example dot com)
 text/..
 -
dtSearchSpider
 text/..
WPBot 1.0
 text/..
Mozilla/5.0 (compatible; UnisterBot; mail address )
 text/..
 image/..
AnomieBOT 1.0 (RandomPagePicker; see [[User:AnomieBOT]])
 application/json
DotNetWikiBot/2.101 (Unix 3.0.0.12; )
 text/..
 application/xml
Mozilla/5.0 QunarBot/1.0
 text/..
 image/..
Nutch Science Crawler/Nutch-1.5
 text/..
 application/ogg
 -
WikiBot/0.1
 text/..
 image/..
ClueBot/2.0 (ClueBot NG Report Interface)
 text/..
Wikibot/2.0 CFNetwork/548.1.4 Darwin/11.0.0
 image/..
 application/json
 text/..
Crawler/Nutch-1.4
 text/..
 -
 application/pdf
 application/ogg
My Bot
 text/..
 image/..
Platonides bot for WLM
 application/json
Perl's Analytic Bot/1.0
 application/json
My Nutch Spider - test/Nutch-1.5.1
 text/..
 -
Mozilla/5.0 (compatible; Tbot/1.0;)
 text/..
infraEnterprise v8 Web Crawler
 -
 text/..
FetcherBot/0.1
 text/..
 image/..
scrapybot/1.0
 text/..
wAPI/1.1 (Bot: NoomBot Operator: Noommos Contact: mail address )
 application/vnd.php.serialized
Surag Spider/Nutch-1.4
 text/..
 image/..
Mozilla/5.0 (compatible; FriendFeedBot/0.1; Http://friendfeed.com/about/bot; 388 subscribers; feed-id=3852576738117026533)
 application/xml
 -
Empedia Bot
 text/..
Geni ircpybot 1.0
 text/..
 application/json
 application/xml
AnomieBOT 1.0 (AFDMergeFromCleaner; see [[User:AnomieBOT]])
 application/json
python-wikitools/1.2 (User:LaraBot)
 application/json
SWAT Crawler. AGH University project. In case of problem contact: mail address Thanks.
 text/..
 -
 application/xml
MediaWiki::Bot 3.1.5
 application/json
HTMLParser/1.6
 text/..
 -
My Nutch Spider/Nutch-1.4
 text/..
 application/json
26299.41total

IP ranges: known ip ranges for Google are 64.233.[160.0-191.255], 66.249.[64.0-95.255], 66.102.[0.0-15.255], 72.14.[192.0-255.255],
74.125.[0.0-255.255], 209.085.[128.0-255.255], 216.239.[32.0-63.255] and a few minor other subranges

Errata: WMF traffic logging service suffered from server capacity problems in Aug/Sep/Oct 2011.
Absolute traffic counts for October 2011 are approximatly 7% too low.
Data loss only occurred during peak hours. It therefore may have had somewhat different impact for traffic from different parts of the world.
and may have also skewed relative figures like share of traffic per browser or operating system.

From mid September till late November squid log records for mobile traffic were in invalid format.
Data could be repaired for logs from mid October onwards. Older logs were no longer available.

In a an unrelated server outage precisely half of traffic to WMF mobile sites was not counted from Oct 16 - Nov 29 (one of two load-balanced servers did not report traffic).
WMF has since improved server monitoring, so that similar outages should be detected and fixed much faster from now on.

Generated on Mon, Nov 5, 2012 16:25
Author:Erik Zachte (
Web site)
Mail: ezachte@### (no spam: ### = wikimedia.org)
All data and images on this page are in the public domain.

Note: page may load slower on Microsoft Internet explorer than on other major browsers