Wikimedia Traffic Analysis Report - Crawler requests

Monthly requests or daily averages, for period: 1 Nov 2012 - 30 Nov 2012 (last 12 months)
000 ⇒ k
 

 This analysis is based on a 1:1000 sampled server log (squids)

 See also: Requests by destination or by origin / Methods / Scripts / User agents / Skins / Crawlers / Op.Sys. / Mobile devices / Browsers / Google / Country data / Traffic trends, and notes about reliability of these data

The following overview of crawler (aka bot) page requests is based on the user agent information that accompanies most server requests. Unfortunately this user agent information follows rather loosely defined guidelines.
Also please bear in mind than the most popular crawler names may be somewhat overrepresented. This is the result of so called user agent spoofing (where a requester supplies false credentials, e.g. to bypass web servers filters).
GoogleBot seems to be a favorite for spoofing. Therefore requests from an ip address registered by Google (see below) are color coded GoogleBot, others GoogleBot

For this report page requests are considered to be issued by a crawler in two cases:
1 The user agent string contains a web address (only crawlers should have that, but there a some false positives, where a browser sends a user agent string with a web address (ill behaved plug-in, main offenders have been eliminated)
2 The user agent string contains the term bot, spider or crawl[er]'

In total 81,021,970 page requests (mime type text/html only!) per day are considered crawler requests, out of 517,521,530 external requests, which is 15.7%

Page requests for crawlers that specify a url in the agent string
Count
x 1000
Secondary domain
(~site) name
URLMime typeUser agent
google
 www.google.com/bot.htmltext/..Mozilla/5.0 (compatible; GoogleBot/2.1; url)
 www.google.com/bot.html-Mozilla/5.0 (compatible; GoogleBot/2.1; url)
 www.google.com/bot.htmltext/..DoCoMo/2.0 N905i(c100;TB;W24H16) (compatible; GoogleBot-Mobile/2.1; url)
 desktop.google.com/application/xmlMozilla/5.0 (compatible; Google Desktop/5.9.1005.12335; url)
 www.google.com/bot.htmltext/..SAMSUNG-SGH-E250/1.0 Profile/MIDP-2.0 Configuration/CLDC-1.1 UP.Browser/6.2.3.3.c.1.101 (GUI) MMP/2.0 (compatible; GoogleBot-Mobile/2.1; url)
 www.google.com/bot.htmltext/..Mozilla/5.0 (iPhone; CPU iPhone OS 4_1 like Mac OS X; en-us) AppleWebKit/532.9 KHTML Version/4.0.5 Mobile/8B117 Safari/6531.22.7 (compatible; GoogleBot-Mobile/2.1; url)
 www.google.com/bot.html-DoCoMo/2.0 N905i(c100;TB;W24H16) (compatible; GoogleBot-Mobile/2.1; url)
 www.google.com/bot.html-SAMSUNG-SGH-E250/1.0 Profile/MIDP-2.0 Configuration/CLDC-1.1 UP.Browser/6.2.3.3.c.1.101 (GUI) MMP/2.0 (compatible; GoogleBot-Mobile/2.1; url)
 www.google.com/bot.html-Mozilla/5.0 (iPhone; CPU iPhone OS 4_1 like Mac OS X; en-us) AppleWebKit/532.9 KHTML Version/4.0.5 Mobile/8B117 Safari/6531.22.7 (compatible; GoogleBot-Mobile/2.1; url)
 www.google.com/bot.htmlimage/..Mozilla/5.0 (compatible; GoogleBot/2.1; url)
 desktop.google.com/image/..Mozilla/5.0 (compatible; Google Desktop/5.9.1005.12335; url)
 www.google.com/feedfetcher.htmlimage/..Mozilla/5.0 (compatible) FeedFetcher-Google; (url)
 www.google.com/feedfetcher.html-FeedFetcher-Google; (url)
 www.google.com/feedfetcher.htmlapplication/xmlFeedFetcher-Google; (url)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: ortografia4)
 www.google.com/feedfetcher.htmltext/..Mozilla/5.0 (compatible) FeedFetcher-Google; (url)
 desktop.google.com/-Mozilla/5.0 (compatible; Google Desktop/5.9.1005.12335; url)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: wikien3)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: s~cloudcrawling)
 code.google.com/p/crawler4j/text/..crawler4j (url)
 www.google.com/feedfetcher.htmltext/..FeedFetcher-Google; (url)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: rarplayer)
 www.google.com/feedfetcher.htmlapplication/jsonMozilla/5.0 (compatible) FeedFetcher-Google; (url)
 desktop.google.com/text/..Mozilla/5.0 (compatible; Google Desktop/5.9.1005.12335; url)
 code.google.com/appenginetext/..WikiBot/0.1 AppEngine-Google; (url; appid: newikipedia)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: ortopedianew)
 code.google.com/appengineimage/..Offline Mobile Wiki (Tel:44 141 334 5472, mail address ) AppEngine-Google; (url; appid: s~wiki2go-hrd)
 www.google.com/feedfetcher.htmlapplication/xmlMozilla/5.0 (compatible) FeedFetcher-Google; (url)
 code.google.com/appengineapplication/jsonAppEngine-Google; (url; appid: s~redconceptual)
 code.google.com/appenginetext/..Mozilla/5.0 (Windows; Windows NT 5.1; en-US; rv:1.9.0.7) Gecko/2009021910 Firefox/3.0.7 AppEngine-Google; (url; appid: s~fonetika3)
 code.google.com/appengineapplication/xmlAppEngine-Google; (url; appid: wikipedia-raw)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: wikien4)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: usawebdl)
 docs.google.comimage/..Mozilla/5.0 (compatible; GoogleDocs; documents; url)
 www.google.com/bot.htmltext/..GoogleBot/2.1 (url)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: s~keytanwiki2)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: s~keytanwiki4)
 code.google.com/appenginetext/..Offline Mobile Wiki (Tel:44 141 334 5472, mail address ) AppEngine-Google; (url; appid: s~wiki2go-hrd)
 docs.google.comimage/..Mozilla/5.0 (compatible; GoogleDocs; apps-presentations; url)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: s~keytanwiki3)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: s~wikigraph2)
 www.google.com/bot.htmlNONE/wikipedia- Mozilla/5.0 (compatible; GoogleBot/2.1; url)
 www.google.com/feedfetcher.htmltext/..Mozilla/5.0 (compatible) FeedFetcher-Google;(url)
 code.google.com/appenginetext/..Python-urllib/2.5 AppEngine-Google; (url; appid: s~isnt-it)
 www.google.com/coop/cse/creftext/..FeedFetcher-Google-CoOp; (url)
 www.google.com/bot.html-GoogleBot/2.1 (url)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: s~francetiki)
 code.google.com/appenginetext/..www.productontology.org/1.0 (Contact: mail address ) AppEngine-Google; (url; appid: gr4bing)
 desktop.google.com/application/xmlMozilla/5.0 (compatible; Google Desktop/5.9.911.3589; url)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: usawebproxy0)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: boxapp)
 www.google.com/bot.htmlapplication/oggMozilla/5.0 (compatible; GoogleBot/2.1; url)
 code.google.com/appenginetext/..Wiki.java 0.27 AppEngine-Google; (url; appid: wikipediatools)
 code.google.com/appengineimage/..AppEngine-Google; (url; appid: d24-img)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: d24-img)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: pakgalaxy)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: s~kasumiremix)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: s~theunblock)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: kires-roxy)
 www.google.com/feedfetcher.html-Mozilla/5.0 (compatible) FeedFetcher-Google; (url)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: s~hr-pulsesubscriber)
 code.google.com/appengineapplication/jsonMWBOT GAE Edition AppEngine-Google; (url; appid: philip-bot)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: s~drizzlprox)
 desktop.google.com/image/..Mozilla/5.0 (compatible; Google Desktop/5.9.911.3589; url)
 www.google.com/bot.htmlapplication/pdfMozilla/5.0 (compatible; GoogleBot/2.1; url)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: threewiki)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: worldwide-propaganda)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: my-api)
 www.google.com/bot.htmlapplication/pdfGoogleBot/2.1 (url)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: s~keytanwiki1)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: proxy-devakishor)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: toom16-10)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: s~japantiki)
 code.google.com/appengineimage/..AppEngine-Google; (url; appid: usawebproxy0)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: your-zone)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: dustbunnytycoonmonitor)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: s~proxyseekkety)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: wmhsonline)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: mehproxy)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: 114proxy)
 code.google.com/appenginetext/..AppEngine-Google; (url; appid: abdulfat)
facebook
 www.facebook.com/externalhit_uatext.phpimage/..facebookexternalhit/1.0 (url)
 www.facebook.com/externalhit_uatext.phptext/..facebookexternalhit/1.0 (url)
 www.facebook.com/externalhit_uatext.phpimage/..facebookexternalhit/1.1 (url)
 www.facebook.com/externalhit_uatext.phptext/..facebookexternalhit/1.1 (url)
 developers.facebook.comimage/..facebookplatform/1.0 (url)
 www.facebook.com/externalhit_uatext.php-facebookexternalhit/1.1 (url)
 www.facebook.com/externalhit_uatext.php-facebookexternalhit/1.0 (url)
 www.facebook.com/externalhit_uatext.phpapplication/jsonfacebookexternalhit/1.1 (url)
 developers.facebook.comtext/..facebookplatform/1.0 (url)
 developers.facebook.com-facebookplatform/1.0 (url)
bing
 www.bing.com/bingbot.htmtext/..Mozilla/5.0 (compatible; bingbot/2.0; url)
 www.bing.com/bingbot.htm-Mozilla/5.0 (compatible; bingbot/2.0; url)
 www.bing.com/bingbot.htmimage/..Mozilla/5.0 (compatible; bingbot/2.0; url)
 www.bing.com/bingbot.htmapplication/jsonMozilla/5.0 (compatible; bingbot/2.0; url)
 www.bing.com/bingbot.htmapplication/vnd.php.serializedMozilla/5.0 (compatible; bingbot/2.0; url)
 www.bing.com/bingbot.htmtext/..Mozilla/5.0 (compatible; bingbot/2.0; url) ASProxy/5.5b3
google?
 www.google.com/bot.htmltext/..Mozilla/5.0 (compatible; GoogleBot/2.1; url)
 www.google.com/bot.htmltext/..GoogleBot/2.1 (url)
 www.google.com/bot.html-Mozilla/5.0 (compatible; GoogleBot/2.1; url)
 www.google.com/bot.htmlimage/..Mozilla/5.0 (compatible; GoogleBot/2.1; url)
 www.google.com/bot.htmlapplication/vnd.php.serializedMozilla/5.0 (compatible; GoogleBot/2.1; url)
 www.google.com/bot.htmlapplication/jsonMozilla/5.0 (compatible; GoogleBot/2.1; url)
 www.google.com/bot.html-DoCoMo/2.0 N905i(c100;TB;W24H16) (compatible; GoogleBot-Mobile/2.1; url)
 www.google.com/bot.html-Mozilla/5.0 (iPhone; CPU iPhone OS 4_1 like Mac OS X; en-us) AppleWebKit/532.9 KHTML Version/4.0.5 Mobile/8B117 Safari/6531.22.7 (compatible; GoogleBot-Mobile/2.1; url)
 www.google.com/bot.html-SAMSUNG-SGH-E250/1.0 Profile/MIDP-2.0 Configuration/CLDC-1.1 UP.Browser/6.2.3.3.c.1.101 (GUI) MMP/2.0 (compatible; GoogleBot-Mobile/2.1; url)
 www.google.com/bot.htmltext/..Mozilla/5.0 (iPhone; CPU iPhone OS 4_1 like Mac OS X; en-us) AppleWebKit/532.9 KHTML Version/4.0.5 Mobile/8B117 Safari/6531.22.7 (compatible; GoogleBot-Mobile/2.1; url)
 www.google.com/bot.html-GoogleBot/2.1 (url)
 www.google.com/bot.htmltext/..Mozilla/5.0 (compatible; GoogleBot/3.1.2-SNAPSHOT-20121112.142015 url)
 www.google.com/bot.htmlapplication/pdfGoogleBot/2.1 (url)
 www.google.com/bot.htmltext/..Mozilla/5.0 (compatible; GoogleBot/2.1; url)
yahoo
 help.yahoo.com/help/us/ysearch/slurpimage/..Mozilla/5.0 (compatible; Yahoo! Slurp/3.0; url)
 help.yahoo.com/help/us/ysearch/slurptext/..Mozilla/5.0 (compatible; Yahoo! Slurp/3.0; url)
 help.yahoo.com/help/us/ysearch/slurptext/..Mozilla/5.0 (compatible; Yahoo! Slurp; url)
 help.yahoo.co.jp/help/jp/search/indexing/indexing-15.htmltext/..'Mozilla/5.0 (compatible; Y!J SearchMonkey/1.0 (Y!J-AGENT; url))'
 help.yahoo.com/help/us/ysearch/slurpapplication/jsonMozilla/5.0 (compatible; Yahoo! Slurp/3.0; url)
 help.yahoo.co.jp/help/jp/search/indexing/indexing-15.htmltext/..Y!J-BRW/1.0 crawler (url)
 help.yahoo.com/help/us/ysearch/slurp-Mozilla/5.0 (compatible; Yahoo! Slurp; url)
 help.yahoo.co.jp/help/jp/search/indexing/indexing-15.htmlimage/..'Mozilla/5.0 (compatible; Y!J SearchMonkey/1.0 (Y!J-AGENT; url))'
 developer.yahoo.com/yql/providertext/..Mozilla/5.0 (compatible; Yahoo Pipes 2.0; url) Gecko/20090729 Firefox/3.5.2
 help.yahoo.com/help/us/ysearch/slurp-Mozilla/5.0 (compatible; Yahoo! Slurp/3.0; url)
 help.yahoo.co.jp/help/jp/search/indexing/indexing-15.htmltext/..Y!J-BRT/1.0 crawler (url)
 help.yahoo.com/help/us/ysearch/slurpapplication/xmlMozilla/5.0 (compatible; Yahoo! Slurp;url)
yandex
 yandex.com/botstext/..Mozilla/5.0 (compatible; YandexBot/3.0; url)
 yandex.com/bots-Mozilla/5.0 (compatible; YandexBot/3.0; url)
 yandex.com/botsapplication/jsonMozilla/5.0 (compatible; YandexBot/3.0; url)
 yandex.com/botsimage/..Mozilla/5.0 (compatible; YandexImageResizer/2.0; url)
 yandex.com/botstext/..Mozilla/5.0 (compatible; YandexImages/3.0; url)
 yandex.com/botsimage/..Mozilla/5.0 (compatible; YandexImages/3.0; url)
 yandex.com/botstext/..Mozilla/5.0 (compatible; YandexNews/3.0; url)
 yandex.com/botsimage/..Mozilla/5.0 (compatible; YandexBot/3.0; url)
baidu
 www.baidu.com/search/spider.htmltext/..Mozilla/5.0 (compatible; Baiduspider/2.0; url)
 www.baidu.com/search/spider.html-Mozilla/5.0 (compatible; Baiduspider/2.0; url)
 www.baidu.com/search/spider.htmltext/..Mozilla/5.0 (Linux;u;Android/2.3.7;zh-cn;) AppleWebKit/533.1 (KHTML,like Gecko) Version/4.0 Mobile Safari/533.1 (compatible; url)
 www.baidu.com/search/spider.htmlapplication/jsonMozilla/5.0 (compatible; Baiduspider/2.0; url)
 www.baidu.com/search/spider.htmtext/..Baiduspider-image(url)
 www.baidu.com/search/spider.htmimage/..Baiduspider-image(url)
 www.baidu.com/search/spider.htmlapplication/xmlMozilla/5.0 (compatible; Baiduspider/2.0; url)
 www.baidu.com/search/spider.htmtext/..Baiduspider(url)
naver
 help.naver.com/robots/text/..Yeti/1.0 (NHN Corp.; url)
 help.naver.com/robots/-Yeti/1.0 (NHN Corp.; url)
 help.naver.com/robots/image/..Yeti/1.0 (NHN Corp.; url)
 help.naver.com/robots/text/..Yeti/1.1 (NHN Corp.; url)
 help.naver.com/robots/application/jsonYeti/1.0 (NHN Corp.; url)
msn
 search.msn.com/msnbot.htmtext/..msnbot/2.0b (url)
 search.msn.com/msnbot.htmtext/..msnbot-media/1.1 (url)
 search.msn.com/msnbot.htmimage/..msnbot-media/1.1 (url)
 search.msn.com/msnbot.htmtext/..msnbot-Products/1.0 (url)
 search.msn.com/msnbot.htmtext/..msnbot-UDiscovery/2.0b (url)
 search.msn.com/msnbot.htmtext/..msnbot/0.01 (url)
 search.msn.com/msnbot.htmtext/..msnbot-NewsBlogs/2.0b (url)
 search.msn.com/msnbot.htm-msnbot-media/1.1 (url)
 search.msn.com/msnbot.htmimage/..msnbot/2.0b (url)
 search.msn.com/msnbot.htmimage/..msnbot-NewsBlogs/2.0b (url)
 search.msn.com/msnbot.htm-msnbot/2.0b (url)
cibra
 cibra.de/text/..CiBra Data Collector (url)
ahrefs
 ahrefs.com/robot/text/..Mozilla/5.0 (compatible; AhrefsBot/4.0; url)
 ahrefs.com/robot/text/..Mozilla/5.0 (compatible; AhrefsBot/3.1; url)
 ahrefs.com/robot/-Mozilla/5.0 (compatible; AhrefsBot/4.0; url)
 ahrefs.com/robot/application/jsonMozilla/5.0 (compatible; AhrefsBot/4.0; url)
 ahrefs.com/robot/application/oggMozilla/5.0 (compatible; AhrefsBot/4.0; url)
80legs
 www.80legs.com/webcrawler.htmltext/..Mozilla/5.0 (compatible; 008/0.83; url) Gecko/2008032620
genieo
 www.genieo.com/webfilter.htmltext/..Mozilla/5.0 (compatible; Genieo/1.0 url)
 www.genieo.com/webfilter.htmlapplication/xmlMozilla/5.0 (compatible; Genieo/1.0 url)
 www.genieo.com/webfilter.htmlimage/..Mozilla/5.0 (compatible; Genieo/1.0 url)
sblog
 fulltext.sblog.cz/screenshot/image/..Mozilla/5.0 (compatible; Seznam screenshot-generator 2.0; url)
 fulltext.sblog.cz/text/..SeznamBot/3.0 (url)
 fulltext.sblog.cz/screenshot/text/..Mozilla/5.0 (compatible; Seznam screenshot-generator 2.0; url)
 fulltext.sblog.cz/-SeznamBot/3.0 (url)
finecomb
 finecomb.com/-api/1.1 (url; mail address )
 finecomb.com/application/jsonapi/1.1 (url; mail address )
php
 pear.php.net/application/vnd.php.serializedPEAR HTTP_Request class ( url )
 pear.php.net/text/..PEAR HTTP_Request class ( url )
 pear.php.net/package/http_request2text/..HTTP_Request2/0.5.2 (url) PHP/5.2.17
 pear.php.net/image/..PEAR HTTP_Request class ( url )
 pear.php.net/application/xmlPEAR HTTP_Request class ( url )
 pear.php.net/package/http_request2application/xmlHTTP_Request2/2.0.0 (url) PHP/5.3.8
 pear.php.net/package/http_request2text/..HTTP_Request2/2.1.1 (url) PHP/5.3.2-1ubuntu4.17
 pear.php.net/package/http_request2image/..HTTP_Request2/2.1.1 (url) PHP/5.3.2-1ubuntu4.15
youdao
 www.youdao.com/help/webmaster/spider/text/..Mozilla/5.0 (compatible; YoudaoBot/1.0; url; )
 www.youdao.com/help/webmaster/spider/-Mozilla/5.0 (compatible; YoudaoBot/1.0; url; )
 toolbar.youdao.com/image/..Youdao Toolbar (url)
wwwgogetpapers
 wwwgogetpapers.com/application/jsonUser-Agent: GoGetPapersBot (url)
echonest
 the.echonest.com/reader/application/xmlnestReader/0.3 (discovery; url; reader at echonest.com)
 the.echonest.com/reader/text/..nestReader/0.3 (discovery; url; reader at echonest.com)
www.
 www.text/..GoogleBot/2.1 ( urlGoogleBot.com/bot.html)
 www.text/..GoogleBot-Image/1.0 ( urlGoogleBot.com/bot.html)
 www.text/..GoogleBot/2.1 (urlGoogleBot.com/bot.html)
 www.image/..GoogleBot/2.1 (urlGoogleBot.com/bot.html)
wordpress
 josefboberg.wordpress.comtext/..WordPress/3.5-alpha-21989; url
 greatriversofhope.wordpress.comtext/..WordPress/3.5-alpha-21989; url
 tsjok45.wordpress.comtext/..WordPress/3.5-alpha-21989; url
 klausgauger.wordpress.comtext/..WordPress/3.5-alpha-21989; url
 christiannoob.wordpress.comtext/..WordPress/3.5-RC2-22949; url
 warsclerotic.wordpress.comtext/..WordPress/3.5-alpha-21989; url
 nelsonmcbs.wordpress.comtext/..WordPress/3.5-alpha-21989; url
 02varvara.wordpress.comtext/..WordPress/3.5-alpha-21989; url
 lesliebrodie.wordpress.comtext/..WordPress/3.5-alpha-21989; url
 captaindemocracy.wordpress.comtext/..WordPress/3.5-alpha-21989; url
zum
 help.zum.com/inquirytext/..ZumBot/1.0 (ZUM Search; url)
 help.zum.com/inquiryimage/..ZumBot/1.0 (ZUM Search; url)
soso
 help.soso.com/webspider.htmtext/..Mozilla/5.0(compatible; Sosospider/2.0; url)
 help.soso.com/webspider.htm-Mozilla/5.0(compatible; Sosospider/2.0; url)
 help.soso.com/webspider.htmapplication/jsonMozilla/5.0(compatible; Sosospider/2.0; url)
yioop
 www.yioop.com/bot.phptext/..Mozilla/5.0 (compatible; YioopBot; url)
 www.yioop.com/bot.phpimage/..Mozilla/5.0 (compatible; YioopBot; url)
coccoc
 help.coccoc.vn/text/..coccoc/1.0 (url)
 help.coccoc.vn/-coccoc/1.0 (url)
wikipedia
 en.wikipedia.org/wiki/Wikipedia:Huggletext/..Huggle/2.1.19.0 url
 en.wikipedia.org/wiki/User:NicoV/Wikipedia_Cleaner/Documentationtext/..WPCleaner (url)
 en.wikipedia.org/wiki/Wikipedia:Huggletext/..Huggle/2.1.18.0 url
 fr.wikipedia.org/wiki/Utilisateur:Salebotapplication/jsonSalebot, see url (uses Perl MediaWiki::API)
exabot
 www.exabot.com/go/robottext/..Mozilla/5.0 (compatible; Exabot/3.0; url)
 www.exabot.com/go/robot-Mozilla/5.0 (compatible; Exabot/3.0; url)
 www.exabot.com/go/robottext/..Mozilla/5.0 (compatible; Exabot/3.0 (BiggerBetter); url)
majestic12
 www.majestic12.co.uk/bot.php?text/..Mozilla/5.0 (compatible; MJ12bot/v1.4.3; url)
sogou
 www.sogou.com/docs/help/webmasters.htm#07text/..Sogou web spider/4.0(url)
 www.sogou.com/docs/help/webmasters.htm#07-Sogou web spider/4.0(url)
 www.sogou.com/docs/help/webmasters.htm#07image/..Sogou Pic Spider/3.0(url)
 www.sogou.com/docs/help/webmasters.htm#07application/jsonSogou web spider/4.0(url)
 www.sogou.com/docs/help/webmasters.htm#07text/..Sogou Pic Spider/3.0(url)
yacy
 yacy.net/bot.htmltext/..yacybot (freeworld-global; amd64 Linux 2.6.32-5-amd64; java 1.6.0_26; Europe/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 3.2.0-32-generic; java 1.6.0_24; Europe/en) url
 yacy.net/bot.htmltext/..yacybot (webportal-global; amd64 Linux 3.2.0-0.bpo.3-amd64; java 1.6.0_18; Europe/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld-global; amd64 Linux 3.2.0-32-generic; java 1.6.0_24; Europe/de) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 3.1.10-1.16-default; java 1.6.0_24; Europe/de) url
 yacy.net/bot.html-yacybot (freeworld-global; amd64 Linux 3.2.0-32-generic; java 1.6.0_24; Europe/de) url
 yacy.net/bot.htmltext/..yacybot (freeworld-global; amd64 Linux 2.6.32-45-server; java 1.6.0_26; Europe/de) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 3.2.0-23-lowlatency; java 1.6.0_24; Europe/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 3.6.6-1-ARCH; java 1.7.0_03; Europe/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 2.6.32-5-amd64; java 1.6.0_18; Europe/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 3.6.4-1-ARCH; java 1.7.0_09; Europe/fr) url
 yacy.net/bot.html-yacybot (freeworld-global; amd64 Linux 2.6.32-5-amd64; java 1.6.0_26; Europe/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Windows Server 2008 R2 6.1; java 1.7.0_07; Europe/es) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 2.6.32-279.11.1.el6.x86_64; java 1.6.0_24; Europe/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 3.5.0-19-generic; java 1.7.0_09; Europe/en) url
 yacy.net/bot.html-yacybot (freeworld/global; amd64 Linux 3.6.4-1-ARCH; java 1.7.0_09; Europe/fr) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 3.5.3-1-desktop; java 1.7.0_07; Europe/de) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; i386 Linux 3.6.6-gnu; java 1.7.0_09; America/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; x86_64 Mac OS X 10.8.2; java 1.6.0_37; America/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 3.2.0-23-generic; java 1.6.0_24; Europe/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Windows NT (unknown) 6.2; java 1.7.0_04; America/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Windows 7 6.1; java 1.6.0_26; Europe/de) url
 yacy.net/bot.html-yacybot (freeworld/global; amd64 Linux 3.3.8-gentoo; java 1.6.0_33; UTC/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 3.2.0-3-amd64; java 1.6.0_24; Europe/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 2.6.32-44-server; java 1.6.0_26; Europe/de) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 3.3.8-gentoo; java 1.6.0_33; UTC/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Windows Server 2008 R2 6.1; java 1.7.0_04; Europe/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 3.4.11-2.16-desktop; java 1.7.0_09; Europe/nl) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 3.6.6-1-ARCH; java 1.7.0_03; Europe/fr) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 3.2.13-grsec-xxxx-grs-ipv6-64; java 1.6.0_24; Europe/en) url
 yacy.net/bot.html-yacybot (freeworld/global; amd64 Linux 2.6.32-279.11.1.el6.x86_64; java 1.6.0_24; Europe/en) url
 yacy.net/bot.htmltext/..yacybot (freeworld/global; amd64 Linux 3.2.0-32-generic; java 1.7.0_09; Europe/de) url
discoveryengine
 discoveryengine.com/discoverybot.htmltext/..Mozilla/5.0 (compatible; discoverybot/2.0; url)
 discoveryengine.com/discoverybot.html-Mozilla/5.0 (compatible; discoverybot/2.0; url)
archive
 www.archive.org/details/archive.org_bottext/..Mozilla/5.0 (compatible; archive.org_bot url)
 www.archive.org/details/archive.org_bottext/..Mozilla/5.0 (compatible; heritrix/3.1.1-SNAPSHOT-20120116.200628 url)
 www.archive.org/details/archive.org_botimage/..Mozilla/5.0 (compatible; archive.org_bot url)
 www.archive.org/details/archive.org_botimage/..Mozilla/5.0 (compatible; heritrix/3.1.1-SNAPSHOT-20120116.200628 url)
 www.archive.org/details/archive.org_bot-Mozilla/5.0 (compatible; archive.org_bot url)
 archive.org/details/archive.org_botimage/..Mozilla/5.0 (compatible; heritrix/3.1.2-SNAPSHOT-20121013.132750 url)
blekko
 blekko.com/about/blekkobottext/..Mozilla/5.0 (compatible; Blekkobot; ScoutJet; url)
 blekko.com/about/blekkobot-Mozilla/5.0 (compatible; Blekkobot; ScoutJet; url)
toolserver
 wiki.toolserver.org/view/GeoHacktext/..Geohack (url)
 toolserver.org/~dispenser/text/..DispensersTools (url)
 toolserver.org/~dispenser/text/..CacheThumbs/1.2 (url)
 toolserver.org/~dispenser/image/..CacheThumbs/1.2 (url)
 toolserver.org/~dispenser/application/jsonDispensersTools (url)
 toolserver.org/~para/cgi-bin/kmlexporttext/..url libwww-perl/6.02
 toolserver.org/~platonides/catdown/image/..catdown Google_Art_Project (url)
wikidict
 www.wikidict.detext/..url
jike
 shoulu.jike.com/spider.htmltext/..Mozilla/5.0 (compatible; JikeSpider; url)
 shoulu.jike.com/spider.htmlimage/..Mozilla/5.0 (compatible; JikeSpider; url)
 shoulu.jike.com/spider.html-Mozilla/5.0 (compatible; JikeSpider; url)
bin-co
 www.bin-co.com/php/scripts/load/text/..BinGet/1.00.A (url)
 www.bin-co.com/php/scripts/load/application/vnd.php.serializedBinGet/1.00.A (url)
traslated
 mymemory.traslated.net/doc/text/..Mozilla/5.0 (MyMemory Bot url)
 mymemory.traslated.net/doc/-Mozilla/5.0 (MyMemory Bot url)
flipboard
 flipboard.com/browserproxyimage/..Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; en-US; rv:1.9.2) Gecko/20100115 Firefox/3.6 (FlipboardProxy/0.0.5; url)
 flipboard.com/browserproxytext/..Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; en-US; rv:1.9.2) Gecko/20100115 Firefox/3.6 (FlipboardProxy/1.1; url)
 flipboard.com/browserproxytext/..Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; en-US; rv:1.9.2) Gecko/20100115 Firefox/3.6 (FlipboardProxy/0.0.5; url)
 flipboard.com/browserproxyapplication/jsonMozilla/5.0 (Macintosh; Intel Mac OS X 10.6; en-US; rv:1.9.2) Gecko/20100115 Firefox/3.6 (FlipboardProxy/0.0.1; url)
 flipboard.com/browserproxyimage/..null (FlipboardProxy/1.1; url)
 flipboard.com/browserproxy-Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; en-US; rv:1.9.2) Gecko/20100115 Firefox/3.6 (FlipboardProxy/0.0.5; url)
 flipboard.com/browserproxy-Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; en-US; rv:1.9.2) Gecko/20100115 Firefox/3.6 (FlipboardProxy/1.1; url)
FeedBurner
 www.FeedBurner.comtext/..FeedBurner/1.0 (url)
SearchNearMe
 SearchNearMe.com/contact.phpapplication/vnd.php.serializedSearchNearMe (url)
 SearchNearMe.com/contact.phptext/..SearchNearMe (url)
archive-it
 archive-it.org/files/site-owners.htmlimage/..Mozilla/5.0 (compatible; archive.org_bot; Archive-It; url)
 archive-it.org/files/site-owners.html-Mozilla/5.0 (compatible; archive.org_bot; Archive-It; url)
 archive-it.org/files/site-owners.htmltext/..Mozilla/5.0 (compatible; archive.org_bot; Archive-It; url)
okian
 www.okian.ro/text/..MyBot/1.0 (url)
daum
 tab.search.daum.net/aboutWebSearch.htmltext/..Mozilla/5.0 (compatible; MSIE or Firefox mutant; not on Windows server; url) Daumoa/3.0
enwp
 enwp.org/User:SDPatrolBottext/..SDPatrolBot (url)
 enwp.org/User:KingpinBottext/..KingpinBot (url)
 enwp.org/User:H3llkn0wz/WikiSharpAPItext/..WikiSharpAPI/0.3 url (C# .NET)
goo
 help.goo.ne.jp/contact/text/..goo wikipedia (url)
 goo.gl/7y4SXtext/..GoogleProducer; (url)
 help.goo.ne.jp/door/crawler.htmltext/..ichiro/3.0 (url)
 search.goo.ne.jp/option/use/sub4/sub4-1/-DoCoMo/2.0 P900i(c100;TB;W24H11) (compatible; ichiro/mobile goo; url)
gnip
 www.gnip.com/text/..UnwindFetchor/1.0 (url)
 www.gnip.com/-UnwindFetchor/1.0 (url)
toshiba
 www.toshiba.co.jp/rdc/about/crawl_info.htmtext/..TosCrawler/Nutch-1.4 (url; ' mail address dot co dot jp')
 www.toshiba.co.jp/rdc/about/crawl_info.htmtext/..TosCrawler/Nutch-1.5.1 (url; ' mail address dot co dot jp')
kosmix
 www.kosmix.com/html/kosmos.htmlapplication/xmlMozilla/5.0(compatible;Kosmos/1.0;url)
plos
 alm.plos.orgapplication/jsonPLoS Article Level Metrics - url
yoursite
 yoursite.com/botinfotext/..Mozilla/5.0 (compatible; YourCoolBot/1.0; url)
xbmc
 www.xbmc.orgimage/..XBMC/11.0 Git:20120702-f3cd288 (iOS; 11.0.0 AppleTV2,1, Version 5.1.1 (Build 9B830); url)
 www.xbmc.orgimage/..XBMC/11.0 Git:20120321-14feb09 (Windows NT 6.1;WOW64;Win64;x64; url)
 www.xbmc.orgimage/..XBMC/11.0 Git:20120321-14feb09 (Windows NT 6.1; url)
 www.xbmc.orgtext/..XBMC/11.0 Git:20120702-f3cd288 (iOS; 11.0.0 AppleTV2,1, Version 5.1.1 (Build 9B830); url)
topsy
 labs.topsy.com/butterfly/text/..Mozilla/5.0 (compatible; Butterfly/1.0; url) Gecko/2009032608 Firefox/3.0.8
wikiglass
 wikiglass.comtext/..url : mail address
bibalex
 archive.bibalex.org/bot/text/..Mozilla/5.0 (compatible; archive.bibalex.org_bot; url)
 archive.bibalex.org/bot/image/..Mozilla/5.0 (compatible; archive.bibalex.org_bot; url)
mediawiki
 www.mediawiki.org/text/..MediaWiki OAI Harvester 0.2 (url)
github
 github.com/pauldix/typhoeus/tree/mastertext/..Typhoeus - url
 github.com/edsu/wikitweetsapplication/jsonwikitweets <url
 github.com/edsu/linkypediaapplication/jsonlinkpyediabot v0.1: url
 github.com/pauldix/feedzirra/tree/masterapplication/xmlfeedzirra url
 wiki.github.com/bixo/bixo/bixocrawlertext/..Mozilla/5.0 (compatible; pub-crawler; url; mail address )
apercite
 www.apercite.fr/robot/index.htmlimage/..Mozilla/5.0 (compatible; Apercite; url)
federatedmedia
 federatedmedia.nettext/..Mozilla/5.0 (url) Gecko/20061208 Firefox/2.0.0.1
speaktoit
 www.speaktoit.comapplication/jsonSpeaktoit url
kalooga
 kalooga.com/crawlerimage/..Mozilla/5.0 (compatible; KaloogaBot; url)
 kalooga.com/crawlertext/..Mozilla/5.0 (compatible; KaloogaBot; url)
paper
 support.paper.li/entries/20023257-what-is-paper-litext/..Mozilla/5.0 (compatible; PaperLiBot/2.1; url)
embed
 support.embed.ly/image/..Mozilla/5.0 (compatible; Embedly/0.2; snap; url)
 support.embed.ly/text/..Mozilla/5.0 (compatible; Embedly/0.2; url)
wikimpress
 wikimpress.org/text/..Mozilla/5.0 (compatible; Linux i686 (x86_64); de-DE; url>Wikimpress) Wikimpress/1.0
 wikimpress.org/-Mozilla/5.0 (compatible; Linux i686 (x86_64); de-DE; url>Wikimpress) Wikimpress/1.0
tineye
 tineye.com/crawler.htmlapplication/jsonTinEye/1.1 (url)
 tineye.com/crawler.htmlimage/..TinEye/1.1 (url)
 tineye.com/crawler.htmltext/..TinEye/1.1 (url)
emining
 emining.jp/text/..emBot-GalaBuzz/Nutch-1.0 (url; mail address )
 emining.jp/-emBot-GalaBuzz/Nutch-1.0 (url; mail address )
tiscali
 www.tiscali.it/text/..Mozilla/5.0 (compatible; IstellaBot/1.10.2 url)
plagiarismcheck
 plagiarismcheck.orgapplication/jsonWikiCrawl 1.0b (url contact-mail: mail address )
proximic
 www.proximic.com/info/spider.phptext/..Mozilla/5.0 (compatible; proximic; url)
textdigger
 textdigger.comtext/..Mozilla/5.0 (url) Gecko/20061208 Firefox/2.0.0.1
 textdigger.comimage/..Mozilla/5.0 (url) Gecko/20061208 Firefox/2.0.0.1
moviecus
 www.moviecus.com/botcontactinfo.phpapplication/yamlmoviecus bot (url)
drupal
 drupal.org/image/..Drupal (url)
 drupal.org/text/..Drupal (url)
 drupal.org/text/..User-Agent: Drupal (url)
muso
 www.muso.comtext/..Mozilla/5.0 (compatible; musobot/1.0; mail address ; url)
cognarius
 cognarius.comapplication/jsonAppsArlak/1.0 (url)
 cognarius.comtext/..AppsArlak/1.0 (url)
zipcode
 zipcode.ustext/..Mozilla/5.0 (compatible; YourCoolBot/1.0; url)
easybib
 content.easybib.com/autocite/text/..EasyBib AutoCite (url)
 content.easybib.com/autocite/application/jsonEasyBib AutoCite (url)
openindex
 www.openindex.io/en/webmasters/spider.htmltext/..Mozilla/5.0 (compatible; OpenindexSpider; url)
 www.openindex.io/en/webmasters/spider.html-Mozilla/5.0 (compatible; OpenindexSpider; url)
sf
 liferea.sf.net/text/..Liferea/1.x.x (Linux; es_ES.UTF-8; url)
 magpierss.sf.nettext/..MagpieRSS/0.7x (url)
 liferea.sf.net/text/..Liferea/0.x.x (Linux; en_US.UTF-8; url)
abonti
 www.abonti.comtext/..Mozilla/5.0 (compatible; Abonti/0.91 - url)
veveo
 corporate.veveo.net/webmasters.htmltext/..Mozilla/5.0 (compatible; Veveobot; url)
bbn
 www.bbn.com/text/..wolverine4j (url)
hatena
 a.hatena.ne.jp/helptext/..Hatena Antenna/0.5 (url)
picsearch
 www.picsearch.com/bot.htmltext/..psbot/0.1 (url)
 www.picsearch.com/bot.htmlimage/..psbot/0.1 (url)
worldaswillandfarce
 worldaswillandfarce.comtext/..WordPress/3.5-alpha-21989; url
zeebox
 www.zeebox.comtext/..Zeebox (url)
 www.zeebox.comapplication/jsonZeebox (url)
tweetmeme
 tweetmeme.com/text/..Mozilla/5.0 (compatible; TweetmemeBot/3.0; url)
bsurprised
 bsurprised.com/text/..BSurprised WikiBox 0.1.3 (url)
vermagerd
 www.vermagerd.be/wptext/..WordPress/3.4.2; url
rockpeaks
 www.rockpeaks.com/contacttext/..RockPeaks/0.1 (url)
alexa
 www.alexa.com/site/help/webmasterstext/..ia_archiver (url; mail address )
sistrix
 crawler.sistrix.net/text/..Mozilla/5.0 (compatible; SISTRIX Crawler; url)
avantbrowser
 www.avantbrowser.comtext/..Advanced Browser (url)
 www.avantbrowser.comtext/..Avant Browser (url)
netarkivet
 netarkivet.dk/webcrawler/text/..Mozilla/5.0 (compatible; heritrix/1.14.4 url)
 netarkivet.dk/webcrawler/image/..Mozilla/5.0 (compatible; heritrix/1.14.4 url)
backgroundswitcher
 www.backgroundswitcher.com/text/..John's Background Switcher 4.6 (url)
 www.backgroundswitcher.com/image/..John's Background Switcher 4.4 (url)
 www.backgroundswitcher.com/text/..John's Background Switcher 4.4 (url)
netseer
 www.netseer.com/crawler.htmltext/..Mozilla/5.0 (compatible; NetSeer crawler/2.0; url; mail address )
 www.netseer.com/crawler.htmltext/..Mozilla/5.0 (compatible; Netseer crawler/2.0; url; mail address )
sentymetr
 sentymetr.pl/bot.htmlapplication/jsonMozilla/5.0 (compatible; SentymetrBot 1.0; url)
 sentymetr.pl/bot.htmltext/..Mozilla/5.0 (compatible; SentymetrBot 1.0; url)
newsgator
 www.newsgator.com/text/..FeedDemon/2.7 (url; Microsoft Windows XP)
 www.newsgator.comtext/..NewsGatorOnline/2.0 (url; 1 subscribers)
weblio
 www.weblio.jp/text/..Mozilla/5.0 (compatible; WeblioBot; url)
 www.weblio.jp/info/crawler.jspimage/..Mozilla/5.0 (compatible; Webliobot/0.1; url)
jetbrains
 www.jetbrains.com/omea_reader/text/..JetBrains Omea Reader 1.0.x (url)
 www.jetbrains.com/omea_reader/text/..JetBrains Omea Reader 2.0 Release Candidate 1 (url)
feedshow
 www.feedshow.comtext/..FeedshowOnline (url)
 www.feedshow.comtext/..Feedshow/x.0 (url; 1 subscriber)
fucinamediale
 labs.fucinamediale.comtext/..Mozilla/5.0 (compatible; ExperimentalWikiBot/1.0; url)
adsensecare
 adsensecare.comapplication/xmlWordPress/3.4.1; url
wetraveltheworld
 wetraveltheworld.euapplication/jsonWeTraveltheWorld.EU BOT 0.1 (url)
superfeedr
 superfeedr.comapplication/xmlSuperfeedr bot/2.0 url - Please get in touch if we are polling too hard.
 superfeedr.comtext/..Superfeedr bot/2.0 url - Please get in touch if we are polling too hard.
 superfeedr.com-Superfeedr bot/2.0 url - Please get in touch if we are polling too hard.
spinn3r
 spinn3r.com/robottext/..Mozilla/5.0 (X11; Linux x86_64; en-US; rv:1.9.0.19; aggregator:Spinn3r (Spinn3r 3.1); url) Gecko/2010040121 Firefox/3.0.19
warebay
 www.warebay.com/bot.htmltext/..Mozilla/5.0 (compatible; WBSearchBot/1.1; url)
simplepie
 simplepie.orgapplication/xmlSimplePie/1.2.1 (Feed Parser; url; Allow like Gecko) Build/20111015034325
 simplepie.orgtext/..SimplePie/1.2.1 (Feed Parser; url; Allow like Gecko) Build/20111015034325
 simplepie.orgapplication/xmlSimplePie/1.2 (Feed Parser; url; Allow like Gecko) Build/20090627192103
netvibes
 www.netvibes.comtext/..Netvibes (url)
example
 example.com/MyCoolToolPage/application/vnd.php.serializedUser-Agent: MyCoolTool (url)
 example.com/MyCoolTool/application/jsonMyCoolTool/1.1 (url; mail address )
semager
 www.semager.de/blog/semager-bots/text/..Mozilla/5.0 (compatible; Semager/1.4c; url)
customernet
 quaba.customernet.detext/..quaba-spider (url)
dataparksearch
 dataparksearch.org/bottext/..DataparkSearch/4.54-26052011 (url)
rcdtokyo
 www.rcdtokyo.com/pc2m/text/..Mozilla/5.0 (compatible; PEAR HTTP_Request class; url)
microsystools
 www.microsystools.com/products/website-download/text/..A1 Website Download/2.3.0 (url) miggibot
 www.microsystools.com/products/sitemap-generator/text/..A1 Sitemap Generator/4.1.0 (url) miggibot
zaal
 zaal.ir/bot.htmltext/..Mozilla/5.0 (compatible;ZaalBot/1.0.2; url) Gecko/20100101 Firefox/5.0
pingdom
 www.pingdom.com/text/..Pingdom.com_bot_version_1.4_(url)
 www.pingdom.comtext/..Pingdom.com_bot_version_1.4_(url)
fotopedia
 www.fotopedia.comapplication/jsonPicor (url)
linguee
 www.linguee.com/bottext/..Linguee Bot (url; mail address )
instapaper
 www.instapaper.com/text/..Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8) AppleWebKit/534.50 KHTML Version/5.1 Instapaper/4.0 (url)
duckduckgo
 duckduckgo.com/duckduckbot.htmltext/..DuckDuckBot/1.1; (url)
 duckduckgo.com/duckduckpreview.htmltext/..DuckDuckPreview/1.0; (url)
 duckduckgo.com/duckduckpreview.html-DuckDuckPreview/1.0; (url)
turnitin
 www.turnitin.com/robot/crawlerinfo.htmltext/..TurnitinBot/2.1 (url)
parsijoo
 www.parsijoo.irtext/..Mozilla/5.0 (compatible; mail address url)
 www.parsijoo.irimage/..Mozilla/5.0 (compatible; mail address url)
stad
 stad.comtext/..Mozilla/5.0 (compatible; stadbot/1.0; url)
pagepeeker
 pagepeeker.com/robots/image/..Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/535.21 KHTML Chrome/19.0.1042.0 Safari/535.21 PagePeeker/2.1; url
 pagepeeker.com/robots/text/..Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/535.21 KHTML Chrome/19.0.1042.0 Safari/535.21 PagePeeker/2.1; url
 pagepeeker.com/robotsimage/..PagePeeker.com (info: url)
js-kit
 js-kit.com/text/..JS-Kit URL Resolver, url
bnf
 www.bnf.fr/fr/outils/a.dl_web_capture_robot.htmlimage/..Mozilla/5.0 (compatible; bnf.fr_bot; url)
 www.bnf.fr/fr/outils/a.dl_web_capture_robot.htmltext/..Mozilla/5.0 (compatible; bnf.fr_bot; url)
publicknowledgeproject
 alm.publicknowledgeproject.orgapplication/jsonArticle Level Metrics - url
sciencecard
 demo.sciencecard.orgapplication/jsonArticle Level Metrics - url
dbpedia
 dbpedia.orgtext/..DBpedia Sync - url - mail address
spotinfluence
 spotinfluence.comtext/..spotinfluence/Nutch-1.4 (Spot Influence crawler; url; mail address )
sourceforge
 linkchecker.sourceforge.net/text/..Mozilla/5.0 (compatible; LinkChecker/8.2; url)
netnewswireapp
 netnewswireapp.com/mac/-NetNewsWire/3.3.2 (Mac OS X; url; gzip-happy)
nb
 www.nb.no/vevfangstimage/..Mozilla/5.0 (compatible; heritrix/1.14.4 url)
 www.nb.no/vevfangsttext/..Mozilla/5.0 (compatible; heritrix/1.14.4 url)
friendofrenia
 friendofrenia.com/application/jsonUser-Agent: FriendoFrenia (url)
 friendofrenia.com/text/..User-Agent: FriendoFrenia (url)
sonyericsson
 www.sonyericsson.com/UAprof/R800xR301.xmlimage/..Mozilla/5.0 (Linux; Android/2.3.3; en-us; SonyEricssonR800xurl Build/3.0.1.E.1.44) AppleWebKit/533.1 KHTML Version/4.0 Mobile Safari/533.1
Anonymouse
 Anonymouse.org/image/..url (Unix)
 Anonymouse.org/text/..url (Unix)
thumbsniper
 thumbsniper.comimage/..ThumbSniper (url)
 thumbsniper.comtext/..ThumbSniper (url)
orcabrowser
 www.orcabrowser.comtext/..Orca Browser (url)
globalspec
 www.globalspec.com/Ocellitext/..Ocelli/1.4 (url)
rssreader
 www.rssreader.comtext/..RssReader/1.0.xx.x (url) Microsoft Windows NT 5.1.2600.0
tinyurl
 tinyurl.com/64t5ntext/..Rome Client (url) Ver: 0.9
feeds4all
 www.feeds4all.com/feedzcollectortext/..FeedZcollector v1.x (Platinum) url
118805.499999991total

Page requests for probable crawlers, recognized by keyword
Count
x 1000
Agent string
  Mime type (count ≥ 3)
PythonWikipediaBot/1.0
 application/json
 application/xml
 text/..
 -
 application/x-www-form-urlencoded
 image/..
spider
 text/..
 application/vnd.php.serialized
 image/..
 application/ogg
 -
 application/json
php wikibot classes
 application/vnd.php.serialized
 text/..
 -
AniBot/0.9 php/curl
 application/vnd.php.serialized
 -
 image/..
 text/..
MediaWikiCrawler-Google/2.0 ( mail address )
 text/..
 -
LinkParser/2.0
 text/..
 -
GoogleBot-Image/1.0
 image/..
 text/..
 -
gsa-crawler (Enterprise; T3-P9JWVCTT9WWGY; mail address )
 text/..
 -
Peachy MediaWiki Bot API Version 1.0
 application/vnd.php.serialized
 -
 text/..
wikiwix-bot-3.0
 text/..
 -
Mozilla/5.0 (Windows; Windows NT 5.1; fr; rv:1.8.1) VoilaBot BETA 1.2 ( mail address )
 text/..
 -
 application/json
 application/ogg
Mozilla/5.0 MaboMwFramework/1.2 (w:de:MerlIwBot)
 text/..
GoogleBot-Image/1.0
 text/..
 image/..
 -
 application/json
 application/rsd+xml
Pywikipediabot/2.0
 application/json
 application/x-www-form-urlencoded
 text/..
gsa-crawler (Enterprise; T3-P9JWVCTT9WWGY; mail address , mail address )
 text/..
 -
Answersbot
 text/..
ClueBot/1.1
 application/vnd.php.serialized
tigerbot
 application/json
 text/..
ClueBot/2.0
 application/vnd.php.serialized
www.integromedb.org/Crawler
 text/..
 image/..
 -
 application/xml
Wikipath Bot (email: mail address )
 application/json
Mozilla 5.0 (Apibot 0.32)
 application/vnd.php.serialized
 text/..
HTMLParser/2.0
 text/..
 -
WikiPlaysBot
 text/..
DotNetWikiBot/2.101 (Microsoft Windows NT 6.1.7601 Service Pack 1; )
 text/..
 application/xml
TrueKnowledgeBot bot mail address >
 application/xml
 application/vnd.php.serialized
 -
Mozilla/5.0 (compatible; Konqueror/3.5; Linux) KHTML/3.5.5 (Exabot-Thumbnails)
 image/..
 text/..
 application/json
 -
 application/javascript
DigitalsmithsBot
 text/..
Mozilla/5.0 (compatible; Ezooms/1.0; mail address )
 text/..
 application/json
 -
 image/..
MediaWiki::Bot/3.2.6
 application/json
plantspedia data crawler
 text/..
AnomieBOT 1.0 (TagDater; see [[User:AnomieBOT]])
 application/json
Wikibot/2.0.1 CFNetwork/609 Darwin/13.0.0
 image/..
 application/json
 text/..
 -
Mozilla/5.0 (compatible; Mail.RU_Bot/2.0)
 text/..
 image/..
 application/json
 -
Tawbot (public svn release; plwiki)
 text/..
mail address
 application/vnd.php.serialized
 text/..
 application/json
mail address mail address – MediaWiki Tcl Bot Framework 0.5
 application/json
 application/x-www-form-urlencoded
 text/..
YBot/0.1
 application/vnd.php.serialized
DotNetWikiBot/2.100 (Unix 2.6.32.38; )
 text/..
Mozilla/5.0 (Windows; Windows NT 5.1; zh-CN; rv:1.8.0.11) Gecko/20070312 Firefox/1.5.0.11; 360Spider
 text/..
 -
 application/json
 application/xml
DotNetWikiBot/2.81 (Microsoft Windows NT 6.1.7601 Service Pack 1; )
 text/..
 application/xml
 image/..
 application/ogg
SearchBot
 text/..
FAST Search Web Crawler 14.0.0325.0000
 text/..
 -
 application/xml
GermCrawler
 application/json
 text/..
CorenSearchBot/1.7 en libwww-perl/6.04
 text/..
DotNetWikiBot/2.101 (Microsoft Windows NT 5.1.2600 Service Pack 3; )
 text/..
 application/xml
SchoolReviewNetworkWikiBot
 application/json
DotNetWikiBot/2.100 (Unix 5.10.0.0; )
 text/..
 application/xml
AnomieBOT 1.0 (OrphanReferenceFixer; see [[User:AnomieBOT]])
 application/json
 text/..
OrlodrimBot/1.0
 text/..
 -
 application/x-www-form-urlencoded
DotNetWikiBot/2.101 (Unix 3.2.0.32; )
 text/..
DotNetWikiBot/2.100 (Microsoft Windows NT 6.2.8400.0; )
 text/..
 application/xml
SineBot/1.5.19(User:SineBot)
 application/vnd.php.serialized
 text/..
Webwiki Search Engine Bot - www.webwiki.de
 text/..
MediaWiki::Bot/3.005002
 application/json
DotNetWikiBot/2.100 (Microsoft Windows NT 6.1.7601 Service Pack 1; )
 text/..
 application/x-www-form-urlencoded
 application/xml
PanjivaBot
 text/..
Web Crawler
 text/..
 -
Opera/8.01 (J2ME/MIDP; MXit WebBot/6.2.1/1.8.5.168;) Opera Mini/3.1
 image/..
 text/..
 -
AnomieBOT 1.0 (FlagIconRemover; see [[User:AnomieBOT]])
 application/json
AnomieBOT 1.0 (TemplateSubster; see [[User:AnomieBOT]])
 application/json
postis synonimos bot (@synonimos.postis.org)
 application/json
EarwigBot/0.2.dev.git4ff7612a (Python/2.7.3; https://github.com/earwig/earwigbot; mail address )
 application/json
 -
MyCuteBot/0.1
 text/..
 application/json
My Nutch Spider/Nutch-1.5
 text/..
 application/ogg
DotNetWikiBot/2.99 (Microsoft Windows NT 6.1.7601 Service Pack 1; )
 text/..
 application/xml
 image/..
 application/ogg
DotNetWikiBot/2.97 (Microsoft Windows NT 5.1.2600 Service Pack 3; )
 text/..
HosiryuhosiBot IRC-RecentChanges Checker
 text/..
 application/x-www-form-urlencoded
JavaCrawler/1.1
 text/..
mySpider/Nutch-1.5.1
 text/..
 -
bizzlebot
 application/json
Mozilla/5.0 (compatible; Mail.RU/3.14) CrawlMl
 text/..
Wikibot/2.0.1 CFNetwork/548.1.4 Darwin/11.0.0
 image/..
 application/json
 text/..
 -
Twitterbot/1.0
 text/..
 image/..
 -
SurakWare MediaWiki Bot/1.0
 text/..
 application/xml
DotNetWikiBot/2.92 (Microsoft Windows NT 5.1.2600 Service Pack 3; )
 text/..
 application/xml
SiocWikiBot/1.0
 application/vnd.php.serialized
 text/..
dtSearchSpider
 text/..
HRoestBot, de-wikipedia using pywikipedia framework
 text/..
 application/json
Test Webbot
 text/..
 application/json
COIBot/2.0
 text/..
~Bot ([[:fr:w:User:TildeBot]] by [[:fr:w:User:Alphos]] mail address )
 text/..
COIBot/1.00
 text/..
AnomieBOT 1.0 (BAGBot; see [[User:AnomieBOT]])
 application/json
 text/..
www.monit24.pl-m24Bot/4.0-
 image/..
 -
 text/..
Mozilla/5.0 (SnapPreviewBot) Gecko/20061206 Firefox/1.5.0.9
 image/..
 text/..
 -
wikbotlite/2.0 CFNetwork/609 Darwin/13.0.0
 image/..
 application/json
 text/..
 -
AnomieBOT 1.0 (PERTableUpdater; see [[User:AnomieBOT]])
 application/json
 text/..
GoogleBot
 text/..
 image/..
 -
TVersity Media Robot
 text/..
DotNetWikiBot/2.96 (Microsoft Windows NT 6.1.7601 Service Pack 1; )
 text/..
Mozilla/5.0 (compatible; SnapPreviewBot; en-US; rv:1.8.0.9) Gecko/20061206 Firefox/1.5.0.9
 text/..
 -
Mozilla/5.0 (compatible; EqraTechBot/1.0; mail address )
 text/..
Mozilla/5.0 (X11; Linux i686; en-US; rv:1.8.0.7) Gecko/20060909 Firefox/1.5.0.7 SnapPreviewBot
 text/..
Mozilla/5.0 (compatible; UnisterBot; mail address )
 text/..
 image/..
 -
 application/json
theWxitBot/0.1
 application/json
Noheto crawler
 text/..
 -
XLinkBot/1.00
 text/..
UCMore Crawler App
 text/..
HTMLParser/1.6
 text/..
 -
Mozilla/5.0 (X11; Linux x86_64) Ubuntu/12.04 Codebot/1.0
 text/..
 image/..
ReaperBot/1.0.1 (incompatible-notwebbrowser:robot:exclusion-noncompliant) bot>
 text/..
Mozilla/5.0 (compatible; Web CEO Online robot)
 text/..
 image/..
 -
 application/rsd+xml
 application/xml
Handelabra WikiBot
 application/vnd.php.serialized
 text/..
Zing-BottaBot/2.0
 text/..
AdMedia bot
 text/..
mail address mail address – MediaWiki Tcl Bot Framework 0.5 (r24)
 application/json
Metabot 0.1
 text/..
wikbotlite/1.60 CFNetwork/609 Darwin/13.0.0
 image/..
 application/json
 text/..
Mozilla/5.0 (compatible; LucidWorks/; ; crawler at example dot com)
 text/..
 -
bitlybot
 text/..
 image/..
 -
MediaWiki::Bot/5.005004
 application/json
Opera/8.01 (J2ME/MIDP; MXit WebBot/5.9.8/1.8.5.168;) Opera Mini/3.1
 image/..
 text/..
 -
DotNetWikiBot/2.100 (Unix 3.0.0.12; )
 text/..
 application/xml
python-wikitools/1.2 (User:BernsteinBot)
 application/json
Bot
 text/..
python-wikitools/1.2 (User:Mr.Z-bot)
 application/json
MaxPointCrawler/Nutch-1.1 (maxpoint.crawler at maxpointinteractive dot com)
 text/..
Goalkeeperbot(User:Beetstra)/1.0
 text/..
DotNetWikiBot, edited by D. Rodionov/2.91 (Microsoft Windows NT 6.0.6002 Service Pack 2; )
 text/..
 application/xml
WikiBot/0.1
 text/..
 image/..
FAST Enterprise Crawler 6 used by ... ( mail address )
 text/..
 -
FetcherBot/0.1
 text/..
 image/..
WPBot 1.0
 text/..
 image/..
LauschenBot/1.0 ( mail address )
 text/..
WordChampBot
 text/..
 application/xml
MetallmanulBot for Wiktionary (run by Metallmanul)
 application/json
FAST Enterprise Crawler 6 used by FAST ( mail address )
 text/..
Mozilla/5.0 (compatible; FriendFeedBot/0.1; Http://friendfeed.com/about/bot; 400 subscribers; feed-id=3852576738117026533)
 application/xml
 -
BibBot/0.9 (urshofer.ch)
 text/..
gsa-crawler (Enterprise; T1-CTDUKJAVTGSJS; mail address )
 text/..
 -
Phantom.js bot
 image/..
 text/..
Mozilla/5.0 (compatible; Tbot/1.0;)
 text/..
 -
AnomieBOT 1.0 (RandomPagePicker; see [[User:AnomieBOT]])
 application/json
Empedia Bot
 text/..
 -
Mozilla/5.0 (Bgbot 0.5)
 text/..
Mozilla 5.0 (Apibot 0.30b5)
 application/vnd.php.serialized
python-wikitools/1.2 (User:LaraBot)
 application/json
SizeObservers.com bot
 text/..
Erel Bot
 text/..
AnomieBOT 1.0 (DeletionSortingCleaner; see [[User:AnomieBOT]])
 application/json
Geni ircpybot 1.0
 application/json
 text/..
 application/xml
wikbot/1.60 CFNetwork/548.1.4 Darwin/11.0.0
 image/..
 application/json
 text/..
Jbot
 text/..
Peachy MediaWiki Bot API Version 0.1beta
 application/vnd.php.serialized
Bobot
 text/..
Baiduspider
 text/..
Mooloo.de Bot
 application/json
msnbot 1.1
 text/..
 application/ogg
26814.59total

IP ranges: known ip ranges for Google are 64.233.[160.0-191.255], 66.249.[64.0-95.255], 66.102.[0.0-15.255], 72.14.[192.0-255.255],
74.125.[0.0-255.255], 209.085.[128.0-255.255], 216.239.[32.0-63.255] and a few minor other subranges

Errata: WMF traffic logging service suffered from server capacity problems in Aug/Sep/Oct 2011.
Absolute traffic counts for October 2011 are approximatly 7% too low.
Data loss only occurred during peak hours. It therefore may have had somewhat different impact for traffic from different parts of the world.
and may have also skewed relative figures like share of traffic per browser or operating system.

From mid September till late November squid log records for mobile traffic were in invalid format.
Data could be repaired for logs from mid October onwards. Older logs were no longer available.

In a an unrelated server outage precisely half of traffic to WMF mobile sites was not counted from Oct 16 - Nov 29 (one of two load-balanced servers did not report traffic).
WMF has since improved server monitoring, so that similar outages should be detected and fixed much faster from now on.

Generated on Sat, Mar 9, 2013 4:42
Author:Erik Zachte (
Web site)
Mail: ezachte@### (no spam: ### = wikimedia.org)
All data and images on this page are in the public domain.

Note: page may load slower on Microsoft Internet explorer than on other major browsers