Wikistats Overview

This page is a concise overview of Wikimedia stats reports built (almost all) by Erik Zachte, and often referred to as Wikistats.
This software has its own git repository, see GitHub analytics-wikistats.

BTW the term Wikistats is used differently in different contexts. The website stats.wikimedia.org features much more software, developed by other Wikimedia staff and volunteers.

Four sets of reports come in (almost) independent clusters.

Cluster

I Dump reports

II Page view reports

III Squid reports

IV Mail stats

Focus Wiki content and activity Traffic Traffic Mailing list activity
Input Wikimedia dumps
Only dumps, which contain all historic revisions.
Mostly stub dumps = meta data only
Sometimes full archive dumps (meta data + raw text)
Hourly page view counts per wiki

(BTW also aggregated into daily/monthly dumps)

Squid logs, 1:1000 sampled Master list
Mail archive per list
Output
  • Per project (8, plus misc.) a sitemap, e.g. Wikipedia (Wp)
  • For each wiki a report with 10+ tables, e.g. English Wp
  • Per project 18 comparison reports, e.g. Edits per month on Wp
  • Current status all wikis, e.g. Wp
  • Bot created/edited articles per wiki per month, e.g. Wp
  • Overview recent months, e.g. Wp
  • Largest/most edited articles, e.g.

    (all in all ± 800 wiki specific pages, plus 144 comparisons, each in 28 languages -> ± 25,000 static html pages )

Output Languages 28 1 English English English
Updated Monthly Daily Monthly Daily
Runs on WMF server WMF server WMF server private server
Code base Dumps Dumps
it reuses reporting code built for dumps
Squids Mail-lists
Non blank perl lines 57,539 4,898 2 14,483 1,227
Dev languages Reports: perl
Charts: R
Animation: html5+javascript
Reports: perl Reports: perl
Animation: html5+javascript
Geo2ip: MaxMind
Reports: perl
Feed to Wikimedia Report Card,
based on Limn via Analytics
Yes Yes No No
Animations Bubble charts code n.a. Interactive map code n.a.
Monitoring On udp2log level, udp packet loss stats: No
Examples of charts
Author Erik Zachte 2003-now Erik Zachte 2008-now
Erik Zachte 2009-now
Andre Engels 2012
Erik Zachte 2005?-now
Notes 1 Some reports only in English, most translations are incomplete.
2 For data collecting and archiving. Reporting code is part of dumps scripts.

Last upd: June 25, 2013