In January 2010 I published reports on the global distribution of Wikipedia page views. The reports have different aggregation levels: by country or by language, yearly average or quarterly trends.
A week ago I published a similar set of reports for page edits. Both sets, page views and page edits, now also show totals per global region. Note that in this context edit means page update, not opening a page in edit mode.
Counts exclude bot and crawler requests. See my earlier blog post for additional comments on filtering process and potential anomalies.
Some insights that can be gleaned from the new reports:
- In general the breakdown of activity per global region is quite similar for views and edits. Most striking exceptions: Europeans contribute 51% of all edits, 35% of all views. North Americans contribute 23% of all edits, 38% of all views.
- The English Wikipedia receives 51% of all page views, but only 41% of all page edits.
- The ‘Global North’ (as defined on Wikipedia) contributes 81% of all page views and edits, with just 19% of the world population, and 46% of the internet population.
- Not unexpected: monthly requests per internet user from China is more than 10 times lower than the Asian average (views 1/10, edits 1/14).
- 94% of page views from India is for the English Wikipedia. For edits this is considerably less: only 78%.
A major difference between page view and page edit reports is the accuracy. Both reports are based on a 1:1000 sampled squid log. With 13 billion page views per month for all Wikipedias combined 1/1000 th or 13 million log records is all one would hope for in terms of accuracy. For every 2000 page views Wikimedia receives one page edit (*). This leaves rougly 6.5 thousand edit requests per month in the sampled squid log. Enough for a breakdown per country, less than ideal for a reliable breakdown per language per country, let alone percentual quarterly shifts in breakdown per language per country. So treat with caution.
Clearly sampling is an issue here. We intend to capture all edit requests in a separate file (depends on server update), and thus improve the accuracy with a factor 1000.
* This ratio has climbed steadily from approx. 1800 to 1 in 4th quarter 2009 to 2000 to 1 in July 2010.
Allocation by continent, global region was done manually, please help me to find input errors.
Update Sep 9: moved Russia to Europe (see comments). Percentages above have been updated.