Variations on the English Wikipedia Main Page

The WMF servers receive a lot of unserviceable page requests. To illustrate this, with what most likely is an extreme example, here is a list of page requests received in July 2011, which target any article with a title starting with ‘Main_Page’. Clearly most faulty  requests come from buggy software, not directly from users.

Posted in Wikimedia View(er)s | Leave a comment

Saving lifetimes

One days Jobs came into the cubicle of Larry Kenyon, an engineer who was working on the Macintosh operating system, and complained that it was taking too long to boot up. Kenyon started to explain, but Job cut him off. “If it could save a person’s life, would you find a way to shave ten seconds off the boot time?” he asked. Kenyon allowed that he probably could. Jobs went to a whiteboard and showed that if there were five million people using the Mac, and it took ten seconds extra to turn it on every day, that added up to three hundred million or so hours per year that people would save, which was the equivalent of at least one hundred lifetimes saved per year [1]. “Larry was suitably impressed, and a few weeks later he came back and booted up twenty-eight seconds faster” (‘Steve Jobs’, by Walter Isaacson, page 123).

Personally I still blame Microsoft for not introducing thousands separators into the dir output until MS-DOS 6. With hundreds of millions of users in the early 90′s, every quarter of a second wasted, several times a day, to read a 8-9 digit file size, added up to a comparable waste of lifetimes as above.

How does this translate to Wikimedia? With over 15 billion page views each month [2] each 1/10 second which is shaved off from page loading time saves humanity 1,500,000,000 seconds each month, which is very close to the waking hours spent by a 70 year old person. (70*365*16*3600). So the awesome dedication of the small Wikimedia operations team (staff AND volunteers) did not only save Wikimedia tons of hardware. It saves tens to hundreds lifetimes a year!

Of course all the ingenuity in the world does only go so far to accommodate Wikimedia’s ever increasing traffic. That’s why Wikimedia’s annual fundraiser, which is about to launch, is so vital to keep access to all our content fast, all over the world.

 

1 This is actually an overstatement: if every user saves 10 seconds per day this is roughly an hour per year. A 70 year old has been awake for 400,000 hours. Five million people saving an hour per year equates to 12 lifetimes.

2 Total file requests (images, scripts, etc) is even an order of magnitude larger (see image).

Posted in uncategorized | Leave a comment

Summary Reports for all Wikimedia Wikis

These weeks I am performing long overdue maintenance on Wikistats. This includes fixing bugs (e.g. Wikibooks en Wikiversity reports were broken for many months). This also includes automation, removing manual steps from the production process. I am also making good a long standing promise to publish summaries for all Wikimedia wikis.

These summaries were originally introduced for the monthly India Report Card, with content and layout suggestions from Wikimedia researcher Mani Pande. Hopefully they serve a wide audience. Hopefully they help to quickly assess fundamentals for any Wikimedia wiki, without getting avalanched by too many details from unwieldy tables.

Where to find these summaries

There are sets of summaries, also known as report cards, per project: Wikipedia, Wiktionary, Wikibooks, Wikinews, Wikiquote, Wikiversity, Wikisource and Other Projects.

For Wikipedia there are also sets of summaries per region: Africa, America’s, Asia, Europe, India, Oceania, and also for Artificial Languages.

Finally for every wiki there is also a new ‘Summary’ link in the project sitemaps: Wikipedia, Wiktionary, Wikibooks, Wikinews, Wikiquote, Wikiversity, Wikisource and Other Projects.

I am open to suggestions what to include further into these summaries. Yet their very purpose is to offer a quick at a glance overview, so this puts some restraints on which information to add.

Update 27 Sep: I added extra charts and metrics for Commons.

 

Posted in Nice Charts, Wikimedia Edit(or)s, Wikimedia View(er)s, Wikistats Reports | 4 Comments

Wikipedia Mobile Traffic II

Three months ago I blogged about mobile traffic to Wikipedia. I explained how we track two different metrics: on one hand traffic to our mobile site, on the other hand traffic from mobile devices (as detected from the so called agent string).

While preparing my presentation for Wikimania Haifa, which shows a visualization of  global page views (more on that soon),  it dawned on me that the chart I presented in that earlier blog actually shows incomparable metrics. They are not wrong, but a comparison of apples and oranges.

Above is the updated plot. Both existing lines are unchanged. I added a new line.

The issue is this: the blue line shows the ratio of page views to our mobile site, based on page views only, aka html requests. At the present our mobile site serves 6% of our page requests. (BTW read more on recent plans to redirect even more traffic to our mobile site).

The red line shows the ratio of requests that originate from a mobile device (to any of our sites), based on all traffic: not only html requests but also images and script files. There is a caveat here: many handheld clients (app/browser) do not retrieve a full Wikipedia page, but only the html file, and just a few of the images and scripts files. This skews the ratio, and not a little bit!

The new purple line shows the ratio of page views from handheld devices, disregarding all non-html file types.  The difference is striking. It turns out at least 15% of our page views comes from mobile devices.  I say at least as we do not factor in API calls yet, my colleague Nimish Gautam thinks this might further drive the ratio upward (to be continued).

It is not possible to generate the new metric for traffic activity older than 3 months. WMF only keeps request logs for a short period due to privacy considerations. Although somewhat confusing without this explanation, I will keep the red line for a while, to allow for long term trend assesments.

 

 

 

Posted in uncategorized | 2 Comments

Wikipedia edits visualized

Today I present a new animated visualization of Wikipedia edits.

>> Animation <<   >> Screenshots <<

It shows all edit events for all Wikipedia’s on one random day. Currently this is 14 February 10 May 2011. On that particular day all Wikipedia’s combined had been edited 369,384 times.

The visualization grabs all those edits (time, location and language code) and shows how these were spread over the globe. You can see the distribution over space and time in an animation. You can also see static maps: bubble maps and heat maps, either per major language or for all languages combined. For bubble maps: large bubbles correspond to many edits, done from the location at the center of the bubble. For heat maps: bright colors correspond to many edits from that particular spot. Bubbles help you quickly focus on areas of large activity, heat maps have better resolution.

You can zoom and pan (mouse), click on the map for latitude and longitude (only when zoomed out), change the animation speed (5x-30x), toggle between color and black&white map (latter with country borders), show/hide the position of the sun, city names and approximated local times, and change event marker type and size (types are circle, fixed or animated language code). You can cycle through major languages  (easiest with space bar), fast forward the clock by 1 hour, or pause the animation. Type H for Help on all options and their associated keys.


Screenshot. Click image to see larger version..

Privacy

Two measures were taken to guard privacy of authors. Timestamps have a deliberate error of up to 10 minutes. For very active editors or wiki’s with little activity this would not suffice. More importantly all coordinates have been rounded to a half degree longitude and latitude (roughly 55 km or 34 miles squared).

Tech note

The animation and maps were implemented in html5, with the canvas object. I used this same approach two years ago,with another animation about Wikipedia. Since then browser support for html5 canvas has considerably improved.

Posted in uncategorized | 25 Comments