August 29, 2014

Gerard Meijssen

#Wikidata - Adolf Butenandt, Nobel laureate, professor and student

For many professors we know in Wikidata that they are or have been employed by what university. Data about this has been added categories at a time. Often this has been repeated for categories about the same university from different Wikipedias.

At the same time information has been added for the universities where people studied. However, there is an increasing number of professors for whom it is not known where they studied.

Professor Butenandt is a case in point; he studied at the university of Marburg and the university of Göttingen. It is known on one Wikipedia and not on others. Given that categories are linked as well, it is fairly easy to signal missed opportunities.

Thanks to this query by Magnus, we know about 23,351 professors without an alma mater. For Mr Butenandt information has been or will be added and, obviously there is much more work left to do.

by Gerard Meijssen (noreply@blogger.com) at August 29, 2014 06:28 AM

August 28, 2014

Wikimedia Foundation

Venerable cultural institution partners with Wikimedia Serbia

Matica srpska building in Novi Sad

The Matica Srpska (MS) and Wikimedia Serbia (WMRS) are joining forces for an exciting new endeavor to digitalize all of the contents of at least two Serbian dictionaries over the next year, including the Serbian ornithological dictionary, and the dialects of Vojvodina dictionary. What is even more exciting for the free culture movement is this collaboration with Serbia’s oldest cultural and scientific institution, and how it came to be.

Founded in 1826, the Matica – which has become a Slavic symbol for an institution that promotes knowledge – was the nexus point for fostering the Serbian national identity and enlightenment during the days of the Ottoman and later Habsburg rule. Today, it still serves as an important center of Serbian culture, housing departments for Natural Sciences, for Performance Arts and Music, Lexicography and more. Additionally, the Matica Srpska acts as an art gallery for eighteenth and nineteenth century paintings, a library containing over 3.5 million books and a publishing house for ten periodicals and, of course, an array of Serbian dictionaries and encyclopedias.

Milos Rancic, the first president of Wikimedia Serbia, believes that this is a historical feat for Serbian culture and Wikimedia.

Logo of Wikimedia Serbia

“The significance of this cooperation for Wikimedia is that we are at the beginning of a close relationship with a national, cultural institution, whose foci include dictionaries and encyclopedias. They share our goals and want to cooperate with us.”

But how was Milos able to lay pavement on a potentially ground-breaking agreement between WMRS and MS? The answer: Micro-grants.

Back in June, WMRS received an interesting proposal for its micro-grants program. The project was about creating a photograph gallery of a single person over time. The project was later deemed unsuitable for the grant; but Milos, still intrigued by the concept of the project, decided to fund it personally.

By chance, this amateur photographer just so happened to be a top Serbian lector, an editor of the Orthography of the Serbian language and a lexicographer at the Matica Srpska. The two men proceeded to talk on a number of topics, including photography, the financial state of the MS and its desire to have more initiatives.

“I had bold ideas, of course, but I was quite skeptical about the possibility of cooperation between WMRS and Matica Srpska,” Milos admitted.

Image of Milos (left) taken at the Third regional conference of Wikimedia Serbia in Belgrade

“However, he convinced me that the president of MS is likely willing to cooperate and that we should talk about that.”

A meeting was scheduled, and a few weeks later, a delegation comprised of Mile Kis, Executive Director of WMRS, Ivana Madzarevic, WMRS program manager and Milos entered into initial talks with the Matica Srpska.

The meeting lasted two hours. Then, both parties dispersed.

Weeks went by without confirmation from MS.

It was not until July 16 that word arrived. “We got a formal letter from MS, which summarized our meeting and emphasized their commitment to accessibility of knowledge to as many people as possible.”

Milos notes that small, deliberate steps are necessary in order to achieve lasting results. “This is just the beginning, of course. We share important traits with these institutions like MS. It’s about long term goals. We want to start cooperation and develop it. They want to share their content on the Internet. With our (technological, licensing, etc.) help, they will become the institution which share their content by default, no matter if we are involved or not.”

Over the course of the next couple of years, Milos hopes to begin discussing uploading the main Serbian dictionaries too.

Milos says that one cannot overestimate the efficacy of having a grants program, no matter the size. “When you are going outside and are telling people that you are willing to support their projects, it could lead into some interesting outcomes. It is important to understand the possibilities that could be opened and catch them.”

Michael Guss, Communications volunteer at the Wikimedia Foundation

by carlosmonterrey at August 28, 2014 11:08 PM

Priyanka Nag

Maker Party gets grander in Pune this time

While going through my twitter time-line this evening, I noticed Michelle Thorne's tweet stating that India leads with most Maker Party action this season.
Well, who doubts that! In India, we have a maker parties being organized almost every second day. My facebook wall and twitter timelime is like overloaded with posts, photos and updates from all the Maker Parties happening around me.

Maker Party, Pune

Well, if you are still not aware of this one, we are having the grand daddy of this maker parties in Pune on the 6th of September 2014. The executive director of Mozilla Foundation, Mark Surman, is going to be personally present for this event. Just like all maker parties, this event is an attempt to map and empower a community of educators and creative people who share a passion to innovate, evolve and change the learning landscape.

A few quick updates about this event:
  •  Event date - 6th and 7th September
  •  Event venue - SICSR, Model Colony, Pune
  • Rough agenda for the event is going to be:
    • 6th September 2014 (Day 1) 
      • 10am - 11am : Mozilla introduction
      • 11am - 12 : About Hive initiative
      •  12 - 1pm: Rohit Lalwani - Entrepreneurship talk
      •  1-2pm : Lunch break
      •  2pm - 3pm: Webmaker begins with Appmaker
      •  3pm - 4pm: Webmaker continues with Thimble
      •  4pm - 4.45pm: Webmaker continues with Popcorn
      •  4.45pm - 5.30pm : Webmaker continues with x-ray goggles
      • 5.30pm - 6pm: Prize distribution (against best makes of the day etc). Science fair also ends
      • 6pm - 7pm : Birds of feature
      • 7pm : Dinner (venue - TBD)
Science fair will be from 12 noon to 6pm.
    •  7th September 2014 (Day 2) 
      • 1st Half: Community Meetup and Discussions on the future roadmap for Hive India,
        Long term partnership prospect meeting with partners.
      •  2nd Half: Community training sessions on Hive and Train the trainer events.
For this event, we are having a variety of different training sessions, workshops and science displays - starting from 3D printing to wood-works, Origami to quad-copter flying and even film making.

If you have still not registered for this event, heres your chance:

<iframe frameborder="0" height="500" marginheight="0" marginwidth="0" src="https://docs.google.com/forms/d/12lD2Rloz7QlhpNPrnZkfoEP0m0UjYZDJtyaRul5YahM/viewform?embedded=true" width="760">Loading...</iframe>

by priyanka nag (noreply@blogger.com) at August 28, 2014 06:39 PM

Andy Mabbett (User:Pigsonthewing)

Whatever happened to Henry Wheeler of Bath: tailor, naval signalman, and Desert Island Discs castaway?

Lately, I’ve been writing lots of Wikipedia biographies of people who have been “castaways” on the BBC Radio programme, Desert Island Discs.

A desert island. Probably not in the North Sea
Photo by Ronald Saunders, on Flickr, CC-BY

Of all the varied people — priests, writers, musicians and others —  that I’ve written about, one above all has intrigued me. Because I’ve found out less about him than any other, even though I have a full transcript of the programme.

That person is Henry Wheeler.

As I wrote on Wikipedia:

Henry Wheeler (born 1924 or 1925) was a naval signalman during World War II. The eldest in a family of six, he was from from Vernhan Grove in Bath, England, where his civilian role was as a tailor’s assistant.

He joined the Royal Navy in 1943, undertook his naval training at HMS Impregnable, went to France on the day after D-Day, and was later stationed in Rotterdam. While in Rotterdam, he had a romantic relationship with a Dutch woman, named Dine.

Shortly after the war’s end, he appeared as a “castaway” on the BBC Radio programme Desert Island Discs, on 24 November 1945, at the age of 20. He was chosen to appear as he was serving on an unspecified “small island off the European coast” — the nearest thing available to a real castaway.

And that is pretty much all anybody seems to know about him. Was he real, or a propaganda fiction, or perhaps using a pseudonym? Did return home to Bath to resume tailoring? Or did he return to marry Dine, in Rotterdam? Does anyone at Vernham Grove remember his family? Are his descendants still alive in Bath, Rotterdam or elsewhere? Indeed, is he?

by Andy Mabbett at August 28, 2014 03:02 PM

August 27, 2014

Wikimedia Foundation

Reimagining Mentorship with the Wikipedia Cooperative

I JethroBT, Project Manager of the Co-op, at Wikimania 2014.

An editor’s initial experience when contributing to Wikipedia can be daunting: there is a ton to read and it’s easy to make mistakes right off the bat and feel pushed away when edits are reverted. My name is Jethro and a small team of editors and I are addressing these issues by building a mentorship space called the Wikipedia Cooperative, or simply the Co-op. In the Co-op, learners (i.e. editors seeking mentorship) will have the chance to describe how they want to contribute to Wikipedia and subsequently be matched with mentors who can teach them editing skills tailored to their goals.

We are working under an Individual Engagement Grant and hope to complete a pilot and analysis of our mentorship space by early next year. If successful, we hope to fully open the space and provide tools to allow similar projects to be built in other Wikipedia projects. We recently passed the second month of our grant and I wanted to share our progress with you thus far.


We recently brought Dustin York to our team as our graphic designer. York’s background designing the WMF’s Travel and Participation Support grantmaking pages and other experience such as with UNICEF will be invaluable to us. He has begun exchanging ideas in hopes that the design work will be in full swing by September. We intend to make the space friendly and inviting for both learners and mentors alike and are confident that we can create a promising look and feel.

Product/Interaction Designer Dustin York’s illustration work for the WMF’s Travel & Participation Support grants pages on Meta.

In program development, we’ve organized an editing curriculum that we hope to make available to learners as part of the mentorship. We’ve categorized these skills into three different levels of difficulty as well as by skill type (see example). We’ve also finalized a conceptual design for how learners will be matched with mentors.

Example skills planned to be made available at the Co-op.

In our research, we’ve finished designing interview protocols and questions for editors who have participated in help spaces on Wikipedia, such as the Teahouse and The Wikipedia Adventure. We have started reaching out to such editors for interviews – their feedback will help guide our upcoming design decisions.

We have narrowed down key questions we want answered which we will use to help us understand the impact of our project:

  • How well does the Co-op work?
  • What predicts how well the Co-op works for particular learners?
  • What features work best in various existing programs?
  • Why do learners seek out and continue mentorship?

We also completed background research in addition to a preliminary mentor survey to assess how and why editors participate in mentoring. We have published our key findings on our hub on the English Wikipedia.

Lastly, our team was well-represented at Wikimania 2014 in London. We met often, sought out prospective programming candidates and connected with a number of editors and Foundation staff to discuss feedback and ideas for our project.

We plan to begin our pilot in early December and are seeking out editors who are interested in mentoring a small number of learners during this pilot period. If you are interested, please let us know on our project talk page or contact me directly. We believe that mentorship is a positive and personalized way to promote good editing habits for editors in addition to engaging productively with the editing community. It is our hope that our efforts, along with those of the mentors, will create a more approachable atmosphere for users who want to contribute to Wikipedia.

This article was co-authored by Soni and IJethroBT

by carlosmonterrey at August 27, 2014 09:41 PM

Mark Rauterkus

Wikimedia UK

Does Wikimania save lives?

This post was written by Fabian Tompsett, Wikimedian and co-ordinator of the Wikimania support team, and originally published here.

Yes it was quite a surprise to find myself with other Wikimedians back in September 2008

I am just coming to the end of a four-month stint working for Wikimedia UK helping to deliver Wikimania 2014 at London’s Barbican Centre. It was all quite exciting and as The Signpost put it was “not too bad, actually”. In the whirl of events seeing dozens of hackers bringing hacking home to Hackney, hunched over their laptops, while other devotees were busy tweeting, it became all too easy to miss some key aspects of the event, and so to fail to recognise that Wikimania contributed to saving lives.

Wikipedia is not just a website, it is also a somewhat heterogeneous international community which thrives on face-to-face encounters in meatspace. For myself my involvement gained an extra dimension when I started attending the regular London Meetups six years ago. It was meeting other human beings rather than tapping away while staring at a computer screen which made it interesting.

So, this August the London Meetup page modestly subsumes Wikimania within its calendar of monthly events, within an expansion to a three day event with between 2,000 and 4,000 attendees (so much for “British understatement“). But in essence it is the face-to-face interactions outside the formal sessions which make Wikimania such a powerful event. I don’t want to be dismissive about the formal sessions and all the hard work which went into them, it is just that I want to focus on the other aspects and use this to show why I believe Wikimania saves lives.

2014 West Africa Ebola virus outbreak situation map

A couple of weeks after Wikimania a discussion opened up on the Wikimedia Ghana list which spoke of an initiative by Carl Fredrik Sjöland of the Wikipedia:WikiProject Medicine who have teamed up with Translators Without Borders to set up a Translation taskforce. As they explained a couple of years ago “We believe that all people deserve high quality healthcare content in their own language.” Faced with the current Ebola outbreak in West Africa the focus of these activities has shifted to finding  people to translate information about Ebola into the relevant indigenous languages. There is something similar happening through the Humanitarian OpenStreetMap Team who have also been very active developing mapping resources for the medics on the ground.

I had hoped to make it to the OpenStreetMap 10th Birthday Party (the London celebrations were held nearby, to coincide with Wikimania) but I got caught up in other things and only arrived after most of the people had left. But that was precisely what Wikimania was like: you find out more and more about it in the aftermath.

Graph indicating the comparative amount of Wikipedia content available for readers in different circumstances. Nearly all indigenous languages in Africa are comparable to Gujarati.

Another aspect I found out afterwards was Denny’s comments on A new metric for Wikimedia where he discusses the availability of Wikipedia in different languages. Considering the recent Ebola outbreak above, this is not just a “nice idea”, but something which requires support now. Often it is not so much getting hold of finances, but finding a way in which those people with the relevant language skills can be linked up with and given the resources to make things happen.

An important aspect of this is that the speakers of these languages are not just passive recipients of knowledge generated in the geographical north. They can also contribute their own knowledge. This also touches on the notion of cognitive justice  as developed by Shiv Visvanathan in The search for cognitive justice

Cognitive justice is not a lazy kind of insistence that every knowledge survives as is, where is. It is an idea which is actually more playful in the sense the Dutch historian Johann Huizinga suggested when he said play transcends the opposition of the serious and the non-serious. Play seeks encounters, the possibilities of dialogue, of thought experiments, a conversation of cosmologies and epistemologies. A historical model that comes to mind is the dialogue of medical systems, where doctors once swapped not just their theologies but their cures. As A. L. Basham put it, the dialogue of medicines, each based on a different cosmology, was never communal or fundamentalist. It recognized incommensurability but allowed for translation.

This is a viewpoint which has been taken up in what is called Open ICT for Development, where “openness” is understood to include the the participation of communities in the governance of their own lives.

So what I found out in the aftermath of Wikimania is the question: Does Wikimania save lives? Can it help people get together and come up with practical methods by which people get in touch and existing initiatives can find that they are taken to a higher level? Will it have an affect in this example and save lives? So in this sense Wikimania is not over. It’s legacy depends on what action people take in its aftermath.

So I am writing this blog because I want you to see if there is something you can do to help either the Humanitarian OpenStreetMap Team or the Translation taskforce find more support for their projects in fighting Ebola.

by Richard Nevell at August 27, 2014 11:48 AM

Jeroen De Dauw

SoCraTes 2014


Last week I attended SoCraTes 2014, the 4th International Software Craftsmanship and Testing Conference in Germany.

Since this was the first time I went there, I did not really know what to expect, and was slightly apprehensive. Turns out there was no need for that, the conference was the most fun and interesting I’ve been to recently, and definitely the most motivating as well.

What made it so great? Probably the nicest aspect of the conference where the people attending. Basically everyone there was passionate about what they were doing, interested in learning more, open minded, and respectful of others. This combined with the schedule being purely composed out of sessions people proposed at the start of the day made the whole atmosphere very different from that of your typical commercial conference. Apart from attending sessions on various topics, I also did some pair programming and played my first set of beachvolleyball games at a software conference.

I’m definitely going back next year!

by Jeroen at August 27, 2014 09:42 AM

Gerard Meijssen

#Wikipedia - Professor Hermann Buhl "Leichtathlet"

Mr Buhl died in Tirol wandering through the Alps.  He used to be an athlete of repute and became a professor at the Julius Maximilians-Universiteit.

It is obvious that Mr Buhl was a professor because of his presence in a category. It is not obvious in the same way where he studied and what he taught. When you read the text, it expects a lot of knowledge about the DDR for the text to make such things obvious.

Every Wikipedia has its notability criteria and, the German Wikipedia is not different. Mr Buhl is certainly notable as an athlete but his career did not end. He is probable notable as well for the "latter" part of his career. Some would argue that he started to contribute in a meaningful way when he taught in university.

by Gerard Meijssen (noreply@blogger.com) at August 27, 2014 09:28 AM

Tony Thomas

Using puppet realm switch to select between beta/prod ( Wikimedia clusters )

Since the BounceHandler extension is currently installed only in the beta clusters ( Official testing servers of Wikimedia- deployment.wikimedia.beta.wmflabs.org ), writing a custom router in the exim configs of operations/puppet ( configuration repo managed by puppet ) to collect in all the bounce emails and HTTP POST to the extension API seemed risky. This was […]

by Tony Thomas at August 27, 2014 07:12 AM

Exim: Creating and using Macros

The topic looks easy, but implementing them was a great learning experience, as I found it. Macros helps to make reuse a lot of code, and make the exim configuration look tidy. In the earlier post, I scribbled how to define an Exim regex to capture all VERPed emails as : Tidying this up a […]

by Tony Thomas at August 27, 2014 06:47 AM

August 26, 2014

Wikimedia Tech Blog

Content Translation: 100 published articles, and more to come!

On July 17, 2014, the Wikimedia Language Engineering team announced the deployment of the ContentTranslation extension in Wikimedia Labs. This first deployment was targeted primarily for translation from Spanish to Catalan. Since then, users have expressed generally positive feedback about the tool. Most of the initial discussion took place in the Village pump (Taverna) of the Catalan Wikipedia. Later, we had the opportunity to showcase the tool to a wider audience at Wikimania in London.

Initial response

In the first 2 weeks, 29 articles were created using the Content Translation tool and published in the Catalan Wikipedia. Article topics were diverse, ranging from places in Malta, to companies in Italy, a river, a monastery, a political manifesto, and a prisoner of war. As the Content Translation tool is also being used for testing by the developers and other volunteers, the full list of articles that make it to a Wikipedia is regularly updated. The Language Engineering team also started addressing some of the bugs that were encountered, such as issues with paragraph alignment and stability of the machine translation controller.

The number of articles published using Content Translation has now crossed over 100 and its usage has not been only limited to Catalan Wikipedia. Users have been creating articles in other languages like Gujarati and Malayalam, although machine translation has not been extended beyond Spanish−Catalan yet. All the pages that were published as articles had further edits for wikification, grammar correction, and in some cases meaningful enhancement. A deeper look at the edits revealed that the additional changes were first made by the same user who made the initial translation, and later by other editors or bots.

Wikimania in London

Amir Aharoni of the Wikimedia Language Engineering team introduces the Content Translation tool to the student delegation from Kazakhstan at Wikimania 2014, in London.

Amir Aharoni of the Wikimedia Language Engineering team introduces the Content Translation tool to the student delegation from Kazakhstan at Wikimania 2014, in London.

The Content Translation tool was showcased widely at Wikimania 2014, the annual conference of the Wikimedia communities. In the main conference, Santhosh Thottingal and Amir Aharoni presented about machine aided translation delivery through Content Translation. During the pre-conference hackathon, Pau Giner conducted a testing session with student volunteers from Kazakhstan, who were enthusiastic about using the tool in their local Wiki Club. Requests for fully supporting other language pairs were brought up by many users and groups like the Wikipedia Medical Translation project. Discussions were held with the Wikidata team to identify areas of collaboration on data reuse for consistent referencing across translated versions. These include categories, links etc.

The Language Engineering team members worked closely with Wikimedians to better understand requirements for languages like Arabic, Persian, Portuguese, Tajik, Swedish, German and others, that can be instrumental in extending support for these languages.

Further development

The development of ContentTranslation continues. Prior to Wikimania, the Language Engineering team met to evaluate the response and effectiveness of the first release of the tool, and prepared the goals for the next release. The second release is slated for the last week of September 2014. Among the features planned are support for more languages (machine translation, dictionaries), a smarter entry point to the translation UI, and basic editor formatting. It is expected that translation support from Catalan to Spanish will be activated by the end of August 2014. Read the detailed release plan and goals to know more.

Over the next couple of months, the Language Engineering team intends to work closely with our communities to better understand how the Content Translation tool has helped the editors so far and how it can serve the the global community better with the translation aids and resources currently integrated with tool. We welcome feedback at the project talk page. Get in touch with the Language Engineering team for more information and feedback.

Amir Aharoni and Runa Bhattacharjee, Language Engineering, Wikimedia Foundation

by Guillaume Paumier at August 26, 2014 12:34 PM

Wikimedia engineering report, July 2014

Major news in July include:

Note: We’re also providing a shorter and translatable version of this report.

Engineering metrics in July:

  • 164 unique committers contributed patchsets of code to MediaWiki.
  • The total number of unresolved commits went from around 1575 to about 1642.
  • About 31 shell requests were processed.


Work with us

Are you looking to work for Wikimedia? We have a lot of hiring coming up, and we really love talking to active community members about these roles.


  • Arthur Richards is now Team Practices Manager (announcement).
  • Kristen Lans joined the Team Practices Group as Scrum Master (announcement).
  • Joel Sahleen joined the Language Engineering team as Software Engineer (announcement).

Technical Operations

Dallas data center

Throughout July, the cabling work of all racked servers and other equipment was nearly completed. We’re still awaiting the installation of the first connectivity to the rest of our US network in early August before we can begin installation of servers and services.

San Francisco data center

Due to a necessary upgrade to power & cooling infrastructure in our San Francisco data center (which we call ulsfo), our racks have been migrated to a new floor within the same building on July 9. The move completed in a very smooth fashion without user impact, and the site was brought back online serving all user traffic again in less than 24 hours.

PFS enabled

Through the help of volunteer work and research, our staff enabled Perfect Forward Secrecy on our SSL infrastructure, significantly increasing the security of encrypted user traffic.

Labs metrics in July:

  • Number of projects: 173
  • Number of instances: 464
  • Amount of RAM in use (in MBs): 1,933,824
  • Amount of allocated storage (in GBs): 20,925
  • Number of virtual CPUs in use: 949
  • Number of users: 3,500

Wikimedia Labs

We’ve made several minor updates to Wikitech: we added OAuth support, fixed a few user interface issues, and purged the obsolete ‘local-*’ terminology for service groups.
OPW Intern Dinu Sandaru has set forms for structured project documentation. This should will help match new volunteers with existing projects, and will make communication with project administrators more straightforward.
Sean Pringle is in the process of updating the Tool Labs replica databases to MariaDB version 10.0. This may reduce replag, and should improve performance and reliability.
We’re setting up new storage hardware for the project dumps. This will resolve our ongoing problems with full drives and out-of-date dumps.

Features Engineering

Editor retention: Editing tools


In July, the team working on VisualEditor converged the design for mobile and desktop, made it possible to see and edit HTML comments, improved access to re-using citations, and fixed over 120 bugs and tickets.

The new design, with controls focussed at the top of each window in consistent positions, was made possible due to the significant progress made in cross-platform support in the UI library, which now provides responsively-sized windows that can work on desktop, tablet and phone with the same code. HTML comments are occasionally used on a few articles to alert editors to contentious or problematic issues without disrupting articles as they are read, so making them prominently visible avoids editors accidentally stepping over expected limits. Re-using citations is now provided with its simple dialog available in the toolbar so that it is easier for users to find.

Other improvements include an array of performance fixes targeted at helping mobile users especially, fixing a number of minor instances where VisualEditor would corrupt the page, and installing better monitoring of corruptions if they occur, and better support for right-to-left languages, displaying icons with the right orientation based on context.

The mobile version of VisualEditor, currently available for beta testers, moved towards stable release, fixing a number of bugs and editing issues and improving loading performance. Our work to support languages made some significant gains, nearing the completion of a major task to support IME users, and the work to support Internet Explorer uncovered some more issues as well as fixes. The deployed version of the code was updated five times in the regular release cycle (1.24-wmf12, 1.24-wmf13, 1.24-wmf14, 1.24-wmf15 and 1.24-wmf16).

In wider news, the team expanded its scope to cover all MediaWiki editing tools as well, as the new Editing Team (covered below).


In July, the newly re-named and re-scoped Editing Team was formed from the VisualEditor Team. We are responsible for extending and improving the editing tools used at Wikimedia – primarily VisualEditor and maintenance for WikiEditor. We exist to support new and existing editors alike; our current work is mostly on desktop, and we are working with Mobile to take responsibility for all editing across desktop, tablet and phone platforms, spanning approximately 50 different areas of MediaWiki and extensions related to editing. We will continue to report progress on VisualEditor separately.

The biggest Editing change this month was in the Cite extension (for footnotes) – this now automatically shows a references list at the end of the page if you forget to put in a <references /> tag, instead of displaying an ugly error message. The Math extension (for formulæ) was improved with more rigorous error handling and LaTeX formula checking, as part of the long-term volunteer-led work to introduce MathML-based display and editing. The TemplateData GUI editor was deployed to a further six wikis – the English, French, Italian, Russian, Finnish and Dutch Wikipedias.

A lot of work was done on libraries and infrastructure for the Editing Team and others. The OOjs UI library was extensively modified to bring in a new window management system for comprehensive combined desktop, tablet and phone support, as well as other updates to improve Internet Explorer compatibility and accessibility of controls. In the next few months the team will continue working on OOUI to support other teams’ needs and implement a consistent look-and-feel in collaboration with the Design team. The OOjs library was updated to fix a minor bug, with a new version (v1.0.11) released and pushed downstream into MediaWiki, VisualEditor and OOjs UI. The ResourceLoader framework was extended to allow skins to set the “skinStyles” property themselves, rather than rely on faux dependencies, as part of wider efforts led jointly by a volunteer and a team member to improve MediaWiki’s skin support.


In July, the Parsoid team continued with ongoing bug fixes and bi-weekly deployments.

With an eye towards supporting Parsoid-driven page views, the Parsoid team strategized on addressing Cite extension rendering differences that arise from site-messages based customizations and is considering a pure CSS-based solution for addressing the common use cases. We also finished work developing the test setup for doing mass visual diff tests between PHP parser rendering and Parsoid rendering. It was tested locally and we started preparations for deploying that on our test servers. This will go live end-July or early-August.

The GSoC 2014 LintTrap project continued to make good progress. We had productive conversations with Project WikiCheck about integrating LintTrap with WikiCheck in a couple different ways. We hope to develop this further over the coming months.

Overall, this was also a month of reduced activity with Gabriel now officially full time in the Services team and Scott focused on the PDF service deployment that went live a couple days ago. The full team is also spending a week at a off-site meeting working and spending time together in person prior to Wikimania in London.


Services and REST API

The brand new Services group (currently Matt Walker and Gabriel Wicke) started July with two main projects:

  1. PDF render service deployment
  2. Design and prototyping work on the storage service and REST API

The PDF render service is now deployed in production, and can be selected as a render backend in Special:Book. The renderer does not work perfectly on all pages yet, but the hope is that this will soon be fixed in collaboration with the other primary author of this service, C. Scott Ananian.

Prototyping work on the storage service and REST API is progressing well. The storage service now has early support for bucket creation and multiple bucket types. We decided to configure the storage service as a backend for the REST API server. This means that all requests will be sent to the REST API, which will then route them to the appropriate storage service without network overhead. This design lets us keep the storage service buckets very general by adding entry point specific logic in front-end handlers. The interface is still well-defined in terms of HTTP requests, so it remains straightforward to run the storage service as a separate process. We refined the bucket design to allow us to add features very similar to Amazon DynamoDB in a future iteration. There is also an early design for light-weight HTTP transaction support.

Matt Walker is sadly leaving the Foundation by the end of this month to follow his passion of building flying cars. This means that we currently have three positions open in the service group, which we hope to start filling soon.

Core Features


In July, the Flow team built the ability for users to subscribe to individual Flow discussions, instead of following an entire page of conversations. Subscribing to an individual thread is automatic for users who create or reply to the thread, and users can choose to subscribe (or unsubscribe) by clicking a star icon in the conversation’s header box. Users who are subscribed to a thread receive notifications about any replies or activity in that thread. To support the new subscription/notification system, the team created a new namespace, Topic, which is the new “permalink” URL for discussion threads; when a user clicks on a notification, the target link will be the Topic page, with the new messages highlighted with a color. The team is currently building a new read/unread state for Flow notifications, to help users keep track of the active discussion topics that they’re subscribed to.



In July, the Growth team completed its second round of A/B testing of signup invitations for anonymous editors on English Wikipedia, including data analysis. The team also built the first API and interface prototypes for task recommendations. This new system, first aimed at brand new editors, makes suggestions based on a user’s previous edits.


Wikimedia Apps

Following on from the successful launch to Android, the Mobile Apps team released the new native Wikipedia app to iOS on July 31. The app is the iOS counterpart to the Android app, with many of the same features such as editing, saving pages for offline reading, and browsing history. The iOS app also contains an onboarding screen that is shown the first time the app is launched, asking users to sign up, a feature which was also launched on Android this month (see below).

On Android this month we released to production accessibility and styling features which were requested by our users, such as a night mode for reading in the dark and a font size selector. We also released an onboarding screen that asks users to sign up.

Our plan for next month is to get user feedback from Wikimania, wrap up our styling fixes, and begin work on an onboarding screen the first time that someone taps edit.

Mobile web projects

This month, the team continued to focus on wrapping up the collaboration with the Editing team to bring VisualEditor to tablet users on the mobile site. We also began working to design and prototype our first new Wikidata contribution stream, which we will build and test with users on the beta site in the coming month.

Wikipedia Zero

During the last month, the team worked on software architecture features that allow for expansion of the Wikipedia Zero footprint on partner networks and that get users to content faster with support for lowered cache fragmentation on Varnish caches. Whereas the previous system supported one-size-fits-all configuration for heterogeneous partner networks, inhibiting some zero-rated access, the new system supports multiple configurations for disparate IP addresses and connection profiles per operator. Additionally, lightweight script and GIF-ified Wikipedia Zero banner support has been added and is being tested; in time this should drastically reduce Varnish cache fragmentation, making pages be served faster and reducing Varnish server load. A faster landing page was introduced for “zerodot” (zero.wikipedia.org, legacy text-only experience) landing pages when operators have multiple popular languages in their geography. Work on compression proxy traffic analysis for header enrichment conformance with the official Wikipedia Zero configurations was also performed after more diagnostic logging code was added to the system. Finally, watchlist thumbnails, although low bandwidth, were removed from the zerodot user experience, as was the higher bandwidth MediaViewer feature for zerodot; mdot will have these features, though.

In side project work, the team spent time on API continuation queries, Android IP editing notices, Amazon Kindle and other non-Google Play distribution, and Google Play reviews (now that the Android launch dust has settled, mobile apps product management will be triaging the reviews). In partnerships work, the team met with Mozilla to talk about future plans for the Firefox OS HTML5 app (e.g., repurposing the existing mobile website, but without any feature reduction) and how Wikimedia search might be further integrated into Firefox OS, and also spoke with Canonical about how Wikipedia might be better integrated into the forthcoming Ubuntu Phone OS.

Routine pre- and post-launch configuration changes were made to support operator zero-rating, with routine technical assistance provided to operators and the partner management team to help add zero-rating and address anomalies. The team also continued its search for a third Partners engineering teammate.

Wikipedia Zero (partnerships)

We served an estimated 68 million free page views in July through Wikipedia Zero. We continue to bring new partners into the program, though none launched in July. Adele Vrana met with prospective partners and local Wikimedians in Brazil. We published our operating principles to increase transparency.

Language Engineering

Language tools

CLDR extension was updated to use CLDR 25; this work was mostly done by Ryan Kaldari. The team made various internationalization fixes in core, MobileFrontend, Wikipedia Android app, Flow, VisualEditor and other features. In the Translate extension, Niklas Laxström fixed ElasticSearchTTMServer to provide translation memory suggestions longer than one word; and improved translation memory suggestions for translation units containing variables (bug 67921).

Language Engineering Communications and Outreach

We announced the initial availability of the Content translation tool with limited feature support. We are focusing on supporting Spanish to Catalan translations for this initial release. You can read a report on the feedback received since deployment.

Content translation

An initial version was released on Beta Labs; it supports machine translation between Spanish and Catalan. The machine translation API leverages open source machine translation with Apertium. The tool supports experimental template adaptation between languages. Numerous bug fixes were made based on testing and user feedback. We worked on matching the Apertium version to the cluster, and planning for the next round of development has started.

Platform Engineering

MediaWiki Core


The Beta cluster is running HHVM. The latest MediaWiki-Vagrant and Labs-vagrant use HHVM by default.

Admin tools development

Most admin tools resources are currently diverted towards SUL finalisation, which will greatly help in reducing the admin tools backlog. July saw the deployment of the global rename tool (bug 14862), and core fixes including the creation of the “viewsuppressed” userright (bug 20476).


Our deployment of CirrusSearch to larger wikis as the primary search back-end turned out to be too ambitious. After encountering performance issues, we rolled back this change. We are now addressing the root of the problem, by getting more servers (nearly doubling the cluster size) and putting together more optimizations to the portion of Cirrus that fell over (working set). If everything goes as planned, it’ll be reduced by about 80%, by reducing indexing performance in return of search performance. These optimizations will slightly change result relevance; please let us know if you notice any issues.

Auth systems

Most work was spent on SUL Finalization tasks. Phpunit and browser tests were added for CentralAuth, global rename was deployed, and lots of small fixes were made to CentralAuth to clean up user accounts in preparation for finalization.

SUL finalisation

In July, the SUL finalisation team began work on completing the necessary feature work to support the SUL finalisation.

To help users with local-only accounts that are going to be forcibly renamed due to the SUL finalisation, the team is working on a form that lets those users request a rename. These requests will be forwarded onto the stewards to handle. The SUL team is currently in consultation with the stewards about how they would like this tool to work. When this consultation is wrapped up, the team will begin design and implementation.

To help users get globally renamed without having to request renames on potentially hundreds of wikis, the team implemented and deployed GlobalRenameUser, a tool which renames users globally. As the tool is designed to work post-finalisation, it only performs renames where the current name is global, and the requested name is totally untaken (no global account and no local accounts exist with that name).

To help users who get renamed by the finalisation and, despite our best efforts to reach out to them, did not get the chance to request a rename before the finalisation, the team is working on a feature to let users log in with their old credentials. The feature will display an interstitial when they log in, informing them that they logged in with old credentials and that they need to use new ones. We are also considering a persistent banner for those users, so that they definitely know they need to use their new credentials. An early beta version of this feature is complete, and now needs design and product refinements to be completed.

To help users who get renamed by the finalisation and, as a result, have several accounts that were previously local-only turned into separate global accounts, the team is working on a tool to merge global accounts. We chose to merge accounts as it was the easiest way to satisfy the use case without causing further local-global account clashes that would cause us to have to perform a second finalisation. The tool is in its preliminary stages.

The team also globalised some accounts that were not globalised but had no clashes. These accounts were either created in this local-only form due to bugs, or are accounts from before CentralAuth was deployed where the user never globalised. As these accounts had no clashes, there were no repercussions to globalising these accounts, so we did this immediately.

At present, no date has been chosen for the finalisation. The team plans to have the necessary engineering work done by the end of the quarter (end of September 2014), and have a date chosen by then.

Next month the team plans to continue work on these features.

Security auditing and response

MediaWiki 1.23.2 was released, fixing 3 security bugs. Security reviews were made for BounceHandler and Petition extensions, and the password API was merged.

Release Engineering

Release Engineering

This month, the Release and QA Team became the Release Engineering Team, mostly reflecting the transition of this team from being made up of members of other distinct teams to that of a coherent self-contained (mostly) team. This will, hopefully, allow better coordination of “Release” and “QA” things (broadly spreaking).

A lot of progress was made on making Phabricator suitable as a task/bug tracking system for Wikimedia projects. You can see the work to be sorted and completed at this workboard.

The Beta Cluster now runs with HHVM, bringing us much closer to full HHVM deployment. In addition, the Language Team deployed the new Content translation system on the Beta Cluster with the help of the Release Engineering team.

The second round of public RFP for third-party MediaWiki release management was conducted and concluded.

We now no longer use the third-party Cloudbees service for any of our Jenkins jobs and run all jobs locally. This will enable us to better diagnose issues with our build process, especially as it pertains to our browser tests (which still mostly run on SauceLabs).

Quality Assurance

This month, the QA team finished two significant achievements: after porting all the remaining browser tests from the browsertests repository to the repositories of the extensions being tested in June, as well as porting a significant set of tests to MediaWiki core itself, we completely retired the Jenkins instance running on a third-party host in favor of running test builds from the Wikimedia Jenkins instance, and we deleted the /qa/browsertests code repository. These moves are the result of more than two years of work. In addition, we have added more functions to the API wrapper used by browser tests, improved support for testing in Vagrant virtual machines, added new Jenkins builds for extensions, and improved the function of the beta labs test environments by preventing database locks and stopping users from being logged out by accident.

Browser testing

The browser tests are now all integrated with builds on the Wikimedia Jenkins host. We added browser tests for MediaWiki core that will validate the correctness of a MediaWiki installation regardless of language, or of what extensions may or may not exist on the wiki, so that the tests may be packaged with the distribution of MediaWiki itself and used on arbitrary wikis. We saw a lot of browser test activity for Flow development, and we are preparing to support even more extensions and features in the very near future.



Media Viewer’s new ‘minimal design’.

In July, the multimedia team reviewed more feedback about Media Viewer, from three separate Requests for Comments on the English and German Wikipedias, as well as on Wikimedia Commons. Based on this community feedback, the team worked to make the tool more useful for readers, while addressing editor concerns. We are now considering a new ‘minimal design’, which would include: a much more visible link to the File: page; an even easier way to disable the tool; a caption or description right below the image; removing additional metadata below the image, directing users to the File: page instead.

As described in our improvements plan, these new features are being prototyped and will be carefully tested with target users in August, so we can validate their effectiveness before developing and deploying them in September. You can see some of our thinking in this presentation.

This month, we continued to work on the Structured Data project with the Wikidata team and many community members, to implement machine-readable data on Wikimedia Commons. We prepared to host a range on online and in-person discussions to plan this project with our communities, and aim to develop our first experiments in October, based on their recommendations. We also continued a major code refactoring for the UploadWizard, as well as fixed a number of bugs for some of our other multimedia tools.

Last but not least, we prepared seven different multimedia roundtables and presentations for Wikimania 2014, which we will report on in more depth in August. For now, you can keep up with our work by joining the multimedia mailing list.

Engineering Community Team

Bug management

At the Pywikibot bugdays, 189 reports received updates. Technically, Jan enabled invalidating the CSS cache and strict transport security, Matanya updated Bugzilla’s cipher_suite and cleaned up a template, and Daniel deleted an unused config file. Tyler and Andre added requested components to Bugzilla. Planning of an exposed “easy bug of the week” continued, summarized on a wikipage.

Phabricator migration

Phabricator’s “Legalpad” application (a tool to manage trusted users) was set up on a separate server. This instance provides WMF Single-User Login authentication.

Mukunda implemented restricting access to tasks in a certain project which can be tested on fab.wmflabs.org. As a followup, he investigated enforcing security policy also on files and attachments and replacing the IRC bots by Phab’s chatbot. Chase worked on initial migration code to import data from Bugzilla reports into Phabricator tasks (and ran into missing API code in Phabricator), investigated configuring Exim for mail, set up a data backup system for Phabricator, and upgraded the dedicated Phabricator server to Ubuntu Trusty. Quim started documenting Phabricator.

Andre helped making decisions on defining field values and how to handle certain Bugzilla fields in the import script and sent a summary email to wikitech-l about the Phabricator migration status.

Mentorship programs

All Google Summer of Code and FOSS Outreach Program for Women projects continued their development toward a successful end. For details, check the reports:

Technical communications

Chart showing historical Flesch reading ease data for Tech News, a measure of the newsletter’s readability. Higher scores indicate material that is easier to read. A score of 60–70 corresponds to content easily understood by 13- to 15-year-old students.

Guillaume Paumier collaborated with authors of the Education newsletter to set it up for multilingual delivery, using a script similar to the one used for Tech News. He also wrote a detailed how-to to accompany the script for people who want to send a multilingual message across wikis. In preparation for the Wikimania session about Tech News, he updated the readability and subscribers metrics. He also continued to provide ongoing communications support for the engineering staff, and to prepare and distribute Tech News every week.

Volunteer coordination and outreach

We focused on the preparation of the Wikimania Hackathon, encouraging all registered participants to propose topics and sign up to interesting sessions. We also organized a Q&A session with potential organizers of the Wikimedia Hackathon 2015. We organized two Tech Talks: Hadoop and Beyond. An overview of Analytics infrastructure and HHVM in production: what that means for Wikimedia developers. More activities hosted in July can be found at Project:Calendar/2014/07.

Architecture and Requests for comment process

Developers finished the security architecture guidelines, and discussed several requests for comment in online architecture meetings:


In July, Quim Gil sorted the tasks necessary for the first hub prototype into a Phabricator board, and Sumana Harihareswara determined which three APIs she would document first.



Wikimetrics can now generate vital sign metrics for every project daily. Rolling Monthly Active Editor metric has been implemented; the reports are in JSON format, in a logical path hosted on a file server and downloadable. The team also worked on backfilling data for the daily reports on Newly Registered and Rolling Active Editor, and numerous optimizations to backfill the data quickly.

Data Processing

New nodes were added to the cluster this month and all machines were upgraded to run CDH5. The team decided not to preserve any data on the cluster during the upgrade and started fresh. The team hosted a Tech Talk on our Hadoop installation (see video and slides). Duplicate monitoring has also been implemented in Hadoop to monitor the incoming Varnish logs.

Editor Engagement Vital Signs

The culmination of our efforts this month can be visualized in a prototype built for Wikimania. This was made possible thanks to many back-end enhancements (optimizations) to Wikimetrics, along with research and selection of the optimal technologies to implement the stack to display a dashboard.


EventLogging monitoring is now in graphite, and we can see which schemas cause spikes in traffic (example).

Research and Data

This month, we completed the documentation for the Active Editor Model, a set of metrics for observing sub-population trends and setting product team goals. We also engaged in further work on the new pageviews definition. An interim solution for Limited-duration Unique Client Identifiers (LUCIDs) was also developed and passed to the Analytics Engineering team for review.

We analyzed trends in mobile readership and contributions, with a particular focus on the tablet switchover and the release of the native Android app. We found that in the first half of 2014, mobile surpassed desktop in the rate at which new registered users become first-time editors and first-time active editors in many major projects, including the English Wikipedia. An update on mobile trends will be presented at the upcoming Monthly Metrics meeting on July 31.

Development of a standardised toolkit for geolocation, user agent parsing and accessing pageviews data was completed.

We supported the multimedia team in developing a research study to objectively measure the preference of Wikipedia editor and readers.

We hosted the July research showcase with a presentation by Aaron Halfaker of 4 Python libraries for data analysis, and a guest talk by Center for Civic Media’s Nathan Matias on the use of open data to increase the diversity of collaboratively created content.

We prepared 8 presentations that we will be giving or co-presenting next week at Wikimania in London. We also organized the next WikiResearch hackathon that will be jointly hosted in London (UK) (during the pre-conference Wikimania Hackathon) and in Philadelphia (USA) on August 6-7, 2014.

We filled the fundraising research analyst position: the new member of the Research & Data team will join us in September and we’ll post an announcement on the lists shortly before his start date.

Lastly, we gave presentations on current research at the Wikimedia Foundation at the Institute for Scientific Interchange (Turin) and at the DesignDensity lab (Milan).


Screenshot of the first Project Gutenberg ZIM file

The Kiwix project is funded and executed by Wikimedia CH.

We have pre-release binaries of the next 0.9 (final) release. Except for OSX everything seems to work file as far. The support of RaspberryPi was finally merged to the kiwix-plug master branch; this offers new perspectives because the price to create a Kiwix-Plug has dropped to around USD 100. We also started an engineering collaboration with ebook reader manufacturer Bookeen (in the scope of the Malebooks project) to be able offer an offline version of Wikipedia on e-ink devices.
We participated in the Google Serve Day at Google Zurich. The goal was to meet Google engineers during one day and have them work on open source projects. The result was a dozen of fixed bugs and implemented features, mostly on Kiwix for Android, but also in Kiwix for desktop and MediaWiki.
Four developers had a one-week hackathon in Lyon, France to develop an offline version of the Gutenberg library. We’re currently polishing the code and plan a release soon; our partners and sponsors plan the first deployments in Africa in Autumn.
Last but not least, a proof-of-concept of a Kiwix iOS app was made, so we might release a first app before the end of the year.


The Wikidata project is funded and executed by Wikimedia Deutschland.

The biggest improvement around Wikidata in July is the release of the entity suggester. It makes it a lot easier to see what kind of information is missing on an item. Helen and Anjali, Wikidata’s Outreach Program for Women interns, continued improving user documentation and outreach around Wikidata as well as worked on a new design for the main page. Guided Tours were published, helping newcomers find their way around the site. The developers further worked on supporting badges (like “featured article”), redirects between items, the monolingual text datatype (to be able to express things like the motto of a country) as well as the first implementation steps for the new user interface design. Additionally the first JSON dumps were published.


The engineering management team continues to update the Deployments page weekly, providing up-to-date information on the upcoming deployments to Wikimedia sites, as well as the annual goals, listing ongoing and future Wikimedia engineering efforts.

This article was written collaboratively by Wikimedia engineers and managers. See revision history and associated status pages. A wiki version is also available.

by Guillaume Paumier at August 26, 2014 10:09 AM

August 25, 2014

Harry Burt

Wikimania review

Wikimania 2014 was held earlier in the Barbican Centre in London. This particular article or mine was originally published in the Signpost, where it had about 1500 page views.

Prologue: hackathon

The pre-Wikimania Hackathon proved popular, with developers flooding the fourth floor for its introductory session.

As has become traditional, Wikimania proper was preceded by a two-and-a-half day hackathon, with entry at slight additional cost. While there had been concerns from hackathon organisers about what percentage of those registered would actually attend, it was clear from the word “go” that it would be alright on the night: the introductory session on Wednesday morning was packed, and numbers remained high throughout Thursday and into Friday. For attendees it was an opportunity to get in some ‘hacking’—any coding of an interesting nature, including work on tools, gadgets, MediaWiki and its extensions—meet other developers, and enjoy the comfortable (if slightly unusual) surroundings of the Barbican’s tropical conservatory and garden room. On a warm summer’s day, it felt like a greenhouse—not least because, in a very real sense, it was.

Nevertheless, the social atmosphere was Wikimania at its best: light, enthusiastic and welcoming to those more unfamiliar with the movement and its goals, here including an impressive assortment of journalists. Staff proved approachable, mixing freely with volunteers—indeed, the sessions served as a reminder that Wikimedians are peculiarly lucky in that regard. Such positivity even crept into sessions as potentially fraught as that led by the Foundation’s Fabrice Florin, a presentation and chat about the development direction of the controversial Media Viewer extension. Although there were minor quibbles, like the sprawling Barbican making it difficult to move from registration (floor: -1) to venue (floor: 4), or the deployment of sandwiches at lunch (“originally supposed to be lasagne”, Ed noted critically) and nothing at dinner, it was an uncomplicated unconference executed well. Even the WiFi held up, as it did throughout the conference—more or less.

Opening session and keynotes

Conference Organiser Ed Saperia opened Wikimania proper with a brief discussion of its main themes and their inspirations.

The opening session of Wikimania, held alongside a welcome drinks reception on the Thursday evening, could roughly be divided into two halves. The first consisted of four speakers (Ed Saperia, Wikimedia UK Chief Executive Jon Davies, Jimmy Wales and Lila Tretikov) enlisted to give short welcome speeches. Apart from an off-the-cuff remark from Wales that he wished the press would talk “less about the monkey” and more about the substantive issues raised in his pre-Wikimania press conference, the burden of getting the packed auditorium to tear themselves away from their phones/tablets/buzzword bingo cards fell to Salil Shetty, Secretary General of Amnesty International and sole keynote speaker of the Thursday evening session. Though many of Shetty’s remarks fell on sympathetic ears, it was his allusions to certain problems of scaling—the forced creation of staff headquarters in developing nations; the difficulties of running a global institution alongside local chapters—which stood out and it was a shame that Shetty did not share more of his considerable experience during the keynote itself.

Salil Shetty provided the opening keynote of Wikimania 2014, discussing the development and growth of Amnesty International, which he heads.

Shetty was arguably the most prominent of the non-Wikimedia names on the list of featured speakers—surprising, perhaps, for a conference that had won the bidding process promising speakers including Clay Shirky, Cory Doctorow, Lawrence Lessig and even Stephen Fry (see related Signpost coverage). Nevertheless, the speakers eventually organised proved sufficient to regularly fill and continuously entertain the cavernous Barbican Hall. The final lineup thus included Danny O’Brien (along with Wales, one of the two survivors of the original London bid), Jack Andraka, and, able to draw on the UK’s well developed civil society infrastructure, representatives of the thinktank Demos, Code Club and Young Rewired State among others: an admirable and effective lineup, if not quite the “VIP speakers (academics, politicians, media, entertainment)” originally described by Jimmy Wales in July 2012. In a Wikimania first, all of the featured speakers’ presentations were reliably streamed live and recordings rapidly made available online, a real boon considering Wikimedia’s global appeal and the months-long delays from previous Wikimanias.

Other tracks

Wikimania 2014′s eight tracks offered access to speakers on a wide variety of subjects—here, author and associate professor of journalism Andrew Lih discusses the difficulties of getting more video onto Wikimedia wikis.

In total, Wikimania 2014 claimed some 200 sessions over 8 simultaneous tracks, replete with the inevitable scheduling and organisational headaches. The organisers will be pleased with the variety they achieved: notable themes including open access, open data, technology, GLAM and diversity were all well-represented, while smaller topics (the legal aspects of Wikipedia, for example) seemed neatly stitched into accessible 90 minute blocks. The Barbican’s cavernous layout and the comfort of its designed-for-purpose auditoria thus conspired to make these blocks, rather than individual sessions, the primary unit of time management—to the benefit of some of the more niche interest talks on the programme. Each talk seemed ably staffed by the conference’s apparently vast team of volunteers, both technically and in terms of sticking to their timetables. The blocks were then in turn punctuated by coffee breaks, lunch, and on some days (but confusingly not all) dinner. Although hackathon attendees quickly got used to the “packed lunch” format, it was the dinners that particularly stood out, including bitesize burgers, skewers and sea-bream tacos (to name a few), served in reasonable quantity but alas with the purity of queuing to which many native Britons (the author included) are accustomed.

Aided by the high overall attendance (an estimated 2000, making London the largest Wikimania to date) all the sessions seemed to receive good levels of participation; there were not enough chairs, for example, to incorporate everyone attending an event on copyright, not usually a floor filler. Saperia added that hundreds of those tickets had been sold in the final days before the start of Wikimania proper—a reminder that it was not just hardcore Wikimedians in attendance. For those unable to attend a talk that they would have liked to—and with eights tracks, that included many attendees—slides and numerous recordings are now available. The quality of the talks varied, but around a high mean; early evidence suggests numerous standout sessions (the author would recommend Brandon Harris’ unique performance style, though his two talks were of very different kinds). Unsurprisingly many attendees also turned to Twitter to add their comments to those of a hyperactive Wikimania social media team, with an estimated 21,000 tweets using the #wikimania or #wikimania2014 hashtags over the course of the three day conference.

Closing speeches

The Wikimania 2014 group photograph, taken immediately before the closing speeches

After a brief video in support of the students of Sinenjongo High School in their WMF-supported campaign to get Wikipedia Zero more widely adopted in the Global South, Jimmy Wales once more took to the stage to give his “state of the wiki” remark. Most pertinent of these was his comment that too often what is intended as a minimum bar serves to define the normal and thus to hive off as supererogatory many of the virtues for which Wikimedia ought to strive: not just mere civility, Wales suggested, but “kindness, generosity, forgiveness, compassion”, a “morally ambitious” programme he said, but an achievable one. He also noted YouGov research that indicated the British public trusts Wikipedia more than both the tabloid and quality press.

Wales’ annual Wikimedian of the Year award went this year to Ihor Kostenko, a prominent Ukrainian Wikipedian and journalist tragically killed in the civil unrest that engulfed the capital Kiev earlier this year (see Signpost special report: “Diary of a protester—Wikimedian perishes in Ukrainian unrest“). It was a poignant and appropriate choice, although in a hat-tip to potential future controversy over the awarding of the honour, Wales promised to ensure a more “democratic” process was in place ahead of Wikimania 2015. After presenting some of the hosting chapter (Wikimedia UK)’s annual awards of their behalf, attention turned more fully to next year’s event, with a brief introductory video shown by the Mexico City team. Of its slogans, “our venue: Vasconcelos library” and “gay friendly” received the most enthusiastic support among the thousand-strong audience.

The Wikimania closing party contained its fair share of free drinks, loud music—and decidedly questionable dancing.

The speeches (including brief remarks by WMF Chair Jan-Bart de Vreede) were followed by the Wikimania closing party, an event backed by reasonable but not excessive amounts of free alcohol, and a selection of musical accompaniments in a variety of styles. Indeed, such entertainment was provided on each evening of the conference, interspersed with comedy performances on a technology theme. The latter especially was a brave choice, and the organisers will be forgiven if the jokes fell a little flat, or the dancefloor was a little empty. Patrons were also able to take advantage of the hackathon rooms—left open well into the night—or escape outside where attractive fountains punctuated the cold brutalist structure of the Barbican estate. The more adventurous tried the City of London’s wallet-busting public houses, if only for novelty value.

Epilogue: looking back

For some, the impact of Wikimania will be direct: a bustling community village featured an array of chapters eager to sign up new members, as well as a variety of non-WMF projects looking for exposure. For most, however, the effect is more subtle, subsisting in a set of renewed relationships, vague recollections and hearsay. It is difficult to see how Wikimania 2014 could have failed to impress the casual onlooker, with its sheer scale an obvious statement of intent. Of course, such a statement must also be paid for, and the debate over the financing of Wikimania, which necessarily took a backseat role for the duration of the conference, may yet cloud what should be enjoyable memories of an enjoyable Wikimania.

The same is true of the announcement, on the final day of the conference, that the WMF would be using technical measures to override local administrators on the German Wikipedia: as one European chapter member remarked, “at least it will give us something to talk about [at the closing party]“. Such worries aside, it was an impressive conference that promised the moon but had to settle for the stars.

Alternatively, in true British understatement, it was “not too bad, actually”.

by Harry at August 25, 2014 07:54 PM

Wikimedia UK

“The institutions that are loved survive”: Pat Hadley and the York Museums Trust

This post was written by Joe Sutherland

<iframe allowfullscreen="allowfullscreen" frameborder="0" height="360" src="https://www.youtube-nocookie.com/embed/QKQMWMywp8M?list=PL66MRMNlLyR6BuplUUTWvyl4_klBOZzNT" width="640"></iframe>
Pat Hadley was part way through a PhD in archaeology at the University of York in the summer of last year when he decided to leave to explore new areas in which to apply his skillset. A natural scientist with a digital background, he is interested in the “ways in which the public engages with the past”. For him, Wikipedia is an ideal platform to investigate this.

Since late 2013, he has worked as Wikimedian in Residence at York Museums Trust, helping them to share their collections through Wikipedia and its sister projects. An archaeologist of ten years, and a contributor to Wikipedia since 2011, Pat had been keen to find a museum in York interested in opening up access to their content.

In September 2013, Wikimedia UK supported the York Museums Trust, and two other institutions, in their search for a Wikimedian in Residence. The YMT is a charitable body which manages three museums, a contemporary art space, and a public gardens in the city.

“[The YMT] is a brilliant test case for the GLAM-Wiki project, because it’s almost the most typical set of museums you could possibly imagine, all in one space,” Pat explains.

Despite his background in academia, he was surprised to land the role. “I heard that the new scheme was going along, but I had no idea it would be me,” he says. Through a series of “happy accidents” he found himself looking for a project at the same time that applications were open and ended up with the job.

In his time at the YMT, Pat has run many major projects. They have ranged from training sessions for the institutions’ volunteers and staff, donations of content held by the museum to Wikimedia Commons, and a public editathon.

One of his first projects revolved around Tempest Anderson, a doctor, amateur photographer and volcanologist from 19th-century York, whose images have been retained on glass lantern slides. “The museum was planning to do a high-resolution digitisation of those anyway,” Pat explains, “and they’re public domain, so they were one of the key early collections for the project to target.”

As one of the first projects to take place during his tenure, he did face challenges during the work. “Unfortunately we only managed to get 56 images released by the end of the residency, but we got five of those used on the English, German and French Wikipedias. So we’re already beginning to make ripples across Wikipedia. Hopefully in the next few months the museum will be releasing the rest of the images.”

The work on Anderson was built upon in March 2014, in an event focused on the luminaries of historical York. “It was a nightmare to think of a theme that could bring all the collections together,” Pat says.

Pat Hadley outside the Yorkshire Museum

Pat Hadley at the Yorkshire Museum
Photo: User:Rock drum, CC-BY-SA 4.0

As such, the day allowed the improvement of a wide variety of topics on Wikipedia, ranging from natural history to fine art to archaeology. Several YMT curators presented their areas of expertise to a determined collection of sixteen participants, most of whom had never edited before.

The topics covered in the editathon included York-based artists such as Mary Ellen Best. “She was a Victorian artist who painted domestic interiors mostly in watercolour,” Pat says. “She wasn’t painting the kinds of things that were popular among Victorian artists.

“She wasn’t getting much recognition at the time, but there were a significant number of her paintings in the collections here. As a result we were able to release some of those and have some volunteers and experienced Wikipedians work together to get her a very reasonable biography, and even got a ‘Did You Know’ on the front page of Wikipedia. That was fantastic.”

Andrew Woods, curator of numismatics at YMT, had an active role during the day. He focused on the Middleham Hoard, a collection of Civil War-era coinage that was discovered in the eponymous market town in North Yorkshire. “Since we acquired it, it had lain dormant. Despite the fact it is this astonishing, very important hoard, we hadn’t done anything with it,” Andrew explains.

“It doesn’t really fit with our gallery spaces,” he adds, “so what we were really keen to do is to put it on display digitally. We had the coins imaged by a volunteer and we put those images onto Wikimedia. From there they’ve really taken off–a whole page has been written about the hoard, and they’ve been used in a number of different ways thereafter. So it’s taken a hoard that nobody really knew anything about and made it visible to so many more people.”

Overall, the partnership has led to “YMT becoming more open”, says Pat, and he argues that Wikimedia should be a key part of the missions of GLAM institutions moving forward. “They need to be connected. Somebody once said it is the institutions that are loved by everyone that survive.”

“If there are central funding cuts,” he adds, “the museums that share their collections and generate love by giving their knowledge and gardening it out… they are the ones that are going to survive through crises. They’re going to get more people supporting them in all sorts of ways.”

by Stevie Benton at August 25, 2014 03:04 PM

Gerard Meijssen

#Wikimedia - "Share in the sum of all available knowledge"

When we are to focus on the available knowledge we have to share, statistics are key. They cut the crap and focus on numbers. Given that information can be made out of data, knowing how much additional information is available that is easily understood by people who can read English is relevant. Two reports are relevant; one shows the number of links in English and, the other shows the number of labels in English [1]

At this time there are 757,967 items with an English label and without an article. This is 4,7% of the total number of items Wikidata holds. At the same time 58% of the number of items do not have a label in English.

Not having a label does not mean that we cannot provide meaningful information. The name of a Dutch or Spanish person is for instance perfectly understood; it is typically written exactly the same in English. Reasonator understands this and always presents a label anyway.

It is fairly easy to start sharing this "missing" information. It is already done in many Wikipedias. The suggestion to share more information has been put asked on all Wikipedias and  several "communities" do not think it is a good idea. In effect they prefer an inferior product providing a subset of the information that should be available to all our readers.

[1] it shows the numbers for other languages as well and, the statistics are near real time. It takes a minute for them to be presented to you.

by Gerard Meijssen (noreply@blogger.com) at August 25, 2014 09:54 AM

Tech News

Tech News issue #35, 2014 (August 25, 2014)

TriangleArrow-Left.svgprevious 2014, week 35 (Monday 25 August 2014) nextTriangleArrow-Right.svg
Other languages:
বাংলা • ‎English • ‎español • ‎suomi • ‎français • ‎עברית • ‎日本語 • ‎português • ‎українська • ‎中文

August 25, 2014 12:00 AM

August 24, 2014

Jamie Thingelstad

MediaWiki LocalSettings for Farmers

I’ve been running a MediaWiki farm at thingelstad.com for a couple of years now hosting about a dozen wikis ranging from small to very large. Running a MediaWiki farm is a bit complicated and you can approach it a number of different ways. I recently pushed the settings that I use to run my farm into GitHub so others can see how I do it. The next step will be to also move the scripts that I use up, but those will be kept in another repository.

Hopefully this proves useful to others. It’s useful for me to finally have these very complicated settings (really code!) under version control.

by Jamie Thingelstad at August 24, 2014 01:08 PM

Gerard Meijssen

#Reasonator - A new metric for #Wikimedia

Denny wrote a really good article in the SignPost. It includes a "TL:DR" that I am happy to quote.
TL;DR: We should focus on measuring how much knowledge we allow every human to share in, instead of number of articles or active editors. A project to measure Wikimedia's success has been started. We can already start using this metric to evaluate new proposals with a common measure.
The point Denny makes is great; we aim to enable every human being to share in the sum of all knowledge and we should measure the extend to which we are achieving this goal. When you read the article carefully it does not say Wikipedia, it says Wikimetrics. The point Denny makes is very much that we need to focus on what it takes to bring information to people.

Presenting data that is available to us as information is what Reasonator does. It relies on what is known in Wikidata about articles that exist in any Wikipedia. To make this understood to a person, the number of available statements and the number of available labels for an item are key.

When Wikimetrics is to appreciate the potential of Wikidata and the approach Reasonator takes, it should include three bits of information;
  • the number of statements per item
  • the number of labels per language
  • how items are covered with labels in a language
With such an approach the graph will be substantially different. Not one language covers 50% of all the topics known to Wikidata and consequently the graph will show that there is much more work for us to do. It will also indicate that the amount of information that is available for a public that can read English is much larger and the amount available to people who can only read Gujarati is much less.

by Gerard Meijssen (noreply@blogger.com) at August 24, 2014 10:02 AM

August 23, 2014

Gerard Meijssen

#Wikidata - Ameyo Adadevoh a physician from #Nigeria

When a Mr Sawyer arrived in Lagos and showed symptoms of ebola, Mrs Adedevoh took control of the situation and thanks to her efforts ebola was largely contained. In the end it did not save her; as a physician in the frontline of the fight against ebola she became a victim herself.

Mrs Adadevoh is another hero of our times. When you google about ebola and Nigeria, there are two things that are of interest; sadly there are the opinion pieces that see a conspiracy in the coming of Mr Sawyer to Nigeria but more positive is the information about the efforts to contain ebola in Nigeria and what it is that you can do to become infected; personal hygiene is key.

There is a call to ensure that hospital staff are immunised. It is quite obvious that no country can really afford to lose key people like Mrs Adadevoh. It is equally obvious that all doctors and nurses who have to deal with ebola patients need to be protected. Without them containing and treating ebola is impossible.

by Gerard Meijssen (noreply@blogger.com) at August 23, 2014 11:04 PM

#Wikidata - Sheik Umar Khan a physician from Sierra Leone

Do not be mistaken. Mr Khan is a hero of our times. Mr Khan died of ebola. He was in charge of the fight to contain this awful disease.

There is a category of people who died from ebola; with currently three entries it is mercifully empty. Then again, that man who died at Lagos airport is not in there.. Probably more people who became notable because of ebola are missing as well.

It is important to recognise ebola for the threat it represents. One of the things you cannot do is run away from it. The only thing that is achieved is spreading the disease even further.

by Gerard Meijssen (noreply@blogger.com) at August 23, 2014 11:01 PM

Tony Thomas

Exim regex: Capture all VERPed emails with a given header pattern

Consider your VERP generater produces a Return-Path header of the form: bounces-testwiki-2a-nanrfx-Tn14EQZWaotS2XNn@mytesthost.com and you want a router to capture all bounce emails having this/similar as the To header. This router can serve multiple purpose like – feeding to a bounce processor, silently killing all bounces ( not intended ) or POSTing the email to an […]

by Tony Thomas at August 23, 2014 07:04 AM

Gerard Meijssen

#Wikidata - the #beta label lister

At the hackathon of #Wikimania2014, work was done on a new version of the label lister. It is a gadget that allows you to edit labels and aliases in other languages. It proved to be an indispensable tool to me. Today I learned that the new label lister is now available.

The most wonderful thing is that it became much more compact, you do not need to click as much anymore and it "just works". In the screenshot you see Mrs Bundschuh, she is a former member of the Landtag of Bavaria, and as you can see it is trivially easy to add a label in your language.

I hope that functionality like the label lister will make it into a core feature of Wikidata.

by Gerard Meijssen (noreply@blogger.com) at August 23, 2014 06:40 AM

August 22, 2014

Wikimedia Foundation

Grants, Programs and Learning: This year at Wikimania London

Grants, Programs & Learning booth in the Community Village.

Those present at this year’s Wikimania may have witnessed a different presence on behalf of the Grantmaking team. The department, formed by Grants, Learning & Evaluation and Education teams, was present in the global conference that brings together Wikimedia project programs, movement leaders and volunteers to learn from and connect with one another at the five day event. Whether at our booth in the Community Village or in the many presentations and workshops, the conversations we shared with community members from all over the world were very enriching.

We heard from more and more people interested in gathering data and working toward understanding, at a deeper level, what works and why. In this way, we are all working together towards building sustainable growth for the movement’s projects and programs; work that not only will involve new editors, but partnerships with other institutions that can help create free knowledge.

The need for sustainable growth

Learning Day notes on Logic Model.

Before the conference, we hosted a small Learning Day for leaders in our grants program to share experiences and insights, from applying evaluation to various projects – projects that might help the movement grow. Jake Orlowitz shared his game The Wikipedia Adventure, an experimental project aimed at onboarding new editors. Sandra Rientjes, the executive director from Wikimedia Nederlands (WMNL), presented her chapter’s long-term approach to programs. Wikimedia UK’s Daria Cybulska shared the Wikimedians in Residence Review to show how they have used evaluation to redesign and improve an existing program. To explore diversity, Amanda Menking talked about her experience in her research project on women and Wikipedia.

These four presentations demonstrated the wide range of experiments being conducted by the grants community. Continuing to measure and discuss evaluation can help us all discover projects that have impact and to understand if and how they can be replicated in different contexts.

The day came to a close with a lively Idea Lab Mixer and Learning Day Poster Session happy hour. This was an opportunity for grantees to showcase their work and insight gained in the past year and ignite conversations around creating new ideas to make Wikimedia even more awesome.

The need to learn from each other

Jake Orlowitz during his lightning talk.

For the first time, all the representatives from our grantmaking committees got together for training and impact discussions. The pre-conference sessions also hosted a special day to welcome new FDC members and discuss Participatory Grantmaking. Guest speaker Matthew Hart shared with the group his research on how this practice takes place and what benefits it has on donors, communities and movements. What does it mean to give a Wikimedia grant and work together in a project? Under the light of the recent Impact Reviews developed by Learning & Evaluation team, that focused on Annual Plan Grants and Projects and Event Grants, three main priorities were highlighted with regards on working towards the movement’s goals: expanding reach, generating more participation and improving quality.

The Wikimedia movement is known for its capacity to innovate and learn from peers. We are now at a point when we need to standardize learning processes and generate resources that guarantee this knowledge exchange. As we continue working on program resources, we will also start working more closely with grantees on their project evaluation plans, hopefully reducing the time invested in this task and increasing impact.

The need for better tools and resources!

Wikimania was a great place to share new tools, resources and strategies around shared programs. Some highlights include:

  • Category Induced: Allows you to know how many categories were created from a collection category.
  • Easy FDC report: a tool that lets you gather the number of uploaders, files uploaded and highlighted files from a specific category and make it format-ready for FDC reports.
  • Unused files: Allows user to see which files on any given category have not yet been used.
  • Wikimetrics new features: this tool now lets you know which users from your cohort are newly registered and also includes the new metric ‘Rolling active editor.’ Find out more on on this presentation!
  • Quarry: Allows you to run SQL queries against Wikipedia and other databases from your browser. Stay tuned for more documentation on this tool on the Evaluation portal on Meta!

For tool-driven program leaders, the new tools directory will come in handy to find these and other resources to measure online impact!

As we continue to work on the challenges that surfaced during conversations at Wikimania, we hope to continue the dialogue online with program leaders and grantees all over the world. We are working to connect talented people and good ideas across the movement, so we call out to movement leaders: stay connected, reach out and ask!

María CruzCommunity Coordinator of Program Evaluation & Design

by carlosmonterrey at August 22, 2014 10:08 PM

Wikimedia UK

Upcoming Training for Trainers session in Edinburgh

Attendees of the February 2014 Training the Trainers event

Wikimedia UK is committed to supporting our volunteers. To encourage them to teach others how to edit Wikipedia and other Wikimedia projects, we are running a weekend training workshop. This will take place on the weekend of 1-2 November in Edinburgh, and we would particularly encourage anyone from Scotland and the north of England to attend.

The workshop will be delivered by a professional training company and aims to improve delegates’ abilities to deliver any training workshop. It’s especially relevant to anybody who already runs Wikimedia-related training, or is very interested in doing so in near future.

The workshop is a chance to:

  • Get accredited and receive detailed feedback about your presenting and training skills
  • Get general trainer skills which you can then apply when e.g. delivering specific Wikipedia workshops
  • Share your skills with others
  • Help design a training programme that serves Wikimedia UK in the long term.

The course will run from 9:30 am-6:30pm on Saturday and 9am-5pm on Sunday. A light breakfast and lunch will be provided. We should also be able to cover travel and accommodation if you let us know in advance.

If you are interested in attending, please indicate your commitment by registering on this page but please note that places are limited.

If you are not able to attend this time but would like to take part in the future, please let us know by email to volunteering@wikimedia.org.uk – we will be offering more sessions in the future.

Please do not hesitate to contact us with any questions. We can also put you in touch with past participants who will be able to share their experiences with you.

by Katie Chan at August 22, 2014 05:14 PM

August 21, 2014

Wiki Education Foundation

For Senior Citizens Day, read about social issues concerning elderly care

Today, we celebrate Senior Citizens Day in honor of the contributions they make all over the U.S., which was created to raise awareness of social issues concerning the elderly. To honor this awareness, read the Wikipedia article about elderly care, which student editor Ellyhutch expanded during the spring 2012 term in Dr. Diana Strassmann and Dr. Anne Chao’s Poverty, Gender, and Development course.

During the assignment, Ellyhutch added information about gender discrepancies in treatment of the elderly, legal issues regarding incapacity, and examples of elderly care in developing countries. Ellyhutch’s improvements have raised awareness for more than 300,000 readers about these important social issues!

Jami Mathewson
Educational Partnerships Manager

by Jami Mathewson at August 21, 2014 05:05 PM

Niklas Laxström

Midsummer cleanup: YAML and file formats, HHVM, translation memory

Wikimania 2014 is now over and that is a good excuse to write updates about the MediaWiki Translate extension and translatewiki.net.
I’ll start with an update related to our YAML format support, which has always been a bit shaky. Translate supports different libraries (we call them drivers) to parse and generate YAML files. Over time the Translate extension has supported four different drivers:

  • spyc uses spyc, a pure PHP library bundled with the Translate extension,
  • syck uses libsyck which is a C library (hard to find any details) which we call by shelling out to Perl,
  • syck-pecl uses libsyck via a PHP extension,
  • phpyaml uses the libyaml C library via a PHP extension.

The latest change is that I dropped syck-pecl because it does not seem to compile with PHP 5.5 anymore; and I added phpyaml. We tried to use sypc a bit but the output it produced for localisation files was not compatible with Ruby projects: after complaints, I had to find an alternative solution.

Joel Sahleen let me know of phpyaml, which I somehow did not found before: thanks to him we now use the same libyaml library that Ruby projects use, so we should be fully compatible. It is also the fastest driver of the four. Anyone generating YAML files with Translate is highly recommended to use the phpyaml driver. I have not checked how phpyaml works with HHVM but I was told that HHVM ships with a built-in yaml extension.

Speaking of HHVM, the long standing bug which causes HHVM to stop processing requests is still unsolved, but I was able to contribute some information upstream. In further testing we also discovered that emails sent via the MediaWiki JobQueue were not delivered, so there is some issue in command line mode. I have not yet had time to investigate this, so HHVM is currently disabled for web requests and command line.

I have a couple of refactoring projects for Translate going on. The first is about simplifying the StringMangler interface. This has no user visible changes, but the end goal is to make the code more testable and reduce coupling. For example the file format handler classes only need to know their own keys, not how those are converted to MediaWiki titles. The other refactoring I have just started is to split the current MessageCollection. Currently it manages a set of messages, handles message data loading and filters the collection. This might also bring performance improvements: we can be more intelligent and only load data we need.

Théo Mancheron competes in the men's decathlon pole vault final

Aiming high: creating a translation memory that works for Wikipedia; even though a long way from here (photo Marie-Lan Nguyen, CC BY 3.0)

Finally, at Wikimania I had a chance to talk about the future of our translation memory with Nik Everett and David Chan. In the short term, Nik is working on implementing in ElasticSearch an algorithm to sort all search results by edit distance. This should bring translation memory performance on par with the old Solr implementation. After that is done, we can finally retire Solr at Wikimedia Foundation, which is much wanted especially as there are signs that Solr is having problems.

Together with David, I laid out some plans on how to go beyond simply comparing entire paragraphs by edit distance. One of his suggestions is to try doing edit distance over words instead of characters. When dealing with the 300 or so languages of Wikimedia, what is a word is less obvious than what is a character (even that is quite complicated), but I am planning to do some research in this area keeping the needs of the content translation extension in mind.

by Niklas Laxström at August 21, 2014 04:24 PM

Wikimedia UK

Free information, the internet and medicine

The image shows a small leaflet outlining the work of WikiProject: Medicine

This post was written by Vinesh Patel, a junior doctor and an alumnus of Imperial College, London

A new adventure for Wikimedia UK began this summer with a project in collaboration with Imperial College School of Medicine.

In a recent BBC article, Wikimedia UK highlighted the need for everyone looking for medical information to remember Wikipedia is simply an online encyclopedia, and nothing more.

A ganglion is a type of benign fluid collection that can form from fluid around tendons on your hand and some people used to claim it could be cured with a well judged thump with a Bible. However, evidence doesn’t support this practice. An encyclopedia with a similarly hard book covering would be judged by most laypeople today to be about as useful in solving such medical problems, and they would probably just see their doctor about a lump on their hand.Yet there seems to be a great tangle when the same information is put in an online encyclopaedia.

It is this tangle that is being explored by 3 groups of medical students, as they seek to edit selected Wikipedia articles within the field of medicine. 10 of them from different year groups are collaborating with senior academics to edit academic field they find interesting.

The format is they select a B or C class article from Wikiproject medicine and look to develop it over several months. They collaborate over several months to edit an article offline and then transcribe their work on to a WP page, having given notice they are going to conduct the edit on Wikipedia. One individual puts their work online after they . They receive help and guidance from senior academics. After putting their edits on WP they work with editors around the world to improve the article through normal routes of discussion on the talk page. The project is running from

The primary aim is to allow the students to develop their academic skills, but it is also hoped that the question of how free information on the internet is used in medicine will be given some practical answers. In future the program may be expanded to allow students to collaborate with students in developing countries. In fact, many students said the most inspiring aspect of the project is the potential to spread free medical information to their less privileged colleagues around the world, harnessing the possibilities of the internet.

by Stevie Benton at August 21, 2014 03:05 PM

August 20, 2014

Wikimedia Foundation

Remembering Jorge Royan

The is a syndicated post originally published by Wikimedia Argentina. The original Spanish version can be found here.

Wikimedia Argentina is saddened by the passing of our great friend and collaborator, the Argentinian architect and photographer Jorge Royan. Jorge was a winner of the National color photo Ranking by AFA and gold medal recipient from the International Federation of Photography (FIAP). Jorge also held various exhibits in local and international events. He was nominated by Agfa International as “professional of the month.”

As well as being a judge for Wiki Loves Monuments Argentina, Jorge donated hundreds of beautiful photos to Wikimedia Commons so that, in his own words, they won’t stay lost in his computer when he’s no longer around and serve a greater purpose other than just as a curiosity to his grandchildren. We hope that his wishes have been granted. Below you will find a small selection of Jorge’s work.

Thank you so much Jorge!

“A camera is like a bird that should be frozen in flight. To decide from where the bird looks into space (and with what eyes) is our job. Sometimes at ground level, others hang from a chandelier from four meters up. To obtain the wings is our responsibility.”

-Jorge Royan

Wedding photography, The Rudolfinum, Prague, Czech Republic


A skater in the Vondelpark. Amsterdam, The Netherlands


Multi-neck guitar, Paris, France


Jama Masjid the main mosque in Delhi, India


Via delle Oche, Italy


Violin repair shop, Salzburg, Austria


Ford Motor Company vintage Ford, Havana, Cuba


Maori rowing ceremonial choreography, New Zealand

We are currently looking to incorporate a photo of Jorge into this blog. If you have access to a freely licensed photo of Jorge, please contact us. Thank You.

by carlosmonterrey at August 20, 2014 11:26 PM

August 19, 2014

Wikimedia UK

Building the Open Access Button

This guest blog post was written by David Carroll, Open Access Button Project Lead

Earlier this month, as I sat at the Wikimania Open Data Hack in the Barbican, silently whirring in the back of my mind was an impending anniversary. It had been one year since the first line of code of what would later become the Open Access Button was written. The surroundings themselves were not dissimilar, a year earlier we were a little way across the city of London at The BMJ’s hack weekend and it was there we found an incredible team of developers to make the Open Access Button Beta a reality.

The motivation for building the Open Access Button came just a few months earlier, when in March 2013, Joe McArthur and I learnt that people are systematically denied access to research every day. hen we learnt this, we wanted to do something about it. Over the following months, we worked every second of our spare time with an amazing team of volunteer developers and in November 2013 we launched the Open Access Button.

The Button is a browser bookmarklet that allows users to report when they hit a paywall and are denied access to research. Being denied access to research is often an invisible problem and through the Button we aim to make the problem visible, collect the individual experiences, and showcase the global magnitude of the problem. 

So far we’ve tracked and mapped over 8,700 paywalls since the launch. These paywalls represent 8,700 times that scholars were denied access to research in their field ­ students couldn’t access additional sources for their thesis, or doctors couldn’t read the latest medical research. The stories collected so far are just the tip of a large iceberg of those being denied access to research.

The Button that currently exists at openaccessbutton.org is only a example of what we want to do in the future and since November, we’ve been working hard planning for the future of the Button and building a global, diverse student team supplemented by a professional steering committee to help us make the future Button the success.

Recently, we announced a partnership with Cottage Labs to further develop the Open Access Button. They will work on development of the Open Access Button, in addition to providing hardware and sysadmin support. In addition to re-building the Open Access Button, we’re also working with Wikipedians on the Signalling OA-ness project. This tool works is that when someone is making a citation, they can use the “Signalling OA” tool. This tool will be used to signal the “openness” of citations on Wikipedia, with the main purpose of this would be to spare readers the disappointment of clicking through to the resource only to find out that they cannot access it. It will also be useful for Wikipedia editors to see if citations are licensed in a way that allows for the images, media or even text to be reused in Wikipedia articles. We’ve been working on this recently and our contribution to this project should be ready by Wikimania next week.

The image shows a map which highlights the places where research has been noted as sitting behind a paywall

A screenshot of the Open Access Button map

The collaboration with Cottage Labs is provided in kind to build core functions of the future Open Access Button and to drive us forward but this in kind support is just for the core functions, to achieve everything we want to achieve and more, we still need your support.

The Open Access Button Beta was built with the support of the Open Access Community and a small team of incredible developers who worked with us on a volunteer basis because of their dedication to Openness and as we develop the future Open Access Button, we want that community spirit to continue. If you’re a developer committed to Openness in your work wanting to lend a hand, find us at Github and if you’re a publisher, a library, an organisation or an individual committed to opening up knowledge for all get in touch. If you can offer financial support, in kind support or just some helpful words of advice, we would love to hear from you. With your support, we can meet our goals and launch the best button we can and continue to make the problems of paywalls impossible to ignore.

As we work towards the next launch, we are going to continue collecting data and user stories in order to advocate for open access. If you’re hitting paywalls, don’t be silent, download the Button at openaccessbutton.org and report each time you’re denied access to research.

To stay up to date on the progress of Button, you can follow us on Twitter, like us on Facebookread our blog or email us at openaccessbutton@medsin.org.


by Stevie Benton at August 19, 2014 02:54 PM

August 18, 2014

Wikimedia Foundation

Wikipedia in the classroom: Empowering students in the digital age

Anne in front of the Library at Diablo Valley College.

During her last year of high school, Anne Kingsley took a variety of classes at Sierra College, her local community college in Rocklin, CA. The experience greatly influenced her decision to pursue a career in teaching. “I loved the atmosphere of the community college and remember spending a lot of time printing out articles and copying books in the library,” Anne recalled. “I remember study groups with recent high school grads, returning students, veterans, single moms.”

The eclectic nature of the community college served her well in her first teaching position in 2002 at a New York organization called Friends of Island Academy (FOIA), where she helped youth in the criminal justice system gain literacy and other basic skills. At that time, the Internet was starting to become a valuable educational resource that would soon make photocopying books in the library a nostalgic pastime. Her time at FOIA was the beginning for discovering innovative ways to solve big educational problems. “Because I had to run a classroom that had very little materials and almost no budget, you had to be creative about content and curriculum design,” explained Kingsley. “This was a powerful experience to build a foundation for classroom experience as it taught me how to think outside of conventional teaching practices.”

Diablo Valley Community college.

Anne went on to teach at Northeastern University, Menlo College and Santa Clara University. While at Northeastern she pursued her doctorate and was part of a training program where the faculty encouraged curriculums that incorporated new media into the classroom. “This was the beginning of blogs and Facebook, so I remember experimenting with these kinds of shared information sources,” said Anne. At the same time Wikipedia, only a few years old at the time, was becoming an increasingly comprehensive encyclopedia. Though in its onset Wikipedia had a reputation for being discouraged by teaching professionals, it has since slowly garnered support and trust from a number of institutions. Today Anne teaches at Diablo Valley College in Pleasant Hill, California, and finds herself once again experimenting with different teaching methods, including the use of Wikipedia.
Tired of assigning the standard research paper and disillusioned by its merits in the 21st century, Anne started to realize that technology has greatly altered the way we access information. Anne elaborates, “I kept thinking that technology has changed the place for research, so why do we keep handing in these static articles as though information doesn’t shift and change all the time. I also knew that old research papers that I had assigned my students were literally piled up in my closet, shoved into boxes, and forgotten about.”

Wikipedia in the classroom.

Simultaneously, Anne kept hearing about underrepresented histories on Wikipedia – from women’s literature to African American history. Though underrepresentation of marginalized subjects is still a concern on Wikipedia, much is being done to address it thanks to people like Anne. “Given that I was teaching at a community college, I figured, let’s see what my students could do with Wikipedia. We all use Wikipedia, so why not see if we could become producers of information rather than just consumers.”

As a Harlem Renaissance enthusiast, Anne taught a course titled “Critical thinking: Composition and Literature Reading the Harlem Renaissance.” It was during this course that she experimented with her idea of producing information in a public forum as a method of learning. Part of the course was to edit articles pertaining to the Harlem Renaissance that were not covered fully on Wikipedia. Using online publications like The Crisis Magazine — an important early 20th century publication for African American culture — the students set out on a journey to research, edit and contribute to the world’s largest encyclopedia.

Humanities Building Classroom at DVC.

Anne and her students soon became aware of the initial learning gap that many new editors face with regards to the Wikipedia syntax. Though somewhat intimidating at first, Anne agrees that editing Wikipedia was a great way to teach students how to become literate in new media language. Her students weren’t the only ones learning something new, Anne explains, “It certainly opened their (and my) eyes to what takes place behind the nicely edited entries.” Another obstacle was trying to figure out how and where to contribute. Anne recalls a student who was hoping to contribute a “religion” entry to the Harlem Renaissance page. The challenge was to figure out where it belonged and how they would go about incorporating it into an existing page in a cohesive manner. Despite a period of adjustment, Anne makes it clear that the benefits her and the students garnered greatly outnumber any difficulties they might initially have had.

From an academic perspective, the assignment captured many of the elements of research that the course aimed to teach – understanding of source material, citation, scholarly research and careful language craft. The fact that Wikipedia is a public forum motivated the students in a manner that perhaps a normal research paper wouldn’t, that is to say, it no longer was just the professor who read the work but also other editors from around the world. The project also proved to be a great collaboration process between the students and the professor. The project lent itself to broader collaboration, especially when it came to the selection process and some of the smaller nuances of contributing to Wikipedia. The project also seemed to greatly improve composition, says Anne, “They (the students) would literally groom their language sentence by sentence – as opposed to earlier experiences writing seven-page research papers where the language fell apart.” Perhaps most satisfying for the students was the sense of accomplishment in seeing their hard work in a public space. Among the new articles created were pages for Arthur P. Davis, a section for religion in the Harlem Renaissance article and a page for Georgia Douglas Johnson – formerly a stub.

Anne expresses great interest in assigning this project again to her students. “I don’t always get to select the classes I teach, but if I had the opportunity to teach the Harlem Renaissance again, I would repeat this curriculum.” When asked what she would do differently, if anything, she replied, “More time. I only gave my students four weeks to create their entries. I did not realize how many of them would choose to create full-length articles or more complex entries.” Anne is part of a growing number of teaching professionals who choose to think outside the box and embrace new mediums in an effort to not only contribute to the greater good, but also prepare their students for a 21st century academic landscape. She had a clear message to her colleagues who perhaps might not be as embracing of Wikipedia in the classroom, she says, “Think big…students have this amazing capacity to want to experiment with you and others, especially when it makes their work visible and meaningful.”

Carlos Monterrey, Communications Associate at the Wikimedia Foundation

by carlosmonterrey at August 18, 2014 09:42 PM

Wiki Education Foundation

Welcome, Lorraine Hariton

A very warm welcome to Lorraine Hariton, the newest board member of the Wiki Education Foundation. I’m thrilled about Lorraine joining the Wiki Education Foundation board and bringing her expertise with technology and board service to our organization. I’m pleased to see the Wiki Education Foundation board shaping up so well with strong members like Lorraine who can help us achieve our goals to be the link between Wikipedia and academia in the United States and Canada.

I look forward to working with Lorraine in the coming months and years.

Frank Schulenburg
Executive Director

by Frank Schulenburg at August 18, 2014 05:12 PM

Magnus Manske

The Men Who Stare at Media

Shortly after the 2014 London Wikimania, the happy world of Wikimedia experienced a localized earthquake when a dispute between some editors of the German Wikipedia and the Wikimedia Foundation escalated into exchanges of electronic artillery. Here, I try to untangle the threads of the resulting Gordian knot, interwoven with my own view on the issue.


As best as I can tell, the following sequence of events is roughly correct:

  1. The WMF (Wikimedia Foundation) decides to update and, at least by intention, improve the viewing of files (mostly images), mainly when clicked on in Wikipedia. The tool for this, dubbed MediaViewer, would do what most people expect when they click on a thumbnail on a website in 2014, and be activated by default. This is aimed at the casual reader, comprising the vast majority of people using Wikipedia. For writers (that is, “old hands” with log-ins), there is an off switch.
  2. A small group of editors on English Wikipedia suggest that the MediaViewer, at least in its current state, is not suitable for default activation. This is ignored by the WMF due to lack of total votes.
  3. A “Meniungsbild” (literally “opinion picture”; basically, a non-binding poll) is initiated on German Wikipedia.
  4. The WMF posts on the Meinungsbild page that it (the WMF) reserves the right to overrule a negative result.
  5. About 300 editors vote on German Wikipedia, with ~2/3 against the default activation of the MediaViewer.
  6. The WMF, as announced, overrules the Meinungsbild and activates the MediaViewer by default.
  7. An admin on German Wikipedia implements a JavaScript hack that deactivates the MediaViewer.
  8. The WMF implements a “super-protect” right that locks out even admins from editing a page, reverts the hack to re-enable the MediaViewer, and protects the “hacked” page from further editing.
  9. Mailing list shitstorm ensues.

An amalgamate of issues

In the flurry of mails, talk page edits, tweets, blog posts, and press not-quite-breaking-news items, a lot of issues were thrown into the increasingly steaming-hot soup of contention-laden bones. Sabotage of the German Wikipedia by its admins, to prevent everyone from reading it, was openly suggested as a possible solution to the problem, Erik Möller of WMF was called a Nazi, and WMF management is raking in the donations for themselves while only delivering shoddy software. I’ll try to list the separate issues that are being bundled under the “MediaViewer controversy” label:

  • Technical issues. This includes claims that MediaViewer is useless, not suitable for readers, too buggy for prime time, violates copyright by hiding some licenses, etc.
  • WMF response. Claims that the Foundation is not responding properly to technical issues (e.g. bug reports), community wishes, etc.
  • WMF aim. Claims that the Foundation is focusing exclusively on readers and new editors, leaving the “old hands” to fend for themselves.
  • Authority. Should the WMF or the community of the individual language edition have the final word about software updates?
  • Representation: Does a relatively small pool of vocal long-time editors speak for all the editors, and/or all the readers?
  • Rules of engagement: Is it OK for admins to use technological means to enforce a point of view? Is it OK for the WMF to do so?
  • Ownership: Does the WMF own Wikipedia, or do the editors who wrote it?

A house needs a foundation

While the English word “foundation” is know to many Germans, I feel it is often interpreted as “Verein”, the title of the German Wikimedia chapter. The literal translation (“Fundament”), and thus its direct meaning, are often overlooked. The WMF is not “the project”; it is a means to an end, a facilitator, a provider of services for “the community” (by whatever definition) to get stuff done. At the same time, “the community” could not function without a foundation; some argue that the community needs a different foundation, because the next one will be much better, for sure. Thankfully, these heroic separatists are a rather minute minority.

The foundation provides stability and reliability; it takes care of a lot of necessary plumbing and keeps it out of everyone’s living room. At the same time, when the foundation changes (this is stretching the literal interpretation of the word a bit, unless you live in The Matrix), everything build on the foundation has to change with it. So what does this specific foundation provide?

  • The servers and the connectivity (network, bandwidth) to run the Wikis.
  • The core software (MediaWiki) and site-specific extensions. Yes, since it’s open source, everyone can make a fork, so WMF “ownership” is limited; however, WMF employs people to develop MediaWiki, with the specific aim of supporting WMFs projects. Third-party use is wide-spread, but not a primary aim.
  • The setup (aka installation) of MediaWiki and its components for the individual projects.
  • The people and know-how to make the above run smoothly.
  • Non-technical aspects, such as strategic planning, public relations and press management, legal aspects etc. which would be hard/impossible for “the community” to provide reliably.
  • The money to pay for all of the above. Again, yes, the money comes from donation; but WMF collects, prioritizes, and distributes it; they plan and execute the fundraising that gets the money in.

The WMF does specifically not provide:

  • The content of Wikipedia, Commons, and other projects.
  • The editorial policies for these projects, beyond certain basic principles (“Wikipedia is an encyclopedia, NPOV, no original research”, etc.) which are common to all language editions of a project.


I think that last point deserves attention in the light of the battle of MediaViewer. The WMF is not just your hosting provider. It does stand for, and is tasked to uphold, some basic principles of the project, across communities and languages. For example, the “neutral point of view” is a basic principle on all Wikipedia. What if a “community” (again, by whatever definition) were to decide to officially abandon it, and have opinionated articles instead? Say, the Urdu edition, a language mostly spoken in Pakistan (which I chose as a random example here!). I think that most editors, from most “communities”, would want the WMF to intervene at that point, and rightly so. You want opinionated texts, get a blog (like this one); the web is large enough. In such a case, the WMF should go against the wishes of that “community” and, if necessary, enforce NPOV, even if it means to de-admin or block people on that project. And while I hope that such a situation will never develop, it would be a case were the WMF would, and should, enforce editorial policy (because otherwise, it wouldn’t be Wikipedia anymore). Which is a far more serious issue than some image viewer tool.

The point I am trying to make here is that there are situations where it is part of the mission and mandate of WMF to overrule “the community”. The question at hand is, does MediaViewer comprise such a situation? It is certainly a borderline case. On one hand, seen from the (German) “community” POV, it is a non-essential function that mostly gets in the way of the established editors that are most likely to show up on the Meinungsbild, and admittedly has some software issues with a generous sprinkling of bug reports. On the other hand, from the WMF’s point of view, the dropping number of editors is a major problem, at it is their duty to solve it as best as they can. Some reasons, e.g. “newbie-biting”, are up to the communities and essentially out of the WMF’s control. Other reasons for the lack on “fresh blood” in the wiki family include the somewhat antiquated technology exposed to the user, and that is something well within its remit. The Visual Editor was developed to get more (non-technical) people to edit Wikipedia. The Upload Wizard and the MediaViewer were developed to get more people interested in (and adding to) the richness of free images and sounds available on the sites.

The Visual Editor (which seems to work a lot better than it used to) represents a major change in the way Wikipedia can be used by editors, and its initial limitations were well known. Here, the WMF did yield to the wishes of individual “communities”, and not even an option for the Visual Editor is shown on German Wikipedia for “anonymous” users.

The MediaViewer is, in this context, a little different. Most people (that is, anonymous readers of Wikipedia, all of which are potential future editors) these days expect that, when you click on a thumbnail image on a website, you see a large version of it. Maybe even with next/prev arrows to cycle through available images on the page. (I make no judgement about whether this is the right thing; it just is this way.) Instead, Wikipedia thus far treated the reader to a slightly larger thumbnail, surrounded by mostly incomprehensible text. And when I say “incomprehensible”, I mean people mailing me if they could use my image from Commons; they skip right past the {{Information}} template and the license boxes to look for the uploader, which happens to be my Flickr/Wikipedia transfer bot.

So the WMF decided that, in this specific case, the feature should be rolled out as default, on all projects instead of piecemeal like the Visual Editor (and do not kid yourself, it will come to every Wikipedia sooner or later). I do not know what prompted this decision; consistency for multilingual readers, simplicity of maintenance, pressure on the programmers to get the code into shape under the ensuing bug report avalanche, or simply the notion of this being a minor change that can be turned off even by anonymous users. I also do not know if this was the right technical decision to make, in light of quite a few examples where MediaViewer does not work as correctly as it should. I am, however, quite certain that it was the WMF’s right to make that decision. It falls within two of their areas of responsibility, which are (a) MediaWiki software and its components, and (b) improving reader and editor numbers by improving their experience of the site. Again, no judgement whether or not it was the right decision; just that it was the WMF’s decision to make, if they chose to do so.


I do, however, understand the “community’s” point of view as well; while I haven’t exactly been active on German Wikipedia for a while, I have been around through all of its history. The German community is very dedicated to quality; where the English reader may be exposed to an army of Pokemons, the article namespace in German Wikipedia is pruned rather rigorously (including an article about Yours Truly). There are no “mispeeling” redirects (apparently, if you can’t spell correctly, you have no business reading an encyclopedia!), and few articles have infoboxes (Wikipedia is an encyclopedia, not a trading card game!). There are “tagging categories”, e.g. for “man” and “woman”, with no subcategories; biographies generally have Persondata and authority control templates. In short, the German community is very much in favor of rigorously controlling many aspects of the pages, in order to provide the (in the community’s view) best experience for the user. This is an essential point: the German community cares very much about the reader experience! This is not to say that other languages don’t care; but, in direct comparison, English Wikipedia is an amorphous free-for-all playground (exaggerating a bit here, but only a bit). If you don’t believe me, ask Jimbo; he speaks some German, enough to experience the effect.

So some of the German editors saw (and continue to see) the default activation of the MediaViewer as an impediment to not only themselves, but especially to the reader. And while Germans are known for their “professional outrage”, and some just dislike everything new (“it worked for me so far, why change anything?”), I believe the majority of editors voting against the MediaViewer are either actually concerned about the reader experience, or were convinced (not to say “dragged into”) by those concerned to vote “no”.

The reactions by the WMF, understandably as they are from their perspective, namely

  1. announcing to ignore the “vote” (not a real, democratic vote, which is why it’s called “Meinungsbild” and not “Wahl”)
  2. proceeding to ignore the vote
  3. using “force” to enforce their decision

were interpreted by many editors as a lack of respect. We the people editors wrote the encyclopedia, after all; how dare they (the WMF) change our carefully crafted user experience, and ignore our declared will? It is from that background that comparisons to corporate overlords etc. stem, barely kept in check by Mike Godwin himself. And while such exaggerations are a common experience to everyone on the web, they do not exactly help in getting the discussion back to where it should be. Which is “where do we go from here”?

The road to hell

One thing is clear to me, and I suspect even to the most hardened edit warrior in the wikiverse: Both “sides”, community and WMF, actually want the same thing, which is to give the reader the best experience possible when browsing the pages of any Wikimedia project. The goal is not in question; the road to get there is. And whose authority it is to decide that.

On the technical side, one issue is the testing-and-fixing cycle. Traditionally, the WMF has made new functionality available for testing by the community quite early. By the same tradition, that option is ignored by most members of that community, only to complain about being steamrollered into it when it suddenly appears on the live site. On the other hand,  the WMF has rolled out both the Visual Editor and the MediaViewer in a state that would be called “early beta” in most software companies. “Release early, release often” is a time-honored motto in open source software development; but in this specific case, using early releases in production isn’t optional for the users. From discussions I had on Wikimania, I have the distinct impression that people expect a higher standard of quality for software rolled out by the WMF on the live sites, especially if it becomes default. How this should work without volunteers to test early remains a mystery; maybe a little more maturity on the initial release, followed by more widespread use of “beta” features, is part of the answer here.

On the votes-vs-foundation side, I am of the opinion that clearer lines need to be drawn. The WMF does have a responsibility for user experience, which includes software changes, some of which will have to be applied across the wikiverse to be effective; the upcoming “forced account unification” for (finally!) Single User Login comes to mind. And, in a twist on the famous Spiderman quote, with great responsibility needs to come great power to fulfill it. Responsibility without power is the worst state one can have in a job, which even the most uncompromising “community fighter” will agree to. So if and when the WMF makes such a decision within their remit, the energy of the community would be best spent in feeding back the flaws in order to get the best possible result, instead of half-assed attempts at sabotage (I much prefer full-assed attempts myself).

There is, of course, another side of that coin. In my opinion, the WMF should leave the decision for default activation of a new feature to a representative vote of a community, unless the activation is necessary for (a) technical, (b) consistency, or (c) interdependency reasons. A security fix would fall under (a); the Single User Login will fall under (c); MediaViewer falls under (b), though somewhat weakly IMHO. Now, the key word in the beginning of this paragraph is “representative”. I am not quite sure how this would work in practice. I am, however, quite sure it is not 300 editors (or Spartans) voting on some page. It could include votes by a randomized subset of readers. It could also include “calls to vote” as part of beta features, e.g. if you had the feature enabled in the last week. These could be repeated over time, as the “product” would change, sometimes significantly so, as it happened with the Visual Editor; a “no” three month ago would be quite invalid today.

Finally, I believe we need at least part of the above written out, and agreed upon, by both the WMF and “the communities”. It is my hope that enough people will share my opinion that both “parties” still have a common goal. Because the house that is Wikipedia cannot stand without a foundation, and a foundation without a house on top is but a dirty pond.

by Magnus at August 18, 2014 03:38 PM

Gerard Meijssen

#Twitter - #WikiParliaments.. but what about #Wikidata and #Austria?

Twitter advertised several things that I might like. WikiParliaments could be one of them. Today I learned that Othmar Tödling died. He was a member of the "Nationalrat" of Austria. As such he might be very much of interest to WikiParliaments.

Politicians are human too; they die. When they do, it is often noted in a category what function they held. Today I started adding statements for those humans who hold or held the function of parliamentarian in Austria.

My hope is that people who care about parliaments will make it even prettier and embellish them with even more statements and qualifiers.

by Gerard Meijssen (noreply@blogger.com) at August 18, 2014 09:00 AM

Santhosh Thottingal

Talk at Wikimania 2014

I presented the Content Translation project of my team at Wikimania 2014 at London. Here is the video of the presentation.

<iframe allowfullscreen="allowfullscreen" frameborder="0" height="360" src="http://www.youtube.com/embed/b6qvv3eJ_Ag?start=1947" width="640"></iframe>

by Santhosh Thottingal at August 18, 2014 03:09 AM

Tech News

Tech News issue #34, 2014 (August 18, 2014)

TriangleArrow-Left.svgprevious 2014, week 34 (Monday 18 August 2014) nextTriangleArrow-Right.svg
Other languages:
Deutsch • ‎English • ‎español • ‎suomi • ‎français • ‎עברית • ‎italiano • ‎日本語 • ‎português • ‎தமிழ் • ‎українська • ‎中文

August 18, 2014 12:00 AM

August 17, 2014

Gerard Meijssen

#MediaWiki - #MediaViewer rehashed

Some things are plain stupid, sometimes I am and sometimes someone else is. I filed a bug about my experience of the MediaViewer. For me it is a show stopper; it prevents me from using it easily.

The problem is that Chrome shows a really awful URL for an image with funny characters in its title. When I look at it using the MediaViewer it is bad but it looks fine when I look at it from the Commons page.
  • File:%C3%89cole_normale_sup%C3%A9rieure_de_Paris,_26_January_2013.jpg
  • File:École normale supérieure de Paris, 26 January 2013.jpg
According to the Bugzilla triage I must be stupid because it works; it complies with specifications and, indeed technically it works. It just stopped working for me.

Several reactions are possible. My choice was to shrug, mutter "it is the user experience stupid" and I got on with my life. Others find it a precursor to the invasion of an evil overlord who does not understand the world and prepare for war.

By filing a bug, by posting this blog I have rid myself of my frustrations. I know several developers; I met many of them at Wikimania and I know they are really dedicated and mean well. I also know that such things pass. I am sure someone will see the light or Google will fix Chrome (if that is where the bug lives). In the end I do not look at images that often as a result.

by Gerard Meijssen (noreply@blogger.com) at August 17, 2014 08:56 PM

#Wikidata - giving a #category an application

Many #Wikimedia categories have interlanguage links. Obviously the content of all these linked categories do not have the same content. Someone has to add the articles, sometimes it gets done and sometimes it doesn't. Often articles just do not exist.

When the facts that are implicit in what a category is about make it to all the items in all the categories, typically you have a superset in Wikidata. It does not stop there; items in Wikidata may be included that are not in any of those linked categories.

This is all theoretical unless ... unless you can query Wikidata and use the results. Much data has been added to Wikidata based on the content of categories and queries have been used to identify missing items this is done using AutoList2. This is one application; it is used by some of the "advanced" users of Wikidata.

What is even more interesting is showing what Wikidata things should be in a category. This is done using Reasonator. At this time for over 690 categories statements are included that define a query. This query is already complex enough that the Wikidata functionality will not be able to express the results..

These queries could be of use to "advanced" Wikipedians because it is a basis for identifying articles that have not been categorised or articles that still need to be written in their Wikipedia. For everyone else it is just interesting; this information exists and it is readily available. It is one way of learning that Wikidata knows for instance about 121,922 politicians.

by Gerard Meijssen (noreply@blogger.com) at August 17, 2014 02:06 PM

#Wikidata - sources or confidence

At this time Wikidata has more than 36,396,372 statements these statements are associated with some 15,335,451 items. The majority of these items have less than five statements and even worse for many items it is not known what they are about.

When you consider the quality of this data, there are two schools of thought. There are those who insist on sources with every statement and, there are those who have confidence in the validity of the data because they know where it came from.

Either way, when you want to assert that a specific approach is superior, it becomes a numbers game and, understanding the relative merits is what it is all about. When something is sourced, you can be confident that it is highly probable at the time of the sourcing. There is however no certainty that the data remains stable. Confidence can be maintained by regularly comparing the data with what the source has to say.

When the data is regularly compared, it does not matter that much if Wikidata has source information itself. The source is typically one of the Wikipedias and they are said to have sources, this may provide us with enough reasons for confidence. The comparison of data increases this confidence particularly when multiple sources prove to be in agreement.

Practically, the basic building blocks to start comparing exist. It has been done before by Amir and he produced long lists of differences. Three things are needed to establish new best practices:
  • a well defined place needs to found where such reports may be found
  • communities need to understand that it raises confidence in their project

by Gerard Meijssen (noreply@blogger.com) at August 17, 2014 01:54 PM

August 16, 2014

Tony Thomas

Writing PHP unit-tests : How to add a fake user table entry

Flexibility of the Mediawiki PHP unit-tests wrapper extends to the fact that fake database entries can be made and tested upon without causing any harm to the actual one. As I scribbled in the earlier post, the class-comments play a vital role, and dont forget to give them like this: This would create a fake […]

by Tony Thomas at August 16, 2014 03:35 PM

Writing PHP Unit tests to verify extension API POST

PHP unit tests are crucial before deployment to make sure that the degree of damage your extension can cause is minimal. How to start was always my worry, and here we go. Considering I have an API called ‘myapi’ that POST’s string $myvar : The job = submitted makes sure that the job is completed, […]

by Tony Thomas at August 16, 2014 02:12 PM

Gerard Meijssen

#Wikidata - application for its long tail

When Lauren Bacall died this week, it was all over in the news. When Marjorie Stapp died on June 2, 2014 it was noted in the English Wikipedia only yesterday. Today it is known to Wikidata and, several bits of information where added to the item about Mrs Stapp as well.

Among those statements is her identifier in the IMDB. The IMDB does not know yet about the demise of Mrs Stapp and it is not unlikely that there are more actors and actresses we know about that have died. Providing external sources like the IMDB with an RSS feed of the changes that are made in Wikidata is not hard.

When we share our information in this way, we gain friends. With these new friends we may do friendly things like noting differences between the data that we hold. Equally important, we add a reason why people might maintain the data that is in Wikidata. As our data gains in application, we will grow and diversify our community.

by Gerard Meijssen (noreply@blogger.com) at August 16, 2014 06:50 AM

August 15, 2014

Wiki Education Foundation

Student editor translates French article into English

Have you ever looked up a city or region on Wikipedia to learn more about it—perhaps before traveling there? If you’re on English Wikipedia and searching for a non-English-speaking place, you may find minimal information beyond its geography.

Dr. Julie McDonough Dolmaya’s spring 2014 translation course at York University sought to minimize the discrepancies between French Wikipedia and English Wikipedia, as student editors developed their translation skills by selecting an article to translate and expand.

Student editor Azink2 found the article about Aubagne and more than doubled its content. Now, the 2,000+ readers per month on English Wikipedia can also learn about the history, demographics, and politics of the region!

Jami Mathewson
Program Manager

by Jami Mathewson at August 15, 2014 05:04 PM

Wikimedia UK

The GLAM-Wiki Revolution

This post was written by Joe Sutherland and User:Rock drum

During Wikimania 2014 last week, we were lucky enough to be able to screen our documentary about the GLAM-Wiki programme in the UK. The film brings together interviews with some of the Wikimedians in Residence from institutions across the country – and with Wikimedia UK staff. We want it to function as an outreach tool – as a way of teaching people about the GLAM programme, but also as a celebration of the work of so many volunteers and paid Wikimedians in Residence.

Over the coming weeks we will be sharing additional content from this project, written interviews and shorter videos which will also be published through Wikimedia UK’s channels. We will also be releasing some of the source footage on Wikimedia Commons under a Creative Commons license.

We are pleased to be able to share this video online, both on YouTube and Wikimedia Commons. We hope you enjoy it.

<iframe allowfullscreen="allowfullscreen" frameborder="0" height="360" src="https://www.youtube-nocookie.com/embed/UlNT16gqHyo?list=PL66MRMNlLyR6BuplUUTWvyl4_klBOZzNT" width="640"></iframe>

by Richard Nevell at August 15, 2014 01:54 PM

August 14, 2014

Wiki Education Foundation

Wiki Ed presents at Wikimania

We’re back from an amazing trip to London! Last week, four Wiki Education Foundation staff and three board members were in the thick of the global Wikimedia community at the annual Wikimania conference. This year Wikimania had a strong focus on education, and it drew large crowds of people from all over the world who were interested in incorporating Wikipedia into educational settings. In our sessions, the Wiki Education Foundation contingent shared our knowledge with the community so that others could learn from what we’ve done.

Ask Wiki Ed

LiAnna Davis, Frank Schulenburg, and Jami Mathewson at the “Ask the Wiki Education Foundation” session.

Educational Partnerships Manager Jami Mathewson and Director of Programs LiAnna Davis were active leaders in the Education Pre-Conference prior to Wikimania. Jami presented during a Wikipedia Ambassador training for people interested in supporting class-based programs globally, as well as a session for helping new program leaders find the best structure for their programs. LiAnna led a half-day workshop on how to use Wikipedia as a teaching tool. All three sessions were very well-attended, with more than 50 people representing at least 18 countries around the world in attendance.

More than 100 attended an education session in which Wiki Education Foundation staff presented in all three speaking slots. First up was “Ask the Wiki Education Foundation“, where LiAnna, Jami, and Frank presented information about our organization and then opened the floor for questions from the audience. Next up, LiAnna joined colleagues from the Wikimedia Foundation, Israel, and the United Kingdom to present information about the Wikipedia Education Collaborative, a group of education program leaders worldwide who coordinate sharing learnings. Finally, LiAnna presented “The 7 Biggest Mistakes the Wikipedia Education Program’s Made — and What We’ve Learned From Them“. All three had a great response from the audience, with interesting questions.

Diana Strassmann

Diana Strassmann gives a keynote. (Photo by SLOWKING under CC BY-NC via Wikimedia Commons)

Wiki Ed’s Board Chair, Diana Strassmann, had a featured speaker slot on Saturday. Diana’s talk, which you can watch online, focused on how the Wikipedia Education Program can help Wikipedia. Diana highlighted different theories of knowledge that come from academia, and how these might help Wikipedia overcome some of its current challenges, including the gender gap. Diana also joined Wikipedia Founder Jimmy Wales and other guests for a BBC World radio interview on a show called “In the Balance” focusing on “The Future of Education” to talk about her experiences using Wikipedia as a teaching tool.

Finally, Jami joined program leaders from the Arab World, Israel, and Mexico in a session exploring how the Wiki Education Foundation uses numerical data to evaluate our programmatic efforts.

In addition to the scheduled activities, Wiki Ed staff and board members participated in a number of social events, meeting with Wikipedia editors and program leaders globally to share our experiences teaching students how to edit Wikipedia as part of their coursework and learning from others’ experiences. Many thanks to the Wikimania organizing team for putting education front-and-center at this year’s conference!

LiAnna Davis
Director of Programs

by LiAnna Davis at August 14, 2014 03:12 PM

Gerard Meijssen

#Wikimedia - the quality of access to the sum of all human knowledge

Again, a big flare up of "we the community" demand this and that. Again what Wikipedia, the Wikimedia Foundation is about is conveniently forgotten. At Wikimania there was a really interesting presentation by Raph Koster author of a "Theory of Fun for Game Design". Well recommended once it is available for viewing..

An abstraction of the current huha is in there and this community is described as the monsters who rule it all (my words, his pictures). These people who impose their world on others have forgotten what the game is about. It is about providing access to the sum of all knowledge. From that perspective their issues with the multimedia viewer are hardly significant compared with the increased ease for people who just access the parts of human knowledge we do give access to.

My pet example of "the community" not caring about providing access to our available knowledge is in the decision that easy and obvious access to fonts adds clutter to the user interface and is therefore not acceptable... About seven percent of a population is dyslexic and it is extremely hard to find and enable the OpenDyslexic font. It took a MediaWiki developer over two minutes and he enabled it in a way I did not know existed... He knew it existed, he knew the name of the font. This demonstrates how relevant seven percent of our reader population to our community is.

Should we primarily care about access or is it a playground for monsters?

by Gerard Meijssen (noreply@blogger.com) at August 14, 2014 08:33 AM

#Wikidata - It ain't got a thing

A rose, a rose, is a rose by any other name as beautiful.. Eh actually people are quite smart and know a rose when they see one. Machines need to be told what is a rose.

Wikidata has this requirement of being usable by machines. So we need to know what thing a thing is and for all humans it needs to be stated that all of them are considered human.

Several high powered people at Wikimania expressed the opinion that for Wikidata to get in full swing, we have to identify every thing.

I have identified a few hundred "list articles". Items that start with "List of " or "Member of " for instance. I have identified a lot of "group of people" who were supposed to be born in the XXth century.

At Wikidata a thing is bad. We cannot safely select it, we cannot auto describe it. We should get rid of every thing.

by Gerard Meijssen (noreply@blogger.com) at August 14, 2014 06:47 AM

August 13, 2014

This month in GLAM

This Month in GLAM: July 2014

by Admin at August 13, 2014 09:21 PM

Sumana Harihareswara

Case Study of a Good Internship

I'm currently a mentor for Frances Hocutt's internship in which she evaluates, documents, and improves client libraries for the MediaWiki web API. She'll be finishing up this month.

I wanted to share some things we've done right. This is the most successful I've ever been at putting my intern management philosophy into practice.

  • A team of mentors. I gathered a co-mentor and two technical advisors: engineers who have different strengths and who all promised to respond to questions within two business days. Frances is reading and writing code in four different languages, and is able to get guidance in all of them. The other guys also have very different perspectives. Tollef has worked in several open source contexts but approaches MediaWiki's API with learner's mind. Brad has hacked on the API itself and maintains a popular Wikipedia bot that uses it. And Merlijn is a maintainer of an existing client library that lots of Wikimedians use. I bring deep knowledge of our technical community, our social norms, and project management. And I'm in charge of the daily "are you blocked?" communication so we avoid deadlocks.

  • Frequent communication. Any time Frances needs substantial guidance, she can ping one of her mentors in IRC, or send us a group email. She also updates a progress report page and tells our community what she's up to via a public mailing list. We have settled into a routine where she checks in with me every weekday at a set time. We videochat three times a week via appear.in (its audio lags so we use our cell phones for audio), and use a public IRC channel the other two weekdays. We also frequently talk informally via IRC or email. She and I have each other's phone numbers in case anything is really urgent.

  • Strong relationship. I met Frances before we ever thought about doing OPW together. I was able to structure the project partly to suit her strengths. We've worked together in person a few times since her project started, which gave us the chance to tell each other stories and give each other context. I've encouraged her to submit talks to relevant conferences, and given her feedback as she prepared them. Frances knows she can come to us with problems and we'll support her and figure out how to solve them. And our daily checkins aren't just about the work -- we also talk about books or silliness or food or travel or feminism or self-care tips. There's a healthy boundary there, of course, since I need to be her boss. But our rapport makes it easier for me to praise or criticize her in the way she can absorb best.

  • Frances is great. I encouraged her as an applicant; from her past work and from our conversations, I inferred that she was resourceful, diligent, well-spoken, analytical, determined, helpful, and the kind of leader who values both consensus and execution. I know that many such people are currently languishing, underemployed, underappreciated. A structured apprenticeship program can work really well to help reflective learners shine.

    I got to know Frances because we went to the same sci-fi convention and she gave me a tour of the makerspace she cofounded. Remember that just next to the open source community, in adjacent spaces like fandom, activism, and education, are thousands of amazing, skilled and underemployed people who are one apprenticeship away from being your next Most Valuable Player.

  • Scope small & cuttable. Frances didn't plan to make one big monolithic thing; we planned for her to make a bunch of individual things, only one of which (the "gold standard" by which we judge API client libraries) needed to happen before the others. This came in very handy. We hadn't budgeted time for Frances to attend three conferences during the summer, and of course some programming bits took longer than we'd expected. When we needed to adjust the schedule, we decided it was okay for her to evaluate eight libraries in four languages, rather than eleven in five languages. The feature she's writing may spill a few days over past the formal end of her internship and we're staying aware of that.

  • Metacognition. As Jefferson said, "If men were angels, we would have no need of government." But we're flawed, and so we have to keep up the discipline of metacognition, of figuring out what we are bad at and how to get better. I asked Frances to self-assess her learning styles and have used that information to give her resources and tasks that will suit her. Early in the internship I messed up and suggested a very broad, ill-defined miniproject as a way to learn more about the MediaWiki API; since then I've learned better what to suggest as an initial discovery approach. Halfway into the internship we realized we weren't meeting enough, so we started the daily videochat-or-IRC appointment. I have let Frances know that I can be a bad correspondent so it's fine to nag me, to remind me that she's blocked on something, to ask other mentors for help. And so on. We've learned along the way, about each other and about ourselves. My mom says, "teaching is learning twice," and she's right.

Setting up an internship on a strong foundation makes it a smoother, less stressful, and more joyous experience for everyone. I've heard lots of mentors' stories of bad internships, but I don't think we talk enough about what makes a good internship. Here's what we are doing that works. You?

(P.S. Oh and by the way you can totally hire Frances starting in September!)

August 13, 2014 07:48 PM

Wiki Education Foundation

Wiki Education Foundation Monthly Report: July 2014

1. Highlights

  • After meetings with representatives of American Sociological Association (ASA), the Association for Psychological Sciences (APS), and the National Communication Association (NCA) this spring to establish their goals for their Wikipedia initiatives, we embarked on a strategy of strengthening our educational partnership support, with a reorganization of the programs department. Jami Mathewson has moved into a new role as the Educational Partnerships Manager, focusing on ensuring the Classroom Program scales sustainably. We kicked off work in this area with work on publications, subject-specific materials, workshops, and data. Read more about this area in the “Educational Partnerships” section, below.
  • The Wiki Education Foundation had our first Quarterly Review on Friday, July 18. Head of Communications and External Relations LiAnna Davis presented to the rest of the Wiki Ed staff the work she’s been doing over the last three months and her plans for the next three months. Creating this dedicated time for reflection and evaluation enables us to be more effective in our work. At the conclusion of the Quarterly Review, we also announced LiAnna’s promotion to Director of Programs, effective August 1.
  • Our search for initial office space concluded with the selection of an office in the Presidio area of San Francisco as our new home. We found our new home after an extensive search that took a variety of different aspects into account, including easy access to public transportation, existing infrastructure, and mostly, a good cultural fit. Our move into the new space will take place in mid August.

2. Programs

At the end of the month, we announced our new Programs Department. We decided to reorganize our staff to reflect that Wiki Education Foundation is a programs-focused organization. Within the Programs Department, we will have two program managers focused on our work with educational institutions in the US and Canada: the Educational Partnerships Manager and the Classroom Program Manager. The Educational Partnerships Manager will lead our sustainability initiatives: our partnerships with academic associations and universities. The Classroom Program Manager will continue supporting instructors, student editors, and Ambassadors. As part of this change, LiAnna Davis has agreed to take on the role of Director of Programs. Jami Mathewson has been promoted to Educational Partnerships Manager, and we are hiring for the Classroom Program Manager position as well as for a Communications Associate who will take over some of the communications tasks LiAnna has been doing for our organization.

2.1. Educational Partnerships

APS meeting

Meeting with the Association for Psychological Sciences (APS) in San Francisco.

While the Classroom Program activity is on summer break, we are building a strategy to make the program more sustainable and far-reaching. This means we need not only to expand Wikipedia assignments to more instructors and students, but we also need to build capacity to support those assignments. We are working to build capacity and reach with two primary focuses: training campus faculty and partnering with academic associations.

Our focus this July has been to develop plans for making our partnerships with academic associations more productive. Though we have supported instructors and courses from the American Sociological Association (ASA), the Association for Psychological Sciences (APS), and the National Communication Association (NCA) in the past, we now have more resources to strengthen those relationships. We plan to support these and other organizations in the following ways:

  • Publications: The Educational Partnerships Manager will work with the Communications Associate to write articles and blog posts for academic associations’ member magazines. This month, Jami and LiAnna wrote an article for ASA’s publication, Footnotes, which will appear in the September/October issue and highlights the past successes from student editors in sociology courses as well as information about how to participate for new instructors.
  • Subject-specific materials: We will develop training and help materials for student editors that include specific guidelines relevant to that topic area on Wikipedia. In July, LiAnna drafted the first of these discipline-specific guides, for psychology student editors. LiAnna collected feedback from Wikipedia editors in the psychology area and instructors who teach psychology to complete two drafts of the handout. The handout will be designed and published in August, with printed copies available for psychology instructors to distribute to students in September.
  • Presentations and workshops: We will attend academic organizations’ annual meetings to host teaching workshops. In August, Jami and LiAnna will lead a workshop for instructors on how to incorporate Wikipedia assignments into sociology classes at ASA’s Annual Meeting in San Francisco.
  • Data and metrics: At the end of each term, the Classroom Program Manager will provide relevant metrics to the academic organization about the contributions from their student editors. For example, we may provide ASA the number of sociology student editors who enrolled, number who made at least one edit, number of sociology articles edited, number of new sociology articles started, and number of page views those articles have received.

We expect this list to grow, and we also expect the academic organizations to provide support for our courses. We see a great benefit in the experts within these organizations identifying content gaps within their field of scholarship, which student editors can then fill during their assignments.

We are also interested in developing more partnerships, especially within disciplines that lack significant coverage on English Wikipedia. In July, Jami had preliminary conversations with representatives from the National Women’s Studies Association and the American Folklore Society, both of which are interested in starting a Wikipedia initiative for their members.

2.2. Classroom Program

During the summer break, much of the Classroom Program work focuses on documentation of past terms. In July, we published a blog post wrapping up the spring 2014 term. We also started the hiring process for the Classroom Program Manager, who will be responsible for this program beginning in fall 2014.

2.3. Digital Infrastructure

The main focus in July for Digital Infrastructure was the wikiedu.org Request for Proposals. Sage contacted development companies – some in Seattle, and some based elsewhere in the United States or internationally – about the project, and had one-on-one discussions with each of the companies that plans to submit a proposal. Sage started analyzing the first complete proposals that started coming near the end of the month. The deadline for all proposals is August 1, and we expect to settle on a development company and begin work on “wikiedu.org 1.0″ by mid- to late-August.

The first major component of the new wikiedu.org will be an “assignment design wizard” that lets a professor choose from among the common types of recommended Wikipedia assignments, customize the timeline and details to fit their course, and post the course plan to Wikipedia. Over the last few years, we’ve gained – and documented – a lot of knowledge about what works and what doesn’t for Wikipedia writing assignments. But it’s not easy for professors who are just getting started with Wikipedia to benefit from that knowledge, and even if they want to follow one of our model assignment designs closely, it is tedious to customize it. For a professor, turning a wall of boilerplate wikitext into an assignment plan that fits your course is not a fun way to spend your day. But a solid assignment plan that incorporates best practices and subject-specific Wikipedia resources is the biggest factor in running a successful Wikipedia assignment.

Wiki Ed launched also its first major MediaWiki development project this month, signing a contract with WikiWorks to create an improved “activity feed” for Wikipedia course pages. The activity feed is a revamp of the current “course activity” page associated with each course. The new design, based on need-finding interviews with users of the current software, was created by Facebook Open Academy student JJ Liu as part of her Wikimedia Foundation mentorship project earlier this year. This design should make it much easier to follow the work of a class edit by edit. We consider this project an experiment, to gauge the viability of working with third-party contractors to improve Wikipedia’s software.

Finally, as an initial exploration of potential new metrics tools, Sage set up an online “course stats” tool to supplement the statistics available via Wikimetrics. In particular, the tool provides page views for the articles assigned to student editors in a course, and lists of all the articles edited by students in one or more courses.

3. Financials

  • Expenses for the month is $83,993 versus plan of $75,039, primarily due to catching up in the areas of Governance and Fundraising.
  • Year-to-date expenses is $287,106 versus plan of $340,076, primarily due to lower expenses in the area of General & administrative, which is caused mainly by delayed move into new office space.
  • Cash position is $92,557 as of July 31, 2014.

July expenses




YTD expenses

4. Board

On July 31, board member Mike Christie announced his resignation. He has been part of the Wiki Education Foundation’s leadership since we were a working group of volunteers trying to determine the best way to support the Wikipedia Education Program in the United States and Canada back in spring 2012.

Mike’s involvement has been instrumental in bringing the Wiki Education Foundation into being. His thoughtful approach to decision-making and careful documentation of our discussions as secretary have helped us maintain transparency with our stakeholders. Mike always speaks from what he believes, and his drive to make sure our organization properly incorporated the Wikipedia editing community’s views was key to the success we’ve had so far.

We will miss Mike’s brilliant perspective and engaging personality on the Wiki Education Foundation board and we look forward to his continued involvement as a supporter of our organization.

5. Office of the ED

Current priorities:

        • Planning the year ahead: annual plan & budget
        • Moving into new office space
        • Monitoring of organizational performance
  • In July, we created a schedule for our quarterly reviews in 2014–15. In order to increase accountability and allow for course corrections, each core team of our organization will present their results and plans for the time ahead once a quarter. Quarterly reviews will allow both staff and other stakeholders of the Wiki Education Foundation to stay up-to-date on which activities we pursue and how we’re doing against our goals. The first quarterly review took place on July 18, with LiAnna Davis giving an overview of her communications work. LiAnna’s slides and the notes from this review meeting can be found on Meta. Our next quarterly review will cover Sage’s work in the area of Digital Infrastructure; it is scheduled for August 22.
  • Executive Assistant Jessica Craft left the Wiki Education Foundation staff in July. We thank Jessica for her contributions.
  • Jami, LiAnna, and Frank in front of our new office in the Presidio.

    Jami, LiAnna, and Frank in front of our new office in the Presidio.

    Also in July, our search for an initial office space could be concluded successfully. Our first office will be in the Presidio of San Francisco on the northern tip of the San Francisco Peninsula. The Presidio, originally a Spanish Fort built in 1776, is now home to Alexa Internet, the San Francisco Film Center, Lucasfilm, and more than 30 non-profits in the areas of arts, education, and conservation (e.g. the Thoreau Center for Sustainability, The Bay School of San Francisco, The Gordon and Betty Moore Foundation, and the Tides Foundation). In 1996, the Presidio came under the management of the Presidio Trust, a US Government Corporation which manages most of the park in partnership with the National Park Service. We found our new home after an extensive search that took a variety of different aspects into account, including easy access to public transportation, existing infrastructure, and mostly, a good cultural fit. Our move into the new space will take place in mid August.


by LiAnna Davis at August 13, 2014 04:43 PM

August 12, 2014

Wikimedia Foundation

Chinese Wikipedia Online Magazine: A Community Gateway

Front Page of The Wikipedian

Chinese Wikipedian Wilson Ye created an online magazine called The Wikipedian in December 2012, in partnership with Addis Wang and Eric Song. This project began as part of the Chinese Social Media Program as another way to connect the local community with the wider international community. Today it has over 500 subscribers on Chinese Wikipedia and is a new tool for spreading the idea of the Wikimedia movement to Chinese readers.

When Addis and Wilson first thought about creating an online magazine, they faced some challenges. Since 2005, many Chinese Wikipedians tried different ways to publish online magazines, but no one succeeded. These past failures brought up tough questions for the team regarding content and target audience. Instead of worrying, Wilson decided to make an experimental issue, which consisted of community news, abstracts of four Wikipedia articles and a featured picture. With a beautiful design by Eric, the first issue of the magazine received a lot of encouragement and advice.

After around six months of iterating on the magazine (which was originally published in simplified Chinese), The Wikipedian team published their first traditional Chinese version to promote news of the international community and interesting content contributed by community members to the Taiwan, Hong Kong and Macau communities. In a 2013 Wikimania special edition, the team invited the editors from Hong Kong and Taiwan to talk about their experiences at Wikimania. This was the first time The Wikipedian broke the geographical barrier to connect Chinese-speaking communities across countries.

Social media plays an important role in promoting the magazine. An account on weibo.com, funded by an Individual Engagement Grant from the Wikimedia Foundation, has become an incubator of new projects. With over 10,000 active followers, this social media account brings in a lot of attention to The Wikipedian and offers an easy way for its readers to provide feedback. One year later, now that The Wikipedian is itself an influencer, the magazine has in turn started to bring new followers to the social media account. This expands the diversity of followers and helps generate more influence in the Chinese-speaking world.

A Chinese proverb says: do not look at the sky from the bottom of a well, going outside is the only way to understand what the world you live in looks like. The Wikipedian would like to be one of the ways for Chinese Wikipedians to see and touch the broader international community.

Addis Wang, Coordinator of Wikimedia User Group China

by carlosmonterrey at August 12, 2014 07:20 PM

August 11, 2014

Wikimedia UK

Wikimania 2014 draws to a close

Wikimania 2014 happened, and it was brilliant. More than 200 sessions, and more than 4,000 attendees according to the figures from The Barbican. We had guests from more than 70 countries. And now we are tired. We’ll share more reflections on the conference,  including facts and figures, once the dust has settled and everything is back to normal. Here’s a group photo, taken on Sunday afternoon. Thanks to everyone who took part, everyone who contributed, and most importantly, thank you to all of the volunteers who made it happen!

by Stevie Benton at August 11, 2014 03:39 PM

Tech News

Tech News issue #33, 2014 (August 11, 2014)

TriangleArrow-Left.svgprevious 2014, week 33 (Monday 11 August 2014) nextTriangleArrow-Right.svg
Other languages:
العربية • ‎বাংলা • ‎Deutsch • ‎English • ‎español • ‎suomi • ‎français • ‎עברית • ‎italiano • ‎한국어 • ‎polski • ‎português • ‎українська • ‎中文

August 11, 2014 12:00 AM

August 09, 2014

Gerard Meijssen

#Wikidata - Dear Lila, it is all about the application

Three months in the job, Lila did an analysis of where we are with our projects. The way she brought it was very much traditional; Wikipedia and English Wikipedia at that. The challenges were not that traditional; much of the public will be mobile and they will not be where they are today.

Another requirement is that all the new people need to be able to contribute. Removing the existing road blocks is absolutely necessary..

When people are to contribute, they have to have a reason to contribute. They will need to benefit from the effort. This year Commons will be wikidatafied and it will become possible to search in multiple languages. The Amnesty International community may add the people on their watch list to Wikidata. In this way what we do in Wikidata gets more of an application.

When we start thinking in terms of how people will be able to use the data we have in store for them, we will find more contributors. Their data will become better connected. The value of our data will increase and we will realise the aspiration of more people in more countries being involved in what we do. We will not only share in the sum of our available information they will put it to use for us.

by Gerard Meijssen (noreply@blogger.com) at August 09, 2014 08:50 PM

Wiki Education Foundation

Happy Book Lovers’ Day!

Today is book lovers’ day! Have you started or expanded a Wikipedia article on a book you love? Here are some student editors’ articles about novels and collections:

From Dr. Lynn Hamilton’s spring 2013 composition course at the University of Pikeville:

  • Darkfever- Karen Marie Moning, started by Courtneyupike
  • Gap Creek- Robert R. Morgan, a stub that UpikeKaren expanded

From Dr. Carol Stabile’s fall 2013 feminist science fiction course at the University of Oregon:

From Dr. Kim Hall’s spring 2014 course about Ntozake Shange at Barnard College

Jami Mathewson
Program Manager

by Jami Mathewson at August 09, 2014 05:21 PM

August 08, 2014

Wikimedia Foundation

Wikimedia Research Newsletter, July 2014

Wikimedia Research Newsletter
Wikimedia Research Newsletter Logo.png

Vol: 4 • Issue: 7 • July 2014 [contribute] [archives] Syndicate the Wikimedia Research Newsletter feed

Shifting values in the paid content debate; cross-language vandalism detection; translations from 53 Wiktionaries

With contributions by: Piotr Konieczny, Maximilian Klein, Heather Ford, and Han-Teng Liao

Understanding shifting values underlying the paid content debate on the English Wikipedia

See related Signpost content: “Extensive network of clandestine paid advocacy exposed“, “With paid advocacy in its sights, the Wikimedia Foundation amends their terms of use
Reviewed by Heather Ford

Kim Osman has performed a fascinating study[1] on the three 2013 failed proposals to ban paid advocacy editing in the English language Wikipedia. Using a Constructivist Grounded Theory approach, Osman analyzed 573 posts from the three main votes on paid editing conducted in the community in November, 2013. She found that editors who opposed the ban felt that existing policies of neutrality and notability in WP already covered issues raised by paid advocacy editing, and that a fair and accurate encyclopedia article could be achieved by addressing the quality of the edits, not the people contributing the content. She also found that a significant challenge to any future policy is that the community ‘is still not clear about what constitutes paid editing’.

Osman uses these results to argue that there has been a transition in the values of the English language Wikipedia editorial community from seeing commercial involvement as direct opposition to Wikipedia’s core values (something repeated at the institutional level by the Wikimedia Foundation and Jimmy Wales who see a bright line between paid and unpaid editing) to an acceptance of paid professions and a resignation to their presence.

Osman argues that the romantic view of Wikipedia as a system somehow apart from the commercial market that characterized earlier depictions (such as those by Yochai Benkler) has been diluted in recent years and that sustainability in the current environment is linked to a platform’s ability to integrate content across multiple places and spaces on the web. Osman also argues that these shifts reflect wider changes in assumptions about commerciality in digital media and that the boundaries between commercial and non-profit in the context of peer production are sometimes fuzzy, overlapping and not clearly defined.

Osman’s close analysis of 573 posts is a valuable contribution to the ongoing policy debate about the role of paid editing in Wikipedia and will hopefully be used to inform future debates.

“Pivot-based multilingual dictionary building using Wiktionary”

Reviewed by Maximilian Klein (talk)

Straight edges represent translation pairs extracted directly from the Wiktionaries. The pair guildbreaslawas found via triangulating.

To build multilingual dictionaries to and from every language is combinatorially a lot of work. If one uses triangulation–if A means B, and B means C, then A means C (see figure)–then a lot of the work can be done by machine. A large closed-source effort did this in 2009[supp 1], but a new paper by Ács[2] defends “while our methods are inferior in data size, the dictionaries are available on our website”[supp 2]. Their approach used the translation tables from 53 Wiktionaries, to make 19 million inferred translations more than the 4 million already occurring in Wiktionary. The researchers steered clear of several classical problems like polysemy, one word having multiple meanings, by using a machine learning classifier. The features used in the classifier were based on the graph-theoretic attributes of each possible word pair. For instance, if two or more languages can be an intermediate “pivot” language for translation, that turned out to be a good indicator of a valid match. In order to test the precision of these translations, manual spot checking was done and found a precision of 47.9% for newly found word-pairs versus 88.4% for random translations coming out of Wiktionary. As for recall, which tested the coverage of a collection of 3,500 common words, 83.7% of words were accounted for by automatic triangulation in the top 40 languages. That means that right now if we were to try and make a 40-language pocket phrasebook to travel around most of the world just using Wiktionary, about 85% of the time there would be a translation, and it would be between 50-85% correct.

This performance would likely need to increase before any results could be operationalized and contributed back into Wiktionary. However, given the fact that the code used to parse and compare 43 different Wiktionaries was also released on GitHub[supp 3], that goal is a possibility. It’s yet another testament to the open ecosystem to see a Wikimedia project along with Open Researcher efforts make a resource to rival a closed standard. While Ács’ research isn’t the holy grail of translation between arbitrary languages, it cleverly mixes established theory and open data, and then contributes it back to the community.

“Cross Language Learning from Bots and Users to detect Vandalism on Wikipedia”

Reviewed by Han-Teng Liao (talk)

A new study[3] by Tran and Christen is the latest example of academic research on vandalism detection which has been developed over the years[supp 4] in the context of the PAN workshop[supp 5], where researchers develop both corpus data and tools to uncover plagiarism, authorship, and the misuse of social media/software. This work should be of interests to both researchers and Wikipedians because of (a) the need to detect vandalism and (b) the interesting question whether such vandalism-fighting data and tools are transferable or portable from one language version to another. Both the vandalism-fighting corpus and tools have both practical and theoretical implications for understanding the cross-lingual transfer in knowledge and bots.

In 2010 and 2011, Wikipedia vandalism detection competitions were included by the PAN as workshops. It started with Martin Potthast’s work on building the free-of-charge PAN Wikipedia vandalism corpus, PAN-WVC-10 for research, which compiled 32452 edits based on 28468 Wikipedia articles, among which 2391 vandalism instances were identified by human coders recruited from Amazon’s Mechanical Turk[supp 6]. In 2011, a larger crowdsourced corpus of 30,000+ Wikipedia edits is released in three languages: English, German, and Spanish[supp 7], with 65 features to capture vandalism.

Based on even larger datasets of over 500 million revisions across five languages (en:English, de:German, es:Spanish, fr:French, and ru:Russian), Tran & Christen’s latest work adds to the efforts by applying several supervised machine learning algorithms from the Scikit-learn toolkit[supp 8], including Decision Tree (DT), Random Forest (RF), Gradient Tree Boosting (GTB), Stochastic Gradient Descent (SGD) and Nearest Neighbour (NN).

What Tran & Christen confirm from their findings is that “distinguishing the vandalism identified by bots and users show statistically significant differences in recognizing vandalism identified by users across languages, but there are no differences in recognizing the vandalism identified by bots” (p.13) This demonstrates human beings can recognize a much wider spectrum of vandalism than bots, but still bots are shown to be trainable to be more sophisticated to capture more and more nonobvious cases of vandalism.

Tran & Christen try to further make the case for the benefits of cross language learning of vandalism. They argue that the detection models are generalizable, based on the positive results of transferring the machine-learned capacity from English to other smaller Wikipedia languages. While they are optimistic, they acknowledge such generalization has at best been proven among some of the languages they studied (these languages are all Roman-alphabet-based languages except for Russian), and the poor performance of the Russian language model. Thus, Tran & Christen rightly point out the need for research on non-English and especially non-European language versions. They also recognize that many word based features are no longer useful for some languages such as Mandarin Chinese, because of tokenization and other language-specific issues.

Tran & Christen call for next research projects to include languages such as Arabic and Mandarin Chinese to complete the United Nations working set of languages. It will be interesting to see how such research projects can be executed and how the greater Wikipedia research and editor community can help and/or use such research efforts.

Readers’ interests differ from editors’ preferences

Reviewed by Piotrus.

A conference paper titled “Reader Preferences and Behavior on Wikipedia”[4] deals with the under-studied population of Wikipedia readers. The paper provides a useful literature review on the few studies about reading preference of that group. The researchers used publicly available page view data, and more interestingly, were able to obtain browsing data (such as time spend by a reader on a given page). Since such data is unfortunately not collected by Wikipedia, the researchers obtained this data through volunteers using a Yahoo! toolbar. The authors used Wikipedia:Assessment classes to gauge article’s quality.

The paper offers valuable findings, including important insights to the Wikipedia community, namely that “the most read articles do not necessarily correspond to those frequently edited, suggesting some degree of non-alignment between user reading preferences and author editing preference”. This is not a finding that should come as much surprise, considering for example the high percentage of quality military history articles produced by the WikiProject Military History, one of the most active if not the most active wikiproject in existence – and of how little importance this topic is to the general population. Statistics on topics popularity and quality of corresponding articles can be seen in Table 1, page 3 of the article. Figure 1 on page 4 is also of interest, presenting a matrix of articles grouped by popularity and length. For example, the authors identify the area of “technology” as the 4th most popular, but the quality of its articles lags behind many other fields, placing it around the 9th place. It would be a worthwhile exercise for the Wikipedia community to identify popular articles that are in need of more attention (through revitalizing tools like Wikipedia:Popular pages, perhaps using code that makes WikiProject popular pages listing work?) and direct more attention towards what our readers want to read about (rather than what we want to write about). Finally, the authors also identify different reading patterns, and suggest how those can be used to analyze article’s popularity in more detail.

Overall, this article seems like a very valuable piece of research for the Wikipedia community and the WMF, and it underscores why we should reconsider collecting more data on our readers’ behavior. In order to serve our readers as best as we can, more information on their browsing habits on Wikipedia could help to produce more valuable research like this project.

Wikipedia from the perspective of PR and marketing

Reviewed by Piotrus.

An article[5] in “Business Horizons”, written in a very friendly prose (not a common finding among academic works), looks at Wikipedia (as well as some other forms of collaborative, Web 2.0 media) from the business perspective of a public relations/marketing studies. Of particular interest to the Wikipedia community is the authors goal of presenting “the three bases of getting your entry into Wikipedia, as well as a set of guidelines that help manage the potential Wikipedia crisis that might happen one day.” The authors correctly recognize that Wikipedia has policies that must be adhered to by any contributors, though a weakness of the paper is that while it discusses Wikipedia concepts such as neutrality, notability, verifiability, and conflict of interest, it does not link to them. The paper provides a set of practical advice on how to get one’s business entry on Wikipedia, or how to improve it. While the paper does not suggest anything outright unethical, it is frank to the point of raising some eyebrows. While nobody can disagree with advice such as “as a rule of thumb, try to remain as objective and neutral as possible” and “when in doubt, check with others on the talk page to determine whether proposed changes are appropriate”, given the lack of consensus among Wikipedia’s community on how to deal with for-profit and PR editors, other advice such as “maximize mentions in other Wikipedia entries” (i.e. gaming WP:RED), “be associated with serious contributors…leverage the reputation of an employee who is already a highly active contributor… [befriend Wikipedians in real life]“, “When correcting negative information is not possible, try counterbalancing it by adding more positive elements about your firm, as long as the facts are interesting and verifiable”, “…you might edit the negative section by replacing numerals (99) with words (ninety-nine), since this is also less likely to be read. Add pictures to draw focus away from the negative content” might be seen as more controversial, falling into the gaming the system gray area. The “Third, get help from friends and family” section in particular seems to fall foul of meatpuppetry.

In the end, this is an article worth reading in detail by all interested in the PR/COI topics, though for better or worse, the fact that it is closed access will likely reduce its impact significantly. On an ending note, one of the two article’s co-authors has a page on Wikipedia at Andreas Kaplan, which was restored by a newbie editor in 2012, two years after it’s deletion, has been maintained by throw-away SPAs, and this reviewer cannot help but notice that it still seems to fail Wikipedia:Notability (academics)

“No praise without effort: experimental evidence on how rewards affect Wikipedia’s contributor community”

Reviewed by Piotrus.

In 2012, the authors of this paper[6] have given out over a hundred barnstars to the top 1% most active Wikipedians, and concluded that such awards improve editors productivity. This time they repeated this experiment while broadening their sample size to the top 10% most active editors. After excluding administrators and recently inactive editors, they handed out 300 barnstars “with a generic positive text that expressed community appreciation for their contributions”, divided between the 91st–95th, 96th–99th, and 100th percentiles of the most active editors (this corresponds to an average of 282, 62 and 22 edits per month) and then tracked the activity of those editors, as well as of the corresponding control sample which did not receive any award. The experiment was designed to test the hypothesis that less active contributors will be responsive to rewards, similar to the most highly-active contributors from the prior research.

The authors found, however, that rewarding less productive editors did not stimulate higher subsequent productivity. They note that while the top 1% group responded to an award with an increase in productivity (measured at a rather high 60% increase), less productive subjects did not change their behavior significantly. The researchers also noted that while some of the top 1% editors received an additional award from other Wikipedians, not a single subject from the less active group was a recipient of another award.

The researchers conclude that “this supports the notion that peer production’s incentive structure is broadly meritocratic; we did not observe contributors receiving praise or recognition without having first demonstrated significant and substantial effort.” While this will come as little surprise to the Wikipedia community, their other observation – that outside the top 1% of editors, awards such as barnstars have little meaningful impact – is more interesting.

Further, the authors found that while rewarding the most active editors tends to increase their retention ratio, it may counter-intuitively decrease the retention ratio of the less active editors. The authors propose the following explanation: “Premature recognition of their work may convey a different meaning to these contributors; instead of signaling recognition and status in the eyes of the community, these individuals may perceive being rewarded as a signal that their contributions are sufficient, for the time being, or come to expect being rewarded for their contributions.” They suggest that this could be better understood through future research. For the community in general, it raises an interesting question: how should we recognize less active editors, to make sure that thanking them will not be taken as “you did enough, now you can leave”?


Wikipedia assignments improve students’ research skills

It is refreshing to see a continuing and growing stream of academic works endorsing various aspects of teaching with Wikipedia paradigm. A study[7] of eleven students “enrolled in a semester-long academic literacy course in a preparatory program for study at an Australian university… showed an educationally statistical improvement in the students’ research skills, while qualitative comments revealed that despite some technical difficulties in using the Wikipedia site, many students valued the opportunity to write for a ‘real’ audience and not just for a lecturer.”

A split in the growing field of Chinese-language Wikipedia research

A blog post[8] by Han-Teng Liao (廖漢騰) presents an interesting exploratory overview of a Chinese language research on Wikipedia. The findings suggest that Chinese-language scholars and academic publication outlets are increasingly doing research in the field of Wikipedia studies; however there’s “a divide between mainland Chinese academic sources/search results on one hand, and Hong Kong/Taiwanese ones on the other.” The reason for this seems to be primarily technical, as scholars from different regions seem to publish in different outlets, which in turn are not indexed in the academic search engines preferred by those from other region.

Other recent publications

A list of other recent publications that could not be covered in time for this issue – contributions are always welcome for reviewing or summarizing newly published research.

  • “Uneven Openness: Barriers to MENA [Middle East/North Africa] Representation on Wikipedia”[9] (blog post)
  • ” Detecting epidemics using Wikipedia article views: A demonstration of feasibility with language as location proxy”[10]
  • “The Reasons of People Continue Editing Wikipedia Content – Task Value Confirmation Perspective”[11]
  • “Circling the Infinite Loop, One Edit at a Time: Seriality in Wikipedia and the Encyclopedic Urge”[12]
  • “Identifying Duplicate and Contradictory Information in Wikipedia”[13]
  • “The impact of elite vs. non-elite contributor groups in online social production communities: The case of Wikipedia”[14]
  • “What do we Think an Encyclopaedia is?”[15] From the abstract: “Based on survey and interview research carried out with publishers, librarians and higher education students, [this article] demonstrates that certain physical features and qualities are associated with the encyclopaedia and continue to be valued by them. Having identified these qualities, the article then explores whether they apply to three incidences of electronic encyclopaedias, Britannica Online, The Stanford Encyclopedia of Philosophy and Wikipedia.”
  • “Crowdsourcing Knowledge Interdiscursive Flows from Wikipedia into Scholarly Research”[16]. From the abstract: “using a dataset collected from the Scopus research database, which is processed with a combination of bibliometric techniques and qualitative analysis [this article finds] that there has been a significant increase in the use of Wikipedia as a reference within all areas of science and scholarship. Wikipedia is used to a larger extent within areas like Computer Science, Mathematics, Social Sciences and Arts and Humanities, than in Natural Sciences, Medicine and Psychology.”
  • “How Readers Shape the Content of an Encyclopedia: A Case Study Comparing the German Meyers Konversationslexikon (1885-1890) with Wikipedia (2002-2013)”[17]


  1. Osman, Kim (2014-06-17). “The Free Encyclopaedia that Anyone can Edit: The Shifting Values of Wikipedia Editors“. Culture Unbound: Journal of Current Cultural Research 6: 593–607. doi:10.3384/cu.2000.1525.146593. ISSN 2000-1525. 
  2. Ács, Judit (May 26–31, 2014). Pivot-based multilingual dictionary building using Wiktionary.
  3. Tran, Khoi-Nguyen; P. Christen (2014). “Cross Language Learning from Bots and Users to detect Vandalism on Wikipedia”. IEEE Transactions on Knowledge and Data Engineering Early Access Online. doi:10.1109/TKDE.2014.2339844. ISSN 1041-4347. 
  4. Janette Lehmann, Claudia Müller-Birn, David Laniado, Mounia Lalmas, Andreas Kaltenbrunner: Reader Preferences and Behavior on Wikipedia. HT’14, September 1–4, 2014, Santiago, Chile. http://www.dcs.gla.ac.uk/~mounia/Papers/wiki.pdf
  5. Kaplan, Andreas; Michael Haenlein. “Collaborative projects (social media application): About Wikipedia, the free encyclopedia“. Business Horizons. doi:10.1016/j.bushor.2014.05.004. ISSN 0007-6813.  Closed access
  6. Restivo, Michael; Arnout van de Rijt. “No praise without effort: experimental evidence on how rewards affect Wikipedia’s contributor community“. Information, Communication & Society: 1-12. doi:10.1080/1369118X.2014.888459. ISSN 1369-118X. 
  7. Miller, Julia (2014-06-13). “Building academic literacy and research skills by contributing to Wikipedia: A case study at an Australian university“. Journal of Academic Language and Learning 8 (2): A72-A86. ISSN 1835-5196. 
  8. Liao, Han-Teng (2014-06-20). Chinese-language literature about Wikipedia: a meta-analysis of academic search engine result pages.
  9. Graham, Mark; Bernie Hogan (2014-04-29). “Uneven Openness: Barriers to MENA Representation on Wikipedia”. Rochester, NY: Social Science Research Network. http://papers.ssrn.com/abstract=2430912. 
  10. Generous, Nicholas; Geoffrey Fairchild, Alina Deshpande, Sara Y. Del Valle, Reid Priedhorsky (2014-05-14). “Detecting epidemics using Wikipedia article views: A demonstration of feasibility with language as location proxy“. arXiv:1405.3612 [physics]. 
  11. Lai, Cheng-Yu; Heng-Li Yang. “The Reasons of People Continue Editing Wikipedia Content – Task Value Confirmation Perspective“. Behaviour & Information Technology (ja): 1-47. doi:10.1080/0144929X.2014.929744. ISSN 0144-929X. 
  12. Salor, E.: Circling the Infinite Loop, One Edit at a Time: Seriality in Wikipedia and the Encyclopedic Urge. In Allen, R. and van den Berg, T. (eds.) Serialization in Popular Culture. London: Routledge p.170 ff.
  13. Weissman, Sarah; Samet Ayhan, Joshua Bradley, Jimmy Lin (2014-06-04). “Identifying Duplicate and Contradictory Information in Wikipedia“. arXiv:1406.1143 [cs]. 
  14. Mihai Grigore, Bernadetta Tarigan, Juliana Sutanto and Chris Dellarocas: “The impact of elite vs. non-elite contributor groups in online social production communities: The case of Wikipedia”. SCECR 2014 PDF
  15. Schopflin, Katharine (2014-06-17). “What do we Think an Encyclopaedia is?“. Culture Unbound: Journal of Current Cultural Research 6: 483-503. doi:10.3384/cu.2000.1525.146483. ISSN 2000-1525. 
  16. Lindgren, Simon (2014-06-17). “Crowdsourcing Knowledge Interdiscursive Flows from Wikipedia into Scholarly Research“. Culture Unbound: Journal of Current Cultural Research 6: 609-627. doi:10.3384/cu.2000.1525.146609. ISSN 2000-1525. 
  17. Spree, Ulrike (2014-06-17). “How Readers Shape the Content of an Encyclopedia: A Case Study Comparing the German Meyers Konversationslexikon (1885-1890) with Wikipedia (2002-2013)“. Culture Unbound: Journal of Current Cultural Research 6: 569-591. doi:10.3384/cu.2000.1525.146569. ISSN 2000-1525. 

Supplementary references and notes:
  1. Mausam and Soderland, Stephen and Etzioni, Oren and Weld, Daniel S. and Skinner, Michael and Bilmes, Jeff (2009). “Compiling a Massive, Multilingual Dictionary via Probabilistic Inference“. 
  2. Hungarian Front Page.
  3. wiki2dict github.
  4. For example, in 2013 only two languages are studied [1] in contrast to the five languages reported in this 2014 journal article.
  5. http://pan.webis.de/
  6. See[2]
  7. See [3]
  8. Scikit-learn is an open source project in Python for machine-learning

Wikimedia Research Newsletter
Vol: 4 • Issue: 7 • July 2014
This newletter is brought to you by the Wikimedia Research Committee and The Signpost
Subscribe: Syndicate the Wikimedia Research Newsletter feed Email @WikiResearch on Identi.ca WikiResearch on Twitter[archives] [signpost edition] [contribute] [research index]

by wikimediablog at August 08, 2014 09:51 PM

Two shades of Wikipedia in Punjabi

Punjabi Wikipedian Satdeep Gill (left) discussing a general Wikipedia editing aspect with Shyamal Lakshminarayan and Shubha

In June of 2014, the Wikimedia blog reported the end of a month-long Umepedia Challenge which aimed to create Wikipedia articles on the Swedish city of Umeå in as many languages as possible. If somebody were to take a wild guess, they could make the assumption that the contributing winner would hail from Europe since the contest pertains to a European city. But surprisingly, the winner is Satdeep Gill, a contributor for Punjabi Wikipedia. He proudly claimed in his Facebook post: “I won the Umepedia Challenge by creating all the articles in Punjabi and a few of them in Hindi and Urdu.” This is the zeal and enthusiasm of Punjabi Wikipedia admin Satdeep. His efforts to advance and maintain the Punjabi Wikipedia are equally shared by co-admin Vigyani as seen in the latter’s inquiries and application of the editing norms of other Indian language Wikipedias on Punjabi Wikipedia. One of Vigyani’s recent initiatives is a query regarding translating and transliterating foreign words on the Hindi Wikipedia Village Pump.

Intrigued by the keenness of the two sysops and the increase in the number of contributors on Punjabi Wikipedia, I decided to get more information from Punjabi Wikipedians by way of a 20-point questionnaire. I got responses from five leading Punjabi Wikipedians. A common factor I noticed from the responses is that they all were introduced to Punjabi Wikipedia out of curiosity when they noticed the interwiki link provided by Wikidata on the left-hand side of the screen of many English articles. A motivating factor of these editors was reflected in the words of Babandeep Singh: “Seeing how the wiki was lagging with respect to the quantity and quality of articles, I decided to contribute as much as I could.” It was also revealed in the survey that Patiala has the highest number of active editors, with at least three known contributors hailing from the city. The main facilitating factors in attracting and retaining new editors here has been the satisfactory language interface and the editing tools. Although the size of the current Punjabi Wikipedia Community is relatively small, according to Parveer Singh Grewal, the atmosphere here is good and there is very little room for conflict.

Punjabi Wikipedian Charan Gill (right) along with Niraj Suryavanshi

According to Charan Gill, while Punjabi Wikipedia has a number of stubs that don’t go beyond a one-line description, many are in the process of being reworked into full-length articles. The respondents generally felt that a neutral point of view is being observed. With regards to the future growth of Punjabi Wikipedia, Vigyani points out: “I recently created many articles using AWB and data lists in form of CSV files on topics of geography and politics. Articles related to politics were already being done on Hindi Wikipedia. I borrowed their data. I then created my own data sheets for geography articles, which were also provided by Hindi Wikipedia. This kind of collaboration can be done across all the other language projects, especially among Indian languages. A huge number of stub/start class articles can be created by recording the data in excel sheets and using bot or AWB. A large part of data is numeric and rest text. By easily translating those text portions, data lists for each local language can be created, resulting in a huge number of articles on important topics.” On the other hand, Satdeep Gill plans to promote Wikipedia in government schools of Punjab. According to him, even a single editor from one school will make a huge difference to the Punjabi Wikipedia. It was also acknowledged that Punjabi Wiktionary and Wikibooks are short of contributors. These projects can reach a commendable level contributions only after enlisting more users from Punjabi Wikimedia into these projects.

It is widely known that Punjabi is written in Gurmukhi script in India while a Persian-style Shahmukhi script is used in Pakistan. Most Punjabi people of these countries speak the same language but are aware of only one predominantly used script in their country. Even in the midst of of this divide, there are a few Wikipedians who contribute to Wikipedia in both the scripts. One such contributor is Abbas Dhothar from Pakistan. Even when he is active on the Shahmukhi script version of Punjabi Wikipedia, called “Western Punjabi Wikipedia,” he contributes to a cultural integration of Punjabi Wikipedians by creating and expanding articles on notable personalities such as Maharaja Ranjit Singh. He has listed links to the website of the Indian Punjabi weekly newspaper Ajit and the global Punjabi unity website Sanjha Punjab on his Western Punjabi Wikipedia userpage. Abbas also created 20 articles on the Gurmukhi version of Punjabi Wikipedia besides editing several articles written by Indian users. In some ways, he seems to echo the statement of Shahmukhi-knowing Satdeep Gill: “I was even thinking one day we could unite both the Wikipedias into one.”

During my humble efforts to reach out to Western Punjabi Wikipedians, I was lucky to get a response from Khalid Mahmood, the lone admin of the Western Punjabi Wikipedia. As a professor of English, Khalid realized the immense difficulties faced by the students in learning English and favours dissemination of knowledge in native languages such as Punjabi. According to him, while there are only 7-8 active and dedicated contributors on Western Punjabi Wikipedia, the qualitative content generation of native language contributions has resulted in 23,000-27,000 clicks everyday, making it the most referred website in the language. For the last six years, the commencement and advancement of Western Punjabi Wikipedia remained a passion for Khalid. He considers the invitation and travel scholarship to Wikimania 2012 in Washigton DC and Wikimania 2014 in London as rewards for his dedicated efforts in starting Western Punjabi Wikipedia, Western Punjabi Wiktionary and Western Punjabi Wikiquote, which is likely to soon come out of incubation as a full-fledged Wikimedia project. Khalid wants to see Western Punjabi Wikipedia as a reliable source of information, a cultural centre for Punjabi people and a matter of pride for them. He wishes a friendly collaboration from the Indian Punjabi Wikipedians. While both Punjabi and Western Punjabi Wikipedias are witnessing growth and expansion, I consider it as a welcome gesture that the Punjabi Wikipedians across both India and Pakistan believe in the need for cooperation and collaboration and are even ready to work in a cordial and mutually beneficial manner on the Wikipedia sphere.

I would like to thank all Punjabi Wikipedians from both India and Pakistan for the valuable input used in this survey.

Syed Muzammiluddin, Wikipedian.

by carlosmonterrey at August 08, 2014 09:19 PM

Harry Burt

Wikimania: Hackathon and Opening Ceremony

What have I been up to at the Hackathon?

  • I’ve fixed svgtranslate, a tool which I’ve never liked and desperately tried to make obsolete. But in the meantime I’ve had to acknowledge that it should at least function correctly. Annoyingly I was held up for a couple of hours debugging a problem where object caching was working too well. Very frustrating. On the plus side, I did get OAuth working — pretty easily in fact. A useful experiment itself.
  • I’ve started going back over its (should be) successor TranslateSvg, adding in a couple more tests and fixing some bugs.
  • I tided up VoiceIntro, a tool for recording audio snippets.
  • Following a request passed on to me by Charles Matthews I’ve set up SafeSandbox, a space that should become a Shibboleth-identified, protected wiki space suitable for use in schools. Although I was able to install Shibboleth, its configuration still confuses me and I reached out for more help.
  • I’ve conducted brief interviews with Dan Garry about SUL finalisation and James Forrester about the VisualEditor (which I just need to tidy up before I can publish).
  • I attend a discussion about the MediaViewer, a point of some controversy since the Foundation are currently refusing to disable it. The new draft, due to go live later this year, looks much better: focussed on readers, it include minimal details but features a bright blue button that links to Wikimedia Commons (slides). Considering the controversy, I was very impressed by the team, with lots of beta testing, monitoring disable rates, recording clicks, even coding of free text entry comments in the survey panel.
  • I’ve chatted to dozens of people, some old friends, some new.

I also attended the Open Ceremony last night. Unfortunately it was not as interesting as I thought it might, consisting mostly of lists of “thank you”s repeated ad nauseam. Wikimania coordinator Ed Saperia will have been glad just to have got through; WMUK CEO Jon Davies was at least brief and the only really novel contribution – a keynote from Amnesty International CEO Salil Shetty – was weirdly out of place. Thankfully, more seasoned Wikimanians pointed me to a suitable ‘buzzword bingo’ grid, to help keep my attention on the selection of speakers, suggesting  this was not the first slightly tedious opening WIkimania ceremony. Nevertheless, I think, were I to run a Wikimania, I would probably scrap it altogether: there was ultimately very little that needed to be said at all — everyone just wanted to get on with it.


by Harry at August 08, 2014 06:58 PM

Wiki Education Foundation

Happy International Cat Day!

Today is International Cat Day, though our feline sources commented that every day is international cat day. Did you know your favorite internet stars can get H5N1 avian influenza? Learn more about causes, symptoms, and preventative measures in the article about this disease in cats. Thanks to student editor Acorn Drains Cans, who started the article for an assignment in Dr. Sherry Seston’s virology course at Alverno College in the spring 2013 term, for helping readers keep their furry friends safe!

Jami Mathewson
Program Manager

by Jami Mathewson at August 08, 2014 05:03 PM

Gerard Meijssen

Dear #Wikipedia, they are not what we call a "human"

At #Wikidata it hurts when you are cheating. For us a human is singular; he or she has a date of birth, maybe a date of death and that is what we expect to find in "20th-century births" and all its subcategories.

We could argue that a horse, a cat or a dog has a date of birth as well but really, Scott Alexander and Larry Karaszewski for instance is not one human or singular. Together they do not have a date of birth, they have two.

Because of the problems articles like the one about Scott and Larry generate, we put them on a "black list". We make them a "group of people". In this way we will not consider them for all kinds of subsequent statements. We will not make them an alumni or give them an occupation. That is reserved for humans.

by Gerard Meijssen (noreply@blogger.com) at August 08, 2014 07:44 AM