November 25, 2015

Wiki Education Foundation

Announcing McMaster University’s Visiting Scholar

Danielle Robichaud (User:Dnllnd), McMaster University’s Visiting Scholar.

I’m pleased to announce Danielle Robichaud (User:Dnllnd) as McMaster University’s Wikipedia Visiting Scholar.

She began editing Wikipedia as an extension of digital outreach efforts in her work as an archivist. Eventually, she became an active editor in her time off, too.

“During the archival description process, archivists write administrative histories of organizations and biographical sketches of record creators,” she told us. “The work draws on the same type of writing required to generate good Wikipedia articles. So, from a skills standpoint, the transition was pretty smooth.”

Danielle will be working with Dale Askey, McMaster University’s Associate University Librarian for Library and Learning Technologies. Asked why he’s participating in the Visiting Scholars program, he explained that it’s in the best interests of universities, libraries, and other educators to have Wikipedia articles supported by the best sources, which aren’t always the most easily accessed. By working with Danielle, McMaster can help while also surfacing some of its rich digital collections. “Wikipedia as an information source is already a key part of the digital fabric. It seems that it’s time for universities to get past their collective phobia about it and embrace it as an opportunity.”

While looking forward to having remote access to McMaster’s research resources, as an archivist Danielle says she’s also excited to “explore how material in archives and special collections can be integrated into, and used to generate, Wikipedia content.” She will be focusing on people and events that aren’t represented well on Wikipedia.

A specific example she wants to improve is the article for Lady Constance Malleson. Highlighting a common symptom of the gender gap on Wikipedia, Danielle explains, “she was an author and actress involved in the pacifist movement during the First World War, yet (until recently) the main focus of her Wikipedia page was her affair with Bertrand Russell, and the open marriage she had with her husband.”

The Wikipedia Visiting Scholars program connects experienced Wikipedia editors with research libraries. Together, they find Wikipedia articles to improve using the library’s digital resources. For information about sponsoring or becoming a Wikipedia Visiting Scholar, see our Visiting Scholars page.

Dnllnd photo:Dnllnd” by DnllndOwn work. Licensed under CC BY-SA 4.0 via Commons. Library photo:  Mills Memorial Library and plaza” by Tom Flemming. Licensed under CC BY-SA 2.0 via Wikimedia Commons.

by Ryan McGrady at November 25, 2015 05:00 PM

November 24, 2015


Barack Obama GeneaWiki, 1 year later

GeneaWiki is a tool created by Magnus Manske to visualize the family of a person using data pulled from Wikidata.

I used the GeneaWiki tool as an example use of Wikidata in a presentation a year ago (2014) and below you can see the screenshot I took from it. It shows 10 people in Barack Obamas family tree / web.

GenaWiki Q76 2014


When creating a new presentation this year (2015) I went back to GeneaWiki to take another screenshot and this is what I found!

GenaWiki Q76 2015

Around 30 people now! :)

Yay, more data!


by addshore at November 24, 2015 10:05 PM

Wikimedia Foundation

Victory in Germany: court rules in favor of the Wikimedia Foundation


The Munich skyline, seen here in 2001. Image by Stefan Kühn, freely licensed under CC BY-SA 3.0.

We are happy to announce that the court of Munich has ruled in favor of the Wikimedia Foundation in Dr. Evelyn Schels v. Wikimedia Foundation, Inc.. In May 2015, Dr. Schels filed a lawsuit in Munich requesting the removal of her date of birth from the German Wikipedia page about her. Since Dr. Schels is a famous person and her birthdate was taken from a publicly available source, the court held that the Wikimedia Foundation did not infringe on her rights.

Dr. Evelyn Schels is a well known screenwriter and a director in Germany. In her lawsuit, Dr. Schels argued  that her general right of personality and her data protection rights were infringed because she did not consent to the publication of her date of birth in the Wikipedia article about her. She also argued that the presence of her date of birth in her Wikipedia article could lead to disadvantages in her job.

The Wikimedia Foundation responded by explaining the Foundation’s role as a hosting provider, why the publication was protected free speech and why Dr. Evelyn Schels is a famous person in which the public has a legitimate interest, including in background biographical details about her.

On November 13, 2015, the court of Munich ruled in favor of the Wikimedia Foundation. The judgment (translated in English) confirmed that the publication of Dr. Schels’ date of birth on Wikipedia did not infringe upon her general right of personality or data protection rights. The court recognized that a birth year made accessible by the Claimant herself through publicly available sources such as a book “would not remain limited to a small circle of people … but be accessible to a circle of users unlimited in theory.”

According to the court, Dr. Schels “has to be considered a person in which the general public is interested”. Indeed, as a renowned producer of documentaries, it is of interest to the public to know what movies she produced at which age. Thus, there is a public interest in her birthdate. Furthermore, the court emphasized that Dr. Schels’ birthdate is not the most private data that has been released. For instance, some data is available regarding Schels’ education and production activities from which one could reasonably determine her age.

The opinion further acknowledged the neutrality and objectivity of Wikipedia, noting: “The Wikipedia entry represents general information for the formation of public opinion, not a publication for the purpose of advertising or marketing … It does not appear to the court that the publication in dispute significantly affects the Claimant in any way.”

Finally, the court declared the transmittal of data to be admissible and declared: “[Since] The birth year is from a publicly accessible source, which [Dr Schels] herself has published, there is no reason to assume that [Dr. Schels] has an interest warranting protection in preventing the data transmittal of her birth year. In fact, the balancing of the general right of personality . . . with the communication interest … favors the latter [i.e. publication].”

Publicly available sources are critical for the existence of Wikipedia. Those who publish information and make it available to the general public should expect a large number of people to be able to find it online, including on Wikipedia, some day. The Wikimedia Foundation will keep supporting you—the global community—in constructing the best and most comprehensive encyclopedia possible, and will fight against lawsuits that seek to prevent the dissemination of free knowledge.

Michelle Paulson, Legal Director
Jacob Rogers, Legal Counsel

* We would like to extend our sincere thanks to the attorneys at Schlüschen Müller in Germany, particularly Dr. Holger Müller, for their exemplary legal representation and dedication to the Wikimedia movement. Special thanks should also go to Gaetan Goldberg, WMF legal fellow, for his assistance on this blog post.

by Michelle Paulson and Jacob Rogers at November 24, 2015 08:24 PM

Mark A. Hershberger

MediaWiki as a community resource

As is only to be expected, Brion asked:

What sort of outcomes are you looking for in such a meeting? Are you looking to meet with engineers about technical issues, or managers to ask about formally committing WMF resources?

I copy-pasted Chris Koerner’s response:

  • People use MediaWiki!
  • How can we bring them into the fold?
  • What is the WMF stance on MediaWiki? Is it part of the mission or a by-product of it?
  • Roadmap roadmap roadmap

But I couldn’t let it stop there, so I went into rant mode.

Since it seems that some people involved in the shared hosting/non-technical office hour weren’t aware of us — “they don’t report bugs” was said over and over and just isn’t true… @cicalese, for example, has been struggling with submitting code — we do contribute and we have a huge investment in the future of MW.

There are a number of large users — NASA, NATO, Pfizer, oil companies, Medical providers and medical researchers, various government agencies as well as the numerous “less serious” game-based wikis. The list goes on.

All of these uses are not controlled by the Foundation, but they do feed the mission statement of the WMF by providing a tool that people use to “empower and engage people around the world to collect and develop educational content … and to disseminate it effectively and globally.

Even if the content isn’t released in the public domain (e.g. it is kept “in house”), it trains people to use the MediaWiki software and allows them to share there knowledge where it is appreciated, even when that knowledge isn’t notable enough for a project with Wikipedia’s aspirations.

The problem, as I see it, is one of direction and vision. Should WMF developers continue to only be concerned with those who have knowledge to share that the Wikipedia communities allow, or should their efforts enable people to share less note-worthy knowledge that — while it doesn’t meet the bar set for Wikipedia — is still part of the sum of all human knowledge that it is WMF’s vision to ensure everyone has access to.

It’s true, some organisations will set up wikis that are not publicly accessible. Even the WMF has some non-public wikis. The Wiki, though, is an amazing tool for publishing knowledge and people have seen the potential (through Wikipedia) of this idea of providing a knowledge sharing tool where “anyone can edit.”

Without engaging those people who use MediaWiki outside of the WMF, the WMF is missing out on a huge amount of feedback on the software and interesting uses for it that the Foundation hasn’t thought of.

There’s a virtuous cycle that the Foundation is missing out on.

by hexmode at November 24, 2015 08:04 PM

November 23, 2015

Wikimedia Foundation

Wikimedia’s Funds Dissemination Committee—how to fairly distribute money around the world

Funds Dissemination Committee November 2015 at Wikimedia Foundation Office.jpg
The FDC and supporting WMF staff. Photo by MGuss (WMF), freely licensed under CC0 1.0.

This week, the Funds Dissemination Committee (FDC) recommended the distribution of almost $3.8 million to 11 independent affiliate organisations around the world.

This, the first round of Annual Plan Grant (APG) deliberations for 2015–16, were conducted at a marathon four-day meeting, but come at the conclusion of months of work by the committee volunteers, the Wikimedia Foundation (WMF) supporting staff, and of course by the applicants themselves. This committee is composed of nine elected and appointed volunteers from countries around the world; each has been editing Wikimedia projects for over a decade. Now in its fourth year, the FDC meets twice annually to decide on funding levels for several Wikimedia affiliates. These deliberations are the first since the movement-wide 2015 Wikimedia elections.

While many Wikimedians are interested to see the amount of money that is being recommended, few are aware of the processes that the FDC uses to come to these decisions. Even fewer are aware of the level of conscientious thought that goes in to these deliberations—both before and during the meeting.


The number, variety, and length of input documents needed for these deliberation is very high. Also, nearly all of them are publicly available to ensure transparency. Context is key in making decisions about funding activities all over the world—what is considered value for money, achievable, or even legal, can vary greatly depending on the applicant’s local situation. This extensive documentation, then, is required not only to ensure financial transparency in our movement, but also to ensure that local context and narrative is understood.

Once the applications are published, as per the APG calendar, the Wikimedia community at large is invited to comment. As the time to meet in-person approaches, the regular phone-meetings with the WMF support staff increase in frequency. This allows the committee members to identify any confusion or concerns they see across several applications, so that the staff can seek clarification from all applicants. Feedback from the community, staff, committee and applicants also helps to refine the application process itself each year.

Meeting in-person, which the FDC does twice annually, is a rarity in the Wikimedia community’s various volunteer committees. However, given the significance of the decisions being made, there are great needs to ensure that all committee members are able to participate on an equal footing and consensus is obtained in all aspects of the review.


The “bubbles”. Example of initial allocations for a fictional returning applicant requesting an increased grant. The spread ranges from full funding to lower than the current amount. Image by Sati Houston, freely licensed under CC BY-SA 4.0.

Example of a second round of allocations, following discussion. Note how the minimum increased and therefore the “spread” decreased. Image by Sati Houston, freely licensed under CC BY-SA 4.0.

The first major tool that the FDC uses to come to agreement, after reading all the documentation, is to assemble initial suggested allocation amounts. By independently sending these numbers to the WMF support staff before the in-person deliberations, each committee member can make a “first attempt” at allocating amounts without being influenced by the others. When combined, these initial allocations are visualised on a scale from zero to the full request, colloquially called “the bubbles”.

Being able to see how far apart each of the committee members are in their initial allocation, “the spread”, gives an excellent starting point for discussions. Some applicants receive a very narrow spread of initial allocations, meaning that all committee members generally agree on result they want to see but not necessarily why, while other applicants receive a wide spread (either with a equal distribution or with individual outliers). A facilitated discussion of each applicant then follows, ensuring each has adequate time dedicated to it to highlight important positive areas or risks, and especially to identify important program areas. After all, if the committee is recommending to the WMF Board that it gives large amounts of money to a legally independent organisation, the Board needs to be able to justify it.

The second major tool used is the “Gradients of Agreement“. Once the committee has extensively discussed an application, and has submitted at least two rounds of allocations to see if the “spread” is decreasing, the Chair of the committee proposes a specific amount to discuss. This could be the highest amount suggested in the latest round of allocation, the average, or the lowest number. This process can be repeated many times using different increments. By gauging the degree of strong and weak support or opposition to specific amounts, it is possible to agree on an amount that has the highest aggregate support from the whole committee. If necessary, serious dissenting views can also be expressed in the recommendation document.

At least one representative of the WMF Board of Trustees is present at all times to confirm that the process is conducted fairly. Also, given that all FDC committee members are also Wikimedia community members, there can be real or perceived conflicts of interest. To address this, the process for “recusal” is extremely thorough. In this round of deliberations, four members of the committee did not participate in deliberations about three of the applicants. They did not have access to any non-public documents about that application, were not present while the application was discussed and funds allocated, and did not participate in drafting the recommendation.

The next steps

Now that the recommendations have been published, applicants may choose to appeal to the independent FDC ombudsman before the Board decides on whether to accept or modify the FDC’s recommendation. After the Board’s decision, successful applicants can  begin their projects! Six months from now these affiliates will begin again with a letter of intent to apply for the next round of APG funding. Meanwhile, the FDC process is already underway for the second group of APG applicants for 2015–16.

Liam Wyatt, Funds Dissemination Committee member

More information about these grants can be found in our mailing list announcement to the Wikimedia community.

by Liam Wyatt at November 23, 2015 11:48 PM

Magnus Manske

The beatings will continue until morale improves

So over the weekend, Wikimedia Labs ran into a bit of trouble. Database replication broke, and was lagging about two days behind the live databases. But, thanks to tireless efforts by JCrespo, replication has now picked up again, and replication lag should be back to normal soon (even though there might be a few bits missing).

Now, this in itself is not something I would blog about; things break, things get fixed, life goes on. But then, I saw a comment by JCrespo with a preliminary analysis of what happened, and how to avoid it happening again:

“…it is due to the contraints we have for labs in terms of hardware and human resources. In order to prevent this in the future, I would like to discuss enforcing stronger constraints per user/tool.”

So, there are insufficient resources invested into (Tools) Labs. The solution, obviously, is to curtail the use of resources. This train of thought should be familiar to everyone whose country went to a phase of austerity in recent years. Even though, it now seems to be commonly agreed outside the cloudy realm of politicians, that austerity is the wrong way to go. If you have a good thing going, and you require some more resources to keep it that way, you give it more resources. You do not cut away scarce resources even more! This is how you go the way of Greece.

This is how you go the way of the toolserver.

by Magnus at November 23, 2015 11:18 PM


MediaWiki CRAP – The worst of it

I don’t mean Mediawiki is crap! The Change Risk Anti-Patterns (CRAP) Index is calculated based on the cyclomatic complexity and code coverage of a unit of code. Complex code and untested code will have a higher CRAP index compared with simple well tested code. Over the last 2 years I have been tracking the CRAP index of some of Mediawikis more complex classes as reported by the automatic coverage reports, and this is a simple summary of what has been happening.

Just over 2 years ago I went through all of the Mediawiki unit tests and added @covers tags to improve the coverage reports for the source. This brought the line coverage to roughly 4% in toward the end of 2013. Since then the coverage has steadily been growing and is now at an amazing 9%. Now I am only counting coverage of the includes directory here, including maintenance scripts and Language definitions the 9% is actually 7%.

You can see the sharp increase in coverage at the very start of the graph below.

Over the past 2 years there has also been a push forward with librarization which has resulted in the removal of many things from the core repository and creation of many libraries now required using composer. Such libraries include:

  • mediawiki/at-ease – A safe alternative to PHP’s “@” error control operator
  • wikimedia/assert – Alternative to PHP’s assert()
  • wikimedia/base-convert – Improved base_convert for PHP
  • wikimedia/ip-set – PHP library to match IPs against CIDR specs
  • wikimedia/relpath – Compute a relative path between two paths
  • wikimedia/utfnormal – Unicode normalization functions
  • etc.

All of the above has helped to generally reduce the CRAP across the code base, even with some of the locations with the largest CRAP score.

The graph shows the CRAP index for the top 10 CRAP clases in Mediawiki core at any one time. The data is taken from 12 snapshots of the CRAP index across the 2 year period. At the very left of the graph you can see a sharp decrease in the CRAP index as unit test coverage was taken into account from this point (as in the coverage graph). Some classes fall out of the top 10 and are replaced by more CRAP classes through the 2 year period.

Well, coverage is generally trending up, CRAP is generally trending down. That’s good right? The overall CRAP index of the top 10 CRAP classes has actually decreased from 2.5 million to 2.2 million! Which of source means for the top 10 classes the CRAP average has decreased from 250,000 to 220,000!

Still a long way to go but it will be interesting to see what this looks like in another year.

by addshore at November 23, 2015 10:35 PM

Wiki Education Foundation

Monthly Report for October


"WikiConference USA 2015 Group Photo 32" by Geraldshields11 - Own work. Licensed under CC BY-SA 4.0 via Wikimedia Commons -https://commons.wikimedia.org/wiki/File:WikiConference_USA_2015_Group_Photo_32.JPG#/media/File:WikiConference_USA_2015_Group_Photo_32.JPG
“WikiConference USA 2015 Group Photo 32” by Geraldshields11 – Own work. Licensed under CC BY-SA 4.0 via Wikimedia Commons -https://commons.wikimedia.org/wiki/File:WikiConference_USA_2015_Group_Photo_32.JPG#/media/File:WikiConference_USA_2015_Group_Photo_32.JPG
  • The Wiki Education Foundation was a proud co-sponsor, alongside the National Archives and Record Administration and the National Archives Foundation, of WikiConference USA 2015. We consider this event a resounding success in achieving its stated goal: Connecting Wikipedians, educators, cultural institution staff, and others to share the work that connects and inspires them. We are grateful to the Wikimedia DC and Wikimedia NYC volunteers whose work made the event possible.
  • While in Washington, D.C., Wiki Ed hosted its first external fundraising event. Wiki Ed board members and staff made meaningful connections and discussed potentials for future collaborations and funding opportunities. We were immediately invited to host another event, a luncheon in New York, for early November.
  • We launched a new question-and-answer tool online, ask.wikiedu.org. The tool allows program participants to pose questions about Wikipedia; Wiki Ed staff or volunteers can respond to these questions. This tool has already been used by instructors, and should have an impact on our ability to scale our staff support by providing a database of the most commonly asked questions and answers.
  • The Classroom Program is now supporting more than 3,000 students in October. This is the largest number of students we have supported to date, and our ability to support quality work from this large of a student group is testament to our great digital tools and staff.
  • The first Featured Article from a Wikipedia Visiting Scholar was approved in October. Gary Greenbaum, at George Mason University, reached Feature Article status for his work on the Boroughitis (http://en.wikipedia.org/wiki/Boroughitis) article. Featured Article designates one of the highest levels of article quality on Wikipedia.


Educational Partnerships & Outreach

Wiki Ed at USF
Wiki Ed at USF 

In early October, Outreach Manager Samantha Erickson met University of San Francisco faculty to discuss the Visiting Scholars program. As a result of that meeting, USF plans to sponsor two Visiting Scholars in 2016.

After less than a year of formalizing a partnership with the National Women’s Studies Association, Wiki Ed is supporting 25 women’s studies courses during the Fall 2015 term — 16% of all supported courses this term. Targeted outreach and support has proven to be a huge success, and we look forward to expanding our partnerships for similar outcomes. At WikiConference USA, Educational Partnerships Manager Jami Mathewson presented her work with the National Women’s Studies Association as a model example for the benefits of partnerships with academic associations.

Classroom Program

Status of the Classroom Program for Fall 2015 in numbers, as of October 31:

  • 156 Wiki Ed-supported courses had Course Pages (72, or 46%, were led by returning instructors)
  • 3,250 student editors were enrolled
  • 2,315 (or 71%) students successfully completed the online training
  • Students edited 2,330 articles and created 160 new entries

The Classroom Program team has been busy working with more than 3,000 enrolled students. This marks our largest number of classes and students to date.

We’re supporting a considerably larger number of students as compared to spring 2015 (2,290 students). Our course Dashboard and help site, ask.wikiedu.org, have improved the quality of staff support to instructors and enabled us to provide guidance to more students. With most of our students well into the editing part of their assignments, we’re eager to see the impact these tools make in the quality of student contributions.

Ximena Gallardo C., Ann Matsuuchi, and former students at the National Archives.
Ximena Gallardo C., Ann Matsuuchi, and former students at the National Archives.

The Wiki Ed team heard firsthand from several instructors at WikiConference USA 2015. Matthew Vetter (Ohio University), Zach McDowell (University of Massachusetts, Amherst), and Ximena Gallardo C. (CUNY LaGuardia), all longtime participants in the Classroom Program, presented their models for Wikipedia assignments. We also had a chance to speak with Jason Smith and Ann Matsuuchi (CUNY LaGuardia) and meet a few of their students. In another presentation, Wiki Ed board member Chanitra Bishop presented on the variety of Wikipedia assignments she has encountered in her work as a librarian.

Classroom Program Manager Helaine Blumenthal and Samantha presented to medical students in Amin Azzam’s course at the University of California, San Francisco. What we heard in these encounters is that, despite the extra effort associated with a Wikipedia assignment, students and instructors find it a meaningful and rewarding experience.

Helaine Blumenthal presents Wiki Ed to a course at UCSF.
Helaine Blumenthal presents Wiki Ed to a course at UCSF.

We’re already planning our instructor onboarding for spring 2016. We’re looking to the upcoming Year of Science and exploring ways for students to contribute to this exciting initiative.

Student work highlights:

  • Students in Duke University’s Evolution of Animal Behavior course have been doing exceptional work.
    • Animals engage in distraction displays to divert the attention of a predator; killdeer, a small North American bird, is famous for its “broken wing” displays that it used to lure predators away from their nests. Student editors expand the short (362 word) article by adding sections on the evolutionary origins and adaptive functions of distraction displays. While people usually associate this behavior with birds, students added sections which discussed the behavior in fish and mammals. They added sections looking at the underlying costs (the behavior entails) and sorts of factors which influence the decision of the animal to engage in this display. The student editors expanded the article from 362 to 1339 words (and have continued to expand it since the end of October).
    • Appropriately for Halloween, students expanded the communal roosting article from a 50-word stub with no references into a well-referenced article with more than 2,200 words. They added sections discussing three separate hypotheses which seek to account for the evolution of communal roosting and discussed examples in birds, butterflies, tiger beetles, and bats. They also added three new images to the article.
    • Students created a new article on pursuit predation. While many predators ambush their prey, pursuit predators run down fleeing prey. Video sequences of pursuit predation by cheetahs, lions, and wolves are a staple of nature television programming. Students added sections discussion the strategy of pursuit predation and sections discussing the evolutionary and ecological basis. They also added several examples of pursuit predation, including Matabele ants and dragonflies.
    • Osteophagy is the practice of consuming bones. While the consumption of bones by carnivores is unremarkable, when this process seems strange when herbivores do it, usually as a way to compensate for deficiencies of calcium or phosphate in their diet. Student editors expanded a three-sentence stub into an 862-word article. They added sections discussing the phenomenon in a variety of species including turtles, cattle, and giraffes. They also added a discussion of the practice in humans.
  • Students in the University of Washington’s Interpersonal Media created an article on Hang Ryul Park, a famous South Korean artist and poet who has exhibited his work internationally. Another student improved Russell Street, Hong Kong, an important street, by making better use of available images and adding information about a transportation strike in the 1950s.
  • Students from Diana Strassmann’s Poverty, Justice, and Human Capabilities course at Rice University made several good contributions to articles dealing with issues of global health, education, and gender.
    • Students created a new article on Gender Inequality in Sri Lanka. In addition to its now 40 references, the article makes good use of media from Wikimedia Commons.
    • As part of their research related to poorly covered global health issues, students essentially created the Health in Guatemala article from a redirect, adding images from Commons and information about health care issues and infrastructure.
  • Students at Carleton University’s Topics in Cinema and Gender added content about female directors and film makers:
    • Shirley Barrett: Cinemazing expanded an article on an Australian film director which had languished as a stub for four years. Starting from a single line of biography and a list of films, they included information on her work and important themes therein.
    • Miwa Nishikawa: Working on an article which was a bit larger to start with, Deseolopez fleshed out her biography with information on her films as well as her writing career.
    • Nancy Grant: Cue95 created a new article about this English-Québécois film producer who produced more than 27 short and feature-length films, using existing images on Wikimedia Commons to illustrate their work.

Community Engagement

The current group of Visiting Scholars started just last month, but they’re already showing us the valuable contributions long-time Wikipedians can make. Gary Greenbaum, the Visiting Scholar at George Mason University, produced three Good Articles: (Lexington-Concord Sesquicentennial half dollar, Huguenot-Walloon half dollar, and Cleveland Centennial half dollar); and one Featured Article (Boroughitis). These designations represent the highest levels of article quality. Meanwhile, Wiki Ed’s most recent Scholar, McMaster University’s Danielle Robichaud, is making significant improvements to the article about the British writer and actor Lady Constance Malleson.

Andrew Lih and Wiki Ed staff
Andrew Lih and Wiki Ed staff

Community Engagement Manager Ryan McGrady represented the Community Engagement program at WikiConference USA 2015, leading a session about alternative forms of expert engagement on Wikipedia. He joined Wikipedia Content Expert Ian Ramjohn and Andrew Lih for a session about Wiki Ed’s Year of Science initiative.

Throughout October, Ryan has also been working with Samantha and Jami to expand the Visiting Scholars program, conducting outreach and entering conversations with contacts at institutions interested in sponsorships. That team has been developing strategies to bring Scholars on board to work on science-related topics for the Year of Science campaign.

Program Support


Communications Manager Eryk Salvaggio spent the first half of October preparing for, and following up on, WikiConference USA 2015. The event was live-streamed on YouTube, and up-to-date information, quotes, and resources were shared through Wiki Ed’s social media channels. Eryk also recruited several presenters on education-related topics to contribute to wikiedu.org.

Dr. Diana Strassmann, board chair of the Wiki Education Foundation and Carolyn and Fred McManis Distinguished Professor in the Practice at Rice University, was interviewed for a segment on Wikipedia’s role in education for the American Public Media program American RadioWorks, which explores “the people, ideas, and innovations that are changing education in the 21st century.”

Blog posts:

External Media:

WikiConference USA archive streams:

Thanks to the National Archives, full-day streams of WikiConference USA 2015 presentations at the McGowan Theater are available to view: Friday, Saturday, Sunday. These videos will eventually be edited by WikiConference USA volunteers. However, a list of education-related presentations is available here.

Digital Infrastructure

A screenshot of Wiki Ed's ask.wikiedu.org tool for finding help and posting unanswered questions.
A screenshot of Wiki Ed’s ask.wikiedu.org tool for finding help and posting unanswered questions.

The biggest digital infrastructure news from October is the official launch of ask.wikiedu.org, our new question-and-answer site. Product Manager Sage Ross set up our custom instance of the open source Askbotplatform, and the Programs and Program Support teams seeded it with more than 150 of the questions that come up most often from instructors and students. A search bar is now integrated into the Dashboard, and we’ve already seen instructors start to use it on their own to read existing answers and ask new questions.

Our main work with our WINTR development partners this month focused on the upcoming in-Dashboard training system. We have a prototype of the new system up and running, including draft content for updated instructor and student training, and we expect to beginning testing it with first-time instructors in November, ahead of a planned roll-out for all Spring 2016 courses.

Finance & Administration / Fundraising

Finance & Administration

Wiki_Education_Foundation_expenses,_October_2015 (1)

For October, expenses were $295,806 versus the plan of $446,259. The main cause for the $150k variance is two-fold: One of timing ($12k in hiring delays; $11k audit fees; travel expenses $14k; WikiCon expenses $24k) and the other resulting from postponing certain expenses until we have a better grasp on our funding flow. These include $62k for moving/expansion plans; design services for $13k and equipment for new staff $6k.

Wiki Education Foundation, Year to Date Expenses to October 2015.
Wiki Education Foundation, Year to Date Expenses to October 2015.

Year-To-Date expenses are $1,035,551 versus the plan of $1,288,252. The $253k variance is the result of $102k of ongoing savings from prior months —”Promotional Items” ($7k), and our “All Staff Meeting” ($4k). Delays and / or temporary postponements of “Personnel” hirings ($14k); “Staff Development” ($13k); “Creative Design” ($9k); “Equipment” associated with hires ($5k); and Staff “Recruiting” efforts ($50k)) in addition to this month’s $150k savings.

Our current spending level is averaging at 80% of planned.


  • The Development team held its first external fundraising event. Graciously hosted by Judith Barnett in her Washington, D.C. residence, the event brought together more than 30 people, including select Wiki Ed board and staff, current and prospective supporters, members of academia, and other thought leaders in the open education space. We also welcomed Dipayan Ghosh, Senior Policy Advisor at the Office of Science and Technology Policy (OSTP) of the White House. Wiki Ed Board members and staff made meaningful connections and discussed potentials for future collaborations and funding opportunities. Remarks by Frank shared Wiki Ed’s mission with a new audience and encouraged our guests’ support for the Year of Science. One immediate outcome of our October event was an invitation to host a second event, a luncheon in New York, in the first week of November.
  • In addition to the event and the conference in D.C., the development team has been researching upcoming grant opportunities, and working with development consultant Brenda Laribee on strategies to maximize its efforts.


On October 12, the Wiki Education Foundation’s board held its first in-person meeting of the Fiscal Year 2015/16. After Executive Director Frank Schulenburg and Director of Program Support LiAnna Davis reported on the current state of the organization, the board discussed the organization’s future development work, followed by reports from the Mission / Vision, Nominating, and Audit Committees.

Office of the ED

  • Current priorities:
    • Securing funding for upcoming major programmatic initiatives
  • October 9–11 marked the 2nd annual WikiCon USA in Washington, D.C. Renée coordinated travel and logistics for those attending the conference, including staff, volunteers, and the Wiki Ed board. In cooperation with our partners, the National Archives and the local Wikimedia chapters (DC and NYC), we provided a space where Wikipedians, staff from cultural institutions, and educators could connect with each other, report on past projects, and catalyze new ideas and collaborations. The videos from the conference can be found on YouTube.
  • During October, Frank supported Wiki Ed’s development team through meetings with connectors and potential funders.

Visitors and guests

  • Martin Rulsch, Wikimedia Steward
  • TJ Bliss, Hewlett Foundation
  • Dr. Amin Azzam, UCSF

by Eryk Salvaggio at November 23, 2015 06:07 PM

Wikimedia Foundation

Wikimedia Foundation, Wikimedia Deutschland urge Reiss Engelhorn Museum to reconsider suit over public domain works of art

Photo by jelm6, freely licensed under CC BY 2.0.

On October 28, the Reiss Engelhorn Museum in Mannheim, Germany, served a lawsuit against the Wikimedia Foundation and later against Wikimedia Deutschland, the local German chapter of the global Wikimedia movement. The suit concerns copyright claims related to 17 images of the museum’s public domain works of art, which have been uploaded to Wikimedia Commons. The Wikimedia Foundation and Wikimedia Deutschland are reviewing the suit, and will coordinate a reply by the current deadline in December.

The Wikimedia Foundation and Wikimedia Deutschland stand firmly in our commitment to making public works freely and openly available. Public institutions such as galleries and museums serve a similar mission, and have historically been our allies in making the world’s knowledge accessible to all. With this lawsuit, the Reiss Engelhorn Museum is limiting public access to culturally important works that most of the world would otherwise not be able to access.

The paintings, portraits, and other works of art at issue in this case are housed in the Reiss Engelhorn Museum, but exist freely in the public domain. However, German copyright law may apply to photographs of works in the public domain, depending on a number of different factors, including the artist who created the work, the amount of skill and effort that went into the photograph, creativity and originality in the photograph, and the actual art itself. The Reiss Engelhorn Museum asserts that copyright applies to these particular images because the museum hired the photographer who took some of them and it took him time, skill, and effort to take the photos. The Reiss Engelhorn Museum further asserts that because of their copyrights, the images of the artwork cannot be shared with the world through Wikimedia Commons.

The Wikimedia Foundation and Wikimedia Deutschland believe that the Reiss Engelhorn Museum’s views are mistaken. Copyright law should not be misused to attempt to control the dissemination of works of art that have long been in the public domain, such as the paintings housed in the Reiss Engelhorn Museum. The intent of copyright is to reward creativity and originality, not to create new rights limiting the online sharing of images of public domain works. Moreover, even if German copyright law is found to provide some rights over these images, we believe that using those rights to prevent sharing of public domain works runs counter to the mission of the Reiss Engelhorn Museum and the City of Mannheim and impoverishes the cultural heritage of people worldwide.

Many cultural institutions have made it their mission to make their collections more accessible to people around the world. In October, the Museum für Kunst und Gewerbe in Hamburg, Germany made its collection available for free online. Amsterdam’s Rijksmuseum has provided free online access to all of its paintings, including the ability to download and use the reproductions under the CC0 Public Domain Dedication license. In Denmark, SMK (Statens Museum for Kunst, The National Gallery of Denmark) has released its digital images and videos under the CC BY license. The British Library and the Japan Center for Asian Historical Records jointly released more than 200 Japanese and Chinese prints into the public domain.

These cultural institutions are upholding the values of the public domain and protecting the right to take part in our cultural heritage. The Reiss Engelhorn Museum’s attempt to create new copyright in public domain works goes against European principles on the public domain.

In a Communication on August 11, 2008, the European Commission wrote: “it is important to stress the importance of keeping public domain works accessible after a format shift. In other words, works in the public domain should stay there once digitised and be made accessible through the internet.” This was reinforced by the Europeana Charta of 2010 that reads: “No other intellectual property right must be used to reconstitute exclusivity over Public Domain material. The Public Domain is an integral element of the internal balance of the copyright system. This internal balance must not be manipulated by attempts to reconstitute or obtain exclusive control via regulations that are external to copyright”.

Over the years, the Wikimedia movement has enjoyed rich partnerships with museums and galleries around the world through the GLAM-Wiki initiative, which helps cultural institutions share their resources with the world through collaborative projects with experienced Wikipedia editors. The relationships have allowed millions of people from around the globe to access and enjoy institutional collections in places they may never have the chance to visit. Wikimedia Deutschland alone has worked with more than 30 museums in Germany to make their collections freely available to anyone, anywhere through the Wikimedia projects. These partnerships are part of a vital effort to allow cultural institutions and Wikimedia to serve their missions of free knowledge and shared culture.

People around the world use Wikipedia to discover and understand the world around them. Thanks to the Internet, many traditional barriers to knowledge and learning have disappeared. Denying online access to images in the public domain prevents people from exploring our shared global cultural heritage. We urge the Reiss Engelhorn Museum to reconsider its position and work with the Wikimedia community to make their public domain works more broadly available.

Michelle Paulson, Legal Director
Geoff Brigham, General Counsel
Wikimedia Foundation

A German-language statement from Wikimedia Deutschland is available on their blog. A full list of the images affected is on the Wikimedia Commons.

by Michelle Paulson and Geoff Brigham at November 23, 2015 05:15 PM

Tech News

Tech News issue #48, 2015 (November 23, 2015)

TriangleArrow-Left.svgprevious 2015, week 48 (Monday 23 November 2015) nextTriangleArrow-Right.svg
Other languages:
čeština • ‎Deutsch • ‎Ελληνικά • ‎English • ‎español • ‎français • ‎עברית • ‎italiano • ‎日本語 • ‎Ripoarisch • ‎मैथिली • ‎नेपाली • ‎português • ‎português do Brasil • ‎русский • ‎svenska • ‎українська • ‎Tiếng Việt • ‎ייִדיש • ‎中文

November 23, 2015 12:00 AM

November 22, 2015


Impact of Wikimania Mexico 2015 on Wikidata

Recently Wikidata celebrated its third birthday. For the occasion I ran the map generation script that I have talked about before again to see what had changed in the geo coordinate landscape of Wikidata!

I found, well, Mexico blossomed!

The image to the left is from June 2015, the right October 2015 and Wikimania was in July 2015!

I will be keeping an eye out for what happens on the map around Esino Lario in 2016 to see what impact the event has on Wikidata again.

Full maps

by addshore at November 22, 2015 05:54 PM

Myanmar coordinates on Wikidata by Lockal & Widar

In a recent blog post I showed the amazing apparent effect that Wikimania’s location had on the coordinate location data in Mexico on Wikidata. A comment on the post by Finn Årup Nielsen pointed out a massive increase in data in the Myanmar (Burma). I had previously spotted this increase but chosen not to mention it in the post. But now after a quick look at some items and edit histories I have found who we have to thank!

The increase in geo coordinate information around the region can clearly be seen in the image above. As with the Mexico comparison this shows the difference between June and October 2015.

Finding the source of the new data

So I knew that the new data was focused around Q836 (Myanmar) but starting from that item wasn’t really going to help. So instead I zoomed in on a regular map and found a small subdivision of Myanmar called Q7961116 (Wakema). Unfortunately the history of this item showed its coordinate was added prior to the dates of the image above.

I decided to look at what linked to the item, and found that there used to be another item about the same place which now remains as a redirect Q13076630. This item was created by Sk!dbot but did not have any coordinate information before being merged, so still no luck for me.

Bots generally create items in bulk meaning it was highly likely the new items either side of Q13076630 would also be about the same topic. Loading Q13076629 (the previous item) revealed that it was also in Myanmar. Looking at the history of this item then revealed that coordinate information was added by Lockal using Widar!

Estimating how much was added

So with a few quick DB queries we can find out how many claims were created for items stating that they were in Myanmar as well as roughly how many coordinates were added:

SELECT count(*) AS count
FROM revision
WHERE rev_user = 53290
  AND rev_timestamp > 2015062401201
  AND rev_comment LIKE "%wbcreateclaim-create%"
  AND rev_comment LIKE "%Q836%"

SELECT count(*) AS count
FROM revision
WHERE rev_user = 53290
  AND rev_timestamp > 2015062401201
  AND rev_comment LIKE "%wbcreateclaim-create%"
  AND rev_comment LIKE "%P625%"

Roughly 16,000 new country statements and 19,000 new coordinates. All imported from Burmese Wikipedia.

Many thanks Lockal!

by addshore at November 22, 2015 05:53 PM

November 21, 2015

Weekly OSM

weekly 278



A map for friends of good music [1]


      • Christoph Hormann presents two new satellite images which can be used to infer data for OpenStreetmap. The images show Greenland and the North Frisian tidal flats.
      • User shares some tips on JOSM that can help while mapping Japan.


      • The paper “Assessing the quality of OpenStreetMap data” Lindy-Anne Siebritz and George Sithole compare OSM with official Southafrican data.
      • User capiscuas is reporting about his efforts in cleaning up the administrative boundaries in Jakarta.
      • A mapping party was organised by Arun Ganesh in Bangalore. This event was the part of the osmgeoweek celebrations. On a drizzly Sunday morning, six people set out on foot scouring the by-lanes of Indiranagar. They were searching for three things to map, street lamps, trash-dumping areas and dustbins
      • Jothirnadh decribes some problems with the data in his home town, Rajahmundry, Andhra Pradesh, India caused by image alignment and an import of medical facilities.
      • Community around TeachOSM, MapStory und AmericanGeo are launching a first version of an #OpenStreetMap Surveyor badge at GeoBadges.org
      • The Sozialhelden association and the World Health Organization (WHO) are launching the worldwide campaign MapMyDay on 3. December. With this campaign both groups call to chart the accessibility of public buildings on Wheelmap.

OpenStreetMap Foundation

      • The itlaian local representation of the Wikimedia Foundation applies for becoming a Local Chapter of the OSMF.
      • Every member of the OpenStreetMap foundation can candidate for becoming a board member. Candidates need to apply till November 21th, 16:00 UTC. All users can ask questions to candidates which are answered in the wiki.


      • A Flash Map Mob takes place in Sevilla, Spain on November 30th.
      • IPIS Research and Openstreetmap Belgium invite for a Missing Maps Marathon on Dec. 9th 2015. This event will be focused on mapping data for Republic of the Congo.
      • Wikimedia Italia is hosting the annual Italian OSM users event in Bologna, Italy on Saturday, November 21 and Sunday, November 22. For more information please visit here.

Humanitarian OSM


      • Andy Allan introduces his new mapstyle “Spinal Map” in his Blog. It uses Metal-umlauts and is built by Richard Fairhurst on Andys request.
      • Grant for Paul to improve Map …
      • The Amadeu Antonio Foundation and PRO ASYL are keeping track of hostilities and demonstrations against refugees in a chronicle and visualize them on a map.
      • Sorted Cities extracts buildings of a selected map section and orders them by size and relation to each other. The resulting outlines are presented as PNG.
      • As the discussions for rendering tertiary roads continues, a new proposal for the same was made.
      • The company Uncharted Software visualizes traffic between approx. 6500 relay servers of the Tor network on a map. If used appropriately, Tor allows to anonymously browse the web.


      • The transport information system autobahn.NRW has been discontinued. It has been substituded by www.verkehr.nrw.de which is based on OSM.

Open Data

      • Theres a diskussion about the quality of emergency access points in OSM in the forum. User DD1GJ compared the data in OSM to the KWF dataset and found many deviations. (Automatic translation)


      • The European Court of Justice, who has been referenced in a lawsuit between the Free State of Bavaria and an Austrian map publisher, has ruled that non-digitized topographic maps are subject to ancillary copyright for databases.


      • User aarthy writes a diary about working on a remote control plugin that can help setup JOSM with everything a mapper requires to work a specific project that has special instructions. This allows mappers to directly concentrate on mapping rather than following setup instructions.
      • User Tordanik reports (in German OSM sub forum) about the results from Google Summer of Code 2015, where user B4stis participated to achieve a better 3D rendering in OSM2World.
      • Recently, the website of Vespucci at vespucci.io (the well known OSM editor for Android devices) has got an overhaul, also in documentation.
      • User Visaman asks about a way for implementing APRS in OSM-based navigation software to use in an own type of hardware.
      • Tim Wisniewski announces his new Choroplethen map Plugin for LeafletJS on Twitter.
      • Bjørn Sandvik writes about mapping grid-based statistics, with the help from openlayers and other javascript tools.
      • Due to popular demand, the developers of Mapsforge ( an offline map display framework for Android and Java) published a detailed guide regarding map creation with proper coastline support.
      • Paul Norman summarizes the complexity of the OSM Carto Mapstyle in his blog by giving facts and numbers.
      • SK53 describes how he creates a dataset of populated areas from OpenStreetMap data in his blog. Since landuse=residential/commercial/... is not widely used, he used the network of residential roads. He tested his method in several regions — also in the US (due to the TIGER import).
        Alternative approaches are discussed in later blogs here and here.
      • Releases
        Software Version Release Date Comment
        OpenLayers 3.11.1 2015-11-12 Highlight of version 3.11 is
        support of Mapbox vector tiles.
        Atlas 1.1.18 2015-11-13

Did you know …

      • … Graphhopper with height profile and heights directly in the map?
      • …the Nebelkarte (foggy map) for Switzerland. See on the right a slider for adjusting the height.
      • Latitude 0, longitude 0 (known as “null island“) has always been the target of misplaced JOSM imports. However, now they’ve got competition – there’s a genuine OpenStreetMap feature there! It even has its own web page.
      • MapOnShirt is another Crowdfundingprojekt which offers stunning clothes with a design from maps. Difference to other projects, the user can freely choose the map section.
      • Wikipedia keeps record of the abominable acts of terror which took place in Paris on 13. November und visualises the locations on a map.

Other “geo” things

      • The Knight Foundation awards 20 new projects with 35.000$ each, als containing projects in the area of OpenStreetMap. (It was also the Knight Foundation that supported the development of iD back in 2013).
      • A map is worth a thousand words.
      • QGIS (previously known as “Quantum GIS”) is a cross-platform free and open-source desktop geographic information system (GIS) application that provides data viewing, editing, and analysis.They now have a flickr group where they have displayed the creations of the users.
      • Google Maps Insider
      • Info and teaching about satell…
      • map-disputes
      • Maps.me got awarded in Russia. (Russisch)
      • A man disappears without a trace after being in a pub with a friend. After nine years his remainings have been discovered in his car which has been found by chance in a nearby pond. On a Google Maps satellite image the car is visible for years.


weeklyOSM is brought to you by … 


by weeklyteam at November 21, 2015 06:32 AM

Wikimedia Foundation

‘Semana i’ student editing in Mexico City improves the Spanish Wikipedia

File:Reto Wikimedia montage Student work.webm

Short montage of student work during Reto Wikimania. Video by Thelmadatter, freely licensed under CC BY-SA 4.0.

Semana i (in English, “i Week”) is a term that invokes curiosity among those of us here at the Monterrey Institute of Technology and Higher Education, as well as among people, organizations and institutions that are dedicated to academic, cultural, social service, technological, business and industrial matters. The goal of Semana i is to engage students with technology by replacing their normal classroom activities with intensive project-based work that has real-world applications for one week. These activities are called retos (in English, “challenges”). Almost 100 students chose to contribute content to Wikimedia projects as their challenge. From September 21 to 25, 2015, students and staff at two campuses in Mexico City worked for five days to improve coverage of areas of the south and west of the city both on Commons and on Spanish Wikipedia. Both campuses led Wikiexpeditions, building on the success of the school’s first Wikiexpedition to Tepoztlán in March 2015. Both campuses dedicated time to the creation of articles about these areas.

Mexico City campus

The Mexico City (South) campus focused its energies on the southern borough of Xochimilco and Tlalpan, which both have a mix of urban and rural areas. In particular, Xochimilco is known for its network of canals, what is left of an extensive lake system that covered the Valley of Mexico and is a World Heritage Site. While these canals were extensively photographed, one of the main objectives of this wiki expedition was to photograph lesser known sites, but still important, such as colonial era chapels and other historic monuments, as well as the borough’s archaeological museum and the Acuexcomatl Environmental Education Center. Students were also encouraged to take pictures of everyday life as they encountered it during their wanderings.

The approximately 75 students who participated were divided into small groups, each with a different area of Xochimilco and/or Tlalpan to cover. About twelve of these participants were foreign students, primarily from Europe, who were spread out among the groups so that these groups could experience their exploration from two points of view. Students had to decide how to cover their areas and divide their work. Most opted to stay in their entire group or pair off rather than explore individually. One of the objectives of the project was to write good descriptions of the places they photographed. While teachers and organizers were able to provide some documentation, much was not easily available and a number of groups took the initiative to visit borough offices for information and to ask caretakers. Some even received tours. The Acuexcomatl Center gave their group a tour and waived the admission fee, stating that they welcomed the effort the share what the center does online.

File:Clip of Sebastian Flores Farfán.webm

Video clip of Sebastián Flores Farfán speaking to students. Video by Thelmadatter, freely licensed under CC BY-SA 4.0.

One of the participants was the granddaughter of the official chronicler of the Xochimilco borough, Sebastián Flores Farfán, who, when he heard of our efforts, asked to come to the closing of the event. Of course he was quite welcomed and had a chance to hear some of the student reports. These reports gave very valuable feedback on the event. Some students stated that they enjoyed the project as it gave them a chance to see aspects of their city they did not know existed, and they enjoyed using photography as a learning tool. One complaint was not being able to take photographs of some of the landmarks by authorities, who often wanted some kind of official permission in order to cooperate.

All participating students were required to upload 50 photographs to Commons (or video clips worth 3 photos each) and all then had the choice to upload another 50 or work on a short article. In total, the participants uploaded 5264 photographs, 8 videos and 36 articles (link here). Almost all of these articles were created or expanded in Spanish Wikipedia, but some articles such as Xochimilco and Niñopa were translated into French, Swedish and Danish.

Santa Fe campus

Like last semester, the Santa Fe campus again participated with Wikimedia Commons and Wikipedia, but with a more concrete objective. This time, the focus was on the traditional Mexico City neighborhood of San Ángel, located in the southeast of Mexico City. It is an emblematic section of the city, an “obligatory” place to visit for those who live in the city as well as national and international tourists, along with other areas such as Coyoacán, Mixcoac and Xochimilco, as despite the march of urban sprawl, it has preserved many of its historic monuments, customs and traditions that define it.

When we organized the Wikipedia San Ángel challenge, we did not expect the interest from students that we received. At first we thought thirty places would be sufficient, but this was not the case. There was so much interest in the project that we had to raise this number to forty. This demonstrates in an optimal way the advantages that working with a site like Wikipedia offers. This is also clear from the references that students used to create articles, a mixture of electronic sources along with books and magazines.

Since contributing to Wikipedia should not be taken lightly, before the San Ángel visit, we chose several sites of interest, looking through the Wikipedia to see what was already covered. Students found that a number of sites had only incomplete articles, were mentioned only in the article on San Ángel… or neither. Foreign students found that articles were missing on the areas in languages such as German and Polish.

Students were first trained in research techniques, including what constitutes reliable sources by library staff then about Wikipedia with a visit by Leigh Thelmadatter of the Mexico City Campus, who offered her enthusiasm about the project as well as a solid understanding of the Wikipedia universe, motivating students.

Wednesday was dedicated to the “expedition” to San Ángel, arriving at the heart of the neighborhood, the Plaza San Jacinto. From there students wandered the area taking photographs of what caught their attention, such as monuments, typical buildings, various installations, people, animals and other aspects that make San Ángel unique. Each student was asked to upload fifteen photographs to Wikimedia Commons. In the end, sites that were covered photographically included the La Bombilla Park, the Risco House, the Library of Mexico’s Revolutions, the Diego Rivera and Frida Kahlo Studio House Museum, the Marquesa de Selva Nevada House, the Casa Blanca of San Ángel, the Soumaya Museum (San Ángel), the Isidro Fabela Cultural Center, the Del Carmen Market, and others. There were difficulties such as students needing permission to take photographs, but often these were resolved by the students themselves and receiving help with references from the locals of San Ángel. In the end, this campus uploaded 543 photographs and ten articles to Spanish Wikipedia, as well as improved the existing article on San Ángel.

Thursday and Friday were dedicated to revising, categorizing and writing the descriptions of the photographs for Wikimedia Commons, along with working on articles to be created or expanded. For this aspect, we worked not in a classroom but in the library’s new learning commons, which allowed students to concentrate on their work such that many lost track of time.

Leigh Thelmadatter, Wiki Learning Tec de Monterrey
Alavaro Alvarez, Wiki Learning

by Leigh Thelmadatter and Alavaro Alvarez at November 21, 2015 01:35 AM

November 20, 2015

Wikimedia Foundation

Three state museums open the doors to Wikimedia Spain and host concerts for Wikipedia

Madrid - Trieditatón Glaming Madrid - 151017 123238.jpg
Triedit-a-thon at National Archaeological Museum. Photo by Barcex, freely licensed under CC BY-SA 3.0.

Wikimedia Spain is working at three state museums in Madrid—the Museum of Romanticism, the Museo del Traje (Costume Museum) and the National Archaeological Museum—through its first Wikipedian in residence, Ruben Ojeda. Wikipedians in residence work within cultural institutions to enable them to initiate a productive relationship with Wikipedia and its community, both during the residency and after. The model was launched as part of the Wikimedia GLAM projects (Galleries, Libraries, Archives and Museums) which seek to encourage institutions and their professionals to contribute to the Wikimedia projects.

In this case, besides promoting the Wikimedia projects in these three institutions, various activities open to the public are being carried out with success. We have hosted two concerts of classical music, where learned quite a bit. For example, it is not beneficial to have an audience present—the recordings are degraded when there is outside noise from people’s cellphones or even just coughing. These problems unfortunately doomed the first recording. However, the seven recordings from the second concert with no audience were captured and donated under a free license to Wikimedia Commons, where it can be used on any of the Wikimedia projects. Furthermore, after the concert, an additional 20 recordings were donated by one of the musicians of Wasei Duo. We will host a third concert on November 24.

Wikimedia Spain has also held fourteen Wikipedia training sessions and one edit-a-thon that explain how Wikipedia works and generate content related to the museums and their collections, which focus on the nineteenth century in Spain, fashion and archaeology.

The Museum of Romanticism has cataloged more than 16,000 museum pieces and offers collections of paintings, miniatures, furniture, decorative arts, prints, drawings and photographs collections, which offer a broad panorama of arts during the Romantic era in Spain. The Museo del Traje has cataloged fashion, costumes and ethnography, with a collection of over 170,000 pieces and documents; these collections date from the Middle Ages to the contemporary fashion of Spain. The National Archaeological Museum is, since 1867, the leading Spanish institution in the preservation of historical pieces. Its permanent collection includes over 15,000 items from Prehistory, Early history, Roman Spain, Greece, Egypt and the Near East, Middle Ages and Modern Age.

The development of this initiative has been possible thanks to a grant from the Wikimedia Foundation, without which it would have been impossible to carry out because of the reduction of financial resources that have negatively affecting cultural institutions in Spain. It began on September 7, 2015 and will run until January 7, 2016. Similarly, following the contact with these three museums, Wikimedia Spain signed a memorandum of understanding with the General Department of National Museums, Ministry of Education, and Culture and Sports, which will allow similar initiatives related to the Wikimedia projects to develop in Spanish museums.

Rubén Ojeda
Wikimedia Spain

by Ruben Ojeda at November 20, 2015 10:17 PM

Gerard Meijssen

What kind of box is #Wikipedia

At #Wikidata we know about likely issues at #Wikipedia. The problem is that #Wikipedia does not seem to care. When Wikipedia is about quality at some stage likely issues are important to tackle, they are the easiest way of improving quality.

There are three scenarios:
  • It is incorrect, and Wikidata knows about a correct alternative
  • Wikidata is wrong and needs improvement
  • Both Wikidata and Wikipedia have it wrong
At present, Wikipedia is a black box, communication may go in and it is neither obvious nor visible that quality improvement suggestions are taken seriously. It follows that when Wikipedia sees Wikidata as a foreign body, at some stage all the quality suggestions become toxic and it gets out of the box. Such a box has a name.

by Gerard Meijssen (noreply@blogger.com) at November 20, 2015 07:10 AM

November 17, 2015

Wikimedia Foundation

Community Digest—Marc Venot and the potential of a new pivot language for Europe; news in brief

Ido Kongreso en Desau 1922.jpg
An international Ido conference in Germany, 1922. Image from Alfred Neussner, public domain.

I switched to Ido Wikimedia because it’s a language that goes into several fundamentals of thinking, something that can be called “scientific”, and has the potential of becoming the pivot language for Europe.

Marc Venot may have only started editing Wikimedia projects in January 2015, but in those eleven months he has accumulated nearly 400,000 edits—over 359,000 of which have been made in the Ido langauge, a constructed language variant of the well-known Esperanto. We interviewed him to gather insight into his editing experience as well as to get know the culture of Ido language projects.

The Ido-language flag. Image by Lucas Larson, public domain.

Marc hails from Vancouver in Canada, a major city on the country’s west coast, where he owns several studio apartments that he rents out. Like many Wikimedia editors, Marc was familiar with Wikipedia and used its information often, but was only compelled to join after finding several errors. Although he started with editing the French Wikipedia and Wiktionary, as an ardent linguist his main passion quickly moved to editing Ido Wikimedia projects.

According to him, Ido is a language that hits several fundamentals of thinking and is something that can be called scientific. In fact, Marc believes that it can become the pivot language for Europe, although he’s open to another language like Lojban taking this role.

As one of only four sysops on Ido Wikipedia, Marc stresses that his role is to keep some coherence, notably in the categories and templates. He feels that Ido needs more advanced translation tools to help add more content easily to the different Ido projects. To keep the Ido Wikipedia and Wiktionary updated, Marc reads newspapers in the six languages on which Ido is based. He also looks for words that the Wiktionary currently does not include and how new entries can be added or existing ones can be modified.

Commenting on his Ido Wikimedia colleagues, Marc referred to his colleagues Artomo, hailing from Finland, and João Xavier of Brazil as pillars because of their immense contributions and supportive gestures.

Syed Muzammiluddin
Wikimedia community volunteer

In brief

Apollo 13 Flown Silver Robbins Medallion (SN-354).jpg
The medallion from the ill-fated Apollo 13 mission. Image by Godot13, public domain.

WikiCup ends: The English Wikipedia’s WikiCup has ended with Godot13 taking the top prize on the strength of his 330 featured pictures, a significant percentage of which were high-quality scans of monetary currency. Of note are the scans of the medallions from the Project Gemini and Apollo space missions (above), the up to 100 trillion mark currency issued in the Weimar Republic in 1920–24, and the emergeny currency issued in French Oceania during the Second World War.
French tragedy: Wikimedians responded to the recent terror attack in Paris with articles in 77 languages; the French-language article has 257 references.
New books being uploaded to Wikisource: The monthly This Month in GLAM newsletter has reported on various GLAM (galleries, libraries, archives, and museums) initiatives from around the world, including the uploading of the Mussolini Library to the Italian-language Wikisource. According to the author, the library is “one of the most relevant source[s] of information about [the] fascism period … in Italy, and the digitization of its materials is supposed to be very useful for historians all over the world.” In similar news, the CIS-A2K out of India reports that one thousand Marathi-language books have been put under free licenses and uploaded to the language’s Wikisource: “Right after our Prime Minister Narendra Modi recommended to read the autobiography of Benjamin Franklin as it contains a lot of messages for [the] common man, a lady walked to us once and asked if she can read this [in] Marathi. Be it such autobiographies or a poetry book like Chetavali, such books that were published … should not be kept closed as … many readers are searching for such books.”
Community Wishlist Survey: The Wikimedia Foundation’s (WMF) Community Wishlist Survey has started. As laid out in the Signpost by the WMF’s Danny Horn, the survey “gives everybody the opportunity to propose fixes and solutions and determine which ideas have the most support. It’s an exciting (and slightly terrifying) prospect.”
Wikimania 2016 scholarship ambassadors needed: Applicants are needed to review and process the thousands of applications that will come in for the coming year’s Wikimania. More information is available on the Wikimania 2016 wiki.
Content translation: Content Translation, a tool that makes it easier to translate Wikipedia articles across languages, has now been used in the creation of 30,000 articles. You can join them in an online office hour on 25 November to give your feedback.

Welcome to your community digest! This is a weekly feature for the blog, and we would like to invite you to take part in putting it together. This digest of Wikimedia community news will pull together items from around the globe to provide a venue for your updates and a diverse roundup of events. It aims to emulate and supplement already-existing community news outlets.
If your Wikimedia community has a milestone, cool new project, or quirky occurrence, please leave a message on our tip line, send me an email, or drop a message at my talk page.

Ed Erhart, Editorial Associate
Wikimedia Foundation

by Syed Muzammiluddin and Ed Erhart at November 17, 2015 08:21 PM

Check out these new features and extensions kicked off at Google Summer of Code 2015

Image by Faebot, freely licensed under OGL 1.0.

Google Summer of Code and Outreachy are two software development internship programs that Wikimedia participates in every year. For the last nine years, college students have applied to be a part of the coding summer, one of many outreach programs operated by the Wikimedia Foundation. After being accepted, they work on their project from May to August and are invited to a Mentor Summit in Google in November.

For the first time, all Wikimedia projects that passed the evaluation were immediately deployed in production or Wikimedia Labs. Take a look!

Reinventing Translation Search

Search translations.png

Search translations. Screenshot by Phoenix303, freely licensed under CC BY-SA 4.0.

TranslateWiki is a popular translation platform used by many projects across Wikimedia and several times as many outside it. Developed single-handedly once by Niklas Laxström, the platform has expanded significantly since its launch in 2006. This project aims to add a Search feature to the Translate extension. Without a search feature, it is difficult for translators to find specific messages they want to translate. Traversing all the translations or strings of the project is inefficient. Also, translators often want to check how a specific term was translated in a certain language across the project. This is solved by the special page Special:SearchTranslations. By default, translators can find the messages containing certain terms in any language and filter by various criteria. After searching, they can switch the results to the translations of said messages, for instance to find the existing, missing or outdated translations of a certain term. You can check it out here. Dibya Singh was the project intern.

Crosswatch. Screenshot by Sitic, freely licensed under CC0 1.0.

Cross-wiki watchlist

Crosswatch is a cross-wiki watchlist for all Wikimedia wikis. The goal of the project is to help editors who are active in several wikis to monitor changes and generally to provide a better watchlist for all editors. Among other things, crosswatch includes cross-wiki notifications, dynamic filtering and the ability to show diffs for edits. As an external tool which uses OAuth to retrieve the watchlists on behalf of the user, it doesn’t have the same constraints as MediaWiki and can experiment with the design and functionality of a watchlist without breaking existing workflows or code. It’s design is much more similar to the mobile watchlist than classical MediaWiki watchlist layout, there is however an option to use the traditional layout. Crosswatch can show a unified watchlist for all wikis or a watchlist subdivided into wikis. One of the predominant features is the native support to show a diff for an edit. The project was completed by Jan Lebert.

Wikivoyage PageBanner extension

PageBanner extension Screenshot. Screenshot by Frédéric Bolduc and others, freely licensed under CC BY-SA 3.0.

Wikivoyage is a wiki about travel and holds rich details related to visiting places. This wiki has a special preference for showing page wide banners at the top of each of their articles to enhance their aesthetic appeal. An example of such a banner can be seen here. These banners are traditionally shown using a template on the wiki. The banners shown through templates however had a few shortcomings such as not delivering an optimum size banner for each device, not rendering properly on mobile devices, too small banners on small mobile screens, not being able to show more than one level, or table of contents inside the banners. The project is all about addressing these issues and adding capabilities through a Mediawiki extension to take the banner experience to the next level. You can test it out here. Summit Asthana was the project intern.

Language Proofing Extension for VisualEditor

LanguageTool extension screenshot. Screenshot by Frédéric Bolduc and others, freely licensed under CC BY-SA 3.0.

LanguageTool is an extension for VisualEditor that enables language proofing support in about twenty languages. This includes spelling and grammar checking. Before this tool, VisualEditor relied on the browser’s native spelling and grammar checking tools. LanguageTool itself is an open source spelling and grammar proofing software created and maintained by Daniel Naber. This extension is an integration of the tool into VisualEditor. You can test this feature here and learn more about it here. Ankita Kumari completed the project.

Newsletter Extension for MediaWiki

Newsletter extension for Mediawiki. Screenshot by Frédéric Bolduc and others, freely licensed under CC BY-SA 3.0.

Many Wikimedia projects and developers use newsletters to broadcast recent developments or relevant news to other Wikimedians. But having to find a newsletter main page and then subscribing to it by adding your username to a wiki page doesn’t really sound appealing.
The main motivation of this project is to offer a catalog with all the newsletters available in a wiki farm, and the possibility to subscribe/unsubscribe and receive notifications without having to visit or be an active editor of any wiki. You can see this project in action here and learn more about it here. Tina Johnson was the intern for this project.

Flow support in Pywikibot

Flow extension to Pywikibot. Screenshot by Frédéric Bolduc and others, freely licensed under CC BY-SA 3.0.

This was a project to add support for Flow, MediaWiki’s new discussion framework, to Pywikibot, a Python framework widely used for bots operating on Wikimedia wikis. To accomplish this task, a module was implemented in Pywikibot with classes mapping to Flow data constructs, like topics and posts. Code supporting Flow-related API calls was also added, and tests were included for loading and saving operations. As it stands, Pywikibot-driven bots can now load Flow content, create new topics, reply to topics and posts, and lock and unlock topics. Learn more about this task here. This project was completed by Alexander Jones.

OAuth Support in Pywikibot

OAuth extension to Pywikibot. Screenshot by Frédéric Bolduc and others, freely licensed under CC BY-SA 3.0.

MediaWiki supports OAuth v1.0a as a method of authentication via OAuth extension. This project adds OAuth v1.0a support for Pywikibot. Pywikibot may be used as an OAuth application for MediaWiki sites with OAuth extension installed and configured properly. Developers may use Pywikibot to authenticate accounts and replace password with OAuth authentication as an alternative login method. This project also includes switching of HTTP library from httplib2 to requests and unit tests related to OAuth authentication and its integration with Pywikibot. All integration builds of Pywikibot now test OAuth on Travis CI (Ubuntu) and Appveyor (Win32). This enables ‘logged in’ tests to be performed on some wiki sites, including beta wiki, which is deployed on Beta Cluster and is an environment where password’s are not considered secure. Learn more about this project here. The project was completed by Jiarong Wei.

Extension to identify and remove spam

SmiteSpam extension. Screenshot by Frédéric Bolduc and others, freely licensed under CC BY-SA 3.0.

SmiteSpam is a MediaWiki extension that helps wiki administrators identify and delete spam pages. Because wikis are openly editable, they make great targets for spammers. From product advertisements to absolute garbage, any kind of spam turns up on wikis. While accurate detection of a spam wiki page is an open problem in the field of computer science, this extension tries to detect possible spam using some simple checks: How frequently are external links occurring in the text? Are any of the external links repeating? How much wikitext is present on the page? The extension does a reasonably good job of finding particularly bad pages in a wiki and presents them to the administrators. They can see a list of pages, their creators, how confident SmiteSpam is of them being spam, the creation time of the page and options to delete the page and/or block the creator. They can also mark users as “trusted”. Pages created by trusted users are ignored by SmiteSpam and will hence reduce the number of false positives in the results. Vivek Ghaisas completed the project.

VisualEditor Graph module

VE graph extension. Screenshot by Frédéric Bolduc and others, freely licensed under CC BY-SA 3.0.

ve-graph is a module within the Graph extension that aims to bring graph editing tools to VisualEditor in order to bridge the gap between editors and Vega, the visualization engine powering graphs in MediaWiki pages. Before, the only way for users to create and maintain graphs was to directly alter their specification in raw wikitext, which was not only hard to grasp for beginners but also very prone to errors. Those errors would simply render the graph unusable without offering any kind of feedback to the user as to what went wrong. With ve-graph, it is now possible to display graphs within VisualEditor and open up an interface to edit graph types, data and padding. The UI also offers a way to edit the raw JSON specification within VisualEditor without having to switch to the classic wikitext editor, in case more advanced users want to tweak settings not supported by the UI. This first step serves as a stepping stone for many possibilities with graph editing within VisualEditor, and there are a lot of ways in which ve-graph can be improved and expanded. This project is in live in action and you can see a demo here. Frédéric Bolduc completed the project.


Niharika Kohli, Wikimedia Foundation

by Niharika Kohli at November 17, 2015 07:30 AM

November 16, 2015

Wikimedia Foundation

Become an international music phenomenon with Wikipedia


Going viral on Imgur today: creating your debut music album with Wikipedia. Have you tried it yet?

To craft your own debut album artwork, follow these steps:

  1. Hit “random article” on the English Wikipedia to find your “band name”
  2. Find a random Wikiquote page, and select three to five words from it to form your “album title”
  3. Use this randomising tool to find a piece of public domain artwork from Wikimedia Commons for your album art—you have three tries
  4. Edit it all together in the image editing software of your choice!

Here are a selection of albums we’ve made:




Screen Shot 2015-11-16 at 10.30.52 AM

Screen Shot 2015-11-16 at 9.13.54 AM


Joe Sutherland, Communications Intern
Ed Erhart, Editorial Associate
Wikimedia Foundation

Our thanks go to the rest of the Communications team for their album covers. All underlying images are in the public domain.

by Joe Sutherland and Ed Erhart at November 16, 2015 08:38 PM

Wiki Education Foundation

Wiki Ed is coming to Michigan!

jamiIanThis week, Wikipedia Content Expert Ian Ramjohn and I will visit various campuses in Michigan to present our upcoming Wikipedia Year of Science to science instructors.

Writing a Wikipedia article for a classroom assignment is a great way for students to learn the ins and outs of a topic. In a science learning environment, students research their topic and develop crucial science communication skills to explain what they’ve learned for a general audience. These communication-intensive assignments teach students fact-based writing, which will serve them whether or not they pursue a scientific career.

If you know someone at one of the following institutions who is interested in Wikipedia assignments or science communication, please send them a registration link below, or email me at jami@wikiedu.org.

Tuesday, November 17
  • University of Michigan, 4–5:30 p.m. in the Clark Library Instructional Space
Wednesday, November 18
  • Grand Valley State University, 3–4:30 p.m. in 3068 JHZ
Thursday, November 19
  • Michigan State University, 2–4 p.m. in the Reference Instruction Room in the Main Library
    • No registration necessary

Photo: From Wikimedia Commons collection of Michigan postcards: Detroit and Canada Tunnel, near American Portal, Detroit, Mich (65791)” by Tichnor Brothers, Publisher – Boston Public Library Tichnor Brothers collection #65791. Licensed under Public Domain via Wikimedia Commons.

by Jami Mathewson at November 16, 2015 06:42 PM

The Roundup: Cinema and Gender

The Wiki Education Foundation has known that classroom assignments are a great way to improve content. When it comes to underrepresented topics on Wikipedia, students can make real, meaningful impacts.

Here’s a great example from students in Dr. Laura Horak’s Topics in Cinema and Gender course at Carleton University. These students shared their knowledge about female directors and filmmakers. For example:

  • Shirley Barrett: A student expanded an article on an Australian film director, which had languished as a stub for four years. Starting from a single line of biography and a list of films, they included information on her work and important themes.
  • Miwa Nishikawa: Working on pre-existing article, another student editor fleshed out the biography of this award-winning Japanese screenwriter and director. Her works include Wild Berries, which earned her two “Best New Director” awards and a “Best Screenplay” award in Japan. There’s now more information about her films and her writing career.
  • Nancy Grant: A student created a new article about this English-Québécois film producer, who produced more than 27 short and feature-length films. The article makes use of images from Wikimedia Commons to illustrate their work
  • Fanta Régina Nacro: The first female film director from Burkina Faso, Fanta Régina Nacro is known for the 2004 film La Nuit de la Verité (Night of Truth).

Thanks to these students for their fantastic contributions to Wikipedia!

Photo:”16mm filmhjul” by Holger.EllgaardOwn work. Licensed under CC BY-SA 3.0 via Wikimedia Commons.

by Eryk Salvaggio at November 16, 2015 05:00 PM

Jeroen De Dauw

EntityStore and TermStore for Wikibase/Wikidata

I’m happy to announce the public release of two new PHP libraries that provide services around Wikibase, the software behind Wikidata. They are called QueryR EntityStore and QueryR TermStore.

Both these libraries provide persistence and lookup services for specific Wikibase data. These services are build on top of Doctrine DBAL, so support various databases, including in-memory SQLite.

QueryR EntityStore

This component has two database tables: one for storing items and one for storing blobs. Each has a blob field, plus some indexed fields to allow finding entities by various criteria. These tables are an internal implementation detail that is not exposed, and are just listed here to give you an idea of what can be done with the library.

All services are constructed via the EntityStoreFactory class:

use Queryr\EntityStore\EntityStoreFactory;
$factory = new EntityStoreFactory(
    new EntityStoreConfig( /* optional config */ )

$dbalConnection is a Connection object from Doctrine DBAL.

For writing values, you will need either ItemStore or PropertyStore.

$itemStore = $factory->newItemStore();
$propertyStore = $factory->newPropertyStore();

The main write methods are “store document” and “remove document by id”.

$itemStore->storeItemRow( $itemRow );
$itemStore->deleteItemById( $itemId );

Note that $itemRow is of type ItemRow, which is defined by this component. ItemRow represents all values in a row of the items table. It does not require having a fully instantiated Wikibase DataModel EntityDocument object, you just need the JSON.

Next to ItemRow there also is ItemInfo, which is identical, apart for not having the JSON. (Internally these share code via the package private trait ItemRowInfo.)

Here are some examples of how entities can be looked up. To get a full list, look at the services you can construct via the store, and their interfaces.

Fetching an Item by id:

$q42 = $itemStore->getItemRowByNumericItemId( 42 );

Property data type lookup:

$lookup = $factory->newPropertyTypeLookup();
$propertyType = $lookup->getTypeOfProperty( $propertyId );

Get cheaply retrievable info on the first 100 items:

$itemInfoList = $itemStore->getItemInfo( 100, 0 );

Restrict the result to items of type “book”, assuming 424242 is the numeric id of “book”:

$itemInfoList = $itemStore->getItemInfo( 100, 0, 424242 );

QueryR TermStore

This component also has two database tables: one for labels and one for aliases. Again, these tables are an internal implementation detail that is not exposed, and are just listed here to give you an idea of what can be done with the library.

All services are constructed via the TermStoreFactory class:

use Queryr\TermStore\TermStoreFactory;
$factory = new TermStoreFactory(
    new TermStoreConfig( /* optional config */ )

Writing to the store:

$writer = $factory->newTermStoreWriter();

$writer->storeEntityFingerprint( $entityId, $fingerprint );
$writer->dropTermsForId( $entityId );

Lookup up an EntityId based on terms:

$idLookup = $factory->newEntityIdLookup();

$idLookup->getItemIdByLabel( $languageCode, $labelText );
$idLookup->getItemIdByText( $languageCode, $termText );
$idLookup->getIdByLabel( $languageCode, $labelText );

(See the EntityIdLookup interface for all methods and their documentation.)

Lookup label based on EntityId and language:

$labelLookup = $factory->newLabelLookup();
$labelLookup->getLabelByIdAndLanguage( $entityId, $languageCode );

(See the LabelLookup interface for documentation.)


QueryR EntityStore is available on Packagist as queryr/entity-store, and QueryR TermStore is available as queryr/term-store. They both support PHP 5.5 and later, including PHP 7 and HHVM. For more detailed instructions, including things such as release notes and how to run the tests, see their respective readme files.


To avoid potential confusion, I’d like to explicitly state that Wikimedia Deutchland, my current employer, was not in any way involved in the development of these libraries. They have been written by me as a personal project.

Contributions in the form of pull requests or issue submission are welcome.

by Jeroen at November 16, 2015 03:12 PM

Magnus Manske

More mixin’, more matches

Mix’n’match has seen some updates in the past few days. There are about ~170K new entries, in several catalogs:

Also, there is a brand-new import tool that anyone (who has made at least one mix’n’match “edit”) can use! Just paste or upload a tab-delimited text. Note that the tool is, as of yet, untested with “production” data, so please measure twice or thrice before importing.

by Magnus at November 16, 2015 02:02 PM

Gerard Meijssen

#Wikidata - #India and the #Peshwa culture

CIS announced in its newsletter a large donation of Marathi books about the Peshwa culture. It is hard to overestimate the relevance of this gift. It makes knowledge available to 73 million people. It provides sources to the history of a large part of India. This is the text:
1000 Marathi books by Marathi-language non-profit to come online on Marathi Wikisource with Open Access

As the Maharashtra Granthottejak Sanstha (MGS), a non-profit organization working for the preservation of the "Peshwa" culture in Maharashtra, and based in Pune, India, celebrated its 121st anniversary recently, the organization relicensed 1000 books for Marathi Wikisource under CC-by-SA 4.0 license so that the books could be digitized and be made available for millions of Marathi readers. Avinash Chaphekar from the organization signed a document permitting Wikimedians to digitize the books on the Wikisource. On this special occasion of the anniversary, a three-day book exhibition was organized starting October 30.

Answering our question "Could you please share with us your ideas of opening these invaluable books for Wikisource? How they are going to be useful for the online readers to learn about the Peshwas?", Mr Chaphekar says:

“These books are of historical importance and cover topics that are rarely covered well anywhere else. This information should reach to more people. Right after our Prime Minister Narendra Modi recommended to read the autobiography of Benjamin Franklin as it contains a lot of messages for a common man, a lady walked to us once and asked if she can read this Marathi. Such books that were published by the Sanstha should not be kept closed as a lot many readers are searching for such books. We might not have a very great presence in the media or the Internet. How does any reader who does not know us buy a book? If these books are available online they could at least find and read them”
As I follow what is new, I often check what Wikidata has to say. What I find is often a lack of information. There is a wealth of data about minor nobility from the Netherlands. Given the major relevance of a nawab of Awadh or indeed a Peshwa, many improvements can be made to acknowledge the relevance of a major culture

by Gerard Meijssen (noreply@blogger.com) at November 16, 2015 05:34 AM

Tech News

Tech News issue #47, 2015 (November 16, 2015)

TriangleArrow-Left.svgprevious 2015, week 47 (Monday 16 November 2015) nextTriangleArrow-Right.svg
Other languages:
বাংলা • ‎čeština • ‎Deutsch • ‎Ελληνικά • ‎English • ‎español • ‎français • ‎עברית • ‎italiano • ‎Ripoarisch • ‎português • ‎português do Brasil • ‎русский • ‎svenska • ‎українська • ‎Tiếng Việt • ‎中文

November 16, 2015 12:00 AM

November 15, 2015

Gerard Meijssen

#Wikipedia on #Syria and #Iraq in the light of #Paris

There is no excuse for what happened for what happens. Now that the news of Paris sunk in, lets consider the other side. The other side are the people of Syria, Iraq.. Countries where many people suffer beyond belief. They are from places that have a rich history, brought us many notable people and even when we look for it, we will not be able to find it in Wikipedia or Wikidata.

If there is one thing that is most often true about an "enemy", it is that you do not know them for who they are. Our true enemy is not the people from Syria or Iraq, they are the people that describe themselves as Daesh. By there own definition they are apart from Syrians and Iraqis.

This distinction is important and, it does not help that we know so little in any language about Syria, Iraq, the notable people, the history, the culture. The lack of knowledge is often seen as a necessary component of discrimination and the associated belief that the other is the enemy.

The war is now uncomfortably close, it hit Paris and who is next? Refugees have arrived in Europe and they have their story to tell. To understand these stories, it is important that Wikipedia has enough information to fill in the background. It is vital that Wikipedia, Wikidata knows about those who are not the enemy.,

by Gerard Meijssen (noreply@blogger.com) at November 15, 2015 09:45 AM

November 13, 2015

Wikimedia Foundation

Nepal’s victorious monument photos include the birthplace of Buddha

Outside view
This blog post was written by a member of the Wikimedia community and not by an employee of the Wikimedia Foundation. The views expressed are the author’s alone and are not necessarily held by the Foundation or the community as a whole.

In September, Wiki Loves Monuments 2015—the largest photography competition in the world—was held. Globally, it attracted more than 6,200 competitors from 33 countries who together uploaded more than 230,000 photographs (see the blog’s previous coverage).

Nepal took part in the competition for the third time in 2015, and it proved to be a success with 107 participants submitting over 1,535 photographs that highlighted the beauty of the country. All have been released under free licenses.

The international winners will be announced in December 2015. Here are the top 10 pictures that will represent Nepal, as decided by our jury:

Ranipokhari during chhath festival.jpg
1st: Ranipokhari during Chhath by Enfeeyano, freely licensed under CC BY-SA 4.0.

Lumbini (Birth Place of Gautam Buddha).jpg
2nd: Maya Devi Temple the birth place of Gautama Buddha by Rajeshdulal, freely licensed under CC BY-SA 4.0.

MG 7595Boudhanath Stupa.jpg
3rd: Boudhanath by Ronixdhungana, freely licensed under CC BY-SA 4.0.

Lord Bishnu-Shesh Narayan.JPG
4th: Vishnu sculpture in pond near Sheshnarayan Temple by Ksssshl, freely licensed under CC BY-SA 4.0.

55 window palace, Bhupatindra Malla, 1754 AD.jpg
5th: 55 window palace in Bhaktapur by Uzzool, freely licensed under CC BY-SA 4.0.

Bhadrakali temple pokhara.JPG
6th: Bhadrakali Temple in Pokhara by Dhurba Gurung, freely licensed under CC BY-SA 4.0.

Patan Krishna Mandir, Mangal bazar.jpg
7th: Patan Krishna Mandir in Patan by Udhabkc, freely licensed under CC BY-SA 4.0.

Taleju Mandir.jpg
8th: Teleju Temple in Kathmandu by Snagina, freely licensed under CC BY-SA 4.0.

Bhaktapur-Bhairava Mandir am Taumadhi Tole-02-gje.jpg
9th: Bhaktapur Bhairav Mandir by Gerd Eichmann, freely licensed under CC BY-SA 4.0.

Nuwakot Palace (5).jpg
10th: Nuwakot Palace in Nuwakot by Bijay Shrestha, freely licensed under CC BY-SA 4.0.

Biplab AnandWikimedians of Nepal

by Biplab Anand at November 13, 2015 10:46 PM

Ukrainian Wikipedia reaches 600,000 articles

Outside view
This blog post was written by a member of the Wikimedia community and not by an employee of the Wikimedia Foundation. The views expressed are the author’s alone and are not necessarily held by the Foundation or the community as a whole.

Reflections in a flask of Methylene Blue
Methylene blue, one of many redox indicators. Photo by Amanda Slater, freely licensed under CC by-SA 2.0.

After twelve years, the Ukrainian-language Wikipedia has passed another milestone—600,000 articles.

The article reaching the milestone was Окисно-відновні індикатори (Redox indicators), substances that are used in chemistry to determine the equivalence point of an redox reaction.

As it says in the article, most often they are organic substances showing redox properties which are used as such indicators, as well as metal-organic compounds in which the oxidation number of the metal changes on reaching certain chemical potential. In both cases the structure changes are followed by changing the colour of the substance.

The articles for previous hundred-thousand milestones are Гойтосир (Goitosyros), Список країн за видобутком вугілля (List of countries by coal production), Шумейко-Роман Олена Олександрівна (Olena Shumeiko-Roman), Міяма (Miyama, Fukuoka), Електронний газ (Free electron model). The last five-hundredth one was created on May 12, 2014.

There are currently 2386 active editors of Ukrainian Wikipedia—users who have made at least one edit in the last 30 days. Wikimedia Ukraine, the local chapter in the country, works on involving new editors by organising wikiworkshops (currently Wikifest:Luhanshchyna: Wikipedia editing workshops series in Luhanshchyna), getting students to write articles as an educational activity, and answering questions).

Recently the English Wikipedia has also passed another milestone—5 million articles. We do believe that encyclopaedic articles about all important phenomena, events, personalities, and things must exist in the Ukrainian language as well; the better quality they have, the better it is for everyone.

We call on you to start editing Wikipedia—create your article today!

Vira Motorko, Wikimedia Ukraine

The post was originally published on the Wikimedia Ukraine blog.

by Vira Motorko at November 13, 2015 10:15 PM

Pete Forsyth, Wiki Strategies

Superprotect removed: A milestone, and what it means

Wikimedia Executive Director Lila Tretikov and Trustee Jimmy Wales presented yesterday on the strategic direction for the organization. Screenshot from video licensed CC BY-SA, Wikimedia Foundation.

Wikimedia Executive Director Lila Tretikov and Trustee Jimmy Wales presented yesterday on the strategic direction for the organization. Screenshot from video licensed CC BY-SA, Wikimedia Foundation.

Yesterday the Superprotect user right, which I have long opposed, was removed from Wikimedia’s servers. In announcing the removal, the Wikimedia Foundation characterized Superprotect  — accurately and appropriately, I would say — as having “set up a precedent of mistrust” among Wikimedia volunteers and Foundation personnel.

This action is striking and significant. The written announcement says it resolves “a symbolic point of tension,” and sets the stage for addressing underlying problems.

While this is unequivocally good news — and the reception in the Wikimedia-L  and Wiki Tech email lists has been unanimously positive — there are two notable aspects to the announcement, which may provide insight into the underlying issues referenced.

Here is Executive Director Lila Tretikov’s statement. (I emphasize the parts relevant to my commentary below):

We wanted to remove Superprotect. Superprotect set up a precedent of mistrust, and this is something it was really important for us to remove, to at least come back to the baseline of a relationship where we’re working together, we’re one community, to create a better process. To make sure we can move together faster, and to make sure everybody is part of that process, everybody is part of that conversation, and not just us at the Wikimedia Foundation.
(from the monthly Metrics & Activities meeting; video footage)

Two notable things revealed in this announcement:

1. The simple solution to a “really important” problem took a very long time

Once again, the news is good. But the costs of Superprotect were tremendous. Most strikingly, in a New York Times op-ed column, Wikipedia expert Andrew Lih attributed the losing campaigns of the only three popularly-elected Wikimedia Trustees last summer to Superprotect.

And the benefits of Superprotect were nonexistent.

Even so, from August 2014 until November 2015, Foundation leadership was conspicuously silent on the removal of Superprotect. It made no formal statement even after receiving a letter requesting that it be disabled, signed by 1,000 people (which I wrote). The closest thing to a formal statement — a series of comments in August 2014 from Wikipedia founder and Wikimedia trustee Jimmy Wales — falsely (and oddly) claimed that Superprotect was removed at that time.

It’s possible this will be the first and last formal statement from the organization about the removal of Superprotect; so we may never know for sure why a “really important” and really simple action took a more than a year to implement. My best guess is that the whole thing is regarded as an embarrassment, and that the removal came about only when it finally became clear that ignoring the issue and declining to use the feature wouldn’t make the acrimony go away.

It might or might not be significant that the announcement came a day before a quarterly Board meeting — the second attended by the three Trustees who won on platforms that were critical of Superprotect. I have heard from a number of staff that internal discussions about Superprotect have been intense and ongoing, so it’s probably safe to assume that the organization has undergone some valuable growth behind the scenes.

2. Foundation leadership (still) gets the power dynamics backwards

The final phrase in Tretikov’s announcement, above, repeats a fundamental misunderstanding of the Foundation’s mandate (which I have called out before): she aims to ensure that “everybody is [included in decision-making], and not just us at the Wikimedia Foundation.” The phrasing evokes the words of former Board chair Jan-Bart de Vreede, whose justification for the initial launch of Superprotect included the following:

I hope that all of you will be a part of this next step in our evolution. But I understand that if you decide to take a wiki-break, that might be the way things have to be. Even so, you have to let the Foundation do its work and allow us all to take that next step when needed.

While the Foundation undeniably has — and should have — the ability to make certain decisions unilaterally, the Wikimedia Foundation would not exist — and would not have the ability to raise more than $60 million a year in individual donations — if not for the efforts of hundreds of thousands of volunteers. Those volunteers are making decisions every day, about how to collaboratively assemble the world’s knowledge. The volunteer community makes so many decisions, in so many venues and in so many languages, that the Foundation (and, truly, everyone) has difficulty even keeping track of them all, much less participating in them.

The bottom line is, the Wikimedia Foundation must continually prove its own worthiness to participate in a much bigger culture of decision-making. This is something all of us who want to contribute to the Wikimedia ecosystem must do; every individual Wikipedian, and every organization that aims to participate in our shared vision. (Including, of course, me and Wiki Strategies.)

The Wikimedia Foundation does indeed have some special responsibilities, and there are some special privileges that should go along with them. Those privileges include the ability to take unilateral action in a variety of (insufficiently defined) areas. But the conceptual slip from “entity with special responsibilities” to “entity which dictates who gets to participate and how” is a bad one, and one Foundation leadership has made too often.

Focusing on the relatively small decisions about launching software features (which is what prompted Superprotect to begin with) misses the bigger picture by a mile. The bigger point: the Wikimedia world exists only by virtue of the collective efforts of hundreds of thousands of volunteers, and could not exist in a recognizable form without it. Any organization that aims to support the Wikimedia vision loses sight of that reality at its own peril, and with inherent risks to the ecosystem around it.

Note: Final three paragraphs were expanded and edited a few hours after initial publication. -PF

by Pete Forsyth at November 13, 2015 09:21 PM

Weekly OSM

weekly 277



OSM Restriction Validator [1]



  • In the continuing “Mapper in the Spotlight” series, Escada publishes his interview with the mapper Dave Corey from Ireland.
  • User Glassman wants to rewrite the welcome page to encourage more participation from the newly registered users on openstreetmap.org. He is asking for more suggestions.
  • The Dutch community discusses (English) the data from the 3dShapes import. The data is from around 2002 and of course the landscape has changed since then (but was it accurate before?). The discussion continues on whether the data should be imported from newer sources or should be manually updated.
  • John Willis explains why the current OSM-based maps are not useful for map-based navigation in Japan. He further elaborates about this in another post.
  • Joost Schouppe has some statistics on the evolution of data density in the different Belgian regions.
  • Joost Schouppe reports about the meeting with “Trage Wegen“. This organisation makes the inventory of  footways, even those that no longer exist. This makes it hard to use their data, but they will try to meet on an inventory day as they are similar to mapping parties, might exchange data and perhaps also meet some potential mappers. (Niederländisch) (automatic translation)
  • User JAT86 explains that he has left the Wikimapia project in favour of OpenStreetMap, and gives the reasons for his change.
  • Ramyaragupathy reports about Mapbox’s progress in the correction of Japanese road alignments (also reported previously).


  • Andy Wilson would like to import buildings in Austin, Texas. See also the wiki page for the proposed import.
  • Nicholas Doiron asks for help, tips or feedback on a dataset containing townships in Myanmar.

OpenStreetMap Foundation

  • Board Elections – there are four seats available, but only four candidates have applied so far?  Candidates can contest for the elections until 21st November, 16:00 UTC. In previous years there has been a “late run” of applications.  Here are the questions for the candidates.


  • Originating in 1999, the annual GISday, which was started by Jack Dangermond (president and co-founder of Esri),  will this year take place on the 18th of November. It will feature many events all over the world.
  •  A Mapathon in Madrid  will be held on the 21st of November 2015 (English).

Humanitarian OSM

  • OpenStreetMap for Disaster Management, a Slideshare presentation by Yantisa Akhadi, Team Manager at Humanitarian OpenStreetMap Team Indonesia.


  • “WakeUp Earthians” started a petition on change.org to ask Google Maps and OpenStreetMap to display the Indian state of Jammu and Kashmir on maps as India would like to see it recognised (the situation is complicated).  (via @Anonymaps)
  • [1] User MorbZ announces (via the OSM Forum), that he has developed a Restriction Analyzer (about turn restrictions) which uses the Overpass API. A main features of it is the ability to find unnecessary restrictions.
  • The map in this blog shows all the bridges in the Hamburg area. By clicking on the name of the bridge, the an information box opens and it links to a separate detailed blog of that particular bridge.
  • Mapworks is a tool which helps in designing fast and pretty map design using data from the wonderful OpenStreetMap. This alpha version allows maps to be downloaded as PNG images and SVG files.
  • The blog “Maps curious about food“, shares the idea of making maps with food. It also shows map of food production , etc. (English).
  • Patricio Gonzalez Vivo created a live map view showing the current position of the ISS on the ground. (via @miichidk)

Open Data



  • In a detailed blogpost, Susan Crawford explains the history and the background for the Android App Nearby Explorer Online. It’s a “audio map” that helps blind and visually impaired people with outdoor navigation using either Google or OpenStreetMap data.
  • On Graphhopper’s blog there is a description of how they solve the famous “Traveling Salesman” problem.
  • Eike Beer reports how to set up a development version of OpenRailwayMap which uses the vector tiles provided by openrailwaymap.org.
  • SHTOSM reports that the OSM data for Russia has now reached 1.7 Gigabytes, and explains how to process smaller extracts of it.  (Russisch) (automatic translation)
  • Releases
Software Version Release Date Comment
libosmium 2.5.2 2015-11-06 One day after version 2.5.1
Locus 3.13.1 2015-11-08 shortly after version 3.13.0, maybe a bugfix release

Did you know …

  • HOT kits – They are comprised of laptops, GPS units, a printer scanner and an assortment of pieces to hook it all together, they allow people to continue contributing to OpenStreetMap even after the H.O.T. team has left.
  • … the QA tool “Node network analysis” for analysing Belgian and Dutch cycle and walking networks.

Other “geo” things

  • Mapbox is blogging about technical way to sharpen Landsat8 satellite images.
  • There as been a lot of attention about the diminishing Arctic sea ice in recent years. The goal of this blog by Bjørn Sandvik, was to show the changing sea ice month by month, and even day by day.
  • OpenSolarMap is a crowdsourced project to find the best roofs to place solar panels. It’s one of the projects of the Climate Change Challenge. Don’t hesitate to start! Click the roof!


weeklyOSM is brought to you by … 

by weeklyteam at November 13, 2015 07:46 PM

Wikimedia Foundation

“Community revitalization”—working together to strengthen the Hebrew Wiktionary

Outside view
This blog post was written by a member of the Wikimedia community and not by an employee of the Wikimedia Foundation. The views expressed are the author’s alone and are not necessarily held by the Foundation or the community as a whole.

The closing event of the first Hebrew Wiktionary course in Israel. The event was held at the Academy of the Hebrew Language in Jerusalem. 07.jpg
The closing event of the first Hebrew Wiktionary course in Israel, held at the Academy of the Hebrew Language in Jerusalem. Photo by Chen Davidi-Almog, freely licensed under CC BY-SA 4.0.

Hebrew Wiktionary statistics, 2014–2015. Infographic by Itzike, freely licensed under CC BY-SA 4.0.

2015 has been a pivotal year for the Hebrew Wiktionary, thanks to the small community and Wikimedia Israel joining forces to strengthen the project together. By collaborating together and with the Academy of the Hebrew Language, we started an educational course and brought new editors into the project.


The Hebrew Wiktionary community can be described as independent, but limited in size. The level of activity in the project in 2014 was very low—428 edits in July and 154 in August, for instance—and leaned heavily on a small but stable group of writers. Their work was divided between creating entries and necessary regular maintenance of the project; very few new writers joined in. We felt that the project needed a “shot in the arm” and could use new blood to drive policy innovation and to integrate new writers into the activity.

We believe that the chapter, being an external entity that is nevertheless familiar with and committed to the Wikimedia communities, could play an important role in the process of revitalizing the community and promoting its becoming an independent and prosperous one. Four key elements enabled the revitalization of the community:

  1. Finding common ground
  2. Relationships in the community
  3. Mutual support
  4. Continuity

Finding common ground

It started with an initial inter-organizational thought process with the purpose of promoting a joint project. Last December we contacted the Academy of the Hebrew Language, Israel’s highest institution for the study of the Hebrew language whose decisions are officially binding.

Recruiting an institution of such prestige has created a sense of pride among the Hebrew Wiktionary’s volunteers and greatly helped in recruiting new volunteers. A shared vision between the Academy of the Hebrew Language and the Wiktionary community—to contribute to the Hebrew language and expand the Hebrew Wiktionary community—has created a natural and meaningful connection and serves as a golden opportunity for promoting the project. The academy, Wiktionary volunteers, and the chapter are now working hand in hand on the production of a tutorial that combines the teaching of practical editing and the fundamentals of professional lexicography. The tutorial was selected to be the first step in an outline for the establishment of an independent and growing community.

Relationships in the Community

It was necessary to cooperate with community members to create the new editor’s course. The first step was finding contacts. We tried locating them through user pages, the village pump, and editors on other projects, but in the end the most successful strategy was by going through the project’s only bureaucrat.

Integrating new people into the activity of an independent community requires us to help them adapt to the community’s spirit and courses of action. Fortunately, the three editors who offered to help us were willing to do so—a crucial step towards starting this project.

Mutual Support

We began with a meeting between the three contributors and the activity coordinator of Wikimedia Israel for initial introduction and to come up with an objective and vision for the Wiktionary course and Hebrew Wiktionary. The volunteers discussed ways to improve the project and created a list of tasks to be performed before the course begins. The course served as an excellent opportunity for the rejuvenation of the project, and it seems that most of the active editors took it upon themselves to make it a success. New help pages were written, lists of missing entries were created, etiquette was formulated for the project, new officials were appointed, the full guide for the Wiktionary reader was updated, and even the project’s logo was updated.

During the following few months, additional updates were presented regularly in the Wiktionary buffet regarding progress of the course and the volunteers also came to the help of new writers, initiating them and providing them with comments on the entries written during the course.

The second meeting included representatives from the Academy of the Hebrew Language and defined the work plan of the course. It was decided that the course would be jointly taught by volunteers and representatives of the Academy. The syllabus was created together in order to create a compatible course structure with lessons that complement each other in terms of the technical and content aspects of writing.

The behind-the-scenes work invested in production of the course was very intensive. Deciding on weekly sessions did enable an emphasis on continuity, individual effort, commitment, and perseverance on the part of course participants, yet the volunteers were called upon to work ‘round the clock’ dealing with monitoring, initiation, creating lesson plans and teaching, as well as examining learning assignments that the participants prepared at home. Their investment and commitment are undoubtedly the reasons for course success.


In order for a community, active in a specific content field, to survive and prosper, and to be motivated for further work and creativity, it is important to continuously maintain high levels of activity. We felt that keeping the volunteers active was significant, as well as recruiting the course graduates and maintaining a high degree of ongoing project liveliness.

The closing event of the first Hebrew Wiktionary course in Israel. The event was held at the Academy of the Hebrew Language in Jerusalem. Photo by Chen Davidi-Almog, freely licensed under CC BY-SA 4.0.

We therefore decided, after a few months in which the project editors continued to further improve and implement course deliverables, to start working on the next project. Volunteers expressed a desire to hold a missing entries writing marathon that would encourage course participants to stay in the project. Eventually, a decision was made to launch a first-of-its-kind prize competition for the writing of lexicographic entries. Additional important objectives of the contest were to advertise the project to the general public and the crowds of Hebrew language lovers, as well as expanding the community even further.

First, a message appeared in the Wiktionary Village Pump, inviting editors to participate in planning and producing the competition, with an invitation to participate in the first planning meeting, held at Wikimedia Israel’s offices. Later on, a Wiktionary volunteer created a list of 500 (!) entries missing from the dictionary. At the same time, a project page for the competition was created. Volunteers were assigned different functions such as Contest Secretaries, Referees, and people to monitor and provide assistance through the help counter. Both new and veteran volunteers joined in to participate in this project, primarily to assist and answer questions of participants in the contest, and also for the judgment and filtration of the entries. After discussions and consultations with Academy members, criteria for the right way to write an entry in Wiktionary were created and published for the first time ever, and the entries were judged based on these criteria.

Linguists and lexicographers who are members of the Academy of the Hebrew language joined the Judgment Committee, as well as language experts who are well-known in Israel.

To date, we are working on the production of a festive community gathering for Wiktionary editors. We now put a stress on encouraging and pushing forward for projects initiated by the community itself. While the Wikimedia Israel chapter would support and help, it would be strictly up to the volunteers to lead the projects. In addition, we hope to launch one more course in early 2016.

We hope that the movement, initiated during the recent year, will continue to develop towards the creation of an independent initiative community.

In conclusion, during 2015, close to 300 new entries were written and about 130 were extended. Some 50 contributors participated in the projects (of which 33 are new). Compared to last year, there was an increase of 750% in new editors’ activity and an increase of 480% in number of active editors in the project. Compared to the same months last year, an almost 1,600% increase is evident, in the number of edits.

Community is the fundamental basis for the existence of all Wikimedia projects. It is our job, as an organization in the Wikimedia movement, to help communities to prosper. In order for a community to thrive, tight work relations are required among the community members and between the community and the Chapter. Finding common ground is also necessary and in this case it was the commitment of the volunteers, the Hebrew Academy and the Chapter, to the efforts for expanding the project. Volunteer communities don’t always find in themselves the strength to promote younger generations, and this was the main role that we took upon ourselves when assisting in executing the various activities that would bring new editors. Providing support can be described as the major role of Wikimedia Israel.

Chen Davidi-Almog, Activity and Resources Coordinator
Wikimedia Israel

by Chen Davidi at November 13, 2015 07:23 PM

Wikimedia UK

Somerville, science and wikipedia

By William Skelton (engraver); Charles Reuben Ryley (artist) (The Bodleian Libraries, Oxford) [CC BY 4.0], via Wikimedia Commons

From Martin Poulter, Bodleian Libraries Wikimedian in Residence. Also published on the Bodleian Libraries blog.

In the early 19th century, Mary Somerville was a celebrity scientist. One of her works was the best-selling science book of the time, until overtaken by Darwin’s On the Origin of Species. She published on astronomy, biology, atomic theory, and physical geography, at a time when scientific publications by women were rare. She tutored the computing pioneer Ada Lovelace, and introduced Lovelace to Charles Babbage.

Somerville’s name lives on, not least in the Oxford college named in her honour, yet not many people today know of her achievements. One reason is that it is hard to find her works. Her publications have been digitised, but the digitisation process only created images – rather than text, which is easy to find, search within and quote.

On 12 October, a group brought together by University of Oxford IT Services and the Bodleian Libraries started to change that. Working together on Wikisource, a sister-project of Wikipedia, we published a definitive transcription of a Mary Somerville paper from 1826. Being totally open access, the paper incidentally now meets modern funders’ requirements for scientific research outputs.

We also began that day a transcription of Somerville’s Preliminary Dissertation on the Mechanisms of the Heavens; a book described by the astronomer John Herschel as ‘by far the best condensed view of the Newtonian philosophy which has yet appeared’. The transcription was finished this month, and it and other texts are available through Somerville’s Wikisource profile.

This event was just one of a larger programme celebrating the bicentenary of Ada Lovelace. They involved Oxford staff, interested public and experienced Wikipedians, a couple of whom participated remotely. We were lucky to have two excellent guest speakers in Prof Ursula Martin and Prof Sylvia McLain. Each event improved a different aspect of open knowledge about women’s achievements in science and related fields.

In the edit-a-thon and improve-a-thon, we created 8 Wikipedia articles about notable women scientists — some living, some historic — and improved a further 16 existing articles. Creating an article from scratch can be a time-consuming process, but fixing clumsy wording or adding a cited fact is relatively quick.

Wikipedia’s new visual editor works like a word processor, so new users can write or improve articles without having to learn wiki code. We found that this makes Wikipedia editing much quicker to learn, quicker to do, and hence more enjoyable.

Our goal was not just to write biographies but to improve the web of knowledge to fairly represent women’s achievements. Amongst the non-biography articles we improved were those on Mary Somerville’s bestseller On the Connexion of the Physical Sciences and on the Finkbeiner test — important reading for anyone who writes about women scientists!

The 4th workshop was an image-a-thon, looking at Wikipedia’s sister project, Wikimedia Commons, and at some images of women scientists not yet used in Wikipedia. We also uploaded images from copyright-free sources, improving a total of 20 Wikipedia articles.

A call for material for the image-a-thon drew responses from private collections and cultural institutions. Among the contributions from The John Johnson Collection of Printed Ephemera is the accompanying illustration from a museum ticket, which now illustrates the Women in science article.

The records of what we did, including the articles edited, images uploaded and feedback are all openly available. More importantly, the process is ongoing. There are more articles that need expanding, re-wording, illustrating, or creating and anyone can join in: see the project pages for more details.

We will keep in touch with the participants and hope they continue as Wikipedians. The feedback from participants includes, ‘I found the session really useful and fun’ and ‘found it a very rewarding and useful experience and would like to continue contributing’ among similar comments, indicating that for some, we have started a habit.

by Martin Poulter at November 13, 2015 10:39 AM

This month in GLAM

This Month in GLAM: October 2015

by Admin at November 13, 2015 10:18 AM

Wikimedia Foundation

Internal security incident identified and resolved at the Wikimedia Foundation

Love padlocks
A security incident with the Wikimedia Foundation’s Mailman mailing list system was identified and addressed today. Photo by Petar Milošević, freely licensed under CC by-SA 4.0.

On November 12, the Wikimedia Operations team identified a security incident on the Wikimedia Foundation’s Mailman mailing list system that resulted in the breach of four staff email accounts. We immediately investigated the incident, addressed the underlying vulnerabilities, and took steps to remedy the situation.

To our knowledge, the affected accounts have now been secured, and the security incident has been resolved. As part of our commitment to transparency, we are sharing an overview of this incident and how we responded.

How did this happen?

An account with legitimate access to the server hosting our mailing list system obtained passwords from configuration files. A number of those passwords were then tested against staff email accounts and matched in four cases.

What has been done to fix it?

We immediately locked the four affected staff accounts, changed affected passwords, and applied additional security measures. We also locked the account believed to have been behind the breach and have terminated all future access from that account to internal systems. At this time, we have no evidence of other production services being impacted. Out of an abundance of caution, we are in the process of regenerating all passwords stored by our mailing list system. If you use your Mailman password for other accounts, we recommend that you change your password for those accounts.

The Wikimedia Foundation takes the privacy of staff and users very seriously. We will continue to monitor our systems and implement additional security measures to prevent this from happening again.

Mark Bergsma, Director of Technical Operations*
Michelle Paulson, Legal Director*
Wikimedia Foundation

*We would like to thank the various teams, including Ops, Performance, Communications, Legal, Office IT, and Community Advocacy, that worked together throughout the day to expeditiously investigate and resolve this issue.

by Mark Bergsma and Michelle Paulson at November 13, 2015 01:01 AM

November 12, 2015

Wikimedia Foundation

How can you write an open access encyclopedia in a closed access world?

Outside view
This blog post was written by an individual who is not affiliated with the Wikimedia Foundation. The views are the author’s alone and are not necessarily held by the Foundation or the Wikimedia community.

Open Access logo PLoS white.svg
Photo by PLoS, modified by Wikipedia users Nina, Beao, and JakobVoss, freely licensed under CC0 1.0.

I was glad to join a panel discussion organized by Pete Forsyth of Wiki Strategies last week at the Wikimedia Foundation, which was inspired by a debate between Michael Eisen, a founding member of the Open Access (OA) publishing movement and leading OA journal PLOS, and Jake Orlowitz, Head of The Wikipedia Library. Below is an elaboration of many of the points I was glad to insert into this conversation.

Collaborators not competitors

I have been delighted to see Jake and others find ways for the Wikimedia Foundation to provide the Wikimedia community access to otherwise restricted content. Back in 2010, Credo Reference was the first paywalled resource to provide Wikipedians with free access to our content (an online aggregation of subject encyclopedias). Our motivations were entirely to create good public relations. Our lead investor, Bela Hatvany (founder of two library companies, CLSI and Silverplatter) was fond of the expression, “A competitor is just someone you’ve not learned how to collaborate with yet.” In his view, one of the principle objectives of a company should be to add value to their customers’ world.

So, while Wikipedia, in some people’s eyes, would be seen as our competitor (and one which had by all measures ‘won the battle’ for the attention of users in need of online reference), I have always thought that they were an important player in the lives of our customers and their users, and that we should find ways to collaborate with them. We did not expect any quid pro quo regarding citations from Wikipedia to Credo Reference, though we did occasionally look at whether there was any growth in such references. It was also important, to my way of thinking, to realize that when you consider any group of Wikipedia editors, you don’t really know who they might be in their non-Wikipedian lives. The 400 prolific editors we provided access to were people that are passionate about creating good encyclopedia entries for the benefit of all. Having such people think well of Credo was valuable even if we never knew who they were.

Critical trade-offs

Michael Eisen raised a key point in the forum: whether or not primarily closed-access publisher Elsevier would gain more from their donation of access than Wikipedia would. I think Elsevier’s goals in this arrangement are entirely public-relations benefits. It allows them to assert their support for openness. They, too, want scientists to think well of them. They even have open access journals that they want to promote. I don’t think the bits of traffic they might get from this arrangement would have entered into their calculations at all. And certainly, as Eisen asserted, it does not represent any potential loss of revenue that might have come from those 45 Wikipedians.

For Wikipedia, content and links should be driven by what best serves their users. Anything that increases Wikipedians’ access to paywalled material is, in my opinion, a positive. Another Wikipedia Library program has been to encourage libraries to designate selected Wikipedians as “Visiting Scholars” who can thereby be given access to paywalled content within those libraries (including Elsevier’s ScienceDirect database). Having access to resources without a blind-spot hiding paywalled content from their view should improve Wikipedia entries: access to that literature can increase an editor’s understanding of a field even if they don’t cite that literature. Personally I think that there are certain paywalled sources which should be cited in a good encyclopedia entry so that the user is aware of them and can assess the reliability of claims in the Wikipedia article.

Michael Eisen also raised an interesting point about cases where reliable sources for a particular assertion in an article are “fungible” and could equally be supported by a comparable open resource. Jay Jordan, former CEO of OCLC, was fond of saying “Discovery Without Delivery is a Disservice to Users.” This is a good principle: unless there is a compelling reason to back up assertions by a paywalled source, an equally reliable open source should be used. Providing references to sources which all readers can access is simply a better user experience.

However, Michael’s suggestion that Elsevier will gain materially from links to sources in ScienceDirect is, in my opinion, not only wrong, but runs the risk of taking our eyes off what I think the battle for open access can best be served by at this point.

The bigger problem

The paywalled business model for scholarly communication was first made clear to me by Jan Velterop at a conference back in 1998 when he was managing director of Academic Press and I was president of SilverPlatter. He said, “Each and every scholarly article is its own little monopoly.” It has stuck with me ever since and is the core problem which both the Gold and Green paths to Open Access address (Gold meaning an OA journal; Green meaning an OA repository). Elsevier does not need traffic from Wikipedia to shore up their paywalled products. The paywalled business model succeeds by exploiting monopoly control of a sufficient portion of scholarly articles to allow the owners of such products to set whatever prices they want. The vast majority of their customers simply have to pay. The buyer has almost no market power or legitimate alternatives.

The Open Access world has made enormous strides in addressing this problem. I salute those achievements—they represent significant progress on the economics of academic publishing. But what is the current state of those economics? I was reminded of it, viscerally, just this week. I was having dinner with a Chief Librarian at a major research university. His university back in 2004 had an annual subscription budget of $8 million. This year the price for that same set of subscriptions—without any substantive increase in the number of journals—is $14.4 million. This is an average annual price increase of 5.5%, more than double the inflation rate in those years.

Gold OA publishers (now about 13% of all scholarly journals) are completely open upon publication. Publishers like PLOS provide a clear and successful “existence proof” that a non-monopoly publishing system for academic research can work. In fact most paywalled publishers, perhaps to hedge their bets or to compete with PLoS, are now offering their own Open Access journals.

But at the same time, there’s a real lag in adoption of the Green path by which scholars open up a version of an article so that it’s accessible to all even though it was published in a paywalled journal. 78% of scholarly publishers worldwide give some form of blanket permission for authors to archive a version of their articles in an accessible place, and even the 22% that don’t can be pressured to allow such archiving.

Both Green and Gold depend on the understanding that scholars have of the issues: Green so that they archive papers they’ve already written or continue to submit to paywalled journals, and Gold so they know that they have excellent choices of where to publish and can reasonably choose Gold OA publishers in the future.

Influencing and educating researchers, one at a time

It is said that there are three times you can capture the attention of an academic researcher:

  • When they are trying to get a paper published
  • When they are trying to show to others (e.g. tenure committees, colleagues, in-laws) that their work is influential
  • When someone important to them recognizes their work (e.g. cites or comments about it).

There is an opportunity here for both Gold Open Access publishers and Wikipedia. Reference sections in your articles are lists of important experts in their fields. Each of these citations are an opportunity to educate an expert about the opportunity for openness. So both Gold OA publishers and Wikipedia have an asset which they can deploy in an effort to educate scholars about Open Access.

For OA publishers, the cited sources of articles you are about to publish were written by experts in your field of publication. That’s why they’re being referenced. You have advanced news that they are about to be referenced. This is sufficiently newsworthy that a scholar is likely to open an email leading with this announcement (rather than all the spam they regularly get). You can use this message to make them aware of your new article’s author. If you know their referenced article is not yet open, you could advocate that they deposit it in a repository or an author’s website, even giving them guidance on whom to turn to for help. You can alert them to the benefits of opening up their article (more downloads, more citations). You can advocate that the next time they write an article they might consider your journal.

For Wikipedia, your references to authors are now gaining in importance to them. The Openness agenda could be well served if someone could mine Wikipedia for the most referenced sources which could be opened but are still closed. Imagine a “Leaderboard” which called out the scholars with the most references in Wikipedia which have not been made open.

Coming back to the question of links from Wikipedia to Elsevier sources: I point out that such links are, yes, links to Elsevier. But they are also a connection to scholars, many of whom are in need of education about Open Access. I say, bring on lots of those links (where the sources serve the content need of Wikipedia) and then organize an effort to make those sources open.

Another ally

Another ally in this use of referenced sources is a growing number of university presses. The librarian I spoke to this week is one of the 30% of head librarians who are also responsible for their university’s press. This means that these presses are right at the point-of-pain where the monopoly pricing practices of paywalled scholarly publishing hit the university. This year when my librarian friend went to get his budget increased to cover the $800K price increases from subscription journals, the president of the university said he’d have to present this increase request to the faculty senate so that the faculty would know (for the first time apparently) the costly impact of the scholarly publishing system on the budget of the university. The university press was included in that budget presentation to provide an educational piece on Open Access. These university presses, like the Gold OA publishers, have a decided self-interest in converting scholars one-by-one to the open access business model. Some are now running scholarly publishing offices that, among other things, work with researchers to assess their prior publications and find opportunities to open up articles they’ve authored.

If these three groups—Gold Open Access publishers, Wikipedians, and University Presses—can actively seek out ways to motivate individual scholars regarding archiving and publishing, then 10 years from now the librarian I met with this week will be able to go to the faculty senate and report on year-on-year savings rather than asking for huge budget increases due to monopoly-based price increases. Let’s find ways to assess accessibility of references, build tools that can automate the process of finding and pinging authors of important paywalled sources, and transform the economics of scholarly publishing to the benefit of science and scholarship.

John G. Dove, Consultant, Paloma & Associates

by John G. Dove at November 12, 2015 11:31 PM

Wikimedia Highlights, October 2015

Wikimedia Highlights, October 2015 lead image.jpg

Here are the highlights from the Wikimedia blog in October 2015.

Wikipedia’s global impact recognized with Spain’s Princess of Asturias Award ceremony

Princess of Asturias awards 2015 - Wikimedia España members, Wikipedia editors and representatives.jpg
Princess of Asturias prizes 2015 – Wikimedia España members, Wikipedia editors and representatives. Photo by Ruben Ortega, freely licensed under CC BY-SA 3.0.

On Friday, October 23, Wikipedia was formally presented with the highly esteemed Princess of Asturias Award for International Cooperation, honoring the Wikimedia movement vision of allowing everyone, everywhere to freely share in the sum of all knowledge.

Wikipedia Founder Jimmy Wales accepted the award on behalf of Wikipedia at the Princess of Asturias Foundation ceremony in Oviedo, Spain, presided over by the King and Queen of Spain. Jimmy was joined by Lila Tretikov, Executive Director of the Wikimedia Foundation, Patricio Lorente, Chair of the Board of Trustees of the Wikimedia Foundation, and Wikimedians Ravan Jaafar Altaie of Iraq, Lourdes Cardenal of Spain, and Jeevan Jose of India.

Creating change one step at a time: Miguel Zuñiga Gonzalez

Miguel Zúñiga González-2.jpg
Miguel Zuñiga Gonzalez first started as a Wikipedia volunteer in 2006. Today, Gonzalez combines this passion with his love of teaching, and works with university students on improving Wikipedia’s coverage of medicine. Photograph by Victor Grigas, Wikimedia Foundation, freely licensed under CC BY-SA 3.0.

Miguel Zuñiga Gonzalez is an architect, a university teacher, and a Wikipedia volunteer. A native of Mexico City, Gonzalez sat down with us during the 2015 Wikimania conference held in the Mexican capital to briefly discuss his work with medical students on the Spanish-language Wikipedia, and share his views on Wikipedia’s current concerns and its future chances.

One concern Miguel has is the lack of involvement of the general population with the project. However, Gonzalez sees the possibility of change in the younger generations. “Young people are more inclined to share their knowledge with the world. Students, actually, get very motivated when they publish their work on Wikipedia as part of a paper. For them, it’s a challenge. Telling them that it will be for the whole world to review gives them even more of a reward.”

There are advantages to this approach. “When students do research for their articles, they find practical uses from what they’re learning at the moment, such as the ability to search scientific databases,” says Miguel. “When a reference is properly used, it also gives them a chance to expand their knowledge. When they improve things even a little bit, by adding five or six references to an article, they make a big contribution by ensuring everything is reviewed.”

Your October milestones include Wikidata’s 15 millionth item

Wikidata Birthday Cake First Cut.jpg
Wikidata hit 15 million items this month, not long before its third birthday celebrations in Berlin. Photo by Jason Krüger, freely licensed under CC BY-SA 4.0.

While the English Wikipedia’s 5 million article milestone just missed the cut, several other Wikimedia projects celebrated milestones of their own in the month of October.

On October 27, Wikidata hit 15 million items—and two days later celebrated its third birthday, with celebrations in Berlin. It’s now the third most-active Wikimedia project behind the English Wikipedia and Wikimedia Commons, with around 5,900 active users in June 2015.

Several language editions of Wikipedia crossed major article milestones—on October 12, the Georgian Wikipedia became the 54th project to reach 100,000 articles, and on the 29th, the Azerbaijani Wikipedia joined them in that club as the 55th.

Early in the month, the Catalan Wiktionary hit the 150,000 entry milestone, while the Hungarian Wiktionary reached 300,000 entries towards the end of October—only the 18th Wiktionary to reach this particular milestone.

Bots again played a role in growing some of the smaller Wikimedia projects this month, with the Min Nan Wikipedia growing by ten thousand articles in just eight days.

District court grants government’s motion to dismiss Wikimedia v. NSA, appeal expected

Stop surveillance poster.jpg
Spread the word about inappropriate surveillance. Art by Rich Black, CC BY 3.0.

On October 23, 2015, a federal district court granted the government’s motion to dismiss Wikimedia v. NSA, the Foundation’s lawsuit challenging the U.S. National Security Agency’s (NSA) use of “Upstream” mass surveillance.

Unfortunately, the court did not actually rule on whether the NSA’s upstream surveillance is legal or illegal. Judge T.S. Ellis III, the presiding judge, dismissed the case on standing grounds. The court held that the Foundation’s complaint did not plausibly allege that the NSA was monitoring the Foundation or other plaintiffs’ communications. Additionally, the court referenced the U.S. Supreme Court decision in Clapper v. Amnesty International, although, in the Foundations opinion, the facts before the court were dramatically different from the ones that were before the Supreme Court in Amnesty.

The Wikimedia Foundation respectfully disagree with the Court’s decision to dismiss. The Foundation believes that our claims have merit. In consultation with the Foundation’s lawyers at the ACLU, the foundation will review the decision and expect to appeal to the Fourth Circuit Court of Appeals.

The Wikimedia Foundation would like to thank our skilled and dedicated pro bono counsel at the American Civil Liberties Union (ACLU) and Cooley, LLP for their dedication and hard work on behalf of the Wikimedia movement.

Making Chinese Wikipedia more ethnologically diverse

2010 07 14820 5420 Amis Folk Center Art of Taiwan Cobblestones Taiwan.JPG
Wikimedia Taiwan collaborated with National Cheng-Chi University to improve the Chinese Wikipedia’s coverage of ethnology. Photo by Lord Koxinga, freely licensed under CC-by-SA 3.0.

Students in Taiwan are making the Chinese Wikipedia more ethnologically diverse though a collaborative program with Wikimedia Taiwan, an affiliate organization of the Wikimedia movement, and National Chengchi University (NCCU) in Taipei.

The students learned how to edit through a lecture at the beginning of the course. Students edited the ethnological content in their own sandboxes and had oral presentations to outline their articles and share their progress. Students were allowed to use translated information from the English Wikipedia but had to improve it to meet academic standards.

In short, readers of the Chinese Wikipedia now have 40 new articles on ethnic minorities to learn from, ranging from the Flemish (en), Mongo (en), and Bemba (en) peoples.

You can read more about this project.

Andrew ShermanDigital Communications InternWikimedia Foundation

Photo Montage: “Wikidata Birthday Cake First Cut” by Jason Krüger, freely licensed under CC BY-SA 4.0.; “Miguel Zúñiga González-2” by Victor Grigas, Wikimedia Foundation, freely licensed under CC BY-SA 3.0.; “Stop surveillance poster” by Rich Black, CC BY 3.0.; “Princess of Asturias awards 2015 – Wikimedia España members, Wikipedia editors and representatives” by Ruben Ortega, freely licensed under CC BY-SA 3.0.; “2010 07 14820 5420 Amis Folk Center Art of Taiwan Cobblestones Taiwan” by Lord Koxinga, freely licensed under CC-by-SA 3.0.; Collage by Andrew Sherman.

Information For versions in other languages, please check the wiki version of this report, or add your own translation there!

by Andrew Sherman at November 12, 2015 08:23 PM

Wiki Education Foundation

Wiki Ed attending the National Women’s Studies Association conference this week

Educational Partnerships Manager,  Jami Mathewson
Educational Partnerships Manager, Jami Mathewson

At the end of 2014, Wiki Ed started an educational partnership with the National Women’s Studies Association (NWSA). The goal of our initiative was to get more women’s studies courses, as well as female students, involved in writing Wikipedia content as an assignment. Some estimates place Wikipedia’s contributors at 90% male. The high percentage of women at American and Canadian universities means our courses, in which 68% of the students are women, can help curb the prevalent systemic bias that accompanies a homogeneous contributor base. We devoted time and energy to this partnership because we expected students in those courses to help close important content gaps on Wikipedia.

During the current fall 2015 term alone, Wiki Ed is supporting 25 courses related to women and gender (for a list of relevant courses, see Wiki Ed’s Dashboard). For context, that’s 16% of all the courses and disciplines in the Classroom Program right now. So far, those students have edited more than 250 articles and have created more than 25, and much more student editing will take place in November and early December. I consider this a huge success, and I’m incredibly grateful for all the support the NWSA has given in encouraging their members to think of Wikipedia as an important medium that deserves to reflect the available academic literature related to women’s studies.

We will continue targeting this discipline and looking for more instructors to join our programs. This week, Outreach Manager Samantha Erickson will attend the National Women’s Studies Association’s annual conference. One of our long-time instructors, Dr. Jennifer Mikulay, will join Samantha to share her experiences assigning her students at an all-women’s college to illustrate Wikipedia. They will be looking for NWSA members who are eager to learn about Wikipedia’s role in shaping public knowledge, and I hope they will meet as many dedicated instructors as we have over the first incredible year of this partnership. If you’re attending NWSA’s meeting in Milwaukee, please look for Samantha and Jennifer in the following places to talk about how your students can make a huge impact on Wikipedia:

Exhibitor’s booth:
  • Friday, November 13: 9 a.m. – 7 p.m.
  • Saturday, November 14: 9 a.m. – 6 p.m.
“Using Wikipedia as a teaching tool in women’s studies classes” presentation:
  • Sunday, November 15: 9:30 a.m. – 10:45 a.m., Wisconsin Center 103C (LCD)

We look forward to seeing more growth in this partnership and in Wikipedia’s coverage of women.

Jami Mathewson
Educational Partnerships Manager

by Jami Mathewson at November 12, 2015 07:41 PM

November 11, 2015

Wikimedia Foundation

Content Translation helped create 30,000 new Wikipedia articles this year

Content Translation helps volunteers translate articles between different-language Wikipedias. Screenshot from “Content Translation Screencast” video. Screencast by Pau Giner, freely licensed under CC BY-SA 4.0.

Weekly article creation, deletion and in-progress trends for Content Translation for the English Wikipedia. Photo by Runa, freely licensed under CC0 1.0.

The number of articles created with the Content Translation tool recently crossed the 30,000 mark.[1] This tool is being used by more than 7000 editors to translate Wikipedia articles into many languages.

As per our recent observations, on average more than 1000 new articles are created each week using Content Translation. The number of articles deleted, as part of the normal article review process, comes to around 7% per week. Compared to new articles created using usual editing tools, this figure is considerably low. As the tool is designed to create a good article by reusing existing content (in another language), this is an encouraging outcome and confirms the assumption that the initial translated version is significantly better in terms of content quality to merit retention.

Challenges about content syntax and improvement

Ever since the tool was first deployed as a beta feature in January 2015, the development team has made an active attempt to monitor the articles being created and examine how well do they fit into the respective wikis, primarily in terms of their internal structure and code—categories, links, templates, footnotes, general wiki syntax handling, and so on. Content Translation, by its inherent nature, is transforming text between diverse Wikipedia projects and this can lead to some issues caused by the differences between the projects in the use of templates, references and markup.

As the article creation and deletion statistics demonstrate, general observation is that the new articles appear to fit well. However, the wiki syntax’s cleanliness is a considerable challenge and new issues are being uncovered through regular use of the tool. Over the year we fixed numerous bugs in the handling of categories, templates and footnotes. While we have fixed many of these bugs already, we know that many still remain. We are thankful to the editors who report bugs in this area and help us understand and fix them.

Improvements to machine translation

Machine Translation improvements have been a recurring request from many users of Content Translation ever since the tool was made available. Until recently, Apertium was the only MT service that was available for Content Translation. Since 4 November 2015, however, Yandex machine translation service has been available for users of the Russian Wikipedia—where Content Translation is especially popular—and can be used when translating Wikipedia pages from English to Russian using Content Translation (see the announcement).

The translation service will be accessible via a freely available API, and the translated content returned by the service is freely licensed according to Wikipedia policy for use in Wikipedia articles. As the interaction between Content Translation and the translation service happens on the server side, no personal information from the user’s device is sent to Yandex. The translated content can be modified by users, just like usual content on wiki pages. The information about the modifications is also available publicly under a free license through the Content Translation API for anyone to develop and improve translation services (from University research groups, open source projects to commercial companies, anyone!). More information about this translation service is available on Mediawiki and in the Content Translation FAQ. For more details about the interactions between Content Translation and Yandex translation service, please see this image.

Enhancements have also been made to the Apertium machine translation service. As a result of recent changes, eight new language pairs are now covered by Apertium. These are, alongside the complete list:

  • Arabic – Maltese (both directions)
  • Breton – French
  • Catalan – Esperanto
  • French – Esperanto
  • Romanian – Spanish
  • Spanish – Esperanto
  • Spanish – Italian (both directions)
  • Swedish – Icelandic (both directions)

Upcoming plans and office hour

In our last update, we informed our readers about article suggestion that provides users a list of articles that can be translated for a certain language. Sometime soon, it will be possible for users to create collections that can be used for translathons or similar shared editing activity. If you have participated in such an event or organized one that involved article creation through translations, we would like to learn from you (via this form) more details about how Content Translation’s article lists can support this activity.

Please join us for the next online office hour on 25 November 2015 at 1300 GMT. We welcome your comments and feedback on the Content Translation project talk page and Phabricator.

Runa Bhattacharjee, Amir Aharoni, Language Engineering, Wikimedia Foundation

by Runa Bhattacharjee and Amir E. Aharoni at November 11, 2015 09:02 PM

November 10, 2015

Wiki Education Foundation

First Featured Article from a Wiki Ed Visiting Scholar: Boroughitis

We’re excited to announce the first Featured Article resulting from a Visiting Scholar since the Wiki Education Foundation took on administering the program.

Bergen_County,_NJ_municipalities_labeledIn mid-October, the Wikipedia article about Boroughitis was designated a “Featured Article” (FA) thanks to the work of George Mason University Visiting Scholar Gary Greenbaum.

Starting in the 1890s, the development of commuter suburbs in New Jersey led small, dissatisfied segments of existing townships to use an obscure law to break away and form separate boroughs. The trend was most pronounced in Bergen County, which today is still divided into 70 distinct municipalities (see the map). The phenomenon became so common that it became known as “boroughitis” or “borough mania.” As of 2014, New Jersey has 565 municipalities — more per capita than any other state.

Gary notes that he grew up in one these boroughs, Woodcliff Lake, “established in 1894.”

Only about 0.1% of Wikipedia articles have the FA designation, which is reserved for content considered the best available on Wikipedia. Last year, Gary brought 17 articles to Featured Article status, including high-traffic subjects such as Babe Ruth, William H. Seward, and James A. Garfield.

The Visiting Scholars program connects experienced Wikipedia editors with academic libraries. For more information, see the Visiting Scholars page.

Ryan McGrady
Community Engagement Manager


Photos: Header image: “Bergen Passaic 1872“. Licensed under Public Domain via Wikimedia Commons. Map: “Bergen County, NJ municipalities labeled” by No machine-readable author provided. ChrisRuvolo assumed (based on copyright claims). – No machine-readable source provided. Own work assumed (based on copyright claims). Licensed under Public Domain via Wikimedia Commons.


by Ryan McGrady at November 10, 2015 05:00 PM

November 09, 2015

Wiki Education Foundation

The Roundup: Animal behavior

Students in Dr. Susan Alberts’ Evolution of Animal Behavior course at Duke University are learning about how animals solve problems. They’ve also been creating and expanding Wikipedia articles based on what they learn. These students are increasing the variety and quality of content related to animal behavior on Wikipedia.

Here are just a few examples of their work:

  • Students delved into the evolutionary reasons behind a phenomenon known as distraction display, in which a bird feigns a broken wing to draw a predator away. Students expanded the article from 362 to 1,339 words, and from 10 to 25 references.
  • Students expanded the article on communal roosting, a behavior commonly seen in bats. They turned a 50-word article without references into one with more than 2,200 words, and 21 references.
  • Outlining some of the evolutionary origins of predators, students created a new article, pursuit predation, consisting of 1,366 words and 17 references. We hope their predatory instincts kick in and they “pursue” this topic further.
  • Students expanded the article on Osteophagy, a condition in which herbivores with calcium and phosphate deficiencies eat bones. The student’s work, however, wasn’t deficient at all! They “boned up” the article from 38 words and 1 reference to 862 words and 12 references.

Thanks to these student editors for their great contributions to Wikipedia!

Photo:Pronghorn run – Flickr – USDAgov” by Mark Gocke/USDA. Licensed under CC BY 2.0 via Wikimedia Commons.

by Eryk Salvaggio at November 09, 2015 04:59 PM

Tech News

Tech News issue #46, 2015 (November 9, 2015)

TriangleArrow-Left.svgprevious 2015, week 46 (Monday 09 November 2015) nextTriangleArrow-Right.svg
Other languages:
čeština • ‎Deutsch • ‎English • ‎español • ‎suomi • ‎français • ‎עברית • ‎italiano • ‎Ripoarisch • ‎português • ‎português do Brasil • ‎русский • ‎svenska • ‎українська • ‎Tiếng Việt • ‎中文

November 09, 2015 12:00 AM

November 08, 2015

Jeroen De Dauw

Rewindable PHP Generators

Today I was refactoring some code in one of my libraries, and ended up replacing a named Iterator class with a Generator. To my surprise this changed behaviour, which I noticed due to a broken test. A test verifying that I could iterate multiple times through the iterator – good that I did write that! And so I found out that you can not actually rewind a Generator. By extension, you can also iterate through the iterator only once. If you try to rewind or iterate multiple times, you’ll get an Exception. Gah!

$generator = $myGeneratorFunction();
iterator_to_array($generator); // boom!

$generator = $myGeneratorFunction();
$generator->rewind(); // boom!

Luckily, this is trivial to get around. Just make an Iterator that recreates the Generator on rewind and delegates to it. I’ve created a (extremely) little library that holds such an adapter, so I can tackle this problem without cluttering my other libraries with such a random infrastructure concern. Which of course you can use as well if you want.

$generator = new RewindableGenerator( $myGeneratorFunction );
iterator_to_array($generator); // works as expected
$generator->rewind(); // works as expected

The library is called Rewindable Generator. You can install it with Composer using the package name jeroen/rewindable-generator.

by Jeroen at November 08, 2015 08:25 AM

November 07, 2015

Weekly OSM

weekly 276


neuer Carto Map Style auf osm.org

New OSM Carto Map Style at osm.org [1] Data: OpenStreetMap-Contributors, Image CC-BY-SA 2.0 OpenStreetMap.org


  • Revision 8969 of JOSM has introduced a new behaviour. When the user splits a way, it prompts the user to choose the part of the way that should inherit the history.
  • Mapbox published a map indicating the lack of footpaths parallel to roads. There is also a blog post and two Mapbox Github issues with more information. The source code of the project is public.
  • Mapbox announced three million square kilometres of the freshest, highest resolution 30 cm imagery available from DigitalGlobe. CEO of Mapbox, Eric Gunderson writes: “And of course it’s all 100% open for tracing in OpenStreetMap.”
    weeklyOSM says: Thank you, Mapbox! :)
  • Martijn van Excel writes a blog about the new JOSM plugin that is used to fix missing and wrong one-way streets. Just after the month of it’s release, 20% of the missing road tiles have already been resolved.
  • Pierre Béland has created the NoFeature-style for JOSM.  It is used to highlight objects that have no “major tag” to relate itself to an OSM feature such as highway, building, natural, landuse, amenity, etc.
  • Karussell (the developer of GraphHopper) raised the problem of parsing units in OpenStreetMap tags such as maxweight, etc.  Many comments  and suggestions regarding this have been posted in his diary.


  • escada’s diary “A Mapper in the Spotlight:Clifford Snow”, shares Clifford Snow’s experience in OSM mapping and also discusses about his achievements and contributions to OSM. Here is Clifford’s heat map.

  • bdiscoe shares his ideas for making a better heat-map for highlighting the contributions of the mappers.
  • Belgian mapper of the Month: Olivier Roussel. Here escada writes about Oliver’s contributions and experience with OpenStreetMap.
  • Joost Schouppe shares his three week road trip experience with Mapillary.
  • Open geo interview with Gregory Marler (aka Living with Dragons), the man who almost single-handedly mapped Durham, England in OpenStreetMap.
  • Andy Allan in his blog, points out that most developers contribute to the OSM project for less than a year (similar to the pattern that is observed amongst mappers).


  • The planning (English) for Chemnitz Linux Days (Chemnitzer Linux-Tage), which is to be held on 19th and 20th of March 2016 has started.. There is also a call for presentations for the same.

Humanitarian OSM

  • Camilla Mahon reported in her Mapbox blog, that Mapbox worked with DigitalGlobe to get the up-to-date satellite imagery for the earthquake effected areas in Eastern Afghanistan. Mikel Maron writes in his blog about the participation of Mapbox’s data team co-led by Maning Sambale in the local HOT activities.
  • The Daily Utah Chronicle reports about geography students participating at Hurricane Patricia crisis response mapping. See also a blogpost by Mapbox about the hurricane and the related HOT activities, and also this report in the Salt Lake Tribune, which also covers the Afghanistan earthquake response.
  • In a video presentation, Mikel Maron outlines the key events and discussions in the evolution of HOT from 2005-2010.
  • User “dekstop” tries to quantify HOT participation inequality. It’s complicated!


  • OpenRailwayMap now visualises a significant number of Austrian railway signals.
  • [1] OSM Carto 2.36.0 was released and has been used since October 30th at OSMF’s tile-servers, which provide the tiles at osm.org. The new colour scheme was developed during the Google Summer of Code by Mateusz Konieczny. One very notable change will be the new colour style of the highways. See also the posting at OSMF blog, the mailing lists Talk, Talk-de (English) and at the German forum (English). Some of the discussion at Talk-gb is about creating a tile-server serving tiles with the old colour scheme. Richard Fairhurst notes that actually maintaining such a tile-server (supporting a layer displayed at OpenStreetMap.org) would need a lot of time and effort.
  • Within the framework of “Arriving in Berlin” four refugees and residents at the Haus Leo (English) created a Map of Berlin for newcomers with contact points of helping organizations, offices, doctors, free Wi-Fi and police. The project coordinator, Ralf Rebmann, says: “This mapping project, which was implemented with uMap, is a collaboration between the Berliner Stadtmission and the Haus der Kulturen der Welt and it would not have been possible in this form without OpenStreetMap .

Open Data


  • OpenGIS.ch is crowdfunding a QGIS 2.5D rendering. It still needs about 4.500 €, ~5.000 US$, ~3.200 Ł.
  • Based on the Graphhopper routing engine, a new API for route optimization is available, so far in beta testing status.
  • The navigation app “EB Dirigo” from Elektrobit is now downloadable as an apk download (English) and on Google Play. The app is “free”, but not “Free”.
  • Simon Poole forked the JOSM standard template and added other often used tags, amongst other things, amenity=social_facility. This template is also being used in Vespucci.
  • BRouter version 1.3.2 was released on November 1st, 2015 .
  • GeometaLab of Hochschule Rapperwil, Switzerland announced: “We’re actually evaluating the many @Docker versions of #Nominatim #DockerHub and analyzing performance bottlenecks!”
  • The Node.js module csvgeocode can convert lists of geographical coordinates into an address list. Services like Google, Mapbox, OSM Nominatim, Mappen or Texas A & M’s can be used as backends.
  • Yuri Astrakhan asks on the dev mailing list about the use of vector tiles and triggers a discussion well worth reading.
  • SQLite version 3.9.2 was released on November 2nd, 2015.
  • TobWen raises an issue about the rendering of Mapnik 3 compared with its predecessor Mapnik 2.x.
  • Do be careful with the choice of the names for your variables!
  • Releases
Software Version Release Date Comment
Route Converter 2.17 2015-10-23
JOSM 8969 2015-10-30
BRouter 1.3.2 2015-11-01
SQLite 3.9.2 2015-11-02
Atlas 1.1.17 2015-11-02

Did you know …

  • Dave Hansen regularly creates “Whole US” Garmin map downloads optimised by location, level of detail and the size of the memory card you’re using. Dave informs the talk-us mailing list when updates are available. The notification contains the download details and a short FAQ.

Other “geo” things


weeklyOSM is brought to you by … 

by Jinal Foflia at November 07, 2015 05:32 AM

Wiki Education Foundation

Welcome, Tanya!

Tanya I. Garcia, Ph.D.
Tanya I. Garcia, Ph.D.

I’m pleased to introduce Tanya I. Garcia as the Wiki Education Foundation’s new Director of Programs!

Tanya has a deep understanding of the U.S. higher education system. She has nine years of state policy experience, seven years with national organizations, five years of institutional experience, and two years working in the philanthropic sector. She’s served as State Policy Officer for Lumina Foundation, Senior Policy Analyst for the State Higher Education Executive Officers (SHEEO) Association, and Policy Analyst at the New Mexico Higher Education Department (NMHED).

As Director of Programs, Tanya will direct the program team (Community Engagement, Classroom Program, and Educational Partnerships) to achieve high impact while also driving long-term strategic vision of our organization’s programmatic work. We are excited to work with her to scale and improve our current programs.

In her private life, Tanya enjoys cooking and eating food from almost every continent, reading short stories, biking with abandon, and traveling to expand her comfort zone.

Welcome, Tanya!

Frank Schulenburg
Executive Director

by Frank Schulenburg at November 07, 2015 12:48 AM

November 05, 2015

Wikidata (WMDE - English)

Q167545: Wikidata celebrated its third birthday

 Wikidata celebrated its third birthday on October 29th. The project went online in 2012 and a lot has happened ever since.

Coincidentally, the birthday also happened along with the project being awarded a prize from Land der Ideen, so so a proper party for volunteers and everyone involved with the project was in order.

There was cake and silly birthday hats, but above all this was an occassion to look at the past, present, and future.

Denny Vrandečić and Eric Möller used a video message to talk about the genesis and development of Wikidata.

Community members Magnus Manske and Marteen Dammers talked about their work for Wikidata in GLAM and science. And Lydia Pintscher not only looked backed to a successful year behind us, but also gave us a peek into the future that lies ahead for the project.

In order to experience Wikidata there was a little exhibition of projects that use it: From Histropedia which visualizes timelines to Ask Platypus, a project that parses questions about the knowledge of the world according to Wikidata using natural language.

No birthday would be complete without presents. Especially the software developers had worked hard to improve parts of Wikidata for this special date. To give you just two examples:

  • https://www.wikidata.org/wiki/Special:Nearby shows nearby items in Wikidata and invites you to improve structured data knowledge in your neighborhood
  • A machine learning model called  ORES helps to identify vandalism with artificial intelligence and can be used as a tool for administrators

These are only two new features released for the birthday party. There is much, much more to come for the Wikidata project next year and we’ll talk about it in length in another post.

Wikidata has data in its name. However — this was more than obvious at the birthday party — it’s about more than just cold numbers. As in all collaborative projects, people are at the core of it all. Those behind or around Wikidata have love in their hearts for something that may at first sound as abstract as „structured data for Wikimedia projects and beyond“.

Upon exiting the party, guests could add themselves on a board and leave a tiny love letter to Wikidata . „I love Wikidata because… with machine-readable data, machines can do the heavy lifting for me“ one guest wrote. The last three years were all about building a foundation for machine-readable data. Let the heavy lifting begin in all the years to come. Q167545!

by Jens Ohlig at November 05, 2015 11:09 AM

Wikimedia Foundation

News on Wikipedia: Editors document Kansas City Royals’ World Series win

The Royals celebrate after winning the 2015 #WorldSeries.
The World Series was first played in 1903; this year’s contest was the 111th time it has been played. Photo by Arturo Pardavila III, freely licensed under CC BY 2.0

Yesterday, the Kansas City Royals defeated the New York Mets in the 111th annual World Series, Major League Baseball’s best-of-seven championship series. It is the most prestigious competition on the baseball calendar in the United States, and attracted capacity crowds throughout all of its five games as the Royals emerged 4–1 victors—their first World Series title in thirty years.

The Royals qualified as the winners of the American League, while the Mets took their place thanks to winning the National League. The Mets beat the Chicago Cubs to win the right to play in the World Series—upsetting many fans of Back to the Future II, the 1989 film that predicted the Cubs would win the whole thing in 2015. It also disappointed Cubs fans that hoped their team would break through and appear in their first World Series since 1945 and win for the first time since 1908.

Wikipedia’s article on the 2015 World Series was viewed almost 44,000 times on the day of the final, and ultimately deciding, game at Citi Field in New York City. The Royals’ starting pitcher was Edinson Volquez, whose father had died days earlier. He gave up two runs to the Mets, but the Royals scored two runs of their own in the nominal final inning to tie the game. In extra innings, they scored a crucial five additional runs to cement their victory in the game and series.

They are the first team in over two decades to win a World Series after losing in the previous year’s World Series.

The Most Valuable Player, an award given to the best player from either side, went to the Royals’ catcher Salvador Pérez. Pérez is a three-time MLB All-Star who had a .364 batting average during the series—including two important hits in the final game—and directed the Royals’ pitching staff through all five games.

Edits made per hour to the “2015 World Series” article throughout the World Series itself. Photo by Joe Sutherland, CC BY-SA 2.0.

Among the four WikiProjects covering the article was WikiProject Baseball, a task force with more than 63,000 articles under its jurisdiction. 268 of these articles are classed as “featured”, the highest honour that can be bestowed upon an article by the community.

The article itself has been edited almost 1,200 times since its creation; as with those on most current sporting events, the vast majority of these were made during the event itself. 818 edits were made during the World Series (between October 27 and November 3, in UTC), almost three quarters of all the edits.

The community also had to contend with vandalism edited into articles related to the World Series finalists in their run in to the final. During the Mets’ win in the National League Championship against the LA Dodgers, New York’s Rubén Tejada had his leg accidentally broken by opposing baseman Chase Utley—an incident resulting in vandalism to Utley’s Wikipedia article from anonymous editors. These changes, however, lasted only minutes—they were swiftly reverted by various members of WikiProject Baseball, including EricEnfermero and Dodgers fan Spanneraol.

Other headlines


The aircraft, EI-ETJ, flown during Kogalymavia Flight 9268. Photo by Sergey Korovkin 84, freely licensed under CC BY-SA 4.0

  • Kogalymavia Flight 9268, a chartered passenger flight from Sharm el-Sheikh International Airport in Egypt to Pulkovo Airport in Saint Petersburg, Russia, crashed in northern Sinai on October 31. The cause of the crash is still under investigation, with the airline, Metrojet, ruling out a technical fault and suggesting external factors are to blame. It is the deadliest air crash in the history of Russian aviation, killing all 224 on board.
  • The Justice and Development Party of Turkey (AKP), led by Ahmet Davutoğlu, regained its parliamentary majority following a snap general election in the country on November 1. Turnout was high at 85.2 percent, with AKP winning 317 of the 550 available seats; it also received almost half of the total public vote. It means the previous parliament is officially the shortest in the Grand National Assembly’s history at just five months.
  • On November 3, Michelle Payne of New Zealand became the first female jockey to win the Melbourne Cup. She and her horse, Prince of Penzance, beat out Max Dynamite—ridden by veteran jockey Frankie Dettori—to win the 155th edition of the annual horse race.
  • A nightclub fire in Bucharest, Romania, killed 32 people and injured a further 179 on October 30. The fire was thought to be caused by pyrotechnics set off by a heavy metal band playing in the club on the night.
  • On October 29, Saudi blogger Raif Badawi was awarded the 2015 Sakharov Prize for his work in defense of freedom of thought and human rights. He is the creator of the website Free Saudi Liberals, a forum discussing a number of issues related to religion in his home country. He is currently serving a sentence imposed by a Saudi high court in punishment for several crimes, including apostasy.

Joe Sutherland
Communications Intern
Wikimedia Foundaiton

by Joe Sutherland at November 05, 2015 12:18 AM

November 04, 2015

Wikimedia Foundation

Wikimedia Research Newsletter, October 2015

Wikimedia Research Newsletter
Wikimedia Research Newsletter Logo.png

Vol: 5 • Issue: 10 • October 2015 [contribute] [archives] Syndicate the Wikimedia Research Newsletter feed

Student attitudes towards Wikipedia; Jesus, Napoleon and Obama top “Wikipedia social network”; featured article editing patterns in 12 languages

With contributions by: Jonathan Morgan, Morten Warncke-Wang, Piotr Konieczny, and Tilman Bayer

Students value Wikipedia both for quick answers and for detailed explorations

Reviewed by Jonathan Morgan

This paper[1] reports findings from a survey of Norwegian secondary school students about their use of Wikipedia in the context of their coursework. The survey of 168 students between the ages of 18 and 19 consisted of 33 Likert scale questions and two free response questions. The goal was to assess how Wikipedia figured into students’ literacy practices, a concept that encompasses students’ and teachers’ attitudes towards the resources they use to learn and the social context in which they engage with those resources, as well as the process by which they read, remember, and understand the information provided by each resource.

The main finding of the study is that students’ attitudes towards Wikipedia are overwhelmingly positive, but they find the information presented in Wikipedia less trustworthy than their official course materials. Although 90% of respondents rated their textbooks as more trustworthy, they cited the ease of finding factual information (such as dates, names, etc) as a key reason for preferring Wikipedia. They also reported that Wikipedia was better than their textbooks at explaining the “big picture” of a given topic, as well as facilitating more in-depth exploration. In the words of one survey respondent: “If you need to, you can read elaborations about a given topic, or you can just read the summary if that is what you need.”

These findings suggest that the primary advantage that Wikipedia offers to students is its flexibility: it allows students to find quick answers and more detailed accounts with equal ease. The findings also suggest that both students and teachers would benefit from a better understanding of how to critically evaluate the quality of information presented in Wikipedia and other open online information resources.

The study also confirmed findings from previous studies: that the vast majority of students use Wikipedia to supplement their official course resources (textbooks, etc), that most of them access Wikipedia via Google search, and that English-speaking students tend to seek information on the English-language Wikipedia first, regardless of their first language or national origin.

Jesus, Napoleon, and Obama top the “Wikipedia social network”

Reviewed by Piotr Konieczny

A (conference?) paper titled “Beyond Friendships and Followers: The Wikipedia Social Network”[2] applies social network theory to the analysis of relationship between subjects of Wikipedia biographical articles. Using Wikidata and Wikipedia metadata, the authors produce a number of findings. Some of them will not be unexpected to readers, such as that “By far the largest occupational groups are politicians and football players”, or “The page with the most mentions of persons is Rosters of the top basketball teams in European club competitions” (with 4,694 mentions of 1,761 different persons). The most referenced persons are Jesus and Napoleon, followed by Barack Obama, Muhammad, Shakespeare, Adolf Hitler, and George W. Bush. Over four fifths of the links in Wikipedia are to male persons, which roughly reflects the gender distribution of Wikipedia biographies; a similar distribution confirms that most of the biographies focus on the 19th and 20th centuries. The authors, however, do not dwell on the social science implications of their findings, but merely suggest that their tool can be used to refine Wikipedia categories and disambiguation tools. The findings are interesting from the perspective of alternate approaches to categorization, as it may suggest possible new categories that haven’t yet been created by human editors, and perhaps provides a mathematical model of how Wikipedia categories can be created.

“Exploration of Online Culture Through Network Analysis of Wikipedia”

Reviewed by Piotr Konieczny

This paper[3] also uses social network theory, as well as the Hofstede‘s cultural dimensions theory, Schwartz‘s Theory of Basic Human Values, and McCrae‘s Five factor model of personality to ask research questions about the concept of online culture; in particular whether it is universal or differs for various national cultures. It focused on 72 Featured Articles in 12 languages (unfortunately, the authors do not explain any reasons for choosing those particular 12 languages over the others); discounting bots, the authors analyzed more than 150,000 editors and 250,000 edits. The authors find that most Wikipedia edits are what they call self-loops, or individual editors making edits to the same articles they have edited before, without their editing being interrupted by edits by another editor. They fail to make any comment on what that really means for the vision of Wikipedia as a collaborative environment. The authors find significant differences in editing patterns between certain Wikipedia projects, though this reviewer finds the description of said differences (focusing on a case study of one Japanese and one Russian article) rather curt. Similarly, their discussion of how the results fit (or don’t) with the established theories of Hofstede and others is interesting, but rather short; that unsatisfying brevity may however be due to editorial requirements (the entire paper is only 3.5k words long, instead of the more common average of about 8k). The authors conclude that “new dimensions of online culture can be explored from directly observed online behavior”, something that one hopes they’ll revisit themselves, together with their dataset, in a longer paper that will do proper justice to it.


Vandalism detection research neglects smaller languages

Reviewed by Morten Warncke-Wang

A paper at the 19th International Conference on Circuits, Systems, Communications and Computers (CSCC)[4] provides an overview of research on vandalism detection in Wikipedia, with a focus on the usage of machine learning. One of the paper’s conclusions is that future research should aim for language-independency, as little progress has been made outside of the English, German, French, and Spanish Wikipedia editions.

Automatic quality assessment using the “collaboration network”

Reviewed by Morten Warncke-Wang

“Measuring Article Quality in Wikipedia Using the Collaboration Network”[5] is a paper that proposes an improved model of co-authorship to be used in predicting the quality of Wikipedia articles. Trained on a stratified sample of articles from the English Wikipedia, it is shown to outperform several baselines. Unfortunately, the dataset used for evaluation omits Start-class articles for no apparent reason, and used the latest revision of an article, which might differ considerably from when an article received its quality rating.

Other recent publications

A list of other recent publications that could not be covered in time for this issue – contributions are always welcome for reviewing or summarizing newly published research.


Mean amount of content added per edit, per editor’s experience level (illustration from “607 Journalists”)

  • “607 Journalists: An evaluation of Wikipedia’s response to and coverage of breaking news and current events[6] See also blog post
  • “Wiki is not paper: Fixing and breaking the ‘news’ on Wikipedia”[7] From the abstract: “The case studies include the “Barack Obama” article, which is used to investigate the establishment and maintenance of the “fact” that Obama is described as an ‘African American,’ despite his mixed-race heritage. … The second case study uses the article on the 2008 war in the Georgian province of South Ossetia to investigate the transnational and transcultural pitfalls of ‘bias’ in the writing of a ‘neutral’ article. The final case examines the decision to publish controversial material by examining the article on the 2006 Muhammad cartoons controversy. This article was crucial on Wikipedia in establishing the protocol in publishing such images.”
  • “User interaction with community processes in online communities”[8] From the abstract: “We find that articles that are deleted from Wikipedia differ from those that are not in many significant ways. We also find, however, that most deleted articles are deleted extremely hastily, often before they have time to develop. We use our data to create a model that can predict with high precision whether or not an article will be deleted. … We propose to deploy a system utilizing this model on Wikipedia as a set of decision-support tools to help article creators evaluate and improve their articles before posting. … English Wikipedia’s Articles for Creation provides a protected space for drafting new articles, which are reviewed against minimum quality guidelines before they are published. We explore the possibility that this drafting process, which is intended to improve the success of newcomers, in fact decreases newcomer productivity in English Wikipedia, and offer recommendations for system designers.”
  • “Detecting Vandalism on Wikipedia across Multiple Languages”[9]
More recent publications
  • “Spillovers in Networks of User Generated Content: Pseudo-Experimental Evidence on Wikipedia”[10] From the abstract: “[On the German Wikipedia, the featuring of an article on the main page does] affect neighboring articles substantially: Their viewership increases by almost 70 percent. This, in turn, translates to increased editing activity. Attention is the driving mechanism behind views and short edits. Both outcomes are related to the order of links, while more substantial edits are not.” See also by the same author: “Spillovers in Networks of User Generated Content”
  • “Peer Effects in Collaborative Content Generation: The Evidence from German Wikipedia”[11] From the abstract: “editors who contribute to the same articles and exchange comments on articles’ talk pages work in collaborative manner sometimes discussing their work. They can, therefore, be considered as peers, who are likely to influence each other. In this article, I examine whether peer influence, measured by the average amount of peer contributions or by the number of peers, yields spillovers to the amount of individual contributions.”
  • “Wikipedia Page View Reflects Web Search Trend”[12] (see also datasets, slides) From the abstract: “We found frequently searched keywords to have remarkably high correlations with Wikipedia page views.”
  • “Wikipedia edition dynamics”[13] From the abstract: “It is argued that the probability to edit is proportional to the editor’s number of previous editions (preferential attachment), to the editor’s fitness and to an ageing factor.” See also by the same authors: “The dynamic nature of conflict in Wikipedia”
  • “Cultural Similarity, Understanding and Affinity on Wikipedia Cuisine Pages”[14] See also “Mining cross-cultural relations from Wikipedia – A study of 31 European food cultures”
  • “The influence of network structures of Wikipedia discussion pages on the efficiency of WikiProjects[15] From the abstract: “The evaluation suggests that an intermediate level of cohesion with a core of influential users dominating network flow improves effectiveness for a WikiProject, and that greater average membership tenure relates to project efficiency in a positive way.”
  • “Technological Nudges and Copyright on Social Media Sites”[16] From the abstract: “Using an adapted taxonomy, this article identifies the technological features on predominant social media sites—Facebook, YouTube, Twitter and Wikipedia—that encourage and constrain users from engaging in generative activities. Notwithstanding the conflicting narrative painted by recent litigation around copyright in relation to content on social media sites, I observe that some of the main technological features on social media sites are designed around copyright considerations.” (However, the paper never mentions that Wikipedia’s content is under a free license.) “In contrast to the other social media sites, I note that Wikipedia does not allow its users to comment on content; hence there is little room for this alternative form of modification.”
  • “The WikEd Error Corpus: A Corpus of Corrective Wikipedia Edits and Its Application to Grammatical Error Correction”[17]
  • “Students’ use of Wikipedia as an academic resource — Patterns of use and perceptions of usefulness”[18] (survey of 1658 undergraduate students) From the abstract: “87.5% of students report using Wikipedia for their academic work, with 24.0% of these considering it ‘very useful’. Use and perceived usefulness of Wikipedia differs by students’ gender; year of study; cultural background and subject studied. Wikipedia mainly plays an introductory and/or clarificatory role in students information gathering and research.”
  • “Snooping Wikipedia Vandals with MapReduce[19] From the abstract: “[Using] MapReduce … we are able to explore a very large dataset, consisting of over 5 millions articles [actually pages on enwiki, including non-articles] collaboratively edited by 14 millions authors, resulting in over 8 billion pairwise interactions. We represent Wikipedia as a signed network, where positive arcs imply constructive interaction between editors. We then isolate a set of high reputation editors (i.e., nodes having many positive incoming links) and classify the remaining ones based on their interactions with high reputation editors.”
  • “An agent-based model of edit wars in Wikipedia: How and when consensus is reached”[20] From the abstract: “We show that increasing the number of credible or trustworthy agents and agents with a neutral point of view decreases the time taken to reach consensus, whereas the duration is longest when agents with opposing views are in equal proportion.” See also last issue’s review of a different numerical model of edit wars: “More newbies mean more conflict, but extreme tolerance can still achieve eternal peace”


  1. Blikstad-Balas, Marte (2015). ““You get what you need” : A study of students’ attitudes towards using Wikipedia when doing school assignments”. Scandinavian Journal of Educational Research 3831 (October): 1–15. Closed access
  2. Johanna Geiß, Andreas Spitz, Michael Gertz: Beyond Friendships and Followers: The Wikipedia Social Network PDF
  3. Park Sung Joo, Kim Jong Woo, Lee Hong Joo, Park Hyunjung, Han Deugcheon, and Gloor Peter. Exploration of Online Culture Through Network Analysis of Wikipedia. Cyberpsychology, Behavior, and Social Networking, ahead of print. DOI:10.1089/cyber.2014.0638 Closed access
  4. Hamiti, Mentor; Susuri, Arsim; Dika, Agni. “Machine Learning and the Detection of Anomalies in Wikipedia” (PDF). Proceedings of the 19th International Conference on Circuits, Systems, Communications and Computers. 
  5. de La Robertie, Baptiste; Pitarch, Yoann; Teste, Olivier. “Measuring Article Quality in Wikipedia Using the Collaboration Network” (PDF). 
  6. Joseph R. B. Sutherland: 607 Journalists: An evaluation of Wikipedia’s response to and coverage of breaking news and current events. Dissertation, Aberdeen Business School – Robert Gordon University, April 2015 PDF
  7. Lyons, J. Michael: Wiki is not paper: Fixing and breaking the “news” on Wikipedia. Dissertation, Indiana University, 2015, 206 pages; [1] Closed access
  8. Gelley, Shoshana Bluma. User interaction with community processes in online communities. Dissertation, Polytechnic Institute of New York University, 2015 [2] Closed access
  9. Khoi-Nguyen Dao Tran: Detecting Vandalism on Wikipedia across Multiple Languages. Thesis submitted for the degree of Doctor of Philosophy, The Australian National University, May 2015 PDF
  10. Kummer, Michael E. (2014-12-29). Spillovers in Networks of User Generated Content: Pseudo-Experimental Evidence on Wikipedia. Rochester, NY: Social Science Research Network. 
  11. Olga Slivko: Peer Effects in Collaborative Content Generation: The Evidence from German Wikipedia. Discussion Paper No. 14-128, Centre for European Economic Research (ZEW). December 22, 2014, updated March 3, 2015 PDF
  12. Mitsuo Yoshida, Yuki Arase, Takaaki Tsunoda, Mikio Yamamoto. Wikipedia Page View Reflects Web Search Trend. The 2015 ACM Web Science conference (WebSci15). Oxford, UK, June 28 – July 1, 2015. Authors’ copy
  13. Gandica, Y.; F. Sampaio dos Aidos, J. Carvalho (2014-12-30). “Wikipedia edition dynamics”. arXiv:1412.8657 [physics]. 
  14. Paul Laufer: Cultural Similarity, Understanding and Affinity on Wikipedia Cuisine Pages. Master Thesis, TU Graz, August 2014 PDF
  15. Xiangju Qin, Pádraig Cunningham, Michael Salter-Townshend: The influence of network structures of Wikipedia discussion pages on the efficiency of WikiProjects. Social Networks Volume 43, October 2015, Pages 1–15 DOI:10.1016/j.socnet.2015.04.002 Closed access
  16. Tan Ms, Corinne (2015). “Technological Nudges and Copyright on Social Media Sites”. Intellectual Property Quarterly (1): 62–78. 
  17. Grundkiewicz, Roman; Junczys-Dowmunt, Marcin (2014-09-17). “The WikEd Error Corpus: A Corpus of Corrective Wikipedia Edits and Its Application to Grammatical Error Correction”. In Adam Przepiórkowski, Maciej Ogrodniczuk (eds.). Advances in Natural Language Processing. Lecture Notes in Computer Science. Springer International Publishing. pp. 478–490. ISBN 978-3-319-10888-9.  Closed access
  18. Neil Selwyna, Stephen Gorardb: Students’ use of Wikipedia as an academic resource — Patterns of use and perceptions of usefulness. The Internet and Higher Education, Volume 28, January 2016, Pages 28–34 DOI:10.1016/j.iheduc.2015.08.004 Closed access
  19. Michele Spina, Dario Rossi, Mauro Sozio, Silviu Maniu, Bogdan Cautis: Snooping Wikipedia Vandals with MapReduce. 2015 IEEE International Conference on Communications (ICC), DOI:10.1109/ICC.2015.7248477. PDF (authors’ copy)
  20. Arun Kalyanasundaram, Wei Wei, Kathleen M. Carley, James D. Herbsleb: An agent-based model of edit wars in Wikipedia: How and when consensus is reached. Proceedings of the 2015 Winter Simulation Conference, L. Yilmaz, W. K V. Chan, I. Moon, T. M. K. Roeder, C. Macal, and M. D. Rossetti, eds. PDF.

Wikimedia Research Newsletter
Vol: 5 • Issue: 10 • October 2015
This newletter is brought to you by the Wikimedia Research Committee and The Signpost
Subscribe: Syndicate the Wikimedia Research Newsletter feed Email WikiResearch on Twitter[archives] [signpost edition] [contribute] [research index]

by Tilman Bayer at November 04, 2015 07:59 AM

November 03, 2015

Semantic MediaWiki

Semantic MediaWiki 2.3 released

Semantic MediaWiki 2.3 released

October 25 2015. Semantic MediaWiki 2.3, the next version after 2.2, has now been released. This new version brings many improvements and new features such as improvments to SPARQLStore that has now reached full feature parity with the SQLStore. It also provides improvements to the "rebuildData.php" script. The search patterns for datatypes "telephone" and "e-mail" were extended and comparator support added for datatype "date". A very important though still experimental feature of this version is the query dependency detection that allows for their automatic updating. See also the release page for information on further improvements and new features. Additionally this version fixes a lot of bugs and brings stability and performance improvements. Automated software testing was further expanded to assure software stability. See the page Installation for details on how to install and upgrade.

Semantic MediaWiki 2.3 released en

by Kghbln at November 03, 2015 08:47 PM

Wikimedia Tech Blog

Your October milestones include Wikidata’s 15 millionth item

Wikidata Birthday Cake First Cut.jpg
Wikidata hit 15 million items this month, not long before its third birthday celebrations in Berlin. Photo by Jason Krüger, freely licensed under CC BY-SA 4.0.

While the English Wikipedia’s 5 million article milestone just missed the cut, several other Wikimedia projects celebrated milestones of their own in the month of October.

On October 27, Wikidata hit 15 million items—and two days later celebrated its third birthday, with celebrations in Berlin. It’s now the third most-active Wikimedia project behind the English Wikipedia and Wikimedia Commons, with around 5,900 active users in June 2015.

Several language editions of Wikipedia crossed major article milestones—on October 12, the Georgian Wikipedia became the 54th project to reach 100,000 articles, and on the 29th, the Azerbaijani Wikipedia joined them in that club as the 55th.

Early in the month, the Catalan Wiktionary hit the 150,000 entry milestone, while the Hungarian Wiktionary reached 300,000 entries towards the end of October—only the 18th Wiktionary to reach this particular milestone.

Bots again played a role in growing some of the smaller Wikimedia projects this month, with the Min Nan Wikipedia growing by ten thousand articles in just eight days.

Other selected milestones

October 3
The Catalan Wiktionary has reached 150,000 entries.
The Serbian Wiktionary has reached 40,000 entries, as a bot has added over 5,000 entries in the last 24 hours.

The Maithili Wikipedia has reached 1,000 articles.

The Oriya Wikipedia has reached 10,000 articles.

The Tamil Wikipedia has reached 70,000 articles.
The Min Nan Wikipedia has reached 50,000 articles.[1]

The Hindi Wikibooks has reached 200 book modules.

The Alemannic Wikipedia has reached 20,000 articles.
The Georgian Wikipedia has reached 100,000 articles.

The Min Nan Wikipedia has reached 60,000 articles.[1]

The Asturian Wikipedia has reached 30,000 articles.
The Emilian-Romagnol Wikipedia has reached 5,000 articles.

The Japanese Wikisource has reached 5,000 text units.

The Sardinian Wikipedia has reached 5,000 articles.
The Croatian Wiktionary has reached 30,000 entries.

The Hungarian Wiktionary has reached 300,000 entries.
Wikidata has reached 15,000,000 items.

The Azerbaijani Wikipedia has reached 100,000 articles.

Joe Sutherland
Communications intern
Wikimedia Foundation


  1. a b This is primarily due to bot activity.

by Joe Sutherland at November 03, 2015 07:02 PM


MassAction Mediawiki extension

MassAction is a Mediawiki extension that allows users to perform mass actions on targets through a special page making use of the job queue. Its development started at some point in 2014 and a very rough experimental version is now available. Below are the basics.

Basic Concepts

  • Tasks are individual mass actions, comprised of smaller actions that are applied to multiple targets using matchers, for example replace the word ‘hello’ with ‘goodbye’ on all wiki pages with a title that contains the word ‘Language’ for pages that are not redirects.
  • Actions are processes that can be applied to Targets to alter some of the data that they contain, for example, change the title (move), change an article text etc.
  • Matchers or Filters are sets of rules that are used to match certain targets, for example all articles that contain the word ‘hello’.

All of these concepts are stored in new database tables (seen below).


The main interaction with the extension is done through a special page. This page allows the creation of tasks as well as the viewing of previously created tasks and various actions such as saving changes.

A wire frame showing task creation can be seen below This allows for basic information about the task such as what type of target we want to change, this could be an Image, Article, Wikibase item etc. It also allows for a summary of the changes that will be made.

The lower sections of the page allow for the input of an unlimited number of Actions and Matchers/Filters to be added.

The version of the special page that allows users to view tasks is slightly different and can be seen below.

The main differences here are that no new data can be added, it is simply presented. And also a Task state and list of targets is now present.

Upon creation of a Task the Task will make its way through various states (seen below).

Once the targets have been found they will appear in the targets list on the special page and users will be able to either save changes individually or save a whole list of changes.

Current State

The code is currently stored on Wikimedia’s Gerrit tool and is mirrored onto GitHub. All issues are now tracked in Phabricator and the current workboard can be found here.

A screenshot of the current special page for task creation can be seen to the right.

Of course at this early stage lots of things are missing and I hope I find the time to work on this over the next year:

by addshore at November 03, 2015 06:07 PM

Review of the big Interwiki link migration

Wikidata was launched on 30 October 2012 and was the first new project of the Wikimedia Foundation since 2006. The first phase enabled items to be created and filled with basic information: a label – a name or title, aliases – alternative terms for the label, a description, and links to articles about the topic in all the various language editions of Wikipedia.

On 14 January 2013, the Hungarian Wikipedia became the first to enable the provision of interlanguage links via Wikidata. This functionality was slowly enabled on more sites until it was enabled on all Wikipedias on the 6th March.

The side bar that these interlanguage links are used to generate can be seen to the right.

Before Wikidata

Before Wikidata phase 1 came to be every single language version of an article contained a manually / bot maintained list of interwiki links at the bottom. These links had a habit of conflicting with one another due to the way that they were maintained. An example of the list maintained in wikitext can be seen below for https://fr.wikipedia.org/wiki/La_Nuit_des_rois_(homonymie). The interwiki links are at the bottom of the content in the style [[LANG:TITLE]].

As said above this list is maintained on every article, by the looks of things this article has 7 other language versions, so 8 lists of links in total. So say you want to add a new article in another language, you should really add said article to all 8 lists, meaning 8 additional edits. Alternatively you could wait for bots to notice that the articles are related and add the links in all of the relevant places.

As people generally didn’t care about these lists on other wikis, or even know about them the bots took over. See below…

Above is the history for the Uzbeck language article for January 31st, 31-yanvar. Only 2 of the 34 edits shown above were made by people, the rest by bots maintaining the list of interwiki links, and this screenshot doesn’t even show the whole history for the article (see here).

The Migration through bots

The whole migration was basically carried out by a fleet of bots. These bots basically did one of the following things:

  • Find an interwiki link on an article and add it to Wikidata (leaving the interwiki link on the article).
  • Find an interwiki link on an article , add it to Wikidata and remove it from the article.
  • Find an interwiki link on an article that is already stored in Wikidata and remove it from the article.

My bot, Addbot, focused on the last one of these trying to remove redundant data. Due to the simplicity of this task the bot made over 15 million edits through the switch on of Phase 1.

An example interwiki link migration edit can be seen below or found here.

Other notable bots in the migration effort include: Legobot and EmausBot.

Migration conflict?

I leave a question mark above as there was not really any conflict with the migration, simply surprise. Bots helping to migrate data to Wikidata were regularly blocked on various projects in the first week or so. Although the Wikidata team, the community and the bot operators all tried to get the message out there about the migration before it happened, some people of course did not hear!

Most of the projects that the bots needed to edit fell under the Wikimedia global bot policy, other projects needed individual consultation and approval for example on the German Wikipedia.


The biggest visible impact on all Wikimedia projects was the decrease in bot edits post migration. This was due to hundreds of interwiki bots no longer needing to run and the link lists being maintained in a single place.

The graph below provided by stats.wikimedia.org shows a large spike in edits in early 2013, this was the flood of edits to remove the interwiki links from articles that were already provided by Wikidata. This then drops after the main part of the migration.

If you focus on the green line you can see that post migration the number of bot edits fell to below half of the number prior to the migration.



Yay, Wikidata!

P.s. I wrote this post very quickly, if you spot any errors please comment!

by addshore at November 03, 2015 06:07 PM

Wikidata Map – 19 months on

The last Wikidata map generation, as last discussed here and as originally created by Denny Vrandečić was on the 7th of November 2013. Recently I have started rewriting the code that generates the maps, stored on github, and boom, a new map!

The old code

The old version of the wikidata-analysis repo, which generated the maps (along with other things) was terribly inefficient. The whole task of analysing the dump and generating data for various visualisations was tied together using a bash script which ran multiple python scripts in turn.

  • The script took somewhere between 6 and 12 hours to run.
  • At some points this script needed over 6GB of memory to run. And this was running when Wikidata was much smaller, this probably wouldn’t even run any more.
  • All of the code was hard to read, follow and understand.
  • The code was not maintained and thus didn’t actually run any more.

The Rewrite

The initial code that generated the map can mainly be found in the following two repositories which were included as sub-modules into the main repo:

The code worked on the Mediawiki page dumps for Wikidata and relied on the internal representation of Wikidata items and thus as this changed everything broke.

The wda repository pointed toward the Wikidata-Toolkit which is written in Java and is actively maintained, and thus the rewrite began! The rewrite is much faster, easily understood and easily expandable (maybe I will make another post about it once it is done)!

The change to the map in 19 months

Unfortunately according to the settings of my blog currently I can not upload the 2 versions of the map so will instead link to the the twitter post announcing the new map as well as the images used there (not full size).

The tweet can be found here.

Wikidata map 7 Nov 2013

Wikidata map 3 June 2015

As you an see, the bottom map contains MORE DOTS! Yay!

Still to do

  • Stop the rewrite of the dump analyser using somewhere between 1 and 2GB ram.
    • Problem: Currently the rewrite takes the data it wants and collects it in a Java JSON object writing to disk at the end of the entire dump has been read. Because of this lots of data ends up in this JSON object and thus in memory, and as we analyse things more this problem is only going to get worse.
    • Solution: Write all data we want directly to disk. After the dump has fully been analysed read all of these output files individually and put them in the format we want (probably JSON).
  • Make all of the analysis run whenever a new JSON dump is available!
  • Keep all of the old data that is generated! This will mean we will be able to look at past maps. Previously the maps were overwritten every day.
  • Fix the interactive map!
    • Problem: Due to the large amount of data that is now loaded (compared with then the interactive map last worked 19 months ago) the interactive map crashes all browsers that try to load it.
    • Solution: Optimise the JS code for the interactive map!
  • Add more data to the interactive map! (of course once the task above is done)

Maps Maps Maps!

by addshore at November 03, 2015 06:07 PM

Removing use of Mediawiki’s API ‘rawmode’ in Wikibase

Rawmode was a boolean value used to determine if an API result formatter in Mediawiki needed extra metadata in order to correctly format the result output. The main use of said metadata was in the XML output of the Mediawiki API. How hard can removing it be? This is the story of the struggle to remove the use of this single boolean value from the Wikibase codebase.


The first commit for this task was made on the 6th July 2015 and the final commit was about to be merged on the 27th August. So the whole removal took just under 2 months.

During this two months roughly 60 commits were made and merged working towards removal.

Overall 9290 lines were removed and 5080 lines were added.

I’m glad that is all done. (This analysis can be found on Google sheets). Sorry there are not more pictures in this post…..

Reason for removal

Well, rawmode is being remove from Mediawiki to remove API complexity. Instead of having to check what the API formatters need they will instead just accept all metadata and simply use what they need and discard the rest.

The change to “Finish killing ‘raw mode'” can be seen on Gerrit and has been around since April of this year. The relevant task can be found on Phabricator.

Process overview

The first step on the path was to remove the old serialization code from Wikibase (otherwise known as the lib serialization code) and replace all usages with the new WikibaseDataModelSerialization component. This component was already used in multiple other places in the code but not in the API due to its reliance on the way the lib serialization code handled the rawmode requirement of the API at the time.

Removal of the lib serialization code was the the first of the two major parts of the process and after around 50 commits I managed to remove it all! Hooray for removing 6000 lines with no additions in a commit…

The next and final step was to make the ResultBuilder class in Wikibase always provide metadata for the API and to remove any dirty hacks that I had to introduce in order to kill the lib code. Again this was done over the course of multiple commits, mainly adding tests for the XML output which at the time was barely tested. Finally a breaking change had to be made to remove lots of the hacks that I had added and the final uses of raw mode.

The final two commits can be seen at http://gerrit.wikimedia.org/r/#/c/227686/ and http://gerrit.wikimedia.org/r/#/c/234258/

Final notes

Look! You can even see this on the GitHub code frequency graph (just….)

You can also find my draft API break announcement post here.

— edit —

It looks like I did break 1 thing incorrectly: https://phabricator.wikimedia.org/T110668 , thought a fix is on it’s way our to the beta site! :)

by addshore at November 03, 2015 05:43 PM

Wikimedia Grafana graphs of Wikidata profiling information

I recently discovered the Wikimedia Grafana instance. After poking it for a little while here are some slightly interesting graphs that I managed to extract.

The graphs

Now most of these are quite pointless, at this early stage I am still trying to work out what half of the data stored in Graphite means. But they are still interesting to look at…

The image above relates to one of my recent blogposts. It shows the removal of the lib serialization code from wikibase and the switch to the new DataModel serialization code. The top graph shows that the Lib code was totally removed and is no longer called at all. The lower graph shows that the DataModel serialization code was infact called in various parts of the code base prior to the switch of the API, although the API is clearly be biggest user of the serialization code.

For quite a while now I have wanted to know which of the wikidata api modules get the most use and which are not used at all / barely used. This graph shows that the wbgetclaims and wbgetentities modules are the most used, which makes sense, accessing data happens more often than writing data. Because of the 1:1000 sample rate from the profiling however the details stored for the less used modules is rather useless.

Now this is quite a pointless graph, simply the construction counts of what I assume are 2 of the most used objects in the datamodel.

The WikibaseLuaBindings class contains the methods that Lua binds to in order to access data. This graph shows use that the most used method is getEntityId and that getDescription is barely used.

This is again quite a pointless graph but it does show us the general construction amounts for out DataValue objects and thus we can see which is generally used more. This makes the composition of objects very apparent with GlobeCoordinateValue constructions basically matching LatLongValue constructions, as they use each other.

The Legacy Deserializer code is in a separate component and is used to deserialize old / legacy versions of entities that are still stored in the database. If we could look at this over a period of a year for example rather than a month we would probably expect this to slowly trend down. The graph above shows a month which doesn’t really show us anything other than that it is still used!

The DataModel Services library is new and hence had no use prior to the deployment clearly visible above. You can read a blog post about the component here. Below you can see green and purple lines vanishing around the time of this deployment as again code moves from using deprecated classes in Lib to this new component.

The dashboard

Graphana is quite a nice tool, although it is a shame that the whole UI just seemed to be a bit broken, which is very similar to my first impressions of most of these graphing tools…

As a result of not being able to create dashboards using the tool I had to resort to downloading some sample JSON snippets from other dashboards and trying to glue them all together.

The JSON used to generate the graphs above can be seen below, to use it you need to go to http://grafana.wikimedia.org/#/dashboard/file/empty.json, click on ‘Search’ in the top right and then click on ‘Import’ and load the JSON! File downloadable here.

  "id": null,
  "title": "Xhprof General",
  "originalTitle": "Xhprof General",
  "tags": [
  "style": "dark",
  "timezone": "utc",
  "editable": true,
  "hideControls": false,
  "sharedCrosshair": true,
  "rows": [
      "title": "xhprof $class $method",
      "height": "350px",
      "editable": true,
      "collapse": false,
      "panels": [
          "id": 13,
          "span": 12,
          "type": "graph",
          "x-axis": true,
          "y-axis": true,
          "scale": 1,
          "y_formats": [
          "grid": {
            "max": null,
            "min": null,
            "leftMax": null,
            "rightMax": null,
            "leftMin": 0,
            "rightMin": null,
            "threshold1": null,
            "threshold2": null,
            "threshold1Color": "rgba(216, 200, 27, 0.27)",
            "threshold2Color": "rgba(234, 112, 112, 0.22)",
            "thresholdLine": false
          "resolution": 100,
          "lines": true,
          "fill": 0,
          "linewidth": 2,
          "points": false,
          "pointradius": 5,
          "bars": false,
          "stack": false,
          "spyable": true,
          "options": false,
          "legend": {
            "show": true,
            "values": true,
            "min": false,
            "max": true,
            "current": false,
            "total": true,
            "avg": true,
            "hideEmpty": true,
            "alignAsTable": true
          "interactive": true,
          "legend_counts": true,
          "timezone": "browser",
          "percentage": true,
          "zerofill": true,
          "nullPointMode": "null",
          "steppedLine": false,
          "tooltip": {
            "value_type": "cumulative",
            "query_as_alias": true,
            "shared": false
          "targets": [
              "target": "aliasByNode(sortByMaxima(MediaWiki.xhprof.$class.$method.$what.$value), 2, 3, 4, 5)",
              "hide": false
          "aliasColors": {},
          "aliasYAxis": {},
          "title": "",
          "datasource": "graphite",
          "renderer": "flot",
          "annotate": {
            "enable": false
          "seriesOverrides": [],
          "leftYAxisLabel": "",
          "height": "",
          "timeFrom": null,
          "timeShift": null,
          "links": []
      "showTitle": true
  "nav": [
      "type": "timepicker",
      "collapse": false,
      "enable": true,
      "status": "Stable",
      "time_options": [
      "refresh_intervals": [
      "now": true,
      "notice": false,
      "nowDelay": ""
  "time": {
    "from": "now-30d",
    "to": "now"

  "templating": {
    "list": [
        "type": "query",
        "datasource": "graphite",
        "refresh_on_load": false,
        "name": "class",
        "options": [],
        "includeAll": true,
        "allFormat": "wildcard",
        "refresh": true,
        "query": "MediaWiki.xhprof.{Wikibase,Wikidata}*",
        "current": {
          "text": "",
          "value": ""
        "type": "query",
        "datasource": "graphite",
        "refresh_on_load": false,
        "name": "method",
        "options": [],
        "includeAll": true,
        "allFormat": "wildcard",
        "refresh": true,
        "query": "MediaWiki.xhprof.$class.*",
        "current": {
          "text": "",
          "value": ""
        "type": "query",
        "datasource": "graphite",
        "refresh_on_load": false,
        "name": "what",
        "options": [],
        "includeAll": true,
        "allFormat": "wildcard",
        "refresh": true,
        "query": "MediaWiki.xhprof.$class.$method.*",
        "current": {
          "text": "null",
          "value": "null"
        "type": "query",
        "datasource": "graphite",
        "refresh_on_load": false,
        "name": "value",
        "options": [],
        "includeAll": true,
        "allFormat": "wildcard",
        "refresh": true,
        "query": "MediaWiki.xhprof.$class.$method.$what.*",
        "current": {
          "text": "null",
          "value": "null"
    "enable": true
  "annotations": {
    "list": [
        "name": "Show deployments",
        "datasource": "graphite",
        "showLine": true,
        "iconColor": "rgba(203, 244, 238, 0.82)",
        "lineColor": "rgba(31, 134, 168, 0.59)",
        "iconSize": 9,
        "enable": false,
        "target": "exclude(aliasByNode(deploy.*.count,-2),\"all\")"
    "enable": true
  "refresh": "1m",
  "version": 6,
  "hideAllLegends": false


And who knows, I might be able to pull some more things out soon……

by addshore at November 03, 2015 05:43 PM

Un-deleting 500,000 Wikidata items

Since some time in January of this year I have been on a mission to un-delete all Wikidata items that were merged into other items before the redirect functionality of Wikidata existed. Finally I am done (well nearly). This is the short story…


Earlier this year I pointed out the importance of redirects on Wikidata in a blog post. At the time I was amazed at how the community nearly said that they were not going to create redirects for merged items…. but thank the higher powers that the discussion just swung in favour of redirects.

Redirects are needed to maintain the persistent identifiers that Wikidata has. When two items relate to the same concept, they are merged and one of the identifiers must then be left pointing to the identifier now holding the data of the concept.

Listing approach

Since Wikidata began there have been around 1,000,000 log entries deleting pages, which equates to roughly the same number of items deleted, although some deleted items may also have been restored. This was a great starting point. The basic query to get this result was can be found below.

SELECT * FROM logging where log_type= 'delete' and log_action = 'delete' and log_namespace = 0

I removed quite a few items from this initial list by looking at at items that had already been restored and were already redirects. To do this I had to find all of the redirects!

SELECT p1.page_title AS title, rd_title,
p1.page_namespace as namespace, rd_namespace
FROM page as p1
LEFT JOIN redirect ON ((rd_from=p1.page_id))
LEFT JOIN page as p2 ON ((p2.page_namespace=rd_namespace) AND (p2.page_title=rd_title))
WHERE p1.page_is_redirect = '1'
AND p1.page_namespace = 0

At this stage I could have probably tried and remove more items depending on if they currently exist, but there was very little point. In fact it turned out that there was very little point in the above query as prior to my run very few items were un-deleted in order to create redirects.

The next step was to determine which of the logged deletions were actually due to the item being merged into another item. This is fairly easy as most cases of merges used the merge gadget on Wikidata.org. So if the summary matched the following regular expression! I would therefore assume it was deleted due to being merged / a duplicate of another item.

/(same as|duplicate|merge)/i

And of course in order to create a redirect I would have to be able to identify a target, so, match Q id links.


I then had a fairly  nice list, although it was still large, but it was time to actually start trying to create these redirects!

Editing approach

So firstly I should point out that such a task is only possible while using an Admin account, as you need to be able to see deleted revisions / un-delete items. Secondly it is not possible to create a redirect over a deleted item and also not possible to restore an item when that would create a conflict on the site, for example due to duplicate site links on items or duplicate joined labels and descriptions.

I split the list up into 104 different sections, each containing exactly 10,000 item IDs. I could then fire up multiple processes to try and create these redirects to make the task go as quickly as possible.

The process of touching a single ID was:

  1. Make sure that the target of the merge exists. If it does not then log to a file, if it does, continue.
  2. Try to un-delete the item. If the deletion fails log to a file, if it is successful continue.
  3. Try to clear the item (as you can only create redirects over empty items). This either results in an edit or no edit, it doesn’t really matter.
  4. Try to create the redirect, this should never fail! If it does log to a fail file that I can clean up after.

The approach on the whole worked very well. As far as I know there were no incorrect un-deletions and nothing failing in the middle.

The first of 2 snags that I hit was the rate at which I was trying to edit was causing the dispatch lag on wikidata to increase. There was no real solution to this other than to keep an eye on the lag and if it ever increased above a certain level to stop editing.

The second snag was causing multiple database locks during the final day of running, although again this was not really a snag as all the transactions recovered. The deadlocks can be seen in the graph below:

The result

  • 500,000 more item IDs now point to the correct locations.
  • We have an accurate idea of how many items have actually been deleted due to not being notable / being test items.
  • The reasoning for redirects has been reinforced in the community.

Final note

One of the steps in the editing approach was to attempt to un-elete an item and if un-deleting were to fail to log the item ID to a log file.

As a result I have now identified a list of roughly 6000 items that should be redirects but and not currently be un-deleted in order to be created.

See https://phabricator.wikimedia.org/T71166

It looks like there is still a bit of work to be done!

Again, sorry for the lack of images :/

by addshore at November 03, 2015 05:42 PM

November 02, 2015

Wiki Education Foundation

Teaching (more than just) writing with Wikipedia

Zach McDowell, who has taught with Wikipedia at the University of Massachusetts Amherst, shares his experiences, challenges and successes with the assignment. These notes were condensed from his WikiConference USA 2015 presentation. 

I’ve been teaching with Wikipedia for going on five years now. It hasn’t always been an “easy” experience, but it has easily been the most rewarding tool I’ve used in the classroom.

Not only is Wikipedia an excellent pedagogical tool, but it’s also incredibly relevant to college students. Recently, I asked a group of 20 students if they had ever used Encyclopedia Britannica. Only one raised their hand. When I asked how many of them had used Wikipedia in the last week, all of them raised their hands. However, when I asked how many have ever edited Wikipedia, it was down to one. Wikipedia is the only encyclopedia they know. It’s where they are getting their information, ­even though few ever examine how Wikipedia works.

Not just writing, not just wikis

Wikipedia is a strong pedagogical tool for teaching writing. It forces students to cite properly, practice peer review, participate in “public” writing, and learn how to format a literature review. As articles evolve over time, students return to them, learning editing skills and understanding the writing process.

The quality of the work at the end of these classes has been, in my experience, outstanding.

Teaching with Wikipedia opens a space to teach more than just writing, or whatever ­topic might be at hand. With its collaborative and connected nature of authorship, and the systemic biases that plague many online environments, Wikipedia is an incredible place for students to learn about collective intelligence, epistemology, combating systemic biases, and digital literacy through embodied practice.

What works (incredibly well)

Teaching with Wikipedia is great for instructors, because it creates a space where students are actively engaged with nonstop peer review in a public setting. This puts them in a role of shepherd rather than dictator.

As an instructor, I have received positive feedback for these classes, resulting in teaching awards for a required class that, taught traditionally, most students find “boring.”

Students are invested in these projects. Can you imagine a student getting excited about writing an annotated bibliography? Only with Wikipedia:

“I’ve learned more in this class than many others… I’ve never done as much research as I did in this class, but it was really fun and now my work is published online.” (This student made nearly 50 citations in her article).

They get excited about their assignments, and get excited about Wikipedia too:

“Not to be over­dramatic, but [your] class changed my worldview. It was a great exercise in writing, but for me it drove home a sort of globalism that I had always understood but never really experienced before. People get annoyed when I talk about Wikipedia now because I can’t shut up about how incredible (and under appreciated) a tool it is.”

Four learnings

I started teaching Wikipedia through my interest in open­ access activism and commons­-based outreach groups. I wanted get students involved with knowledge commons. I wanted to see them invested and excited about the quality of their projects, ensuring their critical engagement with the subject matter and their daily lives. Integrating Wikipedia­-based assignments instead of traditional papers helps teach a few things at once:

1. Collective Intelligence

Wikipedia creates new ways for the students to understand ideas about the creation of knowledge and of collective intelligence. I ask my students to think about their projects as “parents,” as no parent can control the child forever. Eventually, the community takes responsibility for caretaking. This leads to interesting metaphors for writing and digital labor, but the idea is the same. The editor is not an “authoritarian” author ­type that is solely responsible for knowledge creation. They’re part of a team, albeit an important one, that helps to “birth” the article from a collection of knowledge they’ve gathered. Students find themselves relying on co­editors and peer reviewers to give extensive feedback, and contribute to the article.

2. Epistemology

Through this process, students begin to better understand epistemology (theories of knowledge) in general, seeing wiki links and bibliographies as traces of a history of the knowledge they are participating in accounting. Most students are told to “never use” Wikipedia, but by understanding how Wikipedia works, they learn that Wikipedia is often a great starting point, and to keep digging through a bibliography (as many academics do, whether in traditional papers, or even Wikipedia).

3. Digital Literacy

Wikipedia creates a space where students begin to question the validity and verifiability of information. One of the first exercises is evaluating a Wikipedia article, before they suggest edits. Through understanding how Wikipedia uses citations, and the hierarchy of knowledge, students begin to question what they use every day. Not to say that Wikipedia isn’t just as reliable than anything else (it often is), but it makes for an excellent exercise in digital literacy.

4. Countering Systemic Biases

Most students are unaware of the systemic bias problems that plague Wikipedia. Although Wikipedia remains “the encyclopedia anyone can edit,” most have never even made a small grammar correction, let alone created a username. In my courses, the majority of students are women, and find Wikipedia lacking in articles about feminist, media, and gender issues that they themselves are passionate about. After my first course, one of my students reflected in an interview, “I think that if the gender gap was advertised more, it would make women want to edit more.”

Challenges / room for improvement

Most of challenges to Writing with Wikipedia are actually not technical. Students these days “get it.” A quick training program gets them up to speed on how to edit, use Talk pages, and take part in the technical­ side of Wikipedia projects. However, students take longer to understand the community behind Wikipedia, and understanding how to write neutrally. This is true also of faculty members, but in the opposite – they often “understand” the difficulties and how to approach a new community (and how to write neutrally), but have issues with the technical side of Wikipedia. Robust training helps.

The one major hurdle for getting students involved in Wikipedia is the community. It is a big, crazy, wild community. The community is not actually the “problem” – the vast majority of users my students interacted with were kind, thoughtful, and understanding. My students understood that these were mostly volunteers that were giving them their time, and were thankful for their help.

Students simply had a hard time grokking what is happening within the community, and the crazy process behind it. This is literally a foreign community to them, and they often do not have the “intercultural” experience necessary to participate.

Folks at Wiki Ed are an incredible and invaluable resource for both instructors and students.­ I might not have made it through my first course without them, and I’m a huge geek. Understanding the Wikipedia community is complex. There are a lot of rules, suggestions, and styles that aren’t as easy to explain as the technical side of editing Wikipedia. The problem seems to lie in the structure of the community and its often confusing governance systems. Things aren’t “cut and dry” and often require extra handholding.

Wiki Ed is great for everyone

There are a lot of great things going for teaching with Wikipedia, and I’m writing this because I advocate it every chance I can. Students actually care about their projects (which, for me, is more than enough); they are learning about complex knowledge theory and digital literacy; they’re combating systemic biases on the world’s largest (and free) encyclopedia. It isn’t perfect — is anything? But folks at the Wiki Education Foundation are working to actively improve it.

Community ties are growing stronger, training is getting more robust, and tools are getting stronger for organizing and evaluating work. Teaching with Wikipedia is good for everyone ­— students, teachers, and Wikipedia —­ and is getting better all the time.

Photo:Zach McDowell at the National Archives” by Eryk (Wiki Ed)Own work. Licensed under CC BY-SA 4.0 via Wikimedia Commons.

by Zach McDowell at November 02, 2015 05:00 PM

Wikimedia UK

My Week in Happy: Why I interviewed Helen Arney

This post was originally written by Zoe E Breen for Cheeruplove.com. It is available here

Well, “Why wouldn’t you want to interview Helen Arney?”, you might ask?

Helen Arney (Photo: Vera de Kok)

Of course she is super-smart, funny and chic, that’s undeniable. Which is why, when I was booking my tickets for Festival of the Spoken Nerd at The Lowry, I was struck by the fact that she did not have a Wikipedia page dedicated to her.

Almost a year before the gig, I’d been to a Wiki Edit workshop run for Manchester Girl Geeks by Wikimedia UK.

From this experience I learned two things:

  1. Editing Wikipedia is really pretty easy
  2. More than 80% of Wikipedia editors are male (according to some research)

What did I do with this knowledge? Pretty much nothing until I noticed that Helen Arney didn’t have a Wikipedia page.

Then I remembered something.

Fellow Manchester Girl Geek Karen Pudner (@kpudner) had created a Wikipedia page for code-breaker Joan Clarke, who worked alongside Alan Turing on the Enigma Project at Bletchley Park.

Karen started the Wikipedia page in 2013 having attended a previous Manchester Girl Geeks Wiki Edit Day.

This was the year before Joan’s contribution to the team at Bletchley Park was recognised in The Imitation Game. Since then the page has been added to and edited by dozens of other users.

If another girl geek could write a woman into Wikipedia then maybe I could give it a shot?

I was so excited by the prospect that it was with some abandon that I launched into writing my first lines of words on Wikipedia.

So I’ve made a start on Helen Arney’s page, which is currently a described as a ‘Singer stub’. If you would like to add an edit of your own I’d be extremely happy.

So, why did you interview Helen Arney?

As well as her obvious fabness (see above), I thought it would be lovely to have something that I had written to be linked to from the Wikipedia page. And reciprocal linked (of course!).

You can read what Helen had to say on physics, funnyness and frocks right here.

by Richard Nevell at November 02, 2015 12:43 PM

Tech News

Tech News issue #45, 2015 (November 2, 2015)

TriangleArrow-Left.svgprevious 2015, week 45 (Monday 02 November 2015) nextTriangleArrow-Right.svg
Other languages:
čeština • ‎Cymraeg • ‎Deutsch • ‎English • ‎español • ‎suomi • ‎français • ‎galego • ‎italiano • ‎Ripoarisch • ‎português • ‎português do Brasil • ‎русский • ‎svenska • ‎українська • ‎Tiếng Việt • ‎中文

November 02, 2015 12:00 AM

November 01, 2015

Gerard Meijssen

#Wikipedia versus the sum of all knowledge II

A question was raised again: "Whatever happened to Wikipedia, the encyclopedia anyone can edit"?"  It was meant to be a rhetorical question. It assumes that everyone can edit Wikipedia and do "all" things necessary. It is a funny because practically it has never been true. 

It never mattered. Wikipedia had people operating bots, add sources, add images, add templates and only because of this cooperation Wikipedia was functional. When an editor did not know how to do something, he had to learn new skills or had someone else do the job for him.

Wikipedia is a living project. Things change and consequently the skills needed evolve as well. Sometimes new technology is disruptive and old technology is grandfathered; no longer potent, no longer relevant. 

Three years ago Wikidata made its first appearance. From the start it was disruptive, It replaced the old interwiki links and we all benefited from a much more robust technology. This is however a niche area of Wikipedia so nobody complained.

Wikidata has ambitions; it has the potential to serve the sum of all available knowledge. To achieve this over the years data from many sources, often Wikipedias were harvested and found their place in one integrated environment. At this stage, selected areas of information may be served to Wikipedia from Wikidata. 

We are at a stage where Wikidata is increasingly the objective best place for particular fields of information and where a local Wikipedia becomes a backwater, becomes stagnant. People who care about external sources for instance moved a long time ago to Wikidata because it was much more inclusive. It allowed for easy cooperation and comparison with external sources. It had VIAF link to Wikidata in stead of Wikipedia. 

The issues we will face will be similar to the ones at Commons. Wikidata is  a project separate from Wikipedia. It has its own set of rules, its own set of priorities. Bluntly speaking, its user interface sucks bigtime for newbies and it is hard to grasp many concepts. Have a look at a page like this. It may prove disruptive to Wikipedians in a big way.

The problem we face is that for "grey beards" like me, them olden days are gone. New technology that is obviously superior will replace the current crop of tools. It must do so because expectations of service and quality change. Wikipedia is increasingly used from a mobile phone and we are stuck in so many ways with desktop (not even laptop) technology. 

The sum of all knowledge may be edited by anyone who cares to in the Wikisphere. It may become increasingly easy to do so when we care about the user experience for our editors and are willing to let go of all the cruft we accumulated over the years.

by Gerard Meijssen (noreply@blogger.com) at November 01, 2015 02:42 PM

Content Translation Update

November 1 CX Update: Starred Suggestions and Translation Interface Bug Fixes

After a delay in the deployment of new features to Wikimedia sites for the last couple of weeks, this week we are back to normal deployment schedule, and we have several significant updates.

The major new feature is the ability to mark suggested articles as something that you want to translate later by “starring” them (task description), as well as discarding suggestions in which you are not interested. This update is another step for making sophisticated and personalized lists of article to translate, which are designed to help translators be more efficient in completing the coverage of encyclopedic topics in their languages. For more details about the state of the Translation Suggestions, see the recently published Wikimedia Blog post: Article suggestions—a new feature for Content Translation.

Other than that, these bug fixes were deployed:

  • Images without caption were not properly published, which was confusing, because it appeared in the translation view, but not in the article. This is now fixed. (bug report)
  • Adding a red link was not, by itself, triggering auto-saving. Now it does. (bug report)
  • Long words in the titles of the columns in the translation interface were shown only partially. Now they are wrapped so the whole title would be seen. (bug report)

by aharoni at November 01, 2015 11:28 AM