November 24, 2017

Sam Wilson

Display Title extension

The MediaWiki Display Title extension is pretty cool. It uses a page’s display title in all links to that page. That might not sound like much, but it’s really useful to only have to change the title in one place, and have it show correctly all over the wiki. (This is much the same as Dokuwiki with the useheading configuration variable set to 1).

This is the sort of extension that I really like: it does a small thing, but does it well, and it makes sense as an addition to the core software. It’s not trying to do something completely different and just sit on top of or inside MediaWiki. It’s also not something that everyone would want, and so does belong as an extension and not an addition to core (even though the display title feature is part of core).

The other thing the Display Title extension provides is a parser function for retrieving the display title of any page: {{#getdisplaytitle:A page name}}, so you can use the display title without creating a link.

by Sam Wilson at November 24, 2017 02:05 AM

November 22, 2017

Wikimedia Foundation

Community digest: WikiProject Military history’s featured articles; Galician Wikipedia’s painting contest; news in brief

WikiProject Military history stands behind one-fifth of English Wikipedia’s featured articles

Photo by Alfred T. Palmer, public domain.

Featured articles are considered the pinnacle of article evolution on Wikipedia—”the best articles Wikipedia has to offer.” It may take several months of community assessment for an article to get this status. Last month, WikiProject Military history found that they had surpassed a long-held project goal of one thousand featured articles. This number makes up nearly 20% of all the featured articles on the English Wikipedia.

The “WikiProject,” or a group of editors who formally organize themselves around a topic, includes task forces that help organize collaboration on particular military history topics. These task forces work on certain topics such as Military aviation, Intelligence, etc., based on nations and regions such as African, Asian, Ottoman, then periods and conflicts such as the American Civil War, Crusades, or Classical Warfare.

Most of the project’s featured articles go through an internal A-class review process, which helps the editor prepare their article to get a featured article star. The project’s contest department gives twenty-six points to every editor that works on a featured article from scratch. These points sum up to a monthly and annual Military history writing contest.

For newcomers, the project academy presents several guideline pages to help write a featured article, understand the FA criteria, write a large-scope article and more. And when a member writes a featured or an A-class article, they get featured in the Project’s monthly newsletter, The Bugle.

Other ways to keep the members motivated include the annual awards such as the Military historian of the year, Military history newcomer of the year, WikiChevrons, and other titles.

Started as WikiProject Battles in 2003, widening its scope, the project eventually settled at WikiProject Military History. Since its inception, the project has remained one of the largest and most active projects on the English Wikipedia.

Krishna Chaitanya Velaga, Coordinator, WikiProject Military History


Galician Wikipedia celebrates day of literature with a painting contest

Galician Literature Day is an annual public holiday observed on 17 May, where the locals of Galicia, Spain, celebrate their language, literature and writers. For the upcoming year, the Galician Wikipedia community has decided to celebrate the day by holding a painting competition to draw the author of the year for 2018: María Victoria Moreno.

“It is difficult, sometimes, to find images to illustrate Wikipedia articles, especially portraits of people who died in the past eighty years.” says user Elisardojm, administrator on the Galician Wikipedia and one of the competition organizers. “The photos available are usually copyrighted… We would like to promote other ways for people to contribute to the Wikimedia movement.”

To join the competition, a participant needs to do a portrait of the honored writer in the Galician Literature Day for 2018, María Victoria Moreno, and share their work under a free license. Works should be uploaded to the relevant category on Wikimedia Commons by 31 December 2017. “It can be a drawing or any other form of art,” says Elisardojm. “The works uploaded later than the end of December will be appreciated but will not be considered for the competition.”

User HombreDHojalata from the Galician Wikipedia came up with the idea for a contest, which was then carried through to reality by Elisardojm and other Wikipedians in the community. The volunteer team is now working on promoting the contest using local media in Galicia and sharing it with art groups on the internet.

In brief

by Македонец, CC BY-SA 4.0.

Collaboration between Brazil and Macedonia to share photos from Wiki Loves Contests: Last month, the Wikimedia Communities in Brazil and Macedonia held two exhibitions for the winning photos in Wiki Loves Contests in 2017. The Brazil Wikimedia Community held an exhibition for the winning photos in Macedonia and Shared Knowledge group in Macedonia organized the exhibition for Brazilian photos. All photos had QR codes leading to Wikipedia articles that explain the monument in picture.

Call for volunteers for the Wikimania program committee: The committee will help put together the program and schedule for Wikimania 2018, the annual conference of the Wikimedia movement, held on July 18-22 in Cape Town, South Africa. Committee member responsibilities include helping promote the call for presentations, recruiting speakers, and reviewing program submissions. Review dates this year will be in the March-April timeframe. More details on Wikimedia-l.

FDC update: The funds dissemination committee of the Wikimedia Foundation has announced the elected position holders for this year. Bishakha Datta is the new committee Chair while Liam Wyatt is the Vice-Chair and Katherine Bavage is the new committee Secretary. More details on Wikimedia-l. Also last weekend, the committee met in Madrid to assess and provide recommendations on the Annual Plan Grants provided by the community individuals and groups. More on meta.

Metrics meeting for November and December: Since the Wikimedia Metrics and Activities meetings for November and December fall on popular holiday weeks in the United States and other parts of the world, organizers will combine both in one meeting that will be held on Thursday, 14 December, from 6:00 – 7:00 PM UTC (11 AM – 12 PM PT). More about Metrics meetings on meta.

Wishlist survey 2017 starts: The Foundation’s Community Tech team builds features and makes changes that active Wikimedia contributors want, and the Wishlist Survey sets the team’s agenda for the year. The third annual Community Wishlist Survey started early this month and the vote on proposals will start on 27 November. More details on meta.

First local WikiConference in Italy: Trento in Italy hosted the first local WikiConference for the Wikimedia movement in the country. The conference was an opportunity for the editor community to attend workshops on the use of templates and internet bots, discuss topics of concern like the gender gap on Wikipedia and providing safe spaces for refugees online.

Milestones and and anniversaries: This month, the Latvian Wikipedia celebrated article number 80,000, while the Albanian Wikipedia celebrated their 70,000th one. On a different side of the world, the Maithili Wikipedia community held an event to celebrate the Maithili Wikipedia day. Congratulations to all involved!

Samir Elsharbaty, Writer, Communications
Wikimedia Foundation

by Krishna Chaitanya Velaga and Samir Elsharbaty at November 22, 2017 09:34 PM

Gabriel Thullen on bringing offline Wikipedia to West African schools

Photo by Gabriel Thullen, CC BY-SA 4.0.

Senior Program Manager Anne Gomez leads the New Readers initiative, where she works on ways to better understand barriers that prevent people around the world from accessing information online. One of her areas of interest is offline access, as she works with the New Readers team to improve the way people who have limited or infrequent access to the Internet can access free and open knowledge.

Over the coming months, Anne will be interviewing people who work to remove access barriers for people across the world. In her first conversation for the Wikimedia Blog, Anne chatted with Emmanuel Engelhart (aka “Kelson”), a developer who works on Kiwix, an open source software which allows users to download web content for offline reading. In this installment, she interviews Gabriel Thullen, a Geneva (Switzerland) Wikimedian, previous Wikimedia CH board member, and school teacher who has worked with schools across West Africa to test the Kiwix offline Wikipedia during the 2016–2017 school year. As Gabriel writes in the Wikimedia Education newsletter, “These schools are in cities with limited access to the Internet and in small towns with little or no electricity, no cell phone coverage, and no Internet.”

Both Anne and Gabriel recently participated in the OFF.NETWORK Content Hackathon to advance Kiwix and its distribution of offline Wikipedia. Their conversation is below. You can also read the other parts of this ongoing series.

Anne Gomez: Tell me about what started your interest and involvement with offline Wikipedia? When was it?

Thullen: I consider that there were two phases in my involvement with offline Wikipedia. I was fascinated by the presentations at Wikimania 2012, about Kiwix and Wikipedia Zero. I really got into using Kiwix while helping distribute it at the WMCH booth during the 2013 Wikimania in Hong Kong, and was able to successfully secure a mandate from my employers for the 2014-2015 school year to investigate the pedagogical opportunities opened up by the use of offline Wikipedia and Kiwix. Unfortunately, we waited over a year for the Kiwix plug computers to be produced so that by the time the hardware was delivered to Geneva, the funds had dried up and the school year was over.

The second phase of my involvement started with a trip to Senegal in Summer 2014. As coach of a team of Senegalese wheelchair basketball players playing in different European championships, I was involved in organizing a tournament in Dakar. At the same time, I was checking up on the distribution of a few hundred math books I had shipped to Senegal, and I met with a few teachers and government officials. Having surmised the usefulness of offline Wikipedia for schools with little or no Internet access, I had brought a few dozen USB keys pre-loaded with Kiwix and the French Wikipedia. In all, a few weeks were spent meeting with schools and other facilities like the US Peace Corps training base in Thiès and the Naval Academy in Dakar. They were all quite enthusiastic and receptive when I shared Kiwix with them.

Gomez: What’s been the biggest surprise for you over the years?

Thullen: That must be the detrimental effect of smartphones on what I consider to be basic computer literacy such as knowing how to use a keyboard and mouse, knowing the difference between input and output, between local and remote storage. I have been teaching computer science since the mid-eighties, before the web was developed at CERN here in Geneva. Over the years, the students knew more and more about about computers before coming to school, but I have noticed a sharp reversal in this trend over the past two years as an increasing number of students no longer have a computer at home, all is done with mobile devices and a smart TV.

Gomez: Smartphones have transformed the way people can access the internet. How has this changed the landscape and the way you view offline access?

Thullen: My experience with offline access is limited to Africa. I have a research project which is currently stuck in Limbo due to the rules and regulations of my employer which absolutely forbid the use of smartphones in class by middle school students. I now feel that we need a two-pronged approach to providing offline access for smartphones. Users should be able to install resources directly on their device, or else they should be able to easily connect to a mobile WiFi hotspot which provides these resources.

Gomez: How do you see these devices impacting the future of educational resources?

Thullen: Making educational resources available to smartphones and similar devices only makes sense if the educators are ready and willing to make use of them, and that means that those who develop them need to work closely with the educational community. I see a huge potential for offline resources where I work, in Swiss schools. Switzerland is one of the most connected countries in the World, but in a learning environment you sometimes need access to certain resources but not to the Internet or to any other form of electronic communication – think of exams, for example. I am really excited to be part of what could turn out to be a paradigm shift in the way of teaching and evaluating students.

Gomez: You’ve brought Kiwix on a wifi hub (Raspberry Pi) and thumb drives to pass out to teachers in Senegal. Help me understand the context you’re working in… what makes one better than the other?

Thullen: The Raspberry Pi Kiwix hotspot I built turned out to have a few limitations: it needs a very regular power supply, which means that it is not available 24 hours a day, every day. On the other hand, a USB key is always available and needs no power source. One other factor that needs to be considered is that when users connect to the WiFi, they lose any Internet connexion they may have had. As many people in West Africa communicate by Facebook Messenger, this effectively isolates them from their friends and families. They might be willing to do this for a few minutes, but it might not be a long term solution.

The Raspberry Pi WiFi hotspot and the USB flash drives do not address the same needs. A WiFi hotspot can be used by a group in the same physical location whereas the USB sticks can be used individually and at any time.

Gomez: You and I have talked about the challenges of getting teachers to use and share Wikipedia offline, especially with technical challenges. Thumb drives can easily be wiped and repurposed for storing and sharing other types of content. How are you working with teachers to demonstrate the value of Wikipedia and Kiwix?

Thullen: Music takes up a lot of space. Videos take up even more space. Text takes up very little space. My colleagues have to understand just how much information can be stored on a USB thumb drive if you leave out the videos or the HD photos. It is not obvious to them, and it takes time and practice using the Kiwix program before users are convinced that the whole Wikipedia encyclopedia on that little drive, that “it is really all there” to quote a colleague. What I found out during my trips to Senegal is that a lot of teachers there do not know about Wikipedia, so the first step is to ensure that they are familiar with the encyclopedia, and that no matter how much they know about a subject, they can always find more information. That demonstrates the inherent value of the offline encyclopedia so that they are less likely to delete it and reuse the key for other purposes.

Gomez: What’s the hardest part of the technology to deal with?

Thullen: I have now hit a technological wall in West Africa. Please bear with me as I explain, as this probably applies to most other countries to one degree or another.

When I last went to Senegal, over 60% of the computers I saw were running Windows XP, [Editor’s note: Microsoft has released five versions of their OS since XP was released in 2001.] mostly in areas with little or no Internet access. There are two limitations of Windows XP which complicate the distribution of the Kiwix offline Wikipedia reader. The maximum USB size recognized under XP is 32 Gb, and the latest version of the French Kiwix & Zim files are just a little bit more than what can be loaded onto a 32 GB flash drive. Kiwix also uses a single index file which is approaching the 4 GB maximum file size for FAT 32 systems, and Windows XP uses FAT 32 flash drives. You need to know that the index file for the English language Wikipedia far exceeds this limit. To make a long and complicated story short, the latest version of the French Kiwix Wikipedia download I can share with my African colleagues is the one from June 2016…

I keep on getting a lot of well intention-ed suggestions: upgrade to Windows 7! – use a Mac! – use Linux! – don’t use FAT 32! – etc. That is fine for us computer geeks, but it definitely is not an option in a lot of parts in this World. As the old saying goes: “if it ain’t broken don’t fix it” – and that applies to computers as well.

Gomez: What’s the hardest challenge with content?

Thullen: When I present Wikipedia to my African colleagues, online or offline, I can’t help noticing the dearth of information about Africa. There is more information about the village of Genthod in Switzerland (2’700 inhabitants) that about the city of Thiès in Senegal (260’000 inhabitants). Our notability criteria are heavily biased towards what used to be called the “Global North”, and this is flagrant when examining the offline content that I am actively promoting in West African schools. The second-hand printed textbooks that are often used are centered on the French, Belgian or Swiss curriculum. My dream is that our online encyclopedia will soon be more inclusive and better reflect the diversity of our world.

Gomez: What do you see as the future of your on-the-ground work?

Thullen: The Wikipedia encyclopedia is built up by the base, that is by the end-users. The distribution of offline resources should follow the same pattern that made Wikipedia so successful. The distribution should be a large scale community effort, not something managed by a small committee. As for future of my on-the-ground work, I would like to be able to spend more time and energy helping local communities get organized so that they can distribute the offline resources themselves. Each local community will then be responsible for centralizing the downloads, copying and/or distributing the Kiwix files, end user training, financing the purchase of the USB flash drives.

Gomez: What’s one thing the Wikimedia Foundation could do to help?

Thullen: There are certain situations where the Foundation could be instrumental in getting Kiwix to those who would benefit – I was thinking about what is happening right now in Puerto Rico. I imagine that the large scale destruction they experienced touched the schools and libraries as well. As a teacher, I am quite concerned about the future of those thousands of students in the affected areas, and I am quite sure that not all will be able to go to the continental US. The Foundation could come forward with a solution which could be implemented immediately and at low cost, thus providing the students with the resource material they need to be able to continue successfully with their studies. This would raise public awareness and support for the Kiwix program.

Gomez: Where do you learn more and share information about offline access?

Thullen: Kiwix.org, obviously. There are relatively few projects centered on developing offline access. Like most of the free or open source software projects, the main problem most people have, ourselves included, is knowing that they exist. You could start by searching for projects like “Pirate Box” or “KA Lite” or “Project Gutenberg”.

Gomez: What resources exist for people who want to know more?

Thullen: After trying what was mentioned above, those looking for more resources can google “OER” (Open Educational Resources). A more original approach is to look up digital educational resources from the 1990’s since most of them were offline – try old computer magazines or old education related magazines. The WayBackMachine could probably help you find what you are looking for,

Anne Gomez, Senior Program Manager, Program Management
Wikimedia Foundation

by Anne Gomez at November 22, 2017 06:16 PM

Wiki Education Foundation

A trip to the Windy City

Did you know that for some people in Chicago, the term “Windy City” doesn’t apply to the miles per hour of the weather but to the constant changing of political winds within the region? Until my trip there last weekend, neither did I, and in the last month over 12,000 people have also learned about this nickname by reading Wikipedia’s article about the topic.

Heading to the American Studies conference last week, I had no idea of the diverse conversations I would get myself into. From discussing rap, hiphop and performance, to US history, film, photography, comics, or jazz, to Asian-American cinematography and visual culture, my booth at times felt as if it were floating on the winds of the Windy City, heading into directions exciting and unknown. But in all my conversations there was a constant theme of empowerment – instructors excited to help students understand their fields and the relevance of their work all while connecting it to historical themes and narratives. I loved getting a chance to ground that excitement into action – having students write and update Wikipedia allows them to become contributions, not just consumers.

This assignment provides a setting to discuss the real world implications of class topics and teaches students that their voices matter. For example, I talked with one instructor interested in the history of comics, a space that is usually seen as male dominated, both from the creator stand point and in terms of the characters. She expressed a desire for her students to understand the feminist themes that run throughout comics and also an aspiration to help change the common narrative into one more inclusive of female representation and stories. Luckily, we are currently supporting a course working to update the biographies of American women in comics – a project new courses can expand upon using our “editing Wikipedia articles on biographies” guide. Those students are only in the middle of their assignments and their work has already been viewed over 7,000 times on Wikipedia! Many instructors had also heard about our recent partnership announcement and were excited to get involved.

While in Chicago, I also had the opportunity to meet with instructors at the University of Illinois at Chicago and present to them about our tools and support here at Wiki Education. While there, we had a few great discussions about the benefits and the risks of the Wikipedia assignment. The benefits: the project provides an authentic writing experience for students, allows instructors to undergo public outreach in their field, and provides students with 21st century literacy skills, among other things. But some of the drawbacks we discussed include concerns about student privacy and the difficulty for instructors to evaluate such a dynamic project. Luckily, while our Dashboard stores student first and last names for instructors, student Wikipedia usernames can be as anonymous as they want, helping to ensure student privacy. And Wiki Education just released a new assessment rubric for grading a Wikipedia contribution. We are always working to update our resources in response to instructor concerns.

During my trip it rained, it snowed, and it was pretty windy (and cold!), but that didn’t stop great conversations from flowing. Thanks to everyone who joined me. If you have any questions about how your students could work on an assignment like this, please contact us at contact@wikiedu.org.

by Samantha Weald at November 22, 2017 05:18 PM

Jeroen De Dauw

The Fallacy of DRY

DRY, standing for Don’t Repeat Yourself, is a well-known design principle in the software development world.

It is not uncommon for removal of duplication to take center stage via mantras such as “Repetition is the root of all evil”. Yet while duplication is often bad, the well intended pursuit of DRY often leads people astray. To see why, let’s take a step back and look at what we want to achieve by removing duplication.

The Goal of Software

First and foremost, software exists to fulfill a purpose. Your client, which can be your employer, is paying money because they want the software to provide value. As a developer it is your job to provide this value as effectively as possible. This includes tasks beyond writing code to do whatever your client specifies, and might best be done by not writing any code. The creation of code is expensive. Maintenance of code and extension of legacy code is even more so.

Since creation and maintenance of software is expensive, the quality of a developers work (when just looking at the code) can be measured in how quickly functionality is delivered in a satisfactory manner, and how easy to maintain and extend the system is afterwards. Many design discussions arise about trade-offs between those two measures. The DRY principle mainly situates itself in the latter category: reducing maintenance costs. Unfortunately applying DRY blindly often leads to increased maintenance costs.

The Good Side of DRY

So how does DRY help us reduce maintenance costs? If code is duplicated, and it needs to be changed, you will need to find all places where it is duplicated and apply the change. This is (obviously) more difficult than modifying one place, and more error prone. You can forget about one place where the change needs to be applied, you can accidentally apply it differently in one location, or you can modify code that happens to the same at present but should nevertheless not be changed due to conceptual differences (more on this later). This is also known as Shotgun Surgery. Duplicated code tends to also obscure the structure and intent of your code, making it harder to understand and modify. And finally, it conveys a sense of carelessness and lack of responsibility, which begets more carelessness.

Everyone that has been in the industry for a little while has come across horrid procedural code, or perhaps pretend-OO code, where copy-paste was apparently the favorite hammer of its creators. Such programmers indeed should heed DRY, cause what they are producing suffers from the issues we just went over. So where is The Fallacy of DRY?

The Fallacy of DRY

Since removal of duplication is a means towards more maintainable code, we should only remove duplication if that removal makes the code more maintainable.

If you are reading this, presumably you are not a copy-and-paste programmer. Almost no one I ever worked with is. Once you know how to create well designed OO applications (ie by knowing the SOLID principles), are writing tests, etc, the code you create will be very different from the work of a copy-paste-programmer. Even when adhering to the SOLID principles (to the extend that it makes sense) there might still be duplication that should be removed.The catch here is that this duplication will be mixed together with duplication that should stay, since removing it makes the code less maintainable. Hence trying to remove all duplication is likely to be counter productive.

Costs of Unification

How can removing duplication make code less maintainable? If the costs of unification outweigh the costs of duplication, then we should stick with duplication. We’ve already gone over some of the costs of duplication, such as the need for Shotgun Surgery. So let’s now have a look at the costs of unification.

The first cost is added complexity. If you have two classes with a little bit of common code, you can extract this common code into a service, or if you are a masochist extract it into a base class. In both cases you got rid of the duplication by introducing a new class. While doing this you might reduce the total complexity by not having the duplication, and such extracting might make sense in the first place for instance to avoid a Single Responsibility Principle violation. Still, if the only reason for the extraction is reducing duplication, ask yourself if you are reducing the overall complexity or adding to it.

Another cost is coupling. If you have two classes with some common code, they can be fully independent. If you extract the common code into a service, both classes will now depend upon this service. This means that if you make a change to the service, you will need to pay attention to both classes using the service, and make sure they do not break. This is especially a problem if the service ends up being extended to do more things, though that is more of a SOLID issue. I’ll skip going of the results of code reuse via inheritance to avoid suicidal (or homicidal) thoughts in myself and my readers.

DRY = Coupling

— A slide at DDDEU 2017

The coupling increases the need for communication. This is especially true in the large, when talking about unifying code between components or application, and when different teams end up depending on the same shared code. In such a situation it becomes very important that it is clear to everyone what exactly is expected from a piece of code, and making changes is often slow and costly due to the communication needed to make sure they work for everyone.

Another result of unification is that code can no longer evolve separately. If we have our two classes with some common code, and in the first a small behavior change is needed in this code, this change is easy to make. If you are dealing with a common service, you might do something such as adding a flag. That might even be the best thing to do, though it is likely to be harmful design wise. Either way, you start down the path of corrupting your service, which now turned into a frog in a pot of water that is being heated. If you unified your code, this is another point at which to ask yourself if that is still the best trade-off, or if some duplication might be easier to maintain.

You might be able to represent two different concepts with the same bit of code. This is problematic not only because different concepts need to be able to evolve individually, it’s also misleading to have only a single representation in the code, which effectively hides that you are dealing with two different concepts. This is another point that gains importance the bigger the scope of reuse. Domain Driven Design has a strategic pattern called Bounded Contexts, which is about the separation of code that represents different (sub)domains. Generally speaking it is good to avoid sharing code between Bounded Contexts. You can find a concrete example of using the same code for two different concepts in my blog post on Implementing the Clean Architecture, in the section “Lesson learned: bounded contexts”.

DRY is for one Bounded Context

— Eric Evans in Good Design is Imperfect Design


for($i=0; $i<4, ++$i) {

How many times is doAction called? 3 times? 4 times? What about in this snippet:


This demonstrates how removing duplication on a very low-level can be harmful. If you have a good higher level example that can be understood unambiguously without providing a lot of extra context, do leave a comment.


Duplication itself does not matter. We care about code being easy (cheap) to modify without introducing regressions. Therefore we want simple code that is easy to understand. Pursuing removal of duplication as an end-goal rather than looking at the costs and benefits tends to result in a more complex codebase, with higher coupling, higher communication needs, inferior design and misleading code.

by Jeroen at November 22, 2017 06:06 AM

Why Every Single Argument of Dan North is Wrong

Alternative title: Dan North, the Straw Man That Put His Head in His Ass.

This blog post is a reply to Dan’s presentation Why Every Element of SOLID is Wrong. It is crammed full with straw man argumentation in which he misinterprets what the SOLID principles are about. After refuting each principle he proposes an alternative, typically a well-accepted non-SOLID principle that does not contradict SOLID. If you are not that familiar with the SOLID principles and cannot spot the bullshit in his presentation, this blog post is for you. The same goes if you enjoy bullshit being pointed out and broken down.

What follows are screenshots of select slides with comments on them underneath.

Dan starts by asking “What is a single responsibility anyway?”. Perhaps he should have figured that out before giving a presentation about how it is wrong.

A short (non-comprehensive) description of the principle: systems change for various different reasons. Perhaps a database expert changes the database schema for performance reasons, perhaps a User Interface person is reorganizing the layout of a web page, perhaps a developer changes business logic. What the Single Responsibility Principle says is that ideally changes for such disparate reasons do not affect the same code. If they did, different people would get in each other’s way. Possibly worse still, if the concerns are mixed together and you want to change some UI code, suddenly you need to deal with and thus understand, the business logic and database code.

How can we predict what is going to change? Clearly you can’t, and this is simply not needed to follow the Single Responsibility Principle or to get value out of it.

Write simple code… no shit. One of the best ways to write simple code is to separate concerns. You can be needlessly vague about it and simply state “write simple code”. I’m going to label this Dan North’s Pointlessly Vague Principle. Congratulations sir.

The idea behind the Open Closed Principle is not that complicated. To partially quote the first line on the Wikipedia Page (my emphasis):

… such an entity can allow its behaviour to be extended without modifying its source code.

In other words, when you ADD behavior, you should not have to change existing code. This is very nice, since you can add new functionality without having to rewrite old code. Contrast this to shotgun surgery, where to make an addition, you need to modify existing code at various places in the codebase.

In practice, you cannot gain full adherence to this principle, and you will have places where you will need to modify existing code. Full adherence to the principle is not the point. Like with all engineering principles, they are guidelines which live in a complex world of trade offs. Knowing these guidelines is very useful.

Clearly it’s a bad idea to leave code in place that is wrong after a requirement change. That’s not what this principle is about.

Another very informative “simple code is a good thing” slide.

To be honest, I’m not entirely sure what Dan is getting at with his “is-a, has-a” vs “acts-like-a, can-be-used-as-a”. It does make me think of the Interface Segregation Principle, which, coincidentally, is the next principle he misinterprets.

The remainder of this slide is about the “favor compositions about inheritance” principle. This is really good advice, which has been well-accepted in professional circles for a long time. This principle is about code sharing, which is generally better done via composition than inheritance (the latter creates very strong coupling). In the last big application I wrote I have several 100s of classes and less than a handful inherit concrete code. Inheritance has a use which is completely different from code reuse: sub-typing and polymorphism. I won’t go into detail about those here, and will just say that this is at the core of what Object Orientation is about, and that even in the application I mentioned, this is used all over, making the Liskov Substitution Principle very relevant.

Here Dan is slamming the principle for being too obvious? Really?

“Design small , role-based classes”. Here Dan changed “interfaces” into “classes”. Which results in a line that makes me think of the Single Responsibility Principle. More importantly, there is a misunderstanding about the meaning of the word “interface” here. This principle is about the abstract concept of an interface, not the language construct that you find in some programming languages such as Java and PHP. A class forms an interface. This principle applies to OO languages that do not have an interface keyword such as Python and even to those that do not have a class keyword such as Lua.

If you follow the Interface Segregation Principle and create interfaces designed for specific clients, it becomes much easier to construct or invoke those clients. You won’t have to provide additional dependencies that your client does not actually care about. In addition, if you are doing something with those extra dependencies, you know this client will not be affected.

This is a bit bizarre. The definition Dan provides is good enough, even though it is incomplete, which can be excused by it being a slide. From the slide it’s clear that the Dependency Inversion Principle is about dependencies (who would have guessed) and coupling. The next slide is about how reuse is overrated. As we’ve already established, this is not what the principle is about.

As to the Dependency Inversion Principle leading to DI frameworks that you then depend on… this is like saying that if you eat food, you might eat non-nutritious food such as sand, which is not healthy. The fix is not to reject food altogether, it is to not eat food that is non-nutritious. Remember the application I mentioned? It uses dependency injection all the way, without using any framework or magic. In fact, 95% of the code does not bind to the web-framework used due to adherence to the Dependency Inversion Principle. (Read more about this application)

That attitude explains a lot about the preceding slides.

Yeah, please do write simple code. The SOLID principles and many others can help you with this difficult task. There is a lot of hard-won knowledge in our industry and many problems are well understood. Frivolously rejecting that knowledge with “I know better” is an act of supreme arrogance and ignorance.

I do hope this is the category Dan falls into, because the alternative of purposefully misleading people for personal profit (attention via controversy) rustles my jimmies.

If you’re not familiar with the SOLID principles, I recommend you start by reading their associated Wikipedia pages. If you are like me, it will take you practice to truly understand the principles and their implications and to find out where they break down or should be superseded. Knowing about them and keeping an open mind is already a good start, which will likely lead you to many other interesting principles and practices.

by Jeroen at November 22, 2017 05:54 AM

November 21, 2017

Wikimedia Performance Team

The journey to Thumbor, part 3: development and deployment strategy

In the last blog post I described where Thumbor fits in our media thumbnailing stack. Introducing Thumbor replaces an existing service, and as such it's important that it doesn't preform worse than its predecessor. We came up with a strategy to reach feature parity and ensure a launch that would be invisible to end users.


In Wikimedia production, Thumbor was due to interact with several services: Varnish, Swift, Nginx, Memcached, Poolcounter. In order to iron out those interactions, it was important to reproduce them locally during development. Which is why I wrote several roles for the official Mediawiki Vagrant machine, with help from @bd808. Those have already been useful to other developers, with several people reaching out to me about the Varnish and Swift Vagrant roles. While at the time it might have seemed like an unnecessary quest (why not develop straight on a production machine?) it was actually a great learning experience to write the extensive Puppet code required to make it work. While it's a separate codebase, subsequent work to port that over to production Puppet was minimal.

This phase actually represented the bulk of the work, reproducing support for all the media formats and special parameters found in Mediawiki thumbnailing. I dedicated a lot of attention to making sure that the images generated by Thumbor were as good as what Mediawiki was outputting for the same original media. In order to do that, I wrote many integration tests using thumbnails from Wikimedia production, which were used as reference output. Those tests are still part of the Thumbor plugins Debian package and ensure that we avoid regressions. They use a DSSIM algorithm to visually compare images and make sure that what Thumbor outputs doesn't visually diverge from the reference thumbnails. We also compare file size to make sure that the new output isn't significantly heavier than the old.


The next big phase of the project was to create a Debian package for our Thumbor code. I had never done that before and it wasn't as difficult as some people make it out to be (I imagine the tooling has gotten significantly better than it used to be), at least for Python packages. However, in order to be able to ship our code as a Debian package, Thumbor itself needed to have a Debian package. Which wasn't the case at the time. Some people had tried on much older versions of Thumbor but never reached the point where it was put in Debian proper. Since that last attempt, Thumbor added a lot of new dependencies that weren't packaged either. @fgiunchedi and I worked on packaging it all and successfully did so. And with the help of Debian developer Marcelo Jorge Vieira who pushed most of those packaged for us into Debian, we crossed the finish line recently and got Thumbor submitted to Debian unstable.

One advantage of doing this is that it makes deployment of updates really straightforward, with the integration test suite I mentioned earlier running in isolation when the Debian package is built. With those Debian packages done, we were ready to run this on production machines.

But the more important advantage is that by having those Debian packages into Debian itself, other people are using the exact same versions of Thumbor's dependencies and Thumbor itself via Debian, thus greatly expanding the exposure of the software we run in production. This increases the likelihood that security issues we might be exposed to are found and fixed.


Trying to reproduce the production setup locally is always limited. The full complexity of production configuration isn't there, and everything is still running on the same machine. The next step was to convert the Vagrant Puppet code into production Puppet code. Which allowed us to run this on the Beta cluster as a first step, where we could reproduce a setup closer to production with several machines. This was actually an opportunity to improve the Beta cluster to make it have a proper Varnish and Swift setup closer to production than it used to have. Just like the Vagrant improvements, those changes quickly paid off by being useful to others who were working on Beta.

Just like packaging, this new step revealed bugs in the Thumbor plugins Python code that we were able to fix before hitting production.


The Beta wikis only have a small selection of media, and as such we still hadn't been exposed to the variety of content found on production wikis. I was worried that we would run into media files that had special properties in production that we hadn't run into in all the development phase. Which is why I came up with a plan to dual-serve all production requests to the new production Thumbor machines and compare output.

This consisted in modifications to the production Swift proxy plugin code we have in place to rewrite Wikimedia URLs. Instead of sending thumbnail requests to just Mediawiki, I modified it to also send the same requests to Thumbor. At first completely blindly, the Swift proxy would send requests to Thumbor and not even wait to see the outcome.

Then I looked at the Thumbor error logs and found several files that were problematic for Thumbor and not for Mediawiki. This allowed us to fix many bugs that we would have normally found out about during the actual launch. This was also the opportunity to reproduce and iron out the various throttling mechanisms.

To be more thorough, I mage the Swift proxy log HTTP status codes returned by Mediawiki and Thumbor and produced a diff, looking for files that were problematic for one and not the other. This allowed us to find more bugs on the Thumbor side, and a few instances of files that Thumbor could render properly that Mediawiki couldn't!

This is also the phase where under the full production load, our Thumbor configuration started showing significant issues around memory consumption and leaks. We were able to fix all those problems in that fire-and-forget dual serving setup, with no impact at all on production traffic. This was an extremely valuable strategy, as we were able to iterate quickly in the same traffic conditions as if the service had actually launched, without any consequences for users.


With Thumbor running smoothly on production machines, successfully rendering a superset of thumbnails Mediawiki was able to, it was time to launch. The dual-serving logic in the Swift proxy came in very handy: it became a simple toggle between sending thumbnailing traffic to Mediawiki and sending it to Thumbor. And so we did switch. We did that gradually, having more and more wikis's thumbnails rendered by Thumbor over the course of a couple of weeks. The load was handled fine (predictable, since we were handling the same load in the dual-serving mode). The success rate of requests based on HTTP status codes was the same before and after.

However after some time we started getting reports of issues around EXIF orientation. A feature we had integration tests for. But the tests only covered 180 degrees rotation and not 90 degrees (doh!). The Swift proxy switch allowed us to quickly switch traffic back to Mediawiki. We did so because it's quite a prevalent feature in JPGs. We fixed that one large bug, switched the traffic back to Thumbor and that was it.

Some minor bugs surfaced later regarding much less common files with special properties, that we were able to fix very quickly. And deploy fixes for safely and easily with the Debian package. But we could have avoided all of those bugs too if we had been more thorough in the dual-serving phase. We were only comparing HTTP status codes between Mediawiki and Thumbor. However, rendering a thumbnail successfully doesn't mean that the visual contents are right! The JPG orientation could be wrong, for example. If I had to do it again, I would have run DSSIM visual comparisons on the live dual-served production traffic between the Mediawiki and Thumbor outputs. That would have definitely surfaced the handful of bugs that appeared post-launch.


All in all, if you do your homework and are very thorough in testing locally and on production traffic, you can achieve a very smooth launch replacing a core part of infrastructure with completely different software. Despite the handful of avoidable bugs that appeared around the launch, the switch to Thumbor went largely unnoticed by users, which was the original intent, as we were looking for feature parity and ease of swapping the new solution in. Thumbor has been happily serving all Wikimedia production thumbnail traffic since June 2017 in a very stable fashion. This concludes our journey to Thumbor :)

by Gilles (Gilles Dubuc) at November 21, 2017 09:54 PM

Wikimedia Tech Blog

How we designed a more fluent and visually consistent experience for translators

Photo by Marco Verch, CC BY 2.0.

Translation is a key workflow for content creation in many wikis. Content translation has already helped editors create more than 250,000 Wikipedia articles. Not only experienced editors and expert translators have become more productive with the tool, but it has also reduced the barrier to entry for new editors.

Although content translation is currently used by many users, the tool is still provided as a beta feature, and there are still many aspects to polish. In particular, we wanted to tackle some pending work accumulated over time around consistency of the visual style, simplification of key workflows, and support for small screens and touch devices.

The translation dashboard as it looks after the design refresh. Image via Phabricator, license.

The arrival of a new member to the Global Collaboration team was a good opportunity to deal with this design debt. Petar Petković started as an intern in June 2017 and has been leading the development efforts with the collaboration of other members of the team as well as people from other teams at the Wikimedia Foundation.

More consistent visual style

We wanted to update the visual style of content translation in order to better align with the Wikimedia design style guide. The style guide is an effort started by the Wikimedia design team, aimed at helping designers and developers across the Wikimedia movement to create more consistent products. The style guide is still in its early stages, and we welcome interested people to get involved and contribute to the style guide repository.

Although the style guide is far from complete, we wanted to start applying some of the style guide’s initial concepts to existing products in order to learn more about how utilizing the style guide’s principles works in practice.

Changes in colors, spacing, and iconography are small details that add to a more consistent visual experience. Image via Phabricator, license.

Content translation has been updated to make use of the colors defined by the Wikimedia style guide color palette. This not only brings visual consistency but helps to achieve a higher degree of accessibility, since colors in the palette were chosen to provide a high level of contrast. In addition, spacing, layering, icons and UI controls were updated to provide a more clear and consistent experience.

As a result, the information in the content translation dashboard is now more pleasant to read, even under sub-optimal light conditions or for those with sight issues. The content translation dashboard is also more consistent visually with some of the new Wikimedia projects, including the different mobile products.

Simplifying key workflows

Picking an article to translate is a key step in the translation process. From early user research sessions we learned that users are motivated by working on topics that interest them, and by the impact their translations can make for readers. In addition, we knew that making the process more fluent would lower the barrier to start more translations.

Our initial dialog for starting a translation was a basic form that did not consider many of the above aspects. We redesigned the process to both simplify it and better guide the user.

The new workflow to start a translation consists of three steps. GIF via Imgur, license.

In the new process, we surface articles that the user recently edited, since the user may be interested in translating them. We also indicate the number of languages in which the articles are available and the number of views the article receives. These are two indicators of the relevance of the topic and the potential impact a translation may have.

The selection of an article is now more guided, providing clear visual cues for each stage of the process. In addition, the amount of information to be provided by the user has been reduced to make the process more fluent. Users can quickly search for an article to translate rather than feeling like they are filling out an elaborate form.

This is the old dialog we are saying goodbye to.
Image via Commons, license.

Finally, the main action to create a new translation is now shown as the most prominent action in the translation dashboard. This has resulted in the rearrangement of the view controls and the language selectors leading to a clearer visual hierarchy.

Supporting small screens and touch devices

Content translation was designed to support the current translation process of Wikipedia articles. Since this process requires heavy text input and consulting multiple information pieces, it has not been optimised for mobile devices.

While we were not aiming at supporting a full mobile experience with this design refresh, we have been identifying several smaller areas that could make life better for those translators using small screens and touch devices. For example, those using a low resolution laptop with a trackpad or even a tablet.

On small screens, the information modules get rearranged, and controls become more compact. Image via Phabricator, license.

Following a responsive design approach, the translation dashboard now makes better use of the available space. Information panels get rearranged and controls such as the language or view selectors get compacted or expanded depending on the screen size. We also corrected the few instances where the only way to perform an action required using hover or accessing an element with a small active area, which limited the use of touch interactions.

Next steps

Most of this work has been focused on the translation dashboard of content translation, which is the main view to pick articles to translate, and manage your existing translations. The main translation view where users translate the content of an article can also benefit from a similar intervention, and we have identified aspects to adjust on it next.

Feel free to provide your feedback about the improvements on content translation in the project talk page, as well as making any proposal for the more general efforts on the design style guide.

Pau Giner, Senior User Experience Designer, Audiences Design
Wikimedia Foundation

by Pau Giner at November 21, 2017 07:18 PM

Wiki Education Foundation

Teaching rhetoric in digital environments

Cathy Gabor, an Associate Professor in the Department of Rhetoric and Language at the University of San Francisco (USF), has her students edit Wikipedia entries in her Rhetoric 295 class. In 2017, she won the Innovation in Teaching with Technology Award at USF.

I regularly teach a Transfer Year Seminar (TYS), which is just like a First Year Seminar, but the students are all transfer students to the University of San Francisco (USF). Regardless of how many writing classes transfer students have taken at previous institutions, USF requires students to take one writing class at our institution in order to fully understand the writing traditions and expectations of our university. The TYS classes are taught on specialty topics, picked by the professors. My class is called NewMedia/YouMedia: Writing in Electronic Environments (Rhetoric 295). I designed my class around this topic in hopes that it would not feel like any writing class the students had previously taken.

Cathy Gabor
File:Cathy Gabor.jpg, by Cathygaborusf, CC BY-SA 4.0, via Wikimedia Commons.

In my TYS class, I ask students to develop research questions, engage in the successes and frustrations of the research process, and produce researched writing. In previous semesters, students wrote “term papers” which ended up in my bottom drawer: great work exposed only to an audience of one (me). In order to give the students a public platform and wider audience for their work, I redesigned the research paper project: now my students create or significantly edit Wikipedia articles.

Most students do find the class very different from previous required writing classes and most students identify the Wikipedia editing project as both fun and challenging at the same time. Anjelica’s reflective comment is fairly representative: “The Wikipedia assignment was the most interesting and was where I’ve done the most extensive research I have ever done. This taught me how to discuss with others, find valid sources, and write in a neutral tone which was a great challenge.”

The Wikipedia editing project fulfills several key learning goals. First, students work in groups, and, therefore, learn how to think about collaborative work metacognitively. They assess their group members’ individual and collective strengths; they decide when to split up the tasks at hand and when co-authoring will work better; and, they learn how to navigate disagreement within the group about the direction of the research. This small group navigation prepares the students to negotiate in real time with other Wikipedians who are editing and contributing to the same article. And, of course, the recursive process of researching-writing-rewriting-re-researching helps students conceive of how knowledge is actually made for academic and public audiences.

Before each semester starts, I identify terms related to rhetoric and digital writing that are underdeveloped in Wikipedia. Then, I narrow down the list and present my students with four choices of Wikipedia terms. They form four groups and start researching and editing. They have worked on entries from Produsage to Fake News. Once in a while, I will come up with a term that does not have a Wikipedia page but needs one. For example, I noticed that there was no page for a cornerstone concept of Jesuit rhetoric: Eloquentia Perfecta. I worked with the Wiki Education Liaison to create a shell for this entry, and my students produced the first-ever Wikipedia page on Eloquentia Perfecta.

One student, a marketing major named Chelsey Eckhardt, worked on that new Eloquentia Perfecta entry. When she signed up for that group, she said aloud to me and her classmates, “I hope I don’t regret choosing this entry.” By the end of the project, Chelsey realized that both Wikipedia and the principles of Eloquentia Perfecta had become “integrated into her life,” including her part-time job and her other classes. Chelsey explains the impact in this short video:

Faculty across the disciplines could easily employ a Wikipedia editing project because Wikipedia supports entries in every field. Such a project could precede, follow, or replace a traditional term paper. In other words, students might begin their research in a group format and contribute to a Wikipedia entry and then produce an individually-authored paper based on the research already done. Likewise, students could write their own term papers first and then collaborate with classmates to edit a Wikipedia entry related to their term paper subject. Either of these models would give students (and faculty) a platform for comparing how evidence gets presented for academic versus public audiences. Finally, faculty may decide, as I did, that students have had and will have plenty of traditional research assignments and use a Wikipedia-authoring project to engender student enthusiasm for taking writing and research beyond the classroom.

Image: File:University Center – University of San Francisco – San Francisco, CA – DSC02659.JPG, by Daderot, Public Domain Dedication, via Wikimedia Commons.

by Guest Contributor at November 21, 2017 05:19 PM


Central User Administration in MediaWiki. Top 10 Questions about LDAP and Active Directory.

Modern Bouncers
Whoever is allowed to access the MediaWiki, the central authentication server decides. Image: Modern Bouncers von Xxinvictus34535 (CC BY-SA 4.0), via Wikimedia Commons.

Today, the connection to a central authentication server is the standard for a corporate wiki. In the following introduction, we explain the most important backgrounds, processes and concepts.

Beyond a certain size companies manage users and user groups centrally in one common directory. Of course, MediaWiki can also be connected to such a central directory with the help of LDAP, which greatly simplifies the life of administrators. In this and in the following articles we would like to explain the basics and the current developments. We will start with a few basic questions.


1. What is LDAP and what is the AD?

The Lightweight Directory Access Protocol is a network protocol. That is a specific exchange format of data – in our particular case, between the wiki and a central user directory. It is comparable to other standardized languages such as SQL. Just as you address certain databases like MySQL with SQL, there are also user directory services that can be controlled via LDAP.

The Active Directory is the user directory service from Microsoft, where the specific rights management of a company is located and which in turn provides an LDAP interface.

There are several alternatives to Microsoft’s AD, especially in the open source area, such as Apache Directory Server, Novell eDirectory or OpenLDAP. As long as these systems support the LDAP protocol, a connection faces no major hurdles.


2. Why should a wiki be connected to a central user administration via LDAP?

It makes sense, because in this case the user administration is not organized separately in the wiki but the access can be centrally and uniformly regulated. For example, users of a certain group in the wiki are only allowed to see or not see the things that are relevant for this group in the wiki.

On the other hand, the user can also use his company-wide password to log into the wiki, so he or she only has to remember one username and password. This is a useful function, especially if your company has password policies that force users to change their password frequently.


3. What is single-sign-on?

This is another useful feature that can be implemented if the wikis have already been linked using LDAP: the user only has to log in to the company’s network once and is automatically logged into the wiki. Not only does he have a single password but he also saves himself typing this password in the input mask of the wiki.


4. What are the steps of an automated login?

In the first step the user is authenticated, that is user name and password are queried and it is checked whether the inputs are correct. If so, the user gets access to the wiki.

Then the second step takes the form of authorization: the system looks at how the user’s path is built and which attributes he has. From this, for example, the groups, the language, the e-mail address, and additional attributes are drawn for the wiki.
Sometimes, the central user directory is only used to check usernames and password. In this case, groups and rights can be managed from within the wiki.


5. What are group rights? Can group rights be taken over from the LDAP / AD?

Both in the wiki and in the central user directory, the permissions of the users (e.g. read or write permissions) are defined via groups to which the respective users are assigned. When connecting the wiki, it is recommended that the groups are taken from the “control center” and then in the wiki with the special rights set (for example deleting wiki articles). If this takeover is not desired, user rights management can continue to be done in the wiki.


6. Are there also disadvantages in the connection of a wiki, z. B. safety-technical nature?

The connection only uses the directory service (such as the AD) for read-only purposes and the information that runs between the wiki server, and the directory server is queried via secure connections and is not forwarded directly to the outside. Of course, any connection to external systems carries the risk of additional security vulnerabilities, but the safeguards we have made are in line with current standards and we do not expect the data to be misused.

The organizational argument against LDAP connectivity is more of an issue in many cases: depending on how the organization is structured, the wiki’s user management is most likely used as a centralized service. And a directory server is of course a sensitive tool. That is, typically not everyone has easy access, not even the individual departments, but only a specific administrator, who then creates the users, the rights and so on. And that means you have to initiate a process for creating a user, which can be very time-consuming and tedious. That’s just a matter of building a business. If it is unproblematic and handled quickly, it is not an issue.


7. To what extent does the MediaWiki itself has to be prepared for the connection?

The MediaWiki needs an extension that communicates with the directory server. The basic installation of MediaWiki is not able to do that. At most it provides so-called hooks. These are designated places in the code to which you can attach the various authentication extensions (also for other systems such as OpenID, WordPress, etc.). Said extension has already been developed by the community, is available on mediawiki.org (search for “LDAP”) and needs to be installed before the connection is made.


8. What are the biggest difficulties and hurdles for a connection?

Some companies have very complex user directories. A path of a user, which always follows the same pattern, e.g.  account name, department, country name and domain, is relatively easy to connect. But the user domains of companies have often been built up or historically grown according to other criteria. Then there can be difficult requirements, where users from a variety of places should be allowed access to the wiki, and we need to think about what are the distinctive criteria that identify those users. Or the company has a collective user (that is an account that several natural persons use), which is to be resolved in the wiki into individual users.

Overall, the query should be formulated as accurately as possible for the users who can log on. Especially for large international companies you need a contact person on the customer side who knows the local directory services well and can give us the significant attributes that should be queried. Otherwise, the performance may suffer.

Another challenge is the distribution of user management on different servers if for example, a company merger has taken place and users of all companies involved (with different directories and servers) should access the wiki. Here you can clarify, for example, with a pre-query in the form of a switch, which organization the user comes from and which server should be controlled accordingly by the Wiki.


9. What information does Hallo Welt! need before the deployment of a connection?

Of course, we need the address and path to the directory server (“LDAP server”) and typically a so-called proxy user. This is a (non-real) user whose password never expires and who has the task and authorization to read all information about specific users from the directory service – he acts as a kind of communication with the directory service.


10. Is technical knowledge necessary for the connection?

In any case. In principle, such a connection should only be tackled by an expert. Or at least someone who already has experience in the field of LDAP and AD.

The post Central User Administration in MediaWiki. Top 10 Questions about LDAP and Active Directory. appeared first on BlueSpice Blog.

by Anja Ebersbach at November 21, 2017 10:03 AM


Monkey selfies and Technollamas

The monkey selfie is back thanks to a This American Life program. While it mostly deals with Slater vs PETA and gets that right its coverage of wikipedia’s role is more questionable. Techdirt has the details of that:


From the copyright nerd POV the most interesting fallout is Technollama’s attempt to do an analysis of the case under UK law:


While I broadly agree with their analysis (although I think they underestimate the differences between civil and common law copyright) a lot rests on the statement “If we believe Slater’s own telling of the story”. The reality is Slater’s telling of the story has been inconsistent. The initial version had the monkey picking up the camera and the whole thing being unplanned. There are reasons to be sceptical of the camera on a tripod claim. In particular one of the shots shows Slater resting his left hand on a tripod. I don’t exactly travel light in photography terms but I don’t carry more than one tripod unless I have a car with me (and even then the second tripod is a mini one). Other photos in the series were taken at different heights which again suggests a tripod wasn’t used. Technollama also argues for selected the lens aperture. Its possible. With wide angle lenses its hard to judge the depth of field well enough to tell. However the exposure (checked the lighting) jumps around a fair bit between pics depending on how much of the money is in shot (most obvious by looking how light the leaves are in the background). A fairly clear sign of the camera controlling the exposure (a human would be more likely to under expose a touch to try and avoid blowing the highlights before trying to bring the shadow detail out in post).

Post brings us to Slater’s actions after the picture was taken. My feeling is that this is Slater’s strongest case. None of the images are at the camera’s native resolution or even the same ratio as the camera’s native resolution suggesting some rotation and cropping. Its impossible to say if the colour balance has been changed. Does rotation and cropping qualify for copyright? Perhaps although the UK’s Intellectual Property Office ,“it seems unlikely that what is merely a retouched, digitised image of an older work can be considered as ‘original’”. Does cropping and rotating count as merely retouching the older monkey produced image? How would the courts rule? I don’t think there is any direct case-law yet.

In the meantime we are getting a bunch of emails to OTRS blaming wikipedia for Mr Slater’s issues and financial position. This is I’d argue somewhat unfair. The raising of the the issue of the image’s copyright status started with techdirt not us. More broadly the problem is due to the changing nature of the wildlife photography market. We now live in a world where you have a bunch of people who can afford high end camera gear and actively enjoy taking it to strange places and taking pictures of wildlife with it it. While these people have always existed in the past it wasn’t easy for them to offer their images for sale. Now it is. Being in the right place with a decent camera and the ability and willingness to sell you photos isn’t worth what it once was.

by geniice at November 21, 2017 06:21 AM

November 20, 2017

Wikimedia Cloud Services

Ubuntu Trusty now deprecated for new WMCS instances

Long ago, the Wikimedia Operations team made the decision to phase out use of Ubuntu servers in favor of Debian. It's a long, slow process that is still ongoing, but in production Trusty is running on an ever-shrinking minority of our servers.

As Trusty becomes more of an odd duck in production, it grows harder to support in Cloud Services as well. Right now we have no planned timeline for phasing out Trusty instances (there are 238 of them!) but in anticipation of that phase-out we've now disabled creation of new Trusty VMs.

This is an extremely minor technical change (the base image is still there, just marked as 'private' in OpenStack Glance). Existing Trusty VMs are unaffected by this change, as are present ToolForge workflows.

Even though any new Trusty images represent additional technical debt, The WMCS team anticipates that there will still be occasional, niche requirements for Trusty (for example when testing behavior of those few remaining production Trusty instances, or to support software that's not yet packaged on Debian). These requests will be handled via phabricator requests and a bit of commandline magic.

by Andrew (Andrew Bogott) at November 20, 2017 06:32 PM

Wiki Education Foundation

Roundup: Native American Heritage month

In the United States, November is Native American Heritage Month. It is a month to recognize the rights and achievements of indigenous peoples and to provide a platform for the sharing of Native cultures and traditions. In light of this month, we’re highlighting student work in Carwil Bjork-James’ course at Vanderbilt University, titled Human Rights of Indigenous Peoples.

One student significantly expanded the article on Lean Bear, a Cheyenne peace chief who lived from 1813–1864. This student’s additions now give context for Lean Bear’s tribal governance role, the notable peace deals in which he took part, and his murder. Lean Bear was chosen to be a part of the Council of Forty-four, a council that promoted peace between Cheyennes and white settlers. He took part in the signing of the Treaty of Fort Wise in 1861 to this end. Lean Bear also met then-President Abraham Lincoln a few years later at the White House, and asked that the President denounce violence against native peoples. About a year after his request, Lean Bear was murdered by troops of the 1st Colorado Regiment under the command of Lieutenant George Eayre, who were ordered to kill Cheyennes on-sight. While there are no confirmed photos of Lean Bear, the student also added a photo that is believed to be Lean Bear. Photo of Lean Bear (+caption) uploaded by student: Caption–“Cheyenne Peace Chief believed to be Lean Bear. Taken 1863, in Washington D.C.”

Another student created a new article about the Protection of Native American sites in Florida. As the article discusses, the looting of Native American sites is a big problem in the state. Not only does this illegal removal of artifacts threaten archaeological projects in the area, but it also threatens efforts to preserve cultural heritage and Native histories in general. Florida Law enforcement are trying to develop enforcement practices to prevent and persecute such actions, but because of the sheer number of sites–this proves difficult. So far, they have been focusing protection efforts on the most threatened sites. Recent bills in the House and Senate also create new threats for these already vulnerable sites. Fortunately, though, these bills have brought more awareness to this issue.

Among the many other articles that students contributed to in Bjork-James’ course, another student focused on the Red Power movement, a social justice movement led by American Indian youth in the 1960s and ’70s. The movement used civil-disobedience and confrontation to demand policies and programs in support of Native rights, in events such as the Occupation of Alcatraz, the Trail of Broken Treaties, and the Wounded Knee incident. Its legacy includes many bills and laws passed to protect rights of American Indian groups, a greater awareness for American Indian civil rights, and an increasing sense of pride among American Indian communities.

Having students contribute to Wikipedia is a valuable effort to rounding out histories of marginalized groups in a open-access, digital environment. A Wikipedia assignment also provides students with great skills to take with them in their future endeavors, both academically and professionally. If you’d like to teach with Wikipedia or learn more, reach out to us at contact@wikiedu.org. Or visit our informational page.

Image: File:Tribal Flags at Eagle Butte, SD.JPG, by National Park Service, Public Domain, via Wikimedia Commons.

by Cassidy Villeneuve at November 20, 2017 05:34 PM


Shocking tales from ornithology

Manipulative people have always made use of the dynamics of ingroups and outgroups to create diversions from bigger issues. The situation is made worse when misguided philosophies are peddled by governments that place economics ahead of ecology. The pursuit of easily gamed targets such as GDP is easy since money is a man-made and controllable entity. Nationalism, pride, other forms of chauvinism, the creation of enemies and the magnification of war threats are all effective tools in the arsenal of Machiavelli for use in misdirecting the masses. One might imagine that the educated, especially scientists, would be smart enough not to fall into these traps but cases from recent history will dampen hopes for such optimism.

There is a very interesting book in German by Eugeniusz Nowak called "Wissenschaftler in turbulenten Zeiten" (or scientists in turbulent times) that deals with the lives of ornithologists, conservationists and other naturalists during the Second World War. Preceded by a series of recollections published in various journals, the book was published in 2010 but I became aware of it only recently while translating some biographies into the English Wikipedia. I have not yet actually seen the book (it has about five pages on Salim Ali as well) and have had to go by secondary quotations in other content. Nowak was a student of Erwin Stresemann (with whom the first chapter deals with) and he writes about several European (but mostly German, Polish and Russian) ornithologists and their lives during the turbulent 1930s and 40s. Although Europe is pretty far from India, there are ripples that reached afar. Incidentally, Nowak's ornithological research includes studies on the expansion in range of the collared dove (Streptopelia decaocto) which the Germans called the Tuerkentaube, literally the "Turkish dove", a name with a baggage of cultural prejudices.

Nowak's first paper of "recollections" notes that: [he] presents the facts not as accusations or indictments, but rather as a stimulus to the younger generation of scientists to consider the issues, in particular to think “What would I have done if I had lived there or at that time?” - a thought to keep as you read on.

A shocker from this period is a paper by Dr Günther Niethammer on the birds of Auschwitz (Birkenau). This paper (read it online here) was published when Niethammer was posted to the security at the main gate of the concentration camp. You might be forgiven if you thought he was just a victim of the war. Niethammer was a proud nationalist and volunteered to join the Nazi forces in 1937 leaving his position as a curator at the Museum Koenig at Bonn.
The contrast provided by Niethammer who looked at the birds on one side
while ignoring inhumanity on the other provided
novelist Arno Surminski with a title for his 2008 novel -
Die Vogelwelt von Auschwitz
- ie. the birdlife of Auschwitz.

G. Niethammer
Niethammer studied birds around Auschwitz and also shot ducks in numbers for himself and to supply the commandant of the camp Rudolf Höss (if the name does not mean anything please do go to the linked article / or search for the name online).  Upon the death of Niethammer, an obituary (open access PDF here) was published in the Ibis of 1975 - a tribute with little mention of the war years or the fact that he rose to the rank of Obersturmführer. The Bonn museum journal had a special tribute issue noting the works and influence of Niethammer. Among the many tributes is one by Hans Kumerloeve (starts here online). A subspecies of the common jay was named as Garrulus glandarius hansguentheri by Hungarian ornithologist Andreas Keve in 1967 after the first names of Kumerloeve and Niethammer. Fortunately for the poor jay, this name is a junior synonym of  G. g. anatoliae described by Seebohm in 1883.

Meanwhile inside Auschwitz, the Polish artist Wladyslaw Siwek was making sketches of everyday life  in the camp. After the war he became a zoological artist of repute. Unfortunately there is very little that is readily accessible to English readers on the internet.
Siwek, artist who documented life at Auschwitz
before working as a wildlife artist.
Hans Kumerloeve
Now for Niethammer's friend Dr Kumerloeve who also worked in the Museum Koenig at Bonn. His name was originally spelt Kummerlöwe and was, like Niethammer, a doctoral student of Johannes Meisenheimer. Kummerloeve and Niethammer made journeys on a small motorcyle to study the birds of Turkey. Kummerlöwe's political activities started earlier than Niethammer, joining the NSDAP (German: Nationalsozialistische Deutsche Arbeiterpartei = The National Socialist German Workers' Party)  in 1925 and starting the first student union of the party in 1933. Kummerlöwe soon became part of the Ahnenerbe, a think tank meant to give  "scientific" support to the party-ideas on race and history. In 1939 he wrote an anthropological study on "Polish prisoners of war". At the museum in Dresden which he headed, he thought up ideas to promote politics and he published them in 1939 and 1940. After the war, it is thought that he went to all the European libraries that held copies of this journal (Anyone interested in hunting it down should look for copies of Abhandlungen und Berichte aus den Staatlichen Museen für Tierkunde und Völkerkunde in Dresden 20:1-15.) and purged them of his article. According to Nowak, he even managed to get his hands (and scissors) on copies held in Moscow and Leningrad!  

The Dresden museum was also home to the German ornithologist Adolf Bernhard Meyer (1840–1911). In 1858, he translated the works of Charles Darwin and Alfred Russel Wallace into German and introduced evolutionary theory to a whole generation of German scientists. Among Meyer's amazing works is a series of avian osteological works which uses photography and depict birds in nearly-life-like positions - a less artistic precursor to Katrina van Grouw's 2012 book The Unfeathered Bird. Meyer's skeleton images can be found here. In 1904 Meyer was eased out of the Dresden museum because of rising anti-semitism. Meyer does not find a place in Nowak's book.

Nowak's book includes entries on the following scientists: (I keep this here partly for my reference as I intend to improve Wikipedia entries on several of them as and when time and resources permit. Would be amazing if others could pitch in!).
In the first of his "recollection papers" (his 1998 article) he writes about the reason for writing them  - the obituary for Prof. Ernst Schäfer  was a whitewash that carefully avoided any mention of his wartime activities. And this brings us to India. In a recent article in Indian Birds, Sylke Frahnert and others have written about the bird collections from Sikkim in the Berlin natural history museum. In their article there is a brief statement that "The  collection  in  Berlin  has  remained  almost  unknown due  to  the  political  circumstances  of  the  expedition". This might be a bit cryptic for many but the best read on the topic is Himmler's Crusade: The true story of the 1939 Nazi expedition into Tibet (2009) by Christopher Hale. Hale writes about Himmler: 
He revered the ancient cultures of India and the East, or at least his own weird vision of them.
These were not private enthusiasms, and they were certainly not harmless. Cranky pseudoscience nourished Himmler’s own murderous convictions about race and inspired ways of convincing others...
Himmler regarded himself not as the fantasist he was but as a patron of science. He believed that most conventional wisdom was bogus and that his power gave him a unique opportunity to promulgate new thinking. He founded the Ahnenerbe specifically to advance the study of the Aryan (or Nordic or Indo-German) race and its origins
From there Hale goes on to examine the motivations of Schäfer and his team. He looks at how much of the science was politically driven. Swastika signs dominate some of the photos from the expedition - as if it provided for a natural tie with Buddhism in Tibet. It seems that Himmler gave Schäfer the opportunity to rise within the political hierarchy. The team that went to Sikkim included Bruno Beger. Beger was a physical anthropologist but with less than innocent motivations although that would be much harder to ascribe to the team's other pursuits like botany and ornithology. One of the results from the expedition was a film made by the entomologist of the group, Ernst Krause - Geheimnis Tibet - or secret Tibet - a copy of this 1 hour and 40 minute film is on YouTube. At around 26 minutes, you can see Bruno Beger creating face casts - first as a negative in Plaster of Paris from which a positive copy was made using resin. Hale talks about how one of the Tibetans put into a cast with just straws to breathe from went into an epileptic seizure from the claustrophobia and fear induced. The real horror however is revealed when Hale quotes a May 1943 letter from an SS officer to Beger - ‘What exactly is happening with the Jewish heads? They are lying around and taking up valuable space . . . In my opinion, the most reasonable course of action is to send them to Strasbourg . . .’ Apparently Beger had to select some prisoners from Auschwitz who appeared to have Asiatic features. Hale shows that Beger knew the fate of his selection - they were gassed for research conducted by Beger and August Hirt.
SS-Sturmbannführer Schäfer at the head of the table in Lhasa

In all Hale, makes a clear case that the Schäfer mission had quite a bit of political activity underneath. We find that Sven Hedin (Schäfer was a big fan of him in his youth. Hedin was a Nazi sympathizer who funded and supported the mission) was in contact with fellow Nazi supporter Erica Schneider-Filchner and her father Wilhelm Filchner in India, both of whom were interned later at Satara. while Bruno Beger made contact with Subhash Chandra Bose more than once. [Two of the pictures from the Bundesarchiv show a certain Bhattacharya - who appears to be a chemist working on snake venom at the Calcutta snake park - one wonders if he is Abhinash Bhattacharya.]

My review of Nowak's book must be uniquely flawed as  I have never managed to access it beyond some online snippets and English reviews.  The war had impacts on the entire region and Nowak's coverage is limited and there were many other interesting characters including the Russian ornithologist Malchevsky  who survived German bullets thanks to a fat bird observation notebook in his pocket! In the 1950's Trofim Lysenko, the crank scientist who controlled science in the USSR sought Malchevsky's help in proving his own pet theories - one of which was the ideas that cuckoos were the result of feeding hairy caterpillars to young warblers!

Issues arising from race and perceptions are of course not restricted to this period or region, one of the less glorious stories of the Smithsonian Institution concerns the honorary curator Robert Wilson Shufeldt (1850 – 1934) who in the infamous Audubon affair made his personal troubles with his second wife, a grand-daughter of Audubon, into one of race. He also wrote such books as America's Greatest Problem: The Negro (1915) in which we learn of the ideas of other scientists of the period like Edward Drinker Cope! Like many other obituaries, Shufeldt's is a classic whitewash.  

Even as recently as 2015, the University of Salzburg withdrew an honorary doctorate that they had given to the Nobel prize winning Konrad Lorenz for his support of the political setup and racial beliefs. It should not be that hard for scientists to figure out whether they are on the wrong side of history even if they are funded by the state. Perhaps salaried scientists in India would do well to look at the legal contracts they sign with their employers, the state, more carefully.

PS: Mixing natural history with war sometimes led to tragedy for the participants as well. In the case of Dr Manfred Oberdörffer who used his cover as an expert on leprosy to visit the borders of Afghanistan with entomologist Fred Hermann Brandt (1908–1994), an exchange of gunfire with British forces killed him although Brandt lived on to tell the tale.

by Shyamal L. (noreply@blogger.com) at November 20, 2017 11:42 AM

Wikimedia Performance Team

The journey to Thumbor, part 2: thumbnailing architecture

Thumbor has now been serving all public thumbnail traffic for Wikimedia production since late June 2017.

In a previous blog post I explained the rationale behind that project. To understand why Thumbor is a good fit, it's important to understand where it fits in our overall thumbnailing architecture. A lot of historic constraints come into play, where Thumbor could be adapted to meet those needs.

The stack

Like everything we serve to readers, thumbnails are heavily cached. Unlike wiki pages, there is no distinction in caching of thumbnails between readers and editors, in fact. Our edge is Nginx providing SSL termination, behind which we find Varnish clusters (both frontends and backend), which talk to OpenStack Swift - responsible for storing media originals as well as thumbnails - and finally Swift talks to Thumbor (previously Mediawiki).

The request lifecycle

Nginx concerns itself with SSL and HTTP/2, because Varnish as a project decided to draw a line about Varnish's concerns and exclude HTTP/2 support from it.

Varnish concerns itself with having a very high cache hit rate for existing thumbnails. When a thumbnail isn't found in Varnish, either it has never been requested before, or it fell out of cache for not being requested frequently enough.

Swift concerns itself with long-term storage. We have a historical policy - which is in the process of being reassessed - of storing all thumbnails long-term. Which means that when a thumbnail isn't in Varnish, there's a high likelihood that it's found in Swift. Which is why Swift is first in line behind Varnish. When it receives a request for a missing thumbnail from Varnish, the Swift proxy first checks if Swift has a copy of that thumbnail. If not, it forwards that request to Thumbor.

Thumbor concerns itself with generating thumbnails from original media. When it receives a request from Swift, it requests the corresponding original media from Swift, generates the required thumbnail from that original and returns it. This response is sent back up the call chain, all the way to the client, through Swift and Varnish. After that response is sent, Thumbor saves that thumbnail in Swift. Varnish, as it sees the response go through, keeps a copy as well.

What's out of scope

Noticeably absent from the above is uploading, extracting metadata from the original media, etc. All of which are still Mediawiki concerns at upload time. Thumbor doesn't try to handle all things media, it is solely a thumbnailing engine. The concern of uploading, parsing and storing the original media is separate. In fact, Thumbor goes as far as trying to fetch as little data about the original from Swift as possible, seeking data transfer efficiency. For example, we have a custom loader for videos that leverages Ffmpeg's support for range requests, only fetching the frames it needs over the network, rather than the whole video.

What we needed to add

We wanted a thumbnailing service that was "dumb", i.e. didn't concern itself with more than thumbnailing. Thumbor definitely provided that, but was too simple for our existing needs, which is why we had to write a number of plugins for it, to add the following features:

  • New media formats (XCF, DJVU, PDF, WEBM, etc.)
  • Smarter handling of giant originals (>1GB) to save memory
  • The ability to run multiple format engines at once
  • Support for multipage media
  • Handling the Wikimedia thumbnail URL format
  • Loading originals from Swift
  • Loading videos efficiently with range requests
  • Saving thumbnails in Swift
  • Various forms of throttling
  • Live production debugging with Manhole
  • Sending logs to ELK
  • Wikimedia-specific filters/settings, such as conditional sharpening of JPGs

We also changed the images included in the Thumbor project to be respectful of open licenses and wrote Debian packages for all of Thumbor's dependencies and Thumbor itself.


While Thumbor was a good match on the separation of concerns we were looking for, it still required writing many plugins and a lot of extra work to make it a drop-in replacement for Mediawiki's media thumbnailing code. The main reason being that Wikimedia sites support types of media files that the web at large cares less about, like giant TIFFs and PDFs.

In the next blog post, I'll describe the development strategy that led to the successful deployment of Thumbor in production.

by Gilles (Gilles Dubuc) at November 20, 2017 06:59 AM

Tech News

Tech News issue #47, 2017 (November 20, 2017)

TriangleArrow-Left.svgprevious 2017, week 47 (Monday 20 November 2017) nextTriangleArrow-Right.svg
Other languages:
العربية • ‎বাংলা • ‎čeština • ‎English • ‎suomi • ‎français • ‎עברית • ‎हिन्दी • ‎italiano • ‎日本語 • ‎ಕನ್ನಡ • ‎한국어 • ‎polski • ‎português do Brasil • ‎svenska • ‎українська • ‎中文

November 20, 2017 12:00 AM

November 19, 2017

Gerard Meijssen

#Wikidata vs #Wikipedia - Rukmini Maria Callimachi

Mrs Callimachi did not only win the Polk Award, she is both a journalist and a poet and did not only win journalism awards. One of the awards, the Michael Kelly Award is hidden on the Wikipedia article of Michael Kelly

This article is about how Wikidata and English Wikipedia can help each other. The Wikipedia article lists seven awards and this makes it easy to add other award winners for them as well.

Thanks to Magnus' awarder, this is fairly easy but some awards hide out as part of an article and the award has to be added in Wikidata.  It may be one reason why later awards are missing. The religious award she is said to have won, it is a different award with a similar name. The award and the organisation that confers it had to be created.

The point, we can compare data at a Wikipedia with what we have on Wikidata. They should match. When they do not, there is an issue. Copying the data from Wikipedia is easy and it is the obvious thing to do. When Wikipedians decry the quality of Wikidata, they should reflect on why this is the case. When we collaborate, we will slowly but surely improve our quality. In the final analysis our aim is the same; share in the sum of all knowledge.

by Gerard Meijssen (noreply@blogger.com) at November 19, 2017 09:14 AM

November 18, 2017

Gerard Meijssen

#Wikipedia vs #Wikidata - the George Polk Awards

Some Wikipedians consider Wikidata inferior, so much so that they agitate towards a policy that bans Wikidata in "their" Wikipedia. They are welcome to their opinion.

I do bulk imports from Wikipedia and all the time I suffer the consequences. Some three to four percent of their data is wrong for all kinds of reasons, reasons that are manageable with proper tooling.

The George Polk Award is an award for journalism and it got my attention again because the International Consortium of Investigative Journalists received it for their work on the Panama Papers. I noticed that many people listed who had been awarded the Polk Award did not have articles in Wikipedia, that many of the link in the list of award winners pointed to the wrong person and that many award winners did not even have a "red link".

I am in the process of checking all the links and adding the date for the award. I found many issues among them a civil war general and many others false friends. I am adding items for the people who do not have an English article and, I have to check each of them because several do have articles in other languages. It is a lot of work and it is not as useful as it could be because Wikipedia hates Wikidata and we do not collaborate, we do not work together.

There is a Listeria list of winners and slowly but surely it will contains the information that is similar to the English Wikipedia list article. Similar but not the same;
  • the false friends will not be there, 
  • there will be no red or black links
  • people who won the award twice will be missing
Why do this, why spend so much time on one big list? Well, in this day and age of "fake news" we should celebrate journalism but having all this information in Wikidata allows for all kinds of tools as well. We can check for false friends, we can check if the articles on the award winners include the award but also if there are "winners" who are not known in this list and in the source available for the George Polk winners..

I am not a Wikipedian and truthfully I hate the endless and senseless bickering that is going on. So let me work on the data, make it available to tools. Now you Wikipedians, you may choose not to show Wikidata data in your infoboxes but you will not make your errors go away without collaboration. Yes, you can quote a source but when your data is not in line with what the source states, having a source does not do you good, effectively you provide fake information.

My request to the reasonable people at Wikipedia and Wikidata, let us work together and see how we can improve quality. Lets link wiki links (blue, red and black) to Wikidata and improve the quality of what is on offer first.

by Gerard Meijssen (noreply@blogger.com) at November 18, 2017 07:40 AM

November 17, 2017

Wikimedia Foundation

What do donors think about our fundraising? Five things we learned.

Photo by Webwizzard, CC BY-SA 4.0.

The Wikimedia Foundation is a non-profit with a big mission: to provide free and open knowledge to everyone around the world. By constantly working to improve our fundraising appeals and strategies, we ensure that mission is never at risk. This year, we partnered with Lake Research Partners to conduct an online survey reaching out to over 1,000 donors in the United States, Canada, Australia, and Great Britain. Our goal was to better understand what drives readers of the English Wikipedia to support the encyclopedia.

Here are five things we learned:

#1. Why do people give to Wikipedia? The number one reason is that they find it really useful.

The majority of donors across all four countries surveyed use Wikipedia several times per week or more. Donors also stated that they stand behind the Wikimedia Foundation’s vision of supporting free knowledge for all.

#2. Donors stated that the number of fundraising appeals they received over the past year was about right.

We learned that donors recall seeing a few fundraising messages from Wikipedia, and that they consider the number of times they see donation requests to be an appropriate amount. This is good news for our online fundraising team, whose goal is to educate readers about our mission, motivate them to become donors, and then get out of their way.  Last year, we eliminated millions of banner impressions from the site in order to maximize campaign efficiency while mitigating disruption of the reading experience as much as possible.

#3. Past donors have a variety of reasons for not contributing again.

The main reason past donors chose not to donate again is budgetary. In addition, past donors expressed a desire to learn more about what their money would be used for: about 1 in 5 donors said the main reason they didn’t plan to donate again was either because they didn’t know what their donation would be used for, or they had already made a donation in that calendar year. We agree that showing donors how their money is used is incredibly important; transparency is one of our core values, and we are proud to be recognized as one of the most transparent non-profit organizations in the world. Our annual plan and operating budget are developed through open processes, subject to community feedback and Board approval, and always available to the public for full review. Last year’s annual plan is available for you to read.

#4. Donors said that they prefer receiving information about causes they care about via email and social media.

While showing banners on Wikipedia is still a highly efficient way to reach our readers, email has over the last year become an increasingly important part of the Wikimedia Foundation’s fundraising revenue model. We send out far fewer emails per donor than the average non-profit each year.  (The average non-profit sends 49 emails per year per subscriber; we send about 10.) Also, compared to industry standards, Wikimedia emails receive extraordinary engagement. This year, we’re also putting more of our resources toward social media. We’re offering Facebook frames for users to show their support, and will be sharing details on the importance of our individual donor model to Wikipedia’s social followers.

#5. The vast majority of donors trust Wikipedia more because it’s operated by a non-profit organization.

Our donors value the Wikimedia Movement’s commitment to neutrality and understand the importance of keeping Wikipedia independently funded. A major goal for the fundraising team is to educate donors about why it is important to maintain our non-profit status. Wikimedia’s websites are the closest thing the world has to a public internet, and we know our integrity is rooted in our independence.

What’s next?

As we enter into one of the busiest and most important periods of the year, the fundraising team will use these survey findings to further improve our campaign strategy as well as the content and design of our fundraising appeals. Moreover, as part of our commitment to transparency, we are looking at new and improved ways we can communicate about our mission to our donors and share more information about why their donations have an impact. If you have ideas and additional feedback—we are listening! Please reach out to us at fr-creative[at]wikimedia[dot]org.

If you are interested in learning more about our survey results, we invite you to visit this page.

Jessica Robell, Senior Global Campaign Manager, Fundraising
Wikimedia Foundation

by Jessica Robell at November 17, 2017 07:32 PM

Wiki Education Foundation

Collaborating at the Wikimedia Diversity Conference

Earlier this month, I had the privilege to attend the Wikimedia Diversity Conference in Stockholm, Sweden. Hosted by Wikimedia Sverige, the conference brought together around 80 participants from 43 different countries. We share one common goal: to ensure the Wikimedia projects represent the full diversity of human knowledge.

Wikimedia Diversity Conference 2017 Group Picture. Image: File:Wikimedia Diversity Conference 2017 – Group Pic.jpg, by AbhiSuryawanshi, CC BY-SA 4.0, via Wikimedia Commons.

Wiki Education’s work focuses primarily on the English Wikipedia, which gets 49% of all the views to Wikipedia worldwide. In other words, English Wikipedia is where people worldwide go to access knowledge almost more than all the other language Wikipedias, combined.

That’s why it’s so critical that the information on the English Wikipedia cover diverse topics. The sum of all human knowledge needs to include topics traditionally under-covered by Wikipedia’s volunteer editing community. English Wikipedia’s well-documented gender bias (there’s even a Wikipedia article about it) is one example of the systemic bias permeating Wikipedia, and one important way to help address these concerns is through education programs like the one we run at Wiki Education.

In our Classroom Program, university students who are studying race, gender, and sexuality for their coursework contribute to previously underdeveloped Wikipedia articles on these topics as part of their coursework. In doing so, students learn key media literacy, critical thinking, and research skills while also grappling with the theoretical side of knowledge production, while Wikipedia gets more diverse content in these topic areas.

Director of Programs LiAnna Davis co-led a skill-sharing workshop at the Diversity Conference on scaling diversity programs. 
Image: File:Wikimedia Diversity Conference, Stockholm, Day 2 by Dyolf77 DSC 6892.jpg, by Dyolf77, CC BY SA, via Wikimedia Commons.

At the Diversity Conference, I was honored to co-lead a skill-sharing session on scaling diversity programs. With the Wikimedia Foundation’s Tighe Flanagan, we shared our experience in planning diversity programs like the education program. I mapped out how Wiki Education used process mapping to create a scalable solution to our program. Reem Al-Kashif from the education program in Egypt, Camelia Boban from the WikiDonne User Group, and Stella Sessy Agbley from Wiki Loves Africa in Ghana all presented tools that helped them scale their programs as part of our session as well.

We also support the work of existing Wikipedia editors in our Visiting Scholars program, where we provide a university login for editors interested in improving a topic area. Two of our Visiting Scholars, who edit in diversity areas, were also at the conference. Jackie Koerner, who improves Wikipedia articles on disability topics thanks to her role as a Visiting Scholar at San Francisco State University’s Longmore Institute on Disability, wrote a great piece about her experiences at the conference. Rosie Stephenson-Goodnight, who works on articles about women writers thanks to resources from her Visiting Scholar position at Northeastern University, also led a session on gender diversity mapping. I was also honored to have a great working breakfast with Jackie, Rosie, and Cal Poly Pomona librarian Kai Smith, who is considering hosting a Visiting Scholar, about the program and how libraries can be more involved in Wikipedia work.

Wikimedia Sverige put together a fantastic program that included lots of small group discussion, which was really great for getting the perspectives of other conference participants. We spent a fair amount of time grappling with what the new Wikimedia strategic direction means for diversity, and I particularly enjoyed a small group discussion about how to include diversity in our success metrics moving forward. Hearing from others globally how they’ve worked through diversity challenges really inspired me, and the group’s energy to make Wikimedia projects more diverse in both content and contributors was incredible. Many thanks to Wikimedia Sverige for hosting a great event, and to my fellow attendees for making it a productive event filled with informative discussions, learning, and sharing.

by LiAnna Davis at November 17, 2017 04:39 PM

Weekly OSM

weeklyOSM 382



The ADFC (German Cyclist Association) holds a mirror map up to the administration in Dresden, Germany. 1 | © Mapbox, © OpenStreetMap Mitwikende, oDbL © Nils Larson


  • Yuri Astrakhan re-started the discussion on the Talk mailing list about the tool to do mechanical edits (it is now called Sophox). Yuri is perceived by many as unreasonable as before and tries to ignore all the unwritten rules in OSM.
  • Enrico asks on the Tagging mailing list how to tag a religious mural painting, “Affresco votivo” in Italian.
  • Through a Twitter moment, user sev_osm showcased over 10 outreach and training workshops on CarteInnov run by Les Libres Geographes (LLG) and local OSM mappers. This project supported by International Organization of Francophonie (OIF) aims at building an open map of Digital Innovation stakeholders in Haiti, Madagascar and 12 African countries. The data is hosted in OSM and edited with MapContrib.
  • You can cast your vote on the revised version of the Fire Hydrant Extensions proposal until November 27th. The original version did not achieve the required majority.
  • The voting of the proposal Metro Mapping has started and will end on November 24th. Several people including companies which already map stations criticize the proposal on the Tagging mailing list.
  • Gabe Appleton presents his suggestion for the timing of traffic lights and expects constructive criticism.
  • BeKri complains (de) (automatic translation) in the German forum about a remarkably large number of new Wheelmap users in Munich, who very often enter POIs twice or place them at wrong locations. Holger from Wheelmap promises a new editor which can only edit wheelchair=* and toilet:wheelchair=* tags.


  • The developers of Jungle Bus shares a tweet on Twitter for help to translate their app.
  • The Atom Feed interface of Pascal Neis Suspicious OpenStreetMap Changesets has now got the optional bbox parameter. The parameter country can then be omitted. (Source)
  • User ChristianA wrote a blog post about the drawbacks of open data. The lack of outdoor and more armchair mapping.
  • An incorrect wikidata link in OSM caused an area of the Bronx to be mislabelled on Mapbox Vector tiles.
  • Kathmandu Living Labs invited the current Women Leaders in Technology ([WLiT(http://wlit.org.np/)) fellowship holders to join a special OSM session. Participants gained their first experiences in the use of OSM through hand-on mapping. They also had a chance to learn about mapping and OSM, as well as possible ways to contribute to the global OSM database.
  • Sterling Geo describes in their blog how they compared Sentinel-2 data with OSM land use data.


  • The company Cybo which operates a yellow pages directory wants to offer its POI data to OpenStreetMap. There are two parallel threads on the Imports and Talk-us mailing list. the coordinates seem to be retrieved using Google’s geocoding API and cannot be used by OSM due to legal reasons.

OpenStreetMap Foundation

  • Just less than three years ago, a member-initiated initiative proposed to reorganize the terms of office for OSMF board members and limit the terms of office. Simon Poole writes to the Osmf-talk mailing list about getting a clear statement from the board on addressing this in 2018
  • Data Working Group has published its activity report (PDF) for the third quarter of 2017.
  • Members of FOSSGIS e.V., the de-facto local OSM chapter in Germany, discuss about a grant application (de) (automatic translation) to sponsor a secondary production server for the German instance of the Overpass API (overpass-api.de).
  • This year’s meeting of the OSM Foundation members will take place on December 9th via IRC. Details can be found in the invitation email. Two new board members must be elected. OSMF members can register as candidates until November 25th. Details on the elections in the OSM-Wiki, and there is also a page for voter questions.

Humanitarian OSM

  • HOT calls out an acitivation due to the Iran-Iraq earthquake.
  • John Whelan inquires on the HOT mailing list to assist in testing building mapping projects in Malawi.
  • OSM Zambia received a “microgrant” from HOT that made it possible to carry out extensive mapping training in Zambia’s schools.
  • HOT has published a call for applications for a position to develop a mapping visualisation tool which creates videos visualising edits in an area. Application deadline is 17th November.
  • A BBC radio feature talks about Missing Maps, mapathons, armchair mapping, Red Cross, mapping buildings, roads, and rivers.


  • Developers of OSM Carto discuss if restaurants, pubs, and similar amenities should be rendered as small brown squares on zoom level 17 as it already happens with most shops (in pink).
  • [1] The ADFC (member of the ECF) tries to use an OSM based map to get the Dresden administration to do more for cyclists. Nils Larsen, the author of the map, has announced a “how to” post which will be published shortly.
  • User Komzpa submitted a pull request to OSM Carto which adds rendering of place=* on areas. Some people fear that it supports the duplication of place=* tags in OSM (currently mainly mapped on nodes).


  • Pieter Vander Vennet shows how to write your own OsmAnd routing profile.


  • Richard Fairhurst describes in his user diary how to debug Lua scripts of Osm2pgsql and OSRM without re-running a time consuming import each iteration.

Did you know …

  • … that you can create new nodes in JOSM while you are in the extrusion mode (key X) by doing a double-click? It saves pressing A and X. (via a thread (de) in the German OSM forum)
  • Latest Changes on OpenStreetMap by Martin Raifer? It shows all changed objects in the last seven days in an area.

Other “geo” things

  • CityLab reported: London’s Oxford Street will become a pedestrian zone. 800m of traffic will be closed to vehicles by 2018. This fact is also a challenge for OpenStreetMap.
  • An interview by MARTECHSERIES with Javier de la Torre, the founder of CARTO (including the basic map of OSM).
  • In its latest issue, the Spanish online magazine NOSOLOSIG reports on the first maps of the Palaeolithic period. The map carved in a stone, which clearly contains geographical references, was found during excavations in the cave of Abauntz near the village of Arraitz about 20 km north of Pamplona, Spain, and is therefore called “the map of Abauntz”. A translation of the article with deepl is worthwhile. 😉 (es) (automatic translation)
  • Gulf News reports that the 2nd “Louvre” was opened in Abu Dhabi. Of course, it is already recorded in OSM. The tagging is obviously still not complete. 😉
  • Theodolite, an app for “Augmented Reality”, which according to prMac contains excellent OSM data even in “the middle of nowhere” is coming for iPhone X.
  • A tourist from Milan got stuck with his vehicle in a medieval alley in Como because he trusted his satnav more than his eyes. The map data was not from OpenStreetMap, but the incident inspired the local OSM comunity to recheck the area for possible mistakes.

Upcoming Events

Where What When Country
online via Mumble OpenStreetMap Foundation public board meeting 2017-11-16 everywhere
Digne-les-Bains Cartopartie : Libre information sur les services de santé 2017-11-16 france
Fort Collins CSU Ger Community Mapping Center Mapathon Colorado State University 2017-11-16 united states
Suzhou OpenStreetMap Youth Promotion 2017-11-18 china
Helsinki HOT-OSM Finland Geoweek Mapathon 2017 2017-11-18 finland
Brazil Brasília OpenStreetMap Edit challenge begins with a focus on road characteristics 2017-11-19 brazil
London Raincatcher Mapathon for Tanzania 2017-11-21 united kingdom
Lüneburg Mappertreffen 2017-11-21 germany
Nottingham Pub Meetup 2017-11-21 united kingdom
Edinburgh Pub meeting 2017-11-21 united kingdom
Viersen OSM Stammtisch Viersen 2017-11-21 germany
Lübeck Lübecker Stammtisch 2017-11-23 germany
Apach (Moselle) – Schengen (Luxembourg) – Perl (Saarland) 1. OpenSaar-Lor-Lux Stammtisch 2017-11-24 germany
Bremen Bremer Mappertreffen 2017-11-27 germany
Bonn 100! Bonner Stammtisch 2017-11-28 germany
Brussels Missing Maps @ MSF/HI 2017-11-28 belgium
Dusseldorf Stammtisch 2017-11-29 germany
Lima State of the Map LatAm 2017 2017-11-29-2017-12-02 perú
Yaoundé State of the Map Cameroun 2017 2017-12-01-2017-12-03 cameroun
Dar es Salaam State of the Map Tanzania 2017 2017-12-08-2017-12-10 tanzania
Rome FOSS4G-IT 2018 2018-02-19-2018-02-22 italy
Bonn FOSSGIS 2018 2018-03-21-2018-03-24 germany
Poznań State of the Map Poland 2018 2018-04-13-2018-04-14 poland
Milan State of the Map 2018 (international conference) 2018-07-28-2018-07-30 italy

Note: If you like to see your event here, please put it into the calendar. Only data which is there, will appear in weeklyOSM. Please check your event in our public calendar preview and correct it, where appropriate.

This weeklyOSM was produced by Nakaner, Polyglot, SK53, Softgrow, Spanholz, Spec80, YoViajo, derFred, jinalfoflia, sev_osm.

by weeklyteam at November 17, 2017 08:16 AM

November 16, 2017

Wikimedia Foundation

You’re a researcher without a library: What do you do?

Photo by Michael D Beckwith, CC0.

The world of publishing is evolving frantically, while it remains frustratingly fragmented and prohibitively expensive for many. If you’re a student who just left your academic library behind only to discover you are now locked out of the stacks; a startup researching water usage in Africa and keep hitting paywalls; a local nonprofit that studies social change activism, but all the latest papers cost $30 per read… This article is for you.

Some ideas (a few here, more than twenty-five in the full post):

  • Use your library
  • Preprints and respositories
  • Buy it
  • Alternative sharing methods
  • Publisher donations

That’s all, really. That’s literally all you can do. It’s kind of sad that such a patchwork of partially suitable options exists. You have to be a research ninja to navigate the obstacle course of paywalls. Who is looking out for those going stag into the wide world of knowledge? The open access movement is, and each year they move the needle towards a future in which every paper is at least free to read, if not free to reuse and share. You should support open access for your own selfish research needs, for the benefit of those even less fortunate than you, and for the untold discoveries that could erupt from a truly open, global, collaborative research ecosystem. Until then, pick your hack.

Read the full post over on Medium.

Jake Orlowitz, Wikimedian

While Jake works with us over here at the Wikimedia Foundation, this post was written in an entirely volunteer capacity. The views and opinions expressed are those of the author alone. The text of this post is available under the CC BY-SA 4.0 license.

by Jake Orlowitz at November 16, 2017 08:57 PM

Wiki Education Foundation

Representing women’s histories on Wikipedia at NWSA

This week, Wiki Education will join Women and Gender Studies scholars in Baltimore at the National Women’s Studies Association’s (NWSA) annual meeting. NWSA was Wiki Education’s first partner, as they saw an alignment between their mission to improve public understanding of women’s studies and our proven track record of training university students to add missing academic scholarship to Wikipedia.

Since launching that partnership in the Fall 2014 term, we’ve supported 169 courses related to women and gender studies. Nearly 4,000 students have added 2.47 million words to Wikipedia about the topics they were studying in class. That work has received 270 million page views from Wikipedia’s readers all over the world.

We’re proud of the contributions students are making to increase Wikipedia’s equity. As those in powerful positions repeatedly minimize women’s contributions in the workplace, it’s more important than ever to highlight the powerful ideas women have championed for decades. This year’s theme honors the Movement for Black Lives and encourages feminist scholars to stand in solidarity and support the movement. I encourage instructors to join me at one of the locations below to brainstorm assignment ideas to improve Wikipedia’s representation of black women.

Opportunities to Connect

I’ll be in the exhibit hall all weekend, encouraging faculty to join the Classroom Program and use Wikipedia as a teaching tool. Wikipedia Expert Shalor Toncray will join me today and tomorrow, so if she has supported you and your students this term or in the past, this is a rare opportunity to stop by and talk to her in person!

In the exhibit hall, we’ll introduce attendees to our new pilot, Wikipedia Fellows. NWSA is one of the participating partners whose members will join the interdisciplinary effort to train content experts how to edit Wikipedia.

Lastly, please join me on Friday, November 17th, from 8:00–9:15am in the Key Ballroom 9 at the Hilton Baltimore. There, I’ll present about the contributions students have made to Wikipedia and facilitate a discussion about why women and gender studies students should bring their voices to Wikipedia.

by Jami Mathewson at November 16, 2017 05:58 PM

Gerard Meijssen

#Wikidata - women in red - May Wright Sewall

On Twitter, it was mentioned that archival material of Mrs May Wright Sewall was being worked on. When you read the Wikipedia article, it becomes all too obvious how notable she was. She founded multiple organisations and was known for her suffragist ideas.

The article introduces these organisatons and consequently to indicate the relations, new items have to be created in Wikidata. I only did two and I added her husbands, men that supported her in her undertakings.

By adding these new organisations, it becomes possible to link more people to them. They thereby gain notability and it becomes more likely that at some stage they will get their article as well. The least new people and organisations added in Wikidata do is complete the tapestry of information of an age gone by.

by Gerard Meijssen (noreply@blogger.com) at November 16, 2017 08:02 AM

November 15, 2017

Wiki Education Foundation

Tips for Grading a Wikipedia assignment

As the term begins to wind down, many of our instructors are beginning to assess the work their students contributed to Wikipedia. Just as the Wikipedia assignment differs from traditional writing assignments, so does the grading of these projects. Over the past several terms, our instructors have regularly expressed that they would like more guidance regarding the grading of their Wikipedia assignments which is why we’ve developed a sample assessment rubric that instructors can use to evaluate the quality of their student contributions to Wikipedia. We encourage instructors to adapt this table to the specific needs of their individual assignments and welcome feedback so we can continue to improve it.

The following are important points to consider as you begin grading your Wikipedia assignment:

  • Never grade student work based on what sticks on Wikipedia-Sometimes student work is reverted. This may happen for a number of reasons, but remember, nothing is ever lost on Wikipedia. You can always find their work in their contribution history.
  • Quality is far more important than quantity-What a student contributes to an article will depend on the sources available and Wikipedia’s existing coverage of the subject at hand. A 300 word contribution may be as critical to a subject as a 1000 word entry.
  • Creating a new entry is not more work than contributing to an existing article-The vast majority of our students work on existing entries on Wikipedia.
  • Finally, there are multiple paths to improving Wikipedia-Students can improve articles in a variety of ways, from adding new content to updating sources to restructuring an article.

Navigating contribution histories on Wikipedia can be tedious which is why the Course Dashboard is such a critical part of the Wikipedia project. Remember, the Dashboard will allow you to view your student contributions to Wikipedia in a variety of ways. We recommend that you check in on student contributions regularly to see where they are in the Wikipedia assignment.

While it’s up to our individual instructors to assess the work their students did on Wikipedia, we know that their overall impact to the world’s largest encyclopedia deserves an A+!

Best wishes for a productive grading season!


by Helaine Blumenthal at November 15, 2017 05:45 PM

Wikimedia Tech Blog

The future of offline access to Wikipedia: The RACHEL example

Photo by Marie-Lan Nguyen, CC BY 2.5.

Senior Program Manager Anne Gomez leads the New Readers initiative, where she works on ways to better understand barriers that prevent people around the world from accessing information online. One of her areas of interest is offline access, as she works with the New Readers team to improve the way people who have limited or infrequent access to the Internet can access free and open knowledge.

Over the coming months, Anne will be interviewing people who work to remove access barriers for people across the world. Here, she speaks with Jeremy Schwartz, the Executive Director for World Possible. World Possible’s Wikipedia entry describes it as “a non-profit organization that makes and distributes RACHEL (Remote Area Community Hotspot for Education and Learning), software that hosts offline free educational content such as Khan Academy, Wikipedia, Project Gutenberg and others via Wi-Fi on a Raspberry Pi computer.”

You can also read the other interviews in this series, including a chat with Emmanuel Engelhart of Kiwix.


Anne Gomez: Let’s start with the product your organization is known for. What is RACHEL?

Jeremy Schwartz: RACHEL stands for Remote Area Community Hotspot for Education & Learning. It’s a portable plug-and-play server which stores educational websites and makes that content available over any local (offline) wireless connection. It is our method of distributing copies of digital resources (book, video, and website copies) to offline communities. We, along with a handful of others, have to create those copies through scripting and host them on OER2Go.org.


Gomez: Tell me about what started your interest and involvement with offline educational resources? When was it?

Schwartz: In 2008, I finished my undergrad career on a boat: I spent 110 days aboard the MV Explorer where I traveled the world. My favorite classes were ICT & Global Change, and Human Rights & Ethics. As we stopped at different ports, I saw education as the big missing link between the two fields. Without connectivity, there had to be a way to distribute educational content to these communities. It turns out some folks at Cisco were already working on a project that filled this gap, which we formalized in 2009 as World Possible.


Gomez: What’s been the biggest surprise for you over the years?

Schwartz: The lack of growth of Internet connectivity. When we started this in 2008, we were sure that by 2018 there would be global Internet coverage. In fact, the rate of Internet adoption continues to slow. It’s contrary to frequent headlines about every “next thing” I read, and continues to surprise me.


Gomez: Smartphones have transformed the way people can access the internet. How has this changed the landscape and the way you view offline access? How do you see these devices impacting the future of educational resources?

Schwartz: Smartphones continue to be a game-changing enabler, not just in how users access and consume content, but how we think about connectivity in general. I don’t think we understand the scale of the offline world here. It’s half of the world’s population still. To me, teaching digital literacy to this population is becoming more critical. Knowing how to navigate an internet, produce digital content, and survive in an online world has become a key life skill. Offline access provides the first incentive for someone to want to know more, and to realize there is more out there. We hope our work encourages folks to buy that first smartphone, and to do something more with it than text and play games. We hope they see digital content as a learning tool which can answer their questions and change their livelihood.


Gomez: What hasn’t changed?

Schwartz: Our world is still pretty ripe with great ideas, but failed executions. This work hasn’t become any easier over the years, it still depends on dedicated people choosing to make content available, teachers to teach that content, and students who are free to learn. We’ve watched a number of ‘start-ups’ get great attention and fanfare, only to die out as quickly as they came trying to serve this market. The offline world still lacks the element of rapid scale our technology community has come to expect.


Gomez: You’ve been working through chapters on the ground in a number of countries, as well as in the US prison system. Can you share why you believe this model works?

Schwartz: We view our role here as exposing people to what is possible with technology, and letting them run with it.  If we can’t find someone in a country who thinks the project is worthwhile, I can pretty much guarantee you it isn’t.

Our technology gives us a unique reach into the offline world. We use that reach to find someone exceptional and empower them with technological support to build something their communities need. Our model is to find these people who are experienced and enthusiastic in bringing RACHEL to their countries and have the skills to localize it with relevant content, introduce it to lots of people, and train teachers in how to use it. Then we bring them on as a chapter leader, and that chapter ultimately becomes its own independent non-profit. So far our chapter leaders have been recognized by Facebook, Internet Society, and the Nelson Mandela Young African Leaders Initiative.


Gomez: You create customized content in partnership with local entities, tailored for that country. Why is this important? Is it good design, legal requirements, etc?

Schwartz: Our chapter teams definitely create content in partnership with local entities, or just collect it from various local entities and package it together. I constantly take for granted how much of the Internet is in English. There are brilliant educators in every town around the world, but they have to be empowered by the Internet to share their lessons. We’re trying to give them that platform.

People often think about how we can align western educational content to local context, but the reality is, we need to have educators creating local content. Think about the different message you would get as a student if we dubbed over Chinese videos with English language and subtitles and told you this was the way to learn. Teaching should be done by a community, for a community. If the teaching isn’t perfect, a next generation should improve upon it–the answer can’t always be let’s go find how someone else taught it. There’s a process to learning we’re trying to enable.


Gomez: The device you’re primarily working with is more robust than the Raspberry Pi, with bigger storage, range, and battery backup. Why do you prefer this? Why does it work for the schools and teachers you’re working with?

Schwartz:  This move was really in response to a technical need. We needed a device which solved our users’ power issues, while also providing more robust capabilities to serve multiple users. We still offer RACHEL-Pi, but it can only connect to around 10 users at a time and it does not have a battery. With RACHEL-Plus, as many as 20-50 users can connect at the same time, and it can run for 5-8 hours on a battery. Many of the schools using RACHEL have inconsistent or no power, but they can still use RACHEL-Plus all day long and charge it at home at night.

Finally, RACHEL-Plus has far greater processing power, which allows us to layer on features such as web-based content updates and data tracking. The Raspberry Pi has bottlenecks around microSD card read speeds (USB 2.0 interface), WiFi service (1×1 w/ internal antenna), and microSD card storage capacity (64/128GB).


Gomez: What’s the future look like for RACHEL? How do you see your work evolving?

Schwartz: [We hope that] RACHEL will become the Internet hub for our offline communities. We continue to innovate not just around content, but also the user experience with RACHEL. We hope to teach digital literacy, and part of that is extending the “Web 2.0” services we can deliver locally on RACHEL, concepts like message boards, e-mail, cloud storage, global search, and more. We are also evaluating ways to deliver small bits of data through physical courriers of data, this could provide asynchronous internet connectivity and content updates.

Even if the promise of free, ubiquitous Internet does become a reality in the developing world, we still see a role for RACHEL, especially in schools and areas that want to focus on providing access to learning without the distractions of the Internet. In the developed world, we’ve already seen RACHEL adopted by prisons, schools, and even by parents who want their kids to access an educational slice of the Internet at home without full access to Google and social media.


Gomez: What’s one thing the Wikimedia Foundation could do to help?

Schwartz:  Helping us spread word about the availability of RACHEL. There is an immense need globally for offline solutions for the foreseeable future. Even in a place with limited bandwidth, caching content locally for multiple users is immensely valuable. The more users we can get on the platform, the more we can fundraise, develop, and inspire our users.


Gomez: Where do you learn more and share information about offline access? What resources exist for people who want to know more?

Schwartz:  This is still an area of growth for our sector. More recently, we’ve started to join the conference circuit and find ourselves learning about lots of other great efforts. mEducation alliance, Mobile Learning Week, ICT4D, OpenEd, and Kloster’s Forum have all been good experiences for us. While none of them focus exclusively on offline solutions, all of them usually have tracks or specializations around the topic.

We also invite everyone to visit our website at www.worldpossible.org to read more about RACHEL, ask questions and share ideas in our community forums, and purchase RACHEL from our online-store.

Anne Gomez, Senior Program Manager, Program Management
Wikimedia Foundation

by Anne Gomez at November 15, 2017 05:36 PM

Gerard Meijssen

#Wikipedia - #Retraction exposing big issues in #science

When a scientific paper is published, it is read and cited by other scientists to further on science. It is read and cited by Wikimedians to write articles and share the sum of all knowledge. The Wikicite project provides better tooling for using these papers as a source in Wikipedia articles, it is one of the more relevant developments in combatting fake news in Wikipedia.

However.. there is an issue with a substantial number of papers; they were retracted. There are all kinds of reasons possible but the bottom line is; they are not to be used as a source in Wikipedia because its findings are false.

The challenge: what papers are retracted, how are retractions and the reasons for retractions modelled and how will we find these papers in the Wikipedia sources. Knowing retractions and acting on them will be a fine art; one publisher in South Africa for instance was pressed to retract a book exposing the president. There will be so many issues exposed once retractions become part of the Wikipedia work flow. Failing to do so will be the worst we can do. We will not be sharing the sum of all knowledge, we will be sharing the sum of what we are told.

by Gerard Meijssen (noreply@blogger.com) at November 15, 2017 07:25 AM

November 14, 2017

Wikipedia Weekly

Wikipedia Weekly #126 – Introduction to Wikidata

This episode is an introduction to Wikidata, where Andrew and Rob walk through some of the history and current applications of Wikidata. We discuss the basics of items, statements, properties, identifiers, and qualifiers. Some of the tools we cover include Quickstatements, Petscan, Reasonator, SQID and Scholia, as well as Wikidata Query and SPARQL. We discuss the “game” interfaces in engaging Wikidata and how Wikidata is currently used (or not) in Wikipedia infoboxes. (Errata: all mentions of 2013 as the start of Wikidata should instaed be 2012. We regret the error.)

Participants: Andrew Lih (User:Fuzheado), Rob Fernandez (User:Gamaliel)


Opening music: “At The Count” by Broke For Free, is licensed under CC-BY-3.0; Closing music: “Things Will Settle Down Eventually” by 86 Sandals, is licensed under CC-BY-SA-3.0

All original content of this podcast is licensed under CC-BY-SA-4.0.

by admin at November 14, 2017 08:58 PM

Wikimedia Tech Blog

What is technical debt, and why should you pay attention to it?

Photo by Avij, public domain.

What is technical debt?

As put simply by Dan Rawsthorne:

Although simplistic, the spirit of this definition is generally at the core of most industry experts’ own definitions. Technical debt is a metaphor that software developers use to refer to “short-term compromises” to a software project’s “code or design quality,” which in turn “make[s] the product more difficult for someone else to continue to develop, test, and maintain in the future.” A key component of the technical debt metaphor is the idea that for any type of debt, some amount of interest needs to be paid.  Fundamentally, if there’s no potential for interest to accrue, then you’re not dealing with technical debt—you just have  incomplete work and not debt. (Examples of this would include features not delivered, bugs not fixed, etc…)

Technical debt is important for software developers to consider is because code that is hard to work with generally hampers developer’s productivity and results in less stable code.

All too often the term “technical debt” ends up being applied to a wide range of issues, and as such, becomes unmanageable. Let’s talk about what technical debt isn’t: Technical debt is NOT a bug or the lack of a feature. Technical debt is not just a fancy name for “sloppy code”.

Technical debt is generally not visible to a user of the system. It’s visible to developers and those that have to work with the source code in some capacity. That being said, technical debt can contribute to bugs and other user facing/impacting issues.

Is technical debt a code-only thing?

Until this point, I’ve discussed technical debt within the context of code.However, technical debt can be found throughout the technology stack, especially in SAAS (software as a service) organizations like Wikimedia.  Outdated operating systems, lagging patch deployments, old libraries and compilers can all contribute to a system’s overall technical debt.  Not unlike code-based technical’s impact on developers, infrastructure-based technical debt can make it more difficult and unpleasant for admins and sysops to do their work.

Much of the technical debt blog series will be applicable to technical debt found throughout the stack. However, most of the examples and specifics here will be written from a developer’s/code perspective (because I’m a developer.)

What is the impact of technical debt?

Okay, so we’ve got technical debt. So what?  This is really where the financial debt analogy starts to fail us a bit.  Unlike financial debt, technical debt isn’t easy to measure either in terms of the cost of repayment or the cost of interest.  That’s because some of the costs of technical debt are unknown upfront. Applying this to the financial analogy, it would be like signing up for a range of debt with an undefined interest rate for an undefined amount of time.  Does that sound like a good idea?

On the plus side, technical debt doesn’t necessarily incur any interest if the code in question requires no change or maintenance.  For example, if you pay off the debt in full via refactoring, before it has slowed you down, it could end up being an interest-free loan.  That being said, I’d venture to say that a significant portion of legacy technical debt doesn’t fall into this category.  This is because much of the technical debt was identified retroactively due to the impact it already has had.

With technical debt, the interest costs are those associated with working with the code in it’s current and less than ideal state. This could amount to the “extra” time spent in development, or to bugs introduced due to the poor code design or implementation.  Another impact that is less obvious, but potentially very impactful, is that of code avoidance.  An example of code avoidance might be the development of new extensions versus extending an existing extension or the core code-base itself.  The impact of avoidance can be very costly both in-terms of development and operations.

All technical debt is bad—right?

In my experience, most of the discussions related to technical debt, especially the higher-level ones, work from the premise that technical debt is bad and should be avoided at all costs.  However, that’s not my experience.  Not unlike financial debt, technical debt is a tool that can be used to accomplish more things in the nearer term than would otherwise be possible.  Much of our technical debt is more in-line with consumer credit card debt than mortgages or business loans.

Our goal at the Wikimedia Foundation is to transform how we manage our technical debt in order to reduce it and make the debt we do incur be conscious value-driven debt.  To help in that, it probably makes sense to take a deeper dive into defining different kinds of technical debt.  For that I turn to a piece from Steve McConnell, author of several books including Code Complete, which does a nice job creating a technical-debt taxonomy that is simple to understand but also helps when making decisions about technical debt.  He refers to debt in this way:

  • I. Debt which is incurred unintentionally
  • II. Debt  which is incurred intentionally
    • II.A. Short-term debt, usually incurred reactively, for tactical reasons
      • II.A.1. Individually identifiable shortcuts (like a car loan)
      • II.A.2. Numerous tiny shortcuts (like credit card debt)
    • II.B. Long-term debt, usually incurred proactively, for strategic reasons

In this technical debt taxonomy, the first distinction is between unintentional and intentional debt.  This distinction is worth spending some time on.

Unintentional debt is … well, unintentional.  It’s not a deliberate decision to incur some debt in order to deliver something faster or easier.  Although Steve McConnell labels this as “due to low quality work”, I’d propose that it’s actually more so related to decisions or approaches that were based on the understanding/ability at the time.  An example of this is the selection of a design pattern or architecture that seemed reasonable at the time of development, but has since then been identified as complex and difficult to implement and maintain.

Intentional debt on the other hand is known when it is incurred.  That doesn’t necessarily mean it’s tracked or thought of as debt, but it requires a conscious decision to cut a corner or pursue a less desirable approach.  This is the category that we have the most opportunity to actively avoid through our decisions.

In many circumstances, intentional debt is taken on due to external schedule pressures, however at the Wikimedia Foundation we don’t generally have the same external pressures.  That’s not to say that we don’t have self-imposed schedule pressures, but it seems that we are in a better position to resist intentional technical debt that is driven by external pressures.  That being said, there are other reasons that could drive the intentional accumulation of technical debt, such as the desire to get timely user/community feedback on new features.  This perhaps could be called “prototype” driven technical debt.

Within intentional debt we can further break things down into short-term debt vs long-term debt. In other words, how long (in calendar months) we think it will take to repay the debt.  Generally speaking, the longer-term the debt is, the more strategic the decision was in nature.   An example of this is the selection a development language like Objective-C, which is inherently more complex to work with, over Swift, which is a more modern and simpler language to work with.

The shorter-term debt tends to be more reactive and in many cases doesn’t provide much value for the debt it creates, such as not following a coding convention that eases readability.  This is the kind of debt that is most avoidable. We’ll cover more about how the taxonomy is used in our next two blog posts in this series.


In this post I’ve focused mostly on defining what technical debt is, the impact that it has, and a way to categorize it in a way that supports making decisions. In my next post, I’ll talk more about  how we can avoid incurring additional technical debt. I’ll then wrap up this series by discussing how to address currently accrued technical debt.

Interested in joining the discussion?  Feel free to join the Technical Debt SIG, attend a regularly scheduled code health office hours session, and/or in IRC #wikimedia-releng using keyword “CH-TechDebt”.

Jean-Rene Branaa, Senior QA Analyst, Release Engineering
Wikimedia Foundation

by Jean-Rene Branaa at November 14, 2017 06:10 PM

Wiki Education Foundation

Announcing Wikipedia Fellows

Wiki Education is excited to announce a new pilot: Wikipedia Fellows.

Wikipedia is a resource people use every day to better understand the world. In a time when terms like “alternative facts” and “fake news” have become shorthand for a wide range of political, educational, and epistemological challenges to public knowledge, it’s crucial that we ensure the quality of our most popular source of information.

Academics are passionate about sharing knowledge and understand this exigency, but may be unsure how to participate on Wikipedia. Wikipedia is a unique writing environment, and navigating its rules, norms, and processes can be challenging for new users. That’s why Wiki Education exists. We have years of experience supporting students contributing to Wikipedia for a classroom assignment, and we often hear from academics asking for a similar support infrastructure to help them contribute their own expertise.

The Wikipedia Fellows pilot does just that.

Wiki Education is working with three of our partner associations, the American Sociological Association, Midwest Political Science Association, and National Women’s Studies Association, to recruit a small number of their members for the first cohort of this interdisciplinary pilot. From January through April 2018, participants will take advantage of Wiki Education staff expertise and training infrastructure, adapted for this project, to learn how to contribute to Wikipedia. The cohort and Wiki Education staff will have regular group meetings to discuss editing Wikipedia and to collaborate together both as learners and contributors, sharing insights from each of their backgrounds and perspectives. By the end, each person will make a substantial improvement to at least two articles.

For more information about the pilot, including some additional details for potential applicants, see the Wikipedia Fellows page of our website or email wikipediafellows@wikiedu.org. If you’re a member of one of the participating associations and would like to apply to be a Wikipedia Fellow, fill out the application here.

by Ryan McGrady at November 14, 2017 05:20 PM

November 13, 2017

Wikimedia Tech Blog

Wikipedia iOS app named an Editor’s Choice

Image by Rafael Fernandez, modified by the Carolyn Li-Madeo/Wikimedia Foundation, CC BY-SA 4.0

When you have a need for knowledge, Wikipedia is there for you and the billions of others who access the free encyclopedia each month. With more and more people doing so via mobile devices, we’ve re-built our mobile apps over the last eighteen months—and as a result, the Wikipedia iOS app is among the newest to hold the “Editors’ Choice” distinction in Apple’s iTunes store.

The move comes a year after the Wikimedia Foundation re-launched its iOS app, which featured major design and usability improvements with a particular focus on user accessibility and iOS-specific features.

“It’s really great to have our team’s work recognized,” said Josh Minor, Senior Product Manager at the Wikimedia Foundation. “I’m especially happy to see that emphasizing accessibility, one of our organization’s core values, was a major factor in Apple’s decision.” These features included:

  • Using VoiceOver, users can navigate Wikipedia by voice and gesture
  • Users who are colorblind, have contrast sensitivity, or other similar visual issues will find that the Wikipedia iOS app is WCAG AA compliant, features smart color inversion, and has several different appearance themes
  • Those with less-than-perfect eyesight can take advantage of dynamic type, which increases text size across the entire app, not just for articles.

Image modified by Carolyn Li-Madeo/Wikimedia Foundation, CC0.

Approximately 19 percent of all visitors to Wikipedia use the iOS operating system, and the associated Wikipedia app is used by millions of users. The current version has a rating of 4.5 stars and is available to anyone in over 150 countries and territories around the world. Available in 95 language versions of Wikipedia, the iOS app and its Android counterpart aim to assist in Wikipedia’s core mission—becoming the sum of all knowledge—by bringing that knowledge to users’ fingertips.

“It’s been great to see the team’s diligent, incremental work paying off,” says Corey Floyd, Engineering Manager at the Foundation. “And while it is an honor to receive this distinction from our partner, it has been even more amazing to see the impact that the product has had for our users and the love that they have for the app.”

The Apple’s Editors’ Choice marking is given to apps that the App Store’s editors feel are particularly worth a download and meet a variety of criteria.

Work on the Wikipedia iOS app will continue. One planned feature includes synced reading lists, which will allow users to organize Wikipedia articles to read offline, and sync them across devices—even across iOS and Android.

Ed Erhart, Senior Editorial Associate, Communications
Wikimedia Foundation

by Ed Erhart at November 13, 2017 09:51 PM

Wiki Education Foundation

Roundup: Sociology

We all live together in a society composed of increasingly smaller groups. Even if you focus on the individual level, people still take part in various social relationships and belong to one or more cultures. Putting these groups together forms a complex and intricate web. A sociologist can spend their entire career studying just one facet of this web in a given location. In Fall 2017, Florida International University professor Alfredo García’s classes looked at the Basic Ideas of Sociology and used their knowledge to expand Wikipedia’s coverage of sociology related topics.

Have you ever heard of a closed community? Closed communities are societies that intentionally try to sequester themselves away, thereby limiting exposure to other cultures. One student expanded this article to include more information about the concept and give examples of closed countries, such as North Korea. Other articles edited concerned topics such as self-estrangementprimary socialisation, and the cultural transformation theory. These sociology students also looked at the hand wave, a nonverbal communication that can be used as a form of greeting, farewell, or to gain the attention of a person or crowd.

Students also worked on an article about social class differences in food consumption. Prior to this course, the article contained no information about food deserts, areas that lack easy access to grocery stores or other markets that would provide fresh, healthy, and affordable foods to local citizens. Food deserts most frequently occur in poorer areas where inhabitants can lack transportation to grocery stores and funding to purchase better quality food items. Less healthy and cheaper food items, such as fast food and highly processed salty and sugary goods, are often more accessible in a food desert. Thus, people living in these areas must often choose a diet that will not provide them with the nutrition they require. This in part contributes to another aspect that the student included in the article – that some foods are believed to be more “upper class” because of the cost to obtain and prepare the foods, while other, cheaper foods are considered “lower class”.

For some, Wikipedia is the easiest way to learn about a new concept or topic, which is why contributions by teachers using the site as an educational tool can make such a big difference. If you would you like to include Wikipedia editing as a learning tool with your class, contact Wiki Education at contact@wikiedu.org to find out how you can gain access to tools, online trainings, and printed materials.

Image: File:Colin Firth – 66th Venice International Film Festival, 2009 (1).jpg, by nicolas geninCC BY-SA 2.0, via Wikimedia Commons.

by Shalor Toncray at November 13, 2017 06:04 PM

Wikidata (WMDE - English)

WikidataCon 2017

WikidataCon, the first conference dedicated to the Wikidata community took place in Berlin on October 28th and 29th. How to summarize this event? It was a huge success, both from the point of view of the community and the organization team, and we hope that you enjoyed the event as much as we did.

We can already deliver some amazing numbers: 200 attendees, including 50 helpers, 78 speakers, 50 scholars, not including 20+ plushies. We had more than one hundred sessions with a total of 40 hours of video recorded, we had 20 birthday presents, we had 43 countries represented and almost as much sweets on the sweets table!

We are especially happy that WikidataCon may very well be one of the best documented Wikimedia events. Thanks to the wonderful people at C3VOC, recordings of presentations were online just hours after they were given! The list of all recordings is impressive.

Topics that were discussed at the weekend include:

  • An overview of Wikidata’s past, present and future
  • Wikidata for culture and heritage (GLAM)
  • Wikidata for education and science
  • Improving the quality of Wikidata
  • The Query Service, an open door to Wikidata
  • Diving to the technical deeps: the tools around Wikidata
  • Wikidata and the Wikimedia projects

Keep watching the ever-growing documentation!

Apart from making WikidataCon a well documented event, we focussed on making it your event in as many ways as possible: a programme created by the community with a high level of participation (almost half of the attendees also gave talks or presentations) — all in all a conference to give more access to more knowledge to more people.

Giving more people more access also means the little things like inclusiveness and a friendly atmosphere for everyone. We are happy that 200 people came to Berlin to celebrate Wikidata with us.

WikidataCon will return in 2019. In 2018, celebrations of Wikidata’s 6th birthday will be organized by the community. One thing that WikidataCon has shown us all, that this community is excellent. Thank you, Dankeschön, Merci, Q2728730!

by Jens Ohlig at November 13, 2017 04:17 PM

Tech News

Tech News issue #46, 2017 (November 13, 2017)

TriangleArrow-Left.svgprevious 2017, week 46 (Monday 13 November 2017) nextTriangleArrow-Right.svg
Other languages:
العربية • ‎বাংলা • ‎čeština • ‎English • ‎español • ‎suomi • ‎français • ‎עברית • ‎हिन्दी • ‎italiano • ‎日本語 • ‎ಕನ್ನಡ • ‎polski • ‎português do Brasil • ‎русский • ‎svenska • ‎українська • ‎中文

November 13, 2017 12:00 AM

November 10, 2017

Wikimedia Foundation

As Wikipedia Asian Month returns, meet one of its top contributors: J. Patrick Fischer

J. Patrick Fischer, on the left. Photo by Burkhard Mücke, CC BY-SA 4.0.

On a random day two years ago, I was translating articles about East Timor from the English to the Chinese Wikipedia. The English articles were short and lacked information about the country—but on the German Wikipedia, there was an article for pretty much all the entries about East Timor.

While I don’t understand German, it was clear that the German articles had reliable sources, tables, and templates that I could re-use on other wikis. I opened the German version of “Barique Administrative Post”, called “Barique Subdistrict” at the time, and found an unexpectedly well-written and high-quality article with photos and customized maps. This and many other similar articles were developed by one editor: J. Patrick Fischer.

I have seen many Wikipedians with a passion for niche topics, but this was just way beyond my expectations. Unsurprisingly, I later found that some articles on the English Wikipedia, as well as the geographical articles on the Tetum Wikipedia (the official language of East Timor), were written by him too.

J. Patrick Fischer is a German Wikipedian who has been editing for twelve years. He is one of two Wikipedia Asian Ambassadors (recognized by several Asian affiliates), and was the most prolific participant in the Wikipedia Asian Month on the German Wikipedia in the past two years.

Wikipedia Asian Month is an annual global culture and editing event, held every November. It started in 2015, and last year 50 language versions of Wikipedia participated in the event along with more than 2,000 Wikipedians all over the world. Every year, the contest rewards one or two top contributors on each participating Wikipedia.

I went to Patrick’s talk page, and left a message to show how much I appreciate his wonderful efforts. It was my first time talking to him.

Unsurprisingly, when Wikipedia Asian Month started on the German Wikipedia, Patrick quickly became the most active German Wikipedian in the event. Nine new articles were contributed by him in 2015, and 58 more in the 2016 edition. However, Wikipedia Asian Month was not the first step in Patrick’s journey in writing Wikipedia articles about East Timor.

I learned that he had been to East Timor, one year after it had declared independence from Indonesia. The trip, only nine days long, kindled an intense interest in the country and helped drive him to contribute over 2,000 articles about it on German Wikipedia.

Outside Wikipedia, Patrick also contributes to Wikinews and Wikivoyage about East Timor. His work has inspired many German scholars and development workers who professionally work on East Timor using content that Patrick has contributed as first point of the information.

Patrick told me about a Timor-Leste-Conference in Berlin that he attended where almost everyone could recognize him because of his Wikipedia contributions. In order to add to the content quality, he made many connections with locals—including one of the president’s advisors (at the time of the conference) and the photographer of Timor-Leste’s first lady. These have helped him obtain images for Wikimedia Commons about East Timor, ranging from politicians portraits to remote villages.

Meeting Patrick helps me envision how Wikipedia can make the world different. Nowadays, East Timor has gone from a country that you could find nearly nothing about online to one with more information than many other countries. Most of it has come from just one person’s dedication.

Wikipedians from around the world are now participating in Wikipedia Asian Month to help increase the continent’s presence on the world’s largest encyclopedia. You can join them now by registering your name and finding where your passion will take you to.

Addis Wang, Organizer
Wikipedia Asian Month

by Addis Wang at November 10, 2017 07:27 PM

November 09, 2017

Weekly OSM

weeklyOSM 381



How did you contribute with new features 1 | © Pascal Neis

About us

  • SunCobalt has written a parser in Python for the wiki calendar based on the new hCalendar format. The parser has already been successfully integrated into OSMBC so that the OpenStreetMap calendar will remain a part of WeeklyOSM even after 2017-12-31. Big thanks to Thomas on behalf of our readers as well.


  • An enthusiastic tweet from Pascal Neis celebrating the reaching of one million members who contributed at least one changeset to OSM. The responses suggest that figure is a little high if one takes into account unsuccessful edits (empty changesets), and that the actual mark of one million data contributors will be passed in the next couple of months.
  • Antoine Riche reports (fr) about a procurement market to add and maintain bike-related data in OSM around Paris, France. A mapper has been hired and the ongoing work is tracked using #CartoVeloIDF. Carto’Cité will handle connections with the OSM and biking communities. In the light of the recent discussions on paid mapping, they make a point of documenting the whole process on the wiki.
  • Strava has updated its heatmap and written an article about the update. Evaluating three trillion latitude/longitude points (10 terabytes of raw data), the heatmap can also be used in iD and JOSM.
  • User ypid reported in his blog that there are nearly 1 million recorded opening hours in OSM.


  • [1] Pascal Neis’ tool “How did you contribute to #OpenStreetMap?” was updated. Pascal marked the changes in the tweet image. A blog post has been announced, but not yet published.
  • Since October 23rd, members of Project EOF, supported by OIF, carried out seven “OSM, FreeGIS and open data” workshops in Niamey (Niger), Ouagadougou (Burkina-Faso), Saint-Louis (Senegal), Lomé (Togo) as well as Saint-Marc, Cap-Haitien and Port-au-Prince (Haiti) and also 15 one-day actions mixing conferences, technical sessions and mapathons. Follow this on Twitter through #ActionOifProjetEOF.
  • The OSM community was recently awarded at the DINACon conference on digital sustainability. The announcement includes the list of this year’s OSM Awards, awarded at SotM 2017.
  • The Nigerian OpenStreetMap community now has a Twitter account.


  • After Paris and Nice, Vincent Frison plans to import building heights in Montpellier, France. He created a dedicated wiki page to document the process, based on different ground models and open data provided by the city.

OpenStreetMap Foundation

  • The provisional minutes of the meeting of the Licensing Working Group of OSMF on November 2nd are online. Main topics are the directive on the use of the “State of the Map” trademark and data protection issues in connection with the new EU Data Protection Basic Regulation.


  • The State of the Map LatAm 2017 program is online! You can find out more about the activities here.

Humanitarian OSM

  • The International Federation of Red Cross (IFRC) and Red Crescent Societies joined Missing Maps. It aims not only to encourage engagement and support in mapping activities but also improve data literacy and skills across the IFRC.
  • The IFRC and the International School of Geneva hosted a Missing Maps event for secondary (high) school students.
  • User MKnight writes (de) (automatic translation) a diary post about the user interface of Tasking Manager version 3.


  • Sven Geggus, maintainer of the German fork of OSM Carto reports about its current state and the recent difficulties.
  • The developers of OSM Carto discuss a proposal to increase the minimum zoom level of some features whose rendering currently starts on zoom levels 5 to 7. Propositions of different rendering based on density/realities of different countries are rejected because of limits of the rendering tool.


  • User baditaflorin wrote a diary entry on how to create a translation file to convert shapefiles to OSM data. Furthermore, Cygnus is used for conflation.


  • Bryan Housel merged a pull request bringing OpenStreetCam imagery into the iD editor.
  • Carto users ask more and more often to quickly render large geographic datasets with the PostGIS-Mapnik toolchain. Paul Ramsey provides an accurate performance analysis of the various processing steps, alongside with available optimisations and future development plans.
  • User Convergence presented a new 3D renderer on Reddit with New York as an example.

Did you know …

Other “geo” things

  • Steve tweets about the free WiFi finder that he built using Mapbox’s iOS store locator and his city Alberta, Canada’s open data.
  • Jinal Foflia tweeted a picture at the San Francisco International Airport that is using Mapbox maps with OpenStreetMap data.
  • The Guardian newspaper uses OpenStreetMap data to illustrate the effects of rises in global sea level on various major cities around the world.
  • Gerry McGovern writes about navigation-induced traffic jams in England, and more broadly about the consequences of digitalisation on modern society.
  • TechCrunch reported the acquisition of MapData, a neural network-based mapping startup by Mapbox. An augmented reality-based map SDK should be coming soon next year.
  • Dimi (@sztanko) tweeted a face-off between Bing Maps and Google StreetView cars.
  • An article in “The Conversation”, an online magazine, discusses the importance of ‘openness‘ for the digital economy; and mentions OpenStreetMap. It compares the knowledge power revolution step of internet with Gutenberg first printer in 1440.

Upcoming Events

Where What When Country
Tokyo 東京!街歩き!マッピングパーティ:第13回 大宮八幡宮 2017-11-11 japan
Santiago Reunión OSM Chile Noviembre 2017 2017-11-11 chile
Rennes Réunion mensuelle 2017-11-13 france
Taipei OSMGeo Week Taipei 2017-11-13 taiwan
Zurich Stammtisch Zürich 2017-11-13 switzerland
Lyon Rencontre mensuelle 2017-11-14 france
Nantes Réunion mensuelle 2017-11-14 france
Heidelberg Disastermappers Missing Maps mapathon to support OSM GeoWeek 2017-11-14 germany
Karlsruhe Stammtisch 2017-11-15 germany
Ulm Mapping Munyu (OSM Geo Week) 2017-11-15 germany
Rennes Cartopartie humanitaire de Bamako (OSM Geo Week) 2017-11-15 france
Greeley UNC Mapathon University of Northern Colorado 2017-11-15 united states
online via Mumble OpenStreetMap Foundation public board meeting 2017-11-16 everywhere
Digne-les-Bains Cartopartie : Libre information sur les services de santé 2017-11-16 france
Fort Collins CSU Ger Community Mapping Center Mapathon Colorado State University 2017-11-16 united states
Helsinki HOT-OSM Finland Geoweek Mapathon 2017 2017-11-18 finland
Bonn 100! Bonner Stammtisch 2017-11-21 germany
Lüneburg Mappertreffen 2017-11-21 germany
Nottingham Pub Meetup 2017-11-21 united kingdom
Edinburgh Pub meeting 2017-11-21 united kingdom
Lübeck Lübecker Stammtisch 2017-11-23 germany
Perl(DE)-Apach(FR)-Schengen(LU) 1. OpenSaar-Lor-Lux Stammtisch 2017-11-24 luxembourg
Lima State of the Map LatAm 2017 2017-11-29-2017-12-02 perú
Yaoundé State of the Map Cameroun 2017 2017-12-01-2017-12-03 cameroun
Dar es Salaam State of the Map Tanzania 2017 2017-12-08-2017-12-10 tanzania
Bonn FOSSGIS 2018 2018-03-21-2018-03-24 germany
Poznań State of the Map Poland 2018 2018-04-13-2018-04-14 poland
Milan State of the Map 2018 (international conference) 2018-07-28-2018-07-30 italy

Note: If you like to see your event here, please put it into the calendar. Only data which is there, will appear in weeklyOSM. Please check your event in our public calendar preview and correct it, where appropriate.

This weeklyOSM was produced by Anne Ghisla, Nakaner, PierZen, Polyglot, SK53, Spanholz, Spec80, TheFive, derFred, jcoupey, jinalfoflia, keithonearth, sev_osm, wambacher.

by weeklyteam at November 09, 2017 08:19 PM

Wikimedia Foundation

Three principles in CDA 230 that make Wikipedia possible

Photo by James Petts, CC BY-SA 2.0.

Imagine an internet without social media, conversations, or the rich stores of free knowledge created by Wikipedia editors. An internet without content created and shared by anyone. That’s the internet we’d have without Section 230 of the U.S. Communications Decency Act, which provides important legal protections for websites that host user-generated content.

The U.S. Senate Commerce Committee recently approved the Stop Enabling Sex Traffickers Act (S. 1693, or SESTA), a bill intended to address online sex trafficking which would threaten core protections granted by Section 230. The House of Representatives has considered similar changes in recent months. As lawmakers reexamine parts of Section 230, it’s important to remember the law’s goal and essential elements.

The Wikipedia we know today simply would not exist without Section 230. User-driven projects could not thrive if websites were subject to greater liability for user content, and certainly could not be supported by a small nonprofit organization like the Wikimedia Foundation. For that reason, we have some serious concerns about the potential impact of SESTA and other amendments to Section 230. That’s why our Executive Director emphasized Section 230’s importance for Wikipedia’s hundreds of thousands of volunteer contributors in a recent campaign by the Electronic Frontier Foundation, and why we submitted a letter to the Senate Commerce Committee expressing the importance of Section 230 for the Wikimedia projects. The current bill does not reflect the careful balance that preserves small, nonprofit community projects like Wikipedia.

Here is how this balance works.

  1. Website operators need freedom to review content without legal risks

The fundamental goal of Section 230 is to keep the internet free and safe by encouraging operators to host free expression and remove problematic content without the disincentive of possible lawsuits.

SESTA introduces a vague standard for website operators that expands liability for “knowing” support of certain criminal activity. This will encourage websites to avoid gaining knowledge about content (to avoid liability) instead of actively engaging in content moderation.

As currently drafted, SESTA would amend the federal sex trafficking statute (18 U.S.C. § 1591) to state that participation in a sex trafficking venture occurs when a party, such as a website, is “knowingly assisting, supporting, or facilitating” a sex trafficking crime. While clearer than the broad “knowing conduct” standard that appeared in earlier versions of SESTA, this language could potentially make websites unintentionally liable for facilitating criminal activity if they engage in proactive, yet imperfect, monitoring efforts.

The Wikimedia projects are maintained by the collective monitoring efforts of thousands of volunteers worldwide, who promptly remove vandalism and other content that does not follow the community policies. The Wikimedia Foundation Terms of Use provide another baseline for the Wikimedia projects, and violations of those Terms are often removed by volunteer users, or may be reported to the Foundation for removal. The Wikimedia Foundation is able to rely on this community-self governance model, in large part, due to the existing clear protections provided by Section 230.

The ambiguity of the “knowledge” standard in SESTA poses problems for any website that welcomes user-generated content, including those operating on a community self-governance model. Any new laws affecting websites must be drafted carefully to state website operators’ obligations and responsibilities when they are alerted to unlawful content. There must be clear methods for reporting such content to the appropriate law enforcement authorities, without accidentally triggering liability. Without these clear guidelines, website operators will be unable to review new content without risking significant new liability.

  1. The internet needs consistent national standards, not state-specific rules

The internet is built on connectedness, and its advantages come from the ability to share information across borders. Section 230 is a federal law, providing one single, clear standard across all 50 states for when websites can and cannot be held liable for the speech of their users. Website operators in the United States should not have to navigate 50 different, potentially incompatible state rules.

SESTA would amend Section 230 to allow, for the first time, civil and criminal liability for websites under state law as well as federal law in cases where the federal sex trafficking law has also been broken. This improves upon an earlier version of the bill, which would have allowed for much broader liability under state law. Website operators should not have to monitor and attempt to comply with differing laws in all 50 states. Doing so would require substantial time and resources just to stay aware of new laws and ensure compliance, which would be particularly difficult for a small company or nonprofit like the Wikimedia Foundation. It also would put operators in an impossible bind if two states passed laws with contradictory requirements.

The latest version of SESTA avoids the most troubling consequences that could result from competing state standards. Amendments to Section 230 must not upend the balance and predictability of a single national standard that websites have relied on for over 20 years.

  1. The law should not create barriers to smaller website operators and new innovation

The original goal of Section 230 was to provide legal protection for website operators and create room for new forms of innovation. Over 20 years later, these protections remain most crucial for small and emerging platforms.

When plaintiffs target online speech, they often go after the website, not the speaker. It can be difficult to track down individual users, and suing a website may appear to be more lucrative. For two decades, Section 230 has protected websites with a shield from civil liability for user-created content. Critically, Section 230 does not prevent websites from being held responsible for their own actions—websites that are directly involved in illegal activities can already be prosecuted by the Department of Justice. However, SESTA would open up websites to more liability under federal and state law, likely resulting in increased litigation. Some of these lawsuits will be legitimate responses to improper conduct by websites; others may simply target the website over the speaker as an easier way to attack online speech. Even if these lawsuits are meritless, getting them dismissed demands significant time and resources.

Small internet companies, startups, and nonprofit websites like the Wikimedia projects lack the resources to defend against a flood of lawsuits. Websites shouldn’t be sued into the ground, or afraid to even launch, simply because of holes in Section 230’s protections. Any amendments to Section 230 must take into account their effects not just on large, well-funded tech companies, but on startups and nonprofit organizations as well.


We believe that Congress got the balance right when it passed Section 230 back in 1996. When users go online, they are responsible for their own words and actions. When websites act in good faith to keep their communities healthy and free of toxic content, they know that they can do so without undertaking new risks. When websites engage in unlawful conduct themselves, Section 230 provides no shield from prosecution. For over two decades, Section 230 has encouraged good-faith content moderation, under a single federal standard, and protected not only large websites, but also small startups and nonprofits. We urge Congress to avoid disrupting the balance that has made projects like Wikipedia possible.

Leighanna Mixter, Technology Law and Policy Fellow, Legal
Wikimedia Foundation

This post is also available on Medium.

by Leighanna Mixter at November 09, 2017 07:29 PM

Wiki Education Foundation

Giving local history a global audience on Wikipedia

University of Mississippi student Skylar Sandroni was already excited about working with Wikipedia in a college class, even before beginning Robert Cummings’s Writing with Wikipedia course. An enthusiastic friend, who had done a Wikipedia assignment previously, recommended the course to her. The curriculum itself also sparked interest for the English major, who is all too familiar with traditional academic assignments.

“He was excited about it, and it made me excited about it,” she says of this initial exposure. “At the time, I was pretty tired of writing the typical paper. So changing the medium of writing was a very interesting concept to me.”

As part of his course learning objectives, Robert wanted students to gain confidence and fluency in collaborative online writing environments.

“I firmly believe that our students need practical experiences writing in team environments, collaborating online over networks, resolving conflict online, building consensus around facts, producing (rather than merely consuming) information, learning to adapt to an external rhetorical framework, and learning to write beyond academic conventions,” he says. “And, oh yeah, it’s fun!”

Skylar took this new writing opportunity as a chance to better understand and to build upon her city’s history. She chose to focus her research on a well-known public figure where she lives in Oxford, Mississippi: American chef and recipient of the James Beard Award, John Currence.

“Before starting this project I hadn’t known the profundity of accepting a James Beard Award, only that it was another notch in the belt buckle,” Skylar says of her early research. “Only the best chefs in America can claim this title, and I ended up doing a bit more research about Currence and found no one had written an article about him!”

“This article has value as Currence is an award-winning chef, who has had success in blending Louisiana influences into contemporary (US) Southern cooking,” Robert says of Skylar’s work. “By creating a freely available encyclopedia article on Currence, a global audience will now be able to identify him and his work, and his regional significance, and a local audience will be able to better understand the significance of his career within the context of larger cuisine traditions.”

Thus, not only does the Currence article benefit curious Mississippi residents wishing to know more about local history, but as Robert says, because of Wikipedia’s platform, this local history has a global audience.

“Sometimes our faculty feel that our campus and our students can be a bit isolated,” Robert says. “Writing for Wikipedia is like running your own micro-internationalization program within your classroom. Outside of Study Abroad programs—which would expose students to one selected environment—there is no better way to help students engage different perspectives than asking them to write for a global platform.”

With this initial fire to learn more and to tell Currence’s story, Skylar began familiarizing herself with the ins and outs of Wikipedia editing. “I knew virtually nothing about Wikipedia much less writing on the forum,” Skylar says of her previous understanding of the platform. Editing Wikipedia had seemed to her “like a club that we weren’t invited to.” Skylar soon found, however, that writing an article was a natural process, yielding an outcome that exceeded her expectations. Even after encountering technical challenges, such as understanding the difference between hyperlinks and citations, Skylar completed the course more knowledgeable of the platform and proud of the work she had produced.

As the instructor, Robert also had a positive experience with Wiki Education’s infrastructure of support.

“The Dashboard tool was fantastic. It helped me, and, more importantly, my students, visualize our contributions in one space. It was much easier for us to understand our collective and individual impacts on Wikipedia by using the Dashboard. It became the focal point of our classroom as we worked together and individually to make contributions to Wikipedia.”

Skylar embraced the collaborative editing process, accepting input from other Wikipedians about her work. Fellow editors deemed parts of her article promotional, to which she gracefully concedes: “Looking back on it, they were right.” Skylar looks forward to how her final product may continue to evolve this way.

“I enjoyed writing this article and it has become something that I can be truly proud of,” she says. “I think it will be exciting to look back five years from now and see how much my article can and will change!”

Robert’s reflections on what students had learned over the course of the semester are consistent with Skylar’s experience.

“In addition to becoming much more aware of how knowledge is produced through Wikipedia, students became much more confident about writing collaboratively and standing up to challenges on the appropriateness of their contributions,” he says. “They learned to admit when they made mistakes, but along the way they began to better understand why certain mistakes happened. They also learned a different approach to research: they quickly understood that on most pages they absolutely needed to have their facts in order before they could make contributions which would stick.”

Robert isn’t new to Wikipedia; he literally wrote the book on teaching writing in the age of Wikipedia. But this was the first class he’d done with the support of Wiki Education.

“Thirteen years in, writing for Wikipedia is just as fun, just as challenging, and just as innovative as it was in 2004,” he says. “Only now, I have the tremendous help of Wiki Education!”

If you’re interested in learning more about the free educational resources Wiki Education provides, reach out to us at contact@wikiedu.org.

Interview with Skylar Sandroni by Ryan McGrady; interview with Robert Cummings by LiAnna Davis; and blog text by Cassidy Villeneuve.

by Cassidy Villeneuve at November 09, 2017 06:00 PM

Gerard Meijssen

Judith Butler in #Brazil - a reaction in the #Wiki way

When the news has it that an effigy is burned of Mrs Judith Butler in Brazil, it is time to give some attention to Mrs Butler. There is information about her, papers she published and one way of adding to the relevance of Mr Butler is by increasing the people she is connected to.

In 2012 she was awarded the Lyssenko award. Adding that date and the other award winners works in two ways; Mrs Butler is better connected but the other award winners are better connected as well.

There is an article for Mrs Butler in English Wikipedia but given that it is a French think tank who conferred this award, chances are that not everyone on this list has an English article. There are projects that suggest articles to write.. Adding awards in this way may feed those projects. I hope so. For me that would be the best outcome that could be achieved.

by Gerard Meijssen (noreply@blogger.com) at November 09, 2017 08:57 AM

#Wikipedia - Ischia International Journalism Award & the Polk Award

When people win awards, they often win multiple awards. Harrison Salisbury won several awards not only the Polk Award. The Ischia award did not have a date associated with it. I used Awarder and the data from the Italian Wikipedia because that was most convenient.

There was no article for Mr Salisbury in Italian and consequently there was no date associated with him. Mr Salisbury is represented with a red link. It indicated 1990 and it was an easy manual edit.

As you can imagine, that red link could link to the information about Mr Salisbury on Wikidata. Showing this information to those who are interested in writing a Wikipedia article in Italian does provide pertinent information, information that should coincide with the new article. By comparing the information in Wikidata and in existing Wikipedia articles you know that the article is likely to be correct.

by Gerard Meijssen (noreply@blogger.com) at November 09, 2017 06:30 AM

This month in GLAM

This Month in GLAM: October 2017

  • Australia and New Zealand report: Adding Australian women in research to Wikipedia
  • Brazil report: Integrating Wikimedia projects into the Brazilian National Archives GLAM
  • Bulgaria report: Botevgrad became the first wikitown in Bulgaria
  • France report: Wiki Loves Monuments; Opérations Libres
  • Germany report: GLAMorous activities in October
  • Italy report: Experts training on GLAM projects
  • Serbia report: Wikipedian in residence at Historical Archives of Subotica; Model of a grain of wheat exlusivly digitized for Wikimedia Commons; Cooperation of the Ministry of Culture and Information and Wikimedia Serbia – GLAM presentations and workshops for museums, archives and libraries
  • Spain report: Women Writers Day
  • Sweden report: Swedish Performing Arts Agency; Connected Open Heritage; Internetmuseum; More Working life museums
  • UK report: Scotland’s Libraries & Hidden Gems
  • Ukraine report: Wikitraining for Librarians; Library Donation
  • USA report: trick or treat
  • Wikidata report: WikidataCon & Birthday
  • WMF GLAM report: News about Structured Commons!
  • Calendar: November’s GLAM events

by Admin at November 09, 2017 02:21 AM

November 08, 2017

Wikimedia Tech Blog

Wikimedia Foundation appoints Toby Negrin as Chief Product Officer

Photo by Myleen Hollero/Wikimedia Foundation, CC BY-SA 3.0.

The Wikimedia Foundation is excited to announce the appointment of Toby Negrin as Chief Product Officer. Toby has been leading the Foundation’s product development in an interim capacity since April. He brings nearly 20 years of experience of integrating data, research, and design to produce effective and popular products.

The Wikimedia Foundation is the nonprofit organization that supports Wikipedia and the other Wikimedia projects. Together, Wikipedia and the Wikimedia projects are visited by more than a billion unique devices every month. The Wikimedia Foundation is driven by its mission to build a world in which every single person can freely share in the sum of all human knowledge.

As Chief Product Officer, Toby will lead one of the organization’s largest departments in collaborating with the Wikimedia communities on the development of the software products which operate Wikipedia and the other Wikimedia projects. He will be responsible for cultivating a culture of excellence, engagement, and sustainability within the product engineering teams. Toby will also work with others in the organization’s leadership on responding to the product needs and requests of the Wikimedia communities.

“Toby has demonstrated a style of leadership focused on outcomes, collaboration with the Wikimedia community, and respect for Wikimedia values,” said Katherine Maher, Executive Director of the Wikimedia Foundation. “He has a deep interest in partnership with our communities, an instinct for assembling and cultivating teams, and a record of supporting staff as they develop and explore new skills and roles. I am thrilled to have him leading the Wikimedia Foundation’s product development efforts.”

Since joining the Wikimedia Foundation in 2013, Toby has led an expansion of our analytics efforts, built a new team focused on people who utilize the content developed by Wikimedia’s volunteers, and championed the development of intuitive mobile interfaces and mobile applications. Toby played a critical role in the creation of the organization’s Community Tech team, which develops features and tools based on feedback from active members of the communities. He was also a part of the creation of the New Readers program, a cross-departmental effort to help new people discover and benefit from Wikimedia’s projects.

Prior to joining the Wikimedia Foundation, Toby led analytics efforts at DeNA, a mobile social games company. At DeNA, he partnered with colleagues in Japan and China to build global dashboards used to track gaming performance around the world. He also held roles at Yahoo! related to cloud platforms, anti-abuse efforts, and content moderation. Toby grew up in Los Angeles and the UK before landing in the San Francisco Bay Area, and worked in software development at startups in Sweden and The Netherlands.

“The Wikimedia movement is an amazing collaboration among editors, developers, chapters and other organizations around the globe,” Toby said. “I am excited and honored to be part of the movement and look forward to partnering with our communities to build the tools and experiences for the next 15 years.”

Toby graduated from the NIMBAS Graduate School of Management and University of California – Santa Cruz. In his free time, he enjoys spending time winemaking and grape growing, running and hiking Bay Area trails, playing tennis, and supporting the Golden Gate Philharmonic youth orchestra.

by Wikimedia Foundation at November 08, 2017 08:47 PM

Wiki Education Foundation

Monthly Report, September 2017


  • In August, we announced a new partnership with the American Studies Association (ASA), which promotes the development and dissemination of interdisciplinary research on U.S. culture and history in a global context. We are excited to welcome more American Studies classes into our Classroom Program through this partnership.
  • We also announced a new partnership with the National Communication Association (NCA), which advances Communications as a discipline that studies all forms, modes, media, and consequences of communication through humanistic, social scientific, and aesthetic inquiry. Communication Studies students have been improving Wikipedia through our Classroom Program for many years, and we’re thrilled to see more growth in this content area due to this partnership.
  • We placed a new Visiting Scholar with the Deep Carbon Observatory (DCO). Andrew Newell, who in his day job is an Associate Research Professor in the Marine, Earth, and Atmospheric Sciences department at North Carolina State University, specializing in rock magnetism, will be developing Wikipedia’s coverage of topics relevant to deep carbon science in his role as Visiting Scholar.
  • There have been many notable contributions to Wikipedia by Visiting Scholars this month. But George Mason University Visiting Scholar Gary Greenbaum’s three promotions to featured article (his work on the Waterloo Medal, William Henry Harrison’s 1840 presidential campaign, and the British florin) in a single month is especially remarkable.
  • Product Manager Sage Ross has developed a new tool for measuring quality, not just quantity, of impact of student work on Wikipedia. Article development data, or what we call ‘structural completeness’, can be plotted at the scale of thousands of articles at once. As a result, the Dashboard now has a tool that enables Wiki Education staff to build graphs of the impact of entire terms.


Educational Partnerships

Jami speaks with a potential instructor at the American Political Science Association conference.

Early in the month, Educational Partnerships Manager Jami Mathewson, Outreach Manager Samantha Weald, and Classroom Program Manager Helaine Blumenthal attended the American Political Science Association conference here in San Francisco. We spent the week discussing the massive gap between what experts know and what the public understands about political science. So many experts research important, interesting topics about governments and political behavior, yet the information can be difficult to understand if you haven’t spent a career devoted to it. That’s where Wikipedia—and assignments that enable political science students to share distilled knowledge there—comes in. We’re excited to continue working with even more political science professors in our programs.

UC Berkeley students edit Wikipedia.

Samantha visited the University of California, Berkeley and joined faculty at the American Cultures Center, a center committed to fostering academic excellence and civic engagement around issues critical to America’s dynamic ethnic, racial, and sociocultural landscape. The American Cultures Center has a set of engaged scholarship courses that take the study of important issues outside of the classroom, aiming to provide opportunities for students to participate in collaborative social justice projects alongside community organizations like Wiki Education. At the workshop, Samantha met with instructors teaching in global studies, cultural anthropology, environmental design, and bioengineering among others, with courses primed and ready to participate in the Classroom Program.

Jami returned to Washington, D.C. to meet with academic association staff and encourage new Wikipedia initiatives they can use to engage their members in public scholarship. While in town, she joined faculty at Howard University’s Center for Excellence in Teaching, Learning, and Assessment (CETLA) to run a Wikipedia workshop, and Wiki Education looks forward to future collaborations with the center and instructors in attendance.

We’re thrilled to announce two new partnerships this month. We will begin a Wikipedia initiative with the American Studies Association (ASA), an organization promoting the development and dissemination of interdisciplinary research on U.S. culture and history in a global context. We are launching a new partnership with the National Communication Association (NCA), an organization advancing Communication as the discipline that studies all forms, modes, media, and consequences of communication through humanistic, social scientific, and aesthetic inquiry. ASA and NCA are eager to encourage their members to participate in Wiki Education’s Classroom Program to increase the availability of information about American Studies and Communication from multiple perspectives on Wikipedia.

Classroom Program

Status of the Classroom Program for Fall 2017 in numbers, as of September 30:

  • 255 Wiki Ed-supported courses were in progress (142, or 56%, were led by returning instructors)
  • 3,653 student editors were enrolled
  • 61% of students were up-to-date with the student training modules
  • Students edited 1,380 articles, created 4 new entries, and added about 210,000 words to Wikipedia.

The Fall 2017 term is well-underway, and our students are beginning to dive into their Wikipedia assignments. They’re creating their user accounts, taking the training modules that will prepare them for their later contributions, and generally familiarizing themselves with a site they likely thought they knew well already. New courses are continuing to role in as schools on the quarter system begin their fall terms, and we’re even beginning to think forward toward Spring 2018.

Continuing our efforts to provide our instructors with a variety of ways to connect with the Wiki Education team, we held two sessions of Office Hours earlier this month. During these sessions, instructors are invited to join members of the Classroom Program team through a video chat interface to ask any questions about their Wikipedia assignments. It’s also a unique opportunity for instructors to meet and interact with other instructors using Wikipedia in their courses. While we love the chance to share our expertise, it’s even more satisfying when instructors are able to advise one another on what works best for them. We’ll be holding Office Hours throughout the Fall term as students enter the different stages of their assignment.

As always, we’re excited to see the range of contributions our students will make this term. With courses ranging from Behavioral Ecology to the History of Science in Latin America, we know our students will continue to fill in important content gaps on Wikipedia.

With the 2020 summer Olympics only three years away, there’s still plenty of time to soak up information about the culture of Japan. Thankfully one of the students in Elizabeth
Bonsignore’s Introduction to Information Science class is here to help! This person expanded the article’s coverage of Japanese religions, namely Shintoism and Buddhism. Now readers will be able to quickly find basic information about these religions in this article that will whip up their appetite to learn more. Who wouldn’t be fascinated to learn that the early Japanese were initially drawn more to Buddhism’s art, as it was hard for most to understand the religion’s difficult philosophical messages? Or that Shintoism was present in Japan before the country’s introduction to Buddhism in the 6th century C.E.?

While we’re on the subject of Japanese culture, many may recognize the phrase “Gotta catch ’em all!” from the well-loved Pokémon franchise. This franchise isn’t limited to animated films, television, and video games, it also has its own trading card game where players can pit their own “pocket monsters”—or Pokémon—against those of other players. Another student from Bonsignore’s expanded the article on the trading card game to give more information on recent additions. The game’s popularity seems to be ever increasing and players of all ages can even participate in official tournaments held by the Pokémon Company International.

Species articles provide a rich vein for students to create new articles or expand existing ones. And students in Jim Cohen’s Ecology class did just that. Alternanthera philoxeroides is an aquatic plant which is native to Argentina but has spread around the world, hitching a ride on cargo ships. A student in the class expanded the fairly short Wikipedia article, adding information about many of the impacts this invasive species has on vegetation, animals, and human activities. They expanded the article to look at some of the measures meant to control the spread of this species. Other students in the class created articles on other plant species—Verbena stricta, Tanacetum huronense, Platanthera hookeri and Halenia deflexa. Some took a different approach, creating a List of waterfalls of the Delaware Water Gap and an article about Genetic ecology. Yet others expanded existing articles.

Image: Miniatur aus dem böhmischen Cantionale des Literaten-Chors in Chrudim, 1570.jpg, public domain, via Wikimedia Commons.
Image: Artistic indoor pictures.jpg, by Nate Merrill, CC BY-SA 2.0, via Wikimedia Commons.

Butterfly species also got a boost from students in Joan Strassmann’s Behavioral Ecology class. Although the class has just gotten started, students in the class have already made major expansions to the Grizzled skipper, Indian mealmoth, and Anthocharis cardamines articles.

Two students in Ericka Menchen-Trevino’s Understanding Media uploaded images as part of their coursework. One added a freely licensed photo from Flickr, while another uploaded a brush drawing from the 1800s.

Community Engagement

Community Engagement Manager Ryan McGrady split his time between developing materials for the Future of Facts Wikipedia Fellows pilot program outlined in our Annual Plan and working with new Visiting Scholars, sponsors, and prospective sponsors at different stages of the onboarding process.

Andrew Newell, new Visiting Scholar
Image: RockMagnetist.tif, by RockMagnetist, CC BY-SA 4.0, via Wikimedia Commons.

We were glad to announce a new Visiting Scholar, Andrew Newell, placed with the Deep Carbon Observatory (DCO). Andrew is an Associate Research Professor in the Marine, Earth, and Atmospheric Sciences department at North Carolina State University, specializing in rock magnetism. On Wikipedia, he edits as User:RockMagnetist, a long-time contributor and administrator. If you’ve read about geophysics-related subjects on Wikipedia, there’s a very good chance you could find his username somewhere in the articles’ edit histories. With the DCO, he will be developing Wikipedia’s coverage of topics relevant to deep carbon science. Read more about this collaboration in the announcement on our blog.

Existing Visiting Scholars produced some great work this month. We featured a pair of articles by Eryk Salvaggio on our blog: Oregon black exclusion laws and an event which spurred one of those laws, the Cockstock Incident. We mentioned in last month’s report that the Cockstock Incident appeared as a Did You Know on Wikipedia’s Main Page. This month, the article about Oregon black exclusion laws likewise appeared on the Main Page. Eryk also started the article about black cowboys, adding yet another Did You Know to his collection: “[Did You Know] that in the American West from the 1860s to the 1880s, nearly 25% of cowboys were black?”

Danielle Robichaud’s term as Visiting Scholar at McMaster University ended some months ago. One of the great examples of her contributions was the Canadian Indian residential school system article, which she continued to develop and shepherd through the Featured Article Candidates process. Now, the exceptional entry has finally been promoted as an example of Wikipedia’s finest articles.

Catharine Sedgwick
Image: Catherine Sedgwick (crop).png, public domain, via Wikimedia Commons.

Rosie Stephenson-Goodknight, Visiting Scholar at Northeastern University, continued to improve Wikipedia’s coverage of women writers. Among others, this month she added Louisa Susannah Cheves McCord, Minnie S. Davis, and Catharine Sedgwick.

We know that George Mason University Visiting Scholar Gary Greenbaum is a skilled Featured Article writer, but three promotions in a single month is a lot even for him. The WaterlooMedal, William Henry Harrison’s 1840 presidential campaign, and the British florin all joined the highest level of quality this month.

Finally, our newest Visiting Scholar, Paul Thomas at the University of Pennsylvania, whom we had not yet formally announced in September, hit the ground running with a Featured Article about the first century poem, Astronomica by Manilius.

Program Support


In September, we focused on revamping our Editing Wikipedia brochure, the main guidebook for student editors in our program. The updated 2017 edition was published at the end of the month, and contains minor wording tweaks and updated screenshots to make how to edit Wikipedia clearer for our program participants. We also published a new discipline-specific
handout, Editing Wikipedia articles about History, which is available as both a PDF and in print for participating instructors.

As part of our ongoing efforts to improve our help resources, we also updated the text of some of our online training modules for students.

Blog Posts:

Digital Infrastructure

The main areas of Digital Infrastructure work in September were: exploratory data visualization of students’ impact on article quality; bringing our new Desk.com-based ticketing system into full-scale use; and improvements to the training system on the global Programs & Events Dashboard. We also saw an influx of first-time code contributors; the Dashboard codebase has now reached (and passed) the milestone of 50 different contributors.

Structural completeness before-and-after Spring 2017, for the 1,156 already-existing articles that our students added at least 6,000 bytes of content to. The Dashboard can now produce similar graphs on demand for any set of courses.

As part of our preparation for first Wikimedia Annual Plan Grant, we wanted to come up with a automatable metric that we could use to get at the quality — and not just the quantity — of our programs’ impacts on Wikipedia. We’ve been using what we call ‘structural completeness’, based on the ORES artificial intelligence system, to visualize the development of individual articles. This month, Product Manager Sage Ross took that ORES data and started organizing and plotting it at the scale of thousands of articles at once, visualizing the before-and-after distribution of article quality (according to ORES) for an entire term. As a result, the Dashboard now has a tool to build graphs of the impact of entire semesters. Sage also worked toward repeating Kevin Schiroo’s summer 2016 data science experiments, using data from more recent semesters.

The Classroom Program team worked out most of the kinks with our new Desk.com-based ticketing system. It is now the default tool we use to respond to support requests and time-sensitive issues for current courses.

Following up on a lot of excitement generated at Wikimania, Sage ported all the basic Programs & Events Dashboard training modules over to meta.wikimedia.org, which means that they can translated and updated on the wiki. Sage also overhauled the code for importing and updating this on-wiki training data, which was necessary to keep up with the increasing scale of modules and translations being created.

For the Wiki Education training modules, Sage addressed one of the top requests from student feedback; the ‘Evaluating Articles & Sources’ module now includes a live on-wiki tutorial for newcomers to practice adding citations.

Finance & Administration / Fundraising

Finance & Administration

For the month of September, expenses were $143,761 versus the approved budget of $172,420. The $28k variance can be attributed to some staffing vacancies ($10k) and less than anticipated spending on professional services ($6k), travel ($6k), and general operating expenses ($6k).

Our year-to-date expenses of $442,050 are also less than our budgeted expenses of $537,758 by $95,700. Areas where our spending was significantly under budget include staffing ($16.4k), professional services ($41.8k), travel ($15.7k), and general operating expenses ($21.6k).


In September, we received $375,000 in general operating support from the William and Flora Hewlett Foundation. TJ Bliss, Director of Development and Strategy, continued to cultivate
relationships with funders who we felt might be interested in Wiki Education because of our work on the Future of Facts, Guided Editing, and Sustaining Science initiatives. We also began exploring funders that are interested in supporting journalism, as we understand journalism fairly broadly to include citizen-created/collaborative/non-credentialed type work, such as Wikipedia.

We also submitted our first Annual Plan Grant to the Wikimedia Foundation’s Funds Dissemination Committee (FDC). The FDC is a volunteer group that determines large grants to support activities that further the mission of the Wikimedia movement.

Office of the ED

Current priorities:

  • Preparation of the upcoming strategic planning meeting in October
  • Reorganization of Wiki Education’s financial and administrative support

In September, Frank hired Özge Gündoğdu as Wiki Education’s new half-time Office Manager and Executive Assistant. Özge will help with everything that’s needed to ensure the smooth day- to-day operation of our office. She will also support Frank with organizing board and all-staff meetings. Frank also signed a contract with Denise Donovan who will provide financial services to Wiki Education on an hourly basis. Bill Gong, who was actively recruited for and accepted a position as Director of Finance and Business within UC Berkeley, agreed to stay on in an hourly position as Financial Advisor.

Also in September, Frank worked closely with Director of Programs LiAnna Davis on finalizing the FDC Annual Plan Grant application.

by Cassidy Villeneuve at November 08, 2017 06:57 PM

Running a newbie-friendly free software project

Our main tech project, the Wiki Education Dashboard, is a free and open-source web application. Built with Ruby on Rails and React.js, it’s a codebase that started in late 2014 as an ‘agency’ project: we hired the firm WINTR to develop the software we needed. For its first year, it was almost exclusively built through full-time, paid development — with the exception of some language support features that were contributed early on by Wikimedia Foundation engineer Adam Wight. For our 2016-17 annual plan — with a much smaller budget for paid development — we wanted to explore the prospect of building a volunteer development community around the project.

We recently reached the milestone of 50 different contributors to the Dashboard’s code repository on GitHub — about 2/3rds of whom I would describe as newbies. Progress has been gradual and we’re still working on making the project better for newcomers. But this seems like a good time to reflect on what I’ve learned about running a newbie-friendly free software project.

Newcomer-friendly issues

The most important thing is to have some straightforward, self-contained issues lined up. This should come as no suprise to Wikipedians, who have been keen on the newbie-attracting power of typos and redlinks to missing articles for years. My advice:

  • Identify specific, easily-understood issues that only require understanding one part of the system. Knowing where and how to start is often the biggest challenge with a new codebase, even for an experienced developer.
  • Tag issues with the skills and/or technologies involved. This helps people find issues that they are confident enough to dive into. For the Dashboard, we use four categories for this: Design, HTML/CSS, Ruby/Rails, and Javascript/React. Some issues span multiple areas, but for first-time contributions especially, the fewer skills involved, the better.
  • Refactoring and other code quality issues make great issues for newcomers.
  • Document the knowledge you have about how to solve an issue. Which files need to be changed? Can you break it into several steps? Any good resources that would be particularly helpful?

Some of the biggest barriers:

  • Getting a development environment set up is hard. Even beyond project-specific requirements, the diversity of systems, tutorials and ways of setting up particular tools means that some people run into trouble with getting MySQL, Ruby, Node.js, or some other core dependency. And when that happens, it can be very hard to debug.
  • Javascript package management and build tools are complex and error prone. As of late 2017, the Javascript ecosystem seems to be settling down a little bit, but new and interesting environment-specific bugs still come up pretty frequently when newbies install all the assorted Javascript dependencies and build the project assets.
  • For a complex project like this one, creating the local data needed to test and play with an issue is often harder than fixing the code.

Finding contributors

When it comes to attracting contributors, I’ve tried a lot of different things — and a lot of them have worked! We’re fortunate to be part of the Wikimedia technical community, which participates in things like Outreachy, Google Summer of Code, and Google Code-In — all of which have brought in great contributors. For several months, I focused on working with the AgileVentures community, which connects self-taught web developers with nonprofit tech projects. After the initial effort of getting to know the community and documenting our project, AgileVentures has become a steady source of contributions and contributors. I signed up for CodeTriage on a lark one day, more than two years ago, and recently connected with the first contributor who found their way to the Dashboard project from there. We’ve had contributions from people who use the Wikimedia Programs & Events Dashboard and want to scratch their own itch, and from dev bootcamp students assigned to contribute to an open source project. (I’ve also gotten lots of help from my local Ruby community, Seattle.rb.)

My tentative advice for finding contributors is to get the word out in a lot of different places, and take a long-term approach to helping new contributors get started whenever they show up. For the Dashboard at least, and probably for many similar web app projects that serve a particular community, nearly all the contributors got involved because of outreach efforts; waiting for contributors to simply show up on their own would not have worked.

The good and the bad

I’m really pleased with our progress on making the Dashboard a newbie-friendly project, and at this point I think it’s been worth the (very significant) time investment. But it’s not a short-term solution to getting more done — with most contributors I spend about as much time mentoring as it would take me to the same tasks myself, until they’ve spent a few months with the Dashboard codebase. Setting aside newbie-friendly issues can also increase the lead time between knowing about a problem and deploying a fix — especially for really easy things. But beyond the code contributions themselves, I’ve also noticed some more abstract benefits:

  • I’ve started to think about code changes — bug fixes, new features, code quality improvements — in terms of how easy they are to describe, and to plan code changes in chunks that make each description as simple as possible. This leads to better decoupling of different parts of the codebase!
  • I’ve also stopped thinking about the implementation in the friendlier parts of the codebase, freeing up some space in my head for other things.
  • What makes the project easier for newcomers also makes it easier for experienced developers to get up to speed quickly.
  • I learn a lot when I have to explain things!

Best of all, some people stick around and keep contributing to the project, and some (whether they stick around very long or not) have brought skills that I don’t have to the project. I’ve learned an enormous amount from volunteer contributors, and the codebase is much better for it.

Want to get involved with the development of the Dashboard? Let me know! Or send a general inquiry to contact@wikiedu.org.

Image: File:Hackathon at Wikimania 2017 – KTC 55.jpg, by Katie Chan, CC BY-SA 4.0, via Wikimedia Commons.

by Sage Ross at November 08, 2017 04:58 PM

Wikimedia Cloud Services

Automated OpenStack Testing, now with charts and graphs

One of our quarterly goals was "Define a metric to track OpenStack system availability". Despite the weak phrasing, we elected to not only pick something to measure but also to actually measure it.

I originally proposed this goal based on the notion that VPS creation seems to break pretty often, but that I have no idea how often, or for how long. The good news is that several months ago Chase wrote a 'fullstack' testing tool that creates a VM, checks to see if comes up, makes sure that DNS and puppet work, and finally deletes the new VM. That tool is now running in an (ideally) uninterrupted loop, reporting successes and failures to graphite so that we can gather up long-term statistics about when things are working.

In addition to the fullstack test, I wrote some Prometheus tests that check whether or not individual public OpenStack APIs are responding to requests. When these services go down the fullstack test is also likely to break, but other things are affected as well: Horizon, the openstack-browser, and potentially various internal Cloud Services things like DNS updates.

All of these new stats can now be viewed on the WMCS API uptimes dashboard. The information there isn't very detailed but should be useful to the WMCS staff as we work to improve stability, and should be useful to our users when they want to answer the question "Is this broken for everyone or just for me?"

by Andrew (Andrew Bogott) at November 08, 2017 12:28 PM

Gerard Meijssen

#Wikidata as a Wiki versus the data consumers’ perspective

Wikidata is a Wiki. It follows that many people with many agenda's add data to Wikidata. It is a continuous process and as is usual in a Wiki, all contributions that fit the notability requirements of the project are welcome.

The consumers' perspective seen from a Wiki point of view is a bit awkward. There is nothing but active contributors that work towards any of the quality considerations. Even when there is a reasonable quality for some, it may not be enough for others.

Both Wikipedia and Wikidata are Wikis. Both have issues from a consumers' perspective. They are already explicitly integrated through the interwiki links and implicitly through the Wiki links. One of Magnus's tools makes this visible.

When you then consider George Polk and the George Polk Award it becomes obvious that Wikis have an issue from a data consumer's perspective. In some Wikipedia articles the two are conflated. In others there is a separate list of award winners. Many of the award winners do not have an article and some of the award winners refer to the wrong person. Wikidata could do with more data; the data was imported from Wikipedia and several of the wrong persons are still wrong in Wikidata.

Both Wikipedia and Wikidata consume each others data. Both are Wikis. There is no superiority in either project but they could compare their data and curate the differences.

by Gerard Meijssen (noreply@blogger.com) at November 08, 2017 09:53 AM

November 07, 2017

Wiki Education Foundation

Wiki Education will be at American Studies this week

This week, I’ll be traveling to Chicago to join American Studies instructors at the American Studies Association annual meeting. The trip is part of our new partnership with the American Studies Association so I’m excited to help answer any questions that members might have about Wikipedia generally or our work here at Wiki Education.

As this years’ theme is Pedagogies of Dissent I’m especially excited for some lively discussions about the role Wikipedia can play as both a pedagogical tool and as a means of accessing and participating in political and intellectual discourse. As an informal educational tool of its own, Wikipedia reaches billions of people each month globally, and thousands locally. Students who engage in the Wikipedia assignment research knowledge gaps in their field, assess and determine the weight of academic arguments, and ground their contributions with peer-reviewed sources. And even more incredibly, this assignment gives students the opportunity to change the way the public learns about politically relevant topics.

While in town, I’ll be hosting a workshop at the University of Illinois at Chicago and hosting a booth in the exhibit hall at ASA. Details about each are below, I hope you’ll join us!

University of Illinois at Chicago Workshop

  • Thursday, 11/9 1:00-3:30pm
  • Daley Library, Room 1-470
  • The Library is open to the public.
  • RSVP here to attend!

Exhibit Booth

  • Thursday, 11/9 – Opening Reception, 7:00-8:00pm
  • Friday, 11/10 – Booth hours, 9:30-5:30pm
  • Saturday, 11/11 – Booth hours, 9:30-5:30pm
  • Sunday, 11/12 – Booth hours, 8:30-11:00am

by Samantha Weald at November 07, 2017 06:58 PM

Wikimedia UK

Help reduce the #GenderGap and win prizes in the Women in Red World Contest

Women in Red world logo – image by Susan Barnum CC BY-SA 4.0

WikiProject Women in Red is holding a biographical article creation contest throughout the month of November. They aim to create 2000 new biographical articles by the end of the month on women from every country and occupation on the planet as part of a project effort to increase diversity and the percentage of women biographies on Wikipedia (which is just 17.15% in relation to men at present).

The total prize fund is over $4500 (over £3000), and Wikimedia UK are offering Amazon voucher prizes valued at £250 (top prize £150) for any Wikimedian who writes the most satisfactory new articles on British women which are rated Start Class (1.5k bytes) or better.

UK Wikimedians may also create articles on women from any country and compete for the prizes for women of different continents and occupations. Work done on British women may also count towards the European prize for most article creations, of which WMF are offering $200 in prizes.

To take part in the contest, Wikimedians should enter their names in the participants section of the contest page and check out the list of missing biographies of women from the Oxford Dictionary of National Biography (ODNB) which you can see here. You can find missing biographies specifically of British women here. Newly created articles should be added to the bottom of the contest page. If you are competing for prizes further list your entries in the United Kingdom section on the page for Europe during the contest and the prize claims page for most new British biographies at the end of the contest.

User:Dr. Blofeld, the contest organiser, says ‘we’ll accept any UK women bios, but the emphasis is really on those notable missing dictionary entries, particularly the ODNB and the Welsh Dictionary of Biography. In just a week, over 600 articles have been produced worldwide already, but at present not many editors are doing British entries. Here is a chance to significantly increase our proportion of British women biographies and target really notable missing articles. Even if you only have time to create one or two entries, everything counts’.

So we need the UK Wikimedia community to get involved and contribute more new biographies of notable women. Let’s get editing!

by John Lubbock at November 07, 2017 05:36 PM

Wikimedia Foundation

How to design for low-bandwidth users: a conversation with members of Wikimedia’s Global Reach, Audiences, and Performance teams

Photo by Vaido Otsar, CC BY-SA 4.0.

It has been five years since it first became possible to download a full text version of English Wikipedia through the OpenZIM format. In the intervening years, there have been many additional performance improvements to Wikipedia to improve the experience of low-bandwidth users. In this chat, Melody Kramer talks to several members of the Wikimedia Foundation’s Audiences, Global Reach, and Performance teams about ways to improve access for users with low or limited bandwidth.


  1. I’ve seen a fair bit of chatter on Twitter about the need for news and information websites that are designed for low-bandwidth users (primarily for use during disasters like hurricanes or cyclones or floods, when lots of people lose electricity.) There are a few examples—in the US, both CNN and NPR have text-only sites, The Age in Australia has a similar set-up, and Twitter and Google News have both released low-bandwidth versions of their sites—but most news and social sites have lots of pictures, Javascript add-ons and ads.

    Your team has done a lot of research into how to design sites for low-bandwidth users—whether they’re accessing content during a natural disaster or whether they routinely have limited access to the Internet based on cost or other factors. Can you share a little bit about what you’ve learned about these use cases?

Dan Foy:  A few years ago, we created a service to provide access to Wikipedia articles through a combination of SMS and USSD technologies.  It worked by allowing users to name a search term, then presented a short list of choices for disambiguation and then another list of sections within the article.

Once chosen, the content was sent via concatenated SMS which allowed about a screen full of content to be received, with the option to continue reading. We conducted a 2 month pilot with Airtel in Kenya, which allowed us to test performance and gauge interest in the service.  We reached over 80,000 sessions within 2 months, and saw a lot of repeat usage.  Average unique subscribers used the service around 9 times per month.  Even after the pilot officially ended, usage continued to grow organically past 100,000 total uses just by word of mouth.

Jorge Vargas:  We also conducted a USSD-only pilot in Argentina. Through this pilot,we realized that this was a complicated method to use because it required the user to have some digital skills (and a lot of patience) to navigate through the platform. Despite the very poor UX USSD offers, the fact that it can be used on any kind of phone without the need of any data is a huge value when facing access issues. We learned that with a proper marketing campaign and some sort of capacity building or using manuals, we can elevate usage of this tool.


  1. What kinds of performance improvements has Wikipedia done to make the site accessible for low-bandwidth users?

Anne Gomez: We have improved the site loading time for low-bandwidth users in a number of ways recently, including showing users the first section of an article at the top and improving the way photographs load on mobile to reduce data usage.

Gilles Dubuc: Performance is an important topic, and the Foundation has a dedicated team for it. Our Performance Team constantly releases performance improvements to the existing platform, which benefit low-bandwidth users in greater absolute terms than people with fast internet connections. When we make the wikis faster by 5 percent, it might not be felt by people with fast connections, but for someone with a slow internet connection, it might represent seconds of waiting time saved per pageview.

Our synthetic testing platform, currently based on WebPageTest, simulates low bandwidth conditions, which allows us to keep an eye on the evolution of performance for low-bandwidth users. This helps us both to catch performance regressions when they happen and to quantify the impact of improvements we make.

We are also in the process of opening our first caching data center in Asia to improve the performance for users in that region. While bandwidth is an important factor, the bad performance can also come from high latency. Since we’re limited by the speed of light and the physical distance between users and our servers, we have to get closer to them. The decision to open this new data center is directly derived from performance data collected with probes in those regions. With this real world data, our Technical Operations team was able to identify the best physical location possible to achieve maximum impact. This new location is expected to open in late 2017/early 2018 and we’ve already set up additional performance metric measurements focused on Asia in order to assess the before/after impact of this big change.

As for past achievements, it’s best to look at the trend of our core performance metrics over long periods of time. While we sometimes get big wins with big visible changes, such as the effect transitioning to HHVM has had on article save time (cutting it in half), hundreds of small performance improvements over a long period of time have had an even bigger impact. While HHVM brought us from an average of 6 seconds to save an article edit to 3 seconds in 2014, we have since been able to reduce the average edit save time to less than 900 milliseconds. This is the result of a constant focus on performance at the Foundation. This culture, applied to many small individual engineering decisions, leads to tremendous performance improvements over time.

The long-term impact we’ve had on front end performance is less clear. Last year we fixed a number of issues that were previously skewing those metrics and we’re in the process of overhauling our real-user metrics. We know it hasn’t worsened, but we can’t claim it has improved. But maintaining current performance is a challenge in itself, as the wikis are more and more feature-rich. We work with all teams releasing new software, as well as volunteers, to ensure that feature releases don’t impact performance negatively. This is critical for users with bad internet connections, which would be disproportionately affected by performance regressions. So far, the dozens of identified performance regressions – that are often the result of unforeseeable side effects – which the Performance Team has caught since its inception have all been fixed quickly.

Measuring the performance as it is experienced by users and interpreting the data correctly is a significant challenge in itself, which you need to have been constantly good at for a long period of time in order to be able to claim performance gains with absolute certainty. This is even more true for users with bad internet access. The classic example being that for some users their internet connection is so bad that a given request ends before completion, never reaching the stage where it can send us data about its performance. In essence, the worst experience can’t be measured. And when we improve performance to the point that those users can start having a working experience, it might still be a slow one, and make it look like performance has worsened on average (because we start getting metrics for these users, but they’re slower than average). When in practice those users went from having an experience so bad it didn’t work at all, to having a slow, albeit working, experience, which was obviously an improvement. Thankfully web performance is a very active field and the companies developing browsers are constantly releasing new performance-related APIs, which we leverage whenever we can to understand performance better.


  1. If organizations wanted to make their sites accessible for low-bandwidth users, where do you recommend they begin?

Anne Gomez: A lot of people who are cost-conscious with their data use proxy browsers such as UC and Opera Mini. These browsers strip out most of the data-heavy content and features, including removing JavaScript, which is essential for most modern sites to operate. Without getting too deep into the technical ways that they do this, it’s important for brands with a global presence to make sure that their sites work well in these browsers. Even if the functionality is limited, relative to the full site, users of these browsers shouldn’t have a broken experience.

Jorge Vargas: Having a no-pictures version of Wikipedia was something we had with Wikipedia Zero in the initial stages. I think it could be an accessible way to get to Wikipedia for low-bandwidth users—perhaps involving an opt-in or out option. That said, I’m not sure there would be a huge difference, as articles are usually heavier on the text side. There are definitely pros and cons for this.

Olga Vasileva: As Anne pointed out, we implemented lazy loading of images on the mobile website.  This means that images load as the user scrolls down the page.  If a user only views the initial sections of the page – they do not download the data for images below the fold.  For many websites where users are not likely to read the entire page – lazily loading images or other content is an efficient way of saving data for their users.

Gilles Dubuc: We have to measure the performance as experienced by those low-bandwidth users accurately first. We’ve seen examples in the industry where things started with a good intent (eg. making a text-only version of the website), but the execution was poor, with the “light” website loading megabytes of unnecessary JavaScript libraries because that’s what their developers were used to work with.

Developing a website focused on low-bandwidth users requires a drastically different approach than developing a website focused on being feature-rich. Not that those two objective are incompatible, but performance/lightness is difficult to achieve after the fact by retrofitting an existing website. It has to be a core concern from day one and requires great discipline that goes beyond just getting things to work. This is why you usually see that those projects are separate websites, because it’s easier to achieve when starting from scratch. The ideal is having a single website that does what’s best for low-bandwidth users by adapting the experience for them, of course. And much like accessibility, improving the experience for low-bandwidth conditions usually makes the experience more pleasant for users with high bandwidth internet as well.


  1. I realize we’re talking about websites, but there are also ways to think about USSD and SMS. How have you thought about those platforms when thinking about conveying information to the end user?

Jack Rabah: We are currently exploring a partnership to offer free Wikipedia via SMS and voice with a global mobile service company. This collaboration will deliver Wikipedia content to MNO subscribers, free of charge, through the interactive SMS and voice capabilities of their platform. We are exploring this as a pilot in order to learn more about how well this works in practice. From the lessons we learn from this pilot, we hope to eventually make this service widely available to reach the billions of people who have mobile phones, but cannot afford access to the internet.

Jorge Vargas: USSD is an interesting way to bring information to the end user. It works with really low-bandwidth, and there is no need for a smartphone. The problem is the strong limits to be able to obtain text (just two or three lines are displayed), there’s a timeout that requires reconnecting again after certain time, and the UX is not very friendly. Facebook and Twitter also have USSD platforms – it’s a very small audience, but a very specific one that could be served.


  1. What about preloading content on mobile? What kinds of things can be done technically?

Jorge Vargas: We can preload the Wikipedia app on smartphones and tablets. With the app, we can also preload a file with an offline version of a Wikipedia (ZIM files, built by Kiwix). Ideally, we would be able to preload curated “packages” or “collections”, but this content curation is yet to be explored. WIdeally we could have packages with information on response to natural disasters, for example. The only specific ZIM files that are more content specific are the ones for Wikimed.

Anne Gomez: To build on what Jorge said, we’re learning from our initial research around that feature that people are looking for small, specific content packages to be on their devices, which is something we aren’t currently able to offer. You can see that research linked here under “Research findings.” We’d love to be able to create and offer smaller, more focused packages of files based on a topic or area of interest, in any language and are investigating what that might look like and how we could support our readers and editing community in building exactly what they need.


  1. I know you’re also investigating the possibility of changes to the mobile website to support intermittent connection. Can you talk a little bit about how to support users with intermittent connections?

Anne Gomez: Users with intermittent connections exist all over the world. Even in the most connected cities, there are still gaps in coverage. It’s really frustrating when you’re browsing the web waiting for some part of the site to load, the connection drops, and the entire page disappears. Beyond that, we know from our research that some people who are cost-conscious about their data usage open browser tabs when they’re on wifi to read later when they don’t want to pay for internet. We want to support those people.

Olga Vasileva: We have recently begun a project that will allow users to access the mobile website even during intermittent connections.  For example, if their connection is spotty or they leave a wifi zone, they will still be able to read articles within the website – they can hit the back button and access the articles they read before or, tentatively, save articles that they would like to read when offline.  We will also be improving the messaging for users in these circumstances – they will be able to know which portions of the content they can access, as well as which portions of the content are unavailable while the user is offline.  The project will also aim at making the website more cost-friendly to users by focusing on using less data when loading a page.


  1. Where can web devs go to learn more about this or stay abreast of what your team is up to?

Jorge Vargas: They can always reach us at globalreach[at]wikimedia[dot]org to learn more about our work and what our team is up to.

Olga Vasileva: Our current projects are listed in the Reading Web Team project page.

Interview by Melody Kramer, Senior Audience Development Manager, Communications
Wikimedia Foundation

by Melody Kramer at November 07, 2017 04:53 PM

WMF Release Engineering

Tech talk: Selenium tests in Node.js

Who 👨‍💻

Željko Filipin, Engineer (Contractor) from Release Engineering team. That's me! 👋

What 📆

Selenium tests in Node.js. We will write a new simple test for a MediaWiki extension. An example: https://www.mediawiki.org/wiki/Selenium/Node.js/Write

When ⏳

Tuesday, October 31, 16:00 UTC (E766).

Where 🌍

The internet! The event will be streamed and recorded. Details coming soon.

Why 💻

We are deprecating Ruby Selenium framework (T173488).

See you there!

Video 🎥

Youtube, Commons (coming soon)

by zeljkofilipin (Željko Filipin) at November 07, 2017 11:32 AM

Gerard Meijssen

#Wikipedia; Héctor Rondón did not win the #Polk Award

This is Héctor Rondón, he pitches for the Cubs. He did not win the George Polk awardHéctor Rondón Lovera did.

This is a common mistake, it happens all the time and it is where Wikidata may make a positive difference to Wikipedia.. It just requires a different mindset to see why this is the right solution at this time. There are some loud Wikipedians that abhor Wikidata. This is an easy and obvious method that will improve Wikipedia and there is no sane argument why this would not work.

These Wikipedians do not even have to notice that this is done; we can hide it from them and still do a world of good. Not just for English Wikipedia but for all Wikipedias.. Ehm, for the readers of all Wikipedias.

by Gerard Meijssen (noreply@blogger.com) at November 07, 2017 07:01 AM

November 06, 2017

Wiki Education Foundation

Roundup: Latin American history

Modern society and culture did not spontaneously come about, which is why it’s so important to examine all types of history and culture. Classes that focus on Latin American history are so important for this reason, as they present the opportunity to expand our knowledge and awareness of diverse cultures. Several Wiki Education-affiliated classes have explored topics related to Latin America in the past year. Two of these courses were held by educators at Wesleyan University (Corinna Zeltsman’s Survey of Latin American History) and New College of Florida (Sarah Hernandez’s Latin American Social Theory).

There are different ways to showcase the many facets of a culture and to pass along powerful messages about its importance – one of these ways is through the written word, as Zeltsman’s class could tell you. The Guayaquil Group, as these students explored, was an Equadorian literary group that formed in the 1930s as a result of the oppression of the Ecuadorian mestizo people. Its primary members were Joaquín Gallegos Lara, Enrique Gil Gilbert, Demetrio Aguilera Malta, José de la Cuadra, and Alfredo Pareja Diezcanseco, all of whom used social realism as a way of showing the real Ecuadorian montubio and cholo.

Students in Hernandez’s class examined Latin American literature, prompting one student to write an article on Clotilde Betances Jaeger, a feminist writer and journalist who was active in New York’s Puerto Rican intellectual community. Her advocacy for the rights of minority children and education raised awareness around these issues and challenged the traditional conservative roles of Hispanic women; Betances Jaeger felt that it was important for women to get involved in the political independence of Puerto Rico.

Examining the past is always important, as academic and anthropologist Néstor García Canclini would tell you. His work examines modernity, postmodernity, and culture from the Latin American perspective and is considered to be one of the foremost scholars in this area. He also coined the phrase “cultural hybridization,” which his article describes as a phenomenon that “materializes in multi-determined scenarios where diverse systems intersect and interpenetrate.” Students from Hernandez’s class also expanded the article on Machi (shaman) to include information on gender. Machis are traditional healers within the Mapuche culture of Chile and Argentina and while they’re typically female, males can also become Machis and are called Machi Weye. Machi, especially male and transgender Machi, face discrimination if the people around them do not feel like they are fitting into traditional gender roles – despite being seen as religious leaders in their communities.

Students and educators have a wealth of knowledge that is surpassed only by their passion to learn and teach, two things that are incredibly well suited to the task of Wikipedia editing as an educational assignment. If you’re interested in taking part, please contact Wiki Education at contact@wikiedu.org to find out how you can gain access to tools, online trainings, and printed materials.

Image: File:Mapuche Machis.jpg, Source: Chile Collector, Public Domain, via Wikimedia Commons.

by Shalor Toncray at November 06, 2017 05:29 PM

Wikimedia Tech Blog

So -happy to meet you: Advanced searching techniques on Wikimedia sites

Image by Camdiluv, modified with color inversion, CC BY-SA 2.0.

This summer, MediaWiki user This, that and the other (a.k.a. TTO) created a ticket in Phabricator to report that our search results seem to be random when a query begins with a hyphen.

That is in fact a reasonable interpretation of what happens, say, on English Wiktionary when you are searching for a suffix like -happy (as in trigger-happy) or -minded (as in fair-minded): you get about 5.3 million results, and you may get them in the same order for different suffixes, or maybe you get a different order for the same suffix a few seconds later. For some suffixes, like -in-law (as in sister-in-law), you get the entry for that suffix, followed by the similarly (or maybe differently!) ordered 5.3 million results. This doesn’t happen with prefixes, like pre-, un-, Ægypto-, oö-, or polydeoxyribo- or words with internal hyphens, like trigger-happy, fair-minded, or sister-in-law. What gives?

The answer could be the plot to a heroic fantasy adventure film—Clash of the Syntaxes! Okay, maybe not—but I still want to hear Liam Neeson (YouTube) to say, “Release the klikken!”


In a previous post, I talked about how people asking questions on Wikipedia inadvertently fell afoul of the single-character wildcard ? (now \? to prevent the problem). Something similar is happening here because that hyphen, used to indicate a suffix, is also used to negate a search term. So searching for -happy on Wiktionary returns all 5.3 million entries that do not have happy in them—which is most of them. Similarly, most entries do not have minded in them, so searching for -minded gives a similarly huge, mostly useless list.

The order appears random because there’s no meaningful basis for ranking them—they all lack happy or minded to the same degree—and whatever arbitrary criteria is used to sort one batch of about 5.3 million results is used to sort the other batch. When the order of the results changes, that’s probably because the search got routed to a different server, which sorts everything in a slightly different but equally arbitrary order.[1] Over time, the results will change on any given server as well, as the search index is updated and optimized through normal use.

In the case of -in-law, there is a second search pathway activated—an exact title match—which puts the entry for -in-law at the top of the big pile of 5.3 million results that do not contain in-law.

Shriek, shriek, bang, bang

To make matters more complicated, there’s another another way to negate a search: with an exclamation point (!), also called bang, shriek, pling, or, long ago, ecphoneme—no, really!

The exclamation point is also a letter in some African languages, where it usually stands for an alveolar click. Words can start with such clicks, as in the language names !Kung and ǃXóõ, both of which have entries in English Wiktionary. And of course there are many articles in English Wikipedia with titles or redirects that start with an exclamation point—like the title of the article on the punk band !Action Pact!, or easy-to-type redirects to titles that start with an inverted exclamation point such as !Uno!, which redirects to ¡Uno! (As with internal hyphens, internal exclamation points—as in P!nk, The Amaz!ng Meeting, or L!VE TV—don’t cause any problems.)

As a result, searching for -happy or !happy gives the same results, as does searching for -minded or !minded. However, !in-law gets one less result that -in-law because there is no exact match. Conversely, !Kung gets one more exact-match result than -Kung.

Now we know what’s happening, and why, but what can we do about it?


It’s often hard to divine users’ intent from the scant evidence provided in a query. In the case of TTO’s query -happy, we know that a good result would have been either an entry for the suffix -happy, or at least a list of entries that contained -happy. On the other hand, I like to search English Wikipedia for -the to see how many articles don’t contain the—there are over 71 thousand! (I like to see how far I can get down the list of most frequent words in English before running out of Wikipedia articles. There are dozens of articles without any of the top 50 most frequent words in them—though, boringly, they tend to be sports rosters. It’s not a great hobby, but it keeps me off the streets.)

My colleague David Causse pointed out some of these use cases in the Phabricator ticket—including my odd hobby and another use case I hadn’t thought of: on small, growing wikis, editors may meaningfully look for all pages that don’t contain some particular text. It can be very difficult to know what people are trying to do when you have fewer than ten characters on which to base a determination.

The best bet in this case seems to be to provide lexicographically sophisticated users who want to search for suffixes with some advice to turn them into sophisticated searchers as well.

Search nerds, level up!

An obvious approach that doesn’t actually work is to put quotes around the hyphenated search term, such as “-happy” or “-minded”. The text inside the quotes is treated somewhat literally—so searching for “hoping” will not match hope, hoped, and hopes, the way that searching for quoteless hoping will. However, hyphens are still ignored in quoted searches, so that, for example, “well-known” and “well known” match each other.

The quotes do at least block the negating powers of the hyphen. For “-minded” this works out reasonably well on Wiktionary because minded isn’t used much except in the kind of compounds you might be looking for: closed-minded, open-minded, simple-minded, etc. Similarly, “!Kung” ignores the exclamation point. For “-happy”, however, there are too many other uses cluttering up the results.

Among the lesser-known search keywords is insource:. It searches the raw wikitext and it allows regular expressions (also regexes or regexps), which can be both powerful and costly. Regexes, which are marked with /slashes/, in addition to allowing complex pattern searching, are also very literal. By default, even upper and lower case versions of the same letter do not match! Adding dding i after the regex tells it to be case insensitive.

Regexes also allow us to search for that hyphen or exclamation point, with a query like insource:/-minded/i or insource:/-happy/i—however, we probably shouldn’t search like that.

Marcia! Marymarcia! Matamarcianos!

Regexes are very expensive to search with because they don’t use an index. A search index knows, for example, that the word happy appears as word #137, #492, and #517 in document #5943, etc., etc.—making it easy to find all articles with happy in them. When using regexes, each document must be scanned letter by letter to see if there’s a match. With over five million articles on the English Wikipedia and over five million entries on the English Wiktionary, such scans usually take much too long and users get a warning that their search did not complete, along with their incomplete results.

Also because of that letter-by-letter scanning, regexes can make unintended matches. The pattern /marcia/ actually won’t match the name Marcia (Youtube) because regexes are so literal, and without the i at the end m is not the same as M. Worse, that regex does match Marymarcia, matamarcianos, and artemarcialistas. Crafting exactly the right regex is as much art as science and beyond the scope of this blog post, but we will be getting to an alternative that works much better than showing 5.3 million maximally irrelevant results.

Turning the tables

Surprisingly, part of our problem can become part of our solution. In the case of suffixes like -happy, we know the the suffix, when it occurs, will be indexed as plain happy. We can use that to our advantage. While most Wiktionary entries that contain happy do not contain -happy, (a) all entries that contain -happy do in fact contain happy, and (b) there are a lot fewer than 5.3 million of them.

We can split our final search into two parts:

  • "happy" which limits the collection of entries the regex needs to scan to fewer than 3,000—much better than the full 5.3 million!—and,
  • insource:/-happy/i which further restricts results to those that contain the hyphen right before happy, with i at the end because case doesn’t matter.

Our final queries look like this:

  • "happy" insource:/-happy/i
  • "minded" insource:/-minded/i
  • "in-law" insource:/-in-law/i

The quotes around the first term aren’t strictly required, but they do filter out some additional results since plain minded, for example, will also match mindedness and mindedly, which we may not be interested in.

Read The Fantastic Manual

While this is a lot of added complexity, it is also a lot of added power and precision—and there are many more options and methods for bending search to your will. A good way to learn more is to peruse[2] the documentation and then just try things out for yourself!


1. Once this post is more than, oh, about 37 seconds old, it’s quite possible that some of the details could be out of date. Eventually someone may create an entry for -minded or -happy—but the general idea will still be the same.

2. Either of the conflicting senses will do.

Trey Jones, Senior Software Engineer, Search Platform
Wikimedia Foundation

by Trey Jones at November 06, 2017 03:54 PM

Wikimedia UK

WikiFeed project to create custom newsfeeds from Wikimedia data

Image by Jwslubbock CC BY-SA 4.0

Fako Berkers and Edward Saperia have been working together on a project called “WikiFeed”. It’s a framework that allows you to create custom algorithmic newsfeeds using data from Wikipedia and Wikidata.

These open algorithms could be used to discover news stories in niche areas, suggest new collaborative approaches to editorial policy, and probably other things its designers haven’t thought of yet!

Saperia told us that he was thinking about how we consume news, and that while the Wikipedia homepage is not generally thought of as news, its In The News section is probably one of the most viewed news platforms online. He said ‘News is in the news right now. Choosing headlines is a political act. I was interested in whether you could approach editorial in an open, collaborative way.’

You can see more information about the project on its Wikipedia project page here. You can see an example of the algorithmic here: Recently Edited Women WikiFeed shows articles about women, ranked by which have had the most recent edits.

It’s still in a very early stage, but for the first time next weekend (11-12 Nov) its developers are inviting people to come round to Newspeak House and play with it. Remote participation is also possible and there will be two sessions, on Saturday and Sunday, from 1-4pm.

Sign up to the event page here.

by John Lubbock at November 06, 2017 02:04 PM

Tech News

Tech News issue #45, 2017 (November 6, 2017)

TriangleArrow-Left.svgprevious 2017, week 45 (Monday 06 November 2017) nextTriangleArrow-Right.svg
Other languages:
العربية • ‎বাংলা • ‎čeština • ‎English • ‎español • ‎suomi • ‎français • ‎עברית • ‎italiano • ‎日本語 • ‎polski • ‎português do Brasil • ‎русский • ‎svenska • ‎українська • ‎中文

November 06, 2017 12:00 AM

November 05, 2017

Semantic MediaWiki

Help:Embedded format

Help:Embedded format
Embedded format
Embed selected articles.
Available languages
Further Information
Provided by: Semantic MediaWiki
Added: 0.7
Removed: still supported
Requirements: none
Format name: embedded
Enabled by default: 
Indicates whether the result format is enabled by default upon installation of the respective extension.
Authors: Markus Krötzsch
Categories: misc
Table of Contents

↓ INFO ↓

The result format embedded is used to embed the contents of the pages in a query result into a page. The embedding uses MediaWiki transclusion (like when inserting a template), so the tags <includeonly> and <noinclude> work for controlling what is displayed.



Parameter Type Default Description
source text empty Alternative query source
limit whole number 50 The maximum number of results to return
offset whole number 0 The offset of the first result
link text all Show values as links
sort list of texts empty Property to sort the query by
order list of texts empty Order of the query sort
headers text show Display the headers/property names
mainlabel text no The label to give to the main page name
intro text empty The text to display before the query results, if there are any
outro text empty The text to display after the query results, if there are any
searchlabel text ... further results Text for continuing the search
default text empty The text to display if there are no query results

Format specific

Parameter Type Default Description
embedformat text h1 The HTML tag used to define headings
embedonly yes/no no Display no headings

The embedded format introduces the following additional parameters:

  • embedformat: this defines which kinds of headings to use when pages are embedded, may be a heading level, i.e. one of h1, h2, h3, h4, h5, h6, or a description of a list format, i.e. one of ul and ol
  • embedonly: if this parameter has any value (e.g. yes), then no headings are used for the embedded pages at all.


The following creates a list of recent news posted on this site (like in a blog):

 News date::+
 language code::en
 |sort=news date
 |searchlabel= <br />[view older news]

This produces the following output:

Semantic MediaWiki 2.5.5 (SMW 2.5.5) has been released today as a new version of Semantic MediaWiki.

This maintenance release provides bugfixes and improvements. Please refer to the help page on installing Semantic MediaWiki to get detailed instructions on how to install or upgrade.

SMWCon Fall 2017 Registration open.

The registration for SMWCon Fall 2017 in Rotterdam, the Netherlands (October 4-6, 2017) is now open. All interested participants can now register at the registration site. Note that the Early Bird period ends on September 6, 2017.

The conference is organised by ArchiXL, Wikibase Solutions, the Open University in the Netherlands and Open Semantic Data Association (OSDA).

For more information on this and the conference, see the SMWCon Fall 2017 homepage.

Semantic MediaWiki 2.5.4 (SMW 2.5.4) has been released today as a new version of Semantic MediaWiki.

This new version brings a security fix for special page "SemanticMediaWiki". It also provides an improvement for software testing, other bugfixes and further increases platform stability. Since this release provides a security fix it is strongly advised to upgrade immediately! Please refer to the help page on installing Semantic MediaWiki to get detailed instructions on how to install or upgrade.
[view older news]

NoteNote: The newline (<br />) is used to put the further results link on a separate line.


  • Note that by default this result format also adds all annotations from the pages that are being embedded to the page they are embedded to.1 Starting with Semantic MediaWiki 2.4.0Released on 9 July 2016 and compatible with MW 1.19.0 - 1.27.x. this can be prevented for annotations done via parser functions #set parser function and #subobject parser function by setting the embedonly parameter to "yes".2 in-text annotations will continue to be embedded. Thus these annotations need to be migrated to use the #set parser function to prevent this from happening.
  • Note that embedding pages may accidentally include category statements if the embedded articles have any categories. Use <noinclude> to prevent this, e.g. by writing <noinclude>Category:News feed</noinclude>. Starting with Semantic MediaWiki 3.0.0Released on an unknown date and compatible with MW 1.27.0 - 1.30.x. category statements will automatically be filtered from transcluded content. Thus the described trick to prevent this from happening is no longer necessary.
  • Note that Semantic MediaWiki will take care that embedded articles do not import their semantic annotations, so these need not be treated specifically.
  • Note that printout statements have no effect on embedding queries.


You cannot use the embed format to embed a query from another page if that query relies on the magic word {{PAGENAME}}.

by Kghbln at November 05, 2017 06:14 PM

Gerard Meijssen

#Wikidata - There is no such thing as a free lunch

Mrs Adriane Fugh-Berman wrote a paper called "Why lunch matters: Assessing physicians' perceptions about industry relationships". There is no such thing as a free lunch and arguably this is exactly what Wikidata is offering to the bio-medical industry.

All the bio-medical papers find their home in Wikidata and there is no mechanism, there is nothing to indicate the many erroneous papers, there is nothing to indicate that specific substances have been banned from use as a medical substance. When Wikipedia is to use Wikidata for information it will be so bad.

Mr Martin Keller is a psychatrist whose reputation was for sale. "His" paper Efficacy of paroxetine in the treatment of adolescent major depression: a randomized, controlled trial has been thoroughly debunked.

At Wikidata there seems to be the notion that facts like this are an affront to its neutrality. It is why there is no mention on the item for Mr Keller; "significant event" "ghostwriting author" was removed.

The problem is that without sufficient debunking potential for ghostwriting authors, their products and their ill effect, there is no possibility to establish the veracity of the bio-medical facts that have been imported in Wikidata. It is vital to the integrity of the Wikidata project that the Mr Kellers of this world are seen for what they are: frauds.

by Gerard Meijssen (noreply@blogger.com) at November 05, 2017 02:34 PM


Refactoring around WatchedItem in MediaWiki

The refactoring started as part of [RFC] Expiring watch list entries. After an initial draft patch was made touching all of the necessary areas it was decided refactoring first would be a good idea as the change initially spanned many files. It is always good to do things properly ® instead of pushing forward in a hacky way increasing technical debt.

The idea of a WatchedItemStore was created that would remove lots of logic from the WatchedItem class as well as other watchlist database related code that was dotted around the code base such as in API modules and special pages.

The main patches can be seen here.

Moving the logic

Firstly logic was removed from WatchedItem with backward compatible fallbacks left behind wrapping the logic in WatchedItemStore. This essentially turned WatchedItem into a value object.

During this stage it was discovered that most of the logic from the class did not actually need access to a full Title object (the kitchen sink of MediaWiki). Recently a TitleValue object had been introduced and this much smaller class provided everything that was needed, but although the two classes have several methods in common, they did not implement a shared interface, thus LinkTarget was introduced.

Secondly queries and logic were extracted from other classes and brought into WatchedItemStore to be shared, but to start with this only included code that exclusively dealt with the watchlist. Code that combines the watchlist and recent changes for example will likely end up living in a different class.


Testing is important and of course was one of the targets of the refactoring. Prior to the refactoring there was essentially 0% test coverage for watchlist related code. After the refactoring line coverage was roughly 95% with a combination of unit and integration phpunit tests that have been clearly separated.




This test coverage has been achieved by injecting all services through the constructor of the WatchedItemStore which allows them to be mocked during tests.

Injecting a LoadBalancer instance allows mock Database services to be used and thus during unit tests all DB calls can be asserted while not calling a real DB. This is new and has not really been done in mediawiki core before and a test strategy such as this only really works if integration tests are also in place.

As can be seen in the image above 2 callbacks are also defined in the constructor. These are callbacks to static methods which are hard to mock. For these two static methods the production code calls the callback defined in the class. Phpunit tests can call a method to override this callback with a custom function allowing testing.

Using the ServiceLocator

Recently a ServiceLocator was introduced in MediaWiki. WatchedItemStore will be one of the first classes to use this locator once services can be overridden within tests. This will allow the removal of the singleton pattern from within WatchedItemStore.

Caching review

Basic in-process caching was added to WatchedItemStore as some of the logic extracted from the User class included caching. As a result the importance of this caching is now being investigated and the actuall use of the current caching can be seen at https://grafana.wikimedia.org/dashboard/db/mediawiki-watcheditemstore

Currently roughly 15% to 20% of calls to getWatchedItem result in a cached item being retrieved with the other ~80% causing db hits.

The low number of cache hits is likely due to the fact the cache is currently only a per request cache. More advanced caching would be needed to use a longer term caching allowing tagging of cache items / keys to enable purging.


The review process worked well throughout all related patches. Generally 2 people created the patches and then a mixture of people put the patches through a few rounds of review before later being merged by one of 3 or 4 people. The review was likely so smooth as the changes generally were just refactoring.

One issue that was run into a few times was TitleValue doing a strict check to make sure that the namespace ID that it is constructed with is an int. The MediaWiki DB abstraction layer will return int columns as strings, thus this caused exceptions in initial versions before int casts were added.

Also while trying to add more advanced caching MediaWiki’s lack of common cache interfaces caused a bit of pain and as a result and RFC has been started about a common interface or potentially using PSR-6. https://phabricator.wikimedia.org/T130528

A similar method of refactoring could probably be applied for much of MediaWiki, particular storage areas.

by addshore at November 05, 2017 01:11 PM

Wikidata Map May 2016 (Belarus & Uganda)

I originally posted about the Wikidata maps back in early 2015 and have followed up with a few posts since looking at interesting developments. This is another one of those posts covering the changes since the last post, so late 2015, to now, May 2016.

The new maps look very similar to the naked eye and the new ‘big’ map can be seen below.

So while at the 2016 Wikimedia Hackathon in Jerusalem I teamed up with @valhallasw to generate some diffs of these maps, in a slightly more programatic way to my posts following up the 2015 Wikimania!

In the image below all pixels that are red represent Wikidata items with coordinate locations and pixels that are yellow represent items added between October 27, 2015 and April 2, 2016 with coordinate locations. Click the image to see it full size.

The area in eastern Europe with many new items is Belarus and the area in eastern Africa is Uganda. Some other smaller clusters of yellow pixels can also be seen in the image.

All of the generated images from April 2016 can be found on Wikimedia Commons at the links below:

by addshore at November 05, 2017 12:45 PM

Gerard Meijssen

#Wikimedia - I endorse having a #strategy as it is good to have one

Having a strategy is great. There are objectives and there is an idea how to get there. As the Wikimedia Foundation formulates its strategy, it is complicated. Complicated by necessity because it involves so many interests, people who invested so much of themselves in their project(s), people who speak so many different languages, languages that define them, people with different backgrounds because they define them as well. The strategy must be complicated because it aims to reconcile all these people and the organisations that represent them.

When you are a Wikimedian, it helps when your vision coincides with the vision implicit in this big strategy. I was asked to present at the Wikmedia Nederland conference; I presented a historic view on information gathering and sharing. The presentation was given in English because it was the one common language in the room.

I love presentations but talking with people I love even more. I was asked for stategies behind the things that I do, the things I value. The Luc Hoffman award is an example. It does not have a Wikipedia article but the subject, the science is of real relevance in this time of climate change. The idea of associating links (blue, red and black)  is a non confrontational way to bring Wikidata value to Wikipedia. Adding all the USAmerican alumni from en.wp categories will allow us to keep up with what they hold and know about even more USAmerican alumni. There is method behind the madness.

Now that the Wikimedia strategy goes to the next phase; I hope for many user stories; stories explaining what we are going to do and for whom. I also hope that technical considerations will not prevent innovation and improvements. In the end that is not what a strategy is. It is the hope for the bright future we deserve in our Wikimedia movement.

by Gerard Meijssen (noreply@blogger.com) at November 05, 2017 07:15 AM