March 24, 2018

Gerard Meijssen

#WeMissTurkey - The Shihab and Ma'an families

The Shihab family succeeded the Ma'an family because of a marriage. That and because the male line of succession came to an end. There is no complete name for Mrs Shihab-Ma'an but it is what is needed to link two families and to link the succession of power.

When you consider history, it is often told through the conflicts and the succession of office holders (Fashr-al_Din II and three of his sons were execured).. It is not only what shaped history, the relations through marriage prevented many conflicts and allowed for cultural development in times of peace.

Mrs Shihab-Ma'an was married to Haydar Shihab and her son Mulhim became the next Emir. My big question: does anyone have a name for her? She must be notable by linking two families.

by Gerard Meijssen (noreply@blogger.com) at March 24, 2018 07:35 PM

Andre Klapper

Statistics, Google Code-in, Gitlab, Bugzilla

by aklapper at March 24, 2018 05:42 PM

Weekly OSM

weeklyOSM 400



400 editions of Wochennotiz/weeklyOSM! Celebrate with us this milestone | Image © Pyrog / WikiMedia CC BY-SA 3.0


  • Russell Deffner starts a second RFC for the proposed shop=cannabis tag.
  • The voting phase on the proposal about aviation obstacle lights is currently open. This proposal involves the precise tagging of lights on tall obstacles (towers, buildings), that work as collision avoidance for aircraft in flight.
  • Another proposal on the subject of contractors, for example for the construction of an office building, was published by Christopher Baze as a tag contractor=. The voting is open until April.
  • Carlos Brys, from Argentina, reopens the discussion in the forum about the tags for various paving types, especially the "Brazilian" type (a variant of the Portuguese paving) which was previously mentioned on the Tagging mailing list.
  • Geochicas launched their project on Twitter called #LasCallesDeLasMujeres (the streets of women), in which they generated a map to analyse the gender balance in street nomenclature in Latin America and Spain, linked to the Wikipedia biographies of women who have a street named after them.
  • The proposal about pedestrian connections between ski lifts and pistes has been approved.
  • Strava has updated its heatmap imagery.
    Besides not displaying places with very little activity, the maximum zoom level is now limited to 12.
    It will be harder to trace or align ways in most cases due to the heavily pixelated heatmap.


  • OpenStreetMap has reached one million registered map contributors!
  • Geohipster interviewed someone behind the Twitter account Anonymaps.
  • Levente Juhász tweets about his new open-access publication about OSM contribution patterns and how to foster long-time mappers.
  • Maurizio Napolitano (user napo) published a proposal on the Wiki for "Mappers in Residence". This suggests replicating in OSM the Wikipedian in Residence experience: a volunteering idea designed to help institutions or companies donate their information to the open knowledge world.
  • “OpenStreetMap, MapaNica and the Buses in Nicaragua” is the title of a presentation recently held in Esteli, Nicaragua, that explains the mapping process of bus routes in Managua and Estelí. More info is available on the event’s blog post.
  • OpenCageData interviewed with Andy Mabbett, who is active in both Wikidata and OpenStreetMap communities. He speaks about the connections between the two.
  • Valentina Carraro and Bart Wissink wrote a paper about minorities’ under-representation on OpenStreetMap, and the risk of real-world inequalities seeping into potentially democratic crowdsourced projects. The full article is behind a paywall.


  • In Portugal, massive data seems to have been imported for two months without prior agreement nor documentation.
  • Ilya Zverev got ahold of NavAds dataset of over 59K fuel stations worldwide. Data are currently being reviewed and prepared for the import – the progress can be followed on the Import mailing list and on the dedicated wiki page.

OpenStreetMap Foundation

  • Michael Reichert suggests on the OSMF-Talk mailing list that some inactive or less active sections from the OSM forum without moderators be closed to improve use of moderators’ time.
  • On March 15th the OSMF board met. Nakaner has published a German report in the OSM-Forum. Christoph comments in his user blog on the discussion about the translation of the OSMF website.


  • Lars Lingner reminds (de) (automatic translation) about the upcoming OSM Hackweekend, which will take place in the DB-Mindbox in Berlin on April 14th-15th.
  • The next Grazer Linuxtage will take place on 27-28 April in Graz / Austria at the premises of the FH Joanneum, Automotive Engineering course. Among many other contributions, a JOSM workshop can also be attended. Entrance is free.
  • The 6th Annual Conference organised by OpenStreetMap France will take place from 1 to 3 June in Pessac on the campus of the University of Bordeaux-Montaigne. Presentations can now be submitted.
  • On April 7th the general meeting of the Swiss OSM Association takes place at the Office for Geoinformation in Solothurn followed by a mapping party.

Humanitarian OSM

  • sev_osm reports in his diary (fr) (automatic translation) in detail on a workshop he did in Dakar with several members of the local OSM community, including teaching various advanced editing techniques, handling several available imageries, the OSM ecosystem and its governance.
  • The minutes of the March 1st board meeting of HOT US Inc. have been published.
  • Biondi Sima explains, what the Indonesian HOT team has planned to do next as part of the InAWARE project after completion of the Jakarta project.


  • James2432 works on rendering craft businesses in OSM Carto.

Open Data

  • The Uruguayan gvSIG Community and GeoForAll Iberoamérica are organising the 5th Free and Open Source Geographic Information Technologies Conference, that will take place in Montevideo this October.
  • The municipality of Tirana agreed to donate geospatial data to OpenStreetMap using an open license.
  • The "Asia Foundation" shows in a photo blog how the Nepalese community celebrated Open Data Day. Kathmandu Living Labs organized a mapathon where many participants were introduced to the possibilities of OpenStreetMap.



  • A history analysis platform for OpenStreetMap is currently being developed at HeiGIT. It will make OSM data from the full history of edits more easily accessible. Check out the project‘s first ohsome Nepal Dashboard preview.
  • An OSM map as wallpaper on your smartphone? This web service operated by Alvar Carto makes it possible.
  • In a report (de) (automatic translation) the Potsdamer Neueste Nachrichten describes the start-up Calimoto. Their navigation app for motorcyclists is characterised by its preference for curvy routes.
  • In the OSM-Forum a new editor was introduced, which can primarily add new 3D buildings. It comes from the creators of PlaceMaker, a paid plug-in for SketchUp that imports 3D geodata into SketchUp and prepares it. The developers also uploaded a tutorial video.


  • Starting May 1st, the public Geofabrik downloads will no longer contain metadata about OSM users. Accessing the complete extracts will require logging in with an OSM account.


  • Simon Poole explains what the Vespucci version 10.2 will bring, in this blog post.

Did you know …

  • … a script to show the results of a database query in a map?
  • … the flosm OSM theme maps? There’s a new site to create your own individual map themes.
  • OpenMapTiles? Here you can create vector world maps with a set of tools, which you can use for hosting or offline use. The program is available as free software at OpenMapTiles.org.
  • … this video that illustrates our mapping activity in OpenStreetMap since 2006?

OSM in the media

  • An article by CityLab on the gender distribution among mappers is being discussed on the diversity mailing list and Reddit. While commenters extend a welcome to mappers of all genders and backgrounds, the statistics cited in the article are met with scepticism.

Other “geo” things

  • Google Maps is following OSM and now also offers data for wheelchair users. For the time being, however, only in selected cities and with a lot less information than OSM has available.
  • According to a World Bank estimate, half of the world’s urban population lives on unnamed streets. The "Plus-Codes" from Google are an alternative to the usual addresses (city, street, house number). But it is also an alternative to proprietary systems like "what3words".
  • Google wants to make it easier for game developers to create games similar to Pokémon Go using the Google Maps API.
  • In an article on SciDev, the principles of humanitarian mapping are discussed.

Upcoming Events

    Where What When Country
    Bremen Bremer Mappertreffen 2018-03-26 germany
    Graz Stammtisch Graz 2018-03-26 austria
    Rome Incontro mensile 2018-03-26 italy
    Essen Mappertreffen 2018-03-27 germany
    Dusseldorf Stammtisch 2018-03-28 germany
    Osaka もくもくマッピング! #15 2018-03-28 japan
    Paraná Creando mapas colaborativos libres ParanaConf 2018-03-29 argentina
    Urspring Stammtisch Ulmer Alb 2018-03-29 germany
    Rennes conférence découverte d’OpenStreetMap 2018-03-30 france
    Lima Yo mapeo 2018-03-31 peru
    Montreal Les Mercredis cartographie 2018-04-04 canada
    Stuttgart Stuttgarter Stammtisch 2018-04-04 germany
    Bochum Mappertreffen 2018-04-05 germany
    Dresden Stammtisch Dresden 2018-04-05 germany
    Solothurn 2018 SOSM AGM and mapping party 2018-04-07 switzerland
    Rennes Cartographie des rivières 2018-04-08 france
    Poznań State of the Map Poland 2018 2018-04-13-2018-04-14 poland
    Disneyland Paris Marne/Chessy Railway Station FOSS4G-fr 2018 2018-05-15-2018-05-17 france
    Bordeaux State of the Map France 2018 2018-06-01-2018-06-03 france
    Milan State of the Map 2018 (international conference) 2018-07-28-2018-07-30 italy
    Dar es Salaam FOSS4G 2018 2018-08-29-2018-08-31 tanzania
    Bengaluru State of the Map Asia 2018 (effective date to confirm) 2018-11-17-2018-11-18 india

    Note: If you like to see your event here, please put it into the calendar. Only data which is there, will appear in weeklyOSM. Please check your event in our public calendar preview and correct it, where appropriate.

This weeklyOSM was produced by Anne Ghisla, Nakaner, Polyglot, Rogehm, SK53, SeleneYang, Spanholz, Spec80, Tordanik, YoViajo, derFred, jinalfoflia, sabas88, sev_osm.

by weeklyteam at March 24, 2018 02:11 PM

March 23, 2018

Wikimedia Foundation

Add your photos of spring to the sum of all knowledge

Photo by Marcelochal, CC BY-SA 4.0.

Do you have photos of spring sitting on your hard drive or camera? We have a modest proposal for you.

Wikimedia Commons is the photo repository of the Wikimedia movement. It holds most of the images you’ll find on Wikipedia. The site is full of volunteer photographers that have donated their work to the pursuit of the sum of all knowledge.

That’s where you can come in. As the weather—in the Northern Hemisphere, at least—gets warmer, animals come out of hibernation, the birds return to the trees, and the flowers start to bloom, those volunteers are hosting a contest that you can enter.

There’s only a few real simple requirements:

  • Photographs must be the work of the nominator.
  • Photographs must be newly uploaded to Commons during the challenge submission period, i.e. March 2018 UTC (but may have been taken years or decades earlier).
  • Please do not nominate lots of similar photos; instead choose just your best and most varied images.

How do you go about uploading images, you ask? We have you covered! Here’s some detailed steps.

  • Go to the Upload Wizard. Create an account if need be, taking note that we record very little of your personal information.
  • Upload your image. Please remember that it has to have been taken by yourself (“This file is my own work”).
  • Release, or donate, the rights. We’d suggest the CC BY-SA license, which allows anyone to use it for pretty much any purpose so long as they both attribute you and re-share it under a similar license.
  • Describe it. Use a descriptive title, explain what the photo shows, and try to add a category (eg what flower did you photograph?).
  • Upload it!

To enter it in the contest, add the file name to the contest page alongside all the other file names. (That’s the “File:Name name name.jpg” part.) If you need help, don’t be afraid to reach out on the site’s help desk.


Feel free to use this post for inspiration — all of the photos here were entered into this month’s contest. And again, all were taken by Wikimedia Commons’ cohort of volunteer photographers.

Photo by Syed sajidul islam, CC BY-SA 4.0.


Photo by Neptuul, CC BY-SA 4.0.


Photo by Maasaak, CC BY-SA 4.0.


Photo by Cintia Soledad Peri, CC BY-SA 4.0.


Photo by Anna Saini, CC BY-SA 4.0.


Photo by Well-Informed Optimist, CC BY-SA 4.0.


Photo by BohunkaNika, CC BY-SA 4.0.


Photo by Zeynel Cebeci, CC BY-SA 4.0.


Ed Erhart, Senior Editorial Associate, Communications
Wikimedia Foundation

Need more? Check out Wikimedia Commons’ other ongoing contests.

by Ed Erhart at March 23, 2018 04:37 PM

March 22, 2018


Spike in Adam Conover Wikipedia page views | WikiWhat Epsiode 4

This post relates to the WikiWhat Youtube video entitled “Adam Conover Does Not Like Fact Checking | WikiWhat Epsiode 4” by channel Cntrl+Alt+Delete. It would appear that the video went slightly viral over the past few days, so let’s take a quick look at the impact that had on the Wikipedia page view for Adam’s article.

The video was published back in January, and although the viewing metrics are behind closed doors this video has had a lot of activity in the past 5 days (judging by the comments).

It is currently the top viewed video in the WikiWhat series at 198,000 views where the other 3 videos (John Bradley, Kate Upton & Lawrence Gillard Jr.) only have 6000 views between them.

The sharp increase in video views translates rather well into Wikipedia page view for the Adam Conover article.

Generate at https://tools.wmflabs.org/pageviews/?project=en.wikipedia.org&platform=all-access&agent=user&start=2018-02-28&end=2018-03-20&pages=Adam_Conover|Talk:Adam_Conover|User:Adam_Conover|User_talk:Adam_Conover

Interestingly this doesn’t just show a page view increase for the article, but also the talk page and Adam Conover’s user pages, all of which are shown in the video.

It’s a shame that 200,000 youtube views only translates to roughly 15,000 views on Wikipedia, but, still interesting to see the effect videos such as this can have for the visibility of the site.

You can watch the page views for an Wikipedia page using the Page views tool.

by addshore at March 22, 2018 09:20 PM

Wiki Education Foundation

Steve Jankowski is the University of Windsor’s Wikipedia Visiting Scholar

I’m happy to announce Steve Jankowski as Wikipedia Visiting Scholar at the University of Windsor!

Steve Jankowski
Image: File:Greyscale image of Steve Jankowski.jpg, Textaural, CC BY-SA 4.0, via Wikimedia Commons.

Steve is a PhD candidate in the Joint Graduate Program of Communication and Culture at York and Ryerson Universities. He holds a MA in Communication from the University of Ottawa, and a BDes from the York/Sheridan Design Program. His research explores historical and contemporary issues concerning the political design of encyclopedic knowledge.

Wikipedia plays a key role in Steve’s research, but he’s also an active volunteer contributor. As User:Textaural, he has contributed to several topics related to encyclopedias and encyclopedism, like the biography of Vincent of Beauvais, who wrote the important 13th-century encyclopedia, the Speculum Maius. He has also made several contributions to Wikisource.

With his new access to the University of Windsor’s collections, Steve will develop articles about topics like cross-border culture, First Nations history in the Essex county/Detroit River area, early French History of the Detroit River area, Windsor/Sandwich’s Underground Railroad, and First Nations sports history.

“I was thrilled to accept the position of Wikipedia Visiting Scholar at the University of Windsor,” Steve said. “I am excited to dive into the university’s electronic and physical archives to find unique sources, voices, and stories about the local history and cross-border culture of the Windsor and Detroit River area. My interest in this kind of work stems largely from my academic experience as a graduate student. In 2009, I started studying the production of knowledge on Wikipedia as the topic of my Masters thesis. I have carried this focus forward into my PhD dissertation where I am investigating how the designs of general encyclopedias — like those of the Encyclopedia Britannica or Wikipedia — rely on the techniques of both inclusion and exclusion to communicate knowledge. Also during my Masters I worked as a research assistant for Dr. Boulou de B’béri on his ‘Promise Land Project,’ a multi-year project that amplified the recognition of Black history within the neighbouring municipality of Chatham-Kent, Ontario. My goal is to take what I have learned from these experiences and apply them to this new and exciting role.”

Working with Steve at the University of Windsor is librarian Heidi Jacobs, who said that “Leddy Library’s Centre for Digital Scholarship is excited to be partnering with the Wikipedia Visiting Scholars program. We think this partnership offers exciting opportunities to build on faculty and librarian research and classroom work with Wikipedia and to support our commitment to both preserve and make accessible the diverse history of the Windsor/Essex County/Detroit River area. We’re very much looking forward to working with Steve and Wikipedia over the next year.”

Although Visiting Scholars is typically a role performed remotely, “visiting” in a virtual sense, Steve is local to the university and thus will be able to take advantage of the library’s physical collections, in addition to digital. He will also have access to historians and other scholars on campus who can support his research.

If you are a Wikipedia contributor interested in forming a relationship with an academic library, or if you work at a university and would like to learn more about how a Visiting Scholar could increase the impact of your library’s collections, visit the Visiting Scholars section of our website.

Header image: File:OdetteBldg University of Windsor.jpgMikerussell, CC BY-SA 3.0, via Wikimedia Commons.

by Ryan McGrady at March 22, 2018 04:22 PM

Wikimedia Foundation

Building a better #1Lib1Ref

Photo by Lëa-Kim Châteauneuf, CC BY-SA 4.0.

Every year, librarians around the world take part in #1Lib1Ref, the annual event that asks us to imagine“a world where every librarian adds one reference to Wikipedia”. Running from 15 January to 3 February, this year’s campaign saw 824 editors making over 6500 edits across 22 languages — more than in the campaign’s first two years combined.

What made this year’s campaign so successful? We see two components that stood out: technical tools that provided an easier onboarding experience for newcomers, and community sharing techniques that brought the campaign to the attention of more people.


Tech tools provided a simplified path to editing for newcomers. Citation Hunt, which helps editors quickly and easily locate snippets of text in need of a citation, played a key role. The tool now includes category search features to help you hone in on topics of interest in your own language. The Leaderboard spotlights the most prolific editors, promoting friendly competition.

Another returning favorite was the Hashtag tool, used to track edits made as part of the campaign. The interface makes it easy to thank other editors for the edits they’re making, almost as soon as they happen. The new date limiter marked a dramatic mid-campaign improvement in our tracking capabilities.

A new feature in this year’s campaign was the explainer video created in SimpleShow by Felix Nartey. This friendly video gave newcomers an easy introduction to the campaign, and provided a morale boost when shared via social media.

Video by Felix Nartey/Jessamyn West/Wikimedia Foundation, CC BY-SA 4.0. Due to browser limitations, the video will not play on Microsoft Edge, Internet Explorer, or Safari. Please try Mozilla Firefox instead, or watch it directly on Wikimedia Commons.


An even greater influence on this year’s campaign was social sharing by members of the libraries and Wikimedian communities. We notified community organizers, listservs, and US state library associations, and wrote up blog posts and a Coffee Kit to help spread the word. From there the community at large took over, translating materials into local languages and sharing them further. The full list of postings, blogs, and events includes over 70 items.

The same holds true for social media sharing. We were active on Twitter and Facebook, including posting two Twitter moments, but much of the sharing was driven by the community. Nearly 4400 campaign-related tweets from 40 countries around the world reached an audience over 5.5 million people. The Wikipedia + Libraries Facebook group was also a hub of activity, with a 23 percent membership growth during the campaign. Librarians also talked about the campaign across other social platforms.

Library events—and inter-library competition!—were great drivers of engagement. There were over 30 events hosted by libraries around the world. Highlights included a total of 604 edits from six competing libraries in the Montreal area, and an amazing 1160 edits from the State Library of Queensland.

Image by JacintaJS, CC BY-SA 4.0.


The outcome of the outstanding extent of this year’s campaign? An incredible diversity of content. Articles on topics from over 120 countries were improved, across 22 language Wikipedias. Subjects ranged from ‘1997’ to the ‘1896 Summer Olympics’, from ‘audiobook’ to ‘Zaragoza’ and everywhere in between. We also celebrated #1lib1ref’s 10,000th edit, made on Serbian Wikipedia.

Map from OpenStreetMap (Open Data License), data from Wikidata.

Looking forward

We’ve identified some key ways to make next year’s #1Lib1Ref even better. Big news, for the first time you can participate in the brand new #1Lib1Ref South, scheduled this May at the request of contributors in Australia and South America.

Here’s how else you can get involved!

  • Help create a worklist tool to compile lists of thematically linked articles for editathons or group editing.
  • Transcode the campaign video into alternate formats to improve accessibility.
  • Translate campaign materials into other languages to improve sharing in local contexts in advance of next year’s campaign.
  • Get involved in other great editing campaigns, like Art+Feminism and AfroCROWD.

Jessamyn West, Vermont Mutual Aid Society

See our 2018 lessons document for a full report.

by Jessamyn West at March 22, 2018 02:55 PM

Magnus Manske

Why I didn’t fix your bug

Many of you have left me bug reports, feature requests, and other issues relating to my tools in the WikiVerse. You have contacted me through the BitBucket issue tracker (and apparently I’m on phabricator as well now), Twitter, various emails, talk pages (my own, other users, content talk pages, wikitech, meta etc.), messaging apps, and in person.

And I haven’t done anything. I haven’t even replied. No indication that I saw the issue.

Frustrating, I know. You just want that tiny thing fixed. At least you believe it’s a tiny change.

Now, let’s have a look at the resources available, which, in this case, is my time. Starting with the big stuff (general estimates, MMMV [my mileage may vary]):

24h per day
-9h work (including drive)
-7h sleep (I wish)
-2h private (eat, exercise, shower, read, girlfriend, etc.)
=6h left

Can’t argue with that, right? Now, 6h left is a high estimate, obviously; work and private can (and do) expand on a daily, fluctuating basis, as they do for all of us.

So then I can fix your stuff, right? Let’s see:

-1h maintenance (tool restarts, GLAM pageview updates, mix'n'match catalogs add/fix, etc.)
-3h development/rewrite (because that's where tools come from)
=2h left

Two hours per days is a lot, right? In reality, it’s a lot less, but let’s stick with it for now. A few of my tools have no issues, but many of them have several open, so let’s assume each tool has one:

/130 tools (low estimate)
=55 sec/tool

That’s enough time to find and skim the issue, open the source code file(s), and … oh time’s up! Sorry, next issue!

So instead of dealing with all of them, I deal with one of them. Until it’s fixed, or I give up. Either may take minutes, hours, or days. And during that time, I am not looking at the hundreds of other issues. Because I can’t do anything about them at the time.

So how do I pick an issue to work on? It’s an intricate heuristic computed from the following factors:

  • Number of users affected
  • Severity (“security issue” vs. “wrong spelling”)
  • Opportunity (meaning, I noticed it when it got filed)
  • Availability (am I focused on doing something else when I notice the issue?)
  • Fun factor and current mood (yes, I am a volunteer. Deal with it.)

No single event prompted this blog post. I’ll keep it around to point to, when the occasion arises.

by Magnus at March 22, 2018 11:26 AM

March 21, 2018

Wikimedia Foundation

How to hold workshops that are fully remote accessible


Photo via Pixabay, CC0.

When Design Researcher Abbey Ripstra returned to the U.S. after a research trip to learn more about how people in Nigeria and India were accessing information, her first order of business was to figure out how to apply the research findings her cross-functional team had uncovered.

“The team had to understand large amounts of content together, collaboratively figure out which findings to focus on, create ideas to address those prioritized findings, and then evaluate and prioritize our own ideas to start working on them,” she says. “And we wanted to incorporate many perspectives and expertise, because that would help our team make better decisions, and come up with more systematic and potentially more impactful solutions.”

But Ripstra and her team working on the New Readers contextual inquiry were not all located in the same city, or even the same country.  Because of the need to work quickly and efficiently, they decided to bring together some people in person — and another online. There were challenges with this approach — the colocated teammates found it difficult to look at their computer screens, where their remote teammate was visible, so it was difficult to fully incorporate his thoughts in to the discussion.

When Ripstra and her colleague Grace Gellerman were given the opportunity to run another series of workshops for the New Editors Experiences research project, they decided to iterate and improve the workshops that they had previously designed. We interviewed them about what they learned and how they approached holding workshops that were fully remote-accessible, even for colocated employees.

Why did you decide to hold fully remote-accessible workshops?

Grace: I had facilitated the New Readers workshops where most of the participants were in the conference room in San Francisco and one participated via Google Hangout from India.  In trying to make the meeting as remote-friendly as possible, we lost out on some interactions that could have happened in the room.  During the next set of workshops — for the New Editors Experience — I wanted to level the playing field for all participants so everyone joined remotely.  I’ll admit: the 100 percent remote format led to some challenges for me as a facilitator.  That said, we iterated on what we learned in the first set of workshops.

You are working with a team of 17 people across 12 time zones  from 7 teams across the Wikimedia Foundation. That’s tough to schedule. Could you talk a little bit about how you structured these workshops to make them work across all of these locations?

Abbey: We looked at an international calendar and focused on scheduling times that were the least intrusive for participants. For example, in the Pacific Time Zone — where the majority of participants reside — we asked people to join a bit early – 8 a.m. instead of 9 a.m. In the time zones furthest from the Pacific Time Zone, folks were asked to join slightly after their typical working hours. Some people got up early, and some stayed at work a little later. We all made adjustments to our normal working times to be able to participate.

How did you ask people to prep in advance?

Abbey: We made sure everyone had access to the software we used, along with headphones. We also asked that people place a large monitor besides their laptop so they could see the multiple windows they’d need to have open. People also did some significant homework, so that they’d be able to make decisions and fully participate during the 4-hour-long workshops.

Grace, you talk about facilitation. Could you talk about the way you think about facilitating groups across time zones? How is it different than in-person?

Grace: In my work with teams in how they collaborate, I’ve found that when people meet in a conference room, they tend to look at the other people in the room rather than at the Google Hangout.  Remote participants then start to communicate with each other via the Chat feature.  In the workshops, I wanted to remove the temptation to look at other people in the physical room as well as prevent the Chat from becoming a back channel.  I even kicked a participant out of the physical room when she mistakenly joined me in the conference room where I was facilitating.

When people participate fully in person, they are able to communicate with gesture, knowing when someone is done talking, and when you can start, assuming that everyone can see what you can see, and can hear what you can hear. Once there are only a few people who are remote in a group that is used to being fully in person, it is hard for the remote folks to get all the important information the colocated people are getting.  Observing gesture for remote folks and of remote folks  is limited, timing can be off because of delays or difficult headphones or mics. It is also hard for the in-person folks to fully welcome and communicate with remote folks by everyone (even in the room) individually dialing into the video meeting.

With regular meetings with both in-person and remote participants, there are challenges. Only speaking into the computer is counter intuitive to our natural communication styles in person. And it’s very hard to train ourselves to not look at the person, directly across the table you are addressing, but instead talk into the computer so everyone can hear and see the same amount. It turned out to be a first not very good attempt at leveling the playing field. This was one of the reasons we decided to take on the challenge of being fully remote for these workshops.

Grace, you mention making adaptations to the meeting norms.  What were some of the issues that caused you to make changes?

In the first workshop, a participant who did not have their camera on interrupted Abbey.  I could not anticipate the interruption because without video, participants became disembodied voices.  In the next workshop, I asked participants, bandwidth permitting, to turn their cameras on when speaking to allow for non-verbal communication.

To persuade participants to be more present in the meeting, we asked that they limit using the Chat to raising their hands to get into the queue or to indicate when they had a direct response to something someone else said. The result of this is to bring backchannel conversations to the forefront so everyone can participate.

As participants were speaking, others would say, “+1” in the Chat.  If the speaker spoke for a while, what inspired the +1 was no longer obvious.  So I asked participants to indicate whom the support was for and a high level summary.   For example, “+1 Abbey, for limiting work in process.”

What kind of successes have stemmed from your approach that may not have been achieved with a traditional in-person setup?

The week-long breaks between the five workshops allowed the information to percolate and be processed and perspectives to be shared.  Allowing that much time in an in-person workshop would have been prohibitive.

What were the biggest takeaways that might apply to future efforts to work collaboratively across time zones?

Grace: 100 percent remote meetings can prevent behaviors that create two classes of participants that happen in hybrid meetings.

Having elapsed time between workshops allows for teams to “percolate” or process information in between workshops. This helps the greater team make better decisions.

Abbey: Re-focusing back channel conversations into the foreground enables everyone to participate and take in the content.

Fully remote participation saves the Foundation travel costs.

Doing iterative retros helped us to adapt these processes to work for a greater number of people.

Interview by Melody Kramer, Senior Audience Development Manager, Communications
Wikimedia Foundation

by Melody Kramer at March 21, 2018 07:03 PM

Wiki Education Foundation

Students bring their passions to public fora through a Wikipedia assignment

Dr. Clare Talwalker is a Continuing Lecturer in International and Area Studies at the University of California, Berkeley. Last term, she taught with Wikipedia in her course, Ethics and Methods for the Global Poverty and Practice Minor. Here, she reflects on the course and why she will continue to teach with Wikipedia in future GPP classes.

Dr. Clare Talwalker.
File:ClareTalwalker.jpg, Ctalwalker, CC BY-SA 4.0, via Wikimedia Commons.

For ten years the students in my Global Poverty and Practice (GPP) class have done hefty literature reviews to prepare for their “practice experience” (PE) – the minimum 240 hours over 6 weeks they must complete (required but not granted course credits) with any state or non-state organization combating poverty or inequality. They research the history and social context of their PE organization and the debates surrounding it’s sector. For example, a student preparing for an ecotourism organization in Bali might research the state of fishing livelihoods and the marine environment there; and, she will research the scholarly work on ecotourism – its emergence, its strengths, its pitfalls, and its blind spots.

The work has been grueling for student and teacher alike, because guiding 20-40 students each semester through that many different bodies of scholarship is a massive task. Yet, this is important work. Grueling background research turns out to be just the brakes GPP students secretly want as they try to avoid the hubris in their desire for social change.

Early last Fall (2017), having already met the fresh cohort of GPP students, I happened to attend a Wiki Education seminar where an idea struck: why not have GPP students bring all their research to Wikipedia?! It was so exciting a thought I wondered why none of us GPP faculty had considered it before – given our own emphasis on students bringing their learning and passion to public fora.

At first, my students were skeptical. We were told to stay away from Wikipedia in high school, they told me. Why would we write for it in college?

But the semester closed out with significant additions to a range of articles. I encouraged students to add not only empirical but also conceptual material they found in scholarly sources. For example, they added:

So again this semester, GPP students (>70 of them) are mining scholarly databases, using what they find to fuel up Wikipedia articles related to their range of interests. Here are three reasons why I find this so inspiring!

One: GPP students may build on each other’s work in the coming semesters, returning to some of the same articles and slowly improving many important parts of the Wikipedia universe – those focused on poverty and inequality (history, social/cultural context, debates about interventions) around the world.

Two: GPP students learn about the importance of summarizing and synthesizing knowledge in this unique public forum – of finding, clarifying and repackaging what is already researched. This is a valuable corrective to our engulfment by simply more and more information.

Three: GPP students get to collaborate with each other and with other Wikipedians; their authorship is buried under usernames and behind less-visited History tabs. This is an important lesson in appreciating the pleasure and value of joint results over individual performance and renown.

It’s now a new semester. My course is training a fresh and bigger cohort of editors who are chomping at the bit, revving to go. Reservation povertyEducation in GhanaSeed savingEuropean migrant crisisMalnutrition in IndiaBarangay Health VolunteersEmployment discriminationCorporate social responsibilityPeruvian Welfare State, and more …. here we go!

Header image: File:FullSizeRender.jpg-4.jpgZpwilliams, CC BY-SA 4.0, via Wikimedia Commons.

by Guest Contributor at March 21, 2018 04:01 PM

March 20, 2018

Wikimedia Tech Blog

Pre-university students contribute to Wikimedia in Google Code-in 2017

Google Code-in 2015 grand prize winners. Photo by Florian Schmidt, CC BY-SA 4.0.

Google Code-in is an annual contest for 14–17-year-old students exploring free and open source software projects via small tasks in the areas of code, documentation, outreach, research, and design. Students who complete tasks receive a digital certificate and a t-shirt from Google, while the top students in every participating organization get invited to visit Google’s headquarters in California/the United States.

For the fifth time, Wikimedia was one of 25 participating organizations, through offering mentors and tasks.

Many students also summarized their experience in Google Code-in in blog posts, expressing why the contest is a helpful opportunity to get to know free and open source software development.

To list only some of the students’ achievements in the many different software projects and programming languages that the Wikimedia community uses:

… and many, many more.

We would like to congratulate our winners Albert Wolszon and Nikita Volobuiev, our finalists David Siedtmann, Rafid Aslam and Yifei He, and our many hard working students on their valuable contributions to make free knowledge available for everybody. We hope you enjoyed the experience as much as we did and hope to see you around on Internet Relay Chat, mailing lists, tasks, and patch comments also after the end of this edition of Google Code-in. A list of all winners across participating organizations is available.

We would also like to thank all our mentors for their commitment — the time spent on weekends, coming up with task ideas, working together with students and quickly reviewing contributions, and for providing helpful feedback for potential improvements in the next round.

Last but not least, thanks everybody for your friendliness, patience, and help provided.

Wikimedia always welcomes contributions to improve free and open knowledge. Find out how you can contribute.

Andre Klapper, Bug Wrangler, Developer Relations
Wikimedia Foundation

Graph by Andre Klapper, CC0.

by Andre Klapper at March 20, 2018 04:55 PM

Victory in Italy: Wikimedia wins lawsuit against former Minister of Defense

Photo by Andreas Tille, CC BY-SA 4.0.

Today, we are happy to announce that on Feb. 19, 2018, the Court of Appeals of Rome ruled in favor of the Wikimedia Foundation in Previti v. Wikimedia Foundation. This ruling protects the community editing model and enables the work of all the volunteers in the Wikimedia movement.

In 2012, Cesare Previti, a former Italian Minister of Defense, sued the Wikimedia Foundation for hosting a Wikipedia article he alleged contained defamatory information. Mr. Previti sent a general letter demanding that the article be deleted without clearly identifying what content was defamatory or a link to where it was hosted, and subsequently filed the suit requesting its removal when the Wikimedia Foundation did not take it down.

In 2013, the Civil Court in Rome ruled in our favor. The court held that as a hosting provider, the Wikimedia Foundation cannot be held liable for the content of Wikipedia articles, which it does not control. The court also noted that both the Foundation and the Wikipedia sites themselves provide information about the open and collaborative nature of the encyclopedia.

Mr. Previti appealed the decision, claiming that the Foundation did not just host information created by third parties, but also actively participated in the creation and management of content. The Court of Appeals of Rome has now affirmed the lower court’s decision.

In a ruling that provides strong protection for Wikipedia’s community governance model, the Court once again recognized that the Wikimedia Foundation is a hosting provider, and that the volunteer editors and contributors create and control content on the Wikimedia projects. The Court also made clear that a general warning letter, without additional detail about the online location, unlawfulness, or the harmful nature of the content as recognized by a court, does not impose a removal obligation on a hosting provider like the Wikimedia Foundation.

Finally, the Court took notice of Wikipedia’s unique model of community-based content creation, and the mechanisms by which someone can suggest edits or additions to project content. It found that Wikipedia has a clear community procedure for content modification, which Mr. Previti should have used to address his concerns. He could have reached out to the volunteer editors, provided reliable sources, and suggested amendments to the article,  instead of sending a general warning letter to the Foundation.

As a result of the court’s ruling, the article will remain on the projects, and Mr. Previti will pay the Wikimedia Foundation some of the expenses incurred in defending the lawsuit and appeal.

This ruling is a victory not just for the Foundation, or editors of Italian Wikipedia, but for Wikimedians everywhere, and their ability to make accurate, well-sourced information freely available. Wikipedia is created, edited, supported, and managed by its global community of volunteer editors, contributors, and translators. The Wikimedia Foundation’s status as a hosting provider allows it to provide a platform for these volunteers to share their work with the entire world in nearly three hundred different languages. Decisions like this one, which recognize and reaffirm that status, are important for the growth of open source communities in general, and especially the future dissemination of free knowledge on the Wikimedia projects.

Jacob Rogers, Legal Counsel
Emine Yildirim, Legal Fellow

The Foundation would like to extend special thanks to Marco Berliri and his team at Hogan Lovells, who assisted us with their excellent representation throughout this case.


by Jacob Rogers and Emine Yildirim at March 20, 2018 04:31 PM

Luc Héripret on the Orange Foundation’s Digital Schools project

Cameroon’s Reunification Monument, uploaded during Wiki Loves Monuments 2013. The Orange Foundation supported the event. Photo by Steve Mvondo, CC BY-SA 3.0.

Senior Program Manager Anne Gomez leads the New Readers initiative, where she works on ways to better understand barriers that prevent people around the world from accessing information online. One of her areas of interest is offline access, as she works with the New Readers team to improve the way people who have limited or infrequent access to the internet can access free and open knowledge.

Over the coming months, Anne will be interviewing people who work to remove access barriers for people across the world. In this installment, she interviews Luc Héripret, who leads the Orange Foundation’s work in Africa, which supports education initiatives across the continent. Through the Digital Schools program, active in 532 schools in 12 countries across the continent, the Orange Foundation has funded the initial development of WikiFundi, an offline editing tool for schools. They also work to promote gender equality through initiatives such as Women’s Digital Centres for education.

Gomez: In your own words, could you tell us about the Orange Foundation? What are your goals? How do you relate to the Orange Group?

Héripret: The Orange Foundation is the philanthropic branch of the Orange Group. Our three domains are: education, health and culture. We act in the thirty countries where the group is present and through our fifteen local foundations. Our major focus is on digital education.

Could you describe your role at the Orange Foundation?

I have three roles:

  1. I am coordinating our philanthropic actions in Africa.
  2. I drive different philanthropy programs, such as digital schools, villages, and maternal and infant health.
  3. I coordinate our actions in digital content.

How did the Digital Schools program come to be? What does a deployment consist of? How is it run?

At the beginning, the idea of the program was to offer digital content on a tablet rather than books for the same cost because educational material on a digital medium allows access to thousands of more pieces of content.

We also wanted to to go offline with the tablets, as data connection is costly in Africa. And we wanted to use a central server in each school to simplify the update process and be able to have interactive tools like KA Lite [an offline, open source version of the educational platform Khan Academy]. The program is designed, coordinated and funded by the Orange Foundation at the group level and is targeted, as all of our programs, to lower-income populations in our target markets.

The local Orange Foundation then implements the program locally coordinating  with the Ministry of Education and, if needed, with local non-government organizations. Local Orange volunteers follow the program in the schools on a technical level. A local educational adviser is hired by the Orange Foundation to select relevant content, coordinate with the ministry, and follow schools’ educational progress.

Can you tell us a little more about how you’ve seen students using Wikipedia through your program? Can you share a specific story?

They use Wikipedia to discover their own country, and to learn science. For example, a young teacher in Madagascar discovered Wikipedia (and the digital world!) through our digital kit. Her first search was “Human Body” and she found a general article about the human body with an illustration. Without any help, she then zoomed with her fingers so the picture was full screen. She said that it was exactly what she needed for her teaching the next day.

Smartphones have transformed the way people can access the internet. How has this changed the landscape and the way you view offline access? How do you see these devices impacting the future of educational resources?

Smartphones begin to change the access to internet in Africa but there is still a long way to go to allow that access to a larger number of people. Offline access has still some years to really grow. However, smartphones can be a good tool but I think tablets are a better tool for real learning.

What are you most proud of in your work in Africa?

The number of people reached by our actions: we’ve reached 130,000 children in schools and 250,000 people in rural villages.

What has been your biggest surprise over the years?

My biggest surprise was to see how easily children and professors use KALite or Wikipedia on tablets in many different ways but always efficiently.

When we last spoke, you talked about building an ecosystem of resources for education, ranging from digitizing resources, to hosting a resource hub, to deploying hardware. Can you describe the ecosystem as it is now, and your vision for the future?

For launching our digital schools, we decided to use only open source content, apart from the local content given by Ministries of Education. So we gathered a lot of open source content through the Web over the past four years, and we support the creation of open source content with an annual call for projects. We also digitize local books if we have the authorization. We have launched a website to put on line all the content we gathered so everybody can benefit of the library. The site is collaborative: all users can add their own content.

Our Foundation network uses the online library to then add content on their local servers that they send to the schools. In order to add the content to the servers in a simple way, we developed a small software called Edupi, which acts like a local drive. On the hardware side, we send the hardware once a year to all local foundation concerned. They locally assemble the kits with Orange volunteers and bring them to the schools with a two days training for each school. In the future, we would like local teams to have a complete autonomy in running the program.

As we mentioned (and you may know), the Wikimedia volunteer communities play a major part in the on-the-ground projects we work on. What ways can the Orange Foundation engage with these communities to expand collaboration via programs or projects?

We would love to work with Wikimedia volunteers either on the content side or on contributing to train and follow teachers and students. For example, what has been recently done with the Wikichallenge we supported organised by Wiki in Africa with the help of local Wikimedia volunteers.

When you say “the content side,” what do you mean? Would you like more content to be written? Curated?

Both, they lack of local content adapted to their culture. and searching the relevant content can be painful. A kind of specialized Wikipedia extracted from the original material would be very useful.

Who do you rely on to learn more about offline educational resources? What resources  (conferences, people, spaces) exist for people who want to know more?

Interview by Anne Gomez, Senior Program Manager, Program Management
Wikimedia Foundation

by Anne Gomez at March 20, 2018 03:52 PM

Wikimedia Cloud Services

Running red-queen-style

I've spent the last few months building new web servers to support some of the basic WMCS web services: Wikitech, Horizon, and Toolsadmin. The new Wikitech service is already up and running; on Wednesday I hope to flip the last switch and move all public Horizon and Toolsadmin traffic to the new servers as well.

If everything goes as planned, users will barely notice this change at all.

This is a lot of what our team does -- running as fast as we can just to stay in place. Software doesn't last forever -- it takes a lot of effort just to hold things together. Here are some of the problems that this rebuild is solving:

  • T186288: Operating System obsolescence. Years ago, the Wikimedia Foundation Operations team resolved to move all of our infrastructure from Ubuntu to Debian Linux. Ubuntu Trusty will stop receiving security upgrades in about a year, so we have to stop using it by then. All three services (Wikitech, Horizon, Toolsadmin) were running on Ubuntu servers; Wikitech was the last of the Foundation's MediaWiki hosts to run on Ubuntu, so its upgrade should allow for all kinds of special cases to be ignored in the future.
  • T98813: Keeping up with PHP and HHVM. In addition to being the last wiki on Trusty, Wikitech was also the last wiki on PHP 5. Every other wiki is using HHVM and, with the death of the old Wikitech, we can finally stop supporting PHP 5 internally. Better yet, this plays a part in unblocking the entire MediaWiki ecosystem (T172165) as newer versions of MediaWiki standardize on HHVM or PHP 7.
  • T168559: Escaping failing hardware. The old Wikitech site was hosted on a machine named 'Silver'. Hardware wears out, and Silver is pretty old. The last few times I've rebooted it, it's required a bit of nudging to bring it back up. If it powered down today, it would probably come back, but it might not. As of today's switchover, that scenario won't result in weeks of Wikitech downtime.
  • T169099: Tracking OpenStack upgrades. OpenStack (the software project that includes Horizon and most of our virtual machine infrastructure) releases a new version every six months. Ubuntu packages up every version with all of its dependencies, and provides a clear upgrade path between versions. Debian, for the most part, does not. The new release of Horizon is no longer deployed through an upstream package at all, but instead is a pure Python deploy starting with the raw Horizon source and requirements list, rolled into Wheels and deployed into an isolated virtual environment. It's unclear exactly how we'll transition our other OpenStack components away from Ubuntu, but this Horizon deploy provides a potential model for deploying any OpenStack project, any version, on any OS. Having done this I'm much less worried about our reliance on often-fickle upstream packagers.
  • T187506: High availability. The old versions of these web services were hosted on single servers. Any maintenance or hardware downtime meant that the websites were gone for the duration. Now we have a pair of servers with a shared cache, behind a load-balancer. If either of the servers dies (or, more likely, we need to reboot one for kernel updates) the website will remain up and responsive.

Of course, having just moved wikitech to HHVM, the main Wikimedia cluster is being upgraded from HHVM to PHP 7, and Wikitech will soon follow suit. The websites look the same, but the race never ends.

by Andrew (Andrew Bogott) at March 20, 2018 02:49 PM

Gerard Meijssen

#WeMissTurkey - the geography and organisation of the #Ottoman Empire

This map shows the development of the Ottoman Empire over time. Its accuracy may be disputed but it is among the best Wikimedia has to offer at this time.

This animated gif is really good at what it does. With all the basic parts available, it becomes possible to expand on these maps. The Ottoman empire was divided in "eyalets" and these were divided in "sanjaks". The size and the composition of these eyalets changed over time. An animation of these changes helps understand developments in for instance the Balkan.

At this time sanjaks are added to Wikidata and, this proves to be not that straight forward. Most of them do not have an article in any language. The spelling of the same sanjak differs in places and for some eyalets a modern interpretation is sought in order to provide some "legitimacy" of later developments; in one instance even the mentioning of the composite sanjaks is deliberately missing.

The governance of the Ottoman Empire was obviously along the line of these eyalats and sanjaks. For the eyalets there were "beylerbeys" and for the sanjaks "sanjak-beys". These offices were largely non-hereditary and during one time the composition of them was for quite some time by people originating from the Balkan.

When you consider the administrative organisation of the Ottoman Empire, there is a list of all the Sultans and their Grand Viziers. For the successions of other important functionaries there is still a lot that can be done.

When you are willing to help; please. Adding labels in other languages particularly Turkish will make a real difference. Adding missing humans in Wikidata and link them into a succession of functionaries will help a lot. It enables the provision of lists and they may be used in any language. When you are able to hack maps.. That would be really important; it is how all this information may come together.

by Gerard Meijssen (noreply@blogger.com) at March 20, 2018 10:39 AM

March 19, 2018

Wiki Education Foundation

When students write Wikipedia articles about law

Millions of users traffic Wikipedia every month, looking for information on a wide variety of topics. Many of those users then take that information into account when making political and behavioral decisions. That’s why Wiki Education is committed to improving Wikipedia’s coverage of topics relevant to an informed citizenry. Our Future of Facts initiative encourages participants in our programs to improve articles in subject areas like public policy, political science, history, environmental science, sociology, and law.

John Kleefeld’s students at University of Saskatchewan improved and created Wikipedia articles about law topics. By incorporating a Wikipedia assignment into his course, Kleefeld encouraged students to analyze “legal judgments from multiple perspectives—e.g., literary, historical, sociological, political, or jurisprudential”. Teaching with Wikipedia not only engages students in an enriching, new project, but also improves a resource that millions use.

Students worked on a number of articles including the British Columbia Civil Liberties Association, a list of court cases involving the association, and judicial review in Canada.

The British Columbia Civil Liberties Association formed in 1962 to protect and extend civil liberties and human rights. It is a non-partisan charitable society that uses a variety of means to achieve its mission, such as litigation, lobbying, events, and publications. Its lawyers and pro bono counsel work on all levels of Canadian courts and has participated in a number of court cases. Thanks to the student who improved this article, you can find a list of these court cases in a new article that didn’t exist on Wikipedia before this student created it.

Wikipedia’s article about judicial review in Canada also didn’t exist before a student in Kleefeld’s class wrote it. The article describes the intricacies of the judicial review process in Canada, a process which allows individuals to challenge governmental actions and decisions as a protection against abuse of power. The article outlines the basic principles of the concept and key legislation that determines jurisdiction in the application of review. The article also outlines the controversy and differences in opinion surrounding this part of Canadian law.

Students are uniquely positioned to improve Wikipedia as a course assignment. Through their institution, they have access to peer-reviewed research which is normally restricted to the public behind paywalls. Contributing this information to the most accessed online encyclopedia in the world provides Wikipedia’s readers with knowledge to which they may not have otherwise had access. Students become knowledge producers, instead of merely consumers, and they walk away from the assignment with greater skills to identify trustworthy information in their every day lives.

To learn about how you and your students can get involved, visit teach.wikiedu.org or reach out to contact@wikiedu.org.

Header image: File:BCCLA at Vancouver Pride Festival.jpg, Guilhem Vellut, CC BY 2.0, via Wikimedia Commons.

by Cassidy Villeneuve at March 19, 2018 04:11 PM

Wikimedia Foundation

Have you ever read Wikipedia’s article on Lady Gaga?

Screenshot from SMP Entertainment via Vimeo, CC BY 3.0.

Lady Gaga. Hrithik Roshan. Sonam Kapoor. Taylor Swift. Bradley Cooper. Jennifer Lawrence.

All are household names; they’re superstars of Hollywood, Bollywood, or the music industry. Combined, tens of millions of people read the English-language Wikipedia articles about them last year — articles that were authored or co-authored by the same person.

Meet Wikipedia user “FrB.TG,” also known as Frankie. A student in Germany, Frankie describes himself as having an interest in “contemporary music and cinema”—and it certainly shows in his long history of crafting Wikipedia’s articles about celebrities. Swift, Roshan, and the others are only the proverbial tip of the iceberg.


One thing many people do not know about Wikipedia is its tiered series of quality review processes. At the bottom end of the scale are “stubs,” extremely short articles that have little information about the subject. At the top are “featured” articles and lists, which have undergone a peer review process from fellow editors and are “considered to be some of the best articles Wikipedia has to offer.”

Only about five thousand articles, roughly one out of every thousand articles on the entire encyclopedia, have reached this status. Frankie has nine, plus an additional thirty-four featured lists. (They are counted separately).

Frankie draws his subjects from the media he consumes. “Like most typical youngsters,” he tells me, “I frequently listen to modern music and watch contemporary films.”

It doesn’t take much for an artist to grab his attention—but once they do, Frankie starts researching. He starts by watching and/or listening through much of what they’re produced, then doing more serious research into their backstories and personal lives. Frankie only comes to Wikipedia to add what he’s learned after he feels like this research is complete, which allows him to include citations to reliable sources to reassure readers that the information they are reading is verifiable.

This can be more difficult than you’d think. “Modern artists are always challenging to write about [because of a] lack of literature and scholarly sources,” he says. “It is difficult to write a comprehensive biography of them as their success is so recent. Sometimes it [can be] hard to distinguish the reliable sources from the nonsense printed about them, which makes it hard to get at the truth.”

Of all the people Frankie has written about, one stands out as particularly special: Lady Gaga. Frankie worked closely with two other editors to rewrite her Wikipedia biography over a period of two years. It was the final piece in a six article series about her, of which he had a hand in five. “Of all the actors/singers I have written about,” he says. “Gaga is my absolute favorite.” He continues:

I like mostly everything about her—her outrageous style, her voice, her performances and, of course, her music, although her latest stuff is … not as fascinating. Her article was very challenging to write. When I took it up, it was in half-decent shape, unbalanced between over-detailing and glaring omission of some very important information. I had the challenge of both trimming and expanding it (thankfully I had two other editors to help me). Unlike my any other work, it went through two peer reviews, several rewrites, months of work, and help from many other editors to become what it is now—a featured article! In the end, it was all worth it. It will always be among my best achievements on Wikipedia.

Ed Erhart, Senior Editorial Associate, Communications
Wikimedia Foundation

You can also read this post on our Medium publication.

by Ed Erhart at March 19, 2018 03:48 PM

Sumana Harihareswara

Here Are Some Grants You Could Apply For

When I tell people about grants they could get to help them work on open source/open culture stuff, sometimes they are surprised because they didn't know such grants existed. Here are some of them!

Grants with deadlines:

  • Urgent: August 1st is the deadline for the Knight Prototype grant which "helps media makers, technologists and tinkerers take ideas from concept to demo. With grants of $35,000, innovators are given six months to research, test core assumptions and iterate before building out an entire project."
  • Also coming up fast: August 4th is your deadline to apply for the Open Society Fellowship, which gives you about USD$80,000-100,000 to work on a project for a year.
  • September 30th is the deadline for Individual Engagement Grants applications. IEG projects "support Wikimedians to complete projects that benefit the Wikimedia movement. Our focus is on experimentation for online impact. We fund individuals or small teams to organize, build, create, research or facilitate something that enhances the work of Wikimedia's volunteers." The maximum grant request is USD$30,000.
  • If you're a woman working on a tech project that will benefit girls and women in tech, check out The Anita Borg Systers Pass-It-On (PIO) Awards, which range from USD$500-$1000. The next round opens for applications on August 6th.
  • It looks like November 2014 is the deadline to apply for the Drupal Community Cultivation Grants: "to support current and future organizers and leaders of DrupalCamps, Drupal Meetups, Drupal Sprints, Drupal coalitions, and other creative projects that are spreading information within the Drupal community and educating individuals outside the community about Drupal... Grant awards will range from several hundred to several thousand dollars per project".
Grants that you can apply for anytime:

This partially overlaps with the list that OpenHatch maintains on its wiki (and which I or someone else ought to update), and I have not even scratched the surface really. So anyway, yes, if you need some financial help to do better or more work in open stuff, take a look!

March 19, 2018 11:25 AM

Tech News

Tech News issue #12, 2018 (March 19, 2018)

TriangleArrow-Left.svgprevious 2018, week 12 (Monday 19 March 2018) nextTriangleArrow-Right.svg
Other languages:
العربية • ‎বাংলা • ‎čeština • ‎Ελληνικά • ‎English • ‎British English • ‎فارسی • ‎suomi • ‎français • ‎עברית • ‎magyar • ‎italiano • ‎日本語 • ‎polski • ‎português do Brasil • ‎русский • ‎українська • ‎Tiếng Việt • ‎中文

March 19, 2018 12:00 AM

March 18, 2018

Megha Sharma

Outreachy Chapter 6: Towards the end

It still feels like yesterday when I was jumping on seeing my name in the list of selected interns. Yes, these 3 months have passed quickly, rather too quickly.

Today when I’m writing my last blog of the series, I’m both nostalgic and happy. Nostalgic — on collecting all the moments by traveling down the memory lane; Happy — because I was able to pull it off successfully!

When my mentors looked at my implementation plan for the first time, they were of the opinion that it wouldn’t fit in. Simply, because it was too much! And as with time, it became more and more complicated, I joined the same opinion-group too.

But one thing that kept me going was the dream of seeing it live. I always dreamt of the proud feeling I’ll get when users will start using it. It kept me going even at 2 am in the night or when I had already worked for 12 hours but still had one bug left to resolve. And see 3 months later, I’m here with the tool complete! Yes, it works! And it is beautiful. (This much of hard work deserves a little bit of dramatization :P.)

Reflecting back upon the journey, I realize how this internship has transformed me from a reckless college student to a super-patient developer. Every bug has taught me ‘to stay calm and code’; every complex problem has taught me how important optimization is and every meeting has taught me that a way always exists.

Hence, it won’t be wrong to say that this journey has been hell of a learning experience for me from both technical and non-technical perspective.

So today with a heavy heart, I formally say goodbye to Outreachy Round 15 on a good note.

Wait, it doesn’t mean I won’t be writing anymore. There are a lot more blogs to come. So, do stay tuned! :)

by Megha Sharma at March 18, 2018 07:05 PM

Weekly OSM

weeklyOSM 399


Tactile map

Example of a tactile map from TouchMapper 1 | Image from TouchMapper

About us

  • The Czech weeklyOSM team decided to move (automatic translation) all content out of the OSM wiki, due to continued editing conflicts.


  • At FOSS4GUK, the British bird protection society, RSPB, presented their work on using drones to monitor bird reserves (abstract). Some of the collected data was added to OSM.
  • Wiki fiddlers are changing the definition of the aeroway=runway tag.
  • How would you map a house number like 9 3/4? The Tagging mailing list weighs the use of Unicode ( ½ ) versus ASCII (1/2) and also discusses how other map providers handle this. Mappers from different regions pitch in with their local experience in mapping such fractions.
  • Matthijs Melissen starts a thread on the tagging mailing list about the upcoming change in the rendering of boundaries announced by the developers of OSM Carto, the default stylesheet of osm.org. The discussion that follows tries to distinguish tagging issues (how to accurately tag different features) from mere tagging-for-the-renderer. Most participants agree that the information for the accurate rendering of boundaries worldwide is already there, and that the renderer can aggregate them on its own instead of asking for duplication of information.
  • Gregory Marler has started making OSM Diaries, talking to a camera when he goes out mapping. They’re not intended to be tutorials, but a casual glance at what one mapper notices or how he uses different tools. Hopefully, they’re intriguing or inspirational, and will continue if popular.


  • Bryan Housel requests support on Twitter to generate an index of local resources for the various OSM communities, in order to help new mappers find relevant information easily.
  • Harry Wood reports about London OpenStreetMap Q&A events, a new format that is becoming quite successful. The events feature a few presentations and plenty of time for chats among the participants.
  • Heather Leson summarizes the recently held discussion on ‘OSM and Gender’ and some of the ways in which it could be taken forward.
  • Nathalie Sidibe is the newest Mapper of the Month! She shares her experience of mapping places in Mali with minimal internet access and how OSM empowers her activism.
  • Selene Yang examined the gender ratio among the nominees and winners of OSM Awards and invited the community to continue work in growing more diverse and inclusive. The answers to her diary entry try to find the possible causes of women’s under-representation in OSM, and most agree that removing these barriers would benefit the whole community.


  • Ilya Zverev would like to import 59,000 gas stations from NavAds’ database into OSM. Christoph Hormann points out that the database spans across several countries and therefore, integration of the respective local communities is necessary.

OpenStreetMap Foundation

  • The minutes of the Licensing Working Group meeting of March 8th is online. Topics included the Basic Data Protection Ordinance and the protection of minors.
  • The Chat minutes of the Communications Working Group meeting of March 8th is online.


  • OSM Ireland will hold its first Annual General Meeting (AGM) on March 24th to set up a formal organization. So far, it has been a loose collective of like minded people contributing to OSM for over a decade.

Humanitarian OSM

  • The University of Northern Colorado’s Geography and GIS Club maps out first responder routes in Bogota, Colombia to prepare for when disaster strikes.
  • On the occasion of International Women’s Day, HOT throws light on some of the gender equality initiatives within the HOT community – including those funded by Micro grants and Device grants. It also suggests ways in which one could support gender equality in HOT and in the OSM community.
  • Aerial imagery clearly shows that militias are burning down entire villages in DR Congo, causing people to flee and seek refuge in Uganda. It is unclear why the violence suddenly flares. Political motivations are suspected.
  • In Belgium, the third National Missing Maps Mapathon will be organised across 8 university campuses on Saturday, March 24th, by the National Committee of Geography, in collaboration with OpenStreetMap Belgium to improve the map in Northern Nigeria and enable MSF to fight the ongoing Meningitis C epidemic.


  • The Wheelmap now uses (de) parking data of Parkopedia, a commercial project.


  • The Openrouteservice for Disaster Management by HeiGIT now provides OSM updates more frequently. Africa, South America and South Asia are available as stable instances with hourly update intervals.


  • The application phase for participating in this year’s Google Summer of Code is now open. The OSM-Blog invites students who would like to participate in open source projects in the OSM environment.


  • Version 0.10 of the open source routing engine GraphHopper has been released on March 8th.

Did you know …

  • … the open source geo data search engine Geoseer? The query for “OpenStreetMap” returns this record.
  • TouchMapper, a service that creates custom 3D maps?

OSM in the media

  • The Unna district in central North Rhine-Westphalia, Germany received recent aerial imagery, which OpenStreetMap is allowed to use to improve the map.

Other “geo” things

  • Google Maps for iOS now supports public transport routing for passengers in a wheelchair.
  • Co-founder shares an article about how Mapbox started.

Upcoming Events

Where What When Country
Lannion 3e concours de contributions OpenStreetMap 2018-03-01-2018-03-23 france
Lüneburg Lüneburger Mappertreffen 2018-03-20 germany
Cologne Bonn Airport Bonner Stammtisch 2018-03-20 germany
Nottingham Pub Meetup 2018-03-20 united kingdom
Karlsruhe Stammtisch 2018-03-21 germany
Toulouse Réunion mensuelle 2018-03-21 france
Cologne Bonn Airport FOSSGIS 2018 2018-03-21-2018-03-24 germany
Lübeck Lübecker Mappertreffen 2018-03-22 germany
Quezon City MAPAbabae: OpenStreetMap Workshop with Women and for Women 2018-03-22 philippines
Turin MERGE-it 2018 2018-03-23-2018-03-24 italy
8 university campuses Third National Mapathon 2018-03-24 belgium
Bremen Bremer Mappertreffen 2018-03-26 germany
Graz Stammtisch Graz 2018-03-26 austria
Rome Incontro mensile 2018-03-26 italy
Essen Mappertreffen 2018-03-27 germany
Dusseldorf Stammtisch 2018-03-28 germany
Osaka もくもくマッピング! #15 2018-03-28 japan
Poznań State of the Map Poland 2018 2018-04-13-2018-04-14 poland
Disneyland Paris Marne/Chessy Railway Station FOSS4G-fr 2018 2018-05-15-2018-05-17 france
Bordeaux State of the Map France 2018 2018-06-01-2018-06-03 france
Milan State of the Map 2018 (international conference) 2018-07-28-2018-07-30 italy
Dar es Salaam FOSS4G 2018 2018-08-29-2018-08-31 tanzania
Bengaluru State of the Map Asia 2018 (effective date to confirm) 2018-11-17-2018-11-18 india

Note: If you like to see your event here, please put it into the calendar. Only data which is there, will appear in weeklyOSM. Please check your event in our public calendar preview and correct it, where appropriate.

This weeklyOSM was produced by Anne Ghisla, Nakaner, Polyglot, Rogehm, Spec80, SrrReal, TheFive, Tordanik, YoViajo, derFred, jinalfoflia, sev_osm.

by weeklyteam at March 18, 2018 07:09 AM

March 16, 2018

Wiki Education Foundation

Achieving student learning objectives with a Wikipedia assignment

The Stanford Graduate School of Education published a study in 2016 that found that young people have trouble when it comes to “civic online reasoning.” Researchers defined the phrase in terms of students’ ability to identify credible sources online, to distinguish advertisements from news articles, and to understand where information came from. The study reports that despite young peoples’ supposed social media savviness, they have difficulty with all of these aspects of informational and media literacy.

The researchers were primarily concerned by how rapidly misinformation about civic issues can spread and therefore threaten democracy. Being able to identify, and then disregard, misinformation is a valuable skillset for the everyday citizen in the age of fake news.

Instructors in our Classroom Program agree that these skills are important to their students’ education and every day lives. In our survey of Fall 2017 instructors, 87% of respondents said a Wikipedia assignment is more effective for media and information literacy skills than a traditional assignment. Classroom Program participant Dr. Edward Benoit, III spoke to Wikipedia’s role in academia, declaring,

“Higher education aims to both prepare students for their careers, and develop well-rounded citizens in society. … Contemporary students must become proficient information users and creators to succeed in society. This includes understanding how resources like Wikipedia function in the world, dispelling myths of Wikipedia’s quality, learning about information access limitations, and more.”

Understanding where knowledge comes from, and who engages in that production and why, are all student learning outcomes with a Wikipedia assignment. Students understand the historical mechanisms that influence authorship. They are made aware of what is represented on the world’s most accessed online encyclopedia, and what topics or people are missing. They identify the gaps and make efforts to remedy them. “The exercise had students thinking more carefully about how knowledge is reported,” Dr. Nora Haenn reflected. And Dr. Kyra Gaunt remarked that justifying “notability and inclusion is one of the best skills members of marginalized groups should have in academia and in a post-truth world.”

Wiki Education’s Classroom Program provides instructors with the tools to teach with Wikipedia. In doing so, they meet a number of student learning objectives. In 2016, we conducted a study of what students learn when they improve Wikipedia as a course assignment. From this research, Dr. Zachary McDowell concluded that a Wikipedia assignment engages students in digital literacy and critical research; writing for a public audience; and collaboration. And they’re more motivated than in a traditional assignment! 81% of survey respondents agreed that the experience is more effective than a traditional assignment for online communication skills. And over half of respondents agree that a Wikipedia assignment is more effective for self-directed learning skills. Among these skills, instructors also cited others that a Wikipedia assignment fosters in students. Namely, students

  • explore the intricacies of academic integrity and plagiarism
  • communicate to diverse nonspecialist audiences
  • are excited and invested in their work, and have an increased sense of motivation and ownership
  • gain a better understanding of quality sourcing and its importance
  • gain skills to write for a wide and diverse audience
  • understand the difference between encyclopedic writing and theoretical/argumentative writing
  • have an impact beyond the assignment, which provides an outward “real world” focus to the application of course material.

“The more students write for Wikipedia, the more they understand its strengths and weaknesses,” longtime Classroom Program participant Dr. Joan Strassmann said. “They learn where knowledge comes from and how it is transmitted. They learn the power of evidence and gain suspicion when there is none. It should be the first but not the last place they look for information. Its importance cannot be overstated.”

Not only are students gaining important digital literacy skills that will equip them for civic conversations happening in the public sphere, but they can feel empowered in engaging in this way. To learn more about teaching with Wikipedia or to get you and your students involved, visit teach.wikiedu.org or send your questions to contact@wikiedu.org.

by Will Kent at March 16, 2018 04:06 PM

March 15, 2018

Neha Jha

The End of My Internship

My internship has officially ended but for me, this is not the end. It is just the beginning of my open source journey. I started contributing in Wikimedia in around October 2017. Since then, I have fallen in love with the community. There are so many amazing people to learn from. From the beginning to the very end everyone has been so motivating and helpful.

In the last week, I have tried to focus more on documentation. I learned the importance of documenting my code at the beginning of my internship.

Outreachy has given me the best three months of my life. It feels amazing to work on applications that will be used by thousands of users. During the internship, I had two mentors. They have helped me through every step of my internship. A huge Shoutout to them. I really hope to keep working with them.

My next post will be about my thoughts about the application phase and the internship period.

by Neha Jha at March 15, 2018 07:35 PM

Wiki Education Foundation

Improving biographies of women on Wikipedia

Only about 17% of biographies on Wikipedia are about women. That’s a problem. If one of the most popular sources of information worldwide is not representative of all people, the millions of readers who look to Wikipedia every month aren’t getting the full picture of knowledge out there.

As part of Women’s History Month, we’re featuring work by students in Dr. Ariella Rotramel’s Feminist Theory course at Connecticut College. They improved and created Wikipedia articles on a variety of topics last fall, including the biographies of three women who, thanks to these students, now have much more comprehensive Wikipedia articles that detail their lives, careers, and achievements.

Beatrice Fox Auerbach was president and director of department store G. Fox & Co., a labor reform pioneer, philanthropist, and educator. During her position as an executive at G. Fox & Co. from 1938 to 1959, she fought for a 40-hour work week, retirement program, and infrastructure that would support women in business and management. She sold G. Fox & Co. in 1965, and famously declared, “One thing you can be certain of is that I won’t be spending it on yachts and horses, but for the benefit of the people.” She founded the Beatrice Fox Auerbach Foundation, an organization that made contributions to local hospitals as well as educational programs.

Rosemary Park was a scholar, advocate for women’s education, and a President of two colleges. She was President at Connecticut College from 1947 to 1962. And she was President at Barnard College, where she focused on reforming the curriculum and encouraging women to pursue subjects in the sciences, believing them to be as capable of success as their male counterparts. She was the first woman to be President at both colleges. In 1967, Park became vice-chancellor of UCLA, where her husband was a professor. She was also an active Board Member, advisor, director, and trustee for a number of organizations throughout her life, such as The Mystic Oral School for the Deaf, the Connecticut Arboretum, and the American College for Girls. Her Wikipedia article exists thanks to a student in this Fall 2017 course.

Jewel Plummer Cobb was a biologist, cancer researcher, professor, dean, and administrator. She advocated for better representation of women of color in academia, and particularly in the sciences. Throughout her career, she established resources and programs that supported students and faculty of color, including scholarship programs for minority students. Cobb’s research focused on melanoma in humans and mice and she received a number of grants for her research in cell growth. She has been recognized for her contributions in the sciences as well as the programs she supported at both Connecticut College and California State University, Fullerton.

When students improve Wikipedia as a classroom assignment, they not only gain valuable skills, but their work benefits public knowledge about previously underrepresented topics. If you’d like to know how you can get involved, visit our informational page. Also, read Dr. Ariella Rotramel’s own reflections on this Fall 2017 course here.

Header image: File:A student in Connecticut College’s GWS 306 stands in front of her poster on Jewel Plummer Cobb.jpg, Alphareductaze, CC BY-SA 4.0, via Wikimedia Commons.

by Cassidy Villeneuve at March 15, 2018 04:21 PM

Wikimedia Foundation

Why the world reads Wikipedia: What we learned about reader motivation from a recent research study

Photo by Michael Mandiberg, CC BY-SA 4.0.

“لماذا تقرأ هذه المقالة اليوم؟”, “আপনি কেন এ নিবন্ধটি আজ পড়ছেন?”, “為什麼你今天會讀這篇條目?”, “Waarom lees je dit artikel vandaag?”, “Why are you reading this article today?”, “Warum lesen Sie diesen Artikel gerade?”, “?למה אתה קורא את הערך הזה היום”, “यह लेख आज आप क्यों पढ़ रहे हैं?”, “Miért olvasod most ezt a szócikket?”, “あなたは今日何のためにこの項目を読んでいますか?”, “De ce citiți acest articol anume astăzi?”, “Почему вы читаете эту статью сегодня?”, “Por qué estas leyendo este artículo hoy?”, “Чому Ви читаєте цю статтю сьогодні?”

This is the question we posed to a sample of Wikipedia readers across 14 languages (Arabic, Bengali, Chinese, Dutch, English, German, Hebrew, Hindi, Hungarian, Japanese, Romanian, Russian, Spanish, Ukrainian) in June 2017[1] with two goals in mind: to gain a deeper understanding of our readers’ needs, motivations, and characteristics across Wikipedia languages, and to verify the robustness of the results we observed in English Wikipedia in 2016. With the help of Wikipedia volunteers, we collected more than 215,000 responses during this follow-up study, and in this blog post, we will share with you what we learned through the first phase of data analysis.

First, why is understanding readers’ needs important?

Every second, 6,000 people view Wikipedia pages from across the globe. Wikipedia serves a broad range of the daily information needs of these readers. Despite this, we know very little about the motivations and needs of this diverse user group: why they come to Wikipedia, how they consume its content, and how they learn. Knowing more about how this group uses the site allows us to ensure that we’re meeting their needs and developing products and services that help support our mission.

Why didn’t we address this question earlier?

It’s incredibly hard to receive this kind of data at scale and we had to build our capacity to take in this kind of data. Over the past several years, we have laid the foundation for doing this kind of research. Starting in 2015, the Wikimedia Analytics team made the storage and analysis of webrequest logs possible. These logs, which are stored for 90 days, provide an opportunity for performing deeper analyses of reader behavior. However, analyzing actions can be difficult on a site at Wikipedia’s scale. Every second, we can easily receive 150,000 requests performed by readers when loading a webpage. Without knowing what kind of questions we want to answer or what reader characteristics we are interested in, the analysis of webrequest logs resembles the search for a needle in the haystack. The key to our puzzle came in 2015, with the arrival of the Wikimedia Foundation microsurvey tool QuickSurveys. Through QuickSurveys, we can create a framework for interaction with people using Wikipedia. For this study, we combined qualitative user surveys (via QuickSurveys) with quantitative data analysis (via webrequest logs) to make sense of our readers’ needs and characteristics.

What we learned

In 2016, we built the first taxonomy of Wikipedia readers, quantified the prevalence of various use cases for English Wikipedia, and gained a deeper understanding of readers’ behavioral patterns associated with every use case. (The details of the methodology are described in our peer-reviewed publication on this topic.) A year later, when we replicated this study and extended it to other language editions, we put the same survey questions from 2016 in front of readers across 14 languages.  More specifically, we asked readers about

  1. Their information needs (Were they looking up a specific fact or to trying to get a quick answer?  Getting an overview of the topic? or Getting an in-depth understanding of the topic?),
  2. Their familiarity with the topic (Were they familiar or unfamiliar with the topic they were reading about?), and
  3. The source of motivation for visiting the specific Wikipedia article on which they were shown the survey (Was it a personal decision, or inspired by media, a conversation, a current event, a work or school-related assignment, or something else?).

Below is what we have learned so far. (Note that all the results below are debiased based on the method described in the appendix of our earlier research to correct for various forms of possible representation bias in the pool of survey respondents.)

Information needs

“I am reading this article to (pick one) [look up a specific fact or get a quick answer, get an overview of the topic, or get an in-depth understanding of the topic]

The charts below summarize users’ information need across the 14 languages we studied.[2]

From these graphs, we see that on average around 35 percent of Wikipedia users across these languages come to Wikipedia for looking up a specific fact, 33 percent come for an overview or summary of a topic, and 32 percent come to Wikipedia to read about a topic in-depth. There are important exceptions to this general observation that require further investigation:  Hindi’s fact lookup and overview reading is the lowest among all languages (at 20 percent and 10 percent, respectively), while in-depth reading is the highest (almost 70 percent). It is also interesting to note that Hebrew Wikipedia readers have the highest rate of overview readers (almost 50 percent).


“Prior to visiting this article (pick one) [I was already familiar with the topic, I was not familiar with the topic and I am learning about it for the first time]”

We repeat the same kind of plot as above, but now for the question that asked respondents how familiar they were with the article on which the survey popped up.

The average familiarity with the topic of the article in question is 55 percent across all languages. Bengali and Chinese Wikipedia users report much lower familiarity (almost 40 percent), while Dutch, Hungarian, and Ukrainian users report very high familiarity (over 65 percent). Further research is needed to understand whether these are fundamental differences between the reader behavior in these languages or whether such differences are  the result of cultural differences in self-reporting.


“I am reading this article because (select all that apply) [I have a work or school-related assignment, I need to make a personal decision based on this topic (e.g. to buy a book, choose a travel destination), I want to know more about a current event (e.g. a soccer game, a recent earthquake, somebody’s death), the topic was referenced in a piece of media (e.g. TV, radio, article, film, book), the topic came up in a conversation, I am bored or randomly exploring Wikipedia for fun, this topic is important to me and I want to learn more about it. (e.g., to learn about a culture), Other.]

Finally, we look at the sources of motivation leading users to view articles on Wikipedia.

These are the results:

Among the seven motivations the users could choose from, intrinsic learning is reported as the highest motivator for readers, followed by wanting to know more about a topic that they had seen from media sources (books, movies, radio programs, etc.) as well as conversations. There are some exceptions: In Spanish, intrinsic learning is followed by coming to an article because of a work or school assignment; in Bengali by conversation and current event. Hindi has the lowest motivation by media score (10%), while Bengali has the highest motivation by intrinsic learning.

What can we conclude from this research?

We still need time to further analyze the data to understand and compare the behavior of users based on the responses above. We encourage careful examination of the above results, avoiding conclusions that the analysis may not support. Based on the above results we can confidently say a few things:

  • Results of the survey for English Wikipedia are consistent with the 2016 results. (phew!) This could mean results for the other languages in the 2017 survey are also consistent over time. Follow-up studies will be needed to validate this.
  • On average, 32 percent of Wikipedia readers come to the project for in-depth reading, 35 percent come for intrinsic learning, and both numbers can be as high as 55 percent for some languages. Wikipedia is a unique place on the Internet where people can dive into content and read and learn, just for the purpose of learning and without interruption. It is important for further content and product development to cherish this motivator and acknowledge the needs of the users to learn for the sake of learning.
  • Media such as books, radio, movies, and TV programming play an important role in bringing readers to Wikipedia.
  • We do see major differences in information need and motivation is languages, especially in the case of Hindi readers. Further research is needed to understand and explain such differences.
  • The differences in reported numbers for familiarity with the content in Dutch, Hungarian, and Ukrainian Wikipedias can speak to fundamental differences in reader needs and behavior in these languages or cultural differences in self-reporting on familiarity. Further research is needed to shed light on these differences.

What’s next, and how can you help?

We have started the second phase of analysis for some of the languages. If you observe interesting patterns in the data in this blog post that you think we should be aware of and look into, please call it out. If you have hypotheses for some of the patterns we see, please call them out. While we may not be able to test every hypothesis or make sense of every pattern observed, the more eyes we have on the data, the easier it is for us to make sense of it. We hope to be able to write to you about this second phase of analysis in the near future. In the meantime, keep calm and read on!

Florian Lemmerich, RWTH Aachen University and GESIS – Leibniz Institute for the Social Sciences
Bob West, École polytechnique fédérale de Lausanne (EPFL) and Research Fellow, Wikimedia Foundation
Leila Zia, Senior Research Scientist, Wikimedia Foundation



This research is a result of an enormous effort by Wikipedia volunteers, researchers, and engineers to translate, verify, collect, and analyze data that can help us understand the people behind Wikipedia pageviews and their needs. We would like to especially thank the Wikipedia volunteers who have acted as our points of contacts for this project and helped us with the translation of the survey to their languages, going through the verification steps with us, and keeping their communities informed about this research.



[1] The choice of the languages for this study is the result of the following considerations: We ideally wanted to have at least one language from each of the language families as part of the study, and wanted to find languages that where the language communities welcome this study in and support us. We chose the following languages: Arabic (Right-to-left, covers large parts of the Middle East and North Africa), Dutch (per Wikimedia Netherlands request), English (to repeat the results from the initial survey), Hindi (at New Readers team’s request), Japanese (One of the CJKV languages that we know very little about despite the high traffic the language brings to Wikimedia projects), Spanish (a Romance language which helps us understand South America users and their needs to a good extent). 2. All the other languages we added after at least one person from their community responded to our call on the Wikimedia-l mailing list with interest.

[2] The language codes used in the plots are as follows: ar (Arabic), bn (Bengali) , zh (Chinese), nl (Dutch), en (English), de (German), he (Hebrew), hi (Hindi), hu (Hungarian), ja (Japanese), ro (Romanian), ru (Russian), es (Spanish), uk (Ukrainian).

by Florian Lemmerich, Bob West and Leila Zia at March 15, 2018 04:06 PM

Gerard Meijssen

#Wikipedia - throwing the baby out with the bath water

Dear Asaf; there are no pet peeves. There is only my wish for us to be the best we can.

When YouTube is to use Wikipedia to give a background to its offerings, there will be a lot where Wikipedia falls short. We do not offer information on May Ying Welsh for instance. We do not know about the Pardes Humanitarian Prize and, do we report on the current Dalit protests in Maharashtra?

It is not a peeve when I notice how many errors can be found in Wikipedia, particularly in lists, and people do not concentrate on the differences of what Wikipedia knows and what is known elsewhere. This is particularly sad because time invested curating these differences is well spend and it is imho the most effective defence against fake news and fake facts.

When my question is "will YouTube use more than just English", you know as well as I do that English Wikipedia is less than 50% of what our audience read. When the other half does not deserve consideration, it is more than a peeve. It is in these other languages where the danger of fake news is even worse.

Basic facts on any NPOV article are the same in any language.  When they differ, they are where you can expect misinformation. With curated basis information available, it is possible to use natural language technology to provide at least some basic information. You have expressed that this is not something for the Wikimedia Foundation to be interested in (Cebuano remember?).

Asaf; you may hold the keys to what I post on the Wikimedia mailing list and you may privately consider me problematic. However, it is your excess in public ridicule and lack of arguments that is a disservice to what we aim to achieve; it is why we face of. In this you represent an attitude that will not see us provide the best we can offer in a changing landscape where we now have an opportunity to become relevant in debunking the worst of what YouTube has to offer.

by Gerard Meijssen (noreply@blogger.com) at March 15, 2018 10:06 AM

Wikimedia UK

Introducing Wikimedia UK’s new Programme Coordinator

Hannah (far left) in India with ‘FarmHer’ cards to give thanks to women farmers on International Women’s Day 2017.

By Hannah Evans, Programme Coordinator

Hello out there Wikimedia Community…

I’m the new Programme Coordinator for the coming year, working alongside our now three other coordinators across the UK. I’ll be concentrating on our Wikimedia In Residence programmes as well as our educational outreach projects. Both of these areas fill me with great anticipation, and frankly, after a week of intense familiarisation (reading lots of articles and Wikimedia documents) with the Wikimedia movement – I’m eager to get cracking and join in the mission to improve the sum of all human knowledge…

A little introduction of where I’ve come from… I have worked for the past few years focusing on youth and community engagement in solving social justice issues locally and globally. I have done this at the grass roots with a youth charity in Rajasthan, working in network engagement for the campaigns team of a youth-led International Development agency, and as an eager activist with Amnesty International throughout my student years, focusing on active participation of students in building campaigns.

More recently through my time in India and working in youth-focused programmes, I have become enthralled in witnessing the empowering nature of Non-Formal Education. I believe the Wikimedia movement is a great demonstration of this as an informal educational platform, accessible in different formats, available to all.

Through my work in programmes at Wikimedia UK, I’m looking forward to continuing Wikimedia UK’s drive towards greater accessibility and diversity on Wikimedia platforms, and I’m especially looking forward to getting to know the people behind these initiatives and working closely together in the future.

I’m very open to ideas and just saying hello to people in the community – so do introduce yourself – and we’ll see where this year takes us.

To an open 2018,


by John Lubbock at March 15, 2018 08:45 AM

March 14, 2018

Wikimedia Tech Blog

How we’re using machine learning to visually enrich Wikidata

An algorithmic image analysis recommends this image for pseudocordylus melanotus—the common crag lizard. Photo by Amada44, CC BY 3.0.

Wikidata is a multilingual project by design. The project allows contributors to add structured knowledge in every human language, and acts as a central repository of structured data for Wikipedia and its sister projects. As powerful tools to share knowledge without language barriers, images are very important within Wikidata.

Images can also help illustrate the content of an item in a language-agnostic way to external data consumers. However, a large proportion of Wikidata items lack images: for example, as of today, more than 3.6 million Wikidata items are about humans but only 17 percent of them have an image. More generally, only 2.5 million of 45 million Wikidata items have an image attached.

We recently started a research project to help people find relevant images to add to Wikidata items. The project uses algorithmic image analysis and the richness of linked open data to discover and recommend relevant, high-quality, free-licensed pictures for Wikidata items that don’t already have an image attached.

The number of images added to wiki projects since the beginning of 2015. Graph by Miriam Redi, data collected by Magnus Manske, CC0.

The number of Wikidata items is rapidly growing. There are now 2.5 million images in Wikidata, which outnumbers the number of images on English Wikipedia.  In the past three years, the number of images contributed to Wikidata has grown at a much faster rate compared to its  its sister projects.

While the volume of visual knowledge in Wikidata is now large relative to other projects, these 2.5 million images represent a tiny fraction of the material needed to visually represent all entries in a collaborative knowledge base. About 95 percent of Wikidata items currently lack an image statement. Although some types of entries—such as bibliographic items—don’t require an image, many do. In categories of items like ‘people,’only 17 percent of items have images. The same is true for  ‘species’, where only 8 percent of entries have images. Many of these would benefit from a high-quality, relevant image.

Public domain.

Finding the right images is expensive!

Adding images to Wikidata items can be a tedious process. Editors adding visual contributions might have to search for the right picture among various repositories of free-licensed images. Our aim with this research project is to help make it easier for editors to find an appropriate image.

What we did

We designed an algorithm to automatically discover and recommend potentially relevant and high-quality images for pictureless Wikidata items. This consists of two simple steps:

  1. Relevant image discovery: First, given a Wikidata entry without an image, we search Wikipedia and its sister projects for potentially relevant image candidates. We retrieve all images in pages linked to the item. We also pull all images returning from querying Wikimedia Commons with the item label. We then exclude all images whose title does not match the item label (e.g., we would retain Mont Blanc and Dome du Gouter.jpg for the Mont Blanc entry) from the set of returned images. In the future, we are planning to design more complex algorithms to measure the relevance of an image to a Wikidata item, (i.e. the extent to which the image depicts the item.)
  2. Quality image ranking: To find the ‘best’ pictures among those discovered in the first step, we rank them according to their intrinsic photographic quality. To do so, we first need to score images in terms of  photographic quality. We do this automatically, resorting to the most recent computer vision techniques. We train a classifier, i.e. a convolutional neural network (CNN) to distinguish between high and low quality images. More specifically, we provide the classifier examples of  Quality Commons images and Random commons (lower quality) images. The CNN automatically learns  from the image pixels how to classify quality images (More info about the model). In average, our model is able to correctly say if an image si high quality or not around 78 percent of the times.

Example of images for species entries ranked by quality. Screenshot, CC BY-SA 3.0. Individual image credit are available on Commons.

Some examples of species items without images, together with our candidate images ranked by quality can be found on Meta-Wiki. While this project is currently in a pilot stage,we are planning to feed these image recommendations into existing tools for Wikidata visual enrichment, such as Fist and File Candidates.

Evaluation: good images are ranked in the top three

To get an idea of the effectiveness of our methodology for Wikidata visual enrichment, we performed an early evaluation based on historical data of Magnus’ Wikidata Distributed game. This platform allows editors to choose the best image for a Wikidata item given a set of candidate images. We retrieve Distributed Game data for around 66K Wikidata items of various categories. For each item, we get the set of candidate images proposed, as well as the picture manually selected by the user. We run our algorithm on these items: we discover relevant candidates and rank them by quality. We find that around 76% of the times, our algorithm would rank the manually chosen image is in the top three.

This tells us that, using this algorithm, that we may substantially reduce the search space for wikidata visual enrichment. Most of the times, we could filter out bad images and present editors with just 3 pictures to be inspected for visual enrichment of a Wikidata item.

Beyond Commons: Flickr

The aim of this research is to find the best possible pictorial representation of a Wikidata item. While Wikimedia Commons is the largest repository of free-licensed images in the world, and many Commons files are extremely valuable pieces of content, other image repositories such as Flickr or UNsplash also contain high quality free images. In a small-scale experiment based on image analysis, we discovered that only 0.1% of free Flickr images (of monuments) are already on Commons. In the future, we could leverage our technologies to discover and import high quality free-licensed images from Flickr.

Beyond Wikidata: Wikipedia

For the pilot stage of this project, we focused on Wikidata as the main collaborative repository for structured data. In the future, we would like to build on existing techniques to help with the visual enrichment of Wikidata’s sister projects such as Wikipedia. Learning from existing data, we could discover high-quality images that are relevant to Wikipedia articles or sections of articles, and recommend them to editors willing to use more images for knowledge sharing.

How to get involved

Inspect and play with some of the recommendations for Wikidata items of people by checking out our labs pages on this (1, 2). And read more about this work in our Meta-Wiki page.

Miriam Redi, Research Scientist
Wikimedia Foundation

by Miriam Redi at March 14, 2018 10:42 PM

Wiki Education Foundation

Understanding course concepts in broader contexts

Dr. Jennifer Butler Modaff is an Associate Professor of Communication Studies at the University of Wisconsin, La Crosse. She taught with Wikipedia last Fall in her Organizational Communication courses. Here, she discusses her processes and take-aways from the term.

Dr. Jennifer Butler Modaff
Image: File:J Butler Modaff Headshot for Blog.jpg, by JButlerModaff, CC BY-SA 4.0, via Wikimedia Commons.

I first read about Wiki Education’s assignment support in the Spring 2017 term. I was intrigued by the possibility of using a Wikipedia assignment in my introductory-level organizational communication course. In that course, students receive a great deal of exposure to the theoretical aspects of the course concepts, but aren’t always as able to use or explain the concepts and theories in terms of anything but the formal definitions that they have been given. I hoped this assignment would remedy that problem while giving students a creative outlet for their ideas.

After completing the instructor orientation, though, I had serious reservations. While the modules were helpful, I worried that I wasn’t technologically savvy enough to engage the product on my own, let alone walk other students through it. I also worried that my entire semester project relied upon a platform and technology that was supported outside of my university. I wasn’t quite sure what I would do if the assignment failed due to technology, but I decided to rely on the Wiki Education name.

After introducing the assignment, I engaged my students in a discussion about how and why Wikipedia hasn’t traditionally been considered a credible source in academia. We then used a critical framework to discuss the idea of whom information has traditionally been available to, and who has been excluded from knowledge production and dissemination. This led to several very interesting discussions on why their participation in a Wikipedia project was important. These discussions helped differentiate this assignment from just another thing that they needed to complete in their college careers.

We also spent time discussing how to translate the skills they were going to learn into resume building skills. These early discussions helped to create a sense of buy-in that was invaluable in motivating the students to work through the struggles they experienced with the technology and the writing requirements of Wikipedia. Throughout the semester, I reminded students of the ‘real-world’ applications of the assignment in terms of working with new platforms, writing styles, and dealing with feedback from a variety of sources.

Finally, I made sure that I fully integrated the assignment into our class material so that it wasn’t just a group project outside of class. At the end of the semester, I had groups take the ideas that they had been writing about and find one teachable skill for which they could train the class. Learning to evaluate and edit Wikipedia truly became a semester long project that students had to think about and work on in a way that traditional papers and projects don’t necessarily lend themselves to. I was able to follow their progress through my course page on Wiki Education’s Dashboard. Throughout the whole project, I knew who was contributing and staying current on their trainings, which allowed me to maintain an active presence in the group’s work and any problems that they were encountering. I graded their training sessions using completion by due dates; students received either full credit or no credit. Although this was a small portion of their final grade, it did serve as motivation to keep them on track.

That first semester I set the assignment up per Wiki Education’s template. That worked, but students had quite a bit of useful feedback on how to improve the assignment. Their biggest idea was that instead of finding random articles to practice their skills that they practice all of the trainings on the article that they were assigned for the semester. They felt this eliminated any sense of busy work and gave them the opportunity to watch their contributions to the articles have an impact over the semester. I am currently using this feedback and already have a sense that the final project is going to be stronger.

Students also suggested that they should have the opportunity to share ideas with me before putting them on article talk pages as they weren’t fully comfortable in the beginning with the idea that all of their work was publicly viewable. I added several planning documents to the assignment so that I was able to give feedback early in the assignment, although later in the semester they were more confident in their skills and topic and enjoyed the possibility of receiving feedback from individuals outside our classroom. I have formalized those planning documents into steps that I am now referring to as minor, moderate, and major edits. Minor edits are those that address grammar and check to see if links are indeed live. For moderate edits, they work on polishing or eliminating unnecessary aspects of the current article, update resources, and work on their ideas for new writing. Then major edits are their actual written contributions to the Wikipedia article. This seems to break the assignment into more manageable chunks and as previously mentioned allows them to practice the skills they are learning in the trainings in a productive manner.

I absolutely will use the Wikipedia project in future classes although I do intend to continue modifying the assignment as I receive more feedback from students and topics evolve on Wikipedia. I can honestly say this has been as much of a learning experience for me as it has for my students.

Header imageFile:Hoeshler Tower.jpgTheTrashMan, CC BY-SA 4.0, via Wikimedia Commons.

by Guest Contributor at March 14, 2018 04:49 PM

March 13, 2018

Wikimedia Foundation

James Heilman on expanding the reach of Wikipedia’s medical information

Photo by James Heilman, CC BY-SA 4.0.

Senior Program Manager Anne Gomez leads the New Readers initiative, where she works on ways to better understand barriers that prevent people around the world from accessing information online. One of her areas of interest is offline access, as she works with the New Readers team to improve the way people who have limited or infrequent access to the Internet can access free and open knowledge.

Over the coming months, Gomez will be interviewing people who work to remove access barriers for people across the world. In this installment, she interviews emergency room physician James Heilman, an active contributor to WikiProject Medicine and a member of the Wikimedia Foundation Board of Trustees. Heilman has also worked to get medical information into Internet-in-a-Box, a physical device that provides Wikipedia and other content in areas with no or limited internet access. His conversation with Gomez is below.

Gomez: We’ve talked about your goals of creating a production storefront for the Internet-in-a-Box with medical content. Could you describe your vision for our readers?

Heilman: The goal of the project is to allow anyone to package Wikipedia and other content for use in offline devices such that people without internet access can browse these educational resources. Currently we have to balance the cost of storage with the amount of storage space that typical users need for inexpensive devices preventing us from providing all of Wikipedia. While we have been concentrating on medical content as of late, the idea for a “production storefront” is for all areas of knowledge — not just medicine.

Basically the device I envision is made up of modules of content. Different regions and use cases need different content in different languages. I would love to see a simple interface where people can go through the collection of available modules clicking on all the ones they wish, in order to build the exact content library and exact size they want. Then with another click, they could either have the image file delivered to them via the internet ready to be written to a microSD card or have a microSD card with that content mailed to them. Ideally such a platform would also allow people to contribute modules of content and allow feedback by the wider community on both the modules and collections present.

You’ve been working on WikiProject Medicine for a long time, which has been building great medical content in many languages. Tell me about what got you started with offline Wikipedia. When was it?

My first work in offline Wikipedia was in collaboration with Kiwix and the Swiss Wikimedia chapter in 2015. They had developed offline versions of Wikipedia, which for English ran to nearly 70 Gb, which was simply too big to easily fit on a phone’s memory. We wanted to  see if an app closer to one gigabyte in size that would be specifically geared towards medicine would be successful. The app we developed turned out to be very successful in terms of downloads and user engagement. We now have versions of the app in 10 languages and these have been downloaded more than a quarter of a million times. Additionally around 80% of the downloads are from the developing world.

The next question was how could we push this further. At Wikimania Italy in 2016, I met a few folks (specifically Tim Moody and Adam Holt) from Internet-in-a-Box. They were showing me the system they were working on, which appeared perfect for getting health-related content out to the developing world. Within a few months, we had a prototype which cost about $US400 to build and weighted about a kilogram. Over the year the hardware improved such that the devices now weighed less than 25 grams and cost under US$40.

As you’ve started working in the offline space, what’s been the biggest surprise?

The biggest surprise has been the unmet potential that exists within this space. It is surprising that this technology is not already widespread as it appears to elegantly meet such an important need. It is surprising that only a handful of NGOs and basically no commercial entities exist within the space of packaging and providing offline access to educational resources. Many potential applications beyond that of medicine remain completely unserved.

What’s different about working for offline products as opposed to online content? What’s the same?

We are starting from the same content base. The efforts are simply to expand the reach of what we have already spent years creating. What is new is that we are moving into the hardware space, which comes with all the complexities of assembling, selling, and shipping something. There is also the issue of getting updates out to those with already existing devices.

You’ve been involved with researchers from Columbia looking into field studies with the Internet-in-a-Box. How did that connection happen? What are you hoping to get from those efforts?

The partnerships with Columbia and Mount Sinai came about via Adam Holt from Internet-in-a-Box and Lane Rasberry, a fellow member at Wiki Project Med Foundation. Partnerships with institutions, I believe, are key for the success of these efforts. We not only need people to bring devices to where they are needed most but to study the impact that they have.

The Wikimedia community can collect the content and package it for distribution but we need local partners to deploy it and administration professionals and researchers to be neutral in measuring the impact and efficacy.Without such research we are not going to be able to convince the wider public health community that this is an important development that they should engage with.

To date the Internet-in-a-Box has been deployed and studied in the Dominican Republic, Guatemala, and in a Syrian refugee camps.  Most recently, our collaboration with Columbia and Mount Sinai global health medical faculty has led to the deployment of Internet-in-a-Box for the Rohingya refugee crisis in Bangladesh.


What’s one thing the Wikimedia Foundation could do to help?

The Wikimedia movement’s funding of Kiwix has been very useful. This has been critical for getting videos to work within the offline apps. With respect to further efforts, I think getting offline functionality working within the Wikipedia app would be good. And then,of course, helping with raising awareness and maybe the selling of these devices via the Wikipedia store. I have built and shipped the first 100 devices from my home office, but this is not a scalable model as you can imagine. Finally, coming back to my initial comments regarding a production storefront, this is an idea I would love to stay involved with, but simply do not have the technical abilities to turn into a reality.

Where do you learn more and share information about offline access? What resources exist for people who want to know more?

Anne Gomez, Senior Program Manager, Program Management
Wikimedia Foundation

by Anne Gomez at March 13, 2018 08:24 PM

Wiki Education Foundation

“Totally sold!” Why UTA instructors are excited to teach with Wikipedia

Last Friday, I hosted two online workshops for instructors and librarians at the University of Texas at Arlington who wanted to learn more about Wiki Education programs. At the end of my first presentation one instructor said they were “totally sold” on participating… here’s why!

Teaching with Wikipedia is a service learning assignment

When students work with their instructors and librarians to cite peer reviewed journal articles in their quest to update articles on Wikipedia, they are making pay-walled research available to the public. Most people in the world don’t have access to this knowledge – so the Wikipedia assignment acts as an opportunity for service learning, helping students understand how knowledge gatekeeping works, and giving them a direct path for what they can do to help rectify the problem.

Hosting a Visiting Scholar helps close content gaps on Wikipedia

When university libraries sign on for our Visiting Scholars program, they provide access to research for those most capable of making an impact on Wikipedia: long term Wikipedia contributors. And in 2018 as part of our Future of Facts initiative, we hope to work with universities and libraries to host Visiting Scholars in areas relevant to an informed citizenry. This means empowering Wikipedia’s strongest editors to close gaps in important subject areas like public policy, political science, law, history, environmental science, and sociology.

Each of these programs are powerful in their own way, and together, we are making a huge impact. Since 2010 Wiki Education has supported over 2,000 courses, with over 43,000 students updating 60,000 different articles. Together, they’ve added 40 million words to Wikipedia- that’s 138,000 pages – the equivalent of 26 volumes of a printed encyclopedia, shared with everyone, for free.

Despite being totally sold, attendees still had questions…. they wanted to know how best to select topics for students to work on, how to grade student work, how to talk to students about using proper sources, and how they could track the output of a course or scholar. Luckily, the Wiki Education Dashboard provides support for all of this. We have orientations and resources for instructors around selecting articles, a rubric for grading, and more. We have trainings for students around selecting proper sources and using citations on Wikipedia, as well as subject-specific handouts for when an individual field might differ. And at the end of the day, the Dashboard can track all the work that a student, course, or scholar contributes to Wikipedia. Read more about the functionality of the Dashboard here.

If you are interested in hosting an online or in-person workshop for colleagues at your university, or if you just want to learn more about how to get involved, please reach out at contact@wikiedu.org.

Header image: OEWeek_Weald_20180309, by Michelle Reed, CC BY 2.0, via Flickr.

by Samantha Weald at March 13, 2018 04:23 PM

March 12, 2018

Wikimedia Tech Blog

Confound it!—Supporting languages with multiple writing systems

“Let us … confound their language, that they may not understand one another…” —from the biblical story of the Tower of Babel. Painting by Pieter Brueghel the Elder via the Google Art Project, public domain.

In this post, we’ll dip into examples of several multi-script languages, with a deeper dive into Serbian and Chinese, which have interestingly different needs. We’ll try to get a better sense of the complications that arise from supporting readers, editors, and searchers in multi-script languages, and briefly get to know some of the tools that help make it all possible. While the subject can be complicated and the tools are undoubtedly complex, handling multi-script languages well is an essential part of providing information to people in a form that they can readily use.

Making software engineers cry

Depending on your definition of “language” and your definition of “support”, the Wikimedia Foundation supports a bit shy of 300 languages across more than 800 projects. That’s a lot of languages, and the variation across those languages—and the complexity of supporting them—can be staggering.[1] It’s enough to strike fear in the hearts of software engineers everywhere.

Typical software engineer’s reaction when confronted with the complexity of human language. Image by Guillaume Benjamin Armand Duchenne via Wellcome Images, CC BY 4.0.

Human languages, however, don’t care one iota how hard they make software engineers’ lives, so in addition to the baffling variability between languages, there is often considerable variability within a language. English dialects seem able to proliferate without end, and there are many differences in words and phrases[2] (elevator vs lift) and spelling[3] (color vs colour) just between the standard American and British varieties. But at least we all use the same writing system.

Not so in other languages! Let’s take a look…

Serbian—Cyrillic and Latin

Serbian is one of the standard forms of the Bosnian-Croatian-Montenegrin-Serbian language. It can be written in either the Cyrillic or Latin alphabets. While having two scripts complicates matters, the correspondence between the Cyrillic and Latin alphabets is mercifully exact, which makes converting between the two relatively straightforward.

A highway sign outside Belgrade, Serbia, showing both Cyrillic and Latin script. Photo by Jeff Attaway, CC BY 2.0.


Enter language converter!

On the Serbian Wikipedia, most articles are written in Cyrillic, though some are written in Latin. If you haven’t made any language preference suggestions on the Serbian Wikipedia, then where English Wikipedia has “Main Page | Talk” near the upper left of the page, Serbian Wikipedia has “Главна страна | Разговор | Ћир./lat.” Under “Ћир./lat.” you have three options: “Ћир./lat.”, “Ћирилица”, and “Latinica”. That’s our language converter in action!

Not too surprisingly, “Latinica” converts the page to Latin text, “Ћирилица” (which in Latin script is “Ćirilica”) gives you Cyrillic, and the default, “Ћир./lat.”, gives you however the text was originally written. Logged in users can set a preference so that they normally see their preferred script.

Even with the relatively straightforward transliteration, there are complications. For example, in the article about Serbian-American actress Sasha Alexander on Serbian Wikipedia, her stage name is provided in English. It wouldn’t be helpful to have the specifically English version of her name converted along with the general text on the page when transliterating to Cyrillic. In this case, the language converter is smart enough to know not to transliterate inside language-specific templates. There’s also ‑ {special markup} ‑ available that can block the conversion for any bit of text. It often gets used for standard abbreviations like units such as km (kilometer) or mm (millimeter), and cardinal directions in coordinates (e.g., the N and E in “44°48′N 20°28′E”). There’s also a magic word[4] to block title conversion: __NOTC__ (also available in Cyrillic as __БЕЗКН__), which gets used for domain names, abbreviations and initialisms, scientific and technical terms, etc.

Complicated is as complicated does—editing and search

So you’re cruising around the Serbian Wikipedia, with your preference for Cyrillic set, reading about your favorite TV show with a controversial ending, Изгубљени (English: Lost), and you decide to add a little detail or correct a minor typo. When you get to the edit page, you discover that the article is actually titled Izgubljeni and it’s written in Latin script, which you aren’t so comfortable reading or writing. Bummer.

Printing it out would not help, even with the nice binding.
Photo by CollegeDegrees360, CC BY-SA 2.0.

Help is coming! It’s a very complicated issue, though. Do you convert the entirety of every article, from Cyrillic to Latin and back, every time someone wants to edit it in a different script? Or do you try to identify just what they changed in one script and convert it to the majority script of the article? What about cases unlike Serbian—oooo, foreshadowing!—where the conversion is good, but far from perfect? Fortunately for me, that’s not my problem—whew! But the WMF Parsing team has plans.[5]

Similarly, searching in a given script generally only finds matches in the same script. So on Serbian Wikipedia, searching for Izgubljeni gives a few dozen results, while searching for Изгубљени returns several hundred. This one is my problem, as I’m part of the WMF Search Platform team. I’m working on a plugin for our search engine that will not only merge the Cyrillic and Latin scripts in the search index, but will also do some basic stemming, which lets searches for one form of a word return related forms—like hope, hoped, and hoping in English. Speaking of hope, I also hope in the future to be able to bridge part of the mixed-script gap for other languages and projects where the conversion is, like that for Serbian, relatively straightforward.

Less straightforward transliteration—Uzbek, Kazakh, and Crimean Tatar

Not all transliteration systems are as straightforward as Serbian. To varying degrees, Uzbek (Wikipedia), Kazakh (Wikipedia),[6] and Crimean Tatar (Wikipedia)—all Turkic languages—need more complicated support for their Cyrillic/Latin transliteration, up to and including regular expressions and lists of exceptions that just can’t be handled in any straightforward way.

For these languages, the difficulties of reading, editing, and searching are greater than for Serbian, because any automatic conversion has to be significantly more clever, which also makes it more likely to make mistakes.

More straightforward transliteration—Inuktitut and Shilha

Language converter, of course, supports scripts other than Cyrillic and Latin. The transliteration for Kazakh includes Arabic as well. Other scripts, including some you may be unfamiliar with, are supported, too! For example…

Inuktitut (Wikipedia) is an Inuit language spoken in Canada and written in both Latin and Inuktitut syllabics. Fortunately the mapping between the two is straightforward, like Serbian.

A stop sign in Iqaluit, Nunavut, featuring Inuktitut and English. Photo by Sébastien Lapointe, CC BY-SA 3.0.

Shilha is a Berber language spoken in Morocco and written in Arabic, Latin, and Tifinagh. Its Wikipedia is small and still in the incubator, but it already uses language converter to support Latin/Tifinagh transliteration, in part because the mapping is straightforward.

Arabic, Tifinagh, and Latin script at a store in Morocco. Photo by Mohamed Amarochan, CC BY-SA 3.0.


Confounded, confused, and confuzzled—Chinese characters

The Chinese[7] Wikipedia (language code zh) uses the language converter to transform its text into several varieties, including those of mainland China (zh-cn), Hong Kong (zh-hk), Macau (zh-mo), Singapore (zh-sg), and Taiwan (zh-tw). These varieties can have small differences in punctuation and in a few particular words, but the largest split is between whether they use Traditional or Simplified Chinese characters. Chinese Wikipedia also supports the language codes zh-hant and zh-hans, which are generic Traditional and Simplified Han (Chinese) characters, respectively; we’ll use those codes for our next example.

Image by Kjoonlee and Tomchen1989, Arphic Public License (2001 version).

For those who don’t read Chinese, the difference between Traditional and Simplified characters can be subtle, but it’s easier to see when you can compare the exact same text in the two systems. Open the article for “Wikipedia” rendered in Traditional (維基百科) and Simplified (维基百科) characters in adjacent tabs in your browser. Flip back and forth between them and notice that the Traditional variants of characters are often a bit more complex and have a few more strokes, and look a bit darker on the screen as a result. Simplified characters are—well—a bit simpler looking. Punctuation, like periods, commas, and quotes, are also a bit different.

The Chinese language and Chinese Wikipedia together have a number of additional complexities that make the situation even more challenging:

  • The mapping between Traditional and Simplified is not one-to-one—in multiple ways! Some words that are a single Traditional character are written with two Simplified characters. A Traditional character that’s part of a multi-character word or phrase might get converted to a different Simplified character as part of that  phrase than it would if it were on its own.
  • Chinese is written without spaces, making it hard for a computer to break it into words (so that it can make decisions about how to transliterate). As a simple analogy in English—written without spaces—the string “MARGARITA” could be “Margarita” or “Marga Rita”. Context provides clues: “IWANTTODRINKAMARGARITA.” vs “HERFIRSTANDMIDDLENAMESAREMARGARITA.”—but computers are terrible at context.
  • Chinese Wikipedia, unlike Serbian, often has a mix of Traditional and Simplified characters in a given article. Not just in the same article or sentence, but in the same name!

An example I found of the last problem: “UEFA Champions League Final” appears on Chinese Wikipedia in Traditional characters (歐洲冠軍聯賽決賽), in Simplified characters (欧洲冠军联赛决赛), and in a mix of the two (欧洲冠军联赛決賽). In that last instance, the last two characters are Traditional, the rest are Simplified. It strikes me as very odd because the last and third-from-last characters are the same!—so two different versions of the same character are used in the name of a soccer/football[8] league.

In the Bad Old Days, searching for any of these variants would only find articles that contained those specific characters. As of spring 2017, the situation has improved considerably because we convert all text on Chinese Wikipedia to Simplified characters[9] before indexing them for search.

Editing Chinese Wikipedia is still pretty complicated. Unlike Serbian’s Cyrillic/Latin situation—where you could presumably study really hard for a few weeks and become passingly familiar with the few dozens of characters in the script you don’t already know—Traditional and Simplified Chinese have thousands of different characters to learn.

In conclusion…

…there is no conclusion! Well, this blog post is about to end, but the road to fully supporting all the languages of Wikipedia and its sister projects—for reading, editing, and searching—is probably never-ending.

But that shouldn’t be disheartening—every day we come a little bit closer to a world in which every single human being can freely share in the sum of all knowledge. It may never be perfect, but it’s always getting better.

Trey Jones, Senior Software Engineer, Search Platform
Wikimedia Foundation


1. For a very brief but very entertaining review of just some of that complexity, see Roan Kattouw’s lightning talk, given at linux.conf.au 2017 (“Human language wats”): YouTube video, slides on Wikimedia Commons.

2. And of course there is nearly endless variety to be found in English around the world: Appalachian English has sigogglin; Australian English has fair dinkum; Canadian English has namaycush; East African English has boda-boda; Hawaiian English has makai; Indian English has burra-khana; Indonesian English has gotong-royong; Irish English has knawvshawl; Maryland English has moonack; Namibian English has oukie; Philippine English has kilig; Quebec English has sugar pie; Singapore English has taxi uncle; and Texas English has whomperjawed.

3. This is only exacerbated by the fact that English spelling is horrible. The relevant technical term is “orthographic depth”, which is a rough sense of how much a spelling system is WYSIWYG. In the English Wikipedia article on orthographic depth, English is the only example in the category “irregular”. The French Wikipedia article specifically calls out English -ough. It’s a travesty.

4. That’s a rather technical term—follow the link!

5. See C. Scott Ananian’s slides for his Wikimania 2017 presentation on multi-script editing. The slides include his speaker notes, so while it’s not as good as being there, it’s still quite good and full of useful information.

6. Kazakhstan is currently planning to officially shift from the Cyrillic to Latin alphabets by 2025, and the the language converter will need to adapt to that change. An earlier proposal, from October 2017, involved a lot of apostrophes, and was widely criticized. (See an article from The New York Times.) Very recently, a new version was announced that favors acute accents and digraphs. (See an article from Kazinform.) Tables showing the Oct 2017 and Feb 2018 transliterations are on Commons.

7. The label “Chinese” is itself complicated because it can refer to many languages, which can differ as much from each other as the Romance languages do—meaning they are often mutually unintelligible. The Chinese Wikipedia is written in Modern Written Chinese, which is based on the varieties of Chinese spoken throughout China. See also the article on Chinese Wikipedia, on English Wikipedia.

8. To-may-to, to-mah-to. I already said that English is a mess.

9. The software libraries available to handle segmenting Chinese text into words operate on Simplified characters, so converting everything to Simplified first allowed us to also index articles by actual words, rather than by character n-grams; n-grams are much better than nothing, but not great. For more info on the process, you can read my write up for that project.

by Trey Jones at March 12, 2018 06:05 PM

Tech News

Tech News issue #11, 2018 (March 12, 2018)

TriangleArrow-Left.svgprevious 2018, week 11 (Monday 12 March 2018) nextTriangleArrow-Right.svg
Other languages:
العربية • ‎বাংলা • ‎čeština • ‎Ελληνικά • ‎English • ‎Esperanto • ‎español • ‎suomi • ‎français • ‎עברית • ‎हिन्दी • ‎magyar • ‎italiano • ‎日本語 • ‎한국어 • ‎polski • ‎português do Brasil • ‎русский • ‎українська • ‎Tiếng Việt • ‎中文

March 12, 2018 12:00 AM

March 11, 2018

Megha Sharma

Chapter 5: Code & Code Till It Lasts

Firstly, readers apologies for being this late! But the journey became a roller-coaster ride and all my time went controlling it’s pace. With every passing day, this project has become more and more challenging and believe me, the rate has been exponential, if not more :P.

But the love for this tool and support of my mentors have kept me going. For past few weeks, I’ve been drowned in implementation of the proposed features. For more details about the status of the project, you can refer to the workboard on Phabricator.

Looking at it, it seems that the wide scope of the project made the coding phase this gigantic, but that’s not the whole truth. Performance was an equal contributor. To tell you the extent, I will share a small story with you -

As a part of the Graphs module, I was implementing a graph which showed Impact of an editor with respect to time. The original proposed formula for impact was —

Impact = [(Average page views per day)*(contribution by editor till time x)]/(total contribution by all users till time x) summed over all articles in which the user has contributed

Looks like a simple scenario, right? But when it comes to implementing it and that too in such a way, that it loads up within the response time, it becomes hell of a task! Why? Because Wikipedia has sooo manyyy users that calculating the total contribution of all the users in real-time, with respect to time, is next to impossible! To give a bit of statistics, it took approximately an hour to load for an average editor. Just imagine what will be it’s performance when it would come to regular editors!!

Ah, I know you might be wondering how did I solve it? Well, by achieving the same thing in a different way! In other words, by acting smart :). I pre -computed average page-views per day and total contribution by all and cached it for user’s all pages. Then this constant factor was used along with the variable factor of user contribution for generating real-time graphs.

Impact graph for user Acdixon: showing the impact of his contributions in the last year on different pages to which he contributed!

By now, you might have realized the ‘why’ behind the title. Anyhow, it was fun implementing plethora of features and I’m looking forward to new twist and turns :)

Stay tuned!

by Megha Sharma at March 11, 2018 05:52 PM

Wikimedia Foundation

Ten months later: People of Turkey still denied access to Wikipedia

The Library of Celsus, located today near Selçuk, Turkey. Photo by Benh, CC BY-SA 3.0.

It has been ten months since the block of Wikipedia in Turkey. For almost a year, the 80 million people of Turkey have been denied access to information on topics ranging from medicine, to history, to current events on Wikipedia. After ten months, and in the midst of the school term, the need to restore access to Wikipedia in Turkey becomes more urgent every day.

The Wikipedia block in Turkey has had a profound effect on many citizens of Turkey. We have heard from professionals, students, and academics in Turkey and around the world who mourn the loss of this “source of the sources.” Students have lost a helping hand for homework. Teachers have lost a valuable education tool. Academics have lost a starting place for research. Professionals have lost a resource for understanding their industries. All over Turkey, people can no longer turn to Wikipedia for answers to the questions they may have.

This block also limits the spread of knowledge about Turkish history and culture around the world. The more than 300,000 articles on Turkish Wikipedia contain valuable information about Turkey’s history, culture, and geography—written for Turkish speakers, by Turkish speakers. Thousands of everyday people in Turkey have contributed to Turkish Wikipedia on topics from Ankara to Zonguldak. When access to Wikipedia is restricted, the development of this collaborative project suffers. And it makes it harder for the rest of us to learn about Turkey directly from the experiences and interests of the people living there.

Since April, the Wikimedia Foundation has made restoring access to Wikipedia in Turkey one of our highest priorities. We have pursued the legal remedies available to us in Turkish courts. We have filed in the Turkish constitutional court alleging a violation of Turkish and international law regarding free expression and freedom of the press, but have seen no action from the court to-date.

We have followed the situation closely with those directly affected, communicating with known community members and free knowledge advocates. We have endeavored to understand the reasons for the block from points of contact we have within the Turkish government. Our efforts have had a single purpose: lifting the ban while remaining committed to our values of free expression, openness, and neutrality.

For more than 16 years, volunteer editors around the world have built Wikipedia. Today, it is an incredible resource and one of  the world’s most popular and beloved websites. Every month, more than 200,000 volunteers contribute to Wikipedia across hundreds of languages. These volunteers make good-faith efforts to cover all sides of any given issue, even controversial ones, consistent with Wikipedia’s values and policies of neutrality and reliable sourcing.

Wikipedia is made possible by everyday people who donate their time from every corner of the globe, sharing a commitment to neutral, verifiable information, free to all.

This is an utterly unique process in the world. No other internet platform or knowledge community works in this way. The Wikimedia movement is global, distributed, and interconnected — which means that censorship in one part of the world is felt acutely by us all. Censorship runs counter to our shared mission of free knowledge for every single person.

The Wikimedia Foundation will continue to advocate for everyone’s right to read and share knowledge freely and without restriction, including in Turkey. In the meantime, we stand with the Wikimedia community in Turkey and remain committed to our efforts to restore access in the country. We will continue to work to lift the ban, in accordance with Wikimedia values. We will work to spread awareness of how Wikipedia works. Throughout these efforts, we will maintain our opposition to censorship, and support for freedom of information and expression around the world.

For now, we encourage people everywhere to continue contributing to our core mission of making knowledge freely available to everyone. And to the Wikipedia community in Turkey, and those around the world following this issue: we are with you.

Juliet Barbara, Communications Consultant
Wikimedia Foundation

Editor’s note: This article was originally published in November 2017. It was updated in March 2018 with a new date and time period for #WeMissTurkey.

by Juliet Barbara at March 11, 2018 05:01 AM

March 10, 2018

Weekly OSM

weeklyOSM 398



Access the Overpass API with Python, to download specific OSM data and visualise it. 1 | © Picture by Nikolai Janakiev

About us

  • We invite anyone with an OSM account to propose topics for weeklyOSM and/or to write a summary for them. The regular members from the weeklyOSM team will translate them to the other languages.


  • François Lacombe invites mappers to vote on his improved proposal about hydropower water supplies. Several comments from the first round of voting have been taken into account.
  • Frederik Ramm searches for help to fight against the growing number of SEO firms flooding OSM with ‘not useful’ data.


  • Ilya Zverev calls for nominations for the 3rd edition of OpenStreetMap Awards. You could also volunteer for the selection committee.
  • bdiscoe publishes a ranking of OSM data contributions with short notes of the typical mapping activity of individual users.
  • Nikolai Janakiev shows in a very well built-up explanation how to access the Overpass API with Python, to download specific OSM data and visualise it.
  • Geochicas, an OpenStreetMap Women community, published their first annual report 2016-2017.
  • Geochicas celebrate the Women’s International Day with a series of stories #MujeresMapeandoElMundo (WomenMappingTheWorld) about women through history in geography and cartography.
  • 36 percent of all downloads of StreetComplete come from Germany.
  • The diversity-talk mailing list is back.
  • Jose opens the Paraguayan section of the forum with a call for reviving the osmparaguay website.


  • Raymond Nijssen asks for the best way to import roof shapes and materials from Saint Maarten Island. These data come from a mapathon organised by The Netherlands’ Red Cross after Hurricane Irma. The import sounds straightforward because it contains OSM object IDs as well as the new roof detail.
  • Jozef Riha is coordinating the import of address data for Slovakia, provided by the Ministry of Interior. There is no automated procedure so far, so it’s not formally an import; still, the import mailing list remains the best place to ask for help and feedback.
  • Stefan Keller would like to do a semi-automated import of 1,500 car sharing stations of the Swiss mobility.
  • Multi Modal informs the import mailing list of an import for water areas in the Dutch Rhineland.
  • User bdiscroe points out some bad imports of data on woods and waterways that he found during his program find_small_displacement on Japan.

OpenStreetMap Foundation

  • OSMF has updated the OpenStreetMap trademark policy as per January 1st 2018.
  • The protocol of the Engineering Working Group was published.
  • The Membership Working Group of OSMF met on 2nd March. The minutes of the meeting are available online. MWG’s protocol and a discussion on Github show that the MWG is working on displaying the OSMF membership of a mapper on their openstreetmap.org profile if the mapper explicitly requests it.


  • The organisers of 2018 State of the Map, happening at Milan, have opened a community survey for the choice of the conference talks. The survey is open until March 20th. Everyone is welcome to vote!
  • The State of the Map Latam 2018 working group is pleased to announce a call for logo designs. The deadline for submissions is March 15th.


  • An article, recently published in the International Journal of Geo-Information, demonstrates how participatory mapping practices can be combined with collaborative digital mapping techniques for better disaster management and rural development. It draws learning from a pilot study conducted in a flood-prone lower river basin in Western Nepal.

Open Data

  • YEKA, the Youth Mappers’ chapter in Nicaragua celebrated the Open Data Day (#ODD18) in Managua with a mapathon.
  • OpenStreetMap featured in the 2018 Open Data Day celebrations at Nepal and Düsseldorf, with efforts led by Kathmandu Living Labs and OSM Stammtisch at Nepal and Germany respectively.


  • The offline navigation app MapFactor is now available on iOS.
  • OpenRouteService introduces new border restrictions that allow to avoid crossing any external border of the Schengen Area in route planning.
    Also, the result can now be created as GeoJSON.
  • Rinigus gives an update on the status of OSM Scout Server after more than one year.


  • Jason Remillard put together a spam detector for OSM. It now needs training data, so Jason is asking mappers to report spam changesets. This tool could then become part of an automated spam detection process.
  • There are plans to redirect HTTP requests to planet.openstreetmap.org on HTTPS from May 7th. Users of Osmosis and curl are made aware that these programs can’t follow the redirect.


  • iD version 2.7.0 is published. It now supports more background imagery and comes with an improved editor for turn restrictions. All updates can be found on github.
  • The new JOSM version 18.02 is released. It supports private data layers and ESRI projections. For more details, see changelog.

Other “geo” things

  • The Dutch State Archives have published 4700 historical maps.
  • Mapillary’s #CompleteTheMap challenge has come to an end. The first three places went to Brisbane, San Jose and Milano. To know more about the challenge, you can read in a entry in the Mapillary blog.
  • Currently, if satellites in space malfunction or run out of fuel, they are decommissioned. NASA and the U.S. Defense Advanced Research Projects Agency (DARPA) are working on projects to build robotic arms that could be used to repair or refuel the satellites.
  • Microsoft has published an app that allows visually impaired people to get a better picture of the environment via 3D audio. Information about the places is served by OpenStreetMap.
  • The national map of Switzerland is getting more and more accurate but still does not include street names or addresses, as the Tagesanzeiger reports.
  • A map showing the birthplace of over 6,000 notable women born in Latin America and listed on Wikipedia but with no article in the Spanish language edition.

Upcoming Events

Where What When Country
Lannion 3e concours de contributions OpenStreetMap 2018-03-01-2018-03-23 france
Brussels OSMBE Official Meeting & Meetup 2018-03-09 belgium
Buenos Aires Geobirras 2018-03-09 argentina
Kyoto 幕末京都マッピングパーティ#02:月の明かりと大獄と 2018-03-10 japan
Tokyo 東京!街歩き!マッピングパーティ:第17回特別編 旧東海道品川宿 2018-03-11 japan
Rennes Réunion mensuelle 2018-03-12 france
Lyon Rencontre libre mensuelle 2018-03-13 france
Nantes Réunion mensuelle 2018-03-13 france
Bochum Mappertreffen 2018-03-15 germany
Mumble Creek OpenStreetMap Foundation public board meeting 2018-03-15
Rapperswil 9. Micro Mapping Party Rapperswil 2018-03-16 switzerland
Yokkaichi 第1回 富洲原マッピングパーティ 2018-03-17 japan
Hanoi Hà Nội OSM Mapathon 2018 2018-03-18 vietnam
Rennes Cartopartie bâtiments en 3D 2018-03-19 france
Lüneburg Lüneburger Mappertreffen 2018-03-20 germany
Cologne Bonn Airport Bonner Stammtisch 2018-03-20 germany
Nottingham Pub Meetup 2018-03-20 united kingdom
Karlsruhe Stammtisch 2018-03-21 germany
Toulouse Réunion mensuelle 2018-03-21 france
Cologne Bonn Airport FOSSGIS 2018 2018-03-21-2018-03-24 germany
Lübeck Lübecker Mappertreffen 2018-03-22 germany
Turin MERGE-it 2018 2018-03-23-2018-03-24 italy
Poznań State of the Map Poland 2018 2018-04-13-2018-04-14 poland
Disneyland Paris Marne/Chessy Railway Station FOSS4G-fr 2018 2018-05-15-2018-05-17 france
Bordeaux State of the Map France 2018 2018-06-01-2018-06-03 france
Milan State of the Map 2018 (international conference) 2018-07-28-2018-07-30 italy
Dar es Salaam FOSS4G 2018 2018-08-29-2018-08-31 tanzania
Bengaluru State of the Map Asia 2018 (effective date to confirm) 2018-11-17-2018-11-18 india

Note: If you like to see your event here, please put it into the calendar. Only data which is there, will appear in weeklyOSM. Please check your event in our public calendar preview and correct it, where appropriate.

This weeklyOSM was produced by Anne Ghisla, Nakaner, Polyglot, Rogehm, SK53, SeleneYang, Spanholz, Spec80, SrrReal, TheFive, derFred.

by weeklyteam at March 10, 2018 01:09 PM

March 09, 2018

Brion Vibber

String concatenation garbage collection madness!

We got a report of a bug with the new 3D model viewing extension on Wikimedia Commons, where a particular file wasn’t rendering a thumbnail due to an out-of-memory condition. The file was kind-of big (73 MiB) but not super huge, and should have been well within the memory limits of the renderer process.

On investigation, it turned out to be a problem with how three.js’s STLLoader class was parsing the ASCII variant of the file format:

  • First, the file is loaded as a binary ArrayBuffer
  • Then, the buffer is checked to see whether it contains binary or text-format data
  • If it’s text, the entire buffer is converted to a string for further processing

That conversion step had code that looked roughly like this:

var str = '';
for (var i = 0; i < arr.length; i++) {
    str += String.fromCharCode(arr[i]);
return str;

Pretty straightforward code, right? Appends one character to the string until the input binary array is out, then returns it.

Well, JavaScript strings are actually immutable — the “+=” operator is just shorthand for “str = str + …”. This means that on every step through the new loop, we create two new strings: one for the character, and a second for the concatenation of the previous string with the new character.

The JavaScript virtual machine’s automatic garbage collection is supposed to magically de-allocate the intermediate strings once they’re no longer referenced (at some point after the next run through the loop) but for some reason this isn’t happening in Node.js. So when we run through this loop 70-some million times, we get a LOT of intermediate strings still in memory and eventually the VM just dies.

Remember this is before any of the 3d processing — we’re just copying bytes from a binary array to a string, and killing the VM with that. What!?

Newer versions of the STLLoader use a more efficient path through the browser’s TextDecoder API, which we can polyfill in node using Buffer, making it blazing fast and memory-efficient… this seems to fix the thumbnailing for this file in my local testing.

Just for fun though I thought, what would it take to get it working in Node or Chrome without the fancy native API helpers? Turns out you can significantly reduce the memory usage of this conversion just by switching the order of operations….

The original append code results in operations like: (((a + b) + c) + d) which increases the size of the left operand linearly as we go along.

If we instead do it like ((a + b) + (c + d)) we’ll increase _both_ sides more slowly, leading to much smaller intermediate strings clogging up the heap.

Something like this, with a sort of binary bisection thingy:

function do_clever(arr, start, end) {
    if (start === end) {
        return '';
    } else if (start + 1 === end) {
        return String.fromCharCode(arr[start]);
    } else {
        var mid = start + Math.floor((end - start) / 2);
        return do_clever(arr, start, mid) +
               do_clever(arr, mid, end);

return do_clever(arr, 0, arr.length);

Compared to the naive linear append, I’m able to run through the 73 MiB file in Node, and it’s a bit faster too.

But it turns out there’s not much reason to use that code — most browsers have native TextDecoder (even faster) and Node can fake it with another native API, and those that don’t are Edge and IE, which have a special optimization for appending to strings.

Yes that’s right, Edge 16 and IE 11 actually handle the linear append case significantly faster than the clever version! It’s still not _fast_, with a noticeable delay of a couple seconds on IE especially, but it works.

So once the thumbnail fix goes live, that file should work both in the Node thumbnailer service *and* in browsers with native TextDecoder *and* in Edge and IE 11. Yay!

by brion at March 09, 2018 06:14 PM

This month in GLAM

This Month in GLAM: February 2018

by Admin at March 09, 2018 04:08 PM

Wikimedia UK

Data on the history of Scottish witch trials added to Wikidata

North Berwick Witches – the logo for the Survey of Scottish Witchcraft database (Public Domain, via Wikimedia Commons)

By Ewan McAndrew, Wikimedian in Residence at the University of Edinburgh

The first Wikidata in the Classroom assignment at the University of Edinburgh took place last semester on the Data Science for Design MSc course. Two groups of students worked on a project to import the Survey of Scottish Witchcraft database into Wikidata to see what possibilities surfacing this data as structured linked open data could achieve.

Meeting the information & data literacy needs of our students

The Edinburgh and South East Scotland City Region has recently secured a £1.1bn City Region deal from the UK and Scottish Governments. Out of this amount, the University of Edinburgh will receive in the region of £300 million towards making Edinburgh the ‘data capital of Europe’ through developing data-driven innovation. Data “has the potential to transform public and private organisations and drive developments that improve lives.” More specifically, the university is being trusted with the responsibility of delivering a data-literate workforce of 100,000 young people over the next ten years; a workforce equipped with the data skills necessary to meet the needs of Scotland’s growing digital economy.

The implementation of Wikidata in the curriculum therefore presents a massive opportunity for educators, researchers and data scientists alike; not least in honouring the university’s commitment to the “creating, curating & dissemination of open knowledge”. A Wikidata assignment allows students to develop their understanding of, and engagement with, issues such as: data completeness; data ethics; digital provenance; data analysis; data processing; as well as making practical use of a raft of tools and data visualisations. The fact that Wikidata is also linked open data means that students can help connect to and leverage from a variety of other datasets in multiple languages; helping to fuel discovery through exploring the direct and indirect relationships at play in this semantic web of knowledge. This real-world application of teaching and learning enables insights in a variety of disciplines; be it in open science, digital humanities, cultural heritage, open government and much more besides. Wikidata is also a community-driven project so this allows students to work collaboratively and develop the online citizenship skills necessary in today’s digital economy.

At the Data Science for Design MSc’s “Data Fair” on 26th October 2017, researchers from across the university presented the 45 masters students in Design Informatics with approximately 13 datasets to choose from to work on in groups of three. Happily, two groups were enthused to import the university’s Survey of Scottish Witchcraft database into Wikidata (the choice of database to propose was suggested by a colleague).

This fabulous resource began life in the 1990s before being realised in 2001-2003. It had as its aim to collect, collate and record all known information about accused witches and witchcraft belief in early modern Scotland (from 1563 to 1736) in a Microsoft Access database and to create a web-based user interface for the database. Since 2003, the data has remained static in the Access database and so students at the 2018 Data Fair were invited to consider what could be done if the data were exported into Wikidata, given multilingual labels and linked to other datasets? Beyond this, what new insights & visualisations of the data could be achieved?

Packed house at the Data Fair for the Data Science for Design MSc course – 26 October 2017 (Ewan McAndrew, CC-BY-SA)

The methodology

A similar methodology to managing Wikipedia assignments was employed; making the transition from managing a Wikipedia assignment to managing a Wikidata assignment an easy one. The two groups of students underwent a 1.5 hour practical induction on working with Wikidata and third party applications such as Histropedia (“the timeline of everything”) before being introduced to the Access database. They then discussed collaboratively how best to divide the task of analysing and exporting the data before deciding one group would work on (1) importing records for the 3,219 accused witches while the other group would work on (2) the import of the witch trial records and (3) the people associated with these trials (lairds, judges, ministers, prosecutors, witnesses etc).

The groups researched and submitted their data models for review. Once their models had been checked and agreed upon, the students were ready to process the data from the Access database into a format Wikidata could import (making use of the handy Wikidata plug-in on Google Spreadsheets). Upon completion of the import, the students could then choose how to visualise this newly added data in a number of ways; such as maps, timelines, graphs, bubble charts and more. The students finished their project by showcasing their insights and data visualisations in various mediums at the end of project presentation day on the 30th of November 2017.

Here are two data visualisation videos they produced:

North Berwick witches – the logo for the Survey of Scottish Witchcraft database (Public Domain, via Wikimedia Commons)

The way forward

We now have 3219 items of data on the accused witches in Wikidata (Spanning 1563 to 1736). We also now have data on 2356 individuals involved in trying these accused witches. Finally we have 3210 witch trials themselves. This means we can link and enrich the data further by adding location data, dates, occupations, places of residence, social class, marriages, and penalties arising from the trial.

The hope is that this project will aid the students’ understanding of data literacy through the practical application of working with a real-world dataset and help shed new light on a little understood period of Scottish history. This, in turn, may help fuel discoveries by dint of surfacing this data and linking it with other related datasets across the UK, across Europe and beyond. As the Survey of Scottish Witchcraft’s website states itself Our list of people involved in the prosecution of witchcraft suspects can now be used as the basis for further inquiry and research.“

The power of linked open data to share knowledge between different institutions, between geographically and culturally separated societies, and between languages is a beautiful thing. Here’s to many more Wikidata in the Classroom assignments.


by Ewan McAndrew at March 09, 2018 01:12 PM

March 08, 2018

Wiki Loves Monuments

2015 Earthquake in Nepal: A Wake-up Call for Monument Documentation

The main goal of Wiki Loves Monuments is to achieve visual documentation of monuments from around the world on Wikipedia. Monument documentation through Wiki Loves Monuments is not only useful for people to get to know cultural heritage in different regions but also it can prove to be one of the few publicly available and freely usable images online of a destroyed or damaged monument.

The devastating earthquake that shook Nepal in 2015 claimed thousands of lives and left thousands more injured, but in addition to human loss it shook many monuments and shattered them to pieces. This was a huge loss for Nepal’s cultural heritage.

3 years, later a huge amount of effort has gone into restoration of damaged documents and there is still a lot to be done. We asked Nirmal Dulal, one of the organizers of WLM in Nepal, about the impact of the 2015 earthquake on monuments and how it changed WLM competition in Nepal.

Dharahar (aka Bhimsen Tower) in Kathmandu before and after the 2015 earthquake. (Photos by Arnsh locus and बिजय पोख्रेल CC BY-SA 4.0)

Tell us about the earthquake and its impact on monuments in Nepal.

The earthquake that happened on April 25, 2015 with a magnitude of 7.8Mw and its strong aftershock on May 12 with a magnitude of 7.3Mw destroyed several monuments. There is a rather long list of partially to fully damaged monuments among which are Kasthamandap in Kathmandu Durbar Square, a UNESCO World Heritage Site, and the Dharahara tower, built in 1832. Also, the earthquake caused the Manakamana Temple in Gorkha, which was previously damaged in an earlier quake, to tilt several more inches. The northern side of Janaki Mandir in Janakpur was damaged. Several temples in Kathmandu valley, including Kasthamandap, Panchtale temple, the top levels of the nine-story Basantapur Durbar, the Dasavatar temple and two dewals located behind the Shiva Parvati temple were demolished by the quake. Some other monuments including the Taleju Bhawani Temple partially collapsed. Jaya Bageshwari Temple in Gaushala and some parts of the Pashupatinath Temple, Swayambhunath, Boudhanath Stupa, Ratna Mandir, inside Rani Pokhari, and Durbar High School have been destroyed.

In Patan, the Char Narayan Mandir, the statue of Yog Narendra Malla, a pati inside Patan Durbar Square, the Taleju Temple, the Hari Shankar, Uma Maheshwar Temple and the Machindranath Temple in Bungamati were destroyed. In Tripureshwor, the Kalmochan Ghat, a temple inspired by Mugol architecture, was destroyed and the nearby Tripura Sundari also suffered significant damage. In Bhaktapur, several monuments, including Changunarayan Temple, the Phasi Deva temple, the Chardham temple and the 17th century Vatsala Durga Temple were fully or partially destroyed. Several of the churches in the Kathmandu valley were also damaged.

Was there any special activity after the earthquake by WLM Nepal? Did the local team ask people to go back and take photos of monuments again?

Right around the time that the earthquake happened, we were preparing for the 2015 version of Wiki Loves Earth.  All of a sudden, we found ourselves engaged in earthquake relief programs. We did ask people to take photos of destroyed monuments just after the earthquake and 100s of photographs were uploaded during the time and the news of this activity was covered by dozens of national and international news portals and reports. In 2016, we successfully organized the WLM competition which by far has seen the highest number of photographs uploaded in WLM Nepal.

How did the earthquake change WLM competitions in Nepal in the following year? Did you change anything in the contest after earthquake?

Thankfully while the earthquake was devastating and damaging to a significant portion of Nepal’s heritage, it did not destroy or even affect everything. In fact, 5 monumental sites out of 8 UNESCO World Heritage sites in Nepal were not affected by the earthquake. The logistics of the competition didn’t change much in the following year but there was huge spike in the number of participants and the number of photos received during the competition. We went from a little over 1500 photos in 2015 to more than 8500 submissions in 2016. In some sense, the earthquake shook our competition too, in a good way. Perhaps, the loss and damage to so much heritage was a wake-up call to underline the importance of documenting our heritage and a reminder that no matter how old a monument is and how many disasters it has withstood there could still be something stronger, more devastating, more fatal so it’s important to document whatever we have.

What is the status of damaged monuments? How is the restoration activity going?

Most of the damaged monuments are now under restoration process. The restoration is undergoing by Department of Archaeology of Nepal, with support from UNESCO and various other concerned parties. It’s been going well but these kinds of activities take time. So far, 30% of the monuments are fully restored and the rest are in the process of being restored.

In there anything else you’d like to share with us?

It makes me proud to say that because of the WLM photo contest we were able to store almost 17000 photos of monuments in Nepal before and after the earthquake on the open database of Wikimedia Commons.

We did lose something precious in this earthquake but on a positive note, it gave us a chance to appreciate the importance of the documentation process. It has also made us more mindful of our resources. For example, unlike our early years we are avoiding the use of paper promotion with the theme of: 1 ream of paper = 6% of a tree and 5.4kg CO2 in the atmosphere, 3 sheets of A4 Paper = 1 liter of water. To acquire a couple of thousands of pictures we can’t afford sacrificing hundreds of trees to waste. So, we have started a new innovative campaign to achieve our goals by:

  • Rigorous Emailing
  • Using Social Media
  • Tell a friend campaign
  • Central notice across Wikimedia projects; to be directed to our site etc.
  • Media coverage

And besides that this year (In WLM 2017) we introduced a Photoride session to cover as many monumental sites as possible with a team of professional photographers. As a result, we traveled more than 80km uptown and visited 4 major locations and more than 161 monument sites in Nepal. It’s been fun and we are looking forward to the 2018 edition of the competition and many more photos of Nepal’s monuments to come.

Special thanks to Nirmal for answering our questions and also for the great work he is doing along with the other members of the WLM team in Nepal.

by Mohammad at March 08, 2018 09:26 PM

Wikimedia Tech Blog

New Wikipedia feature gives you the power to choose whatever language you want

The Wall of Love in Paris shows the phrase “I love you” in 250 languages of the world. Photo by Lystopad, CC BY-SA 4.0.

Few websites are as massively multilingual as Wikipedia. An article on Wikipedia may be available in several languages—and readers and editors may want to view that article in a language other than the one their browser automatically selects.

This creates some challenges for the Wikimedia Foundation’s Language team, which formed in 2011. We want to make sure that both readers and editors can always select the language they want—and just released a new feature to English Wikipedia that will make the language selection process easier.

In this post, I explain the history of interlanguage links on Wikipedia and detail how other organizations approach multilingual readers. But first, let’s look at why people may want to read an article in a particular language.

Serving multilingual readers

On Wikipedia, millions of articles are available in more than one language. Many are available in more than a hundred languages. For example, you can read the article about the jazz musician Louis Armstrong in 120 languages, the article about the Indonesian singer Anggun in 141, and the article about Beijing, the capital of China, in 218.

People have different reasons for wanting to read an article in a particular language. Most readers simply want to read an article in the language that they know best, but even that is often a challenge: many people search for the topic that interests them on a search engine, and then wind up on Wikipedia through their research results. But the search engine doesn’t necessarily bring them to the article in the language that they want.


At that point, if an article is available in their language, they may click on their language name in the sidebar. But what if the article is available in fifty languages? As our research has shown, finding their language name in such a long list is difficult, and many people are not even aware that a Wikipedia in their language exists.

It could be quite a long list (left sidebar). GIF from the English Wikipedia’s article on Earth, text available under CC BY-SA 3.0.

How different websites resolve this problem

There are different methods for showcasing language availability. As part of our research we explored existing approaches.

When a website is available in a just a handful of languages, it is convenient to simply show a list of these languages’ names. But when the number of languages increases, using a plain list becomes problematic. Listing also requires you to think about ordering languages.: For example, where will Japanese appear—at “J”, according to its English name, between Italian and Korean? At “N” by its native name “Nihongo”? Or perhaps towards of the list, because its own name is not written in the Latin alphabet?

In some websites and apps, the names of the languages are all written in the same language, and they are sorted alphabetically. For example, common machine translation websites work like that. This may seem convenient, but in fact, finding a language in a list of more than twenty items takes quite a few seconds, despite the alphabetical sorting. And if the computer is set to work in Hebrew, and it is used by somebody who knows only English, it will be very difficult to set English as the target language for translation if “English” is written as “אנגלית”.

Clearly, when the user can select from so many languages, there are many possible solutions. So how did we approach this challenge?

Sorting doesn’t come overnight

Wikimedia’s Language team has been working on this language sorting and selection problem since 2012 when we  designed and released the first version of the Universal Language Selector extension (ULS). The ULS was initially made not as a way to move between versions of an article, but as a generic design for making it easy to select a language from a long list of languages in various contexts. Its first use was setting user preferences related to language.

The design included dividing the complete list of languages by continents on which they are spoken, and then further dividing them by writing system, and sorting alphabetically within the writing system. Another section was added at the top, with the languages that are most likely to be known to the reader: the languages on which they previously clicked, the language of their operating system and browser, and the languages of their country.

The panel also showed a search box, which allows users to find the language they need as quickly as possible, and in any language. So even if you don’t know Japanese and cannot type in it on your computer keyboard, you can type “Japanese” in English or “Ιαπωνικά” in Greek and find “日本語”.

ULS was soon adopted for use at Wikidata, the Translate and Content Translation extensions, the Upload Wizard and in other locations in Wikimedia sites. However, the ULS didn’t tackle what may be the most visible and challenging context for language selection: moving to a version of the article that you’re reading in another language.

The only significant attempt to change the design of interlanguage links was made in 2010, when the list of languages was made completely hidden by default in an attempt to reduce visual clutter. The complete hiding of the links caused the number of clicks to drop by about 75%, and after several weeks this change was reverted.

The new compact language links.

This problem began to be addressed again in 2014, when Niharika Kohli, who is now a software engineer at the Wikimedia Foundation, adapted ULS to interlanguage links as part of her project with OPW, a program now known as Outreachy. This feature compacted the complete list of languages in which a page is available to at most nine items. This number was chosen according to a common guideline in design and psychology: The Magical Number Seven, Plus or Minus Two, the number of objects an average human can hold in working memory. The rest of the languages are shown using a “More” button, which would show a ULS panel with all the languages in which the article is available.

The languages for the initial nine-item list were chosen according to the same criteria as the languages for the top section in the ULS panel. The highest priority is given to the languages that the user had clicked previously. This helps users avoid scanning the list again and again for their usual languages. This optimizes for repeated use which has a larger impact on regular use of the site. In addition to previously-clicked languages, languages are added according to user’s operating system settings and the language’s spoken in the country from which the user is connecting.

This feature was enabled as a beta feature in 2014, and the team started collecting feedback from the editors who enabled it.

A common theme in the feedback was the need to adapt the feature for Wikipedia editors, whose needs are different from those of casual readers. While casual readers usually want to read the article in a language that they know best, people may also want to read an article in a different language for other reasons. For example the article in their language may be too short and they want to try to read in another language to learn more. Wikipedia editors may also want to look at the article in the language that is related to the article’s topic. For example, they may want to look at an article about a city in Tunisia in the Arabic Wikipedia even if they don’t know Arabic. This is useful for finding more images, comparing the article’s length and structure, finding the native spelling of names, and so on. They may also want to find in which languages does that article have the “featured article” status.

Based on this feedback, the feature was modified to prioritize languages for a user’s initial list based on several more criteria: languages from the user’s Babel userboxes, which many Wikipedia editors use to tell the world about the languages they know; languages in which the article is featured; and languages that are used in the article’s text. It also started indicating languages in which the article is featured with an appropriate icon (usually a star). Many visual tweaks were also made: for example, the division into sections by continent was removed when it’s not needed—when the list is too short for the sections to be useful, or when the panel is showing search results.

How language links will look now, on desktops. GIF from the English Wikipedia’s article on John Maynard Keynes, text available under CC BY-SA 3.0.

Some Wikipedia editors also improved the database of languages by territory, which is maintained as part of the CLDR project. This improves the relevance of languages that are shown in the initial compact list according to geolocation, and it is also a fine example of how Wikipedians collaborate with other open data projects.

In June 2016, the team has begun gradually moving Compact Language Links out of beta status in different projects. Showing a compact language list had notable impact: after one year, the percent of users who click the interlanguage links almost doubled across all languages. Traffic through interlanguage links into all languages has grown, and this includes languages that are smaller and aren’t tied to any country, such as Esperanto.

By February 2018, the English Wikipedia became the last Wikipedia in which the feature was taken out of beta. The English Wikipedia is the most read Wikimedia project, it is read by at least some people in all countries, and it acts as a gateway for Wikipedia in many other languages, so it’s particularly important that interlanguage links in it are as optimized as possible for the global audience.

What does the future hold for the interlanguage links design? There are no solid plans to change anything at the moment, but it’s pretty clear that Compact Language Links is only the first step in the redesign of the interlanguage links. Future changes may include:

  • Showing links to all the languages, rather than just the languages in which the article is available. Actual implementation of this will require proper research and design, but this is supposed to let the users know that there is a Wikipedia in their language; currently, languages with fewer articles have a lower chance of showing up in the interlanguage links list. Links to languages in which the article is not available can lead the user to a list of basic facts using Article Placeholder or Wikidata, or to creating a translated article using Content Translation.
  • Showing the list of interlanguage links in a more prominent location on the page.
  • Redesigning the different elements near the language list: the gear icon for language settings, the “more” button, and Wikidata’s “Edit links” element.
  • Making the algorithm for prioritizing the languages common for the desktop site, the mobile site, and the mobile apps.

Wikipedia is already one of the web’s most linguistically diverse sites, and better design for its languages list may uncover an even bigger potential for language diversity.

Amir Aharoni, Product Analyst, Language team (Editing)
Wikimedia Foundation

by Amir E. Aharoni at March 08, 2018 03:24 PM

Lorna M Campbell

Celebrating Wikipedians for International Women’s Day

Today is International Women’s Day, and as a passionate Wikipedian, I want to give a shout out to everyone who is working so hard to address the gender balance of articles on Wikipedia.  It’s no secret that Wikipedia has a gender problem, here’s the encyclopedia’s own article on the topic: Gender bias on Wikipedia.  On the English-language Wikipedia only 17.49 % of biographical articles are about women.  This figure is better for some languages, such as Welsh which achieved gender equilibrium in 2016, and worse for others. This disparity is hardly surprising given that women account for only around 10 -15% of editors on English Wikipedia.   What is perhaps less well known is that, all over the world, editors, Wikimedia chapters, and Wikipedia projects, such as Wiki Women in Red, are working really, really hard to change this.

At the University of Edinburgh we are fortunate to have an amazing Wikimedian in Residence, Ewan McAndrew, who works tirelessly not only to embed open knowledge in the curriculum, but also to redress the gender imbalance of contributors by encouraging more women to become editors, and to improve the representation, coverage and esteem of articles about women in science, art, literature, medicine and technology.  Ewan regularly runs WikiWomen in Red editathons and events for International Women’s Day and Ada Lovelace Day to name but a few.

In my own fledgling Wikipedia editing career I’ve created a whole seven new articles, all of them about women, and I wouldn’t have been able to do this without Ewan’s support and guidance. Seven articles might seem like a drop in the ocean, but those little drops can ripple out and have an unexpected effect.

Last week in school, as part of The Young Women’s Movement, my daughter and her class were asked to write a couple of lines about a person who inspired them.  This is what my daughter wrote.

I choose my mum because she wants to empower women by making Wikipedia pages for them, she makes wikipedia pages for women who played a big part in history but aren’t known that well. #Gomymum

I hope by the time she is old enough to become an editor herself the gender balance of Wikipedia will have improved a good deal, but in the meantime, here’s a shout out to some of the amazing women who are helping to make that a reality:  Lucy Crompton-Reid, Daria Cybulska, Marshall Dozier, Charlie Farley, Josie Fraser, Gill Hamilton, Melissa Highton, Susan Ross, Ann-Marie Scott, Jo Spiller, Sara Thomas, Alice White, and all my Wikimedia UK colleagues and fellow Board members, past and present.

by admin at March 08, 2018 08:40 AM

March 07, 2018

Wiki Education Foundation

Ocean scientists sign on to share knowledge with the world

Earlier this month, I attended the Ocean Sciences Meeting, a venue for marine scientists to share knowledge and research across disciplines, including geology, physics, and chemistry. The meeting is co-hosted by the American Geophysical Union (AGU), The Oceanography Society (TOS), and the Association for the Sciences of Limnology and Oceanography (ASLO), whose members’ lives and careers are dedicated to understanding and conserving the world’s largest ecosystem. I spoke with dozens of university instructors, graduate students, undergraduates, and industry professionals, and the ethos toward Wikipedia was overwhelmingly positive. One researcher “confessed” to reading Wikipedia as a quick refresher on topics related to his research, and another instructor even remarked that Wikipedia is “one of the great things of the information age.”

At Wiki Education, we couldn’t agree more. Wikipedia is a valuable source of information about topics ranging from current events to biographies of famous people to geological evidence of climate change. While the encyclopedia serves as a valuable refresher for professionals already embedded in a field of study, it’s often the primary or only source people have access to when looking to learn from others’ expertise. If Wikipedia is where the general public accesses scientific knowledge, then we should work hard to make it as comprehensive, accurate, and up-to-date as possible.

In our Classroom Program, university instructors are improving Wikipedia by assigning students to write articles as a part of their course curriculum. Students access Wiki Education’s online trainings and other tools to learn how to turn academic research into a thorough, well-cited Wikipedia article. Students are motivated to do good work because they see purpose behind their hours of labor, research, and writing. Instructors help share knowledge with the world—knowledge that otherwise might end up in a recycling bin.

I also attended the American Geophysical Union’s Fall meeting in December 2017. The event is open to their members beyond oceanographers, and thousands gathered under the theme: What will you discover?. Now that we’ve recruited several of AGU’s members, I’m hopeful they will work with their students to discover the wider implications of freeing knowledge to share with the world. We’re actively looking for more geology and oceanography students to improve Wikipedia, making information about the earth more readily accessible to its inhabitants. If you’re an instructor looking for a meaningful, real-world activity for students, visit our website for steps to get involved or email us at contact@wikiedu.org.

by Jami Mathewson at March 07, 2018 05:01 PM

March 06, 2018

Wikimedia Foundation

Wikimedia releases eighth transparency report

Photo by Flicka, CC BY-SA 3.0.

In the last six months of 2017, thousands of volunteers took the time to edit and add content to Wikipedia and the other Wikimedia projects. Each edit was a step towards achieving the Wikimedia movement’s vision of providing free access to the sum of human knowledge.

As the projects grow, the Wikimedia Foundation receives requests from private parties and government entities to delete or change content, or to disclose nonpublic information about users. At the center of our guiding principles are our commitments to transparency, privacy, and freedom of expression, which is why we push back on any requests that are inappropriate or fall short of our stringent standards.

Twice a year, we publish our transparency report, which details the number of requests we received, their types, countries of origin, and other information. The report also includes an FAQ and stories about interesting and unusual requests.

The report focuses on five main types of requests:

Content alteration and takedown requests. From July to December of 2017, we received 343 requests to alter or remove project content, seven of which came from government entities. Once again, we granted zero of these requests. The Wikimedia projects thrive when the volunteer community is empowered to curate and vet content. When we receive requests to remove or alter that content, our first action is to refer requesters to experienced volunteers who can explain project policies and provide them with assistance.

Copyright takedown requests. Wikimedia projects feature a wide variety of content that is freely licensed or in the public domain. However, we occasionally will receive Digital Millennium Copyright Act (DMCA) notices asking us to remove content that is allegedly copyrighted. All DMCA requests are reviewed thoroughly to determine if the content is infringing a copyright, and if there are any legal exceptions, such as fair use, that could allow the content to remain on the Wikimedia projects. From July to December of 2017, we received 12 DMCA requests. We granted two of these. This relatively low amount of DMCA takedown requests for an online platform is due in part to the high standards of community copyright policies and the diligence of project contributors.

Right to erasure. From July to December of 2017, the Wikimedia Foundation received one request for content removal that cited the right to erasure, also known as the right to be forgotten. We did not grant this request. The right to erasure in the European Union was established in 2014 by a decision in the Court of Justice of the European Union. As the law now stands, an individual can request the delisting of certain pages from appearing in search results for their name. The Wikimedia Foundation remains opposed to these delistings, which negatively impact the free exchange of information in the public interest.

Requests for user data. Sometimes, the Wikimedia Foundation receives requests for nonpublic user data from government entities, organizations, and individuals. These requests can range from an informal email to a formal court order or subpoena. From July to December of 2017, we received 14 requests for user data and partially complied with one of them. Unlike many other online platforms, we collect very little nonpublic information about our users, so we often do not have data that is responsive to the requests. We will only produce information if a request is legally valid and follows our requests for user information procedures and guidelines.

Emergency disclosures. Very rarely, the Wikimedia Foundation will be made aware of concerning information on the projects, such as suicide or bomb threats. Under these extraordinary circumstances, we may voluntarily disclose information to the proper authorities, consistent with our privacy policy, to help resolve the issue in a peaceful manner. Additionally, we provide an emergency request procedure for law enforcement to seek information that may prevent imminent harm. From July to December of 2017, we voluntarily disclosed information in 13 cases, and provided data in response to one emergency request, for a total of 14 emergency disclosures.

The Wikimedia Foundation is committed to transparency. We invite you to read the full transparency report online, where you will find more data, frequently asked questions, and interesting stories from the last six months. We’ll also be distributing an updated version of our print edition at upcoming events, beginning in early March. Additionally, feel free to read the blog posts about our past transparency reports for more insight on how the information in this report compares to reports from the past.

Jim Buatti, Legal Counsel
Leighanna Mixter, Legal Counsel
Aeryn Palmer, Senior Legal Counsel
Wikimedia Foundation

The transparency report would not be possible without the contributions of Siddharth Parmar,  Jacob Rogers, Jan Gerlach, Katie Francis, Rachel Stallman, Eileen Hershenov, James Alexander, Shunsuke Terakado, Emine Yildirim, Matt Wes, Niharika Malhotra, and the entire Wikimedia communications team. Special thanks to Dan Douglas for help in preparing this blog post, and to the entire staff at Oscar Printing Company.

by Jim Buatti, Leighanna Mixter and Aeryn Palmer at March 06, 2018 08:59 PM

Amir E. Aharoni

The Curious Problem of Belarusian and Igbo in Twitter and Bing Translation

Twitter sometimes offers machine translation for tweets that are not written in the language that I chose in my preferences. Usually I have Hebrew chosen, but for writing this post I temporarily switched to English.

Here’s an example where it works pretty well. I see a tweet written in French, and a little “Translate from French” link:

Emmanuel Macron on Twitter.png

The translation is not perfect English, but it’s good enough; I never expect machine translation to have perfect grammar, vocabulary, and word order.

Now, out of curiosity I happen to follow a lot of people and organizations who tweet in the Belarusian language. It’s the official language of the country of Belarus, and it’s very closely related to Russian and Ukrainian. All three languages have similar grammar and share a lot of basic vocabulary, and all are written in the Cyrillic alphabet. However, the actual spelling rules are very different in each of them, and they use slightly different variants of Cyrillic: only Russian uses the letter ⟨ъ⟩; only Belarusian uses ⟨ў⟩; only Ukrainian uses ⟨є⟩.

Despite this, Bing gets totally confused when it sees tweets in the Belarusian language. Here’s an example form the Euroradio account:

Еўрарадыё   euroradio    Twitter double.pngBoth tweets are written in Belarusian. Both of them have the letter ⟨ў⟩, which is used only in Belarusian, and never in Ukrainian and Russian. The letter ⟨ў⟩ is also used in Uzbek, but Uzbek never uses the letter ⟨і⟩. If a text uses both ⟨ў⟩ and ⟨і⟩, you can be certain that it’s written in Belarusian.

And yet, Twitter’s machine translation suggests to translate the top tweet from Ukrainian, and the bottom one from Russian!

An even stranger thing happens when you actually try to translate it:

Еўрарадыё   euroradio    Twitter single Russian.pngNotice two weird things here:

  1. After clicking, “Ukrainian” turned into “Russian”!
  2. Since the text is actually written in Belarusian, trying to translate it as if it was Russian is futile. The actual output is mostly a transliteration of the Belarusian text, and it’s completely useless. You can notice how the letter ⟨ў⟩ cannot be transliterated.

Something similar happens with the Igbo language, spoken by more than 20 million people in Nigeria and other places in Western Africa:

 4  Tweets with replies by Ntụ Agbasa   blossomozurumba    Twitter.png

This is written in Igbo by Blossom Ozurumba, a Nigerian Wikipedia editor, whom I have the pleasure of knowing in real life. Twitter identifies this as Vietnamese—a language of South-East Asia.

The reason for this might be that both Vietnamese and Igbo happen to be written in the Latin alphabet with addition of diacritical marks, one of the most common of which is the dot below, such as in the words ibụọla in this Igbo tweet, and the word chọn lọc in Vietnamese. However, other than this incidental and superficial similarity, the languages are completely unrelated. Identifying that a text is written in a certain language only by this feature is really not great.

If I paste the text of the tweet, “Nwoke ọma, ibụọla chi?”, into translate.bing.com, it is auto-identified as Italian, probably because it includes the word chi, and a word that is written identically happens to be very common in Italian. Of course, Bing fails to translate everything else in the Tweet, but this does show a curious thing: Even though the same translation engine is used on both sites, the language of the same text is identified differently.

How could this be resolved?

Neither Belarusian nor Igbo languages are supported by Bing. If Bing is the only machine translation engine that Twitter can use, it would be better to just skip it completely and not to offer any translation, than to offer this strange and meaningless thing. Of course, Bing could start supporting Belarusian; it has a smaller online presence than Russian and Ukrainian, but their grammar is so similar, that it shouldn’t be that hard. But what to do until that happens?

In Wikipedia’s Content Translation, we don’t give exclusivity to any machine translation backend, and we provide whatever we can, legally and technically. At the moment we have Apertium, Yandex, and YouDao, in languages that support them, and we may connect to more machine translation services in the future. In theory, Twitter could do the same and use another machine translation service that does support the Belarusian language, such as Yandex, Google, or Apertium, which started supporting Belarusian recently. This may be more a matter of legal and business decisions than a matter of engineering.

Another thing for Twitter to try is to let users specify in which languages do they write. Currently, Twitter’s preferences only allow selecting one language, and that is the language in which Twitter’s own user interface will appear. It could also let the user say explicitly in which languages do they write. This would make language identification easier for machine translation engines. It would also make some business sense, because it would be useful for researchers and marketers. Of course, it must not be mandatory, because people may want to avoid providing too much identifying information.

If Twitter or Bing Translation were free software projects with a public bug tracking system, I’d post this as a bug report. Given that they aren’t, I can only hope that somebody from Twitter or Microsoft will read it and fix these issues some day. Machine translation can be useful, and in fact Bing often surprises me with the quality of its translation, but it has silly bugs, too.

by aharoni at March 06, 2018 07:39 PM

Wikimedia Foundation

Community digest: Conversations with Wikimedia women; Arabic Wikinews and Global Voices collaboration; news in brief

Women in the Wikimedia movement: Conversations with communities

Photo by Victor Grigas/Wikimedia Foundation, CC BY-SA 3.0.

In an effort to celebrate Women’s History Month, and commemorate International Women’s Day, the Wikimedia Foundation’s Community Engagement department is hosting a series of conversations about women in the Wikimedia movement.

We’d like to take this opportunity to highlight women’s roles in the many spheres of the movement they occupy, to learn what challenges they face, and how they are thriving in their work.

The discussions will address women’s insights in three different spheres: Women in Wikimedia programs, Women in Wikimedia technical spaces, and Women in Wikimedia leadership.

Invited community members will be presenting at these three events, and everyone from across the movement is invited to participate in these discussions, that will be streamed on YouTube as follows:

The goal of these discussions is not only to inform our community about women’s roles in this movement, but to better understand the challenges and inequalities that they may encounter. The gender gap in the Wikimedia movement is a known and documented issue. Conducting surveys and holding discussions with female contributors, among other tools, help us understand the nuances behind these issues.

The discussions can also help raise the awareness about the amazing women currently contributing to the movement in different capacities.

The latest movement-wide survey reveals that while women constitute less than 15 percent of the total contributors, the gap is smaller by 10% when it comes to leadership roles in affiliate groups and program coordination. Having a better understanding of human experiences is invaluable for our work towards more diverse Wikimedia projects where we can replicate these success stories in different communities.

María Cruz, Communications and Outreach Manager, Community Engagement


Announcing a collaboration between Wikinews and Global Voices Lingua

We are happy to announce a collaboration between the Arabic Wikinews community and Global Voices Lingua to support Arabic digital content.

Global Voices Lingua is a project managed by a global volunteer community that works on translating what our editors and writers report in 167 countries. This is done in our entirely virtual non-profit newsroom, as well as working on translating Global Voices projects; Advox, Rising Voices and NewsFrames.

Wikinews is a Wikimedia project that relies on a community of volunteers to present reliable, unbiased, and relevant news. The project started in 2004, and as of February it hosted 4,076 Arabic news posts, making it the twelfth largest Wikinews.

The idea of this collaboration started during WikiArabia 2017, the annual meetup of the Arabic Wikimedia community, in Cairo.

Since the beginning of this collaboration, the ranking of Arabic Wikinews went up to the 10th place, thanks to importing 3682 Arabic Lingua posts via جار الله [Jarallah]’s bot.

Adding an additional copy in Wikinews will provide a new life to the posts, based on the open editing nature of Wikinews. Arabic Lingua content will be available on Wikimedia platforms offline via ZIM file format, viewable in Kiwix.

Arabic Lingua editors and contributors will be supporting the Arabic content on Wikipedia. We are working on extending this collaboration to other languages!

Mohamed ElGohary, Global Voices Lingua Manager.

This post originally appeared on Global Voices’ blog, where it is licensed under CC BY 3.0. It has been edited for publication on the Wikimedia blog.


In brief

Steward election results announced: Stewards are users with complete access to the wiki interface on all public Wikimedia wikis, including the ability to change any and all user rights and groups. New stewards are regularly elected by the global Wikimedia community and the elections are organized by existing stewards and run for roughly three weeks. On 28 February, five newly elected stewards were announced from a diverse group of Wikimedians around the world. More information on Wikimedia-l.

Global preferences available for testing: The GlobalPreferences extension allows user preferences to be set for all wikis in a wiki farm. This means that users can change their preferences for all linked wikis without having to visit each in turn. Local exceptions can be set for individual wikis. The extension is now ready for testing on the English Wikipedia, the German Wiktionary and the Hebrew Wikipedia before making it available in all wikis. The GlobalPreferences extension is one of the items from the 2016 community wishlist. More on how to test the beta edition is on Wikimedia-l.

Creative Commons invites Wikimedians to attend their annual global summit: The annual Global Summit brings together an international community of leading technologists, legal experts, academics, activists, and community members who work to promote the power of open and future of the Commons worldwide. Planned Wikimedia-related events include a keynote by Katherine Maher, the Wikimedia Foundation’s executive director, in addition to several discussions and sessions on how Wikimedia and Creative Commons work together. Wikimedians are invited to participate and registration is now open.

Wikidata hits 50 million (but not really): On 23 February, the Wikidata’s Q50000000 was created—an item page about about El Refugio, a village in the Mexican state of Durango, in the municipality of Mapimí. Perhaps strangely, though, this does not mean that Wikidata now has 50 million items. Wikimedian Denny Vrandečić wrote on Facebook that it is “merely” just a number, and that “there are about 45 [m]illion items currently.”

Swedish embassies host Wikipedia editathons on the World Women’s Day: On 8 March, World Women’s Day, the Swedish embassies in several cities around the world will host and support Wikipedia editing workshops that will focus on increasing the gender diversity in Wikipedia’s content. The event locations include Israel, Egypt, Indonesia, Colombia, and more.

Samir Elsharbaty, Writer, Communications
Wikimedia Foundation


by María Cruz, Mohamed ElGohary and Samir Elsharbaty at March 06, 2018 07:31 PM

March 05, 2018

Vinitha VS

Beginning, not the end

“When you reach the end of what you should know, you will be at the beginning of what you should sense.”

–Kahlil Gibran

Officially, Outreachy 15 has come to an end. Yet, this is more like a beginning for me, than an end. The knowledge I have gained in these months has given me more confidence and courage to explore deeper and wider.  This blog is mainly for those girls who are still doubtful about applying to Outreachy or who have to face the fears of a beginner. I am going to tell you about some fears you might face, which I too had in the beginning. I got an opportunity to be mentored by awesome mentors. Hope you get to meet yours too.

When I heard of Outreachy I wanted to try my best and get selected, but I also had the fear of exposing my ignorance. There was always a fear of “what if I do something wrong?”. For everyone who is in this position, the only words you need to remind yourself is “It is alright to be wrong.”. We have always been told to do right, but here you have an excuse. Writing a wrong code while learning to code has not harmed anyone. Everyone who started out didn’t start it perfectly. The more mistakes you make, the more you know what not to do. You only need to take care that you don’t repeat the same mistakes. Try to make new ones! And after making a lot of mistakes, you gain enough confidence. Confidence to make many more mistakes fearlessly and during this process, you learn how to write code which works.

How to begin? There’s nothing wrong with asking help about how to begin. But something else would be an even better way. Identify the bugs (which are displayed in the git codebase of the project) which are tagged as ‘easy to solve’ and try fixing them. You may not know the programming language used and you may fail miserably. But you have made the first move in the right direction. Try to understand what the code is doing. Search online to check anything you don’t understand. This can be very intimidating in the beginning. Trust me when I say that you are still on the right path. The more time you spend with the code, you will actually begin to understand it better. If you are able to fix the bug, well and good. Mostly, when you are a beginner, you will take more time. If you are not able to fix it, then start asking specific questions like “I had tried to do <this> and it is failing <here>. Can I get some help”.  Mostly you will get help at this stage. If you don’t, then nothing to worry. Everything is still fine. Keep trying to tweak the code and keep asking. The more effort you put in, the more the chances that you are heard.

If there is anything that I think is a good skill to have before starting is the knowledge of git. This is because to start fixing bugs, you need to clone the code and start working on it. And the good news is that it won’t take you much time to get the basics of git. Initially, you only need the code on your system.  And there are many online tutorials/ blogs/videos which you can help you get started.

There is no right time or wrong time to start. If you have some time, start by checking the participating organizations and find the project that excites you.  Don’t give up. The hard work is totally worth it.


by vinithavs at March 05, 2018 08:04 PM

Wikimedia Foundation

On Wikipedia, Black Panther won the Olympic gold

Photo by Gage Skidmore, CC BY-SA 2.0.

Did you know that you can view Wikipedia’s most popular articles from each day or month?

Well, now you do. And the most popular article on the English Wikipedia last month was the Marvel superhero film Black Panther, which was released in the United States on 16 February to widespread critical and popular acclaim. Rotten Tomatoes summarizes the reviews, 97% of which it measured as positive, by stating that “Black Panther elevates superhero cinema to thrilling new heights while telling one of the [Marvel Cinematic Universe‘s] most absorbing stories—and introducing some of its most fully realized characters.”

Beyond the article on the film, Black Panther influenced three other entries in the top 25 most popular articles: Michael B. Jordan, who portrayed the film’s antagonist; Chadwick Boseman, who played the protagonist (#17); and the article about the comic book version of Black Panther (#23).

The second most-popular article was about Sridevi. A Bollywood actress widely known for being the industry’s first woman superstar, Sridevi passed away on 24 February. Having appeared in major films in five different Indian languages, Wikipedia’s article on Sridevi notes that she had “pan-Indian appeal” and was once voted the country’s best actress from the last one hundred years.

The Winter Olympics came, perhaps surprisingly, only in third place. Norway came away with the most medals overall and the most gold medals, the latter tied with Germany.


Here are the top ten most popular articles from February 2018. For the full list, head over to Topviews Analysis.

  1. Black Panther (film), 8,163,571
  2. Sridevi, 6,359,702
  3. 2018 Winter Olympics, 4,828,193
  4. Elon Musk, 3,471,643
  5. Exo (band), 3,415,620
  6. Sylvester Stallone, 3,147,866
  7. Deaths in 2018, 2,983,212
  8. Shaun White, 2,830,665
  9. The Cloverfield Paradox, 2,407,298
  10. Michael B. Jordan, 2,257,866


Some notes related to these:

  • Elon Musk (#4): Pageviews to the article about the business magnate and entrepreneur spiked around the launch of Falcon Heavy, a launch vehicle designed by Musk-owned SpaceX.
  • Exo (band) (#5): Wikimedia UK, an independent organization devoted to promoting the Wikimedia movement in the UK, started a Twitter feud between fans of the Korean pop bands Exo and BTS. BTS won the poll there in a landslide, but Exo took the pageviews crown.
  • The Cloverfield Paradox (#9): Regardless of your opinion on the film itself, Netflix’s strategy to release it right after the Super Bowl certainly garnered attention — enough to put it on this list. And while The A.V. Club’s Alex McLevy has argued that this “stunt … derailed basically all the promotional steam” for Altered Carbon, a Netflix series released two days before Cloverfield, the show slotted in at #15.

Ed Erhart, Senior Editorial Associate, Communications
Wikimedia Foundation

*We’ve removed articles with above 90% or below 10% mobile views, as this is usually an indicator of being a false positive. See more.

You can also view this post with embedded tweets on our Medium.

by Ed Erhart at March 05, 2018 07:27 PM

Wiki Education Foundation

Roundup: American Women’s and Gender History

History is full of strong, complicated women. No matter where you look in history, odds are high that you will find a woman who’s made a significant difference. Regardless of whether or not their contribution was to be cheered or jeered, you cannot deny that some of the most fascinating people in history just happened to be women. In some instances these women risked ostracization or worse, as their actions went against the expected gender roles for their time and culture. As such, it’s unsurprising that Tamar Carroll’s class at the Rochester Institute of Technology had American Women’s and Gender History as their focus.

One such interesting woman is Kateri Tekakwitha, an Algonquin–Mohawk tribeswoman who chose to go against the expectations of her tribespeople in order to convert to Catholicism and pursue a lifetime of chastity. Refusing to marry her family’s choice, Tekakwitha joined a mission in 1677 and in 2012 was canonized as a saint, making her the first Native American woman of North America to be canonized by the Roman Catholic Church. Students also worked on articles about women such as Lotte Reiniger, who created The Adventures of Prince Achmed in 1926, now considered to be the oldest surviving animated feature film. Reiniger is considered to be the foremost pioneer of silhouette animation and her work has influenced many creative professionals, which can be seen in films such as the opening for the 1940 Disney masterpiece Fantasia.

American Chemist, Nutritionist, and Professor Katharine Blunt. (Image uploaded by a student)

Another focus of the class was women who worked with science and the community, as in the case of Mary Richmond and Kate Gordon Moore. Richmond was a social worker active in the late 1800s and early 1900s, whose works still impact social work education today. Gordon Moore was a psychologist who focused on color vision and memory in her early career. Her research focus shifted twice during her career, from color vision to education and then to the imagination. The article for Katharine Blunt was also edited to include mention that Blunt — a chemist, professor, and nutritionist — worked within the field of home education in the early 1900s, a field that offered many women the chance to enter the male-dominated academic community of the time.

Students also focused on women with more complicated histories and careers. Sybil Neville-Rolfe is one such woman, as she was a noted social hygienist and a founder of the Eugenics Society. Neville-Rolfe believed that a person’s worth was determined by their genetics — a person with good genetics would become successful and that someone with poor genetics would not, regardless of their backgrounds. In contrast to some eugenicists, however, Neville-Rolfe did not automatically assume that unmarried mothers or prostitutes were automatically genetically deficient. She instead believed that these women were victims of the poor morals of the era and argued that steps should be taken to provide them with better education and support. Neville-Rolfe would likely have held the same opinion of another woman the students wrote about, Sarah Grosvenor. Grosvenor lived during the colonial era, which lacked the high tech health care that many of us enjoy today. She became pregnant in her late teens while in a secret relationship with an unmarried man in his twenties. As pregnancy out of wedlock was heavily frowned upon and her lover was allegedly uninterested in marrying her, Grosvenor sought out an abortion from her local physician. Just over a week after the procedure was completed and Grosvenor miscarried, the young woman grew sick and died, likely due to the procedure being done in unsanitary conditions.

Are you interested in having your class participate in Wiki Education’s program? If so, contact us at contact@wikiedu.org to find out how you can gain access to tools, online trainings, and printed materials.

Header images: File:Katharine Blunt.png, by ConnColl Collections, CC BY-SA 4.0, via Wikimedia Commons. File:Lotte Reiniger 1939.jpg, public domain, via Wikimedia Commons. File:Penelope Barker (1728 – 1796).jpg, public domain, via Wikimedia Commons.

by Shalor Toncray at March 05, 2018 04:52 PM

Wikimedia Cloud Services

Labs and Tool Labs being renamed

(reposted with minor edits from https://lists.wikimedia.org/pipermail/labs-l/2017-July/005036.html)


  • Tool Labs is being renamed to Toolforge
  • The name for our OpenStack cluster is changing from Labs to Cloud VPS
  • The prefered term for projects such as Toolforge and Beta-Cluster-Infrastructure running on Cloud-VPS is VPS projects
  • Data Services is a new collective name for the databases, dumps, and other curated data sets managed by the cloud-services-team
  • Wiki replicas is the new name for the private-information-redacted copies of Wikimedia's production wiki databases
  • No domain name changes are scheduled at this time, but we control wikimediacloud.org, wmcloud.org, and toolforge.org
  • The Cloud Services logo will still be the unicorn rampant on a green field surrounded by the red & blue bars of the Wikimedia Community logo
  • Toolforge and Cloud VPS will have distinct images to represent them on wikitech and in other web contexts

In February when the formation of the Cloud Services team was announced there was a foreshadowing of more branding changes to come:

This new team will soon begin working on rebranding efforts intended to reduce confusion about the products they maintain. This refocus and re-branding will take time to execute, but the team is looking forward to the challenge.

In May we announced a consultation period on a straw dog proposal for the rebranding efforts. Discussion that followed both on and off wiki was used to refine the initial proposal. During the hackathon in Vienna the team started to make changes on Wikitech reflecting both the new naming and the new way that we are trying to think about the large suite of services that are offered. Starting this month, the changes that are planned (T168480) are becoming more visible in Phabricator and other locations.

It may come as a surprise to many of you on this list, but many people, even very active movement participants, do not know what Labs and Tool Labs are and how they work. The fact that the Wikimedia Foundation and volunteers collaborate to offer a public cloud computing service that is available for use by anyone who can show a reasonable benefit to the movement is a surprise to many. When we made the internal pitch at the Foundation to form the Cloud Services team, the core of our arguments were the "Labs labs labs" problem and this larger lack of awareness for our Labs OpenStack cluster and the Tool Labs shared hosting/platform as a service product.

The use of the term 'labs' in regards to multiple related-but-distinct products, and the natural tendency to shorten often used names, leads to ambiguity and confusion. Additionally the term 'labs' itself commonly refers to 'experimental projects' when applied to software; the OpenStack cloud and the tools hosting environments maintained by WMCS have been viable customer facing projects for a long time. Both environments host projects with varying levels of maturity, but the collective group of projects should not be considered experimental or inconsequential.

by bd808 (Bryan Davis) at March 05, 2018 04:44 AM

Frank Schulenburg

Which gear do the top Commons photographers use?

Today, I was curious to know which gear Commons photographers use. As Wikimedia Commons doesn’t provide any data on camera types (most other photography sites do), I decided to collect the data myself. Now, that being said, it might be obvious that I had to make some choices. Every day, many thousands of photos get […]

by Frank Schulenburg at March 05, 2018 03:41 AM

Tech News

Tech News issue #10, 2018 (March 5, 2018)

TriangleArrow-Left.svgprevious 2018, week 10 (Monday 05 March 2018) nextTriangleArrow-Right.svg
Other languages:
العربية • ‎বাংলা • ‎čeština • ‎Ελληνικά • ‎English • ‎Esperanto • ‎español • ‎فارسی • ‎suomi • ‎français • ‎עברית • ‎हिन्दी • ‎magyar • ‎italiano • ‎日本語 • ‎한국어 • ‎नेपाली • ‎polski • ‎português do Brasil • ‎русский • ‎svenska • ‎українська • ‎Tiếng Việt • ‎中文

March 05, 2018 12:00 AM

March 04, 2018

Weekly OSM

weeklyOSM 397



Population density in Europe based on the distribution of petrol stations. 1 | © Picture by Dominic Royé


  • Rory McCann asks on the talk mailing list how to connect rivers that flow through lakes, thus forming a network suitable for routing calculations. Extending the riverways into the lake works fine, but the extra segment in the lake should be excluded for determining the length of the actual river.
  • Two tags are in use for passenger information in railway stations. The tagging mailing list discussed the pros and cons of standardising on the more widely used tag passenger_information_display=yes.
  • Robert Whittaker announced a tool for comparing OSM Post Office data with that recently released by the company now running post offices in the UK. The newly-released data has also been republished here by Owen Boswarva.


  • Dominic Royé tweeted a map that displays population density through the number of gas stations in Europe. A further tweet shows the same data for the whole world. The original joke, for those unfamiliar, is here.
  • Several users on the Talk-IT mailing list discuss (automatic translation) the use of information from Google StreetView for OSM mapping. It’s clear to everyone that plain copying is not allowed by the terms of service, and that explicitly listing StreetView among the sources for an edit is a call for legal trouble.
  • Alleged problems in OSM are discussed (de) (translation) again on the German forum. The use of the OSM-Carto style as the standard map is increasingly under attack for a number of perceived weaknesses.. The thread was prompted by Serge Wroclawski’s blog entry reported here earlier).
  • The University of Malta and the University of New York have published a paper dedicated to the automatic creation of game puzzles using Wikipedia articles, OpenStreetMap data and Wikimedia commons.
  • User SeleneYang published a diary entry (translation) about the International Gender Representation Survey led by Geochicas, an OSM women’s mapping group.

OpenStreetMap Foundation

  • Freemap Slovakia has applied to become an official OSM Local Chapter.
  • The OSM-US Board has completed this year’s election process. For the first time, the board used Single Transferable Vote procedure.


  • A “call for logo” has been announced for the State of the Map LatAm 2018, that will be held in Buenos Aires, Argentina. Deadline for submissions will be March 15th.
  • OSM Burundi organized its State of the Map on February 23 and announced a partnership with SmartBurundi providing free 3G SIM cards for OSM contribution.
  • Tickets for State of the Map 2018 are now on sale, at the Early Bird discount rate. It’s time to start planning the trip to Milan for the end of July!

Humanitarian OSM

  • Fanga e. V., aims to enable the youth in Burkina Faso (West Africa) to receive training and to take care of their medical care. A HOT tasking manager project has been set up to help update the map there.
  • A recent HOT project succeeded in mapping Dar es Salaam’s smallest decision making structures, the “shina”. These administrative units, unmapped until November 2017, are now available on an official map and will help improve public health services, emergency response and decision making for local authorities and community members. They’re not yet in OSM though – the highest level boundary currently there is at the “subward” level such as Makangarawe here.
  • In the International Journal of Geo-Information published (PDF) a scientific paper by Wei Liu et al. on mapping flood-prone regions in Nepal with the help of OSM.


  • Omniscale (a map hosting company using OSM data) has released a new outdoor map style. The map is available as WMTS and WMS and is updated live. (via Twitter)
  • Rescue services in Luxembourg use (amongst others services) OpenStreetMap. It is the most up-to-date map in the area shown here. (via Twitter)

Open Data

  • Sidorela Uku reports about the first successes of the Albanian OSM community, which received geodata from the city of Tirana under an OSM-compatible license, and hopes that further cities will follow this model.
  • The English Environment Agency announced which areas are targeted for new Lidar imagery this winter. This is part of the programme to achieve complete coverage of England by 2020.
  • This year’s global Open Data Day will take place on Saturday, March 3. OSM is also represented, e. g. in Düsseldorf.


  • The web application “is OSM up-to-date?” by Francesco Frassinelli checks for the last edit of nodes and ways, based on the assumption that older data could more likely be outdated. The application produces a colour overlay that makes these old data easy to spot.
  • During the most recent Karlsruhe hack weekend Hartmut added support for uploading Umap export files into his MapOSMatic instance. This now allows for easy printing of custom maps created with Umap.


  • Frederik Ramm has added an option to “planet-dump-ng” (which generates the the planet dumps from OSM that all other data extracts rely on). The new functionality is to optionally allow the removal of user metadata. This may be necessary because of GDPR, and also will also reduce the size of planet dumps.
  • Version 3.6.2 of Mapnik’s NodeJS bindings is the last one supporting Windows, unless a new maintainer can be found.
  • The geodata analysis team of HeiGIT is currently developing a historical OpenStreetMap analysis platform. The goal is to make OSM data from the past more readily available for various types of data analysis tasks at the global level.
  • Mapbox Navigation SDK for iOS v0.14.0 uses Amazon Polly–powered spoken instructions by default for most languages, cleans up notifications, and displays map labels in the local language by default.
  • Mapbox has published revamped developer documentation about its Maps SDK for Unity.


  • The latest version of “OpenStreetMap Carto” (the default style on the OpenStreetMap.org website) has been released. 4.8.0 now also shows some historical tags like Wayside shrines and Forts (and discussion continues about other historical things such as castles).
  • QGIS 3.0 ‘Girona’ has been released for Windows, MacOS X, Linux and Android.
  • For most OSM applications you will find the current release, as always updated by Wambacher, on the OSM Software Watchlist.

OSM in the media

  • Several German news websites published a brief article on OpenRailwayMap.

Other “geo” things

  • A pilot project from DataSeed has succeeded in mapping high-voltage towers in Pakistan, using machine learning to speed up manual mapping. This strategy, called Intelligence Augmentation, allowed speed-ups of 16x and 19x while maintaining the quality level of manual mapping. An example of a mapped power line is here.
  • Google Maps rolls out the open source and royalty-free “Plus Codes” (formerly Open Location Code). There’s speculation on the effect on proprietary suppliers in that area.
  • Spiegel online praises (translation) the map system of Here (BMW, Daimler and Audi) in an article and compares it with its major competitors.
  • Bloomberg writes about the companies competing to supply highly detailed maps for autonomous driving, including Google. This is heavily discussed on Hacker News.
  • The magazine “Binnenschifffahrt” in German presents the digital shipping assistant (translation), an app using OSM maps for inland waterway logistics will be tested from autumn this year.

Upcoming Events

Where What When Country
Lannion 3e concours de contributions OpenStreetMap 2018-03-01-2018-03-23 france
Montreal Les Mercredis cartographie 2018-03-07 canada
Stuttgart Stuttgarter Stammtisch 2018-03-07 germany
Praha/Brno/Ostrava Kvartální pivo 2018-03-07 czech republic
London OpenStreetMap Q&A London 2018-03-07 united kingdom
Berlin 117. Berlin-Brandenburg Stammtisch 2018-03-08 germany
Munich Münchner Stammtisch 2018-03-08 germany
Brussels OSMBE Official Meeting & Meetup 2018-03-09 belgium
Buenos Aires Geobirras 2018-03-09 argentina
Kyoto 幕末京都マッピングパーティ#02:月の明かりと大獄と 2018-03-10 japan
Tokyo 東京!街歩き!マッピングパーティ:第17回特別編 旧東海道品川宿 2018-03-11 japan
Rennes 12 Mar.2018-03-12–2018-03-13
Réunion mensuelle 2018-03-12 france
Lyon Rencontre libre mensuelle 2018-03-13 france
Nantes Réunion mensuelle 2018-03-13 france
Bochum Mappertreffen 2018-03-15 germany
Mumble Creek OpenStreetMap Foundation public board meeting 2018-03-15
Rapperswil 9. Micro Mapping Party Rapperswil 2018-03-16 switzerland
Yokkaichi 第1回 富洲原マッピングパーティ 2018-03-17 japan
Rennes 19 Mar.2018-03-19–2018-03-20
Cartopartie bâtiments en 3D 2018-03-19 france
Cologne Bonn Airport FOSSGIS 2018 2018-03-21-2018-03-24 germany
Turin MERGE-it 2018 2018-03-23-2018-03-24 italy
Poznań State of the Map Poland 2018 2018-04-13-2018-04-14 poland
Disneyland Paris Marne/Chessy Railway Station FOSS4G-fr 2018 2018-05-15-2018-05-17 france
Bordeaux State of the Map France 2018 2018-06-01-2018-06-03 france
Milan State of the Map 2018 (international conference) 2018-07-28-2018-07-30 italy
Dar es Salaam FOSS4G 2018 2018-08-29-2018-08-31 tanzania
Bengaluru State of the Map Asia 2018 (effective date to confirm) 2018-10-01-2018-10-31 india

Note: If you like to see your event here, please put it into the calendar. Only data which is there, will appear in weeklyOSM. Please check your event in our public calendar preview and correct it, where appropriate.

This weeklyOSM was produced by Anne Ghisla, Nakaner, Peda, Polyglot, Rogehm, SK53, SeleneYang, SomeoneElse, Spanholz, Spec80, YoViajo, derFred, jinalfoflia, sev_osm.

by weeklyteam at March 04, 2018 06:14 PM

Gerard Meijssen

Oostvaardersplassen - unintended consequences

When there are too many animals and there is too little for them to eat, they die. This happens regularly in winter in the Oostvaardersplassen, a nature reserve in the Netherlands.

The Oostvaardersplassen were created to provide a place for geese to feed. It takes deer, cattle and horses to prevent the development of a wood. Geese like grass short and it is why these animals were released in the Oostvaardersplassen

In the past there have proposals to provide more room for the animals because in winter they die in huge numbers. Providing room is not possible by an unending cycle of adding new grounds to the Oostvaardersplassen but it is possibe to make a connection to the Veluwe and extend this to the nature alongside of the Dutch rivers connecting even further to Germany. This plan that was actively developed was at the last moment shot down by politicians.

Given that "animal lovers" are bringing hay feeding some of the animals, it resulted in such an upheaval under the animals that Staatsbosbeheer prefers to feed them themselves. In the past they pronounced what they would do when pressed. They will shoot the animals and bring the numbers down by half maybe even more. Nature will respond positively after such a catastrophe. It will invigorate nature and make the Oostvaardersplassen less of a meadow.

Natural predation like by a pack of wolfs would make a difference. Wolfs are finding their way into the Netherlands, they only have to find their way to the Oostvaardersplassen and call it home.

by Gerard Meijssen (noreply@blogger.com) at March 04, 2018 07:44 AM

March 02, 2018

Wikimedia Foundation

Oscars predictions made easy with Wikipedia

Image via Pixabay, CC0.

It’s Oscars weekend in the United States, meaning that millions of fans will be able to see if their favorite film, actor, director, or even sound editor gets recognized for their singular work.

A common parlor game in the media is to predict who will win. (For one example, see FiveThirtyEight.)

We’re here today to join in with our own data. The Wikimedia Foundation—the nonprofit organization that supports Wikipedia, other Wikimedia free knowledge websites, and our mission of free knowledge for all—releases anonymized pageview data for every page on Wikipedia.

Even better, a team of volunteers built a public-facing tool that takes that data and allows you to do all sorts of fun things with the data.

We decided to use it to compare this year’s best picture nominees.


We chose two time periods for comparison. The first was from 1 January 2017, the first day on which a film could be released and be eligible for this year’s awards, and ran it to 1 March 2018, which you may also recognize as yesterday.

That time period sees Christopher Nolan’s Dunkirk cleaning house. The film, which focuses on the World War II evacuation of Allied forces from the beaches of Dunkirk, France, far outpaces the rest of the crowd. In fact, it had more than double the pageviews of second-place The Shape of Water, a fantasy-drama film about a custodian at a secret government lab falling for a captured humanoid-amphibian creature.

Note: Those weird jumps and such at the bottom left of the graph are for The Post. Why? Wikipedia’s article on that film was titled The Papers until August 2017, and this tool does not handle this very well. We’ve added the approximately 200,000 views The Papers got to The Post‘s total, which moved it above Lady Bird.

But what if we restricted the time period to only after the nominations were announced on 23 January 2018?

Answer: things change. Rather drastically, in fact. The Shape of Water, having been released in December 2017, retains much of its pageviews and takes first place. Dunkirk moves from first to last, in part suffering from having come out months earlier.

What does it mean? We’re not quite sure. Intriguingly, The Shape of Water has the most total nominations (thirteen) of any film in this year’s Oscars; Dunkirk has the second most (eight).

  • The Shape of Water (film), 2,056,978
  • Three Billboards Outside Ebbing, Missouri, 1,639,093
  • Call Me by Your Name (film), 1,324,580
  • Get Out, 1,140,808
  • Phantom Thread, 1,028,114
  • Lady Bird (film), 993,391
  • The Post (film), 952,441
  • Darkest Hour (film), 720,019
  • Dunkirk (2017 film), 706,613

Given all this, who would you choose as the favorite? Let us know in the comments or by tagging us on Twitter.

And if you want to use Wikipedia pageviews to predict another category, you can use the pageview tool for all of the nominees.

Ed Erhart, Senior Editorial Associate, Communications
Wikimedia Foundation

This post is also available on our Medium.

by Ed Erhart at March 02, 2018 10:41 PM

Wikimedia Foundation appoints Tony Sebro as Deputy General Counsel

Photo of Tony Sebro.

Photo by Myleen Hollero/Wikimedia Foundation, CC BY-SA 4.0.

The Wikimedia Foundation is excited to announce the appointment of Tony Sebro as Deputy General Counsel. Tony joins the Foundation after working since 2011 at the Software Freedom Conservancy—a public charity that acts as the home for more than 40 free and open source software projects. He brings more than 15 years of technological, open source, and business strategy experience to the organization.

The Wikimedia Foundation is the nonprofit organization that supports Wikipedia and other free knowledge projects. Together, Wikipedia and the Wikimedia projects are visited by more than a billion unique devices every month. The Wikimedia Foundation is driven by its mission to build a world in which every single person can freely share in the sum of all knowledge.

“We are incredibly fortunate and excited to be able to have such a talented attorney join us who brings a magic mixture of legal expertise in nonprofit governance and compliance, technology, and open source — areas that are critical to the Wikimedia Foundation’s work,” said Eileen Hershenov, General Counsel for the Wikimedia Foundation. “We are looking forward to Tony maintaining strong and supportive interactions with the executive staff, affiliate organizations, and volunteer communities.”

As Deputy General Counsel, Tony will work closely with the General Counsel to coordinate all of the organization’s legal activities. He will oversee the department’s day-to-day operations and advise and counsel on several areas of the law, including intellectual property, free speech, and privacy. Tony will also support the global expansion of the Foundation’s initiatives and legal strategy, including coordination with foreign counsel, like-minded organizations, and individuals to advance Wikimedia’s legal interest in support of its mission, vision, values and goals.

While at Software Freedom Conservancy, Tony negotiated contracts, managed legal risk, enforced the GPL on behalf of Linux copyright holders, and advised Conservancy’s member projects. He also spent the last two years as a coordinator for the Outreachy initiative, which offers paid remote internships with free and open source software and free culture organizations to people underrepresented in tech.

Earlier in his career, Tony handled both business development and legal issues relating to intellectual property—in private practice with the PCT Law Group and Kenyon & Kenyon LLP—and as a business development professional and licensing executive with IBM’s Technology and Intellectual Property Group. In 2017, Tony received the O’Reilly Open Source Award for his exceptional impact in free software.

“I’m excited about joining an organization that’s dedicated to collecting knowledge of all kinds and making it free to any individual who wishes to access it,” Tony said. “It has been a pleasure getting to know the organization’s exceptionally talented group of lawyers. I look forward to standing alongside them as we continue to defend and advocate for the world’s preeminent free knowledge projects.”

Tony received his J.D. and his M.B.A. from the University of Michigan and his B.S. from the Massachusetts Institute of Technology. He is a member of the New York bar, has registered to practice before the U.S. Patent and Trademark Office and has served on the boards of multiple non-profit organizations. In his spare time, Tony enjoys music production, rooting for DC-area sports teams, and spending time with his wife, Beth, and son.

About the Wikimedia Foundation

The Wikimedia Foundation is the nonprofit organization that supports Wikipedia and the other Wikimedia free knowledge projects. The Wikimedia Foundation operates the technology behind Wikipedia and related sites, supports the global volunteer communities that make Wikipedia possible, and raises funds to support the Wikimedia movement. The Wikimedia Foundation does not control editorial content on Wikipedia or any other Wikimedia projects. That is up to a global movement of volunteers, more than 200,000 of whom edit Wikipedia on a given month. Based in San Francisco, California, the Wikimedia Foundation is a 501(c)(3) charity that is funded primarily through donations and grants.

Wikimedia Foundation press contact

by Wikimedia Foundation at March 02, 2018 06:39 PM

Wiki Education Foundation

Why we’re celebrating women filmmakers this Oscar season

Greta Gerwig has become the fifth woman to be nominated for Best Director in the Academy Awards’ 90 year history. And only one woman, Kathryn Bigelow, has ever won the award. In a Q&A after a screening of her film, Lady BirdGerwig noted the importance of women storytellers and why their work should be recognized: “I think women tend to focus on stories that men don’t have the privilege of seeing. … It felt like there was this whole world left to be explored that had been largely undocumented.”

Women and people of color have historically been underrepresented in the film industry. Take the famous Bechdel test, the standard against which a number of the top movies in the business fail. The test asks three things of a film:

  • Does it feature two female characters?
  • Do those two characters talk to each other?
  • Do they talk about something other than a man?

Considering how simple these criteria sound, you may be surprised by how many top blockbusters failed this year.

The Bechdel test doesn’t focus on representation beyond female characters, so women in the Hollywood film industry have imagined the next Bechdel tests to hold the industry accountable. These tests focus on representation behind the camera, as well as in front of it. They also question thematic elements of storytelling and if a movie upholds sexist or racist stereotyping.

Since the success of stories directed by women and featuring strong female characters this year, it looks as though the film industry may be moving toward a more equity-focused approach. The success of Black Panther, for example, is proving that better representation of people of color and strong female characters in film and production teams are not only highly desired, but also extremely profitable.

As mentioned in this FiveThirtyEight analysis of the importance of Black Panther character Shuri, when people see themselves represented in film, they are inspired to pursue industries and careers they might not have believed were open to them before. These blockbusters have a wide audience, and therefore a huge influence.

Wikipedia, similarly, has a big impact on the public. About 450 million users access the site every month. But like the film industry, stories and biographies of men are disproportionately represented. Only about 17% of the biographies on Wikipedia are about women. And more than 80% of volunteers who regularly contribute information to the site are men.

At Wiki Education, we’re committed to increasing Wikipedia’s coverage of all histories and engaging more voices in that knowledge production. 68% of students who learn how to edit Wikipedia in our program are women. And when they improve and create biography articles of women, they’re exposed to careers and fields that are traditionally thought of as male dominated. They also help to improve Wikipedia’s coverage of these women’s lives and accomplishments. Improving representation of careers is particularly relevant for women in STEM as shown in this study from 2013.

Representation, whether it be in movies or on Wikipedia, matters!

That’s why we’re proud to feature student work this week from Jennifer Nichols Fall 2016 course at the University of Arizona. As seen in the examples below, and throughout the course, students improved Wikipedia articles about women directors, many of whom are women of color. Documenting the accomplishments and impact women have had in the film industry is important work. Not only can young women read about potential careers and see what could be possible for them, but it helps to round out Wikipedia’s coverage of women throughout history.

Ava DuVernay is a film director, producer, and screenwriter from the United States. She’s best known for directing Selma, which received an Academy Award nomination for Best Picture and Best Original Song. She’s the first African-American woman director to have had her film nominated for Best Picture, although she was not nominated for her directorial role. DuVernay has directed a variety of films, from big-budget blockbusters like A Wrinkle in Time to documentary work like her film 13th. She’s also produced and directed videos for advertising campaigns, as well as music videos for artists like Jay-Z.

Matilde Landeta was a filmmaker active during the Golden Age of Mexican cinema in the early 20th century. She was the first woman to take part in the period as a director. Landeta first encountered the film industry visiting her brother, actor Eduardo Landeta, on set. She was first offered a job as a make-up artist, and eventually the job of script supervisor in 1932. She became an assistant director in 1945 and worked with some of the greats of the period. Neither production companies or labor unions offered support for her own directorial career. So in 1948, Landeta sold her car and held her house as collateral for a loan to start her own production company. Her first two films were boycotted by distribution companies and her career suffered a blow. She continued writing screenplays (over 100 shorts!), but did not return to directing until several decades later. A student significantly expanded the article about her to include all of these details!

Dawn Porter is a documentary filmmaker and founder of production company Trilogy Films. Porter went to school to become an attorney, then began her film career working as an executive producer for various documentary films. Her latest film, Trapped, examines TRAP Laws (“Targeted Regulation of Abortion Providers”) in the southern United States. These laws disproportionately limit access to abortion for poor women and women of color.

Students improve Wikipedia articles in all disciplines in our Classroom Program. To learn about how you and your students can make Wikipedia more representative of all people, visit our information page or reach out to contact@wikiedu.org with questions.

Header image: File:Ava DuVernay, David Oyelowo and Colman Domingo Febrauary 2015.jpg, usbotschaftberlin, public domain, via Wikimedia Commons.

by Cassidy Villeneuve at March 02, 2018 06:35 PM

Gerard Meijssen

#Wikidata - Vikram Patel and missing awards

Mr Patel received many awards; for one of them you see illustrated what the award looks like. This award was missing in the Wikipedia article. Another award, the Chalmers Medal by the Royal Society of Tropical Medicine and Hygiene, was mentioned but there is no information about other award winners.

Other awards for this society had their own item but no enrichment had taken place. They are now at least linked and award winners with a Wikipedia article are now also linked.

Linking people through awards, through their employers, their education provides an entry point to a subject like "Tropical Medicine and Hygiene". When this is a subject that matters to you, Mr Patel is the first one listed to have received the Chalmers award; this award started in 1923 so you can add all of them to Wikidata or write Wikipedia articles to these notable people.

When you consider notability, would it not be an argument to use against the Wikipedia deletionists when there is plenty of information at Wikidata?

by Gerard Meijssen (noreply@blogger.com) at March 02, 2018 04:16 PM


Documenting is the key to thrive

It's no secret I am a fairly new member of the Wikimedia movement. As you can see here, I only started contributing on September 11, 2017, encouraged by Outreachy's application process. Since then, I made a total of 2,254 edits including translations, daily notes about my internship and production of small videos to illustrate my Translation quick guide. I was completely immersed in one single aspect of the movement—technical translations on MediaWiki.org—for three full months, and I am aware I still have a lot to learn.

But I believe there is no need of having a great amount of experience with Wikimedia projects to point out that recruiting volunteers and organizing work on Wikipedia and doing that on MediaWiki.org requires different strategies—actually, if there is something that this internship proved right, it's that sometimes you need novices that don't quite know what to expect to help you find the most obvious points of failure. It's like that QA joke:

It makes me think about a recurrent trend in the software developement world: calling users "idiots" if they don't behave like we predict. I disagree with this kind of mindset completely [1]: if so many people are having difficulties, maybe the problem isn't them but the design[2][3]. We cannot build systems and processes hoping that people will magically understand what we had in mind while developing them. We need to make them crystal clear—with no distraction whatsoever—so we can retain the most of the intended audience without running into perfectly avoidable problems.

"The Wikimedia movement is a volunteer movement: we edit and translate (and to a fairly large degree, do technical development and technical documentation) in our spare time, but the roads to becoming a translator are difficult to find, and it's difficult to get engaged in the movement as a translator rather than as a an editor who sooner or later ends up helping out with translation." — T158296

Here is an interesting fact: the Wikimedia movement isn't a group of people deeply involved with free and open-source software, but a organization deeply driven by open knowledge that happens to develop free and open-source software to support their operations (and has a small, dedicated community with ties with FOSS as a result). This is an important distiction I suspect that makes all the difference when thinking about recruiting efforts for technical translation.

I was discussing with my mentors about how difficult is to a person that is not familiar with the Wikimedia movement to contribute as a technical translator. In Bringing documentation to light[4], I point out that translation efforts are unorganized on MediaWiki.org, which kind of reflect the overall work culture within the movement: take a task you would like to spend time on (after all, you are doing this in your spare time); try to complete it as well as you can; contact us in case you run into any problems. We have chapters, the Wikimedia Foundation, user groups, thematic organizations, but my impression is that those things are there to support the movement, and generally every member not deeply associated with them works (or is encouraged to work) independently.

Although this strategy may work well with Wikipedia[5], it has some downsides: you end up with a set of unwritten rules and conventions that are not easy to catch and usually are passed from a person to another informally (as organizing them takes a lot of effort and it's easy to forget about this kind of task when you wish to dedicate your time to things you consider more important). Let's face it: most people don't like to document their work. It's a boring, repetitive task, often not as exciting as coding or editing. And the way the Wikimedia movement works brings another challenge to the equation: even if they do document their work somehow, the lack of solid guidelines to help them complete this task (style guide, indication of where to make it available, how to publicize it) could result in not-so-useful documentation. The dimension of the movement itself make it difficult to find this kind of information. As a result, information is extremely fragmented.

Therefore, it's not surprising that some of the most engaged editors will end up as occasional translators sooner or later. They belong to a group that needs to be fundamentally familiar with the way the movement works, the software that supports the operations of every single project. They have this willingness to help the movement because they know how important is to make content available in multiple languages. But here is something that often happens with people like me and them, people that aren't "actual translators" but have the required fluency to do this kind of task: we are inexperienced (especially if this is our first contribution in the free and open-source software world and we are in charge of this role alone!) and frequently we don't put much thought in long-term consequences. We just want to get things done[6].

That is, of course, a recipe for disaster. Maybe someday you will become busier and won't have time to contribute to that specific project. Consequently, you probably won't have time to explain to someone else all the work you've done and that will require them to make some guesses. It won't take long until inconsistencies begin to appear, making users extremely confused—particularly if said project follows the trend of only getting bigger and bigger everyday as we speak.

With that, we finally get to the reasons why I was so adamant and persistent about the idea of translation teams. After all, this is what most of the FOSS projects do, and being FOSS is one of the most fundamental traits of MediaWiki.

So I tried to create a Brazilian Portuguese team with new contributors only. I made a call on Twitter, being absolutely direct: I need some people to help me with this over the course of the next two weeks. I'll issue a certificate of participation for your help. I had a better response than with my attempts to publicize the role of technical translators: with this approach, three people contacted me, two created MediaWiki.org accounts, one followed through—and they said they did this because they have always wanted to be a volunteer translator, but never heard back from projects they attempt to make part of.

Another surprise was my fiancé offering his help. He is a Systems Information undergrad and always wanted to contribute to free and open-source projects, but never did. He usually practices his English with Duolingo, but was growing tired of it. He saw that as an opportunity to combine business with pleasure.

And it worked. Both of my teammates have different schedules and sometimes they can't contribute as much as they want (and I was busy doing other things during this period, so it happened with me as well), but together we increased the translation rate of selected pages from 24% to 65%. I am really happy with this!

So as you read this, I already interviewed both of them about their experiences and I am writing my final report. As my internship ends on March 5 and this is my second-to-last bi-weekly report, my next one will probably address my thoughts, conclusions and recommendations to the Wikimedia community. But this doesn't mean it's the end of my blog—I would say it's only the beginning of it. I certainly enjoyed writing that much again and I still—and will—have a lot of things to share.

  1. Just to make clear: I am not saying the Wikimedia movement does that. It's just a phrase I have heard a lot in programming classes and social platforms. ↩︎

  2. Another reason this bothers me a lot is its ableist roots. ↩︎

  3. There is an article called The myth of the stupid user that resonates with a lot of I think about this particular subject. ↩︎

  4. This piece was published on the Wikimedia blog last week, by the way! Yay for me! ↩︎

  5. With some reservations, I would say—it's not unheard of people complaining about how difficult or cofusing is to begin contributing when you don't know any Wikimedian/Wikipedian, and how unwelcome they feel. ↩︎

  6. This is a mistake I regret making when translating Mastodon, especially because I ended up involved with the translation of other related projects (mobile apps, for instance). When this internship ends, I plan on talking to other Brazilian Portuguese translators to write down our conventions and style guide. ↩︎

by Anna e só at March 02, 2018 02:10 PM

Documentar é a chave para prosperar

Não é segredo algum que sou um membro relativamente novo do movimento Wikimedia. Como você pode observar aqui, só comecei a contribuir a partir de 11 de setembro de 2017, encorajada pelo processo de candidatura do Outreachy. Desde então, fiz um total de 2,254 edições incluindo traduções, notas diárias sobre o meu estágio e uma produção de pequenos vídeos para ilustrar o meu Guia rápido de tradução. Eu fiquei completamente imersa em apenas um aspecto do movimento — traduções técnicas no MediaWiki.org — por três meses, e tenho consciência de que ainda tenho muito a aprender.

Mas eu acredito que não há a necessidade de ter muito tempo de experiência com projetos Wikimedia para apontar que recrutar voluntários e organizar trabalho na Wikipédia e fazer isso no MediaWiki.org requer estratégias diferentes — na verdade, se há algo que este estágio provou ser certo é que às vezes você precisa de novatos que não sabem muito bem o que esperar para ajudá-lo a encontrar os mais óbvios pontos de falha. É como aquela piada de controle de qualidade:

Isso faz com que eu pense a respeito de uma tendência recorrente no mundo do desenvolvimento de software: chamar os usuários de "idiotas" se eles não se comportam como previmos. Discordo completamente com esse tipo de linha de pensamento [1] — se tantas pessoas estão tendo dificuldades, talvez o problema não seja elas, mas o design[2][3]. Não podemos construir sistemas e processos esperando que pessoas magicamente entendam o que tínhamos em mente ao desenvolvê-los. Precisamos deixá-los bem claros — sem quaisquer distrações — para que possamos reter o máximo de nossa audiência pretendida sem esbarrar em problemas perfeitamente evitáveis.

"O movimento Wikimedia é um movimento voluntário: nós editamos e traduzimos (e, até certo grau, fazemos desenvolvimento e documentação técnica) em nosso tempo livre, mas os caminhos para se tornar um tradutor são difíceis de encontrar, e é difícil se envolver no movimento como tradutor ao invés de um editor que mais cedo ou mais tarde acaba ajudando com a tradução." — T158296

Eis um fato interessante: o movimento Wikimedia não é um grupo de pessoas profundamente envolvidas com software livre, mas uma organização profundamente guiada por conhecimento aberto que por coincidência desenvolve software livre para dar suporte às suas operações (e tem uma pequena e dedicada comunidade com laços no software livre como resultado). Essa é uma distinção importante quando falamos sobre recruitar esforços para tradução técnica.

Eu estava discutindo com os meus mentores sobre como é difícil, para uma pessoa que não é familiarizada com o movimento Wikimedia, contribuir como tradutor técnico. Em Colocando a documentação em foco[4], eu aponto que esforços de tradução são desorganizados no MediaWiki.org, o que reflete a cultura de trabalho dentro do movimento: pegue uma tarefa em que você gostaria de gastar tempo (afinal, você está fazendo isso no seu tempo livre); tente completá-la dando o seu melhor; contate-nos caso você encontre algum problema. Temos capítulos, a Wikimedia Foundation, grupos de usuáriio, organizações temáticas, mas a minha impressão é que essas são coisas para dar suporte ao movimento, e geralmente todo membro que não é profundamente associado com elas trabalha (ou é encorajado a trabalhar) de forma independente.

Embora essa estratégia possa funcionar bem com a Wikipédia[5], ela tem alguns pontos negativos: você acaba com um conjunto de regras e convenções não-escritas que não são fáceis de entender e geralmente são passadas de uma pessoa para outra informalmente (já que organizá-las demanda muito esforço e é fácil esquecer sobre esse tipo de tarefa quando você deseja dedicar o seu tempo a coisas que você considera mais importantes). Encaremos: a maioria das pessoas não gosta de documentar o trabalho que fazem. É uma tarefa chata e repetitiva, frequentemente não tão excitante quanto programar ou editar. E a maneira como o movimento Wikimedia funciona traz outro desafio à equação: ainda que haja documentação de alguma forma, a ausência de orientações sólidas para ajudá-los a completar tal tarefa (guias de estilo, indicação de onde deixá-la disponível, como divulgá-la) pode resultar em documentação não tão útil. A dimensão do movimento por si só torna difícil encontrar esse tipo de informação. Por isso, informação é extremamente fragmentada.

Portanto, não é surpreendente que alguns dos editores mais engajados acabem se tornando tradutores ocasionais mais cedo ou mais tarde. Eles pertencem a um grupo que fundamentalmente precisa ser familiarizado com a maneira como o movimento funciona, o software que dá suporte às operações de cada projeto. Eles estão dispostos a ajudar o movimento porque sabem o quanto é importante tornar conteúdo disponível em múltiplos idiomas. Mas eis algo que frequentemente acontece com pessoas como eu e eles, pessoas que não são "tradutores de verdade" mas possuem a fluência requerida para fazer esse tipo de tarefa: nós somos inexperientes (especialmente se essa é a nossa primeira contribuição no mundo do software livre e estamos trabalhando sozinhos!) e muitas vezes nós não pensamos muito sobre consequências a longo prazo. Só queremos concluir aquela tarefa[6].

Isso é, é claro, uma receita para um desastre. Talvez um dia você fique mais ocupado e não terá tempo para contribuir naquele projeto específico. Consequentemente, você provavelmente não terá tempo para explicar para outra pessoa todo o trabalho que você fez e isso fará que eles presumam algumas coisas. Não levará muito tempo até que inconsistências comecem a aparecer, fazendo com que usuários fiquem extremamente confusos — especialmente se o dito projeto segue a tendência de se tornar cada vez maior.

Com isso, nós finalmente chegamos às razões pelas quais eu fui tão insistente e inflexível sobre a ideia de equipes de tradução. Afinal, isso é o que a maioria dos projetos de software livre fazem, e ser software livre é uma das características mais fundamentais do MediaWiki.

Então eu tentei criar uma equipe de português brasileiro apenas com novos contribuidores. Eu fiz uma chamada no Twitter, sendo bem direta: preciso de pessoas para me ajudar com isso nas próximas duas semanas. Eu vou emitir um certificado de participação pela sua ajuda. Eu tive uma resposta melhor que as minhas tentativas de divulgar o papel de tradutor técnico: com essa estratégia, três pessoas me contataram, duas criaram contas no MediaWiki.org, uma persistiu — e disse que que o fez porque sempre quis ser um tradutor voluntário, mas nunca recebeu resposta de outros projetos dos quais tentou fazer parte.

Outra surpresa foi o meu noivo oferecendo a sua ajuda. Ele é um graduando em Sistemas de Informação e sempre quis contribuir com projetos de software livre, mas nunca o fez. Ele geralmente pratica inglês com Duolingo, mas estava se cansando disso. Viu a oportunidade como uma chance de juntar o útil ao agradável.

E isso funcionou. Meus dois colegas de equipe têm diferentes rotinas e às vezes não conseguem contribuir como gostariam (e eu estava ocupada fazendo outras coisas durante esse período, então aconteceu comigo também), mas juntos nós aumentamos a taxa de tradução de páginas selecionadas de 24% para 65%. Eu estou bem feliz com isso!

Então no momento em que você lê este texto, eu já entrevistei os dois sobre as suas experiências e estou escrevendo o meu relatório final. Como o meu estágio termina no dia 5 de março e este é meu penúltimo relatório, o meu próximo provavelmente falará sobre as minhas opiniões, conclusões e recomendações para a comunidade Wikimedia. Mas isso não significa o fim do meu blog — eu diria que é apenas o começo. Eu certamente adorei escrever tanto assim novamente e ainda tenho — e terei — muitas coisas para compartilhar.

  1. Deixando claro: não estou dizendo que o movimento Wikimedia faz isso. É só uma frase que já ouvi em várias aulas de programação e mídias sociais. ↩︎

  2. Outra razão pela qual isso me incomoda: suas raízes capacitistas. ↩︎

  3. Há um artigo chamadoThe myth of the stupid user que expressa muito do que eu penso sobre esse assunto. ↩︎

  4. Esse texto foi publicado no blog da Wikimedia na semana passada, aliás! Yay! ↩︎

  5. Com algumas reservas, eu diria — não é difícil achar pessoas reclamando sobre quão confuso ou difícil é começar a contribuir quando você não conhece algum Wikipediano ou Wikimedista, ou o quanto elas se sentem não-acolhidos. ↩︎

  6. Esse é um erro que me arrependo de ter cometido quando traduzindo o Mastodon, especialmente porque eu acabei envolvida com a tradução de outros projetos relacionados (aplicativos móveis, por exemplo). Quando o estágio terminar, planejo falar com outros tradutores e documentar nossas convenções e guia de estilo. ↩︎

by Anna e só at March 02, 2018 11:17 AM