An update on our monthly reports

21:09, Tuesday, 29 2020 September UTC

Since Wiki Education started in 2014, we’ve been making the ED’s monthly reports to the board public. We’ve invested a lot of staff time into creating and distributing these reports in three formats: a designed PDF uploaded to Wikimedia Commons, a wikitext version published on the Wikimedia Foundation’s Meta wiki, and a full text version on this blog. Archives of all of these are listed on our page on Meta, and we also announced their availability on the English Wikipedia’s Education Noticeboard and our social media.

In June 2020, we released a survey accompanying our report and asked people who read it to fill it out. After more than a month of being up, the survey has had zero responses outside our own board, leading us to conclude that we can reduce the burden that goes into distributing the reports across different channels.

Organizationally, Wiki Education values transparency. That’s why we’ll keep making our monthly reports to the board accessible to the public. However, from now on, we’ll distribute it in only one format: A PDF uploaded to Wikimedia Commons, which we’ll add a link to on Wiki Education’s page on the Wikimedia Foundation’s Meta wiki. If you’re interested and you have an account on the Meta wiki, we encourage you to add the Wiki Education page on Meta to your watchlist so you’ll be notified when we add new reports.

By Srishti Sethi, Senior Developer Advocate

Overview

Small Wiki Toolkits is an initiative that focuses on building technical capacity in smaller language wikis and a global community of practice for discussing and resolving the challenges of these wikis. 

The Wikimedia projects are powered by the MediaWiki software and extensions. Community-built tools, bots, gadgets and templates enhance the software experience and support a variety of workflows. A visible example in the user interface is an infobox, which is generated through templates. Tools, bots, templates and gadgets play an essential role for the curation, maintenance and growth of content. Technical contributors across wikis perform a wide variety of tasks, such as site configuration requests (e.g. adding a custom namespace, changing a logo), importing gadgets and templates, software development, documentation, reporting bugs and translating technical news.

About 1/3 of the total edits to the Wikimedia projects come from tools and bots that help wikis with site maintenance, content moderation, editing, etc. Most of them are using the Wikimedia Cloud Services (WMCS) infrastructure. In the screenshot below, you can see that 7.5% of Punjabi wiki’s edits came from the tools and bots that are hosted via WMCS in March of 2020. To develop and use impactful tools and workflows, communities need technical contributors. Hence the initiative–Small Wiki Toolkits seeks to support smaller wikis that cannot address unmet technical needs by developing toolkits and recommendations on technical topics and organizing workshops to equip people with the necessary skills to make their work on wikis easier and more effective.

Generated via https://wmcs-edits.wmflabs.org/#wmcs-edit, Srishti Sethi, CC BY-SA 4.0

This initiative was kicked off at Wikimania last year. Since its inception, over 200 people have participated in 16 informational sessions, technical workshops, and group discussions through 6 Wikimedia events (local, regional, and international) online and offline. Through these events, the initiative engaged people from many smaller wikis from all over the world. 

Activities and highlights

Let’s take a look at the different formats that the initiative experimented with in its first year:

Brainstorming sessions to understand the technical challenges smaller wikis face.

In a typical brainstorming session, participants would discuss the situation in their wiki community, problems they would like to solve as a technical contributor and missing gaps in their workflows (tools, processes, skills, etc.). Participants shared ideas for how to enable more people to do technical work and get better at working together across wikis. Five such group discussions helped gather challenges and ideas for possible solutions from around 30 smaller language wikis including Breton, Macedonian, Basque in the European region, Indic, Arabic and indigenous language wikis from the North American region.

Participants from one of the Indic workshops, Satdeep Gill, CC BY-SA 4.0

In a nutshell, challenges reported were:

  • Lack of awareness and knowledge to use tools and technologies (e.g., anti-vandalism tools, bots, Phabricator for bug reporting, etc.)
  • Lack of specific technical skills (e.g., connecting infoboxes to Wikidata, using Lua modules, using templates, etc.)
  • It’s hard to keep up with new changes that break the wikis
  • It’s difficult to handle cumbersome wiki workflows (e.g., importing templates or gadgets from one wiki to another)
  • Lack of technical resources in different languages
  • Localization/ translation issues in many areas

In these conversations, people from a few wikis that are just getting started and working to grow content also participated. To address the needs of such wikis currently feels out of the scope of this initiative. Small Wiki Toolkits is focusing on smaller wikis that have reached a tipping point in their content development and are now looking for technical help to expand further. In these sessions, connections were made between participants from different wikis to collaborate on longstanding issues. Also, some sessions surfaced success stories; for example, immediately after one such session, a community member felt encouraged to apply for global interface rights and received it. 

Technical workshops and informational sessions

The Small Wiki Toolkit’s initiative’s members developed and curated a few teaching and learning resources on a wide variety of technical topics. These were documented on the Small wiki toolkits’ landing page on meta-wiki. There were also guidelines developed for individuals interested in developing the toolkits, keeping in mind that potential trainers and learners might utilize them and should have a lot of notes in the slides.

Ten workshops were conducted on eight technical topics: developing user scripts and gadgets, working with Wikimedia APIs, writing templates in Lua, generating Wikidata infoboxes, leveraging Wikimedia cloud services, translating pages via the Translate extension, writing Wikidata queries, using Phabricator, etc. Out of these ten workshops, seven were part of the 2019’s Wikimania and remaining of a workshop series designed specifically for the Indic community. Some topics were selected based on the needs of community members gathered through a survey.

Wikimania 2019 hackathon opening ceremony, Mike Peel, CC BY-SA 4.0

One of the participants at Wikimania live-tweeted a workshop:

These workshops showed promising results. Attendees seemed to be engaged and gained mastery in a topic. As per the Indic workshop series report, participants’ skills and awareness in technical areas increased by 30% on average.

Two participants provided feedback: 

The session was useful and moreover it was in Hinglish (A mix of Hindi and English), and I feel it was more adaptive for me to understand. The workshop started with the basics, which really helped me. Workshop was so interesting that some of the attendees were eager to learn more even after workshop time limits. I think it should not [be limited] to one session.

Overall experience of [the] workshop was excellent, and I would say that such [a] kind of workshop should be organised frequently.

Starter kit for smaller language wikis

One of the solutions proposed in response to the technical challenges during a brainstorming session was to develop a set of resources, tools, recommendations in technical areas relevant to smaller wikis that are just getting started. With continuous feedback from experts from many different wikis, the Small Wiki Toolkit initiative developed a starter kit for smaller language wikis. After receiving final feedback from participants of the Celtic Knot Conference 2020, it was released and announced on the mailing lists. Overall, the starter kit received a good response, and since it was published, it has been viewed 1,234 pageviews, with a daily average of 41.

Looking ahead

The Small Wiki Toolkit initiative identified a vast technical skill gap in small wiki communities. These communities need to rely on outside experts to address technical issues. A post-workshop series survey reflected the participants’ continued interest in attending workshops.

For the second year, the initiative is planning monthly technical skill-sharing workshops for the smaller wikis, beginning with the South Asian region, in which there are eight countries. This project will be a collaboration between the Indic-TechCom and the Wikimedia Foundation’s Developer Advocacy team. These workshops will allow participants to exchange challenges in the brainstorming session format discussed above, learn new skills, and connect with members from other communities. Exploring use cases and needs for automation, and a program around developing and running bots are further topics under consideration.

Another experimental idea is to allow community members who are interested in attending the technical workshop to take a prerequisite course via an external course provider and obtain a certificate for it. For example, for writing user scripts and gadgets, one must be familiar with JavaScript & JQuery fundamentals. Without prior knowledge of these topics, for a novice programmer, the learning curve to jump straight into the user script topic could be very high. And, this idea would help reduce it.

If you are interested in joining and supporting the Small Wiki Toolkits initiative, please add your name to the members list. Ideas and feedback are welcome on the talkpage of the initiative.

About this post

Featured image credit: Participants at WikiArabia 2019 in Marakesh, Noureddine Akabli, CC BY-SA 4.0

Diving into Wikipedia’s ocean of errors

15:57, Monday, 28 2020 September UTC

How we went from an error counter to fixing our code

By Jon Robson, Staff Software Engineer, The Wikimedia Foundation

A while back I wrote about debugging an issue on mobile armed only with an error counter and referrer information. Thankfully I’m pleased to report that we are now tracking and fixing client-side JavaScript errors for Wikimedia wikis which is providing more error-free experiences to our end users. In this blog post I wanted to document what led us to prioritize it; how we went about implementing a solution; and what we learned from the experience. If you are planning to log client errors in your own project, you may find the summarizing section at the end of the article useful.

Preparing for the voyage

You may have recently heard about how Wikipedia has made a recommendation to adopt Vue.js or how we are redesigning the desktop site. As you can imagine migrating an entire front end stack comes with certain challenges. We were diving into a large technical project with no knowledge of the existing landscape in which we were working in. In fact much of our code hadn’t been touched in some time. In addition to this our mobile error counting had also jumped from 10k a minute to 40k a minute with no obvious cause¹ signally we had introduced a bug somewhere. It was clear to the developers tasked with this work, that sophisticated error handling was a requirement for this project.

We had little idea what we would uncover through our explorations of client-side errors. Image credit: Thistlegorm deck and train parts.jpg, Woodym555, CC BY-SA 3.0

Luckily this dovetailed nicely with the scaling up of our Event Platform client and a global pandemic where our focus has shifted to site stability. Existing solutions such as Raven and Sentry were considered, but given the size of the client libraries comparatively to the rest of the software and out of a desire to minimize additional tools for developers, a decision was made to roll our own client code and send unmodified errors and stacktraces to our existing Kafka-Logstash logging infrastructure².

We had little idea what we would uncover through our explorations.

Preparing for launch was just the start

Before I even got involved in the work, there was an amazing collaboration between the security, performance, operations, analytics and product infrastructure to get everything in place. This effort is tracked on our public task management system for those that are interested in the details.

Thanks to the collaboration and planning of these teams, the code for catching errors and sending them to the server ended up being relatively small. This effort culminated in the launch on Hawaiian Wikipedia, one of our less often visited wikis. The challenge now became how to do this at scale.

Diving into the unknown

We maintain multiple projects and our biggest Wikipedia alone has 293 languages. Its an understatement to say that our ecosystem is complex. You’ve likely read about how bots are widely used in Wikipedia, but many of our users all rely on bespoke tooling to manage a variety of editing activities which are provided by browser extensions and user gadgets (special wiki pages that allow users to write and run their own JavaScript).³

To roll out further, we had to ensure that bugs were getting fixed, information we didn’t care about from user gadgets and scripts did not drown the signals we got from more important errors and most importantly the traffic to the end point was low such that it wouldn’t bring down our services.

We started small on a small wiki for the software that runs our sites: mediawiki.org. From a product perspective, even though the audience was low, it gave us a hint about what to expect as we rolled out further. The errors logged were a good way to capture very prominent errors e.g. errors occurring on every page view. Many of these errors had very actionable stack traces meaning we could file bugs and fix them, which we did, however others were more cryptic. “Script error.” was one of the most prominent and the most unclear to fix.

The reason we were getting “Script error” was due to the fact that code was being loaded from across our websites. Many of our users were loading gadgets from other wikis. So while this could it be explained, it was not very actionable without a stack trace. However, it did provide the file which caused the error which could be linked to the associated wiki page. Similar to my bug hunting adventures in my previous blog post, it could be used to manually reproduce those errors in exploratory testing.

However, the errors we were seeing on mediawiki.org represented a small drop in a large ocean of errors for our projects. Despite the rate of errors for these smaller wikis being sufficiently low, we knew that there were more errors out there waiting to be explored, the kind that only come from really stretching the use our software — for example editing a wiki page with complicated markup or copying and pasting a table with thousands of rows! We needed to roll out to larger wikis to help prioritize bug fixing in our product workflow by identifying where in our complex code ecosystem our users were commonly hitting problems in our code. But where to begin?

We looked at traffic first to guide us. We wanted wikis bigger than our current wikis, but not too big. Thanks to the error counter we introduced in our mobile site we were able to get guesstimates of the actual error rates on mobile for possible wikis by looking at referrer traffic. Combining this information with the actual error rates being ingested for Hawaiian Wikipedia and mediawiki.org for all our traffic and the traffic for all our wikis, we were able to predict the volume of errors for our candidate wikis on desktop too. We choose a wiki which would inform us more about the errors our users were actually facing without overwhelming our event platform. We decided to enable on our Catalan language Wikipedia.

A JavaScript error occurs for an end-user deep down inside the developer console, hidden away from the user’s view and previously not seen by our engineers.
Image credit: Screenshot own work, Jon Robson, CC BY-SA 4.0

Lesson one: Not all user errors are equal

Running the client side error tracking on Catalan Wikipedia was really helpful for prioritizing roll out. Due to the larger volume of traffic we went from around 2,000 errors a day to 40,000, however this taught us a lot. Certain errors we were seeing in the same software on our smaller websites were now occurring more frequently which helped us prioritize them.

What became apparent however was certain errors were being repeated by a single IP. We hadn’t been able to notice that with such low traffic. At one point 38,830 of 48,175 errors in a 12 hr period logged to Catalan Wikipedia came from a single IP address who was running some badly maintained gadgets that they had included almost a decade ago.

In another interesting development, from the stack traces we identified a bug related to our map feature which was only occurring on slow connections. When the bug occurred unfortunately it got executed in an interval timer so appeared in high volumes. A patch was provided and that bug was squashed. It turned out this accounted for 50% of our errors on mobile and our error counting graph adjusted accordingly.

As for other bugs, many of them came from faulty and forgotten scripts that users were running. After reaching out to those users, we managed to clean those up.

Over the course of 7 days we went from 30,000 errors per 3hrs to a manageable 735
Image credit: Screenshot own work, Jon Robson, CC BY-SA 4.0

These two incidents once fixed got the error rate down to a very manageable and respectable rate of 735 errors ever 3hrs. A lesson was learned and we began to limit the number of bugs we logged from the same user session to five. With that obstacle out of the way we felt confident to roll out further.

Lesson Two: Not all page JavaScript is equal

With errors limited, the list of errors was much more manageable, however it was clear that many of the stack traces were not useful. In certain cases, the code was clearly coming from places outside of our control. Interestingly we also saw a lot of bugs from browser extensions and non-Wikimedia domains — Google Translate for example and browser extensions like GreaseMonkey that allow the running of locally written JavaScript. Some of these errors did not have any information about stack traces or the source of the error. The noise of this unfortunately made it difficult to identify real errors, so we decided to exclude any errors without information on their origin. If a bug falls and nobody knows what file it came from, is it really a bug?

We however decided to defer the filtering of browser extensions and non-Wikimedia sites as this could be taken care of using logstash filtering if needed and it was useful to know, however a bug has been filed about reconsidering that in future, potentially sending them to a different channel.

Lesson Three: Some errors are the same

The end-users for the mediawiki.org and Catalan Wikipedia were not fully representative of all our users so a logic next step was to enable bug tracking on one of our larger language projects written in a right to left text script. Hebrew was an obvious choice, as the community there had previously volunteered for earlier deployments to help us catch errors before they reach production meaning if we saw client errors there we would have time to block the bug before it could impact other projects.

When we enabled the error tracking on Hebrew Wikipedia, the bump in errors was not as significant as it might have been. We didn’t learn much from this deployment, other than we were getting closer to the finish line. Sometimes a deployment just gives you the validation you need to continue.

After this, we expanded coverage further to Wikimedia Commons, our site for uploading images. As we’ve rolled out further we’ve benefited more from our scale, we noticed problems in our post-processing normalization. Some errors are prefixed with “Uncaught” and some are not. For example, the error TypeError: $submitButton.submitOnEnter is not a function is the same as Uncaught TypeError: $submitButton.submitOnEnter is not a function . This meant that similar errors were not being grouped.

After a month of roll out a significant milestone was reached when our infrastructure surfaced our first deployment blocker not reported by an end user which was swiftly fixed. Hurray!

Future voyages

The majority of the issues that we are now surfacing relate to user gadgets. User gadgets are an historic artifact of the wiki ecosystem that predates browser extensions which have allowed many editors to self-serve and many of our editors depend on them. Some of these user written gadgets are defaulted to run for all users, including anonymous users, so we have to be careful that as we roll out further we address or communicate with our editors to get those fixes.

We’ll continue this approach with further wikis, but this will take time. At the time of writing we are seeing about 60,000 errors a day from 10 of the over 2000 sites we maintain.

While hopefully obvious, one thing that’s important is that now we have the ability to identify client side errors that we use this information to fix bugs and to block deployments where necessary. Just adding the tool is not enough. This requires a lot of socializing the change, which is partly why I am writing this blog post. I have also started attending a weekly triage meeting and including consideration of these errors in our existing triaging processes. My hope is that all teams working on software for our projects will have a workflow for triaging and addressing such bugs without my assistance.

Before you jump in on your own project

I’ve linked to the associated tasks for these takeaways so you can see further discussion and implementation details if you are interested in.

  • If you are not sure of the amount of errors your site will generate, allow yourself an educated guess by counting errors first to get an idea. You could also use page views to restrict the error ingesting to only certain pages.
  • Limit the number of errors you track on a single IP. One faulty script from one power user could mislead you and overwhelming your data collection— particular if it’s in run regularly e.g. using setInterval or a scroll event handler!
  • Make sure you only track errors from domains you care about and consider checking the stack trace for hints that the script originated from Firefox and Chrome extensions or user scripts while logging or post-processing.
  • Remember when grouping and filtering errors that some errors may be prefixed with “Uncaught”. Normalize your messages before grouping and displaying them in a developer tool.
  • Some user scripts will have no associated file URI or stack trace. In my opinion, these are not worth caring about so avoid logging them.
  • If you load code from external domains these will be anonymized with “Script error”. Try to replicate those issues through exploratory testing or if possible run local mirrors. It may even make sense to filter out those errors.
  • Make sure your engineers have workflows for triaging new errors during and after deployments and ensure fixing and slowing down is part of your workflow.

How volunteers can help

  • If you are a Wikimedia staff member or volunteer and have sufficient permissions you can use our JavaScript dashboard to help identify and fix bugs caused by gadgets or your own code.
  • If you are an editor, you can help make sure errors get the attention they need by doing your part. Please review your user scripts that live on the wikis you frequent. Please blank them if you are not using them, use gadgets if you can, and ensure they are not throwing errors on every page you view.

Footnotes

¹ for those interested, we eventually diagnosed the increase to relate to a bug in some incorrectly configured code using the mapbox library which triggered on slow connections leading errors to be thrown inside a setInterval function in high quantities. The full bug report can be read on https://phabricator.wikimedia.org/T257872
² The meeting notes can be found on the Wikimedia etherpad for those interested: https://etherpad.wikimedia.org/p/clients-error-logging
³ An example of a gadget which allows users to place comments onto images shown on file description pages can be viewed at https://commons.wikimedia.org/wiki/Help:Gadget-ImageAnnotator. The code lives at https://commons.wikimedia.org/wiki/MediaWiki:Gadget-ImageAnnotator.js

About this post

This post also appeared on https://jdlrobson.com/posts/

Featured image credit: Christoph Wolff, son of Dr. Michael Wolff, scubadiving at Crystal Bat, Nusa Penida, Indonesia at the age of 12, HenryHiggens, CC BY-SA 3.0

Tech News issue #40, 2020 (September 28, 2020)

00:00, Monday, 28 2020 September UTC
previous 2020, week 40 (Monday 28 September 2020) next
Other languages:
Bahasa Indonesia • ‎British English • ‎Deutsch • ‎English • ‎Nederlands • ‎español • ‎français • ‎italiano • ‎magyar • ‎polski • ‎português do Brasil • ‎suomi • ‎svenska • ‎čeština • ‎русский • ‎українська • ‎עברית • ‎العربية • ‎മലയാളം • ‎ไทย • ‎中文 • ‎日本語 • ‎한국어

weeklyOSM 531

10:53, Sunday, 27 2020 September UTC

15/09/2020-21/09/2020

lead picture

How did you contribute – celebrated its 10th birthday 1 | © Pascal Neis | Map data © OpenStreetMap contributors

Mapping

  • Alex started a discussion on what to consider when mapping a street’s width and the different variants currently used. In the discussion, Tobias Zwick, from StreetComplete, referred to a helpful app for measuring with your phone and the help of augmented reality.
  • After the approval of amenity=funeral_hall, user Vollis is proposing the subtag funeral_hall=*, for a building or room used for funeral ceremonies ancillary to a funeral director’s shop or a crematorium.
  • Andrew Wiseman, from Apple, has updated (es) > en the challenges on MapRoulette for Bolivia using new data. The challenges can be found here.
  • Michael Montani, from the UN Department of Operational Support, invited (fr) > en OSM contributors, on various African local discussion lists, to help support UN missions. Mappers can do this by participating in the Tasking Manager jobs they have created to map highways, waterways, land cover, land use, places, and any other relevant points of interest.
  • User stragu warned users of OsmAnd Maps for iOS about a bug that shifts POIs without notice, due to incorrect rounding of coordinates. The issue affects versions 3.10 to 3.14, but was promptly addressed and is solved in version 3.80.

Community

  • After a long break, the Belgian community continues its choice of the ‘Mapper of the Month’. This month it is Jacques Fondaire, aka jfonda, from Belgium.
  • [1] The widely-used tool ‘How did you contribute to OpenStreetMap’ celebrated its 10th anniversary – we congratulate and say thank you, Pascal Neis, for this and also for all the other nice stats you have added over the years!
  • The 35th edition of the Geomob Podcast features an interview with Sarah Hoffmann who, as a software developer, is responsible for the maintenance of Nominatim, the most important software for geocoding used by OpenStreetMap. Sarah explains the technical aspects and challenges of managing a technically complex and publicly visible open source project.
  • Maning Sambale notes that diversity and inclusion are fundamental for OSM’s spread and growth, but seldom make it to centre stage. Plenary sessions such as State of the Country at SotM Asia 2016 and GeoLadies-PH plenary at Pista ng Mapa 2019 aim at increasing the audience listening to community stories. The second Pista ng Mapa (November 2020, online) hopes to feature more stories, maps and posters from the community.
  • OSM contributor hocu wrote about a changeset discussion where he (once more) realised that, thanks to OSM contributor Alikam, the OSM data is more up to date than the official sources, IETT (the governing body of Istanbul’s transport network) in this case.

OpenStreetMap Foundation

  • Christoph Hormann took a look at OSMF’s strategic reorientation and changed way of working and communicating over the past year and expresses concern and frustration about the, in his view, insufficient attention to the long-term consequences and risks of this.
  • Rory McCann (᚛ᚏᚒᚐᚔᚏᚔᚋ᚜ 🏳🏳️‍🌈) wrote in his blog about his activities in August 2020 and reports, among other things, about his activities on the OSMF Board of Directors and the Communication Working Group. mmd complained in a comment that apparently no formal candidate selection process (for the new paid positions on the OSMF) has taken place. Instead, he suspects that this was ‘all done behind the scenes’. Rory confirmed, in his response, that OSMF ‘had a preferred candidate in mind, rather than an open interview process’.

Events

  • Last weekend members from the Italian OSM community gathered (it) > en in Limone, Piemonte for a weekend dedicated to OSM. On Saturday, they gave talks and shared their OSM experiences and useful tools for mapping. Sunday was dedicated to mapping excursions around the area using OSM-based phone apps.
  • On Monday 21 September, the New Caledonian chapter of the cycling association ‘Droit au Vélo’ hosted (fr) a mapping party in its office in Nouméa. The event focused on mapping cycling infrastructure and welcoming new mappers.
  • The Hungarian colleagues have moved their meetings and ‘regulars’ table’ to the Internet en .
  • A virtual mapathon will be held (it) > en on 30 September by the Italian OSM community. During the mapathon participants will draw the maps of the 78 municipalities of Benevento based on satellite images using OpenStreetMap. The event is a collaboration between Valerio De Luca (Map For Future Roma (it) > en ), Luciano Amodio (ThinkSannio (it)), and Nicola De Innocentis (GeoPillole), with the support of Wikimedia Italia (it) > en .

Humanitarian OSM

  • The European Youth Humanitarian OpenStreetMap is planning a two-day online workshop on 15 and 16 October, aimed at students and teachers. The participants will learn to use JOSM and will map the streets and houses on the island of Terceira (Azores), as pre-disaster mapping.
  • The app MapSwipe is being extended to be able to compare the completeness of OSM data with airborne or satellite imagery. A first prototype was developed and tested in the LOKI project at Heidelberg University. The use case is building footprints, as they are important inputs to earthquake risk exposure models.
  • HOT detailed their strategy for achieving equal pay and organisational gender equality in a news post on International Equal Pay Day.
  • HOT is currently in a hiring push, with many jobs available, to scale up their staff to carry out the Audacious Project to map an area across 94 countries, home to one billion people.

Education

  • Daniel Feldmeyer et al. have published an article with the title ‘Using OpenStreetMap Data and Machine Learning to Generate Socio-Economic Indicators’ in the International Journal of Geoinformation. A remarkable sentence from the abstract: ‘OSM provides an unparalleled standardised global database with a high spatio-temporal resolution. Surprisingly, the potential of OSM seems largely unexplored in this context’. In their study, they used machine learning to predict four exemplary socio-economic indicators for communities based on OSM data.

Maps

  • Camille Scheffler announced (fr) > en on Talk-fr that magOSM (fr) > en , a project of services linked to OSM thematic data from the Magellium company, offers four new thematic layers at the level of France (fr):
    1. Train Itineraries
    2. Social structures
    3. Construction Proposals
    4. Constructions in progress
  • Jean-Louis Zimmermann mentioned (fr) > en
    on Talk-fr that OSM-fr publishes a demo map of thematic layers known only to the initiated. Further announcements are planned, and comments are welcome. The different layers, such as territories, schools, and transportation, are selected
    simply on the map from a list.
  • Google Maps will soon show COVID-19 risk areas in mobile versions so that users can avoid them. This does not mean automated routing.

Open Data

  • The Digital Elevation Model (DEM) of Italy at 10 m resolution is available (it) > en for the whole country as open data and downloadable here.
  • The Baden-Württemberg mobility data platform MobiData BW started (de) > en operations on 14 September 2020. On the new website (de) > en , the available data and interfaces are clearly described and are available for further use.

Software

  • A new WordPress plugin called ‘Out of the Block’ has been published. Benefiting from WordPress’s Gutenberg editor, the plugin tries a different approach to adding locations on a map and other user actions compared to existing plugins.
  • Google is a heavy user of open source tools. They are using iD for the historical map project Kartta Labs. Kartta Labs uses a customised version of iD, which lets everyone draw the historical landscape of a city based on georeferenced old paper maps.

Releases

  • An update of OsmAnd has been released. For iOS, version 3.80 introduces application profiles with independent settings, the ability to import and export profile settings, the capability to download online maps to cache, and fixes a crash that appeared while starting navigation. On Android, version 3.8 brings an updated plan a route function, a new appearance menu for tracks, and improved search algorithms.

Did you know …

  • … you can download (de) > en geological maps for all of Germany to your smartphone, with this detailed guide, and use them offline with OsmAnd?
  • Taginfo, the site which allows you to explore the tags and values used in OSM? Several local versions are available for you to explore at a country level.
  • … there is a step-by-step guide (it) > en on how to display parking spaces reserved for people with disabilities in OsmAnd?

OSM in the media

  • Just van den Broecke informed (nl) > en us that the live streaming web show, which ran April to June 2020, will return, running on the first Thursday each month (19:00-21:00 CEST), starting on 1 October.
  • Rina Chandran reported about HOT’s activities in Indonesia, where handwashing stations were mapped to help fight COVID-19.

Other “geo” things

  • Geospatial Media and Communications will present the Geospatial World Awards on 6 October virtually for the first time due to the COVID-19 pandemic . Since its launch in 2007, the awards have been presented to more than 200 individuals and organisations for remarkable innovations and ideas in global geodesy.
  • After a successful trial showed enormous potential, Ramblers have embarked on the Mapping Scotland’s Paths project.

Upcoming Events

Where What When Country
Montrouge Soirée de fin de projet “Ça reste ouvert” 2020-09-24 france
Bratislava Missing Maps Mapathon Bratislava #9 2020-09-24 slovakia
Munich TUM – Mapping Party 2020-09-24 germany
online HOT Working Groups 101 Community Webinar 2020-09-25 united kingdom
Düsseldorf Düsseldorfer OSM-Stammtisch 2020-09-25 germany
Helsinki State of the Map Suomi 2020 2020-09-26 finland
online FOSSGIS OSM Communitytreffen 2020-09-27 germany
Salt Lake City / Virtual OpenStreetMap Utah Map Night 2020-09-29 united states
Zurich Missing Maps Mapathon Zürich 2020-09-30 switzerland
Ulm + virtuell Covid-19-Mapathon 2020-10-01 germany
San José Civic Hack & Map Night 2020-10-01 united states
Taipei OSM x Wikidata #21 2020-10-05 taiwan
London Missing Maps London Mapathon 2020-10-06 united kingdom
Stuttgart Stuttgarter Stammtisch 2020-10-07 germany
Berlin 148. Berlin-Brandenburg Stammtisch 2020-10-09 germany
Michigan Michigan Online Meetup 2020-10-12 USA
Cologne Bonn Airport 132. Bonner OSM-Stammtisch (Online) 2020-10-13 germany
Munich Münchner Stammtisch 2020-10-13 germany
online 2020 Pista ng Mapa 2020-11-13-2020-11-27 philippines
online FOSS4G SotM Oceania 2020 2020-11-20 oceania

Note: If you like to see your event here, please put it into the calendar. Only data which is there, will appear in weeklyOSM. Please check your event in our public calendar preview and correct it, where appropriate.

This weeklyOSM was produced by AnisKoutsi, Anne Ghisla, Climate_Ben, MatthiasMatthias, MichaelFS, Nordpfeil, NunoMASAzevedo, PierZen, Polyglot, Rogehm, Supaplex, TheSwavu, YoViajo, alesarrett, derFred, richter_fn.

Should I substr(), substring(), or slice()?

23:00, Friday, 25 2020 September UTC

What’s the deal with these string methods, and how are they different?

substr()

str.substr(start[, length])

This method takes a start index, and optionally a number of characters to read from that start index with the default being to read until the end of the string.

'foobar'.substr(2, 3); // "oba"

The start parameter may be a negative number, for starting relative from the end.

Note that only the first parameter of substr() supports negative numbers. This in contrast to most methods you may be familiar with that support negative offsets, such as String#slice() or Array#slice(). The second parameter may not be negative. In fact, it isn’t an end index at all. Instead, it is the (maximum) number of characters to return.

But, in Internet Explorer 8 (and earlier IE versions), the substr() method deviates from the ECMAScript spec. Its start parameter doesn’t support negative numbers. Instead, these are silently ignored and treated as zero. (I noticed this in 2014, shortly before Wikimedia disabled JavaScript for IE 8.)

IE 8:

'faux'.substr( -1 ); // "faux"

Standard behaviour:

'faux'.substr( -1 ); // "x"

And, the name and signature of substr() are deceptively similar to those of the substring() method.

String substring()

str.substring(start[, end])

This method takes a start index, and optionally an end index. At glance, a very simple and low-level method. No relative lengths, negative offsets, or any other trickery. Right?

Behold! The two parameters automatically swap if start is larger than end[1]

'foobar'.substring(1, 4); // "oob"
'foobar'.substring(4, 1); // "oob", also!

Unexpected values such as null, undefined, or NaN are silently treated as zero. For substring() this also applies to negative numbers.

And, of course, the name and signature of substring() are deceptively similar to substr().

String slice()

str.slice(start[, end])

This method takes a start index, and optionally an end index that defaults to the end of the string. Either parameter may be a negative number, which is interpreted as a relative offset from the end of the string.

I found no defects in browsers or JavaScript engines implementing this method. And it has been around since the beginning of time.

Its only weakness is also its greatest strength — full support for negative numbers.

One might think this can be ignored for cases where you only intend to work with positive numbers. You’d be right, until you write code like the following:

start = something.indexOf(needle); // returns -1 if needle not found.
remainder = str.slice(start); // oops, -1 means something else here!

The notion of negative offsets was confusing to me when I first learned it. But, over the years, I’ve come to appreciate it and it actually became second nature to think about offsets in this way. If you’re unfamiliar, see the examples below.

Conclusion

First, let us compare these methods once more:

str = 'foobarb…z';

// Strip start "foo" > "barb…z"
str.slice(3);
str.substring(3);
str.substr(3);

// Strip end "z" > "foobarb…"
str.slice(0, -1);
str.substring(0, str.length - 1);
str.substr(0, str.length - 1);

// Strip "foo" and "z" > "barb…"
str.slice(3, -1);
str.substring(3, str.length - 1);
str.substr(3, str.length - 3 - 1); // 👀

// Extract start > "foo"
str.slice(0, 3);
str.substring(0, 3);
str.substr(0, 3);

// Extract end > "z"
str.slice(-1);
str.substring(str.length - 1);
str.substr(str.length - 1); // Compat
str.substr(-1); // Modern

// Extract 4 chars at [3] > "barb"
str.slice(3, 3 + 4);
str.substring(3, 3 + 4);
str.substr(3, 4); // 👀

None of these seem unreasonable, in isolation. It’s nice that slice() allows negative offsets. It’s nice that substring() may limit the damage of accidentally negative offsets. It’s nice that substr() allows extracting a specific number of characters without needing to add to the start index.

But having all three? That can incur a very real cost on development in the form of doubt, confusion, and inevitably mistakes. I don’t think any of these is worth that cost over some minute localised benefit.

I find substr() or substring() cast doubt on surrounding code. I need to second-guess the author’s intentions when reviewing or debugging such code. Which is wasteful even, or especially, when they (or I) used one correctly.

But what about unit tests? Well, there’s sufficient overlap between the three that a couple of good tests may very well pass. It’s easy to forget exercising every possible value for a parameter, especially one that is passed through to a built-in. The question isn’t whether the built-in works. The question is – did we use the right one?

This ubiquitous signature of slice() is well-understood. It is a de facto standard in technology, seen in virtually all programming languages. It is applies to strings, arrays, and sequences of all sorts. As such, that’s the one I tend to prefer.

But more important than which one you choose, I think, is the act of choosing itself. Eliminating the others from your work environment reduces cognitive overhead in development, with one less worry whilst reading code, and one less decision when writing it. [2]


  1. This “argument swapping” behaviour in substring() has existed since the original JavaScript 1.0 as implemented in Netscape 2 (1996), and reverse-engineered by Microsoft in IE 3. The behaviour was briefly removed by Netscape 4 with JavaScript 1.2 in June 1997, but that same month the misfeature finished its fast-tracked standardisation as part of ECMAScript 1. Thus, the misfeature returned in 1998 with the release of Netscape 4.5 and JavaScript 1.3, which aligned itself with the new specification. ↩︎
  2. In 2014, I wrote a lengthy code review about the string methods which, after much delay, I used as the basis for this article. ↩︎

For the third year in a row, Amazon has renewed its commitment to the Wikimedia Endowment, a permanent fund dedicated to ensuring the long-term future of Wikipedia and the other Wikimedia free knowledge projects. With this gift, Amazon reinforces the importance of ensuring people everywhere can create and share free knowledge.  

With millions of students learning from home, teachers adapting their curricula to online education, and people looking for up-to-date information about the spread of COVID-19, the world has turned to Wikipedia as a critical source of knowledge during the pandemic. The need to conserve and expand the world’s largest free knowledge ecosystem has become even more acute. 

Corporate donations, like those from Amazon today, allow the Wikimedia Foundation, the nonprofit that operates Wikipedia, to prepare for uncertain times and ensure Wikipedia and the other Wikimedia free knowledge projects remain available for everyone to use, even during urgent crises. Companies regularly use Wikimedia projects and our freely-licensed content to power their work and further advance their goals. Commensurate support from these organizations helps sustain our projects and mission of delivering free knowledge to the world.

“Wikipedia is one of the largest and most beloved websites in the world, with hundreds of millions of people seeking information from its articles every month. The Wikimedia Foundation is committed to keeping Wikipedia and the other Wikimedia projects free and available to all, and gifts like this latest contribution from Amazon help us do just that,” said Amy Parker, Endowment Director at the Wikimedia Foundation. 

The Foundation has raised $64 million for the Endowment since its inception, thanks to gifts from generous individuals, corporations, and foundations donors.To learn more about how you can support the Wikimedia Endowment, visit www.wikimediaendowment.org or email endowment@wikimedia.org.

CI now updates your deployment-charts

09:44, Friday, 25 2020 September UTC

If you're making changes to a service that is deployed to Kubernetes, it sure is annoying to have to update the helm deployment-chart values with the newest image version before you deploy. At least, that's how I felt when developing on our dockerfile-generating service, blubber.

Over the last two months we've added

And I'm excited to say that CI can now handle updating image versions for you (after your change has merged), in the form of a change to deployment-charts that you'll need to +2 in Gerrit. Here's what you need to do to get this working in your repo:

Add the following to your .pipeline/config.yaml file's publish stage:

promote: true

The above assumes the defaults, which are the same as if you had added:

promote:
  - chart: "${setup.project}"           # The project name
    environments: []                    # All environments
    version: '${.imageTag}'             # The image published in this stage

You can specify any of these values, and you can promote to multiple charts, for example:

promote:
  - chart: "echostore"
    environments: ["staging", "codfw"]
  - chart: "sessionstore"

The above values would promote the production image published after merging to all environments for the sessionstore service, and only the staging and codfw environments for the echostore service. You can see more examples at https://wikitech.wikimedia.org/wiki/PipelineLib/Reference#Promote

If your containerized service doesn't yet have a .pipeline/config.yaml, now is a great time to migrate it! This tutorial can help you with the basics: https://wikitech.wikimedia.org/wiki/Deployment_pipeline/Migration/Tutorial#Publishing_Docker_Images

This is just one step closer to achieving continuous delivery of our containerized services! I'm looking forward to continuing to make improvements in that area.

Wikimedia’s Event Data Platform – Event Intake

20:54, Thursday, 24 2020 September UTC

By Andrew Otto, Staff Site Reliability Engineer

In our previous post in this series, we described how Wikimedia manages and distributes event schemas. Now that we’ve got that nailed down, let’s look into how we use them to ensure events conform to their schemas.

Kafka has client implementations in many languages, but those implementations don’t do any schema validation. If we were using Avro with Confluent’s Schema Registry, we’d need custom Serializers (e.g. KafkaAvroSerializer or this Python AvroProducer) that know how to contact a specific Schema Registry service before they can produce data. Fortunately for us, we don’t need to do any special serialization, as we don’t need a schema to serialize JSON data. However, we’d like to enforce that event data in Kafka topics always contain events that validate with a specific schema. 

Even though Kafka clients exist in many languages, the quality and freshness of those client implementations vary. One of Wikimedia’s main Kafka use cases is producing event data from MediaWiki, which is written in PHP. Our MediaWiki application servers handle around 15000 requests per second. Kafka is pretty efficient at producing messages, but the bootstrap and connection time for a new Kafka producer isn’t quite as fast. Being PHP, MediaWiki would have to create and connect a new Kafka producer client for each of these requests, which would add serious overhead and latency to the application servers handling these requests.

We also need to intake events from external sources, like browsers and mobile apps, which don’t have access to internal Kafka clusters.

Sometimes you need to produce data to Kafka without a Kafka client. Confluent has a Kafka-HTTP REST Proxy to handle situations like these. Wikimedia has EventGate.

EventGate

EventGate is a NodeJS library and HTTP service that receives event data in HTTP POST requests, validates the event, and then produces it. The validate and produce logic are pluggable, but the original intention was for EventGate to validate events against JSONSchemas looked up from $schema URIs, and produce them to Kafka.

When used as an HTTP service, the API is very simple:

POST /v1/events

The POST body should be a JSON serialized array of events (or just a single event object) to validate and produce. It is expected that each event contains information that the provided validate and produce logic can use to do their jobs. The provided EventValidator implementation of validate expects that events have a schema URI field (default $schema) that can be used to lookup the event’s schema for validation. The provided Kafka produce functionality expects that a configured field in the event contains the destination ‘stream’ name, which will be used to determine the Kafka topic that the event should be produced. 

Note that the API itself does not include information about what topic an event should be produced to, or what schema the body of events should be validated with. This is intentional. Even though EventGate was built to use with Kafka, we wanted to keep the main functionality generic and pluggable, and so the API had to be generic too. We also wanted to be able to batch events of multiple types and destinations in a single HTTP request, so an API parameter that specified e.g. a single schema would not help if the POSTed events could all have different schemas.

The provided implementations in the main EventGate repository work well. They solve the problem of finding event schemas at runtime and using them to validate events, and they can produce to Kafka. But they are missing a crucial requirement: the ability to restrict the types of events that make it into a Kafka topic.

Stream configuration

Confluent’s REST Proxy doesn’t do this either. Instead, it relies on you setting specific topic configurations that specify the name of a ‘strategy’ implementation that (by default) maps from either the topic name to the name (subject) of a schema in the schema registry. Your producer implementation must know how to look up topic configurations and interact with the schema registry. This is all fine if you are using Java, as Confluent expects you to. But what if you are using a different language? You’ll now not only need a good Kafka client, but also an implementation of e.g. io.confluent.kafka.serializers.subject.TopicNameStrategy in your language.

Wikimedia solves this by creating what we call ‘stream configuration’, and using that to map from stream names (which are just an abstraction of Kafka topic names) to configuration about those streams, including the title of the JSONSchema that is allowed. JSONSchemas have a title, and we expect that all schema versions of the same lineage have the same title. The configuration for a stream has a schema_title setting. When our EventGate validate function gets an event to validate, it extracts both the $schema URI and the stream name from the event. The stream name is used to lookup stream configuration, from which the schema_title setting is compared with the schema’s title. If they match, and if the event validates against its JSONSchema, it is allowed to be produced.

We also use stream configuration for other things, including arbitrary configuration of remote producer clients. This allows product engineers and analysts to change certain behaviors of data producers, like the sampling rate, via configuration rather than code deployment.

Wikimedia’s stream configuration is useful for us, but its implementation would have been difficult to make generic for outside users, so we did not include it in the main EventGate library. We also had the need for some other Wikimedia specific customizations of EventGate. Since EventGate is pluggable, our implementation is in a separate repository that uses EventGate as a dependency. 

Pros and cons

I’ve talked a lot about how Confluent does things, and why we’ve done things differently. However, I’d like to acknowledge that Confluent is building software for use by many different organizations, which is not easy to do. The Schema Registry and all of the custom code needed to use it does have one huge advantage: you can use a Kafka client to produce messages directly to Kafka, and still get all the automated schema validation done for you in your application code. Kafka clients have some amazing features (exactly once, transactions, etc.) that are not available if you are using a HTTP proxy. The downside is that you need Confluent specific code available in your language of choice.

Wikimedia’s polyglot and distributed nature makes doing things the Confluent way difficult. EventGate and our decentralized schema model work for us, and we think there are other organizations out there that might find our approach useful. See the EventGate README for more on how to use EventGate.

EventGate, jsonschema-tools, and everything else that Wikimedia creates is always 100% Free Open Source software. We’d love any thoughts and contributions you might have, so please reach out with questions or issues or pull requests.

Acknowledgements

Wikimedia’s Event Data Platform would not be possible without the numerous Open Source projects it is built on. I’d like to take a moment to thank a few of the critical ones.

About this post

This is the final part of a 3 part series Read part 1; read part 2.

Featured image credit: A stile at the bottom of Arreton Down, Editor5807, CC BY 3.0

Viann N. Nguyen-Feng, PhD, MPH, is an assistant professor in the Department of Psychology at the University of Minnesota, Duluth. She serves as core faculty in the counseling/clinical track and directs the Mind-Body Trauma Care Lab.

Viann Nguyen-Feng
Viann Nguyen-Feng (photo by Bobby Rogers, used with permission of subject; all rights reserved).

In late Fall 2019, I was doing my usual perusal of the Association for Psychological Science’s (APS) e-newsletter with its Weekly Wiki feature: “Wikipedia says: ‘In psychology and other social sciences, the contact hypothesis suggests that intergroup contact under appropriate conditions can effectively reduce prejudice between majority and minority group members.’ Did they get it right? Read on.”

Although the APS Wikipedia Initiative has been active since 2015 or so, that day, I decided that, yes, I wanted to read on. Many of the previously highlighted terms had appeared more social or cognitive focused than my area of work in counseling psychology (pop quiz: what does Wikipedia say about executive function? Anchoring? Self-categorization theory? Spatial memory?). But I then realized that my students and I can also find spaces to contribute — anyone can help make psychological science on Wikipedia as complete and accurate as possible.

Selecting psychological statistics for Wiki Education

The question then shifted to determining which Spring 2020 class I could incorporate a semester-long Wiki Education assignment. Two of my courses involved too many direct-service hours (Internship, Assessment II) to add another assignment without kindling overwhelming student stress. My remaining course option was Advanced Statistics II.

Involving a dozen early graduate students in compiling information for a Wikipedia article felt daunting. Statistics in psychology has traditionally — and understandably — been taught with a focus on methodology without intensive writing components. Further, the Psychology WikiProject, a categorization of psychology-related articles targeted for improvement, did not appear to have a psychological statistics topic. 

With curiosity about statistics as an option, I reached out to Dr. Helaine Blumenthal (Wikipedia Student Program Manager) and Dr. Ian Ramjohn (Senior Wikipedia Expert), who provided guidance on completing Wiki Educations’s instructor orientation. The orientation was only 30 minutes long yet enabled me to see that Wikipedia “isn’t just for words.” There was space for multimedia, photography, and illustration courses and, most relevantly, for courses in which students visualize data and could create original graphical figures. Wiki Education even provided guidance on designing a media contribution assignment. And thus, Wiki Education entered my Advanced Statistics II syllabus.

Making it happen

My first Wiki Education statistics semester took off with feedback and advice from Helaine, Ian, and my assigned psychology Wiki Education mentor, Dr. Patricia J. Brooks. I landed on having students complete three manuscript-style written reports for which they needed to provide their own data. Reports included interpretation of data analyses performed in “lab” and required at least one table or figure. In turn, students were assigned to provide three Wiki Education media contributions related to each report topic discussed during the semester. Among my class of nine students enrolled in Spring 2020 (an unprecedented semester, as you all know), 32 original files were uploaded to Wikimedia Commons.

I did not realize that psychological statistics was an uncommon Wiki Education course until that particularity was mentioned during an end-user testing meeting of new Dashboard features. The listed courses on the APS Wiki Education Campaign indeed seem to suggest a paucity of statistics classes — perhaps Advanced Statistics II at the University of Minnesota, Duluth was it. However, I hope psychological statistics instructors will come to recognize that Wiki Education is certainly a place for you and your students. 

Visualization of statistical concepts, particularly in areas quite relevant to the social sciences (e.g., mediation, moderation), appear quite needed. The vast majority of my students contributed graphical statistical models that used general or specific examples, yet color-coded equations and a table were also in the assortment. All students learned how to write captions that communicated concise, accurate, and understandable information to the general public. Currently, more than four months after the conclusion of the course, 31 files remain in Wikipedia articles. These files comprise 26% of all APS Wiki Education Campaign files currently retained in articles and demonstrate a 97% overall retention rate for the course (vs. a typical 53% of all files, exclusive of this statistics course). 

As the idiom goes, a picture is worth a thousand words — or, in the case of psychological statistics, a thousand numbers.

Getting student feedback

By incorporating Wiki Education into Advanced Statistics II, my students saw that statistics could be applied outside of the classroom, as they indicated on midterm and end-of-term evaluations. Learning statistical theories and procedures were not simply intellectual exercises, but a means to understanding other domains and an opportunity to share seemingly difficult concepts in a meaningful way to others:

“I think I will find myself using some of these teaching methods in the future! For example, I am a big advocate for being able to teach a concept/skill to someone else as a demonstration that you really understand that skill.”

“I really enjoyed learning the material and have been able to apply it in my work and daily life.”

“She really helps us engage with the information. I feel like I can take what I’ve learned in this class and apply it outside of the course.”

Because the media contributions were cumulative assignments of in-lab material, Wiki Education seemed well integrated into the course rather than a random side assignment. Approaching statistics as an applied laboratory and public engagement concept was initially challenging, yet grew on students over time:

“I really enjoyed how the course was set up once we knew how to handle labs.”

“I also like the way the weekly labs were set up as it required applying the information that we had just learned.”

“The design of the course has been different than what many of us are traditionally used to, but I think that the design has applied learning built into it with immediate feedback on how we did. It’s still hard to get used to the fact… but the labs seem to be a good check on our knowledge in a more applied way.”

By creating a space for students to share their knowledge, the students also appeared to increase their confidence in statistics:

“Personally, I think that I was able to grow in confidence in my statistical knowledge. When the first labs were completed, I felt like I knew nothing, and now when I perform a lab I feel like I know what I turned in is great!”

“This semester I have learned so much about statistics. I’ve gained a newfound interest and appreciation for it. In the past, I had previous teachers who pretty much made me lose my interest in statistics and made me feel like I would never be able to get it. But by taking this class with you I’ve learned so much and I’ve been able to apply statistics in my own projects, and am now even able to help other students understand it! …I’m really happy I took this class!!!”

China yesterday blocked the Wikimedia Foundation’s application for observer status at the World Intellectual Property Organization (WIPO), the United Nations (UN) organization that develops international treaties on copyright, IP, trademarks, patents and related issues. As a result of the block, the Foundation’s application for observer status has been suspended and will be reconsidered at a future WIPO meeting in 2021.

China was the only country to raise objections to the accreditation of the Wikimedia Foundation as an official observer. Their last-minute objections claimed Wikimedia’s application was incomplete, and suggested that the Wikimedia Foundation was carrying out political activities via the volunteer-led Wikimedia Taiwan chapter. The United Kingdom and the United States voiced support for the Foundation’s application.

WIPO’s work, which shapes international laws and policies that affect the sharing of free knowledge, impacts Wikipedia’s ability to provide hundreds of millions of people with information in their own languages. The Wikimedia Foundation’s absence from these meetings further separates those people from global events that shape their access to knowledge.

“The Wikimedia Foundation operates Wikipedia, one of the most popular sources of information for people around the world. Our organization can provide insights into global issues surrounding intellectual property, copyright law, and treaties addressed by WIPO that ensure access to free knowledge and information,” said Amanda Keton, General Counsel of the Wikimedia Foundation. “The objection by the Chinese delegation limits Wikimedia’s ability to engage with WIPO and interferes with the Foundation’s mission to strengthen access to free knowledge everywhere. We urge WIPO members, including China, to withdraw their objection and approve our application.”

A wide range of international and non-profit organizations as well as private companies are official observers of WIPO proceedings and debates. These outside groups offer technical expertise, on-the-ground experience, and diversity of opinions to help WIPO with its global mandate.

“The Wikimedia Foundation calls on the member states of WIPO to reconsider our application for observer status and encourages other UN member states to voice their support for civil society inclusion and international cooperation,” said Keton.

The Wikimedia Foundation provides the essential infrastructure for free knowledge and advocates for a world in which every single human being can freely share in the sum of all knowledge.

Production Excellence #23: July & August 2020

18:10, Wednesday, 23 2020 September UTC

How’d we do in our strive for operational excellence last month? Read on to find out!

📈   Incidents

4 documented incidents in July, and 2 documented incidents in August. [1] Historically, that's on average for this time of year. [5]

For more about recent incidents see Incident documentation on Wikitech, or Preventive measures in Phabricator.


📊   Trends

Take a look at the workboard and look for tasks that could use your help.
https://phabricator.wikimedia.org/tag/wikimedia-production-error/

Summary over recent months:

  • ⚠️ July 2019 (4 of 18 tasks left): One task closed.
  • ⚠️ August 2019 (1 of 14 tasks left): no change.
  • ⚠️ September 2019 (3 of 12 tasks left): Two tasks closed.
  • October (6 of 12 tasks left), no change.
  • November (3 of 5 tasks left): no change.
  • December (3 of 9 tasks left), Two tasks closed.
  • January 2020 (5 of 7 tasks lef), no change.
  • February (2 of 7 tasks left), Two tasks closed.
  • March (2 of 2 tasks left), no change.
  • April (10 of 14 tasks left): One task closed.
  • May (7 of 14 tasks left): Four tasks closed.
  • June (10 of 14 tasks left): Four tasks closed.
  • July 2020: 13 of 24 new tasks survived the month of July and remain open today.
  • August 2020: 37 of 53 new tasks survived the month of August and remain open today.
Recent tally
72 open, as of Excellence #22 (Jul 23rd).
-16 closed, of the previous 72 recent tasks.
+13 opened and survived July 2020.
+37 opened and survived August 2020.
106 open, as of today (Sep 23rd).

Previously, we had 72 open production errors over the recent months up to June. Since then, 16 of those were closed. But, the 13 and 37 errors surviving July and August raise our recent tally to 106.

The workboard overall (including tasks from 2019 and earlier) held 192 open production errors on July 23rd. As of writing, the workboard holds 296 open tasks in total. [4] This +104 increase is largely due to the merged backlog of JavaScript client errors, which were previously untracked. Note that we backdated the majority of these JS errors under “Old”, and thus are not amongst the elevated numbers of July and August.


🎉   Thanks!

Thank you to everyone else who helped by reporting, investigating, or resolving problems in Wikimedia production. Thanks!

Until next time,

– Timo Tijhof


👊🍺 Tyler: “You know man, it could be worse! […]” Narrator: “[but] I was close... to being complete.”

Tyler: “Martha's polishing the brass on the Titanic. It's all going down, man. […] Evolve! Let the chips fall where they may.”
Narrator: “What!?” Tyler: “The things you own..., they end up owning you.”

Footnotes:
[1] Incidents. – https://wikitech.wikimedia.org/wiki/Incident_documentation
[2] Tasks created. – https://phabricator.wikimedia.org/maniphest/query…
[3] Tasks closed. – https://phabricator.wikimedia.org/maniphest/query…
[4] Open tasks. – https://phabricator.wikimedia.org/maniphest/query…
[5] Wikimedia incident stats. – https://codepen.io/Krinkle/full/wbYMZK

Wikipedia editing in an upper-level biology elective

16:15, Tuesday, 22 2020 September UTC

Heather Olins is an Assistant Professor of the Practice in the Biology Department and Environmental Studies Program at Boston College. Here she reflects on her student’s experience with Wikipedia editing in an upper level elective course, and adjustments she is making to the project in its second iteration.

Heather Olins
Heather Olins

I teach a biology elective course called Deep Sea Biology. My primary goal in this course of mostly senior bio majors is to increase awareness of, and enthusiasm for, the deep sea. I don’t particularly care if 5 years after my course students still remember what depth corresponds to the ocean’s abyssopelagic zone, but I hope that students are still thinking about how little we know about the ocean’s depths, that they remember the ways in which their own lives are indirectly affected by Earth’s oceans, and that they consider the impacts on the marine realm that their daily actions have. 

Because of these course goals, I have chosen to forego a traditional final exam, and instead have students take on a final project. I’ve done this in a number of ways, but in the Spring of 2020, I added Wikipedia editing (through Wiki Education) to the course for a few reasons. I wanted something that was an authentic contribution to the world beyond our classroom. When students know that I am the only person reading their paper they feel a different level of engagement and ownership of the project than something that may, ultimately, be publicly accessible. One of the themes I stress over and over again is how poorly understood the deep sea is. We spend a lot of time reading the scientific literature for the course, which is accessible to us through our institutional journal subscriptions, but much of this information is not accessible to the public. Translating and summarizing some of that information for a general audience seemed like a valuable academic exercise, and sharing it on Wikipedia felt like a valuable contribution. I could tell my students that they were going to play a role in making our knowledge about the deep sea more accessible to the general public. I could tell that this responsibility and challenge was something that appealed almost instantly to many of my students. They all use Wikipedia regularly, and have heard mixed messages about whether or not it is a “good” source, so in addition to the contributions they would make, this project would help them better understand how Wikipedia works.

Now that I’ve gone through the Wikipedia editing process once, I’ve learned a few things that will hopefully make things smoother. I want to separate out the research from the editing more than I did in the past. Trying to do both things simultaneously was frustrating for some students. The Wiki Education platform has lots of great tutorials, but having students do these too early (before they’ve selected a topic, for example) left some students feeling like the Wiki Education platform was a lot of busy work… I’m hoping that condensing these tutorials and assigning them right when students need that information will help.

This semester (Fall 2020) I am formatting the project in 3 parts:

1. Background research: Students choose their topic, read primary source documents, and create an annotated bibliography.

2. Wikipedia editing: Students summarize and translate their research into content that can be added to Wikipedia. For some topics this might just be adding some references and citations, for others it may be a page reorganization, and for some it could involve adding substantial text to their topic’s page (or even creating a new page). 

3. Get creative: Students propose a creative way to share what they’ve learned about their topic, that they will submit digitally for peer review before final submission. This semester especially, I wanted students to be excited and energized by their projects so I’m leaving this pretty open. I might end up with children’s books, music videos, art projects… who knows!

The following are some quotes from my students last semester about the Wikipedia editing project.

  • With the Wiki Education project, I thought it was a cool way to combine research and teamwork while making an impact by providing updated information to be people browsing Wikipedia. I enjoyed being in groups for the project and would not want to complete the task alone. Some of the training felt tedious but it was also helpful to understand how to actually make edits and utilize the software.
  •  I did enjoy the Wikipedia assignment, it was cool to be behind the scenes like that and get to make a contribution!
  • I really enjoyed the Wikipedia project. It was interesting to study just one deep sea organism, and although we didn’t get to finish, it helped a lot with scientific writing and research.
  • I liked doing the research and editing the article myself, and it made me feel like I was contributing to something bigger than had I just done the research and written a paper just for a professor to read and then nothing came from it.
  • I think Wikipedia was a good way to expose those who do not know a lot about the deep sea to the greater community. It showed that people search this ad look through this information.
  • I think I’ll probably remember the Wikipedia project — I now have a lot of oddly specific knowledge of an obscure genus of octopus, which should be fun at parties.
  • I think the Wikipedia project was actually fun and it was a good experience in finding proper sources and adding to articles.
  • I did enjoy the Wikipedia assignment, it was cool to be behind the scenes like that and get to make a contribution!
  • I also really enjoyed the Wikipedia project, the team, and that it was something I could feel good about because others benefitted from our efforts.

From Gerrit to Gitlab: join the discussion

19:45, Monday, 21 2020 September UTC

By Tyler Cipriani, Manager, Release Engineering

There is a lot of Wikimedia code canonically hosted by the Wikimedia Gerrit install. Gerrit is a web-based git repository collaboration tool that allows users to submit, comment on, update, and merge code into its hosted repositories. 

Gerrit’s workflow and user experience are unique when compared to other popular code review systems like GitHub, Bitbucket, and GitLab. Gerrit’s method of integration is focused on continuous integration of stacked patchsets that may be rearranged and merged independently. In Gerrit there is no concept of feature branches where all work on a feature is completed before it’s merged to a mainline branch—the only branch developers need to worry about is the mainline branch. The consequence of this is that each commit is a distinct unit of change that may be merged with the mainline branch at any time. 

The primary unit of change for GitHub and other review systems is the pull request. Thanks to the proliferation of GitHub, pull requests (synonymous with “merge requests”) have become the defacto standard for integration. The type of continuous integration used by Gerrit can allow for more rapid iteration of closely aligned teams but might be hostile to new contributors.

Following an announcement in 2011, in early 2012 Wikimedia moved from Subversion to Git and chose Gerrit as the code review platform. The following summer a consultation resulted in affirming that Wikimedia development was staying on Gerrit “for the time being”. Since 2012, new Open Source tools for git-based code review have continued to evolve. Integrated self-service continuous integration, easy repository creation and browsing, and pull requests are used for development in large Open Source projects and help define user expectations about what a code review should do.

Gerrit’s user interface has improved — particularly with the upgrade from version 2 to version 3 — but Gerrit is still lacking some of the friendly features of many of the modern code review tools like easy feature branch creation, first-class self-service continuous integration, and first-class repository navigation. Meanwhile, the best parts of Gerrit’s code review system — draft comments, approvals, and explicit approvers — have made their way into other review systems. Gerrit’s unique patchset workflow has a lot of advantages over the pull request model, but, maybe, that alone is not a compelling enough reason to avoid alternatives.

Enter GitLab

Earlier this year, as part of the evaluation of continuous integration tooling, the Wikimedia Foundation’s Release Engineering team reviewed GitLab’s MIT-licensed community edition (CE) offering and found that it met many of the needs for our continuous integration system—things like support for self-service pre- and post-merge testing, a useful ACL system for reviewers, multiple CI executors supporting physical hosts and Kubernetes clusters, support for our existing git repositories, and more.

GitLab has been adopted by comparable Open Source entities like Debian, FreeDesktop.org, KDE, Inkscape, Fedora, and the GNOME project.

GitLab is a modern code review system that seems capable of handling our advanced CI workflows. A move to GitLab could provide our contributors with a friendly and open code review experience that respects the principles of freedom and open source.

Feedback

As shepherds of the code review system, the Release Engineering team reached the stage of evaluations where we need to gather feedback on the proposal to move from Gerrit to GitLab. The Wikimedia Gerrit install is used in diverse ways by over 2,500 projects. To reach an equitable decision about whether or not GitLab is the future for our code hosting, we need the feedback of the technical community.

On 2 September 2020, we announced the beginning of the GitLab consultation period. We invite all technical contributors with opinions about code review to speak their mind on the consultation talk page.

From now until the end of September 2020 a working group composed of individuals from across our technical communities will be collecting and responding on the consultation talk page. Following this consultation period, the working group will review the feedback it has received, and it will produce a summary, recommendation, and supporting deliverables.

It’s difficult to make decisions collaboratively, but those decisions are stronger for their efforts. Please take the time to add a topic or add to the discussion — our decision can only be as strong as our participation.

About this post

Featured image credit: Vulpes vulpes Mallnitz 01, Uoaei1, CC BY-SA 4.0

What is a good input method?

08:40, Monday, 21 2020 September UTC

As more and more people enter to the Malayalam digital world, the issue of not having any formal training for Malayalam computing is more observed. People sometimes just search for input methods in web, ask friends or use whatever coming with the devices they have. Since I myself is an author of two input methods, sometimes people ask me too. This essay is about the characteristics of a good input method to help people make the right choice.

Tech News issue #39, 2020 (September 21, 2020)

00:00, Monday, 21 2020 September UTC
previous 2020, week 39 (Monday 21 September 2020) next
Other languages:
Bahasa Indonesia • ‎British English • ‎Deutsch • ‎English • ‎Nederlands • ‎español • ‎français • ‎italiano • ‎magyar • ‎polski • ‎português do Brasil • ‎suomi • ‎svenska • ‎čeština • ‎русский • ‎српски / srpski • ‎українська • ‎עברית • ‎العربية • ‎മലയാളം • ‎中文 • ‎日本語 • ‎한국어

weeklyOSM 530

09:39, Sunday, 20 2020 September UTC

08/09/2020-14/09/2020

lead picture

I want to work with OSM data! – Which tool do I need? 1 | © HeiGIT gGmbH

Mapping

  • Michał Brzozowski asked, on OSM-talk, for some examples of good paid mapping. He has received many responses from mappers sharing their experiences and opinions.
  • Crowd2Map Tanzania organised and hosted training sessions for YouthMappers chapters at the Institute of Rural Development Planning Dodoma (IRDP) and the University of Dodoma (UDOM) on open geospatial data and open source geospatial software.
  • ‘Ça reste ouvert’, the French original ‘Staying open’, closes (fr) > en on 30 September 2020, with the satisfaction of the work done by the team and the certainty of having made a contribution to the big picture. Its creators will then launch the development of the OpenStreetMap community’s ‘project of the month’: a new ‘energy catalyst’ tool for more collaborative adventures!
  • François Lacombe published a very detailed proposal, with photos and examples, to create three new tags to better describe pumps used for liquids.
    Please feel free to comment.
  • François Lacombe is also proposing the new tag man_made=utility_pole and asks for comments. His proposal aims to review and complete man_made=utility_pole with existing utility=* for poles intended for other activities than power transmission and distribution.

Community

  • Allan Mustard, aka apm-wa, published a diary entry about his new camera setup for Mapillary imagery and the recent decisions of the OSMF Board about funding the development of OSM editors (namely iD, Potlatch and StreetComplete) and the hire of a full-time system reliability engineer.
  • Branko Kokanovic has created a Mapnik-based tile server where names are internationalised. He has retrieved the names from Wikidata and Wikipedia. The project is available on GitLab.
  • Florian Lainez is proposing (fr) > en an OSM-Fr Association newsletter, which would include news from the association itself such as communications or board decisions, general OSM updates and project of the month (fr) > en related news.

Imports

  • Sous-surveillance.net was created 7 years ago and has stored in their database the location of around 20,000 cameras, mostly located in France and Belgium. OSM currently has the location of around 80,000 cameras over the entire world. After a successful test import in March 2020, for the city of Brussels, in September 2020 authorisation has been granted to import the full dataset into OSM.

OpenStreetMap Foundation

  • The OSMF announced that their 14th Annual General Meeting will be held online on 12 December. More details will be available soon on the dedicated wiki page.

Events

  • On Sunday, 4 October 2020, 25 Swiss castles invite you to the 5th National Castles Day. The Wikimedia CH association and Swiss OpenStreetMap (SOSM) will complement the event with a Swiss ‘Burgentag’. They ask for the mapping of castle ruin sites and initiated a photo challenge to upload photos of castles missing from Wikimedia Commons.

Humanitarian OSM

  • Disaster Map Foundation, a Jakarta-based non-government organisation, has launchedMapaKalamidad.ph‘, a web-based, flood reporting platform to help disaster authorities gather life-saving information during calamities in the Philippines. The project was done in partnership with the Office of Civil Defense (OCD), National Disaster Risk Reduction and Management Council, Pacific Disaster Center, and Humanitarian OpenStreetMap Team.
  • HOT’s 2020 Summit has moved online. It will be held virtually across multiple time zones on 4 December, in conjunction with the Understanding Risk conference. The theme is ’10 Years of HOT: The Past, Present and Future of Humanitarian Mapping’. Proposals are currently being accepted for sessions.
  • Eugene Chong has written a blog post entitled ‘How Can OpenStreetMap be Used to Track UN Sustainable Development Goals?’.
  • Flip Science reporter Mikael Angelo Francisco gave a first-hand account of joining in on a Missing Maps mapathon to map villages in Nigeria.
  • Mikel Maron has written an article entitled ‘Typhoon season in Japan: Crisis Mapping and OpenStreetMap’.
  • Thomson Reuters Foundation News covered the work of the humanitarian mapping community supporting COVID-19 responses in communities around the world.

Education

  • Stratofortress explained, in his instructable, how to create custom-stylised maps using OpenStreetMap.

Maps

  • Have you ever thought ‘Gee, I wonder what would happen if you took #OpenStreetMap data for North America, and coloured the image based on the distance to the nearest building?’ If yes, Rory McCann can show you the results.
  • Omri Wallach presented a 3D map which shows the highest population density centres of the world.
  • Have you ever thought about the distribution of street suffixes such as streets, boulevards, avenues, and other roads in a city? Lansingography gave us some examples from Lansing, Michigan, USA.

Open Data

  • Have you ever wondered what GTFS is and how to use it to complete OpenStreetMap? Noémie Lehuby, from Jungle Bus, has published (fr) > en a long blog post explaining the format, the steps to follow and the tools that can be used to map public transport routes and stops in OpenStreetMap from a GTFS open dataset.

Software

  • [1] Marcel Reinmuth, from HeiGIT in Heidelberg, drew attention, in a blog post, to a decision tree that helps you choose which of the various tools from ohsome suits your needs. The flowchart can be downloaded as a PDF.
  • JOSM is now also available in Farsi (a language spoken in Iran and elsewhere).
  • This year Nominatim celebrates its 11th birthday. On 11 November 2009, the OpenStreetMap (OSM) homepage used Nominatim as its main search engine for the first time. Since then OSM has grown enormously and with it the need for a geocoder based on OSM data. The OSM Nominatim servers alone now serve more than 30 million queries per day. Read more about the present and future for Nominatim.

Programming

  • Paul Norman explained, on the Dev discussion list, that the upgrade of Ironbelly, the primary site planet server, to Ubuntu 20.04 revealed a bug in the software that generates the weekly planet dump. He also described the various actions taken to fix the problem.
  • Stefa called on all JOSM plugin authors to follow them in transforming their icons to SVG (or to reuse JOSM core icons if suitable).

Releases

Did you know …

  • … the interactive Geo Open Accessibility Tool (GOAT)? This tool allows dynamic analysis of walking and cycling accessibility to different destinations (e.g. supermarkets, schools).
  • … that every user gets a blog on OpenStreetMap? Go ahead and write your first blog post!
  • … Pascal Neis has updates on Trends and Changesets?
  • … the bike travel wiki? Here you will find (de) > en references to maps and route planners, but also to everything else that is of interest to cyclists.

OSM in the media

  • GeoSpatial World showed which interesting discoveries can be made during walks through your own or a foreign city. OsmAnd supports the navigation, if necessary, with maps downloadable for offline use.
  • Researchers at the Technion (Israel Insititute of Technology) have developed an innovative mapping system for blind pedestrians (we reported earlier). Their study examined the possibility of using OpenStreetMap to map spatial data relevant to blind pedestrians while calculating optimised walking routes.
  • The director of Ramblers Scotland (Ramblers is the largest hillwalking organisation in GB, ~100 years old with ~120k+ members) says OSM ‘has the most complete public map of Scotland’s paths that is currently available’.

Other “geo” things

  • Grant Slater reports that there is, finally, a free RTK / NTRIP Broadcaster in London.
  • The United Nations has decided to establish a new UN Global Geodetic Centre of Excellence (GGCE) in the city of Bonn, Germany.
  • The Karlsruhe-based software company Disy Informationssysteme GmbH presented (de) > en the new version of their data analytics, reporting and GIS platform ‘Cadenza’. The extension of the analytics functionalities with the provision of an integrated routing function and the POI search are two of the essential innovations.
  • Matt Burgess presented the best privacy-friendly alternatives to Google Maps, which is arguably the easiest mapping service to use, but that doesn’t mean it’s the most secure.
  • The earth observation company 4 Earth Intelligence (4EI) has published information packages that provide an overview of countries in six layers (demography, land cover, points of interest, major events, transport and wealth index). The Country Intelligence Data Suite, derived from satellite imagery and other resources such as the World Bank, OpenStreetMap, census data and historical archives, was created to support economic analysis, policy-making and reporting on the SMART Sustainable Development Goals (SDGs). The information is provided as a mixture of point, line and polygon features and is suitable for use in desktop mapping software or geographic information systems (GIS).

Upcoming Events

Where What When Country
Kabul / Online Why OSM and how to Contribute into it on Software Freedom Day 2020 2020-09-18 afghanistan
Nottingham Nottingham pub meetup 2020-09-22 united kingdom
Bratislava Missing Maps Mapathon Bratislava #9 2020-09-24 slovakia
Munich TUM – Mapping Party 2020-09-24 germany
Alice HOT Working Groups 101 Community Webinar 2020-09-25 united kingdom
Düsseldorf Düsseldorfer OSM-Stammtisch 2020-09-25 germany
Helsinki State of the Map Suomi 2020 2020-09-26 finland
Salt Lake City / Virtual OpenStreetMap Utah Map Night 2020-09-29 united states
Zurich Missing Maps Mapathon Zürich 2020-09-30 switzerland
Ulm + virtuell Covid-19-Mapathon 2020-10-01 germany
San José Civic Hack & Map Night 2020-10-01 united states
Taipei OSM x Wikidata #21 2020-10-05 taiwan
London Missing Maps London Mapathon 2020-10-06 united kingdom
Stuttgart Stuttgarter Stammtisch 2020-10-07 germany
Berlin 148. Berlin-Brandenburg Stammtisch 2020-10-09 germany
Alice 2020 Pista ng Mapa 2020-11-13-2020-11-27 philippines
Alice FOSS4G SotM Oceania 2020 2020-11-20 oceania

Note: If you like to see your event here, please put it into the calendar. Only data which is there, will appear in weeklyOSM. Please check your event in our public calendar preview and correct it, where appropriate.

This weeklyOSM was produced by AnisKoutsi, Anne Ghisla, Climate_Ben, MatthiasMatthias, MichaelFS, Nordpfeil, NunoMASAzevedo, PierZen, Polyglot, Rogehm, TheSwavu, derFred, richter_fn

Research quantifies Wiki Education’s impact to articles

20:36, Thursday, 17 2020 September UTC

In their recent paper on “content growth and attention contagion” (free preprint), researchers Kai Zhu, Dylan Walker and Lev Muchnik use Wiki Education’s Wikipedia Student Program as a natural experiment to study the effects of a sudden “shock” of new editing activity. They look at the set of about 3,300 articles that were substantially expanded by student editors during the Fall 2016 term, compared to a control group of similarly developed articles that weren’t worked on by any of our classes. The main idea is to look at what happens after the classes have ended and the students have (largely) stopped editing.

chart showing increased pageviews after the "shock" of student contributions
Increased pageviews after the “shock” of student contributions (Figure 2 from the paper, used with permission. All rights reserved).

One major effect is an increase in pageviews — up 12%, on average, compared to the control group, and the more an article expanded, the larger the impact on pageviews. Not only that, but there’s a “spillover”, with increased pageviews for the articles linked to from the students’ articles — especially for newly added links. Along with more pageviews, these students’ articles got more edits from more unique editors after the class ended. As with the pageview effect, the more an article was expanded during the Fall 2016 term, the more edits and editors it was likely to attract afterwards. This increased attention for the students’ articles represents a combination of more traffic coming from related articles that link to them as well as more direct traffic from search results.

Their findings underscore the impact of our work, and reinforce a few ideas that are at the heart of what we do. First, it shows the impact and scale of our Wikipedia Student Program: student editors are doing so much to expand Wikipedia’s content gaps, especially in academic content areas, that it’s possible to study their activity and find statistically significant effects. (In an average month, our program participants represent nearly 20% of English Wikipedia’s new active editors.)

It also shows that the work students are doing isn’t just an “academic exercise”; it has an audience, and it matters enough that people are both reading it and building on it. Because of the way articles lead readers to explore the network of related articles, this “attention contagion” for substantially improved articles spills over to related articles, and this suggests that a particularly effective strategy is to focus on clusters of underdeveloped but important articles. That happens naturally in many of our Student Program courses, where the instructor guides their students to work on underdeveloped articles related to the course content. We’re also doubling down on this strategy for our Wiki Scholars & Scientists courses, where a cohort of experts comes together to learn how to apply their knowledge to a particular topic area — such as our recent courses focused on the COVID-19 pandemic.

Research like this is an important part of the Wikimedia ecosystem. Not only does this help quantify the impact of Wiki Education’s work, but it also offers insight for other Wikimedia program leaders interested in focusing editors on a cluster of underdeveloped but important articles. Wiki Education is always interested in independent researchers using our programs as a focus of academic study; for information on data available, reach out to us through our contact us page.

Wikimedia’s Event Data Platform – JSON & Event Schemas

19:54, Thursday, 17 2020 September UTC

By Andrew Otto, Staff Site Reliability Engineer

In the previous post, we talked about why Wikimedia chose JSONSchema instead of Avro for our Event Data Platform. This post will discuss the conventions we adopted and the tooling we built to support an Event Data Platform using JSON and JSONSchema.

For Avro serialized streaming event data, the event’s Avro schema is needed during both production and consumption of the event. For JSON events, the schema is not strictly needed to produce or consume the data. JSON records include minimal ‘schema’ information, including the structure of the record and the field names. Field type information is what is missing.

If we are using JSON, why do we need the schemas at all? Producers and consumers don’t need it. What does?

There are two answers to this question.

  1. We want to validate that all incoming events of a specific type conform to a schema at produce time, so that consumers can rely on it being consistent.
  2. Schemas are useful for data integration, e.g. ingesting event data into an RDBMS or elsewhere.

To do either of these, we need to be able to find the schema of an event.

Confluent does this with opaque schema IDs which are used to look up schemas in their remote Schema Registry service. This means that each schema gets assigned its own unique integer ID inside of the Schema Registry (just like a unique ID in a database). This ID carries no meaning outside of a particular Schema Registry installation. 

But we’d like the ability to lookup a schema without access to a centralized registry. Wikimedia does this for JSONSChemas with versioned URIs.

Schema URIs

JSONSchema documents themselves have their own ‘meta schemas’. The meta schema of a JSONSchema document is identified using the $schema keyword, which usually is set to the official URL of a JSONSchema version, e.g. ‘http://json-schema.org/draft-07/schema#’. JSONSchema validator implementations use this to know which version of JSONSchema to use when interpreting a JSONSchema document. 

So $schema is used to identify and lookup the meta schema of a JSONSchema document.

This kind of lookup is exactly what we need for our JSON event data: we want to lookup the schema of the JSON event document. We add the $schema keyword in our event data itself and set it to the URI of the event’s JSONSchema. Here’s an example event:

{
  "$schema": "http://schema.example.org/coolsoftware   /user_create.json",
  "datafield1": "value1" 
}

With the $schema present in every event and set to a URI pointing at the event’s schema, any user of the event can get its schema at will. Wikimedia’s EventGate HTTP -> Kafka proxy uses $schema to validate events before they are produced to Kafka. We use the same field to lookup the JSONSchema and convert it to a Hive or Spark schema during ingestion into Hadoop. We’ll come back to use cases like this later.

Decentralized Schema Repositories

In the first post in this series, I described how Wikimedia code and data should be decentralized. But the example event above had a fully qualified URL for its $schema, which would require anyone who wanted to lookup that event’s schema to do so from a centralized address. This also couples the schema lookup to that remote website. If the schema website goes down, schema lookup will fail.

Git is great at decentralization, so we keep all schemas in git repositories. This means that users can find schemas wherever the schema repository is cloned: in your local file system, somewhere in a distributed file system, or somewhere at a remote web address. Since these locations will all clone the same git repository, we can assume that they have the same relative directory hierarchy and choose to only use relative $schema URIs in event data.

Our example event would instead look like this:

{
  "$schema": "/coolsoftware/user_create.json", 
  "datafield1": "value1" 
}

We’ve removed the reference to a centralized web address…but what now? This surely isn’t enough to look up a schema.

It isn’t! We will need to prefix the $schema URI with a base address or path. If we want to read the schema from a local clone of our schema repository, we can prefix the relative $schema URI with a file:// path like file:///path/to/schema_repo/coolsoftware/user-create.json. If the schema is hosted at a remote HTTP address, we can prefix it with that domain like http://schema.example.org/coolsoftware/user-create.json

This is especially useful in development environments while developers create and change schemas and producer code. Their environments can configure a local schema base path to their clone of the schema repository. This also helps to decouple services in production. A critical service that needs to lookup schemas can have a local copy of the schema repository, whereas a less critical service that might want to do more dynamic schema lookups can use a remote web address to the schema repository.

Versioned Schemas

Code changes over time and so does data. As the data changes, so must the schemas. In an event driven architecture, event data is immutable and captures the historical record of all changes. We need a way to modify event schemas so that all historical event data of the same schema lineage will validate with the latest version of that schema. We also want to be able to look up the specific schema version an event was originally written with.

Wikimedia distributes schema files in git repositories, but we intentionally don’t use git for versioning of those files. We want every version of a schema to be readily available for on demand use by code. Instead of naming files after the schema title, we name them with their semantic version. The JSONSchema title keyword is used to name the full schema lineage, and we keep immutable versioned files in a directory hierarchy that matches the schema title. 

For example, an event schema that is modeling user account creations might be titled ‘coolsoftware/user/create’. The various versions of this schema would live in our schema repository in the coolsoftware/user/create/ directory with semantically versioned filenames like ‘1.0.0.json’, etc. An example schema repository directory tree might look like:

schemas
└── coolsofware
  ├── user
  │  ├── create
  │  │  ├── 1.0.0.json
  │  │  └── 1.0.0 -> 1.0.0.json
  └── product
    └── order
      ├── 1.0.0.json
      ├── 1.0.0 -> 1.0.0.json
      ├── 1.1.0.json
      └── 1.1.0 -> 1.1.0.json

These schemas will be addressed by URIs. To hide the file format extension in those URIs, we create extensionless symlinks to each version, e.g. 1.0.0 -> 1.0.0.json. 

Each schema version will identify itself using the JSONSchema $id keyword set to its specific versioned relative base URI inside the schema repository. The coolsoftware/user/create/1.0.0.json file will have "$id": "/coolsoftware/user/create/1.0.0“. (Using $id in this way also has advantages for reusing schema fragments using $ref pointers.)

Events will have their $schema field set to a relative schema URI that now includes the schema title and the schema version, matching exactly the $id field of the versioned schema. A coolsoftware/user/create event now might look like:

{ 
  "$schema": "/coolsoftware/user/create/1.0.0",
  "datafield1": "value1"
}

Code can use the $schema URI to look up the schema for an event from a local or remotely hosted schema repository.

This versioned schema hierarchy inside of a git repository allows us to decentralize the versioned schemas and refer to them with base location agnostic relative URIs.

jsonschema-tools

Each versioned file in our repository contains a full JSONSchema. But this means that whenever a developer wants to make a new version, they have to copy and paste the full previous version into their new version file and make the changes they want. This could result in subtle copy/paste bugs or undetected schema differences.

Also, every distinct schema lineage file with the same major version must be fully compatible with each other. The only type of change that is allowed for full compatibility is adding optional fields. This means no field deletion and no field renaming (which is essentially a deletion).

Rather than relying on developers to enforce these rules, Wikimedia has developed jsonschema-tools— a library, command line interface and set of tests to manage the schema development lifecycle.

Instead of manually keeping copies of each schema version, jsonschema-tools generates schema version files from a single ‘current’ schema file. A developer can modify the current schema file, update the version in the $id field, and still keep the previous version files around. It will also (by default) attempt to dereference any JSON $ref pointers so that the generated version schema files require no dereferencing or lookups at runtime. jsonschema-tools also exports a set of tests that can be run in your schema repository to enforce compatibility between the versions, as well as rules and conventions. 

Continuing with the coolsoftware/user/create schema example, If a developer wants to add a new field, they would edit the coolsoftware/user/create/current.json file, add the field, and bump the version in the $id field to "$id": "/coolsoftware/user/create/1.1.0". jsonschema-tools will handle generating a static coolsoftware/user/create/1.1.0.json file (as well as some handy symlinks). If your schema repository has been configured properly, running npm test will recurse through your schema repository and ensure that schema versions with the same title are compatible with each other.

jsonschema-tools does a little more than what is described in this post. Check out the documentation for more information.

This all might seem like a lot just to manage schemas, but a big advantage here is that schemas version are decentralized and immutable. jsonschema-tools does move some of the complexity of developing schemas to the developer environment. However, once a schema change is merged, all that is needed for other developers or code to access that schema is to have a URI and a clone of the schema repository (either locally or accessible via http).


Now that we’ve made versioned and compatible schema files accessible, how do we use them to ensure that the events produced into Kafka are valid? Wikimedia has developed an HTTP event intake and validation service and library based on JSONSchema URIs: EventGate. The next post will go into how EventGate works, how Wikimedia uses it, and how you can use it together with your schema repository to build your own custom event intake service.

About this post

This is part 2 of a 3 part series. Read part 1. Read Part 3.

Featured image credit: Vue aérienne raprochée du grand récif de Gatope et son trou bleu, Kévin Thenaisie, CC BY-SA 4.0

By Pavithra Eswaramoorthy

Every summer, budding technologists from around the world come together to contribute to Free and Open Source software. At the epicenter of this contribution season are Google Summer of Code and Outreachy — remote activities and internship programs with the goal of getting more people involved in FOSS.

Google Summer of Code is a program for university students, and Outreachy is a program that supports members of underrepresented communities in technology. Participants start by completing an application, where they submit a project proposal for a FOSS organization. Selected participants are then paired with experienced mentors for a period for 3 months, and they spend the summer working on their projects.

Experiences and achievements

Wikimedia has participated as a mentoring organization for GSoC since 2006 and for Outreachy since 2013. This round, we received record-breaking student participation with over 130 total proposal submissions from 13 different countries!

In May, Wikimedia welcomed 17 selected participants and kicked-off the community bonding phase with a ‘Welcome Party’ at the Wikimedia Remote Hackathon 2020

GSoC and Outreachy welcome party at Wikimedia Hackathon 2020

Participants and mentors worked on projects ranging from test engineering to data science. Webdriver IO, MediaWiki’s test automation framework, was evaluated and upgraded. New features were added to the Commons Android App. Wiki Education Dashboard’s error handling mechanism and efficiency were enhanced. The Page Forms extension, Proofread Page extension, GDrive-to-Commons uploader tool, and WikiContrib tool were improved. 

Some new projects were also developed, including a bot for support on Wikimedia’s Zulip, a tool to correct false depicts claims on Wikimedia Commons, and a tool for automating tasks for Wikimedia databases. 

Learn more about the projects and outcomes:

Participate in the next round

Contributing to Free and Open Source Software can be fun and rewarding. You can work alongside experienced community members, contribute to real-world projects, and witness the impact of your work first-hand. Moreover, there are various other areas to contribute besides code! For instance, every project needs documentarians, designers, and community builders.

Wikimedia always welcomes new contributors. 🙂 To get involved, select an area that is interesting to you: https://www.mediawiki.org/wiki/How_to_contribute. Programs like Google Summer of Code, Google Season of Docs, and Outreachy can provide a platform to accelerate your FOSS journey. Learn about Wikimedia’s outreach programs: https://www.mediawiki.org/wiki/Outreach_programs 

If you are undecided, have any questions, or just want to hang out with Wikimedians, say “Hi!” to us on Zulip!

Note: The initial applications for Outreachy 2020 winter edition are now open. We are also looking for projects and mentors. For more details: https://lists.wikimedia.org/pipermail/wikitech-l/2020-August/093759.html

Thank you, organizers, coordinators, and mentors!

This year brought some unique challenges. The COVID-19 pandemic affected people around the world and there were abrupt changes to personal schedules. Yet, our mentors were always ready to help and support the participants. The program organizers at Google and Outreachy adapted to the changes quickly and kept the programs running smoothly. Their hard work was the primary reason behind the success of this season, so thank you!

About this post

A note about the programs featured in this post: Outreachy is an internship program; whereas Google Summer of Code is considered an activity, see: https://developers.google.com/open-source/gsoc/faq#is_gsoc_considered_an_internship_a_job_or_any_form_of_employment

Featured image credit: Sea otters holding hands, John Robertson, CC BY 2.0

Spring 2020: Reflections on a term to remember

17:26, Tuesday, 15 2020 September UTC

A time of transition

To say that the spring 2020 academic term was unlike any other is a gross understatement. The pandemic abruptly upended normal life, and the students and instructors in our program were no exception. Mid-way through the term, just as most of our students were about to begin drafting their contributions and moving into the article main space, campuses closed nationwide, and students had to return home for the remainder of the academic year. Amid this turmoil, Wiki Education was thankfully able to continue providing uninterrupted support, and we reached out to all of our program participants to see how we could help.

Despite the unprecedented disruptions that the spring 2020 term saw, our students still did outstanding work, and Wiki Education supported 409 courses — our largest number to date. In fact, we quickly learned that the Wikipedia assignment became a lifeline for many of our instructors as they struggled to transition to remote learning. As one instructor wrote, “It actually was a perfect thing that I already had this scheduled and in the syllabus. It gave us an online component before we ‘needed’ it!” Others decided to run a Wikipedia assignment as a way to keep students engaged as their courses went online. We were truly grateful that we could provide some degree of stability for our instructors and students amid a tumultuous time, and despite it all, our students made substantial contributions to Wikipedia.

Across the 409 courses we supported in Spring 2020, almost 7,500 students collectively contributed more than 5 million words to Wikipedia, edited more than 6,000 articles, and created more than 500 entirely new entries. They covered subjects ranging from African American History to Prokaryotic Processes, and their work was collectively viewed almost 270 million times in the term alone.

A source of empowerment

We were also gratified to see that the Wikipedia assignment continued to be a source of motivation and empowerment for our students, as reported by instructors in our post-term survey. One instructor recounted the following anecdote: “Because of the pandemic, our university library was slower than usual. One student, frustrated that one of the books she wanted to read for the Wikipedia assignment was not immediately available at our university library, got on a bicycle and went to borrow a book in a city library in her town. I’ve never seen a student do that. I was happy to hear that, though I tried to hide my reaction in front of the student!” Another instructor reported, “One thing the struck me is I actually had a student came up and thank me for assigning this as their final. I was floored because, I’ve never had a student actually thank me for assigning them work or a final. This student really loved writing for Wikipedia and showing off their work to their family.”

Wiki Education has long been committed to filling in Wikipedia’s content gaps and especially those related to issues of equity, and we’re pleased to see that many of our students were also motivated by a desire to fill in Wikipedia’s equity gaps. “The project,” noted one instructor, “also increased their understanding of representation — who is included, who is not, and whose absence we notice. In many ways this grew out of the understanding that absences need to be addressed and that information is power (the Women in Red project). So thank you.”

Many of our students work on biographies of women to help close Wikipedia’s gender gap. One such student described their experience writing about Andrea Ivory: “Personally, I learned a lot when I was younger just going on random Wikipedia rabbit holes. It was really cool doing a deep-dive into Andrea Ivory’s backstory and working with my teammates to prove that she is, in fact, well known. And, more importantly, her work should be spotlighted regardless of how much media coverage she gets. Keeping my own experiences in mind, I was excited to go live with an article about such an important figure, because you never know who’s Wikipedia rabbit hole session may lead them to know more about Andrea Ivory!”

Just as important as filling in Wikipedia’s equity gaps is making the population that engages with Wikipedia editing more equitable. The students who participate in our program are representative of college campuses more broadly, and roughly 60% of our students are women. More surprising, however, is the number of female instructors who decide to run Wikipedia assignments. About 73% of our new instructors and about 60% of our returning instructors identify as women, a number that is far greater than the roughly 31% of women that make up academia more generally. Wikipedia has long struggled with a fairly homogeneous editing community — composed largely of men — and through our efforts, we are changing the face of who writes Wikipedia and who decides what content should be included on the world’s largest online encyclopedia.

Not simply an assignment, but a way to teach

When considering a Wikipedia assignment for the first time, instructors are often concerned that their students will spend more time learning how to contribute to Wikipedia than the content of the course. To the contrary, learning the ins and outs of Wikipedia can truly complement a course’s broader objectives. According to one instructor, “I started this assignment with the hope of highlighting the difference between academic knowledge (history, in my case) and Wikipedia knowledge. I ended up teaching, more effectively than otherwise, the key processes of history research, instead of teaching students what Wikipedia is. In other words, Wikipedia assignment helped me teach history, more so that a conventional research seminar can do.”

Another instructor remarked, “The Wikipedia project served as a great introduction to Women’s and Gender Studies as it incorporated the material we had discussed in class in a tangible, realizable way. It showed how the theories and practices we covered are not just concepts or academic discourse but are real, applicable, and powerful.”

The Wikipedia assignment transforms students from passive learners to active producers of knowledge. In a time where many feel helpless and powerless to change the world around them, this is no small feat. As one instructor put it, “Wikipedia has become a key pedagogical tool for me in my classes. I still start each semester anxious about how everything will go, but I’m increasingly impressed with what the students accomplish and what they get out of it! It’s a source of community and a source of empowerment and a space of learning.”

We all learned a lot from the challenges of the spring 2020 term, and we know that many of those challenges will persist as we embark on our fall 2020 adventure. What remains constant though is the potential our students and instructors have to truly make a difference through their engagement with Wikipedia.

Another year of WMAU tech stuff

07:54, Monday, 14 2020 September UTC

Fremantle

· Wikimedia · WMAU · wikis · system administration ·

The Wikimedia Australia AGM was yesterday, and I'm signed up for another 12 months of being the tech nerd on the Committee, running the wikis and whatnot. I quite like doing it, although I'm not always very good at keeping up to date with everything — it's nice to run a MediaWiki installation outside of the WMF world, just to have a feel of what it's like and what annoyances are felt. I'm going to try to improve things this year (better mobile view; stats reporting maybe; and generally keeping everyone up to date with things).

A new tool implies changes for me .

05:48, Monday, 14 2020 September UTC

 

A list like this is wonderful and it has always been a list where I either only import existing office holders or attempt to "do them all".  Typically I did the first few, making a point to include the incumbent. 

I added this template {{PositionHolderHistory|id=Q**}} to all the items for an office in my African politicians project.  You find the template on the talk page and like the Listeria lists, they show past and present office holders.

I still prefer my method of including the "red links" in Wikidata but it is a wiki and there is so much more to do.. What I started to do with office holders from Togo is that for those I will not link to predecessors and successors, I will at least show the dates they were in office. 

It looks much better in Listeria too. 

Thanks, GerardM

Tech News issue #38, 2020 (September 14, 2020)

00:00, Monday, 14 2020 September UTC
previous 2020, week 38 (Monday 14 September 2020) next
Other languages:
Bahasa Indonesia • ‎British English • ‎Deutsch • ‎English • ‎Nederlands • ‎español • ‎français • ‎italiano • ‎magyar • ‎polski • ‎português do Brasil • ‎suomi • ‎svenska • ‎čeština • ‎русский • ‎српски / srpski • ‎українська • ‎עברית • ‎العربية • ‎മലയാളം • ‎ไทย • ‎中文 • ‎日本語 • ‎한국어

Naturalists in court and courtship

11:58, Sunday, 13 2020 September UTC
The Bombay Natural History Society offers an interesting case in the history of amateur science in India and there are many little stories hidden away that have not quite been written about, possibly due to the lack of publicly accessible archival material. Interestingly two of the founders of the BNHS were Indians and hardly anything has been written about them in the pages of the Journal of the Bombay Natural History Society where even lesser known British members have obituaries. I suspect that this lack of obituaries can be traced to the political and social turmoil of the period. Even a major two-part history of the BNHS by Salim Ali in 1978 makes no mention of the Indians founders. Both the founders were doctors with an interest in medical botany and were connected to other naturalists not just because of their interest in plants but also perhaps through their involvement in social reform. The only colleague who could have written their obituaries was the BNHS member Dr Kanhoba Ranchoddas Kirtikar who probably did not because of his conservative-views and a consequent fall-out with the reformists. This is merely my suspicion and it arises from reading between the lines while editing the  relevant entries on the English Wikipedia. There are some rather interesting connections.

Sakharam Arjun
Dr Sakharam Arjun (Raut) (1839-16 April 1885) - This medical doctor with an interest in botanical remedies was for sometime a teacher of botany at the Grant Medical College - but his name perhaps became more well known after a historic court case dealing with child marriage and women's rights, that of Dadaji vs. Rukhmabai. Rukhmabai had been married off at the age of 11 and stayed with her mother and step-father Sakharam Arjun. When she reached puberty, she was asked by Dadaji to join him. Rukhmabai refused and Sakharam Arjun supported her. It led to a series of court cases, the first of which was in Rukhmabai's favour. This rankled the Hindu conservatives who believed that this was a display of the moral superiority of the English. The judge had in reality found fault with English law and had commented on the patriarchal and unfair system of marriage that had already been questioned back in England. A subsequent appeal was ruled in favour of Dadaji and Rukhmabai was ordered to go to his home or face six months in prison. Rukhmabai was in the meantime writing a series of articles in the Times of India under the pen-name of A Hindoo Lady (wish there was a nice online Indian newspapers archive) and she declared that she would rather take the maximal prison penalty. This led to further worries - with Queen Victoria and the Viceroy jumping into the fray. Max Müller commented on the case, while Behramji Malabari and Allan Octavian Hume (now retired from ornithology; there may be another connection as Sakharam Arjun seems to have been a member of the Theosophical Society, founded by Hume and others before he quit it) debated various aspects. Somewhat surprisingly Hume tended to being less radical about reforms than Malabari.

Dr Rukhmabai
Dr Edith Pechey
Dr Sakharam Arjun did not live to see the judgement, and he probably died early thanks to the stress it created. His step-daughter Rukhmabai became one of the earliest Indian women doctors and was supported in her cause by Dr Edith Pechey, another pioneering English woman doctor, who went on to marry H.M. Phipson. Phipson of course was a more famous founder of the BNHS. Rukhmabai's counsel included the lawyer J.D.Inverarity who was a big-game hunter and BNHS member. To add to the mess of BNHS members in court, there was (later Lt.-Col.) Kanhoba Ranchoddas Kirtikar (1850-9 May 1917), a student of Sakharam Arjun and like him interested in medicinal plants. Kirtikar however became a hostile witness in the Rukhmabai case, and supported Dadaji. Rukhmabai, in her writings as a Hindoo Lady, indicated her interest in studying medicine. Dr Pechey and others set up a fund for supporting her medical education in London. The whole case caused a tremendous upheaval in India with a division across multiple axes -  nationalists, reformists, conservatives, liberals, feminists, Indians, Europeans - everyone seems to have got into the debate. The conservative Indians believed that Rukhmabai's defiance of Hindu customs was the obvious result of a western influence.

J.D.Inverarity, Barrister
and Vice President of BNHS (1897-1923)
Counsel for Rukhmabai.
It is somewhat odd that the BNHS journal carries no obituary whatsoever to this Indian founding member. I suspect that the only one who may have been asked to write an obituary would have been Kirtikar and he may have refused to write given his stance in court. Another of Sakharam Arjun's students was a Gujarati botanist named Jayakrishna Indraji who perhaps wrote India's first non-English botanical treatise (at least the first that seems to have been based on modern scientific tradition). Indraji seems to be rather sadly largely forgotten except in some pockets of Kutch, in Bhuj. I recently discovered that the organization GUIDE in Bhuj have tried to bring back Indraji into modern attention.

Atmaram Pandurang
The other Indian founder of the BNHS was Dr Atmaram Pandurang Tarkhadkar (1823-1898)- This medical doctor was a founder of the Prarthana Samaj in 1867 in Bombay. He and his theistic reform movement were deeply involved in the Age of Consent debates raised by the Rukhmabai case. His organization seems to have taken Max Muller's suggestion that the ills of society could not be cured by laws but by education and social reform. If Sakharam Arjun is not known enough, even lesser is known of Atmaram Pandurang (at least online!) but one can find another natural history connection here - his youngest daughter - Annapurna "Ana" Turkhud tutored Rabindranath Tagore in English and the latter was smitten. Tagore wrote several poems to her where she is referred to as "Nalini". Ana however married Harold Littledale (3 October 1853-11 May 1930), professor of history and English literature, later principal of the Baroda College (Moreshwar Atmaram Turkhud, Ana's older brother, was a vice-principal at Rajkumar College Baroda - another early natural history hub), and if you remember an earlier post where his name occurs - Littledale was the only person from the educational circle to contribute to Allan Octavian Hume's notes on birds! Littledale also documented bird trapping techniques in Gujarat. Sadly, Ana did not live very long and died in her thirties in Edinburgh somewhere around 1891.

It would appear that many others in the legal profession were associated with natural history - we have already seen the case of Courtenay Ilbert, who founded the Simla Natural History Society in 1885. Ilbert lived at Chapslee House in Simla - now still a carefully maintained heritage home (that I had the fortune of visiting recently) owned by the kin of Maharaja Ranjit Singh. Ilbert was involved with the eponymous Ilbert Bill which allowed Indian judges to pass resolutions on cases involving Europeans - a step forward in equality that also led to rancour. Other law professionals in the BNHS - included Sir Norman A. Macleod and  S. M. Robinson. We know that at least a few marriages were mediated by associations with the BNHS and these include - Norman Boyd Kinnear married a relative of Walter Samuel Millard (the man who kindly showed a child named Salim Ali around the BNHS); R.C. Morris married Heather, daughter of Angus Kinloch (another BNHS member who lived near Longwood Shola, Kotagiri) - and even before the BNHS, there were other naturalists connected by marriage - Brian Hodgson's brother William was married to Mary Rosa the sister of S.R. Tickell (of Tickell's flowerpecker fame); Sir Walter Elliot (of Anathana fame) was married to Maria Dorothea Hunter Blair while her sister Jane Anne Eliza Hunter Blair was married to Philip Sclater, a leading figure in zoology. The project that led to the Fauna of British India was promoted by Sclater and Jerdon (a good friend of Elliot) - these little family ties may have provided additional impetus.


In 2014, someone in London asked me if I had heard of an India-born naturalist named E.K. Robinson. At that time I did not know of him but it turns out that Edward Kay Robinson (1857?-1928) born in Naini Tal was the founder of the British (Empire) Naturalists' Association. He fostered a young and promising journalist who would later dedicate a work to him - To E.K.R. from R.K. - Rudyard Kipling. Now E.K.R. had an older brother named Phil Robinson who was also in the newspaper line - and became famous for his brand of Anglo-Indian nature writing - a style that was more prominent in the writings of E.H. Aitken (Eha). Now Phil - Philip Stewart Robinson - despite the books he wrote like In my Indian Garden and Noah's ark, or, "Mornings in the zoo." Being a contribution to the study of unnatural history is not a well-known name in Indian natural history writing. One reason for his works being unknown may be the infamy that Phil achieved from affairs aboard ships between India and England that led to a scandalous divorce case and bankruptcy.

weeklyOSM 529

09:59, Sunday, 13 2020 September UTC

01/09/2020-07/09/2020

lead picture

Wheelmap.org celebrates its 10th birthday – Happy birthday! 1 | © Wheelmap.org | data © OpenStreetMap contributors

About us

  • Our editorial system now gives you the opportunity to post your articles directly into our system, and to suggest an initial formulation. We may make slight editorial changes to adapt the text to our language style. Translation into 12 languages will be done by us.

Mapping

  • The French ‘Project of the Month’ (fr) > en , has a graphically prepared interim balance of what has been achieved. It still has 18 days to run and aims for the complete mapping of defibrillators (AED). PanierAvide gives (fr) > en , in addition, a few tips for further work.
  • In September 2010, the Berlin-based association Sozialhelden e.V. launched (de) > en the Wheelmap.org project. Ten years later, Wheelmap has become the largest online map of wheelchair accessible places with over one million entries. In September, the team invites you to join in the celebrations under the motto ‘Wheelmap turns 10’. The campaign is taking place in the Wheelmap social networks and as a digital mapping action. They are also launching a small mapping campaign with the aim of getting people to each evaluate ten places on the Wheelmap that still have unknown wheelchair accessibility.
  • Andrew Harvey proposes the new tag shelter_type=rock_shelter to mark the difference between the already existing natural=cave_entrance and a shallow cave-like opening at the base of a bluff or cliff, which may be used to protect yourself from the weather.
  • In his blog post (de) > en , Robhubi deals with different types and names of watercourses that lead to mills in Austria and Germany. He evaluates entries in OpenStreetmap. WSHerx rates the publication as ‘impressively thorough work and convincing argumentation’.

Community

  • On 1 September, Amazon Web Services (AWS) released an episode of their documentary series Now Go Build, which highlighted the work done by the Humanitarian OpenStreetMap Team in the Philippines, especially in mapping the town of Guagua, in the province of Pampanga.Several members of the OSM-PH community, however, observed that there were missing and problematic narratives in the video related to the story it tells of geospatial and humanitarian workers in the country. Therefore, they prepared and released this statement (pdf).
  • Martin Koppenhoefer asked on osmf-talk whether Brexit will have an effect on OSM and wondered if the foundation could be moved to another European country. The core database, website and API are already hosted in the Netherlands (since late 2018), but there is still some user data stored in the United Kingdom (Forum, Piwik, Foundation wiki, mailing lists). Additionally AWS is used for some data storage needs; the data is primarily stored on AWS Ireland.
  • AI is not a magic bullet. Even for a company as resourceful as Facebook the results of an image recognition algorithm still needs to be manually verified. Some mappers using the results of MapwithAI, bypassing the upload limits, mass uploaded without reviewing the results. As a result users have discovered zh-tw strange ‘roads’ on Rudy Map, the most important offline map dataset in Taiwan, and have reported them in the Facebook user group. Cartographers of a mountaineering group have also had to remove non-existent roads in the mountains, or correct streams without water in the dry season, which were wrongly classified as roads.
  • Prajwal Shrestha, a student of the Institute of Agriculture and Animal Science-Lamjung, Tribhuvan University (Lamjung Campus), Nepal, shared his excitement about joining the OSM community through the creation of a new YouthMappers chapter (Agri-Mappers Lamjung) on his university campus.
  • Facebook has released a third update to Daylight, their complete, downloadable preview of OpenStreetMap data (57 GB). Michal Migurski gives some insights in his blog and provides relevant links.

OpenStreetMap Foundation

Events

  • The 2020 Pista ng Mapa (Festival of Maps) organisers from the OpenStreetMap and FOSS4G communities in the Philippines are calling for presenters and workshop proposals on Open Data, Mapping, and FOSS4G for presentation at their 2020 conference this November. The call is open until 30 September. Also, this month, they are holding a map and poster making contest with the theme ‘Mapa para sa lahat’ (Maps for everyone). Check out their website for details. Prizes await winners in two categories: students and professionals.
  • Alessandro Sarretta presented a poster entitled ‘OpenStreetMap: an opportunity for Citizen Science’ at the ECSA 2020 conference, showing the many ways OpenStreetMap and citizen science can help each other.

Humanitarian OSM

  • Jean-Marc Liotier criticised HOT’s use of the number of mapped buildings as a measure of the ‘state of OpenStreetMap in Africa’. At the same time, he acknowledged that criticising bad metrics is easy, but finding good ones is difficult and he agreed to participate in an ‘open working group on OpenStreetMap metrics’.
  • Taichi Furuhashi (User MAPconcierge) looked back (ja) on the crisis mapping conducted in response to heavy rain in Japan and summarised points to be improved. Over 1800 people participated in the task, and over 65,000 buildings were mapped.

Maps

  • Technion (Israel Institute of Technology) has developed a mapping system to assist blind people in navigating cities.

Open Data

  • The Croatian State Geodetic Administration has published (hr) > en new aerial imagery for the eastern and southern parts of Croatia taken in 2019, and extended new topographic 1:25,000 map coverage.

Software

Programming

Releases

  • The changelog for the stable version #17013 of JOSM lists, among many others, the addition of the Serbian language with Latin script, GPX routes as a separate layer, and a dark mode via the plugin FlatLaf as major changes.

Did you know …

  • OSM Streak? Do you still need a few days to get your free membership at the OSM Foundation? Ilya Zverev’s Telegram bot OSM Streak reminds you with a small daily task. 😉
  • … there is a list of the OSM mailing lists? However, many of them are orphaned.
  • … OSM Ireland has its own site for mapping tasks that can be solved from home? The focus is on buildings.

Other “geo” things

  • Allan Mustard says the article by Maite Vermeulen, Leon de Korte and Henk van Houtum in The Correspondent presents an ‘excellent case study of political bias in maps’.
  • When US companies were ordered to stop working with Huawei, Google Maps was one of the apps that was dearly missed. Now the Chinese smartphone company has released TomTom’s navigation app on Huawei AppGallery. With OsmAnd, Here WeGo and maps.me there were already some alternatives available in the store.

Upcoming Events

Where What When Country
Potsdam 147. Berlin-Brandenburg Stammtisch 2020-09-10 germany
Munich Münchner Treffen 2020-09-10 germany
Zurich 121. OSM Meetup Zurich 2020-09-11 switzerland
Leoben Stammtisch Obersteiermark 2020-09-12 austria
San José National Day of Civic Hacking 2020-09-12 united states
Ashurst Trek View New Forest Pano Party 2020-09-13 united kingdom
Montrouge Rencontre des contributeurs de Montrouge et alentours 2020-09-14 france
Cologne Bonn Airport 131. Bonner OSM-Stammtisch (Online) 2020-09-15 germany
Lüneburg Lüneburger Mappertreffen 2020-09-15 germany
Salt Lake City / Virtual OpenStreetMap Utah Map Night 2020-09-15 united states
Kabul / Online Why OSM and how to Contribute into it on Software Freedom Day 2020 2020-09-18 afghanistan
Nottingham Nottingham pub meetup 2020-09-22 united kingdom
Bratislava Missing Maps Mapathon Bratislava #9 2020-09-24 slovakia
Düsseldorf Düsseldorfer OSM-Stammtisch 2020-09-25 germany
Salt Lake City / Virtual OpenStreetMap Utah Map Night 2020-09-29 united states
Alice 2020 Pista ng Mapa 2020-11-13-2020-11-27 philippines
Alice FOSS4G SotM Oceania 2020 2020-11-20 oceania

Note: If you like to see your event here, please put it into the calendar. Only data which is there, will appear in weeklyOSM. Please check your event in our public calendar preview and correct it, where appropriate.

This weeklyOSM was produced by AnisKoutsi, GOwin, MatthiasMatthias, Nordpfeil, Polyglot, Rogehm, TheSwavu, derFred, richter_fn.

A blog post about the Wikimedia Design style guide was recently published.

The origin was a conversation with David Golberg about the construction of open participatory design systems and some parallelisms with architecture.

This Month in GLAM: August 2020

21:13, Friday, 11 2020 September UTC
  • Albania report: Wikivoyage edit-a-thon – Editing Albania and Kosovo’s travel destinations
  • Brazil report: Open innovation and dissemination activities: wrapping up great achievements on a major GLAM in Brazil
  • Czech Republic report: First Prague Wiki Editathon held in Prague
  • Estonia report: Virtual exhibition about Polish-Estonian relations. Rephotography and cultural heritage
  • Germany report: KulTour in Swabia and 8000 documents new online
  • India report: Utilising Occasion for Content donation: A story
  • Netherlands report: WMIN & WMNL collaboration & Japanese propaganda films
  • Serbia report: Enriching Wiki projects in different ways
  • Sweden report: Free music and new recordings of songs in the public domain; Autumn in the libraries; Yes, you can hack the heritage this year – online!
  • Uganda report: Participating in the African Librarians Week (24-30 May 2020)
  • UK report: Spanish metal and …
  • USA report: Wiknic & Black Artists Matter & Respect Her Crank
  • WMF GLAM report: Wikipedia Library, new WikiCite grant programs, and GLAM office hours
  • Calendar: September’s GLAM events

I just published a VS Code language extension to support syntax highlighting for Stuttgart Finite State Transducer (SFST) formalism to VS Code. Download and install the extension Source code I learned how to write a language extension when I attempted the opentype feature file support. So I thought of applying that learning to SFST which I regularly use for the Malayalam morphology analyser project.

Best Responsive MediaWiki Skins

00:00, Friday, 11 2020 September UTC

Are you looking for a mobile and responsive skin for your wiki? I got you covered with a comparison of the best skins.

I believe that your wiki should naturally serve mobile and responsive devices from the start, but it can be difficult to find a skin to fit this purpose. Do these mobile friendly skins even exist? Let’s find out and take a look at our best picks for MediaWiki you should consider.

Overview

Even today, there are not many choices when it comes to a mobile and responsive skin. I’ve done the research for you and in this blog post, I will look at the available alternatives serving the content of your wiki best for mobile devices.

All of the mobile and responsive skins to MediaWiki I will mention in this blog post both integrate with Semantic MediaWiki and VisualEditor. According to my opinion these two requirements need to be met. In addition, these skins are under active development, meaning that support for new major releases of MediaWiki is added within a short period of time. The latter is good to know if you would like the next major upgrade of your MediaWiki instance to be smooth when it comes to the skin used.

Let us now look at the selection of mobile and responsive skins for MediaWiki I made for you.

Chameleon

The Chameleon comes by default with a layout offering two navigation bars enclosing the wiki’s content at the top and bottom, with the first holding the menu for actions on the right and the latter holding the tool menu. This is different to what you are accustomed to by the classical wiki skins for MediaWiki but I do not think this is bad.

It is the most versatile and flexible skin when it comes to customizing the look and feel of your wiki. Moreover it can easily be extened with further features from the Bootstrap framework. If you are into creating an unique impression of your wiki then this skin is your choice.

Chameleon skin

Foreground

The Foreground skin serves a fixed layout providing one navigation bar at the top above the wiki’s content which also holds the tool menu. The menu for actions is located at the top right of the content area.

It is one of the skins with a clear and straigt appearance and serves by default the features provided by the Foundation framwork. If you are happy with the layout and just want to do a couple of changes to the CSS then you should pick this skin.

Foreground skin

Pivot

The Pivot skin serves a fixed layout resembling that of the well-known classic Vector and Monobook skins with a sidebar on the left of the screen. The user menu is toggled in and out on the right side of the screen. The menu for actions is located at the top right of the content area.

This is a mobile and responsive skin offering the classic wiki appearance and serves by default the features provided by the Foundation framwork. If you are into the classic wiki layout and just want to do a couple of changes to the CSS then this skin is your pick.

Pivot skin

Tweeki

The Tweeki skin serves a flexible layout by default providing one navigation bar at the top above the wiki’s content which also holds the menu for tools as well as for actions. The latter with the exception of the action for editing which is placed as a button on the top right of the content area.

It is a skin with a clean aesthetic layout by default bringing the features provided by the Bootstrap framwork. If you like the layout provided by the Tweeki skin and you just want to do a couple of changes to the CSS, then this skin is definitely something for you. If you would like to build intensely upon the skin, it is also a very nice choice.

Tweeki skin

Adoption

Let us now look at the basic data like usage numbers, web framework used, and other information:

Skin Usage Since Framework Demo site
Chameleon 215 2013 Bootstrap 4
Foreground 208 2014 Foundation 5
Tweeki 68 2016 Bootstrap 3
Pivot 38 2017 Foundation 5

The data was sourced from WikiApiary.com, MediaWiki.org and GitHub.

The usage numbers of these mobile and responsive MediaWiki skins do not convey a relevant volume of serious adoption. However, we need to remember that private wiki instances running MediaWiki are not covered at all. From experience, these are much more likely using one of the skins mentioned here. Thus, these numbers do, indeed, provide a pretty good overview of how usages compare by skin. Moreover, their overall adoption is steadily increasing since their inception.

Interestingly, but not surprisingly, none of the mobile and responsive skins mentioned here is deployed on wikis run by the Wikimedia Foundation, the organization behind Wikipedia. One reason for that is due to the skins’ focus on being mobile and responsive, they provide features that may break content on Wikipedia. Moreover these features are even unwanted within the scope of Wikipedia. Think of features like accordions, carousels or tabs here. Another reason is that they were not developed by the Wikimedia Foundation, thus not assuring direct control over their code base and release process, which is also an important factor for Wikipedia and related projects.

This leads to the questions: Which mobile skin is used for Wikipedia, and why wasn’t it considered in this blog post? Well, the skin is called Minerva Neue, which is best used with the MobileFrontend extension. They piggy-back on the classic MediaWiki skins we all know - Vector and MonoBook. In my experience, this solution to a mobile skin for MediaWiki is generally viewed by users with a resigned acceptance. Also it lacks the possibility for adaptation to individual wiki’s requirements.

Conclusion

All mobile and responsive skins for MediaWiki I have covered in this blog post are recommended and can cheerfully be used for your wiki.

In the end your decision will not just depend on your individual use case for running a wiki but also on which kind of changes you would like to have for the skin and on either your personal technical skill level or on how much you would like to get involved yourself in doing these.

All of these skins can be used out of the box with you probably wanting to make only a few necessary adaptations regarding the visual appearance. Two of the skins, Chameleon and Tweeki allow for more or less invasive changes to the layout and visual appearance and require a medium to high level of technical skills to do them. The other two, Foreground and Pivot are fixed in their layout however still can be changed regarding their visual appearance. The skills needed here can be made by people with a low to medium high level of technical skills.

I hope you enjoyed reading this blog post and got something out of it. Please have a look at the follow-up blog posts covering the mobile and responsive skins mentioned here.

Hosting and Support

All of Professional.Wiki’s hosting plans offer the mobile and responsive skins for MediaWiki covered by this blog post. Also professional support is available to help you get the most out of your wiki. We will be very happy to assist you and make a difference for you!

Further reading