Production Excellence #28: January 2021

07:31, Saturday, 20 2021 February UTC

How’d we do in our strive for operational excellence last month? Read on to find out!

📈 Incidents

1 documented incident last month. That's the third month in a row that we are at or near zero major incidents – not bad! [1] [2]

Learn about recent incidents at Incident status on Wikitech, or Preventive measures in Phabricator.

💡 Did you know: Our Incident status page provides a green-yellow status reflection over the past ten days, with a link to the most recent incident doc if there was any during that time.

📊 Trends

This January saw a small recovery in our otherwise negative upward trend. For the first time in twelve month more reports were closed than new reports having outlived the previous month without resolution. What happened twelve months ago? In January 2020, we also saw a small recovery during the otherwise upward trend before and after it.

Perhaps it's something about the post-December holidays that temporarily improves the quality and/or reduces the quantity — of code changes. Only time will tell if this is the start of a new positive trend, or merely a post-holiday break. [3]

While our month-to-month trend might not (yet) be improving, we do see persistent improvements in our overall backlog of pre-2019 reports. This is in part because we generally don't file new reports there, so it makes sense that it doesn't go back up, but it's still good to see downward progress every month, unlike with reports from more recent months which often see no change month-to-month (see "Outstanding errors" below, for example).

This positive trend on our "Old" backlog started in October 2020 and has consistently progressed every month since then (refer to the "Old" numbers in red on the below chart, or the same column in the spreadsheet). [3][4]

📖 Outstanding errors

Summary over recent months:

  • ⚠️ July 2019 (2 of 18 issues left): no change.
  • ⚠️ August 2019 (1 of 14 issues): no change.
  • ✅ September 2019 (0 of 12 issues): Last two tasks were resolved (-2).
  • ⚠️ October 2019 (4 of 12 issues): One task resolved (-1).
  • ⚠️ November 2019 (1 of 5 issues): no change.
  • ⚠️ December 2019 (2 of 9 issues), Two tasks resolved (-2).
  • ⚠️ January 2020 (2 of 7 issues), no change.
  • ⚠️ February 2020 (1 of 7 issues left), One task resolved (-1).
  • March 2020 (2 of 2 issues left), no change.
  • April 2020 (9 of 14 issues left): no change.
  • May 2020 (6 of 14 issues left): One task resolved (-1).
  • June 2020 (7 of 14 issues left): no change.
  • July 2020 (9 of 24 new issues): no change.
  • August 2020 (22 of 53 new issues): One task resolved (-1).
  • September 2020 (13 of 33 new issues): One task resolved (-1).
  • October 2020 (31 of 69 new issues): Four tasks fixed (-4).
  • November 2020 (14 of 38 new issues): no change.
  • December 2020 (19 of 33 new issues) Three tasks resolved (-3)
  • January 2021: 7 of 50 new issues survived the month and remained unresolved (+50; -43)
Recent tally
160 issues open, as of Excellence #27 (4 Feb 2021).
-15 issues closed since, of the previous 160 open issues.
+7 new issues that survived January 2021.
152 issues open, as of today (16 Feb 2021).

January saw +50 new production errors reported in a single month, which is an unfortunate all-time high. However, we've also done remarkably well on addressing 43 of them within a month, when the potential root cause and diagnostics data were still fresh in our minds. Well done!

For the on-going month of February, there have been 16 new issues reported so far.

Take a look at the workboard and look for tasks that could use your help!

View Workboard

🎉 Thanks!

Thank you to everyone else who helped by reporting, investigating, or resolving problems in Wikimedia production. Thanks!

Until next time,

– Timo Tijhof


[1] Incident status Wikitech.
[2] Wikimedia incident stats by Krinkle, CodePen.
[3] Month-over-month, Production Excellence spreadsheet.
[4] Open tasks, Wikimedia-prod-error, Phabricator.

Production Excellence #27: December 2020

18:35, Thursday, 04 2021 February UTC

How’d we do in our strive for operational excellence last month? Read on to find out!

📈 Incidents

1 documented incident in December. [1] In previous years, December typically had 4 or fewer documented incidents. [3]

Learn about recent incidents at Incident documentation on Wikitech, or Preventive measures in Phabricator.

📊 Trends

Month-over-month plots based on spreadsheet data. [4] [2]

📖 Outstanding errors

Take a look at the workboard and look for tasks that could use your help.

Summary over recent months:

  • ⚠️ July 2019 (2 of 18 issues left): no change.
  • ⚠️ August 2019 (1 of 14 issues): no change.
  • ⚠️ September 2019 (2 of 12 issues): One task resolved (-1).
  • ⚠️ October 2019 (5 of 12 issues): no change.
  • ⚠️ November 2019 (1 of 5 issues): no change.
  • ⚠️ December 2019 (4 of 9 issues), no change.
  • ⚠️ January 2020 (2 of 7 issues), no change.
  • February 2020 (2 of 7 issues left), no change.
  • March 2020 (2 of 2 issues left), no change.
  • April 2020 (9 of 14 issues left): no change.
  • May 2020 (7 of 14 issues left): no change.
  • June 2020 (7 of 14 issues left): no change.
  • July 2020 (9 of 24 new issues): no change.
  • August 2020 (23 of 53 new issues): no change.
  • September 2020 (13 of 33 new issues): One task resolved (-1).
  • October 2020 (35 of 69 new issues): Four issues fixed (-4).
  • November 2020 (14 of 38 new issues): Five issues fixed (-5).
  • December 2020: 22 of 33 new issues survived the month and remained unresolved (+33; -22)
Recent tally
149 as of Excellence #26 (15 Dec 2020).
-11 closed of the 149 recent issues.
+22 new issues survived December 2020.
160 as of 27 Jan 2020.

🎉 Thanks!

Thank you to everyone else who helped by reporting, investigating, or resolving problems in Wikimedia production. Thanks!

Until next time,

– Timo Tijhof


[1] Incident documentation 2020, Wikitech.
[2] Open tasks, Wikimedia-prod-error, Phabricator.
[3] Wikimedia incident stats by Krinkle, CodePen.
[4] Month-over-month, Production Excellence spreadsheet.

Production Excellence #26: November 2020

18:34, Thursday, 04 2021 February UTC

How’d we do in our strive for operational excellence last month? Read on to find out!

📈 Incidents

Zero documented incidents in November. [1] That's the only month this year without any (publicly documented) incidents. In 2019, November was also the only such month. [3]

Learn about recent incidents at Incident documentation on Wikitech, or Preventive measures in Phabricator.

📊 Trends

The overall increase in errors was relatively low this past month, similar to the November-December period last year.

What's new is that we can start to see a positive trend emerging in the backlogs where we've shrunk issue count three months in a row, from the 233 high in October, down to the 181 we have in the ol' backlog today.

Month-over-month plots based on spreadsheet data. [4]

📖 Outstanding errors

Take a look at the workboard and look for tasks that could use your help.

Summary over recent months:

  • ⚠️ July 2019 (2 of 18 tasks): One task closed (-1).
  • ⚠️ August 2019 (1 of 14 tasks): no change.
  • ⚠️ September 2019 (3 of 12 tasks): no change.
  • ⚠️ October 2019 (5 of 12 tasks): no change.
  • ⚠️ November 2019 (1 of 5 tasks): no change.
  • ⚠️ December 2019 (3 of 9 tasks left), no change.
  • January 2020 (3 of 7 tasks left), One task closed (-1).
  • February (2 of 7 tasks left), no change.
  • March (2 of 2 tasks left), no change.
  • April (9 of 14 tasks left): no change.
  • May (7 of 14 tasks left): no change.
  • June (7 of 14 tasks left): no change.
  • July 2020 (9 of 24 new tasks): no change.
  • August 2020 (23 of 53 new tasks): Three tasks closed (-3).
  • September 2020 (14 of 33 new tasks): One task closed (-1).
  • October 2020 (39 of 69 new tasks): Six tasks closed (-6).
  • November 2020: 19 of 38 new tasks survived the month and remain open today (+38; -19)
Recent tally
142 as of Excellence #25 (23 Oct 2020).
-12 closed of the 142 recent tasks.
+19 survived November 2020.
149 as of today, 15 Dec 2020.

The on-going month of December, has 19 unresolved tasks so far.

🎉 Thanks!

Thank you to everyone else who helped by reporting, investigating, or resolving problems in Wikimedia production. Thanks!

Until next time,

– Timo Tijhof

❝   The plot "thickens" as they say. Why, by the way? Is it a soup metaphor? ❞


[1] Incident documentation 2020, Wikitech.
[2] Open tasks, Wikimedia-prod-error, Phabricator.
[3] Wikimedia incident stats, Krinkle, CodePen.
[4] Month-over-month, Production Excellence (spreadsheet).

Perf Matters at Wikipedia in 2015

00:33, Thursday, 31 2020 December UTC

Hello, WANObjectCache

This year we achieved another milestone in our multi-year effort to prepare Wikipedia for serving traffic from multiple data centres.

The MediaWiki application that powers Wikipedia relies heavily on object caching. We use Memcached as horizontally scaled key-value store, and we’d like to keep the cache local to each data centre. This minimises dependencies between data centres, and makes better use of storage capacity (based on local needs).

Aaron Schulz devised a strategy that makes MediaWiki caching compatible with the requirements of a multi-DC architecture. Previously, when source data changed, MediaWiki would recompute and replace the cache value. Now, MediaWiki broadcasts “purge” events for cache keys. Each data centre receives these and sets a “tombstone”, a marker lasting a few seconds that limits any set-value operations for that key to a miniscule time-to-live. This makes it tolerable for recache-on-miss logic to recompute the cache value using local replica databases, even though they might have several seconds of replication lag. Heartbeats are used to detect the replication lag of the databases involved during any re-computation of a cache value. When that lag is more than a few seconds (a large portion of the tombstone period), the corresponding cache set-value operation automatically uses a low time-to-live. This means that large amounts of replication lag are tolerated.

This and other aspects of WANObjectCache’s design allow MediaWiki to trust that cached values are not substantially more stale, than a local replica database; provided that cross-DC broadcasting of tiny in-memory tombstones is not disrupted.

First paint time now under 900ms

In July we set out a goal: improve page load performance so our median first paint time would go down from approximately 1.5 seconds to under a second – and stay under it!

I identified synchronous scripts as the single-biggest task blocking the browser, between the start of a page navigation and the first visual change seen by Wikipedia readers. We had used async scripts before, but converting these last two scripts to be asynchronous was easier said than done.

There were several blockers to this change. Including the use of embedded scripts by interactive features. These were partly migrated to CSS-only solutions. For the other features, we introduced the notion of “delayed inline scripts”. Embedded scripts now wrap their code in a closure and add it to an array. After the module loader arrives, we process the closures from the array and execute the code within.

Another major blocker was the subset of community-developed gadgets that didn’t yet use the module loader (introduced in 2011). These legacy scripts assumed a global scope for variables, and depended on browser behaviour specific to serially loaded, synchronous, scripts. Between July 2015 and August 2015, I worked with the community to develop a migration guide. And, after a short deprecation period, the legacy loader was removed.

Hello, WebPageTest

Previously, we only collected performance metrics for Wikipedia from sampled real-user page loads. This is super and helps detect trends, regressions, and other changes at large. But, to truly understand the characteristics of what made a page load a certain way, we need synthetic testing as well.

Synthetic testing offers frame-by-frame video captures, waterfall graphs, performance timelines, and above-the-fold visual progression. We can run these automatically (e.g. every hour) for many urls, on many different browsers and devices, and from different geo locations. These tests allow us to understand the performance, and analyse it. We can then compare runs over any period of time, and across different factors. It also gives us snapshots of how pages were built at a certain point in time.

The results are automatically recorded into a database every hour, and we use Grafana to visualise the data.

In 2015 Peter built out the synthetic testing infrastructure for Wikimedia, from scratch. We use the open-source WebPageTest software. To read more about its operation, check Wikitech.

The journey to Thumbor begins

Gilles evaluated various thumbnailing services for MediaWiki. The open-source Thumbor software came out as the most promising candidate.

Gilles implemented support for Thumbor in the MediaWiki-Vagrant development environment.

To read more about our journey to Thumbor, read The Journey to Thumbor (part 1).

Save timing reduced by 50%

Save timing is one of the key performance metrics for Wikipedia. It measures the time from when a user presses “Publish changes” when editing – until the user’s browser starts to receive a response. During this time, many things happen. MediaWiki parses the wiki-markup into HTML, which can involve page macros, sub-queries, templates, and other parser extensions. These inputs must be saved to a database. There may also be some cascading updates, such as the page’s membership in a category. And last but not least, there is the network latency between user’s device and our data centres.

This year saw a 50% reduction in save timing. At the beginning of the year, median save timing was 2.0 seconds (quarterly report). By June, it was down to 1.6 seconds (report), and in September 2015, we reached 1.0 seconds! (report)

The effort to reduce save timing was led by Aaron Schulz. The impact that followed was the result of hundreds of changes to MediaWiki core and to extensions.

Deferring tasks to post-send

Many of these changes involved deferring work to happen post-send. That is, after the server sends the HTTP response to the user and closes the main database transaction. Examples of tasks that now happen post-send are: cascading updates, emitting “recent changes” objects to the database and to pub-sub feeds, and doing automatic user rights promotions for the editing user based on their current age and total edit count.

Aaron also implemented the “async write” feature in the multi-backend object cache interface. MediaWiki uses this for storing the parser cache HTML in both Memcached (tier 1) and MySQL (tier 2). The second write now happens post-send.

By re-ordering these tasks to occur post-send, the server can send a response back to the user sooner.

Working with the database, instead of against it

A major category of changes were improvements to database queries. For example, reducing lock contention in SQL, refactoring code in a way that reduces the amount of work done between two write queries in the same transaction, splitting large queries into smaller ones, and avoiding use of database master connections whenever possible.

These optimisations reduced chances of queries being stalled, and allow them to complete more quickly.

Avoid synchronous cache re-computations

The aforementioned work on WANObjectCache also helped a lot. Whenever we converted a feature to use this interface, we reduced the amount of blocking cache computation that happened mid-request. WANObjectCache also performs probabilistic preemptive refreshes of near-expiring values, which can prevent cache stampedes.

Profiling can be expensive

We disabled the performance profiler of the AbuseFilter extension in production. AbuseFilter allows privileged users to write rules that may prevent edits based on certain heuristics. Its profiler would record how long the rules took to inspect an edit, allowing users to optimise them. The way the profiler worked, though, added a significant slow down to the editing process. Work began later in 2016 to create a new profiler, which has since completed.

And more

Lots of small things. Including the fixing of the User object cache which existed but wasn’t working. And avoid caching values in Memcached if computing them is faster than the Memcached latency required to fetch it!

We also improved latency of file operations by switching more LBYL-style coding patterns to EAFP-style code. Rather than checking whether a file exists, is readable, and then checking when it was last modified – do only the latter and handle any errors. This is both faster and more correct (due to LBYL race conditions).

So long, Sajax!

Sajax was a library for invoking a subroutine on the server, and receiving its return value as JSON from client-side JavaScript. In March 2006, it was adopted in MediaWiki to power the autocomplete feature of the search input field.

The Sajax library had a utility for creating an XMLHttpRequest object in a cross-browser-compatible way. MediaWiki deprecated Sajax in favour of jQuery.ajax and the MediaWiki API. Yet, years later in 2015, this tiny part of Sajax remained popular in Wikimedia's ecosystem of community-developed gadgets.

The legacy library was loaded by default on all Wikipedia page views for nearly a decade. During a performance inspection this year, Ori Livneh decided it was high time to finish this migration. Goodbye Sajax!

Further reading

This year also saw the switch to encrypt all Wikimedia traffic with TLS by default.

Mentioned tasks: T107399, T105391, T109666, T110858, T55120.

Tech News issue #53, 2020 (December 28, 2020)

00:00, Monday, 28 2020 December UTC
previous 2020, week 53 (Monday 28 December 2020) next
Other languages:
Bahasa Indonesia • ‎Deutsch • ‎English • ‎español • ‎italiano • ‎polski • ‎português do Brasil • ‎Ελληνικά • ‎русский • ‎українська • ‎עברית • ‎العربية • ‎മലയാളം • ‎中文 • ‎日本語 • ‎한국어

Runnable runbooks

18:59, Tuesday, 15 2020 December UTC

Recently there has been a small effort on the Release-Engineering-Team to encode some of our institutional knowledge as runbooks linked from a page in the team's wiki space.

What are runbooks, you might ask? This is how they are described on the aforementioned wiki page:

This is a list of runbooks for the Wikimedia Release Engineering Team, covering step-by-step lists of what to do when things need doing, especially when things go wrong.

So runbooks are each essentially a sequence of commands, intended to be pasted into a shell by a human. Step by step instructions that are intended to help the reader accomplish an anticipated task or resolve a previously-encountered issue.

Presumably runbooks are created when someone encounters an issue, and, recognizing that it might happen again, helpfully documents the steps that were used to resolve said issue.

This all seems pretty sensible at first glance. This type of documentation can be really valuable when you're in an unexpected situation or trying to accomplish a task that you've never attempted before and just about anyone reading this probably has some experience running shell commands pasted from some online tutorials, setup instructions for a program, etc.

Despite the obvious value runbooks can provide, I've come to harbor a fairly strong aversion to the idea of encoding what are essentially shell scripts as individual commands on a wiki page. As someone who's job involves a lot of automation, I would usually much prefer a shell script, a python program, or even a "maintenance script" over a runbook.

After a lot of contemplation, I've identified a few reasons that I don't like runbooks on wiki pages:

  • Runbooks are tedious and prone to human errors.
    • It's easy to lose track of where you are in the process.
    • It's easy to accidentally skip a step.
    • It's easy to make typos.
  • A script can be code reviewed and version controlled in git.
  • A script can validate it's arguments which helps to catch typos.
  • I think that command line terminal input is more like code than it is prose. I am more comfortable editing code in my usual text editor as apposed to editing in a web browser. The wikitext editor is sufficient for basic text editing, and visual editor is quite nice for rich text editing, but neither is ideal for editing code.

I do realize that mediawiki does version control. I also realize that sometimes you just can't be bothered to write and debug a robust shell script to address some rare circumstances. The cost is high and it's uncertain whether the script will be worth such an effort. In those situations a runbook might be the perfect way to contribute to collective knowledge without investing a lot of time into perfecting a script.

My favorite web comic, xkcd, has a lot few things to say about this subject:

"The General Problem" xkcd #974. "Automation" xkcd #1319. "Is It Worth the Time?" xkcd #1205.

Potential Solutions

I've been pondering a solution to these issues for a long time. Mostly motivated by the pain I have experienced (and the mistakes I've made) while executing the biggest runbook of all on a regular basis.

Over the past couple of years I've come across some promising ideas which I think can help the problems I've identified with runbooks. I think that one of the most interesting is Do-nothing scripting. Dan Slimmon identifies some of the same problems that I've detailed here. He uses the term *slog* to refer to long and tedious procedures like the Wikimedia Train Deploys. The proposed solution comes in the form of a do-nothing script. You should go read that article, it's not very long. Here are a few relevant quotes:

Almost any slog can be turned into a do-nothing script. A do-nothing script is a script that encodes the instructions of a slog, encapsulating each step in a function.


At first glance, it might not be obvious that this script provides value. Maybe it looks like all we’ve done is make the instructions harder to read. But the value of a do-nothing script is immense:

  • It’s now much less likely that you’ll lose your place and skip a step. This makes it easier to maintain focus and power through the slog.
  • Each step of the procedure is now encapsulated in a function, which makes it possible to replace the text in any given step with code that performs the action automatically.
  • Over time, you’ll develop a library of useful steps, which will make future automation tasks more efficient.

A do-nothing script doesn’t save your team any manual effort. It lowers the activation energy for automating tasks, which allows the team to eliminate toil over time.

I was inspired by this and I think it's a fairly clever solution to the problems identified. What if we combined the best aspects of gradual automation with the best aspects of a wiki-based runbook? Others were inspired by this as well, resulting in tools like braintree/runbook, codedown and the one I'm most interested in, rundoc.

Runnable Runbooks

My ideal tool would combine code and instructions in a free-form "literate programming" style. By following some simple conventions in our runbooks we can use a tool to parse and execute the embedded code blocks in a controlled manner. With a little bit of tooling we can gain many benefits:

  • The tooling will keep track of the steps to execute, ensuring that no steps are missed.
  • Ensure that errors aren't missed by carefully checking / logging the result of each step.
  • We could also provide a mechanism for inputting the values of any variables / arguments and validate the format of user input.
  • With flexible control flow management we can even allow resuming from anywhere in the middle of a runbook after an aborted run.
  • Manual steps can just consist of a block of prose that gets displayed to the operator. With embedded markup we can format the instructions nicely and render them in the terminal using [Rich][7]. Once the operator confirms that the step is complete then the workflow moves on to the next step.

Prior Art

I've found a few projects that already implement many of these ideas. Here are a few of the most relevant:

The one I'm most interested in is Rundoc. It's almost exactly the tool that I would have created. In fact, I started writing code before discovering rundoc but once I realized how closely this matched my ideal solution, I decided to abandon my effort. Instead I will add a couple of missing features to Rundoc in order to get everything that I want and hopefully I can contribute my enhancements back upstream for the benefit of others.



[1]: "runbooks"
[2]: "Train deploys"
[3]: "Do-nothing scripting: the key to gradual automation by Dan Slimmon"
[4]: "runbook by braintree"
[5]: "codedown by earldouglas"
[6]: "rundoc by eclecticiq"
[7]: "Rich python library"

Changes and improvements to PHPUnit testing in MediaWiki

10:32, Wednesday, 25 2020 November UTC

Building off the work done at the Prague Hackathon (T216260), we're happy to announce some significant changes and improvements to the PHP testing tools included with MediaWiki.

PHP unit tests can now be run statically, without installing MediaWiki

You can now download MediaWiki, run composer install, and then composer phpunit:unit to run core's unit test suite (T89432).

The standard PHPUnit entrypoint can be used, instead of the PHPUnit Maintenance class

You can now use the plain PHPUnit entrypoint at vendor/bin/phpunit instead of the MediaWiki maintenance class which wraps PHPUnit (tests/phpunit/phpunit.php).

Both the unit tests and integration tests can be executed with the standard phpunit entrypoint (vendor/bin/phpunit) or if you prefer, with the composer scripts defined in composer.json (e.g. composer phpunit:unit). We accomplished this by writing a new bootstrap.php file (the old one which the maintenance class uses was moved to tests/phpunit/bootstrap.maintenance.php) which executes the minimal amount of code necessary to make core, extension and skin classes discoverable by test classes.

Tests should be placed in tests/phpunit/{integration,unit}

Integration tests should be placed in tests/phpunit/integration while unit tests go in tests/phpunit/unit, these are discoverable by the new test suites (T87781). It sounds obvious now to write this, but a nice side effect is that by organizing tests into these directories it's immediately clear to authors and reviewers what type of test one is looking at.

Introducing MediaWikiUnitTestCase

A new base test case, MediaWikiUnitTestCase has been introduced with a minimal amount of boilerplate (@covers validator, ensuring the globals are disabled, and that the tests are in the proper directory, the default PHPUnit 4 and 6 compatibility layer). The MediaWikiTestCase has been renamed to MediaWikiIntegrationTestCase for clarity.

Please migrate tests to be unit tests where appropriate

A significant portion of core's unit tests have been ported to use MediaWikiUnitTestCase, approximately 50% of the total. We have also worked on porting extension tests to the unit/integration directories. @Ladsgroup wrote a helpful script to assist with automating the identification and moving of unit tests, see P8702. Migrating tests from MediaWikiIntegrationTestCase to MediaWikiUnitTestCase makes them faster.

Note that unit tests in CI are still run with the PHPUnit maintenance class (tests/phpunit/phpunit.php), so when reviewing unit test patches please execute them locally with vendor/bin/phpunit /path/to/tests/phpunit/unit or composer phpunit -- /path/to/tests/phpunit/unit.

Generating code coverage is now faster

The PHPUnit configuration file now resides at the root of the repository, and is called phpunit.xml.dist. (As an aside, you can copy this to phpunit.xml and make local changes, as that file is git-ignored, although you should not need to do that.) We made a modification (T192078) to the PHPUnit configuration inside MediaWiki to speed up code coverage generation. This makes it feasible to have a split window in your IDE (e.g. PhpStorm), run "Debug with coverage", and see the results in your editor fairly quickly after running the tests.

What is next?

Things we are working on:

  • Porting core tests to integration/unit
  • Porting extension tests to integration/unit.
  • Removing legacy testsuites or ensuring they can be run in a different way (passing the directory name for example).
  • Switching CI to use new entrypoint for unit tests, then for unit and integration tests

Help is wanted in all areas of the above! We can be found in the #wikimedia-codehealth channel and via the phab issues linked in this post.


The above work has been done and supported by Máté (@TK-999), Amir (@Ladsgroup), Kosta (@kostajh), James (@Jdforrester-WMF), Timo (@Krinkle), Leszek (@WMDE-leszek), Kunal (@Legoktm), Daniel (@daniel), Michael Große (@Michael), Adam (@awight), Antoine (@hashar), JR (@Jrbranaa) and Greg (@greg) along with several others. Thank you!

thanks for reading, and happy testing!

Amir, Kosta, & Máté

Production Excellence #25: October 2020

05:50, Tuesday, 24 2020 November UTC

How’d we do in our strive for operational excellence last month? Read on to find out!

📈 Incidents

2 documented incidents in October. [1] Historically, that's just below the median of 3 for this time of year. [3]

Learn about recent incidents at Incident documentation on Wikitech, or Preventive measures in Phabricator.

📊 Trends

Month-over-month plots based on spreadsheet data. [5]

📖 Outstanding errors

Take a look at the workboard and look for tasks that could use your help.

Summary over recent months:

  • ⚠️ July 2019 (3 of 18 tasks): One task closed.
  • ⚠️ August 2019 (1 of 14 tasks): no change.
  • ⚠️ September 2019 (3 of 12 tasks): no change.
  • ⚠️ October 2019 (5 of 12 tasks): One task closed.
  • ⚠️ November 2019 (1 of 5 tasks): Two tasks closed.
  • December (3 of 9 tasks left), no change.
  • January 2020 (4 of 7 tasks left), no change.
  • February (2 of 7 tasks left), no change.
  • March (2 of 2 tasks left), no change.
  • April (9 of 14 tasks left): One task closed.
  • May (7 of 14 tasks left): no change.
  • June (7 of 14 tasks left): no change.
  • July 2020 (9 of 24 new tasks): One task closed.
  • August 2020 (26 of 53 new tasks): Five tasks closed.
  • September 2020 (15 of 33 new tasks): Two tasks closed.
  • October 2020: 45 of 69 new tasks survived the month of October and remain open today.
Recent tally
110 as of Excellence #24 (23rd Oct).
-13 closed of the 110 recent tasks.
+45 survived October 2020.
142 as of today, 23rd Nov.

For the on-going month of November, there are 25 new tasks so far.

🎉 Thanks!

Thank you to everyone else who helped by reporting, investigating, or resolving problems in Wikimedia production. Thanks!

Until next time,

– Timo Tijhof

 👤  Howard Salomon:

❝   Problem is when they arrest you, you get put on the justice train, and the train has no brain. ❞  


[1] Incident documentation 2020, Wikitech
[2] Open tasks in Wikimedia-prod-error, Phabricator
[3] Wikimedia incident stats by Krinkle, CodePen
[4] Month-over-month, Production Excellence (spreadsheet)

CI now updates your deployment-charts

23:46, Tuesday, 17 2020 November UTC

If you're making changes to a service that is deployed to Kubernetes, it sure is annoying to have to update the helm deployment-chart values with the newest image version before you deploy. At least, that's how I felt when developing on our dockerfile-generating service, blubber.

Over the last two months we've added

And I'm excited to say that CI can now handle updating image versions for you (after your change has merged), in the form of a change to deployment-charts that you'll need to +2 in Gerrit. Here's what you need to do to get this working in your repo:

Add the following to your .pipeline/config.yaml file's publish stage:

promote: true

The above assumes the defaults, which are the same as if you had added:

  - chart: "${setup.projectShortName}"  # The project name
    environments: []                    # All environments
    version: '${.imageTag}'             # The image published in this stage

You can specify any of these values, and you can promote to multiple charts, for example:

  - chart: "echostore"
    environments: ["staging", "codfw"]
  - chart: "sessionstore"

The above values would promote the production image published after merging to all environments for the sessionstore service, and only the staging and codfw environments for the echostore service. You can see more examples at

If your containerized service doesn't yet have a .pipeline/config.yaml, now is a great time to migrate it! This tutorial can help you with the basics:

This is just one step closer to achieving continuous delivery of our containerized services! I'm looking forward to continuing to make improvements in that area.

From student to professor: Amanda Levendowski

17:23, Monday, 16 2020 November UTC

This fall, we’re celebrating the 10th anniversary of the Wikipedia Student Program with a series of blog posts telling the story of the program in the United States and Canada.

Amanda Levendowski was a law school student 10 years ago when her professor assigned her to edit a Wikipedia article as a class assignment, part of the pilot program of what is now known as the Wikipedia Student Program. She tackled the article on the FAIR USE Act, a piece of failed copyright reform legislation introduced by Rep. Zoe Lofgren. And she was hooked.

“It felt so impactful to be able to contribute to this repository of knowledge that everyone I knew was using and leave behind something valuable,” Amanda says.

When her class ended, she wasn’t done with Wikipedia. She developed an independent study in law school to create the article about revenge porn because she was writing a scholarly piece about it and noticed that there wasn’t a Wikipedia article about the problem.

“That article has been viewed more than 1 million times — it’s probably gonna have more views than any piece of scholarship I write for the rest of my life,” she says.

She continued editing herself, even appearing in a 2015 “60 Minutes” piece about editing Wikipedia. (“There was a lot of footage that was understandably left on the cutting-room floor, but I’ll always remember wryly responding to Morley Safer when he suggested that copyright law was a little outdated and maybe a little boring — I think I said something like, ‘I’m sure many of your producers who rely on fair use would disagree.’ Who says that to Morley Safer?!” she recalls.) But she attributes her ongoing dedication to Wikipedia in part to Barbara Ringer.

“The year I graduated from law school, I overhauled the article about Ringer, the lead architect of the 1976 Copyright Act, the law around which much of my professional life revolves, during a WikiCon edit-a-thon,” she explains (the hero image on this blog post is of Amanda speaking at WikiConference USA in 2014). “There is something meditative about making an article better, about sharing an untold story, that I couldn’t resist wanting to continue experiencing alongside my students. And in the process, I found this stunning quote from Ringer about how the public interest of copyright law should be ‘to provide the widest possible access to information of all kinds.’ It’s hard to hear that and not think of Wikipedia and its mission.”

And now the student has become a professor herself. Amanda’s an Associate Professor of Law and Director, Intellectual Property and Information Policy Clinic at the Georgetown University Law Center. And she assigns her students to edit Wikipedia as a class assignment, of course.

One such student is Laura Ahmed, who is interested in the intersection of intellectual property and privacy law. Laura, who graduated in spring 2020, was both excited and nervous to tackle a Wikipedia assignment, making improvements to current Supreme Court case Google v. Oracle America, on the copyrightability of APIs and fair use.

“It is almost certainly going to have a substantial impact on software development in the United States, so I think it’s important for the information that is out there about the case to be accurate. That is what made me so nervous about it; it’s such a critical issue and I wanted to be sure that anything I was saying about it was adequately supported by facts,” she says. “Amanda was really great though about helping me get started and build up my confidence to edit the page. When we were editing, COVID-19 had just caused the Supreme Court to postpone several arguments, including this case. So Amanda suggested I start there, and once I’d made that one change it felt easier to go into the substance of the case and change some of the article to better reflect the legal arguments that are being made in the case.”

While Laura found the time constraints of a class assignment challenging, she thought the assignment was critical for both Wikipedia’s readers and her own hands-on learning as a law student.

“This assignment made me really think critically about what I’ve learned in law school and how I can use that knowledge in productive, but unexpected ways,” Laura explains. “When you’re a law student, you tend to forget that a lot of legal concepts aren’t common knowledge. So a lot of cases on Wikipedia really could benefit from a first or second year law student going in and just clarifying what the court actually said or what has actually happened with a case. It’s a nice reminder that we have more to contribute than we think.”

This reflection is exactly what Amanda experienced as a student herself, and is now seeing as an instructor. She reflects back on the American Bar Association’s Model Rules of Professional Conduct: “As a member of a learned profession…a lawyer should further the public’s understanding of and confidence in the rule of law and the justice system because legal institutions in a constitutional democracy depend on popular participation and support to maintain their authority.”

“It’s hard to imagine a more powerful way to further the public’s understanding of law and justice than by empowering law students to improve Wikipedia articles about those laws: it teaches the public, but it also teaches the students the twin skillsets of editing and the value of giving knowledge back to our communities,” Amanda says. “This community isn’t perfect, but I’m so inspired by the many, many volunteers who are striving to make it better. I’m proud to include myself and my students among them, and I’m excited to see where we are another decade out.”

Image: Geraldshields11, CC BY-SA 3.0, via Wikimedia Commons

Arabic and the Web

00:00, Monday, 16 2020 November UTC

I remember a Wikipedia workshop organized by the Institute of Computer Science at the University of Oxford? The question was why the number of Arabic speakers is around half a billion and the Arabic content is less than 5%, and in these five cases, perhaps a third is useful. A question and whether he found his answers and suggestions for a solution that the road is still long to support more content.

And because Arabic speakers are peoples who master multilingualism, perhaps unlike the American or European peoples, for example, you will always find those who master a second language, such as Algerian speaking in French and Egyptian speaking in English.

In the history of languages: And when the mother tongue is second or third. We waste time learning language instead of science. And many fall behind in their knowledge if they don't master the language. The rest do not succeed because they are not able to understand the culture of the language.

But in reality, how many people live in Algeria? How many contributors are from Algeria? And how many Algerians add encyclopedic content?

I can't answer here, but I have retrieved the 140-page report of the study in which I shared my thoughts and which was conducted by Oxford - whose excellent analyses I recommend.

In summary: we need to focus on the important objectives to define, organize and direct the work on this topic.

Permanent, adaptation and recurrence. I am optimistic about our future at this time.

(This is a conversation that took place on a social media page, which was collected by this text for several responses with light behaviour)

Wikicite from the ground up: references

11:23, Sunday, 15 2020 November UTC
When a point is made in Wikipedia, when a statement is made in Wikidata, best practice is to include a reference. The same is true in a scholarly paper, its references are typically found in a references section.

Wikicite is a project that brings many scholarly papers into Wikidata as beautiful as it is, it is a top down process. As an ordinary editor there is a lot that you can do to enrich the result.

The paper, "Can trophic rewilding reduce the impact of fire in a more flammable world?" has a DOI, the PDF includes a reference section. It takes a lot of effort to add the authors and papers it cites to Wikidata. The visibility of the paper improves and so does the visibility of the paper it cites. The Scholia shows that at this time, this paper is not used as a reference in Wikipedia. 

There is now a template that retrieves information from Wikidata for its reference data. It will be great when it is widely adopted because it provides an additional pathway from Wikipedia to the used references and the information relating to the reference.

So what can we do to improve on the quality of the data in Wikidata. First, the processes that import the bulk of new data are crucial, they are essential and need to be appreciated as such. The next part is enabling a community to improve the data. A recent paper explained what can be done with a top down approach. All kinds of decisions were made for us and the result feels like a one off project. 

When ORCID is considered to be our partner, it makes sense to invite people registered at ORCID to contribute to Wikidata. Their papers can be uploaded from ORCID into Wikidata, their co-authors and references can be linked by these people. As they do this while being logged into ORCID, we are assured because of their known personal involvement and use this as a reference.

The quality of such a reference is better than our current references that came with a link to an "author name string". Who knows that the disambiguation was correct? When a paper is linked to at least one known ORCID person with public information, we have a link we can verify and consequently it becomes a link we can trust. Once the link with a person with a ORCID identifier is established, we can ask to acknowledge the  changes that happen in his or her papers. Our quality is enhanced and a sense of community with ORCID is established.

Thanks, GerardM

Wikicite from the ground up: "Trophic rewilding"

11:15, Sunday, 15 2020 November UTC
In nature conservation, trophic rewilding and trophic cascades are important topics. When an animal like the howler monkey is no longer around, it no longer distributes the seeds of trees. The likely effect is that in time plants are no longer part of the ecosystem. Reintroducing a howler monkey restores the relation; it is considered an example of trophic rewilding.

At Wikipedia there is no article about trophic rewilding. As someone famously said, references are the most important part of a Wikipedia article, let's start with finding references.

There is a longstanding process of importing data about scholarly papers, all kinds of scholarly papers. Some of them have "trophic rewilding" in their title. Trophic rewilding was not known as a subject so it was easy enough to look for "trophic rewilding" and add it as a subject. Slowly but surely the Scholia representation evolves. More papers means more authors and more authors known to have collaborated on multiple publications. More citations are found for these papers and by inference they have a relation to the subject.

The initial set of data is already good enough to get a grasp of the subject but when you want more, you can look for missing data using Scholia, information like missing authors. The author disambiguator aids in finding papers for the missing author. With such iterations, the Scholia for trophic rewilding becomes more complete.

Another avenue to improve the coverage of a subject is by adding "cites work" in Wikidata for a paper like this one. Not all cited works are known to Wikidata but the effect can be impressive. NB The citations are often found in a PDF  and not in the article..

Slowly but surely all the scholarly references to be used for a new article are available, you can use a template in the article to link to the (evolving) Scholia. The best bit is you can add this template in an existing Wikipedia article as well providing a scholarly rabbit hole for interested readers.

Thanks, GerardM

weeklyOSM 538

10:27, Sunday, 15 2020 November UTC


lead picture

Peruvian vaccination bases on an OSM map. 1 | © Ministerio de Salud, Perú | map data © OpenStreetMap contributors |

About us

  • Since issue #537 we have been publishing in the Polish language. We are very happy to welcome our new colleagues and hope that this service in Poland will inspire even more people to contribute to OpenStreetMap when they can read our news in their native language. Witamy, polska drużyno. 😉


  • Pascal Neis has just updated his ‘Unmapped Places‘ using OSM data from 30 October 2020.
  • Christina Ludwig, Sascha Fendrich and Alexander Zipf report about their study on ‘Regional variations of context‐based association rules in OpenStreetMap’. This study investigates the variability of association rules extracted from OSM across different geographic regions and their dependence on different context variables, such as the number of OSM mappers.
  • The Spanish Red Cross is organising an online Mapathon (es) on Thursday 19 November from 17:00 to 19:00, to help people in Burundi vulnerable to natural disasters, armed conflict and epidemics.
  • OSM Ireland now offers a Tasking Manager that organises the simultaneous mapping of buildings close to each other, by multiple mappers, by assigning the mappers different squares to work on. Another possibility is to start your own projects anywhere in the world.
  • PoliMappers report on their effort to introduce new and interested people to the world of geospatial collaborative projects in the Politecnico di Milano campus of Piacenza.
  • Brian M. Sperlongano (ZeLonewolf) announced that his proposal boundary=special_economic_zone, to tag an area in which the business and trade laws are different from the rest of the country or state, is now open for voting until 24 November.
  • With the outbreak of COVID-19 in March 2020 the German speaking Telegram group started voting on and publishing a weekly mapping focus. Along with the German version, they recently made available an English version of their wiki page to inspire more people to contribute to the weekly changing mapping challenges. amenity=car_sharing is the current ‘Weekly Focus’. Please enter your own ideas in the ‘Focus Idea Depot’ and vote on them in the Telegram group ‘OSM de‘.


  • YouthMappers has appointed a new cohort of regional ambassadors for 2020–2021.
  • You can now read the answers to questions from the AskMeAnything thread on Reddit with some of the OSM Foundation Board members.

OpenStreetMap Foundation

  • Jonathan Beliën, one of the OSMF Microgrant Program recipients, has submitted his final report for the Road Completion Project. The project focused on software for conflating open data road networks against OpenStreetMap roads in Belgium. The code is open and can be used to achieve similar results anywhere in the world.

Local chapter news

  • Due to the lockdown in place since late October, OSM Ireland has come up with a schedule for November to make focused progress on the osmIRL_buildings project. Each day, there is a different town or group of towns to be mapped using the HOT OSM task manager.
  • mapeadora reported (es) > en in her blog about the official opening of the YouthMappers chapter at the Universidad Autónoma del Estado de México (UAEM), Faculty of Geography, on the Campus Toluca.


  • Code The City’s 21st hack event invites everyone to use mapping, software, open data, or programming tools on Saturday 28 November and Sunday 29 November to create, update, digitise and modernise maps of our locales. The event will take place online and coding is, despite the name, only a small part of what is planned – take a look at the agenda here.
  • The OSM Geography Awareness Week is taking place from 15 to 21 November. ‘OSMGeoWeek’ is a week when teachers, students, community groups, organisations, and map lovers around the world can join together to celebrate geography and OpenStreetMap. Consider planning a mapathon, a webinar to show off your latest project, a career panel to talk about how your organisation uses OSM, or a workshop to teach others what you know. Follow #osmgeoweek, and share your experiences using the #osmgeoweek hashtag. When editing in OpenStreetMap add the #osmgeoweek2020 hashtag to your changesets to be included in the metrics. Add your event, or find one, at!
  • State of the Map Japan 2020 (ja) and FOSS4G Japan 2020 (ja) were jointly held on 7 and 8 November. The summary (ja) of tweets and each video are now online (SotMJ (ja), FOSS4G (ja)).
  • Following the success of the mapping party held in September, the Chair of Urban Structure and Transport Planning of the Technical University in Munich is organising another mapping party to be held on 18 November at 18:15.

Humanitarian OSM

  • The HOT Disaster Services Team offers an update on the disaster responses the team is currently supporting or preparing to support around the world, as well as detailing ways the community can help.
  • Ramani Huria has been training students in Dar es Salaam and equipping them with industrial and technical skills for the 21st century while generating vital high-precision, low-cost data for flood prediction and preparedness. Through the students’ work, Ramani Huria has mapped more than 10,000 flood data points in eight weeks.
  • Crowd2Map, which was created in 2015 by Janet Chapman, has enabled volunteers to map almost five million buildings in Tanzania. This made it possible for welfare workers to locate and save 3000 girls from female genital mutilation and bring them to safe houses, where they can also receive education.


  • [1] The Peruvian Ministry of Health has published (es) > en its vaccination data points on an OSM base map.

Open Data

  • terrestris shows us several different possibilities for visualising the SRTM-30 elevation model.
  • The European Data Portal has 2259 (and counting) open datasets for Romania.


  • Jochen Topf gave an overview of the first ten years of Taginfo, a service developed and maintained by himself, and describes some of the recently added features.
  • Trufi Association has created a new multi-modal bike app. It has information about the cycling road network, the public transport network, at what times bicycles can be carried on it, and how busy the vehicles are, enabling the app to propose combination routes that no other routing engine can. Maps and POIs are based on OpenStreetMap and the routing is based on OpenTripPlanner. The app can be adapted for use in any city.
  • The MIERUNE Inc experimentally released (ja) > en an address and facility search service ‘MIERUNE Search(ja). This service focuses on geo services. OSM is used (ja) > en for some data, such as POIs.
  • Dongha Hwang (LuxuryCoop) has created a Taginfo instance for South Korea.
  • Hartmut (maposmatic) added some new ‘OpenOrienteeringStyles’ to ‘OpenOrienteeringMap‘, the easy Street-O map creation tool. You can quickly and easily set a map, add controls, and create a print-ready, high quality vector PDF. If you have any comments, leave them at the end.
  • Researchers from the Federal University of Paraná (UFPR), the University of Maryland (UMD), and the University of Florida published a free e-book on QGIS (pt) in October. QGIS is a free and open-source cross-platform desktop geographic information system that supports viewing, editing, and analysing geospatial data. The book is intended for both students and professionals.


  • Walter Nordmann (wambacher) has started rebuilding the OSM Software Watchlist (a list of the current release status of OSM software products).
  • Marcus Wolschon announced the highlights of the Vespucci 15.1 Beta, released on 28 October.

Did you know …

  • … the osm-in-realtime website by James Westman?
  • … Taginfo now has a chronology tab? You can use it to see how often a tag has been used in the past.
  • … that Geofabrik is hosting Taginfo instances for each country (plus some regions) and continent, even Antarctica?

Other “geo” things

  • I Hate Coordinate Systems! has a very informative overview of common questions and pitfalls encountered when working with coordinate systems.
  • Topi Tjukanov started the #30DayMapChallenge, a daily social mapping project for every day of November 2020.
  • Hoefler&Co has compiled a cartography font collection, with 70 fonts recommended for maps and inspired by mapmaking.
  • Udo Urban, from the German Research Centre for Artificial Intelligence (DFKI), presented (de) > en the project ‘TreeSatAI – Artificial Intelligence with Earth Observation and Multi-Source Geodata’.
  • Mike Darracott reported about Yorkshire Wildlife’s use of MGISS cloud technology to map and help protect habitats.

Upcoming Events

Where What When Country
Online 2020 Pista ng Mapa 2020-11-13-2020-11-27 philippines
Cologne Bonn Airport 133. Bonner OSM-Stammtisch (Online) 2020-11-17 germany
Berlin OSM-Verkehrswende #17 (Online) 2020-11-17 germany
Lyon Rencontre mensuelle (virtuelle) 2020-11-17 france
Cologne Köln Stammtisch ONLINE 2020-11-18 germany
Munich TUM Mapping Party 2020-11-18 germany
Online Missing Maps Mapathon Bratislava #10 2020-11-19 slovakia
Online FOSS4G SotM Oceania 2020 2020-11-20 oceania
Bremen Bremer Mappertreffen (Online) 2020-11-23 germany
Derby Derby pub meetup 2020-11-24 united kingdom
Salt Lake City / Virtual OpenStreetMap Utah Map Night 2020-11-24 united states
Düsseldorf Düsseldorfer OSM-Stammtisch [1] 2020-11-25 germany
London Missing Maps London Mapathon 2020-12-01 united kingdom
Stuttgart Stuttgarter Stammtisch (online) 2020-12-02 germany
Taipei OSM x Wikidata #23 2020-12-07 taiwan
Michigan Michigan Online Meetup 2020-12-07 usa

Note: If you like to see your event here, please put it into the calendar. Only data which is there, will appear in weeklyOSM. Please check your event in our public calendar preview and correct it, where appropriate.

This weeklyOSM was produced by Elizabete, Climate_Ben, MatthiasMatthias, Nordpfeil, PierZen, Polyglot, Rogehm, Sammyhawkrad, TheSwavu, derFred, k_zoar, richter_fn.

Tesseract OCR web interface

09:08, Saturday, 14 2020 November UTC

I prepared a web frontend for Tesseract OCR to do optical character recognition for Malayalam - This application uses Tesseract.js, Javascript port of Tesseract. You can use images with English or Malayalam content. Use the editor and the spellchecker for proofreading the text recognized. Your image does not leave your browser since the recognition is done in browser and does not use any remote servers. Source code:

Fixing a bug in Malayalam ya, ra, va sign rendering

11:20, Friday, 13 2020 November UTC

In Malayalam, the Ya, Va and Ra consonant signs when appeared together has an interesting problem. The Ra sign(്ര also known as reph) is prebase sign, meaning, it goes to left side of the consonant or conjunct to which it applies. The Ya sign(്യ) and Va sign(്വ) are post base, meaning it goes to the right side of consonant or conjunct to which it applies. So, after a consonant or conjunct, if Ra sign and Ya sign is present, Ra sign goes to left and Ya sign remain to the right.

Open Practice in Practice

18:33, Thursday, 12 2020 November UTC

Last week I had the pleasure of running a workshop on open practice with Catherine Cronin as part of City University of London’s online MSc in Digital Literacies and Open Practice, run by the fabulous Jane Secker.  Both Catherine and I have run guest webinars for this course for the last two years, so this year we decided collaborate and run a session together.  Catherine has had a huge influence on shaping my own open practice so it was really great to have an opportunity to work together.  We decided from the outset that we wanted to practice what we preach so we designed a session that would give participants plenty of opportunity to interact with us and with each other, and to choose the topics the workshop focused on. 

We began with a couple of definitions open practice, emphasising that there is no one hard and fast definition and that open practice is highly contextual and continually negotiated and we then asked participants to suggest what open practice meant to them by writing on a shared slide.  We went on to highlight some examples of open responses to the COVID-19 pandemic, including the UNESCO Call for Joint Action to support learning and knowledge sharing through open educational resources, Creative Commons Open COVID Pledge, Helen Beetham and ALT’s Open COVID Pledge for Education and the University of Edinburgh’s COVID-19 Critical Care MOOC

We then gave participants an opportunity to choose what they wanted us to focus on from a list of four topics: 

  1. OEP to Build Community – which included the examples of Femedtech and Equity Unbound.
  2. Open Pedagogy –  including All Aboard Digital Skills in HE, the National Forum Open Licensing Toolkit, Open Pedagogy Notebook, and University of Windsor Tool Parade
  3. Open Practice for Authentic Assessment – covering Wikimedia in Education and Open Assessment Practices.
  4. Open Practice and Policy – with examples of open policies for learning and teaching from the University of Edinburgh. 

For the last quarter of the workshop we divided participants into small groups and invited them to discuss

  • What OEP are you developing and learning most about right now?
  • What OEP would you like to develop further?

Before coming back together to feedback and share their discussions. 

Finally, to draw the workshop to a close, Catherine ended with a quote from Rebecca Solnit, which means a lot to both of us, and which was particularly significant for the day we ran the workshop, 3rd November, the day of the US elections.

Rebecca Solnit quote

Slides from the workshop are available under open licence for anyone to reuse and a recording of our session is also available:  Watch recording | View slides.

10 years of teaching with Wikipedia: Jonathan Obar

17:34, Thursday, 12 2020 November UTC

This fall, we’re celebrating the 10th anniversary of the Wikipedia Student Program with a series of blog posts telling the story of the program in the United States and Canada.

Jonathan Obar was teaching at Michigan State University ten years ago when he heard some representatives from the Wikimedia Foundation would be visiting. As the governance of social media was central to Jonathan’s research and teaching, he looked forward to the meeting.

“To be honest, I was highly critical of Wikipedia at the time, assuming incorrectly that Wikipedia was mainly a problematic information resource with few benefits beyond convenience,” he admits. “How my perspective changed during that meeting and in the months that followed. I was taught convincingly the distinction between Wikipedia as a tool for research, and Wikipedia as a tool for teaching. Clearly much of the controversy has always been, and remains, about the former. More to the moment, was the realization about the possibilities of the latter. Banning Wikipedia is counter-productive if teaching about the internet is the plan. The benefits of active, experiential learning via Web 2.0 are as convincing now as they were then.”

Jonathan should know: He joined the pilot program of what’s now known as the Wikipedia Student Program, and ten years later, he’s still actively teaching with Wikipedia. Jonathan incorporated Wikipedia assignments into his classes at Michigan State, the University of Toronto, the University of Ontario Institute of Technology, and now at York University, where he’s been since 2016. Not only has Jonathan taught with Wikipedia himself, he also spearheaded efforts to expand the program within Canada.

“The opportunity to work with Wikimedia and now Wiki Education continues to be one of the more meaningful academic experiences I’ve been fortunate enough to encounter these last ten years,” he says. “I’ve connected more than 15 Communication Studies courses to the Education Program, and in each course I’ve worked with students eager to learn about Wikipedia, happy when they learn how to edit, and thrilled when their work contributes to the global internet. As a Canadian recruiter for the Education Program I had the privilege to work with more than 35 different classes operating across Canada, meeting and learning with different instructors, while also sharing a fascination with Wikipedia.”

As an early instructor in the program, Jonathan experienced the evolution of our support resources, from the original patchwork wiki pages to the now seamless Dashboard platform with built-in training modules. He notes he appreciates the ways it’s become easier to teach with Wikipedia in the 10 years he’s been doing it. He notes that training he received as an early instructor in the program a decade ago talked about source triangulation; now, the information literacy environment online requires these skills more than ever.

“Students consistently emphasize how Wikipedia assignments help them develop information and digital literacies, which they view as essential to developing their knowledge of the internet,” Jonathan says. “The students are correct as learning about Wikipedia and its social network helps to address many disinformation and misinformation challenges.”

Jonathan Obar with student who received award
Professor Jonathan Obar, at left, with student Andrew Hatelt and Writing Prize Coordinator Jon Sufrin of York University.

In 10 years, many moments stand out for Jonathan, particularly in the support he’s received and interactions he’s had with Wikipedia’s volunteer community. But he points to one student’s work as being a particular favorite: A York University student in his senior undergraduate seminar created the article on the “Digital Divide in Canada”, including passing through the “Did You Know” process to land on Wikipedia’s main page. York University also recognized the student’s work, giving him the senior undergraduate writing prize, over more than 20,000 other students across 20 departments and programs in the Faculty.

“The recognition by the university emphasizes not only that the community is starting to acknowledge the value of Wikipedia, but perhaps also that the student’s work, supported by the program, helped inform that perspective,” he says.

Jonathan is teaching two more classes this year as part of our program, one on Fake News, Fact-Finding, and the Future of Journalism and one on Information and Technology.

“After attending that meeting all those years ago, I was convinced that Wikipedia was one of the most effective tools for eLearning available (and it remains that way),” he says. “I hope to continue teaching with Wikipedia, and with the Wikipedia Student Program, for many years to come.”

Hero image credit: Alin (Public Policy), CC BY-SA 3.0, via Wikimedia Commons; In-text image credit: Jon Sufrin, on behalf of Faculty of LA&PS, York University, CC BY-SA 4.0, via Wikimedia Commons

The Listeria Evolution

09:40, Thursday, 12 2020 November UTC

My Listeria tool has been around for years now, and is used on over 72K pages across 80 wikis in the Wikimediaverse. And while it still works in principle, it has some issues, an, being a single PHP script, it is not exactly flexible to adapt to new requirements.

Long story short, I rewrote the thing in Rust. The PHP-based bot has been deactivated, and all editing of ListeriaBot (marked as “V2”, example) since 2020-11-12 are done by the new version.

I tried to keep the output as compatible to the previous version as possible, but some minute changes are to be expected, so there should be a one-time “wave” of editing by the bot. Once every page has been updated, things should stabilize again.

As best as I can tell, the new version does everything the old one did, but it can do more already, and has some foundations for future expansions:

  • Multiple lists per page (a much requested feature), eliminating the need for subpage transclusion.
  • Auto-linking external IDs (eg VIAF) instead of just showing the value.
  • Multiple list rows per item, depending on the SPARQL (another requested feature). This requires the new one_row_per_item=no parameter.
  • Foundation to use other SPARQL engines, such as the one being prepared for Commons (as there is an OAuth login required for the current test one, I have not completed that yet). This could generate lists for SDC queries.
  • Portability to generic wikibase installations (untested might require some minor configuration changes). Could even be bundled with Docker, as QuickStatements is now.
  • Foundation to use the Commons Data namespace to store the lists, then display them on a wiki via Lua. This would allow lists to be updated without editing the wikitext of the page, and no part of the list is directly editable by users (thus, no possibility of the bot overwriting human edits, a reason given to disallow Listeria edits in main namespace). The code is actually pretty complete already (including the Lua), but it got bogged down a bit in details of encoding information like sections which is not “native” to tabular data. An example with both wiki and “tabbed” versions is here.

As always with new code, there will be bugs and unwanted side effects. Please use the issue tracker to log them.

This Month in GLAM: October 2020

22:38, Wednesday, 11 2020 November UTC
  • AfLIA Wikipedia in African Libraries report: Wikipedia in African Libraries Project
  • Brazil report: Abre-te Código hackathon, Wikidata related events and news from our partners
  • Finland report: Postponed Hack4FI GLAM hackathon turned into an online global Hack4OpenGLAM
  • France report: Partnership with BNU Strasbourg
  • Germany report: Coding da Vinci cultural data hackathon heads to Lower Saxony
  • India report: Mapping GLAM in Maharashtra, India
  • Indonesia report: Bulan Sejarah Indonesia 2.0; Structured data edit-a-thon; Proofreading mini contest
  • Netherlands report: National History Month: East to West, Dutch libraries and Wikipedia
  • New Zealand report: West Coast Wikipedian at Large
  • Norway report: The Sámi Languages on wiki
  • Serbia report: Many activities are in our way
  • Sweden report: Librarians learn about Wikidata; More Swedish literature on Wikidata; Online Edit-a-thon Dalarna; Applications to the Swedish Innovation Agency; Kulturhistoria som gymnasiearbete; Librarians and Projekt HBTQI; GLAM Statistical Tool
  • UK report: Enamels of the World
  • USA report: American Archive of Public Broadcasting; Smithsonian Women in Finance Edit-a-thon; Black Lunch Table; San Diego/October 2020; WikiWednesday Salon
  • Calendar: November’s GLAM events

How helping others edit Wikipedia changes lives

17:07, Tuesday, 10 2020 November UTC

This fall, we’re celebrating the 10th anniversary of the Wikipedia Student Program with a series of blog posts telling the story of the program in the United States and Canada.

When we started what is now the Wikipedia Student Program, we wanted to create support for students and instructors participating in the program. An initial plan involved supporting a new volunteer role within Wikipedia: the Campus Ambassador, who would help support participants in-person.

We sought out people who would be newbie-friendly faces on campus, helping students learn the basics of Wikipedia. Paired with a more Wikipedia-experienced Online Ambassador to answer technical questions, many Campus Ambassadors hadn’t edited Wikipedia prior to this role. While we’re no longer using the Ambassador model, we note the role itself had a profound impact on at least two people whose involvement on Wikipedia began as Campus Ambassadors in 2010: Max Klein and PJ Tabit.

Max Klein
Max Klein, today (the hero image on this post is of Max in 2011). Image courtesy Max Klein.

“That was basically the entire jumping off point for my whole career,” Max says. “I’ve made a living out of being knowledgeable about Wikipedia and contributing to the ecosystem, mostly through bots and data projects.”

Max taught a student-led class at the University of California at Berkeley that he and collaborator Matt Senate decided to build out entirely on the Wikipedia project namespace. He also served as an Ambassador for other courses. After graduating from Berkeley in 2012, Max’s first job was as a Wikimedian-in-Residence for OCLC, teaching librarians to contribute to Wikipedia. Then Wikidata became a project.

“Wikidata legitimized and exponentiated the idea that Wikipedia could be about data as well as articles,” Max says. “That is a useful way to get involved if you are more, let’s say, numerically-minded. That allowed me to get involved in a way were I could start immediately with large individual contributions. However today I recognize that the best projects merge all the different perspectives of the users, the aesthetes, the editors, and the programmers.”

He built a bot that contributed bibliographic and autographic data from the Library of Congress to Wikipedia, then helped build the WikiProject Open Access Citation Bot. In 2015, Max piloted the Wikipedia Human Gender Indicators, the first automated documentation of biography-gender representation across all language Wikipedias. He helped create an AI-powered version of HostBot to find the best newcomers. Then he supported the Citizens and TechLab experiment to see if wiki-thanking by other users led editors to contribute more. Today, Max is starting project “Humaniki” to provide data and tools to assist systemic-bias-focused editing.

In other words, Max has done a lot from his initial start as an Ambassador!

“It’s defined my career and values,” he says. “Wikipedia is one of the few remaining sites that hold the promise of what we thought the internet would be at the turn of the millennium. We knew entertainment and commerce would come online, but the promise of libraries and public parks and civic-engagement coming on-line has found less of a foothold. Luckily Wikipedia is still ticking showing what a non-commercial internet could be like. I’m motivated by the feeling of collaborating on public-good, socially important projects with humans all around the world.”

PJ Tabit
PJ Tabit in 2011, at a training for the program.

While Max branched out from his work with the program to other areas of Wikipedia work, PJ has continued to be involved with the educational efforts. He originally got involved when starting graduate school in public policy at George Washington University.

“It seemed like an exciting opportunity to work on something related to what I was studying and involving one of the most visited websites on the internet,” PJ says.

After supporting courses on campus at GW, PJ traveled to India in 2011 to support the Wikimedia Foundation’s efforts to replicate the program there. When a working group was formed to find a new home for the program, PJ volunteered. And when Wiki Education as a new organization was formed, PJ was elected to the board, initially serving as treasurer. Since 2017, PJ has been Wiki Education’s board chair.

“Simply, I think the work is critical,” PJ says. “Wikipedia stands out as a source of reliable factual information on the internet, and Wiki Education, through the Student Program, helps Wikipedia become more representative, accurate, and complete. I am extremely proud of what this organization and program accomplish.”

PJ points to the scale of Wiki Education’s program and impact as a key success marker over the last decade. He noted that when we were first starting out in 2010, we couldn’t have imagined that 20% of English Wikipedia’s new active editors would come from this program.

And his involvement over the last decade has meant a lot to PJ personally as well.

“I have made amazing friends that I likely would never have met if not for Wikipedia,” he says. “My involvement with Wiki Education and the Student Program have also given me an understanding and deep respect for how Wikipedia gets made, which I would not have gained as just a reader of the site.”

Both Max and PJ hope to see a future in which Wikipedia reflects fewer and fewer systemic biases.

“Wiki Education has made tremendous progress toward ensuring Wikipedia is representative, accurate, and complete, but clearly there is much more to do,” PJ says. “I hope that we eventually resolve Wikipedia’s systemic biases and that it truly represents the sum of all human knowledge.”

“I hope that Wikipedia lives for another 20 years, and beyond. But I also hope that Wikipedia can be a platform for change vis-a-vis the problems of gender, economic, racial, and political justice,” Max says. “I think it’s already stepping in this direction with amazing editors who increase its coverage and fight misinformation. Obviously an encyclopedia can only do so much (although it’s quite a lot despite its medium). Still I imagine there is another project beyond Wikipedia, like Wikidata hinted at, that can utilize the pattern of collaboration that’s existed and has been so fruitful. I don’t know what it is yet, I’ve been thinking about it for 10 years, but I believe it’s there in the future.”

Natasha in St Malo earlier this year.

In October we recruited for a role that we have long known will be critical to the sustainability of Wikimedia UK’s vital work. Having a Head of Development and Communications gives us a strategic approach to our public image, fundraising, and external outreach. We wanted the role to be in senior management, leading a new team consisting of Katie Crampton, our Communications and Governance Assistant, and another new role that we’re currently recruiting for, a Fundraising Development Coordinator. Though we had to postpone recruitment for the Head of Development and Communications due to lockdown, we’re pleased to announce that one month ago we found a candidate who we think is the perfect fit; Natasha Iles.

With a background in the corporate world, Natasha took a career change into the Third Sector over ten years ago knowing she wanted to make a broader, more positive impact with her skills. Since Natasha’s first charity role as a sole fundraiser and marketeer, she has developed to lead both fundraising and communications functions as an active member of senior management. Natasha holds a Diploma in Fundraising and is a member of the Chartered Institute of Fundraising.

When asked about her goals while working with us, Natasha outlined her aims for our new Development and Communications team to continue to increase visibility of our amazing programmes and activities across the UK. Natasha will also work to diversify our income streams. Like us, Natasha feels that increasing our profile and the positive impact of our work is vital to ensuring we continue breaking down the barriers to accessing and contributing to free knowledge.

Though she’s only been with us a few weeks, we’ve already seen incredible work from Natasha. To say she hit the ground running is a bit of an understatement! We’re very excited for everything she’s bringing to the team.

Outreachy report #14: October 2020

00:00, Monday, 09 2020 November UTC

Application review

We were able to review all applications by extending the review period. This also led to an unplanned experimentation with the contribution period length that we’re still trying to tune perfectly after the essay questions were implemented back in 2018. There’s a fine line between making it too long for projects that require simpler contributions and too short for those that ask for more complex ones.

Communication and planning

Sage and I had a “decompress” meeting to discuss what went right and wrong during this application/review period and to set up short and long term goals for Outreachy. For the first time in my two years working on Outreachy we were able to build strategies keeping in mind the long term health of the program, and I attribute that in part to the fact Sage and I have been sharing more responsibilities. Taking a lot of weight off Sage’s shoulders and transferring it to mine directly translates into more getting done and expanding the program’s horizons as never before.


I’m resuming my involvement with the development of Outreachy’s website. The project’s documentation has greatly improved since the last time I set up a local environment, as well as the automation of tests and scenarios to explore. I was able to test setting up a new environment in different systems (Fedora 32 and 33, Ubuntu 20.04 and 20.10), writing down every single dependency needed to build the environment and explore it. Thanks to Jamey and Sage upgrading dependencies, I was able to overcome a specific issue I was running into in all systems (failure to compile psycopg2, a known bug with Python 3.8).

I’ve been focusing on understanding flows related to the mentor roles, which leads us to…

Mentor interviews

Sage and I have been discussing improving the mentor documentation for a few months, and one of the best ways to start thinking about that is interviewing new Outreachy mentors to understand how and why they took interest in the program, how was the onboarding process in their community, what issues they’ve ran into during the process to become a mentor, and which ways we can improve our own onboarding.

I sent an email to the mentors mailing list encouraging mentors to contact me to either an asynchronous email interview or a synchronous video or text chat. The response was better than I expected: I have 12 interviews in my schedule in the next 10 days. However, none of our volunteer interviewees are Outreachy alums–I’ll have to send emails to specific mentors to see if we can schedule interviews with those in that group too.


I accepted two invitations for live events this November:

  • LKCAMP, a Linux kernel study group, invited me to participate in LKConf to talk specifically about Outreachy internships from an alum and organizer perspective on November 17th.
  • Casa Hacker invited me to talk about free software as a whole on November 18th – we’ll discuss concepts, ideas, latest events. This is more of a generalistic livestream to help people understand free software communities.

weeklyOSM 537

10:43, Sunday, 08 2020 November UTC


lead picture

Daily updated corona incidences per county 1 | © | map data © OpenStreetMap contributors | © RKI, DIVI Intensivregister, BKG, LVermGeo Rlp 2020


  • TheFive reported that weeklyOSM often receives messages alerting us to map errors. In a small blog post he points people (de) > en to the map notes system and encourages people to participate.
  • Yuu Hayashi, who published (ja) > en a draft of a scheme for route mapping of Japanese historical paths, is asking how to manage a long-term mapping project with multiple members, and whether there is a form of monitoring that works better than updating mapping progress on OSM Wiki. He also asked if there is a better way to decide on a scheme for mapping a series of features than the Tag proposal process.
  • The user Vollis proposed the tag amenity=chapel_of_rest for a room or building where families and friends can say goodbye to a deceased person before his or her funeral. This proposal is now up for vote until 18 November.
  • Privatemajory, Luke proposed the tag electricity=[grid, generator, yes, no] to indicate the electricity source used in a dwelling, a general building or a settlement. After a short voting period (29 and 30 October) the voting was stopped due to a formal error and the proposal is again in RFC state and open for comments.


  • The MapRoulette team pointed out, on Twitter, that a MapRoulette user box can be added to OpenStreetMap wiki user profile pages.
  • The video ‘4 tools to start with OpenStreetMap‘, by Captain Mustache, is now available under a free Creative Commons BY licence on the PeerTube OpenStreetMap France instance. (fr)
    • In another video, he answers the question, ‘OpenStreetMap? What is it?’ (fr)
  • DeBigCs examined the claim that poorer areas around Dublin are less completely mapped than wealthier ones and the reasons for this.
  • Jennings Anderson writes in his blog about ‘OSMUS Community Chronicles’, exploring the growth and temporal mapping patterns in North America.
  • Nuno Caldeira is committed to the correct attribution of maps based on OSM and he has criticised Mapbox many times about the incorrect attribution function on their map service. This time, he praises Mapbox customer Flickr, which has managed to use correct attribution, even on the smallest maps. So, small size seems to be just an excuse and one can clearly add visible attribution on any map.
  • OpenStreetMap US published its newsletter for November 2020.

OpenStreetMap Foundation

  • You can find the key dates for the upcoming OSMF Annual General Meeting 2020 here.
  • Mikel Maron would like to revitalise diversity and inclusion in the OSM Foundation; in his blog post he calls on all those who have been less represented so far not to be shy but to contact him.

Local chapter news

  • FLOSSK, the Local chapter for OSM Kosovo, signed a Memorandum of Cooperation with the LUMBARDHI Foundation. This cooperation will serve for the exchange of knowledge, capacities, and resources for digitalisation, as well as the provision of materials and publications for free use by the public. FLOSSK and LUMBARDHI will cooperate in the digitisation of Zëri newspaper, TAN newspaper, and completion of the digital archive of Rilindja newspaper, as well as Jeta e Re, Përparimi and Çevren magazines, which will also be public with free access.
  • Maggie Cawley, Martijn van Exel, and Steven Johnson report on the OpenStreetMap US Charter Project Program.
  • The OpenStreetMap France Blog described (fr) > en a cartographic portal initially developed by the OSM Cameroon Association. This interactive visualiser/downloader of OSM data (OSMdata (fr)) allows you to visualise different OSM thematic layers, defined by Jean Louis Zimmermann and grouped into 16 geothematic layers. The open source code for the portal is on Github.


  • As part of National Heritage Week in Ireland, the local OSM community has decided to focus on the historic town of Clonmel. The first step was to quickly map from satellite imagery; in order to supplement this a Mapillary stream was also taken, COVID-19 compliant with mask from inside a car, using a camera attached to the window so that it could capture both sides of the road.

Humanitarian OSM

  • Russell Deffner, from HOT, is asking for assistance in mapping Izmir, Turkey. On 30 October a magnitude 7.0 earthquake struck the region encompassing south/southeast Greece and western Turkey, with the epicentre being the city of Izmir, home to about 4 million residents.


  • Sven Geggus has had trouble with the capacity tag on OpenCampingMap, wrote a blogpost about it, and is trying to engage with the community, on the tagging mailing list, to clarify ‘the meaning of the capacity tag for tourism=camp_site‘.
  • AcquaMAT is a project powered by CleaNAP, based in Naples. It is creating a crowdsourced map of drinking water points scattered in all the cities of Europe, with the aim of promoting the use of public water, thus reducing the purchase of plastic bottles for water.
    The map allows you to geolocate to see if there are points in the immediate vicinity of streets or squares of the city. Help the project by mapping water points that work and those that do not work, through a reporting form on the site.
  • [1] The map of Germany by sven_s8 (de), from the NETGIS (de) > en office in Trier, visualises the incidence of COVID-19, updated daily, as well as the intensive care bed situation (DIVI Intensive Care Register (de)) in districts or independent cities. It uses various open data interfaces and, of course, OpenStreetMap. The OSM data are imported (de) via a map of the Federal Office of Cartography and Geodesy. The application uses the UMN Mapserver and PostgreSQL/PostGIS in the backend.


  • Deutsche Bahn has updated (de) their information portal about active and future construction projects. The start page shows where all of the projects are located on an OSM-based map.


  • The JOSM issue tracker reached ticket #20000. The issue, a bug in the Wikipedia plugin, was fixed a few hours later.
  • sold MAPS.ME to Daegu Limited for 1,557 million RUB (£15.3 million). They had acquired the mobile app and its services in 2014 for 542 million RUB (£5.3 million).
    The app has been installed more than 140 million times and has ten million active users.
  • An updated version of mod_tile, the classic raster tile stack of OpenStreetMap, has been released by Felix Delattre, from the German Research Centre for Geosciences (GFZ). They packaged this software and included it as libapache2-mod-tile and renderd in Debian so that it will automatically be part of upcoming Debian and Ubuntu releases, and they are now asking for help with testing.


  • Quincy Morgan reported the updates to iD in v2.19.4 (#2931).
  • Tobias Zwick compared the download times for StreetComplete before and after he reworked the download to exclusively use the OSM API, instead of individual Overpass queries, in this chart. User mmd commented, on OSM Slack, that a similar reduction in download times might have been achieved through the performance improvements he developed for Overpass a year ago but which still haven’t been merged. The StreetComplete changes have been released in v26.0-beta1.

Did you know …

OSM in the media

  • The Times of India reported that the OSM community in Kerala has created geospatial open data maps of all local government bodies in the state, numbering over 1200.

Other “geo” things

  • The Open Geospatial Consortium (OGC) has adopted a new international standard, opening the way to a common format for cartographic description.
  • If the world were a piano roll, this is what it would sound like.
  • Marios Kyriakou created a YouTube video showing the entire changelog of QGIS 3.16 (Hannover). There is a lot to show in those 12 minutes, so it’s blazingly fast. If you prefer a slower overview you can also watch this screencast in Spanish made by Patricio Soriano from Asociación Geoinnova and In this one the first 15 minutes are introduction and installation.
  • In Quantarctica, a collection of Antarctic geographical datasets, version 4 is intended to offer expanded theme coverage and newer datasets, with more capabilities. Therefore, help is needed to identify the community’s requirements. The questionnaire takes a maximum of ten minutes to complete and will be very helpful in developing the next version of Quantarctica.

Upcoming Events

Where What When Country
Online State of the Map Japan 2020 Online 2020-11-07 japan
Taipei OSM x Wikidata #22 2020-11-09 taiwan
Salt Lake City / Virtual OpenStreetMap Utah Map Night 2020-11-10 united states
Munich Münchner Stammtisch 2020-11-11 germany
Zurich 123. OSM Meetup Zurich 2020-11-11 switzerland
Berlin 149. Berlin-Brandenburg Stammtisch (Online) 2020-11-12 germany
Online 2020 Pista ng Mapa 2020-11-13-2020-11-27 philippines
Cologne Bonn Airport 133. Bonner OSM-Stammtisch (Online) 2020-11-17 germany
Berlin OSM-Verkehrswende #17 (Online) 2020-11-17 germany
Cologne Köln Stammtisch ONLINE 2020-11-18 germany
Online FOSS4G SotM Oceania 2020 2020-11-20 oceania
Derby Derby pub meetup 2020-11-24 united kingdom
Salt Lake City / Virtual OpenStreetMap Utah Map Night 2020-11-24 united states
Düsseldorf Düsseldorfer OSM-Stammtisch [2] 2020-11-25 germany
Taipei OSM x Wikidata #23 2020-11-07 taiwan

Note: If you like to see your event here, please put it into the calendar. Only data which is there, will appear in weeklyOSM. Please check your event in our public calendar preview and correct it, where appropriate.

This weeklyOSM was produced by AnisKoutsi, Joker234, Lejun, MatthiasMatthias, MichaelFS, Nordpfeil, PierZen, Polyglot, Rogehm, TheSwavu, YoViajo, alesarrett, derFred, richter_fn.

Moving Plants

08:13, Friday, 06 2020 November UTC
All humans move plants, most often by accident and sometimes with intent. Humans, unfortunately, are only rarely moved by the sight of exotic plants. 

Unfortunately, the history of plant movements is often difficult to establish. In the past, the only way to tell a plant's homeland was to look for the number of related species in a region to provide clues on their area of origin. This idea was firmly established by Nikolai Vavilov before he was sent off to Siberia, thanks to Stalin's crank-scientist Lysenko, to meet an early death. Today, genetic relatedness of plants can be examined by comparing the similarity of DNA sequences (although this is apparently harder than with animals due to issues with polyploidy). Some recent studies on individual plants and their relatedness have provided insights into human history. A study on baobabs in India and their geographical origins in East Africa established by a study in 2015 and that of coconuts in 2011 are hopefully just the beginnings. These demonstrate ancient human movements which have never received much attention from most standard historical accounts.
Inferred trasfer routes for Baobabs -  source

Unfortunately there are a lot of older crank ideas that can be difficult for untrained readers to separate. I recently stumbled on a book by Grafton Elliot Smith, a Fullerian professor who succeeded J.B.S.Haldane but descended into crankdom. The book "Elephants and Ethnologists" (1924) can be found online and it is just one among several similar works by Smith. It appears that Smith used a skewed and misapplied cousin of Dollo's Law. According to him, cultural innovation tended to occur only once and that they were then carried on with human migrations. Smith was subsequently labelled a "hyperdiffusionist", a disparaging term used by ethnologists. When he saw illustrations of Mayan sculpture he envisioned an elephant where others saw at best a stylized tapir. Not only were they elephants, they were Asian elephants, complete with mahouts and Indian-style goads and he saw this as definite evidence for an ancient connection between India and the Americas! An idea that would please some modern-day Indian cranks and zealots.

Smith's idea of the elephant as emphasised by him.
The actual Stela in question
 "Fanciful" is the current consensus view on most of Smith's ideas, but let's get back to plants. 

I happened to visit Chikmagalur recently and revisited the beautiful temples of Belur on the way. The "Archaeological Survey of India-approved" guide at the temple did not flinch when he described an object in the hand of a carved figure as being maize. He said maize was a symbol of prosperity. Now maize is a crop that was imported to India and by most accounts only after the Portuguese reached the Americas in 1492 and made sea incursions into India in 1498. In the late 1990s, a Swedish researcher identified similar  carvings (actually another one at Somnathpur) from 12th century temples in Karnataka as being maize cobs. It was subsequently debunked by several Indian researchers from IARI and from the University of Agricultural Sciences where I was then studying. An alternate view is that the object is a mukthaphala, an imaginary fruit made up of pearls.
Somnathpur carvings. The figures to the
left and right hold the puported cobs in their left hands.
(Photo: G41rn8)

The pre-Columbian oceanic trade ideas however do not end with these two cases from India. The third story (and historically the first, from 1879) is that of the sitaphal or custard apple. The founder of the Archaeological Survey of India, Alexander Cunningham, described a fruit in one of the carvings from Bharhut, a fruit that he identified as custard-apple. The custard-apple and its relatives are all from the New World. The Bharhut Stupa is dated to 200 BC and the custard-apple, as quickly pointed out by others, could only have been in India post-1492. The Hobson-Jobson has a long entry on the custard apple that covers the situation well. In 2009, a study raised the possibility of custard apples in ancient India. The ancient carbonized evidence is hard to evaluate unless one has examined all the possible plant seeds and what remains of their microstructure. The researchers however establish a date of about 2000 B.C. for the carbonized remains and attempt to demonstrate that it looks like the seeds of sitaphal. The jury is still out.
The Hobson-Jobson has an interesting entry on the custard-apple
I was quite surprised that there are not many writings that synthesize and comment on the history of these ideas on the Internet and somewhat oddly I found no mention of these three cases in the relevant Wikipedia article (naturally, fixed now with an entire new section) - pre-Columbian trans-oceanic contact theories

There seems to be value for someone to put together a collation of plant introductions to India along with sources, dates and locations of introduction. Some of the old specimens of introduced plants may well be worthy of further study.

Introduction dates
  • Pithecollobium dulce - Portuguese introduction from Mexico to Philippines and India on the way in the 15th or 16th century. The species was described from specimens taken from the Coromandel region (ie type locality outside native range) by William Roxburgh.
  • Eucalyptus globulus? - There are some claims that Tipu planted the first of these (See my post on this topic).  It appears that the first person to move eucalyptus plants (probably E. globulosum) out of Australia was  Jacques Labillardière. Labillardiere was surprized by the size of the trees in Tasmania. The lowest branches were 60 m above the ground and the trunks were 9 m in diameter (27 m circumference). He saw flowers through a telescope and had some flowering branches shot down with guns! (original source in French) His ship was seized by the British in Java and that was around 1795 or so and released in 1796. All subsequent movements seem to have been post 1800 (ie after Tipu's death). If Tipu Sultan did indeed plant the Eucalyptus here he must have got it via the French through the Labillardière shipment.  The Nilgiris were apparently planted up starting with the work of Captain Frederick Cotton (Madras Engineers) at Gayton Park(?)/Woodcote Estate in 1843.
  • Muntingia calabura - when? - I suspect that Tickell's flowerpecker populations boomed after this, possibly with a decline in the Thick-billed flowerpecker.
  • Delonix regia - when?
  • In 1857, Mr New from Kew was made Superintendent of Lalbagh and he introduced in the following years several Australian plants from Kew including Araucaria, Eucalyptus, Grevillea, Dalbergia and Casuarina. Mulberry plant varieties were introduced in 1862 by Signor de Vicchy. The Hebbal Butts plantation was establised around 1886 by Cameron along with Mr Rickets, Conservator of Forests, who became Superintendent of Lalbagh after New's death - rain trees, ceara rubber (Manihot glaziovii), and shingle trees(?). Apparently Rickets was also involved in introducing a variety of potato (kidney variety) which got named as "Ricket". -from Krumbiegel's introduction to "Report on the progress of Agriculture in Mysore" (1939) [Hebbal Butts would be the current day Airforce Headquarters)

Further reading
  • Johannessen, Carl L.; Parker, Anne Z. (1989). "Maize ears sculptured in 12th and 13th century A.D. India as indicators of pre-columbian diffusion". Economic Botany 43 (2): 164–180.
  • Payak, M.M.; Sachan, J.K.S (1993). "Maize ears not sculpted in 13th century Somnathpur temple in India". Economic Botany 47 (2): 202–205. 
  • Pokharia, Anil Kumar; Sekar, B.; Pal, Jagannath; Srivastava, Alka (2009). "Possible evidence of pre-Columbian transoceanic voyages based on conventional LSC and AMS 14C dating of associated charcoal and a carbonized seed of custard apple (Annona squamosa L.)" Radiocarbon 51 (3): 923–930. - Also see
  • Veena, T.; Sigamani, N. (1991). "Do objects in friezes of Somnathpur temple (1286 AD) in South India represent maize ears?". Current Science 61 (6): 395–397.
  • Rangan, H., & Bell, K. L. (2015). Elusive Traces: Baobabs and the African Diaspora in South Asia. Environment and History, 21(1):103–133. doi:10.3197/096734015x1418317996982 [The authors however make a mistake in using Achaya, K.T. Indian Food (1994) who in turn cites Vishnu-Mittre's faulty paper for the early evidence of Eleusine coracana in India. Vishnu-Mittre himself admitted his error in a paper that re-examined his specimens - see below]
Dubious research sources
  • Singh, Anurudh K. (2016). "Exotic ancient plant introductions: Part of Indian 'Ayurveda' medicinal system". Plant Genetic Resources. 14(4):356–369. 10.1017/S1479262116000368. [Among the claims here are that Bixa orellana was introduced prior to 1000 AD - on the basis of Sanskrit names which are assigned to that species - does not indicate basis or original dated sources. The author works in the "International Society for Noni Science"! ] 
  • The same author has rehashed this content with several references and published it in no less than the Proceedings of the INSA - Singh, Anurudh Kumar (2017) Ancient Alien Crop Introductions Integral to Indian Agriculture: An Overview. Proceedings of the Indian National Science Academy 83(3). There is a series of cherry-picked references, many of the claims of which were subsequently dismissed by others or remain under serious question. In one case there is a claim for early occurrence of Eleusine coracana in India - to around 1000 BC. The reference cited is in fact a secondary one - the original work was by Vishnu-Mittre and the sample was rechecked by another bunch of scientist and they clearly showed that it was not even a monocot - in fact Vishnu-Mittre himself accepted the error - the original paper was Vishnu-Mittre (1968). "Protohistoric records of agriculture in India". Trans. Bose Res. Inst. Calcutta. 31: 87–106. and the re-analysis of the samples can be found in - Hilu, K. W.; de Wet, J. M. J.; Harlan, J. R. Harlan (1979). "Archaeobotanical Studies of Eleusine coracana ssp. coracana (Finger Millet)". American Journal of Botany. 66 (3):330–333. Clearly INSA does not have great peer review and have gone with argument by claimed authority.
  • PS 2019-August. Singh, Anurudh, K. (2018). Early history of crop presence/introduction in India: III. Anacardium occidentale L., Cashew Nut. Asian Agri-History 22(3):197-202. Singh has published another article claiming that cashew was present in ancient India well before the Columbian exchange - with "evidence" from J.L. Sorenson of a sketch purportedly made from a Bharhut stupa balustrade carving - the original of which is not found here and a carving from Jambukeshwara temple with a "cashew" arising singly and placed atop a stalk that rises from below like a lily! He also claims that some Sanskrit words and translations (from texts/copies of unknown provenance or date) confirm ancient existence. I accidentally asked about whether he had examined his sources carefully and received a rather interesting response which I find very useful as a classic symptom of the problems of science in India. More interestingly I learned that John L. Sorenson is well known for his affiliation with the Church of Jesus Christ of Latter-day Saints and apparently part of Mormon foundations is the claim that Mesoamerican cultures were of Semitic origin and much of the "research" of their followers have attempted to bolster support for this by various means. Below is the evidence that A.K.Singh provides for cashew in India.

Worth examining the motivation of Sorenson through the life of a close associate  -  here

Authorship Highlighting improvements

17:53, Thursday, 05 2020 November UTC

We recently launched an awesome new feature to the Dashboard’s Authorship Highlighting, thanks to volunteer open source developer Bailey McKelway. Bailey is a full-stack developer who recently graduated from Fullstack Academy in New York City, and judging by the sophisticated work he’s done on the Dashboard, he’s got a strong software development career ahead of him. Here’s Bailey to explain his new feature (and he also wrote a technical post about it on his blog). – Sage Ross, Chief Technology Officer

Demonstration of scrolling to the first highlighted contribution by a student
Click the arrow to scroll to an editor’s highlighted contributions.
Demonstration of scrolling back to the top after the last contribution is reached
Once you’ve reached the last contribution, click again to scroll back to the first one.
Demonstration of switching between students
Check for different students’ contributions as you scroll through the page.

So you may have noticed there is a new feature within the Authorship Highlighting view. Now you will be able to scroll to a user’s revisions just by clicking a button.

This makes it much easier to find revisions that students have made to articles. All you have to do is click the arrow next to the user’s name at the bottom and the page will scroll to the user’s revisions!

By clicking the arrow for the first time, this will scroll the selected student’s first edit to the top of the page. Continuing to click the arrow will scroll to the next revision that is currently not in view.

After clicking the arrow at the last revision the page will scroll back up to the first revision.

If you switch to a different user, then the feature is smart enough to scroll to the new editor’s closest edit below the current edit. If there are no edits below the current edit then it scrolls to the first edit made by the editor.

If you scroll to a revision and all the current revisions are currently in view, then the page will “bump” signifying there are no other edits.

Hope you all enjoy the new feature!

– Bailey McKelway

Semantic MediaWiki 3.2.0 released

16:03, Wednesday, 04 2020 November UTC

September 7, 2020

Semantic MediaWiki 3.2.0 (SMW 3.2.0) has been released today as the next release of Semantic MediaWiki.

It is a major release. Please refer to Semantic MediaWiki 3.2.0 for further information.

A Wikipedian six years in the making

17:20, Monday, 02 2020 November UTC

In 2014, I joined Wiki Education as Program Manager for the Wikipedia Student Program. Six years later, I can now proudly call myself a real Wikipedian!

Though I had never edited Wikipedia myself before joining Wiki Education, I believed whole-heartedly in its mission of making knowledge free and accessible to all and was thrilled to be part of a team attempting to bridge the gap between Wikipedia and academia. I had come from academia myself, having completed a Ph.D. in History from UC Berkeley in 2012, and I was excited to bring my own expertise and skills to Wikipedia. Right away, I began learning the ins and outs of editing and had soon racked up edits on talk pages as Helaine (Wiki Ed). I slowly but surely became a member of the Wikipedia community, but still I had not made any content contributions in the article main space. I could speak about notability with the best of them and had shepherded thousands of students and instructors through their Wikipedia assignments, but I had yet to take those first baby steps myself.

Finally, in September of this year, I decided to take that leap. There was no better way to do so than with one of our own Wiki Scholars courses. I enrolled as a student in a course specifically devoted to improving content around COVID-19 led by my wonderful colleague Ian Ramjohn.

I chose to write a topic near and dear to my heart: how the pandemic has affected people with disabilities. I am blind myself, and while I have fared relatively well during this tumultuous period, I wanted to make sure the world had access to information about how COVID has impacted an already vulnerable community.

As User:Hblumen I got to work, scouring the internet for scant information on how the pandemic has affected people with disabilities, and finally encountered both the challenges and the heights new editors face when contributing to Wikipedia for the first time. As a blind editor, in particular, I learned that the VisualEditor is not at all accessible with screen readers and that references are tricky as well. I was glad that I had learned wikicode all those years ago when I joined the team. I also learned that, while I had not contributed article content to Wikipedia, I already knew a great deal and mostly just needed the motivation and confidence boost to make those first edits.

By the end of the course, I was incredibly proud to have written Impact of the COVID-19 pandemic on people with disabilities. I was both dismayed but unsurprised to find a paucity of information on the topic, but I’m hopeful that my article sparks others to think about how COVID has affected populations already at high risk for a host of physical, emotional, and socioeconomic disadvantages.

Thank you to Ian and to my fellow Wiki Scholar participants for helping this would-be Wikipedian take those final critical steps. For years I have read comments from students and instructors on the pride and satisfaction that comes with seeing your edits live on Wikipedia, and now I truly understand how gratifying it is to contribute to public knowledge.

Interested in taking a course like the one Helaine took? Visit to see our current course offerings.

weeklyOSM 536

12:04, Sunday, 01 2020 November UTC


lead picture

Wikimap with all geotagged Wikipedia articles 1 | © Louis Jencka, Wikidata, Wikipedia | map data © OpenStreetMap contributors


  • User Darafei updated the Disaster.Ninja tool, which was developed to assist HOT in their activation process, but could also be useful for the general mapping community.
  • Robert Delmenico published a proposal to substitute the existing tags man_made=* with the new tag artificial=* in order to make the language more gender-neutral. The proposal is open for comments.
  • Jeroen Hoek and Supaplex have made a proposal for parking=street_side for tagging areas suitable, or designated for, parking, which are directly adjacent to the carriageway of a road and can be reached directly from the roadway without having to use an access way. The proposal is now open for comments.
  • Brian Sperlongano (User ZeLonewolf) has published a proposal for a new tag boundary=special_economic_zone to map Special Economic Zones. The proposal is now open for comments.
  • Alter Geosystems explains (es) > en how OpenStreetMap data can be enriched with Wikidata knowledge. They also link to the Wikidata Query Service, a facility for running queries.
  • PanierAvide reports (fr) > en about the most successful project of the month of the French OSM community so far, the mapping of defibrillators.


  • Labian Gashi has won the DINAcon 2020 Award in the category ‘Best Newcomer’ for his JOSM plugin ‘NeTEx Converter‘. Stefan Keller, from the HSR Rapperswil/Ostschweizer University of Applied Sciences, as well as specialists from SBB, supervised the project. The ‘NeTEx Converter’ converts OpenStreetMap data into the Network Timetable Exchange (NeTEx) format, (a CEN standard), which is ‘designed for the efficient exchange of complex transport data’. The plugin also checks rudimentary indoor routing within stations.
  • coolmule0 has been mapping since July of last year and has summarised a beginner’s experience of OSM in a blog post. Among others things they discuss Mapcarta and the wiki article about building=terrace.
  • DeBigC blogged that he discovered a little brother of the infamous Melbourne skyscraper in Dublin and traced it down to a typo.

OpenStreetMap Foundation

  • If you are intending to run for an OSM Foundation seat, don’t forget that the deadline is Saturday 7 November 2020.
  • Rory McCann, member of the OSMF Board of Directors, proposes an amendment to Article 91 of the OSMF Constitution. In future, it should be possible for board committees to include members who are not board members.
  • User Nakaner is proposing a resolution for the upcoming Annual General Meeting of the OSM Foundation on 12 December 2020. About 80 supporters are necessary before 4 November.
  • Some OpenStreetMap Foundation board members will host an Ask me Anything (AMA) on Reddit. All questions can be asked. The AMA will start 9 November at 16:00 CET. Questions can be raised from 2 November in the AMA thread.
  • Between concern and disappointment, Severin Menard outlined his opinion of OSMF’s development since the last elections. He refers to Christoph Hormann’s (Imagico) blog post that we covered earlier.
  • The OSMF-Talk mailing list discussed two proposals from the OSMF board of directors to harden OSMF against hostile takeovers by big/bad companies. Well-known employees of Facebook and Mapbox argued against these proposals.
    • Rory McCann proposes that the OSMF bylaws include a provision that memberships expire, and votes become invalid if a member cannot freely exercise his or her rights or is contractually bound (e.g., with the employer) in the exercise of those rights. The opponents believe that this is practically impossible to prove. Proponents believe that the non-usability of the clause does no harm and that one must assume that companies are evil.
    • Tobias Knerr proposes a member resolution on minimum requirements for new members. In the future, the board should reject membership applications if the interested party has not made significant contributions to OSM. Here, too, there is headwind from the American business environment.


  • The next Geomob will take place online on 17 November 2020. Signing up for an invite (Zoom URL) is necessary.
  • A mixed physical-digital Missing Maps Mapathon is planned (de) > en for 30 November. If the epidemiological situation permits, the physical part will take place in Wabern (Bern, Switzerland) at swisstopo.
  • On Monday and Tuesday (2 and 3 November) the biennial conference GeOnG will take place for the seventh time and participants will join in over 30 live sessions around the topics of technology and information management in the humanitarian and development sector. This year’s theme is ‘People at the heart of Information Management: promoting responsible and inclusive practices’. Check the full agenda here.

Humanitarian OSM

  • Marcel Reinmuth provided a cross-sectional analysis about mapping physical access to health care for older adults in sub-Saharan Africa and the implications for the COVID-19 response.
  • Jikka Defiño reports about the collection of field data for the PhilAWARE disaster risk reduction project and training in the Philippines.
  • The Mapping Power campaign was featured in Mapillary’s Blog, explaining how students across the YouthMappers network are using augmented and volunteer mapping through Mapillary, Map With AI, TeachOSM, and HOT to improve Sierra Leone’s electrical grid and connect rural communities.


  • Diego Alonso explained (es) > en how to download Sentinel images with QGIS.
  • flo2154 provided (de) > en his first MapComplete theme, which displays benches amenity=bench and other elements tagged with bench=yes.
  • derstefan is looking (de) > en for beta testers for the new OpenTopoMap-Garmin maps. The temporary address is
  • cquest presented (fr) a map showing the areas affected by the health curfew in France.

Open Data

  • Russian startup company Geoalert has published Urban Mapping, the first open dataset of automatically traced building footprints covering Russia. To achieve this the company used Mapbox Satellite imagery, which Mapbox has explicitly permitted others to auto-trace using machine learning algorithms. Despite the fact that Mapbox has quite poor coverage in Russia, in terms of images quality and timeliness, for some regions the ‘Urban Mapping’ datasets surpass the current count of OSM buildings significantly.Currently there are three regions available via the links on Github: Chechnya, Tyva and Moscow.


  • Simon Legner reports that the Java version of Osmpbf, a library for reading and writing OSM PBF files, is now available from Maven Central.
  • Erick de Oliveira Leal explained how to enable the Strava High Resolution Layer in OpenStreetMap (JOSM or ID). Editor’s note: Please note that when you are using this Strava layer there is, at present, no permission to use this layer for OSM and you run the risk of having your edits removed.
  • Guillaume Rischard, maintainer of the Editor Layer Index, suggests abandoning the ELI and for iD to use the background layer list from JOSM (we have covered previous discussions of this).
  • Sarah Hoffmann, aka lonvia, reports that the download server for Photon now has ready-to-use database dumps for over 200 countries.


  • QGIS 3.16.0 ‘Hannover’ has been released. It brings new options for 3D mapping, mesh generation from other data types, additional spatial analysis tools, symbology and user interface enhancements.

Did you know …

  • … the list of English exonyms for foreign toponyms?
  • [1] … Wikimap, a map showing the location of all geotagged Wikipedia articles?

Other “geo” things

  • Matthias Schwindt, from GPS Radler, presents (de) > en three models of the robust outdoor Garmin Montana 700 series in practical tests and helps you to decide which one is the right one.
  • Google AI recently launched the open-source browser-based toolset , which was created to enable the exploration of city transitions from 1800 to 2000 virtually in a three-dimensional view.
  • Jonathan Amos, a BBC Science correspondent, reported about Norway’s funding of satellite maps of the world’s tropical forests.
  • Seán Lynch informed us of his decision to make OpenLitterMap available as open source (GPLv3).
  • David Hambling, from BBC Future, poses the question of what the world would do without GPS.
  • The Fraunhofer Institute for Industrial Engineering (FhG-IAO) offers (de) as a result of the Communal Innovation Center (KIC@bw) (de) > en a full-text download of the practice-oriented guideline ‘Communal Data for Future-Oriented Urban Development’, which is intended to provide orientation knowledge and show fields of application, options for action and development possibilities. The data, generated by administrative digitisation or the use of digital offerings in public spaces, can provide municipalities with potential for improving the quality of life, reducing the number of resources used, cutting costs, improving citizen services, or making administrative processes more efficient, and thus making a significant contribution to municipal development.

Upcoming Events

Where What When Country
Bratislava Meeting Missing Maps CZ & SK [1] 2020-10-31 slovakia
London Missing Maps London Mapathon 2020-11-03 united kingdom
Stuttgart Stuttgarter Stammtisch (online) 2020-11-04 germany
Bochum Bochum OSM-Stammtisch (Online) [2] 2020-11-05 germany
Dresden Dresdner OSM-Stammtisch 2020-11-05 germany
Online State of the Map Japan 2020 Online 2020-11-07 japan
Taipei OSM x Wikidata #22 2020-11-09 taiwan
Salt Lake City / Virtual OpenStreetMap Utah Map Night 2020-11-10 united states
Munich Münchner Stammtisch 2020-11-11 germany
Zurich 123. OSM Meetup Zurich 2020-11-11 switzerland
Berlin 149. Berlin-Brandenburg Stammtisch (Online) 2020-11-12 germany
Online 2020 Pista ng Mapa 2020-11-13-2020-11-27 philippines
Cologne Bonn Airport 133. Bonner OSM-Stammtisch (Online) 2020-11-17 germany
Lüneburg Lüneburger Mappertreffen 2020-11-17 germany
Berlin OSM-Verkehrswende #17 (Online) 2020-11-17 germany
Cologne Köln Stammtisch ONLINE 2020-11-18 germany
Online FOSS4G SotM Oceania 2020 2020-11-20 oceania

Note: If you like to see your event here, please put it into the calendar. Only data which is there, will appear in weeklyOSM. Please check your event in our public calendar preview and correct it, where appropriate.

This weeklyOSM was produced by Lejun, MatthiasMatthias, MichaelFS, Nakaner, Nordpfeil, NunoMASAzevedo, PierZen, Rogehm, TheSwavu, derFred, richter_fn.

English Malayalam Translation using OpusMT

11:40, Sunday, 01 2020 November UTC

SMC had started a machine translation service at for English-Malayalam. This system uses huggingface transformers with OpusMT language models for translation. OPUS MT provides pre-trained neural translation models trained on OPUS data. These models can seamlessly run with the OPUS-MT transation servers that can be installed from our OPUS-MT github repository. The translation service is powered by Marian Neural MT engine The quality of the machine translation depends on the availability of parallel corpus.