en.planet.wikimedia

December 14, 2017

Wikimedia Foundation

‘Monumental’ winners from the world’s largest photo contest showcase history and heritage

Janepop Atirattanachai jumped in a car with some friends and road-tripped over 500 km for this second-place photo. When the sky cleared at the perfect moment, Atirattanachai captured a ray of sunshine descending upon the royal pavilion in Thailand’s Khao Sam Roi Yot National Park. They also won ninth place, which can be found farther down in this post.

A timeless moment of an aging man walking through a historic area of Comacchio, Italy, on a foggy and bitterly cold winter day won fifteenth place. Francesco, the photographer and a native of Comacchio, was able to spot this man just early enough to get into position and wait for him to pass.

Long-time volunteer Wikimedian Diego Delso was in Montreal for Wikimania, the major annual conference of the Wikimedia movement, but snuck out of his hotel early one morning to capture the interior of the Notre-Dame Basilica—and eighth place in the contest. “I was actually the first visitor who got into the church that day,” he said. “The church was, for a minute, only for me.”

———

These are three of the fifteen winning photos from this year’s Wiki Loves Monuments, an annual photo competition recognized by the Guinness Book of World Records as the world’s largest photo competition. The contest, now in its sixth year, focuses on “monuments,” which the organizers broadly define as structures recognized by a local authority as being of particular value to cultural heritage.

This year’s rendition lived up to the billing: over 254,000 photos were submitted by just over 10,000 photographers. A plurality of the photos were of Ukrainian subjects, followed by Armenian.

The top images were winnowed down by the federated nature of the contest, as Wiki Loves Monuments is primarily organized on a national level by people just like you. Up to ten winners from each national competition, fifty-four in all, were advanced to an international jury, whose results are listed here. Five of those nations were entirely new to the contest, while seven were participating for the seventh time.

For more information on the winning photos and the 2017 competition, go to www.wikilovesmonuments.org. The remaining winners follow below.

———

The Hindu deity Khandoba is worshipped with turmeric, bel fruit-leaves, onions and other vegetables. Here, at a revered temple in Pune, India, followers are showering one another with turmeric powder. This photo from PKharote won first place in the international competition.

Simone Letari was unhappy with the lighting conditions available outside, so they ventured inside Verrucole Castle and discovered this vintage scene, which netted them sixth place.

Long-time architectural photographer Mostafa Meraji submitted this image of Iran’s Tabātabāei House, built in the 1880s for a wealthy carpet trader, and received fourteenth place in return.

“Around 20,000 people were visiting [Bangladesh’s Baitul Mukarram National Mosque], decorated in beautiful teal and gold, for weekly [Friday] prayers,” says amateur photographer Azim Khan Ronnie. “Thousands of people come together [here] to pray over several floors of one of the biggest mosques in the world.”  This presented both opportunity and difficulty; Ronnie needed many attempts to find the right composition. That effort yielded Wiki Loves Monuments’ third-place photo.

Unlike the other shot of Verrucole Castle to place in the international competition, Iris—the photographer behind this image—visited the area when the lighting outside was a bit better. The unexpectedly fantastic conditions, in fact, gave Iris lighting that shot out of the sky “like a razor through the heart,” as they put it—and straight into fifth place.

 

Janepop Atirattanachai was the only photographer with more than one image in the top fifteen. Unlike the other one, this did not require a road trip. Atirattanachai lives near Wat Benchamabophit, a Buddhist temple in Bangkok, Thailand, and photographs it often. They used that experience to snapshot the temple, framed by an arched gateway entrance and highlighted by a setting sun.

Brighton, in the United Kingdom, built the massive West Pier in the 1860s, during what its Wikipedia article calls “a boom in pleasure pier building.” It’s rather less pleasurable today, given that it has been derelict since 1975 and in a state of collapse since 2002. Still, that was no obstacle to Matthew Hoser, who used the rusting piers to frame the outermost ruins of what was a concert hall one century ago. “I think what makes my photo stand out is its simplicity,” Hoser said. “The long exposure … allowed me to smooth out the water and get a slightly smoky effect at the shore. This really allowed the strong lines of the derelict pier to stand out. I also like the contrasting textures of the pebbly sand and the soft water, and how a couple of pigeons managed to stay still for the full eight seconds!” The result came in eleventh place.

Roulette and poker tables feature in Martin Kraft‘s twelfth-place photo of the casino inside Wiesbaden, Germany’s Kurhaus. It was actually shot in 2013, right around the time Kraft became part of the Wikimedia movement.

The fourth-place image, from Manadily, shows a staircase in Cairo’s Indian-inspired Baron Empain Palace.

Dmytro Balkhovitin traveled to Mestia with a few friends awhile back and was forced to walk around for quite some time before finding this vantage point of its Svaneti towers, which won thirteenth place. “I like how the Svani towers are in harmony with the mountain peaks,” Balkhovitin said, “and [how the] illumination of the towers is combined with clouds at sunset.”

Seventh place is an exterior shot of Baitul Mukarram National Mosque from user Jubair1985, with a solitary person reading in between two of the pillars. The building was also featured in fourth place.

Thomas Adams’ process in taking this tenth-place image of the famed Sydney Opera House, seen from the perhaps equally famous Sydney Harbor Bridge, is a story unto itself. “I had been out all night on my own taking shots of the city when I thought I could get a shot of the Opera House,” Adams told us in comments that have been lightly edited for clarity.

Colourful themed light projections light the opera house’s sails for most of the night, but I already had images of those and wanted something different. I had to wait till 11:30pm for the coloured lights to be switched off, leaving just white light. The image was a 32 second exposure at f11 ISO 100 taken from the middle of the Harbour Bridge … The middle of the bridge was the toughest spot for a long exposure—every time a truck or train went past, the vibrations ruined the shot. Still, it was also the best spot to get a great angle of the house.

To make the situation more interesting, my camera battery was about to die and shot after shot was being ruined by the trucks and trains.  Then a passerby happened to walk past, a visitor to Sydney, and asked for his photo to be taken. I didn’t have the heart to refuse … and got a nice shot of him with the Opera House in the background.

So then it was back to trying to get the shot of the opera house with … a dying battery and getting a clear 30 seconds without any heavy vehicles crossing the bridge. I was wondering if I might have to come back another night to try again, but after a few more attempts I finally got it—with moments to spare before the battery finally died.

Ed Erhart, Senior Editorial Associate, Communications
Wikimedia Foundation

Did you love these photos? Check out last year’s winners for more.

All of the images in this post are freely licensed under Creative Commons’ CC BY-SA 4.0, with the exception of twelfth place (Kurhaus), which is CC BY-SA 3.0. In short, this means that you can use any of the images in this post for any reason, so long as you attribute the photographers and share any remixes under the same license.

by Ed Erhart at December 14, 2017 03:18 PM

Wiki Loves Monuments

The winners of 2017

After another exciting year of Wiki Loves Monuments, the time has come that we share the winners of this year’s international finale. In the 2017 edition, more than 245,000 photos of built heritage sites around the world, from Agra to Zagreb, were submitted to our 54 national competitions. By participating, the more than 10,000 photographers donated their images to be used on Wikipedia and sister projects, giving a beautiful spark to free knowledge. On behalf of all readers of Wikipedia: thank you!

Out of these submissions, 489 were selected for consideration by our international jury. You can explore these diverse images on our main  overview page. The jury assessed, considered and ranked these photos based on our usual criteria: usefulness for Wikipedia, technical quality and originality. A report with their full assessment will be published at a later time. Today, we’re proud to share with you their final selection of the 15 winners of Wiki Loves Monuments along with some of the stories of the people behind these photos. We’re especially excited that for several of the winning contributors, this was their first time contributing to a Wikimedia project. We hope to see their continued contributions in the coming years.

If you still long for more heritage stories and photos, take a look on our Instagram or Twitter account.

Enjoy!

 

First place. The grand prize goes this year to India! The Khandoba temple in Pune, India is a wonderful view in any day, but this photo is an amazing combination of activity, color and history. During the Bhandara festival, celebrants are showering yellow turmeric powder on crowd around the temple. The play with colors makes it that you can almost smell the turmeric through the photo. (PKharote, CC BY-SA 4.0)

 

Second place. The second prize is quite literally a hidden gem: a royal pavilion hidden in a cave in the mountains of Thailand. Janepop travelled for more than 500 km with his friends to arrive at exactly the right moment to make this great shot with the perfect light conditions. (BerryJ, CC BY-SA 4.0)

 

Third place. The Baitul Mukarram Mosque in Dhaka is the 10th largest mosque in the world, and was here photographed during peak hour, the Jummah prayer in the early afternoon. After many photos, Azim managed to find the exact right angle and moment that shows the magnitude of the mosque in action as well as key architectural elements. (Azim Khan Ronnie, CC BY-SA 4.0)

Fourth place. The spiral stairs from the tower of the Baron Empain Palace (also: the Hindu palace) in Cairo, Egypt. The palace was inspired by Ankor Wat (Cambodia) and built in the early 20th century out of concrete. The winding staircases can be somewhat hypnotizing, but the distortions at the top will bring you back to reality. (Manadily, CC BY-SA 4.0)

Fifth place. A great use of light emphasizes the beauty of the surrounding area of Verrucole Castle in the Italian Tuscany. Iris likes to observe strange phenomena in the sky, and wanted to capture this in the act. (Iris.gonelli, CC BY-SA 4.0)

Sixth place. It’s hard to imagine, but this is the same Verrucole Castle that made fifth place! Simone managed to get into the top-10 with an entirely different photo. With a rare interior shot, it is one of the exhibitions in the castle, showing what life was like in medieval times. The jury characterized this as a ‘historical scene where the servants seem to have just left’. (Simone Letari, CC BY-SA 4.0)

Seventh place. A lonely reader in (yes, again!) the Baitul Mukarram National Mosque in Dhaka, Bangladesh gives a peek into a totally different side of the same building. The reader is not just decorative, but provides a helpful measure of scale to the nested arches. (Jubair1985, CC BY-SA 4.0)

Eighth place. Diego Delso has been a long time contributor to Wikimedia and Wiki Loves Monuments, and managed to blow away jurors once again with his magnificent photograph of the Notre Dame basilica in Montreal, Canada. Diego was unhappy about his earlier photo of this basilica, and was more satisfied with this photo ten years later. Apparently, so was the jury. (Diego Delso, CC BY-SA 4.0)

Ninth place. A great symmetric photo of Wat Benchamabophit with only the sun peeking over one of the sides is the second submission by Janepop that made it into the top-15. He photographed it already many times, as it has great architectural features and this time it definitely paid off. (BerryJ, CC BY-SA 4.0)

Tenth place. Australia joined this year for the first time in Wiki Loves Monuments, and with the Sydney Opera House, the country makes an impressive entry into the top-10. Thomas was already interested in night photography, and waited until the middle of winter so that the Opera House was opened during dark, so that the internal light would provide the best opportunity.(Alphacontrol, CC BY-SA 4.0)

Eleventh place. Let’s face it, this derelict pier in Brighton, UK is not quite a famous beauty. But also decaying heritage is a valuable reminder of our past, and this massive pier from the 1860s is a memento from a time that pleasure piers were built in many places. The quiet sea and rusty remains provide some sense of timelessness. (Matthew Hoser, CC BY-SA 4.0)

Twelfth place. This casino hall is located in one of the spa’s in the spa town Wiesbaden in Germany. While gambling is definitely going on in this establishment, it has more the look of a library. The setting is calm, but with plenty of details to discover. (Martin Kraft, CC BY-SA 3.0)

Thirteenth place. These Svaneti towers in Mestia, Georgia stem from the 9th-12th century and are typical for the region. Dmytro had to walk around for quite a while to find the exact right spot and light conditions to capture them in this composition. (Dmytro Balkhovitin, CC BY-SA 4.0)

Fourteenth place. This image shows an element of ‘Tabātabāei House’ in Kashan, Iran. The house was built in the 1880s for a wealthy carpet trader and is an example of Iranian residential architecture. With thousands of photos donated by Mostafa this year alone, we’re happy to reward this one with a fourteenth place.(Mostafameraji, CC BY-SA 4.0)

Fifteenth place. This is great in capturing an emotional state of being, and the foggy cold winter day becomes very tangible as the old man walks down the stairs in the historical center of Comacchio in Italy. (Francesco-1978, CC BY-SA 4.0)

Did you enjoy these images? Continue browsing the winning pictures of all national competitions in 2017!

by Lodewijk at December 14, 2017 03:14 PM

Gerard Meijssen

A purposeful #strategy for #Wikidata

A strategy for Wikidata? Obvious, it is all about having a purpose. It is not about policies, it is not about what we need or expect of others but it is about the purpose you, I and others have for us to collaborate on in an inclusive Wiki and data project.

The implication of making the purposes of our community rule supreme are huge. Purpose like so many other things can be measured. When people have a purpose for Wikidata and actually use it, their need for quality is self evident. They will invest their time and effort in fulfilling their purpose. The one question is how to fit in the many purposes that exist for Wikidata.

Take for instance the objective of Lsjbot for a rich Wikipedia in the Cebuano language. He uses data from an external database to create articles. Data from these articles are imported later through the Cebuano Wikipedia in Wikidata. This is seen by some as controversial because of the need to integrate data that often already exists. The purpose is obvious; rich information in the Cebuano language. The solution is obvious as well; let Lsjbot use the data at Wikidata to generate the information for the Cebuano Wikipedia. GeoNames is happy to collaborate with us on this, so when we care to collaborate and welcome its data at the front door, we can mix'n'match the data into Wikidata, curate the data where necessary and share improved quality widely, not only on the ceb.wp.

The Biodiversity Heritage Library Consortium is working extremely hard to expose their work to the general public. Over a million illustration found their way to Flickr. Fae imported many of these to Commons and most if not all the associated publications can be read on the Internet Archive or on its website. Their content is awesome, check for instance their Twitter account. We can import all the BHL books in Wikidata, we are importing all associated authors using Mix'n'Match. The images are in Commons but how is this brought together? How do we add value for the BHL and as important, for our shared public?

The Internet Archive is a Wikimedia partner. It provides essential services for us with its "Wayback machine". It is how we can still refer to references that used to be online. One other venture of the Internet Archive is its Open Library.  What we already do for the Open Library is linking their authors and by inference books to the libraries of the world through VIAF. We could share this information with the Wikipedias so that its readers may find books they can read. (Talk about sharing the sum of all knowledge).

Both the IA and the BHL want people to read. They (also) provide scientific publications that may be read to prove the points Wikipedia authors make in articles. Both can be big players strengthening the value of citations in WikiCite. At this time its strength is particularly in the biomedical field and it is already attracting bright people to Wikidata. As data from other fields finds its way, people like Egon and Siobhan will find their way. This will make Wikidata even more inclusive.

To make this future work, to become more inclusive, we should trust people more particularly when they indicate why they use Wikidata. The Black Lunch Table is a great example. The description at Wikidata says: "visual artists of the African diaspora initiative that includes Wikipedia editathons and outreach". One way of knowing how effective this initiative is is the history page of its listeria list. It shows a steady growth of information added. When you analyse it further you find artists added and selected for new editathons. Truly a great example of Wikidata having a purpose.

A strategy based on purpose, is a strategy based on trust. Not blind trust, but the kind of trust where it is seen that people are committed to improve both quantity, quality and usefulness of the data they identify with.
Thanks,
      GerardM

by Gerard Meijssen (noreply@blogger.com) at December 14, 2017 11:08 AM

December 13, 2017

Wikimedia Performance Team

The journey to Thumbor, part 1: rationale

We are currently in the final stages of deploying Thumbor to Wikimedia production, where it will generate media thumbnails for all our public wikis. Up until now, MediaWiki was responsible for generating thumbnails.

I started the project of making Thumbor production-ready for Wikimedia a year and a half ago and I'll talk about this journey in a series of blog posts. In this one, I'll explain the rationale behind this project.

Security

The biggest reason to change the status quo is security. Since MediaWiki is quite monolithic, deployments of MediaWiki on our server fleet responsible for generating thumbnails aren't as isolated as they could be from the rest of our infrastructure.

Media formats being a frequent security breach vector, it has always been an objective of ours to isolate thumbnailing more than we currently can with MediaWiki. We run our command-line tools responsible for media conversion inside firejail, but we could do more to fence off thumbnailing from the rest of what we do.

One possibility would have been to rewrite the MediaWiki code responsible for thumbnailing, turning it into a series of PHP libraries, that could then be run without MediaWiki, to perform the thumbnailing work we are currently doing - while untangling the code enough that the thumbnailing servers can be more isolated.

However such a rewrite would be very expensive and when we can afford to, we prefer to use ready-made open source solutions with a community of their own, rather than writing new tools. It seemed to us that media thumbnailing was far from being a MediaWiki-specific problem and there ought to be open source solutions tackling that issue. We undertook a review of the open source landscape for this problem domain and Thumbor emerged as the clear leader in that area.

Maintenance

The MediaWiki code responsible for thumbnailing currently doesn't have any team ownership at the Wikimedia Foundation. It's maintained by volunteers (including some WMF staff acting in a volunteer capacity). However, the amount of contributors is very low and technical debt is accumulating.

Thumbor, on the other hand, is a very active open-source project with many contributors. A large company, Globo, where this project originated, dedicates significant resources to it.

In the open source world, joining forces with others pays off, and Thumbor is the perfect example of this. Like other large websites leveraging Thumbor, we've contributed a number of upstream changes.

Maintenance of Wikimedia-specific Thumbor plugins remains, but those represent only a small portion of the code, the lion's share of the functionality being provided by Thumbor.

Service-oriented architecture

For operational purposes, running parts of the wiki workflow as isolated services is always beneficial. It enables us to set up the best fencing possible for security purposes, where Thumbor only has access to what it needs. This limits the amount of damage possible in case of a security vulnerability propagated through media files.

From monitoring, to resource usage control and upstream security updates, running our media thumbnailing as a service has significant operational upsides.

New features

3rd-party open source projects might have features that would have been low priority on our list to implement, or considered too costly to build. Thumbor sports a number of features that MediaWiki currently doesn't have, which might open exciting possibilities in the future, such as feature detection and advanced filters.

At this time, however, we're only aiming to deploy Thumbor to Wikimedia production as a drop-in replacement for MediaWiki thumbnailing, targeting feature parity with the status quo.

Performance

Where does performance fit in all this? For one, Thumbor's clean extension architecture means that the Wikimedia-specific code footprint is small, making improvements to our thumbnailing pipeline a lot easier. Running thumbnailing as a service means that it should be more practical to test alternative thumbnailing software and parameters.

Rendering thumbnails as WebP to user agents that support it is a built-in feature of Thumbor and the most likely first performance project we'll leverage Thumbor for, once Thumbor has proven to handle our production load correctly for some time. This alone should save a significant amount of bandwidth for users whose user agents support WebP. This is the sort of high-impact performance change to our images that Thumbor will make a lot easier to achieve.

Conclusion

Those many factors contributed to us betting on Thumbor. Soon it will be put to the test of Wikimedia production where not only the scale of our traffic but also the huge diversity of media files we host make thumbnailing a challenge.

In the next blog post, I'll describe the architecture of our production thumbnailing pipeline in detail and where Thumbor fits into it.

by Gilles (Gilles Dubuc) at December 13, 2017 04:14 PM

The journey to Thumbor, part 2: thumbnailing architecture

Thumbor has now been serving all public thumbnail traffic for Wikimedia production since late June 2017.

In a previous blog post I explained the rationale behind that project. To understand why Thumbor is a good fit, it's important to understand where it fits in our overall thumbnailing architecture. A lot of historic constraints come into play, where Thumbor could be adapted to meet those needs.

The stack

Like everything we serve to readers, thumbnails are heavily cached. Unlike wiki pages, there is no distinction in caching of thumbnails between readers and editors, in fact. Our edge is Nginx providing SSL termination, behind which we find Varnish clusters (both frontends and backend), which talk to OpenStack Swift - responsible for storing media originals as well as thumbnails - and finally Swift talks to Thumbor (previously MediaWiki).

The request lifecycle

Nginx concerns itself with SSL and HTTP/2, because Varnish as a project decided to draw a line about Varnish's concerns and exclude HTTP/2 support from it.

Varnish concerns itself with having a very high cache hit rate for existing thumbnails. When a thumbnail isn't found in Varnish, either it has never been requested before, or it fell out of cache for not being requested frequently enough.

Swift concerns itself with long-term storage. We have a historical policy - which is in the process of being reassessed - of storing all thumbnails long-term. Which means that when a thumbnail isn't in Varnish, there's a high likelihood that it's found in Swift. Which is why Swift is first in line behind Varnish. When it receives a request for a missing thumbnail from Varnish, the Swift proxy first checks if Swift has a copy of that thumbnail. If not, it forwards that request to Thumbor.

Thumbor concerns itself with generating thumbnails from original media. When it receives a request from Swift, it requests the corresponding original media from Swift, generates the required thumbnail from that original and returns it. This response is sent back up the call chain, all the way to the client, through Swift and Varnish. After that response is sent, Thumbor saves that thumbnail in Swift. Varnish, as it sees the response go through, keeps a copy as well.

What's out of scope

Noticeably absent from the above is uploading, extracting metadata from the original media, etc. All of which are still MediaWiki concerns at upload time. Thumbor doesn't try to handle all things media, it is solely a thumbnailing engine. The concern of uploading, parsing and storing the original media is separate. In fact, Thumbor goes as far as trying to fetch as little data about the original from Swift as possible, seeking data transfer efficiency. For example, we have a custom loader for videos that leverages Ffmpeg's support for range requests, only fetching the frames it needs over the network, rather than the whole video.

What we needed to add

We wanted a thumbnailing service that was "dumb", i.e. didn't concern itself with more than thumbnailing. Thumbor definitely provided that, but was too simple for our existing needs, which is why we had to write a number of plugins for it, to add the following features:

  • New media formats (XCF, DJVU, PDF, WEBM, etc.)
  • Smarter handling of giant originals (>1GB) to save memory
  • The ability to run multiple format engines at once
  • Support for multipage media
  • Handling the Wikimedia thumbnail URL format
  • Loading originals from Swift
  • Loading videos efficiently with range requests
  • Saving thumbnails in Swift
  • Various forms of throttling
  • Live production debugging with Manhole
  • Sending logs to ELK
  • Wikimedia-specific filters/settings, such as conditional sharpening of JPGs

We also changed the images included in the Thumbor project to be respectful of open licenses and wrote Debian packages for all of Thumbor's dependencies and Thumbor itself.

Conclusion

While Thumbor was a good match on the separation of concerns we were looking for, it still required writing many plugins and a lot of extra work to make it a drop-in replacement for MediaWiki's media thumbnailing code. The main reason being that Wikimedia sites support types of media files that the web at large cares less about, like giant TIFFs and PDFs.

In the next blog post, I'll describe the development strategy that led to the successful deployment of Thumbor in production.

by Gilles (Gilles Dubuc) at December 13, 2017 04:14 PM

The journey to Thumbor, part 3: development and deployment strategy

In the last blog post I described where Thumbor fits in our media thumbnailing stack. Introducing Thumbor replaces an existing service, and as such it's important that it doesn't preform worse than its predecessor. We came up with a strategy to reach feature parity and ensure a launch that would be invisible to end users.

Development

In Wikimedia production, Thumbor was due to interact with several services: Varnish, Swift, Nginx, Memcached, Poolcounter. In order to iron out those interactions, it was important to reproduce them locally during development. Which is why I wrote several roles for the official MediaWiki Vagrant machine, with help from @bd808. Those have already been useful to other developers, with several people reaching out to me about the Varnish and Swift Vagrant roles. While at the time it might have seemed like an unnecessary quest (why not develop straight on a production machine?) it was actually a great learning experience to write the extensive Puppet code required to make it work. While it's a separate codebase, subsequent work to port that over to production Puppet was minimal.

This phase actually represented the bulk of the work, reproducing support for all the media formats and special parameters found in Mediawiki thumbnailing. I dedicated a lot of attention to making sure that the images generated by Thumbor were as good as what MediaWiki was outputting for the same original media. In order to do that, I wrote many integration tests using thumbnails from Wikimedia production, which were used as reference output. Those tests are still part of the Thumbor plugins Debian package and ensure that we avoid regressions. They use a DSSIM algorithm to visually compare images and make sure that what Thumbor outputs doesn't visually diverge from the reference thumbnails. We also compare file size to make sure that the new output isn't significantly heavier than the old.

Packaging

The next big phase of the project was to create a Debian package for our Thumbor code. I had never done that before and it wasn't as difficult as some people make it out to be (I imagine the tooling has gotten significantly better than it used to be), at least for Python packages. However, in order to be able to ship our code as a Debian package, Thumbor itself needed to have a Debian package. Which wasn't the case at the time. Some people had tried on much older versions of Thumbor but never reached the point where it was put in Debian proper. Since that last attempt, Thumbor added a lot of new dependencies that weren't packaged either. @fgiunchedi and I worked on packaging it all and successfully did so. And with the help of Debian developer Marcelo Jorge Vieira who pushed most of those packaged for us into Debian, we crossed the finish line recently and got Thumbor submitted to Debian unstable.

One advantage of doing this is that it makes deployment of updates really straightforward, with the integration test suite I mentioned earlier running in isolation when the Debian package is built. With those Debian packages done, we were ready to run this on production machines.

But the more important advantage is that by having those Debian packages into Debian itself, other people are using the exact same versions of Thumbor's dependencies and Thumbor itself via Debian, thus greatly expanding the exposure of the software we run in production. This increases the likelihood that security issues we might be exposed to are found and fixed.

Beta

Trying to reproduce the production setup locally is always limited. The full complexity of production configuration isn't there, and everything is still running on the same machine. The next step was to convert the Vagrant Puppet code into production Puppet code. Which allowed us to run this on the Beta cluster as a first step, where we could reproduce a setup closer to production with several machines. This was actually an opportunity to improve the Beta cluster to make it have a proper Varnish and Swift setup closer to production than it used to have. Just like the Vagrant improvements, those changes quickly paid off by being useful to others who were working on Beta.

Just like packaging, this new step revealed bugs in the Thumbor plugins Python code that we were able to fix before hitting production.

Pre-production

The Beta wikis only have a small selection of media, and as such we still hadn't been exposed to the variety of content found on production wikis. I was worried that we would run into media files that had special properties in production that we hadn't run into in all the development phase. Which is why I came up with a plan to dual-serve all production requests to the new production Thumbor machines and compare output.

This consisted in modifications to the production Swift proxy plugin code we have in place to rewrite Wikimedia URLs. Instead of sending thumbnail requests to just MediaWiki, I modified it to also send the same requests to Thumbor. At first completely blindly, the Swift proxy would send requests to Thumbor and not even wait to see the outcome.

Then I looked at the Thumbor error logs and found several files that were problematic for Thumbor and not for MediaWiki. This allowed us to fix many bugs that we would have normally found out about during the actual launch. This was also the opportunity to reproduce and iron out the various throttling mechanisms.

To be more thorough, I mage the Swift proxy log HTTP status codes returned by MediaWiki and Thumbor and produced a diff, looking for files that were problematic for one and not the other. This allowed us to find more bugs on the Thumbor side, and a few instances of files that Thumbor could render properly that MediaWiki couldn't!

This is also the phase where under the full production load, our Thumbor configuration started showing significant issues around memory consumption and leaks. We were able to fix all those problems in that fire-and-forget dual serving setup, with no impact at all on production traffic. This was an extremely valuable strategy, as we were able to iterate quickly in the same traffic conditions as if the service had actually launched, without any consequences for users.

Production

With Thumbor running smoothly on production machines, successfully rendering a superset of thumbnails MediaWiki was able to, it was time to launch. The dual-serving logic in the Swift proxy came in very handy: it became a simple toggle between sending thumbnailing traffic to MediaWiki and sending it to Thumbor. And so we did switch. We did that gradually, having more and more wikis's thumbnails rendered by Thumbor over the course of a couple of weeks. The load was handled fine (predictable, since we were handling the same load in the dual-serving mode). The success rate of requests based on HTTP status codes was the same before and after.

However after some time we started getting reports of issues around EXIF orientation. A feature we had integration tests for. But the tests only covered 180 degrees rotation and not 90 degrees (doh!). The Swift proxy switch allowed us to quickly switch traffic back to MediaWiki. We did so because it's quite a prevalent feature in JPGs. We fixed that one large bug, switched the traffic back to Thumbor and that was it.

Some minor bugs surfaced later regarding much less common files with special properties, that we were able to fix very quickly. And deploy fixes for safely and easily with the Debian package. But we could have avoided all of those bugs too if we had been more thorough in the dual-serving phase. We were only comparing HTTP status codes between MediaWiki and Thumbor. However, rendering a thumbnail successfully doesn't mean that the visual contents are right! The JPG orientation could be wrong, for example. If I had to do it again, I would have run DSSIM visual comparisons on the live dual-served production traffic between the MediaWiki and Thumbor outputs. That would have definitely surfaced the handful of bugs that appeared post-launch.

Conclusion

All in all, if you do your homework and are very thorough in testing locally and on production traffic, you can achieve a very smooth launch replacing a core part of infrastructure with completely different software. Despite the handful of avoidable bugs that appeared around the launch, the switch to Thumbor went largely unnoticed by users, which was the original intent, as we were looking for feature parity and ease of swapping the new solution in. Thumbor has been happily serving all Wikimedia production thumbnail traffic since June 2017 in a very stable fashion. This concludes our journey to Thumbor :)

by Gilles (Gilles Dubuc) at December 13, 2017 04:14 PM

Wikimedia Foundation

What can we glean from OCLC’s experience with library staff learning Wikipedia?

Photo by Tony Webster, CC BY 2.0

In June 2016, a “Knight News Challenge” award for innovation in the library sector went to OCLC, the global nonprofit library cooperative, to strengthen ties between Wikipedia and libraries. Now a year in, the OCLC Wikipedia + Libraries: Better Together project has had a major impact on Wikipedia literacy among librarians in the United States, having introduced hundreds of staff to Wikipedia’s policies and community practices. Here’s more about this innovative approach to Wikipedia outreach and library work, and what you can learn from it for “GLAM”—galleries, libraries, archives, and museums—education projects in your own communities.

———

At the center of the 18-month project is WebJunction, OCLC’s library learning program, to research and deliver Wikipedia awareness and educational opportunities, including a nine-week online professional development course for US public library staff. To support the project, OCLC brought on a Wikipedian-in-residence: Monika Sengul-Jones, an advocate for Wikipedia literacy in education and a communication and media studies scholar working fulltime with the Seattle-based WebJunction team. She facilitates project research and outreach, including the design and delivery of the course.

With Sengul-Jones’s support, the project has successfully introduced more than six hundred library staff to the inner workings of Wikipedia with webinars and a nine-week live online course that took place during the fall of 2017. In the coming months, the project will continue to support libraries as they put plans of engagement into action, and will be hosting Citations Needed, a free live webinar open to all on January 10, 2018, on the eve of the #1lib1ref campaign.

With the scale and impact of this program in the United States, we asked the Wikimedia Foundation’s Alex Stinson to investigate: what can we glean from OCLC’s approach to library staff learning Wikipedia? His conversation with Sengul-Jones is below.

Monika Sengul-Jones. Photo via Monika Sengul-Jones, CC BY-SA 4.0.

Alex: How does it feel to be done with the Webjunction course? It seems to have gone well: every report out I get from you and the team, and volunteers who help with the course, seems to be quite exciting.

Monika: Thank you, Alex! It has been an honor to join the OCLC Wikipedia + Libraries project team to strengthen ties between Wikipedia and libraries. Indeed, the nine-week course just wrapped up and it feels like we’ve accomplished something big—and that it’s not over—I feel momentum. What’s most notable is the palpable excitement and shift in perception among library staff. At enrollment, more than seventy percent of the attendees reported they’d never edited Wikipedia and did not consider the encyclopedia relevant to their library. Amongst the library staff we’ve engaged in this project, we’ve seen a mind shift. Now staff are saying ‘yeah, I now know why I should care, I’m excited and knowledgeable enough to tell my colleagues and patrons why they should care, too.’ What’s also made the recent course experience really special is supporting library staff from many different locations, with 299 enrollees from 45 states and 7 countries. We focused on outreach to public libraries, as in the US 80% of public libraries are small or rural.

Alex: I am incredibly excited that the OCLC WebJunction course focused on working with public libraries. As you probably already know, some of the Wikimedia movement’s most successful at-scale projects have been with public libraries (the work in Catalonia, for example), yet when we put the IFLA Public Library Opportunity Paper together, we had a hard time finding examples beyond a small handful of highly visible “case studies”. Did you discover any exciting approaches to “doing Wikipedia in public libraries” when preparing the project? Do you feel like the class generated any new ideas?

Monika: Yes! Public libraries! So full of ideas!

Let me illustrate some of these ideas with a story from our sixth live session. We had a quick in-class activity, participants brainstormed titles for Wikipedia-related events at the library. They came up with a couple dozen creative titles, like:

  • Wikipedia is a Verb and a Noun: Learn How and Why
  • Wiki Your Tree: How to Use Wikipedia for Genealogy
  • Fantastic Facts and Where to Find Them, On and Offline
  • Get Smart: Wikipedia for Research
  • Gnomes vs Elves: Wikipedia Holiday Throwdown

Don’t these sound like fun? The titles capture participants’ interests with Wikipedia, too. Information literacy. Staff awareness of what’s new with Wikipedia. Genealogy. Local history and culture. Wikipedia for research. And to explain the last suggestion, participants took to the title of “gnome” editor, which is what veteran Wikipedians have sometimes called editors who make little improvements.

In terms of examples, the ball is in their court, so to speak. The final assignment was to design a plan of action. Some early adopters have already put plans into action, (which I wrote about here). In brief, librarians at Morton-James Public Library in Nebraska City, Nebraska, have begun to use Wikipedia to engender digital literacy skills and research discovery in outreach workshops they run with students. A few libraries have hosted edit-a-thons and panel discussions, and others had conversation starters with their directors and staff. That is just a glimpse of what’s to come—I think the GLAM-Wiki community is in for a real treat. In the coming months we’ll be supporting and sharing the implementation of these plans.

Alex: There is an odd power dynamic that “doing outreach with libraries” creates, especially when librarians are under increased pressure to expand their services without additional resources. This is especially true if you are not a librarian and you are working for an organization that benefits a lot from libraries (like Wikipedia or OCLC). We both have academic researcher backgrounds that make us familiar with library work, but I always feel a bit on the edge of the library profession during outreach because I have never been “in the trenches”.  How did you navigate these tensions? What did you learn about working with libraries when you are not a librarian?

Monika: Yes, thanks for asking. While I am not a librarian, I joined a project team of library professionals at OCLC. Sharon Streams (@thinktower), the director of WebJunction, and Merrilee Proffitt (@MerrileeIAm), senior program officer and a longtime champion of Wikipedia, designed an incredible project that I was lucky to join last March.

The first months of the project were spent listening and learning. For me, that was by listening to my team and public library staff. The OCLC WebJunction team—which has nearly twenty years of experience in delivering online ongoing professional education to library staff—assured me that if you approach public library staff with a thoughtful request for information, staff are generous with their answers and want to help you learn. Sure enough, I found it easy to connect because library staff were eager to have their voices heard! I did two dozen phone interviews, went to my public library to observe daily life, had virtual chats with libraries across the US, joined public library listservs, viewed recordings of panels and webinars, and read books and articles written by library professionals about their work and the history of public libraries.

This kind of listening was fruitful because once we understood existing perceptions of Wikipedia by library staff, we could then consider why the online encyclopedia might matter to their missions and visions. The project design and delivery ultimately emerged from this research, and so there was a long lead time for course design. As I mentioned earlier, we discovered that more than seventy percent of library staff reported they had never edited, but libraries provide many services and use sophisticated technical tools. I also learned that library staff try out things, but with Wikipedia, many had not realized the changes that Wikipedia has gone through over the past decade. So our program emerged as a cultural introduction to Wikipedia for libraries to help them understand why it might matter to them now.

Alex: I really like that approach: really deeply listening and engaging with the community that you want to connect with. It seems like that research really worked out well: You profiled librarians about how they use Wikipedia in a blog series. What stood out?  Can you share a few of their stories?

Monika: That’s right. The Librarians Who Wikipedia interview series on WebJunction began as a way to start connecting the dots between library staff across the US who were already engaged in Wikipedia, and to show others who were eager to learn how the alignment between libraries and Wikipedia works. It’s been a ‘show, don’t tell’ approach, turning research into meaningful awareness. And it’s continuing; we will be publishing more stories in the months to come.

Each story is eye-opening! I’ll share a few: Susan Barnum, a public service librarian at El Paso Public Library in Texas, has edited Wikipedia to answer patron reference questions. Someone asked for references about the history of Chihuahuita, a border neighborhood in El Paso. She saw there was no Wikipedia article, so she shared the references she compiled by editing Wikipedia. Doing this, she reached more than 1,500 (and counting) additional information seekers who viewed the page. Susan’s done much more on Wikipedia—but this one example encapsulates her visionary way of using Wikipedia to serve her community and beyond.

In Pennsylvania, Allison Frick, a youth services librarian, hosted a one-hour basic information literacy program highlighting Wikipedia while providing basic internet literacy instruction. In Florida, Paul Flagg, a newly minted MLIS graduate on staff at Tampa-Hillsborough County Public Library in Florida, was concerned with misrepresentation and inequalities online and coordinated Wiki-Equality editing events at his library.

Alex: What’s been the response?

Library staff appreciate hearing from their peers, definitely. We’ve gotten enthusiastic feedback about the articles and then about the ten library staff who made guest presentations during webinars. Our live guest presenters were incredible and I must recognize them, in order of appearance: Bob Kosovsky, János McGhie, Susan Barnum, Kerry Raymond, Jacinta Sutton, Rajene Hardeman, and Sherry Antoine.

Participants also learned from the thoughtfulness and savviness of fifteen Wikimedians who accepted our invitation to join the training program as Wikipedia guides. Allow me to give a special shout out to, for their incredible participation in enabling human-to-human connections, in alphabetical order: Avery Jensen, Alexandre Hocquet, Gamaliel, JacintaJS, Jackie Koerner, Kerry Raymond, Librarygurl, Megalibrarygirl, Megs, Merrilee, PersnicketyPaul, Rachelwex, slowking4, Sodapopinski7, and Vizzylane.  Both the guides and guest presenters engaged in live sessions and course spaces over our nine weeks (and beyond) together. Feedback so far leads me to believe the learning has been multidirectional.

Alex: One of the overwhelmingly common experiences I have when doing outreach with librarians is the absolute delight and surprise that comes from library communities when they realize the values and practices Wikimedians have adopted for shepherding knowledge are very similar to theirs. I’m wondering: What kinds of topics resonated the most with the groups you worked with?

Monika: Yes! This exactly. Library staff have been delighted to discover there’s a community behind Wikipedia, and to actually get to know some of them as presenters and guides. That human-to-human connection is key, and helped to bring alive the ideal that Wikimedia is made out of a community committed to expanding free access to knowledge, they could then connect how the community aligns with libraries.

Hearing about the editorial processes and policies was interesting for library staff. Many appreciated the spirit of the five pillars.

“The five pillars…. are awesome [and] remind me a lot of what I love about working in the public library,” wrote a librarian from a small library. “Free information, no opinions, suggestions when offered and lots of people who love to help people find information.”

Wikipedians may be too close to realize how many unusual and exciting aspects of this otherwise ubiquitous online platform there are, which sometimes makes it hard to see the humans who make it work. For me it’s been exciting to facilitate library staff’s ah-ha moments about the community and the vision of Wikimedia in a structured learning environment.

Alex: What was the most bizarre/interesting/funny thing that happened during the development/deployment of the course?

Monika: You know, librarians are supportive and sharp. They have “superior evaluation skills,” as one Wikipedian who participated in the course as a guide observed. And as I mentioned, the favored pillar is four: “Treat others with civility and respect.” More of this! Participants asked right away, how do you tag other editors and let them know they’ve done something. They really took to “thanking,” sharing wiki-love and appreciate barnstars. The experience with them has been so affirming.

Alex: What areas of our community did you see as “surprises” which we need to communicate better as Wikimedians?

Monika: It shouldn’t come as a surprise, in fact, that library staff are smart and they’re  accustomed to learning new informational systems. So what might surprise other Wikimedians is that it’s not for lack of skills that library staff they haven’t participated in the Wikipedia project. Rather, somewhere along the line they just never got engaged as Wikipedia has grown. So we’re giving them the cultural know-how, the reason, and confidence to trust Wikipedia and believe they can and should participate—be BOLD—by demonstrating what’s in it for them and their communities, and building human-to-human connections. This approach is a win-win for Wikimedia and public libraries.

One of our assignments was to observe the Teahouse. Let me share two comments:

“In following the fourth pillar of Wikipedia, more than one Teahouse Wikipedian conversed with the obviously frustrated new editor in a calm and rational manner. This was impressive and encouraging to me as a new editor.”

“I am most impressed by the time that [Teahouse] editors put into their thoughtful responses, many making sure that the question is properly answered and the solution or information offered is understood. I am starting to believe that Wikipedians are all really reference librarians at heart.”

This is huge! When librarians see themselves in Wikipedians, they have trust in the community, confidence that they can decide for themselves what to do next, and know where to go for help.

That said, in terms of user experience, a few have gotten turned around navigating wiki-markup on Talk and community pages. Let’s implement VE on all Wikimedia pages—not just articles—and make it easier to communicate and share wiki-love.

Alex: What visualization or illustration best represents the work that you have been doing with the course?

Monika: I love the WebJunction live online webinar experience. Picture this: hundreds of people at hundreds of libraries across the US online at the same time. This is scaled adult learning at its best—it’s a classroom, a town hall, and a radio show! Our live sessions (which were recorded and will be shared with course materials on WebJunction under CC BY-SA 4.0 in spring 2018), consist of beautiful slides, pdf learner guides to contextualize the content, animated instructors and guest presenters, a closed captioner transforming voices into text, and an active chat channel for conversation during the presentations—text is flying. It’s exciting. Then, we use annotation tools periodically during the session to enhance interactivity. Annotation is the ability to use, on screen and all together, checkmarks, pointers, and highlighters on slides.

Image by Monika Sengul-Jones, CC BY-SA 4.0.

Check out this screenshot taken after participants were assigned to make their first edits—make a copy edit and add a citation. Consider this the equivalent of a group photo, it’s neat.

Alex: It seems like the class provided a really powerful and engaging environment for learning about the Wikimedia community. Conversely, what can Wikipedia learn from public libraries?

Monika: Thanks for asking this question. First I want to recognize that the Wikimedia community, and Wikipedia, has already been learning from libraries because librarians are a part of the Wikimedia community! This project is building on the connections and lessons learned from the GLAM-Wiki community, Art + Feminism, AfroCROWD, Wikimedia New York City, Wikimedia D.C., the Wikipedia Library, and the new Wikipedia Library User Group. I’m also building on my own experiences and lessons learned working with the Wiki Education Foundation and feminist academics.

Looking ahead, I think continuing to support these connections between libraries and Wikimedians will help them become sustainable, built on a foundation of trust and affinity. And it’s happening! A sign to me of strengthened ties is the fact that Dr. Carla Hayden, Librarian of Congress and Katherine Maher, executive director of the Wikimedia Foundation were both keynote speakers at OCLC’s Americas Regional Council meeting on Oct. 30-31. Their keynotes were complementary. However, in terms of Wikimedians learning from libraries, what stood out to me was really how young Wikimedia is compared to public libraries, which Katherine Maher described. And I say young even though Wikipedia is one of the early internet projects that has endured, and matured, in the messy rise of a digital networked and commercial economy. Dr. Hayden gave us a longer view of the history of libraries, which date back hundreds of years. “We are the original search engines,” she said. She also motioned to the significance of the position she holds, Librarian of Congress, as an African American woman: “I am a descendent of people who were forbidden to read,” she said.

What a tribute Dr. Hayden’s presentation was to the long history of work that libraries have done—and continue to do—to ensure equitable access to information for all in the face of deep systemic injustices to too many lives. I think it’s important for Wikimedians to recognize that the successful, broad institutionalization of public libraries that we have today in the United States—where there are more public libraries than Starbucks or McDonald’s restaurants—is the outcome of sweat, advocacy, and professionalization accomplished by library staff and devoted champions of literacy for democracy. And libraries have had to redress deeply rooted, complicated historic and geographical legacies of violence, oppression and inequity—including inaccessible and segregated libraries—to achieve this.

Redress isn’t finished. Public libraries today are reinventing their professional practices to ensure access for all. Libraries provide many services—social work, for instance—because a substantial portion of the people that public libraries serve in the United States face chronic life insecurities. Public libraries partner with community agencies to address, for example, the nationwide opioid crisis. Libraries train staff on the use of naloxone to reverse overdoses. Libraries are also kitchens, feeding hungry children. I’ve heard of librarians running literacy programs in public housing, at barber shops and farmers markets—radical reinventions of the work of an information professional.

I am in awe of the ingenuity of public libraries. As we, Wikimedians, look to a future where the best of the Wikimedia movement can root more deeply and meaningfully in our networked civic life, let’s listen and learn from public libraries. I’d also invite Wikimedians to consider public libraries as co-collaborators and role models as we look forward to the exciting challenge of realizing the 2030 strategic direction for a future of knowledge as service and knowledge equity.

AS: Is there anything else you’d like to add?

MSJ: View or participate live in a WebJunction webinar! You can experience the excitement of library life with WebJunction. That might sound cheesy, but it’s true. WebJunction is able to bring a large cross-section of library staff from across the country together to learn side-by-side. This is diversity that can be missing at larger professional conferences, which can be expensive or difficult to attend. Webinars are free, accessible, and open to all (browse the webinar calendar here). Wikimedians may find themselves delighted by the ways that libraries thoughtfully address topics that matter to our community, such as civility. WebJunction’s most popular course is “Dealing with Difficult Patrons” (to view this and other free webinars, you will need to create a free WebJunction account first).

Wikimedians interested in learning more from libraries are warmly invited to attend our next Wikipedia webinar, Citations Needed: Build Your Wikipedia Skills While Building the World’s Encyclopedia. This free webinar will build on our existing momentum and support library staff interested in adding citations to Wikipedia; the event will be on January 10, 2018 at 3pm EST,  with Emily Jack from UNC Chapel Hill Libraries joining me as a guest presenter to kick off participation in the Wikipedia Library’s #1lib1ref campaign from Jan. 15 – Feb. 3, 2018.

I also wanted to mention that we share project updates—such as reminders about this webinar—in the Wikipedia + Libraries monthly newsletter, and post in the GLAM-Wiki newsletter too.

A final comment: What has become clear to me in the course of joining this project is that humans have no shortage of information about and of our worlds. But equitable access to information, it’s an enduring problem. As Kathleen de la Peña McCook writes on page 66 in her textbook on public librarianship, equity is both “simple in concept and complex in implementation.” But by working together, Wikimedians and libraries are stronger and better equipped to address the complexities of access, and motivated by a shared value that “knowledge belongs to all of us.”

Thanks again Alex for the interview, the support of GLAM-Wiki, and the opportunity to share more details about this project’s approach to outreach and activities.

For more information on this topic, please see:

Interview by Alex Stinson, Strategist, Community Engagement
Wikimedia Foundation

by Alex Stinson at December 13, 2017 04:01 PM

December 12, 2017

Wikimedia Tech Blog

Gnomes and trolls and hobgoblins (oh my!)—Failed queries and the vicarious fear of missing out

The Thinker by Auguste Rodin. Photo by Tammy Lo, CC BY 2.0.

Many people worry—some a little, some a lot—about how other people find information on Wikipedia and its sister projects. It’s a driving concern for me and other members of the Search Platform team, and also for many community members, like those who create and curate redirects.

But the curse of knowledge makes it difficult for more sophisticated Wikipedians to put themselves in the shoes of a wiki newbie who is confused by the sea of blue links sprinkled with red links, and who doesn’t yet fully understand notabilitynaming conventionstalk pages, or search.

What do such newbies need, how do they look for it, and can we help them find it‽

Unearth and extract!

This vicarious fear of missing out leads many people to an awesome idea; it’s an idea that I myself had shortly after I joined the Foundation; it’s an idea others on my team had had a while before I joined them; it’s an idea that many people have proposed and re-proposed and debated and argued about and not quite come to fisticuffs over:

Let’s mine the most frequent queries that get no results, figure out what users are looking for, and arrange to provide it to them—possibly in the form of new articles or redirects from the failed query terms to appropriate existing articles.

It seems like it should be an easy win: Wikipedia gets a steady stream of useful new information, fewer searches get no results at all, and fewer newbies fail to find what they are looking for. But, as is so common, things are more complicated than they first appear.

Privacy, publicity, and prankery

The most important issue is privacy. People sometimes accidentally reveal their private information in their searches. A name, email address, social media account, phone number, physical address, IP address, national ID number, credit card number, love letter, secret recipe, etc., etc.—all can be inadvertently copied from somewhere and pasted into a search box. And, given the million of users of hundreds of Wikipedias and their sister projects, it not only can happen, it does happen—fairly regularly. So, a big raw dump of search data is out of the question.

Several automated methods of preventing the exposure of personal information have been suggested, and many may even be 99.9% effective. The problem is that a 0.1% failure rate multiplied by millions of cases is still thousands of potential privacy problems, when even one is too many.

It is possible to use patterns to identify likely personal information like email addresses, physical addresses, and phone numbers, but errors, unexpected formats, and unknown types of data mean there is always something that can slip through.

Another proposed approach is to require a minimum number of distinct searchers (i.e., distinct IP addresses). After all, if a hundred people have searched for the same thing, it must actually be a thing, right? But a well-placed link on social media (say, in a blog post) can easily generate hundreds or thousands of queries, because some people will click on almost anything.

Since Wikipedia is a such a high-profile website, it’s easy to imagine people trying to game such a system, whether for the fleeting fame or just “for the lulz”. Aspiring celebrities could easily mobilize a relatively small number of fans to propel their brand to the top of such a list. And you never know what will strike the internet’s fickle fancy—which is how we got Boaty McBoatface. Roving bands of internet trolls have famously done much worse in online polls for Mountain Dew in 2012, the Time 100 list in 2009, and countless others.[1]

Diamond in the rough or needle in a haystack?

Thus, while the expected return on mining the most frequent failed queries is often high, the potential risk is also large. Last summer I decided to do a little data spelunking to determine whether not pursuing such prospecting was foolishly abandoning a mountain of hidden gems or wisely avoiding sifting through so much digital dung.

Photo by Sad loser, CC BY-SA 4.0.

I collected a month’s worth of failed queries from English Wikipedia in May 2016 and tried to filter probable bots and other likely outliers. The result was a corpus of about 8.6 million queries, with 7.4 million unique queries. I carefully reviewed the top 100 most frequent queries, and skimmed the top 1,000.

What I found, mostly, was porn.

The most common query is the name of a porn site. It always shows up in my data when I look at poorly performing queries. In this case, I found five variants of it in the top 100, and they account for about 29% of the raw query count for the top 100.

Other common categories for these searches are other websites, internet memes, TV shows and movies, internet personalities, porn stars, people mentioned in recent news, historical figures, politicians, etc.

I don’t have any opinions on the notability of any particular person or website, but while some of these are clearly popular (based on searches), they have been deemed not notable by the English Wikipedia community. The most frequent porn site, the two most frequent internet personalities, a couple of other sites, and a couple of the internet memes show up in the deletion logs as having been created—some of them multiple times—and deleted. Two articles (with different capitalization) for the most frequent porn site were deleted in 2007, 2009, 2011, 2015, and 2016.

There were also a few typos, but almost all of them were corrected by the completion suggester (suggestions made while you type) and the “did you mean” spelling correction (suggestions made after you submit your query).

Small value, diminishing returns

Over an entire month, only 281 distinct queries passed the proposed 100-unique-IPs threshold, and the 1000th most-frequent query was only searched for 48 times.

The top 100 most frequent queries only accounted for 0.62% of the 8.7M queries, and the top 1,000 only accounted for 1.45% of all queries. The top 25,000 most frequent queries didn’t quite account for 5% of all the queries, and 25,000th had a frequency of 7. The long tail is very long.

In the top 1000, there were at least 10 obvious addresses, a phone number, an ISBN number, an obfuscated email address, a Twitter account, and two instances of a person + city—so aggregating over longer time periods than a month does increase the likelihood of personal data making it past an IP-count filter, validating our privacy concerns.

I gathered the same data for the following month, June 2016, and found it to be very similar. The top 10 most frequent queries were the same as in May, and 71 of the top 100 were the same, indicating that there isn’t a lot of new information at month-over-month timescales.

Some of the same street addresses from May showed up in the June data, indicating to me that the query is likely coming from a link on the internet, or a better class of bot.

Interestingly, the Brexit vote took place in June 2016, and I found 8 misspellings of Brexit in the top 1000 for June—though none in the top 100. Most were corrected by either the completion suggester or the “did you mean” spelling correction.

You can find even more detail about all this in the write up I did last summer.

What went wrong right?

Great minds think alike—so everyone who thought this sounded like a promising idea should give themselves a pat on the back, because lots of other smart people thought so, too.

So what went wrong? Nothing! What went right? My intuition is that, at least on English Wikipedia, the WikiGnomes are so far ahead of the curve that they render this mining exercise moot. There are already so many carefully created and curated articles and redirects that the vast majority of queries that otherwise would have failed, don’t. (CirrusSearch, which the Search Platform hobgoblins work on tirelessly, probably helps a bit, too.)

For other large wikis, particularly Wikipedias, I would expect similar results. Smaller wikis may have a better failed-query gem-to-dung ratio than English Wikipedia, but privacy concerns are still paramount, and the long tail is probably still very, very long.

Trey Jones, Senior Software Engineer, Search Platform
Wikimedia Foundation

Footnotes

  1. Don’t feed the trolls! That is, no links will be provided. The information is easily found, but also simultaneously tasteless and distasteful, inappropriate for polite conversation, and generally not safe for work.

by Trey Jones at December 12, 2017 04:49 PM

December 11, 2017

Wiki Education Foundation

Wiki Education recruiting Geophysicists this week at AGU

This week, Wiki Education is attending the American Geophysical Union’s (AGU) annual meeting in New Orleans. At last year’s meeting, we met dozens of scientists interested in teaching students how to communicate science through Wikipedia assignments. In the year since, we’ve supported seven resulting courses as students learned how to improve Wikipedia’s coverage of geology and earth science.

In Wiki Education’s Classroom Program, we provide the toolkit students need to become contributors to the world’s most widely read online encyclopedia. For several weeks in the semester, students identify missing pieces of a Wikipedia article related to their course, research the topic, and learn how to add well-sourced information to fill in those gaps. We’ve worked with earth science courses over the years, which is why we created a guide for students editing environmental science articles.

Geology students in our Classroom Program have tackled topics ranging from women geologists’ biographies to volcanoes of east-central Baja California to the Landsat program.

University students often question the purpose of their classroom assignments. The Wikipedia assignment presents opportunities to discuss student learning outcomes of a real-world assignment. Through such a project, students begin to understand the vast knowledge contained within academic silos that has yet to be disseminated to the broader public. Along the way, they realize their own agency in curbing these inequities. Students are helping combat the gender biases on Wikipedia that mimic the gender biases in STEM fields. They’re writing about geological formations and fluid dynamics and remote sensing. In the process of educating themselves, they’re creating opportunities for other future geologists who may not have access to libraries full of journal subscriptions. By learning how to channel their classroom labor to Wikipedia, they’re amplifying the impact of their own higher education.

If you’re attending AGU’s meeting this week, drop by the exhibit hall to discuss potential assignments with me. To join Wiki Education’s Classroom Program and teach students how to contribute to public knowledge in the earth sciences, visit teach.wikiedu.org or email us at contact@wikiedu.org.

by Jami Mathewson at December 11, 2017 11:17 PM

Roundup: Universal Human Rights Month

December is Universal Human Rights Month, and yesterday was Human Rights Day, an internationally recognized day remembering the Universal Declaration of Human Rights by the United Nations General Assembly in 1948. On this day, governmental and non-governmental organizations around the world raise awareness for human rights issues, meet to strategize about solving them, and host events to commemorate the day. Among the freedoms laid out in the Universal Declaration of Human Rights is fair access to education. Here at Wiki Education, we believe in accessible knowledge, in improving public literacy, and in ensuring our popular sources of information are accurate. That’s why we support instructors in teaching their students how to edit Wikipedia as an assignment. Students channel the hard work they’re already doing in the classroom into a platform that shares academic knowledge with the world.

In Jennifer Olmsted’s Fall 2016 course at Drew University, Gender and Globalization, students edited Wikipedia articles on topics surrounding globalization as it relates to work, human mobility, well-being, and other issues related to human rights.

Student editors contributed to the article on the effects of war. War impacts culture, population, education, and economic stability of a country or people. Economic instability, destruction of social infrastructure, changes in the labor force, population loss, forced migration and displacement, decline of education, and dramatic political change all result from war, and these post-war effects can be long or short term.

Students also added content to the article on unpaid work. Unpaid labor is defined as work without monetary compensation, which may include work within the household, volunteer work, and interning. The United Nations Statistic Division conducted a survey that found women to be the majority of unpaid workers worldwide. Unpaid domestic work includes cooking, cleaning, caring for family members, and other drudgery. Unpaid work also includes reproductive (or childbearing) labor, the expectation that because reproduction relies primarily on female organs, women are to be the primary actors of this labor throughout their lives. The article touches upon histories of gender roles and gendered cultural values, and how they affect these dynamics. The gendered nature of unpaid work has implications for economic equality, as well as women’s participation in both private and public spheres.

Students in our programs contribute valuable information to Wikipedia, one of the most accessed sources of information out there, helping readers learn more about important issues. Through this process, they also gain crucial skills from understanding and contributing to a resource that they use all the time. If you’re interested in learning more about how to teach with Wikipedia, see our informational page or reach out to us at contact@wikiedu.org.

Header image: File:UN General Assembly hall.jpg, by Patrick Gruban, CC BY-SA 2.0, via Wikimedia Commons.

 

by Cassidy Villeneuve at December 11, 2017 07:00 PM

Wikimedia Cloud Services

Labs and Tool Labs being renamed

(reposted with minor edits from https://lists.wikimedia.org/pipermail/labs-l/2017-July/005036.html)

TL;DR

  • Tool Labs is being renamed to Toolforge
  • The name for our OpenStack cluster is changing from Labs to Cloud VPS
  • The prefered term for projects such as Toolforge and Beta-Cluster-Infrastructure running on Cloud-VPS is VPS projects
  • Data Services is a new collective name for the databases, dumps, and other curated data sets managed by the cloud-services-team
  • Wiki replicas is the new name for the private-information-redacted copies of Wikimedia's production wiki databases
  • No domain name changes are scheduled at this time, but we control wikimediacloud.org, wmcloud.org, and toolforge.org
  • The Cloud Services logo will still be the unicorn rampant on a green field surrounded by the red & blue bars of the Wikimedia Community logo
  • Toolforge and Cloud VPS will have distinct images to represent them on wikitech and in other web contexts

In February when the formation of the Cloud Services team was announced there was a foreshadowing of more branding changes to come:

This new team will soon begin working on rebranding efforts intended to reduce confusion about the products they maintain. This refocus and re-branding will take time to execute, but the team is looking forward to the challenge.

In May we announced a consultation period on a straw dog proposal for the rebranding efforts. Discussion that followed both on and off wiki was used to refine the initial proposal. During the hackathon in Vienna the team started to make changes on Wikitech reflecting both the new naming and the new way that we are trying to think about the large suite of services that are offered. Starting this month, the changes that are planned (T168480) are becoming more visible in Phabricator and other locations.

It may come as a surprise to many of you on this list, but many people, even very active movement participants, do not know what Labs and Tool Labs are and how they work. The fact that the Wikimedia Foundation and volunteers collaborate to offer a public cloud computing service that is available for use by anyone who can show a reasonable benefit to the movement is a surprise to many. When we made the internal pitch at the Foundation to form the Cloud Services team, the core of our arguments were the "Labs labs labs" problem and this larger lack of awareness for our Labs OpenStack cluster and the Tool Labs shared hosting/platform as a service product.

The use of the term 'labs' in regards to multiple related-but-distinct products, and the natural tendency to shorten often used names, leads to ambiguity and confusion. Additionally the term 'labs' itself commonly refers to 'experimental projects' when applied to software; the OpenStack cloud and the tools hosting environments maintained by WMCS have been viable customer facing projects for a long time. Both environments host projects with varying levels of maturity, but the collective group of projects should not be considered experimental or inconsequential.

by bd808 (Bryan Davis) at December 11, 2017 05:07 PM

Wikimedia Foundation

Discover a world of natural heritage through the winning images from Wiki Loves Earth

Wiki Love Earth’s overall winner: Ogoy Island in Russia’s Lake Baikal. Photo by Sergey Pesterev, CC BY-SA 4.0.

A jaguar gazed out from a bed of shrubs. A water buffalo took a bath. A capybara family gathered in line. These are just a few of the delightful sights captured in the winners of the international Wiki Loves Earth photography competition, announced today.

Coming in first place (at top) is a ghostly image of Ogoy Island. Photographer Sergey Pesterev ventured out onto the ice of Lake Baikal, the largest freshwater lake in the world, and was commended by the contest judges for using cracks in the ice and clouds in the sky to frame the rocky center of the image.

Wiki Loves Earth focuses on natural heritage in protected areas—unique and special places like nature reserves, landscape conservation areas, national parks, and more. It asks photographers to contribute their work to Wikimedia Commons, a media repository that holds many of the photos used on Wikipedia and the Wikimedia ecosystem. All of its content is freely licensed, meaning that they can be used by anyone, for any purpose, with few restrictions.[1]

The fourth annual contest, held earlier this year, saw a total of 131,984 uploads to Commons, a record. These were captured in at least 38 countries.[2] 15,299 different user accounts uploaded photos during the contest, nearly 14,000 of which were newly registered.

All of the entries were judged by national juries; winners there were forwarded to the international jury, composed of members from seven different nations (Germany, Serbia, Thailand, Nepal, Russia, Argentina, and Ukraine).

The contest’s second- through fifteenth-place images follow.

Second place: A jaguar in the Pantanal Conservation Area of Brazil. It was commended by one jury member for its dark background, which they saw as “show[ing] uncertainty” and accentuating the jaguar’s expression. Photo by Leonardo Ramos, CC BY-SA 4.0.

Third place: A pair of Eurasian spoonbills in the Danube Biosphere Reserve, Ukraine. This was one jury member’s favorite animal photo from the competition because of its “artwork-like reflections and great dynamism.” Photo by Ryzhkov Sergey, CC BY-SA 4.0.

Fourth place: A water buffalo getting a nice soak in Baluran National Park, Indonesia. A jury member applauded the photo’s blend of “lighting, focus and symmetry of colors”—it may “[take] a while to realize the bull is there in the mud, but then it’s all you can see.” Photo by Candra Firmansyah, CC BY-SA 4.0.

Fifth place: Two Svalbard reindeer grazing in Bünsow Land National Park, Spitsbergen, Norway. One jury member loved the “clear focus on the animals” and the “change from brown to blue” in the photo. Photo by Siri Uldal, CC BY-SA 4.0.

Sixth place: Pedra Azul (Blue Stone) with the Milky Way above it, located in the eponymous state park in Brazil. Photo by EduardoMSNeves, CC BY-SA 4.0.

Seventh place: A Bornean orangutan. Photo by Ridwan0810, CC BY-SA 4.0.

Eighth place: The Dovbush rocks seen at twilight, located in the Polyanytskiy Regional Landscape Park, Ukraine. Photo by Пивовар Павло, CC BY-SA 4.0.

Ninth place: Aerial shot of Dzharylhach National Nature Park, Ukraine. Photo by Vadym Yunyk, CC BY-SA 4.0.

Tenth place: A Phyllomedusa rohdei frog steps over its friend, seen in Brazil. Photo by Renato Augusto Martins, CC BY-SA 4.0.

Eleventh place: Railroad tracks in Lawachara National Park, Bangladesh, provide a convenient path through the forest. Photo by Pallabkabir, CC BY-SA 4.0.

Twelfth place: A peak in the Nilgiri mountains backdrops this shot of a simple suspension bridge over the Gandaki River in Nepal. Photo by Faj2323, CC BY-SA 4.0.

Thirteenth place: A northern pika peeks over a rock in Momsky National Park, Russia. Photo by Юрий Емельянов, CC BY-SA 4.0.

Fourteenth place: A recently burned-down forest on Parnitha Mountain, Greece. Photo by Stathis floros, CC BY-SA 4.0.

Fifteenth place: Capybaras near the Tietê River in São Paulo state, Brazil. Photo by Clodomiro Esteves Junior, CC BY-SA 4.0.

 

You can learn more about Wiki Loves Earth on its website, and read this year’s jury report on Commons in low and high resolution.

Ed Erhart, Senior Editorial Associate, Communications
Wikimedia Foundation

Footnotes

  1. Please make sure to follow each image’s copyright tag. Many of those listed above, for instance, are available under a Creative Commons CC BY-SA license, meaning that you are free to share them for any reason so long as you give credit to the photographer and release any derivative images under the same copyright license.
  2. This total excludes countries from the unique UNESCO Biosphere Reserves campaign, for which the number of countries has not been tabulated yet.

by Ed Erhart at December 11, 2017 02:53 PM

Tech News

Tech News issue #50, 2017 (December 11, 2017)

This document has a planned publication deadline (link leads to timeanddate.com).
TriangleArrow-Left.svgprevious 2017, week 50 (Monday 11 December 2017) nextTriangleArrow-Right.svg
Other languages:
العربية • ‎čeština • ‎English • ‎español • ‎suomi • ‎français • ‎עברית • ‎हिन्दी • ‎magyar • ‎日本語 • ‎ಕನ್ನಡ • ‎norsk bokmål • ‎polski • ‎português do Brasil • ‎русский • ‎svenska • ‎українська • ‎中文

December 11, 2017 12:00 AM

December 10, 2017

Gerard Meijssen

When #Wikidata is good for something

When #Wikidata is good for something, it shines. It does not take much prodding to find people to improve on what it does so well and consequently when Wikidata is useful, quality follows easily.

The promise of  a useful Wikidata was delivered at its start by having it replace the native interwiki links of Wikipedia. Within a month the quality of Wikipedia links had improved dramatically and at this time corner cases are still worked improving quality even more.

The WikiCite project is really important in many respects and it has so much more to offer. It is useful because it brings many initiatives and projects together under one roof. It is why scientific papers are included, including its authors. We find that more and more authors are included as well and they are often linked to the ORCID, VIAF and other external identifiers of this world. This has great value because it allows Wikipedia articles and information maintained elsewhere to be linked. What it can be used for is limitless. End users will find new and interesting ways to use the data and make it into information.

When Wikidata is to be good for Wikimedia projects, this information brought to Wikidata because of WikiCite has great potential. It largely reflects the citations in all the Wikipedias and consequently through linked so external sources we could know what sources are problematic, retracted or bought by interested parties. We could, we don't. When we did, we would provide weight against propaganda and fake news.

The big thing holding us back is trust. Wikipedians need to consider a Wikidata that is not only used for links and that can be trusted for high level maintenance of its citations. Wikidata is to appreciate its use and trust that its information will be used and that this will increase its value and quality. WikiCiters have to understand that Wikidata is not a stamp collection only including publication data. It must include information about retractions, about papers considered problematic for political or scientific reasons (or both).

When Wikidata is to be good for something; we should expand our collaboration with Cochrane, Retraction Watch and organisations like it. There is everything to gain; quality, contributors and relevance.
Thanks,
      GerardM

by Gerard Meijssen (noreply@blogger.com) at December 10, 2017 02:07 PM

Weekly OSM

weeklyOSM 385

28/11/2017-04/12/2017

JOSM QuickLabel Screenshot

Screenshot des neuen JOSM-Plugins QuickLabel 1 | © OpenStreetMap Mitwirkende

Mapping

  • Potentiel 3.0, a Haitian digital association, have created a 3-D model of the Zerpelin basin.
  • Daniel Koć has written a blog article on the planned changes and ideas for the rendering of (natural) protected areas.
  • Ilya Zverev writes to the talk mailing list introducing OSM streak: a website that gives you points for submitting changesets each day.
  • Maripo Goda published her 3rd JOSM Plugin “QuickLabel” and asks for feedback. This plugin shows any tag values next to objects. You can visualize the mapping progress of sub tags like cuisine or surface.
  • Mapillary introduces Mapillary Tasker, this tool will let you invite people to help complete capture, map editing, or data verification tasks in your area.
  • Daniel Koć asks for the tagging of leisure=common and leisure=village_green, which have their origin in British law. As they are often misused he triggers a discusion how to use the tags.

Community

  • Richard Fairhurst adds a section about education and socio-economic status to the Contributor Covenant and explains the rationale behind his text with reference to OSM.
  • Roland Olbricht queries the basis on which the proposal process on the wiki works. His concerns were heightened by the reaction to Ilya Zverev’s metro railway proposal.
  • Nicolas Chavent writes to the OSMF-talk mailing list about his perspective on why it’s important to balance the representation of the Humanitarian OpenStreetMap Team US Inc. at the OpenStreetMap Foundation Board (OSMF) Board to favour the
    diversity of OpenStreetMap perspectives in this institutional body and also writes about HOT’s Code of Conduct. Read more about the related conversations in this email thread.
  • The Advisory Board of OSM UK are dissatisfied with the Directed Editing Policy and will therefore submit their own new proposal.

OpenStreetMap Foundation

  • Contributor Glassman writes about a proposal to OSMF adopting a Code of Conduct.
  • Imagico.de writes about the importance of the OSMF elections, understanding the best candidates to vote and its impact on the OpenStreetMap project as a whole.

Events

Humanitarian OSM

  • Alessandro Venerandi works on poverty assessment using centrality criteria in OpenStreetMap data.
  • Severin (aka sev_osm) writes a diary post where he shares his frustrations and reasons behind his resigning from HOT US Corporation.
  • Rachel VanNice thanks the HOT News to the community for the valuable work and help this year. The many natural catastrophes presented a special challenge and a video supports this Thanksgiving. At the same time, she is calling for continued and even more forceful action in the future.

Maps

  • The OpenTopoMap now arranges summits according to their dominance. For example, at zoom level 8, mountains with a dominance greater than 100 km are displayed. See this forum post for more information.

switch2OSM

  • Pokemon GO has switched to OSM data in most countries. For some players this means better map data, for others there is less detail now.
  • Garmins uses OpenStreetMap data from now on for its Pilot 9.1 Apple app.

Open Data

  • The city of Cambridge (UK) have released a week’s worth of anonymised vehicle movements captured using ANPR (automatic number plate recognition).

Software

  • Kevin Arutyunyan has used the new 3D viewer in QGIS 3.0 to visualise OSM data. He comments that “it makes you notice how much data is still missing” from OSM to make accurate 3D maps.
  • Answering a question mmd, one of the main developers of the Overpass API, explains that the upcoming version will offer a way to let you determine the length of paths in meters.

Programming

  • Up to now,”iD” has only been able to display satellite images in 4 brightness levels. In the upcoming version it will be possible to adjust brightness, contrast, saturation and sharpness between 0 and 200% from the base value.
  • Komяpa is offering a $100 bounty for a mod_tile/renderd pull request. He’s hoping to improve render queue lengths on tile.openstreetmap.org.
  • Thomas Skowron presents in issue 59 of his podcast “Microwave”, produced in cooperation with user U-Bahnnverleih his work on vector tiles, the project Grandine, the geodata format Spaten and the problems of the rendering toolchain (raster and vector) (from 0:58:40)

Releases

  • JOSM released the new stable version 13170.
  • The new release for Mapbox’s Android SDK v5.2.0 and iOS SDK v3.7.0 comes with client-side static map snapshots, georeferenced image sources, smooth transitions between styles, unlimited annotation images, shape annotation selection and improved VoiceOver compatibility.
  • Mapbox Navigation SDK for iOS v0.11.0 has upgraded to Swift 4, added reusable step list control and removed audio feedback feature.
  • The release of MapboxDirections.swift v0.14.0 and v0.15.0 comes with toll road/motorway/ferry avoidance options, property indicating whether cars drive on the right or left and much more.

Did you know …

  • … the affiliate links for Amazon? If you’re buying your christmas presents at Amazon, please consider using the affiliate program to help OSM 😉
  • For all who want to visit a nice locality in Schweich, here is the geoportal of Schweich, made with OpenSource, where you will find extensive information on geodata and map services of the municipality Schweich.

Other “geo” things

  • QGIS 3.0 is on the horizon. There are many improvements and new features, which means the documentation needs to be updated and the developers could use your help with this.
  • The ruthenium-106 traces (de) (Google translation) measured in the atmosphere at the beginning of October are very likely to have come from the Mayak nuclear facility. Due to the low dose, however, there was no danger to the public.
  • Czech Railways (national train operator) has stopped the distribution of 2018 agendas for employees and business partners. The agenda contains a small map on the last page, showing several disputed areas, ‘Islamic state’ or Russian Crimea.

Upcoming Events

Where What When Country
Denver Online High School Mitchell PoliMappers’ Adventures: One mapping quest each day 2017-12-01-Invalid date everywhere
Rennes Réunion mensuelle 2017-12-11 france
Lyon Rencontre mensuelle 2017-12-12 france
Nantes Réunion mensuelle 2017-12-12 france
Toulouse Rencontre mensuelle 2017-12-13 france
Brazil #CompleteTheMap Brasília – o evento 2017-12-13 brazil
Munich Stammtisch 2017-12-14 germany
Moscow Schemotechnika 13 2017-12-14 russia
Dresden Stammtisch 2017-12-14 germany
Berlin DB Open Data Hackathon 2017-12-15-2017-12-16 germany
Taipei OpenStreetMap Taipei Meetup 2017-12-18 taiwan
Bonn Bonner Stammtisch 2017-12-19 germany
Lüneburg Mappertreffen 2017-12-19 germany
Nottingham Pub Meetup 2017-12-19 united kingdom
Rome FOSS4G-IT 2018 2018-02-19-2018-02-22 italy
Bonn FOSSGIS 2018 2018-03-21-2018-03-24 germany
Poznań State of the Map Poland 2018 2018-04-13-2018-04-14 poland
Milan State of the Map 2018 (international conference) 2018-07-28-2018-07-30 italy

Note: If you like to see your event here, please put it into the calendar. Only data which is there, will appear in weeklyOSM. Please check your event in our public calendar preview and correct it, where appropriate.

This weeklyOSM was produced by Nakaner, Peda, Polyglot, SK53, SomeoneElse, Spanholz, YoViajo, derFred, jcoupey, jinalfoflia, sev_osm.

by weeklyteam at December 10, 2017 12:36 PM

December 09, 2017

Gerard Meijssen

#Wikipedia #NPOV - When there is no neutral point of view

Mr Jacobson, a climatologists at Stanford University wrote a paper. Its findings were disputed in another paper. Jacobson maintains that the USA can be served for its energy needs exclusively with green energy. The contrarians have it that there must be a mix of conventional and green energy.

There are several issues with the latter paper; it is a paper supported by the conventional energy industry. The result of the paper are in the best interest of this energy and the paper is considered by many not to be the result of a scientific process. So much so that Jacobson went to court.

There is a big difference with an opinion piece and a scientific paper. The critique of the contrarians is that Mr Jacobson does not consider nuclear, fuel and bio fuel solutions at all. They argue that it could make the transition more difficult or expensive. But that is not the point. The point is that you can and, the point is that green energy is getting cheaper.

When a paper is bought by industry and the premise of the original paper is ignored, it is no longer scientific but becomes an opinion piece. Mr Jacobson is not the first predicting the demise of "big" energy, Greepeace has been doing it for decades..

There is no middle ground. It is why Mr Jacobson is going to court because the paper of the contrarians only serves one purpose; postponing the inevitable. It is not a scientific critique in any acceptable way.
Thanks,
     GerardM

by Gerard Meijssen (noreply@blogger.com) at December 09, 2017 10:00 PM

Wikimedia UK

Talking to Creative Commons’ Ryan Merkley about CC Search and Structured Data on Commons

Creative Commons’ Ryan Merkley and Wikimedia Foundation Exec Director Katherine Maher at Mozfest 2017 – Image by Jwslubbock CC BY-SA 4.0

CC Search beta was launched in February. This new tool incorporates ‘list-making features, and simple, one-click attribution to make it easier to credit the source of any image you discover.’ Its developer, Liza Daly, describes it as ‘a front door to the universe of openly licensed content.’

As a small organisation, Creative Commons did not have the resources to start by indexing all of the 1.1 billion Openly Licensed works that it estimates are available in the Commons. Liza Daly decided to start with a representative sample of about 1% of the known Commons content online, and decided to select about 10 million images rather than a cross-section of all media types, due to the fact that a majority of CC content is images.

One issue they encountered was in making sure that all the content they would include was CC licensed, where a provider (like Flickr) hosted content that was both CC and commercially licensed. They also decided to defer the use of material from Wikimedia Commons, saying that,

‘Wikimedia Commons represents a large and rich corpus of material, but rights information is not currently well-structured. The Wikimedia Foundation recently announced that a $3 million grant from the Sloan Foundation will be applied to work on this problem, but that work has just begun.’

The Wikimedia Foundation understands that the resources available through Wikimedia Commons are not as accessible as they could potentially be as a result of the ad hoc nature of much of the metadata attached to the files people have uploaded. For example, one common query is ‘Why can’t I search Commons by date’. The problem here is ‘which date?’ Is it the stated date that the photo was taken (which could be incorrect) or the date that the file was created, which could be different?

This is why Structured Data is so important. The $3m grant that the WMF has received to implement structured data on Commons, in a similar way to how it’s structured on Wikidata, will allow for much better searching and indexing of media files.

CC search wants to make CC content more discoverable, regardless of where it is hosted online. To do this, they decided to import the metadata from the selected works that they are currently indexing –  title, creator name, any known tags or descriptions. This data will link directly back to the original source so you can view and download the media. It seems that in its current, unstructured state, Wiki Commons is not very good for systematically importing this kind of metadata.

It seems that Creative Commons is even looking at the possibility of using some kind of blockchain-like ledger system to record reuse of CC licensed works so that reuse can be tracked. However, this remains a longer term goal.

I asked Creative Commons CEO Ryan Merkley some questions about how the project had been progressing since its announcement and how it might work.

WMUK: How much progress has been made on CC search since the start of 2017? Have you indexed many more than the original 10 million media items?

RM: CC has hired a Director of Product Engineering, Paola Villarreal to lead the project. We’re staffing up the team, with a Data Engineer starting soon. In addition, we’ll be pushing a series of enhancements, including adding new content, by the end of the year.

WMUK: Will you have to wait until the end of the Structured Data on Commons project to index Wikimedia content? Or does the tool only require basic metadata categories like Title, Creator, Description, Category Tags, meaning it be possible to start this before the end of the project?

RM: We’re happy to work with the Wikimedia Commons community on the project. In our initial conversations, we mutually decided to wait until some of that work was further along. We want to make sure our work is complementary.

WMUK: Is it still an ultimate ambition to use some kind of blockchain architecture to record reuse? Or is that potentially a goal that would require more resources than will likely be available for the foreseeable future?

RM: Not necessarily. There’s a lot of interesting work going on with the blockchain and distributed ledger projects. What’s most important to us is a complete, updated, and enhanced catalog of works and metadata that is fast and accessible.

WMUK: Can you explain how ledger entries would be created when someone reused a CC licensed work?

RM: The tools to track remix don’t exist right now. It’s something we’re really interested in, and our community wants as well. It will require new tools, and collaboration with platforms and creators.

There are so many incredible applications possible for all the data on Wikimedia Commons, and we hope that after the content is structured properly, it will become a valuable source which can be searched along with other CC content online using Creative Commons’ CC Search tool. Like a lot of the changes we would like to see in the way the Wikimedia products work, this will likely take some time, but we are hopeful that the wait will be worth it.

by John Lubbock at December 09, 2017 05:48 PM

Noella - My Outreachy 17

Coding has begun :))

Hmmm!! long waited this time, where i will get my hands dirty 😎😎

Coding has finally started and the first thing is to setup a MediaWiki instance on Cloud VPS, which I shall use with my mentors to test changes before deployment. Believe me, it was a sweet experience but difficult. Just at the beginning, I had multiple challenges😓😓 but luckily, I have very responding mentors (@Legoktm, @D3r1ck) always available for help. This is a narration of what happened.

I had never worked with cloud services before so I was relying greatly on documentations and following of steps. I created a new instance massmessage-test-wiki on the massmessage project on WikiMedia Cloud VPS. That was pretty simple. The next thing was to connect to the instance and setup MediaWiki on it. I had struggle for hours without succeeding, made research with no good output. In fact, I could not even connect to the instance😢😢. After some long struggle I thought the problem might be my operating system as I had no other idea. I formated my PC and installed Ubuntu 16.04 hopping all is well, I configured ssh on the machine and tried to login again but same error. I almost got mad at that point😠😠.  I contacted my mentors explaining to them what has being going on. After a chat with both mentors we figured out what was the problem and within a minute I was able to connect to the instance and MediaWiki is up and running on massmessage-test-wiki instance😎😎.

Actually, the solution was out of my reach. I needed a shell privileges which I could not grant myself. CHAI!!!! I wish I had contacted my mentors before formating😥😥. It would have saved me the stress of backing up and setting up my whole PC.

At least It was a great challenge and I learnt alot. Looking forward to greater challenges (hopefully without formating) and to greater things to learn😋😋😋.  

by Noella Teke (noreply@blogger.com) at December 09, 2017 03:58 PM

Megha Sharma

Outreachy Chapter 1: Settling in the community

When you are working for Wikimedia, the community bonding period has to be quite overwhelming because its humongous in size and has loads of projects!

But it’s so loving, accepting and eager to help, that it is almost impossible for you to get lost. In my case too, my super awesome mentors (Gergo Tisza and Stephen La Porte) helped me at every step and made things no more difficult than a cake-walk. And yes, because of them the community bonding period went quite well for me. So, a big shout-out for them!

Coming to the details, I utilized this time to get well versed with different chapters of Wikimedia, setting up my user-page and for settling in the community. In particular to my project (which is developing a ‘User Contribution Summary Tool’), I studied and analyzed the existing work that has been done in this area. This exercise helped me to get a better picture of what all should I include and what to leave out. Apart from this, I read and read and read! (I told you, it’s huge!)

So far all good. But during one fine video call with my mentors, it dawned upon us that we are missing something important. But what? Any guesses? What is the most important thing before starting any altogether new project? Taking inputs from the target users!

This important realization has led me into a new direction which involves getting in touch with people and asking for their views and inputs. All in all, another interesting work! Currently, I’m on it and will soon be using the results of this exercise to frame the requirements.

To all the viewers, if you are interested to provide inputs for the same, please drop me a mail at meghasharma4910@gmail.com. Would love to see many emails coming in!

And yes, chapter 2 is on its way!

by Megha Sharma at December 09, 2017 03:02 PM

Hey, it’s quite a nice idea, quite helpful for the newbies!

Hey, it’s quite a nice idea, quite helpful for the newbies! I checked your repository, it’s awesome. Would be more than happy to give in a detailed feedback :)

by Megha Sharma at December 09, 2017 01:45 PM

Thanks a lot! Yes, many more are lined up. Will be waiting for your review :)

Thanks a lot! Yes, many more are lined up. Will be waiting for your review :)

by Megha Sharma at December 09, 2017 01:42 PM

Wikimedia Tech Blog

The journey to Thumbor, part 3: Development and deployment strategy

Photo by Sarah Reid via Flickr, CC BY 2.0.

Part 1 | Part 2 | Part 3

In the last blog post I described where Thumbor fits in our media thumbnailing stack. Introducing Thumbor replaces an existing service, and as such it’s important that it doesn’t preform worse than its predecessor. We came up with a strategy to reach feature parity and ensure a launch that would be invisible to end users.

Development

In Wikimedia production, Thumbor was due to interact with several services: Varnish, Swift, Nginx, Memcached, Poolcounter. In order to iron out those interactions, it was important to reproduce them locally during development. Which is why I wrote several roles for the official MediaWiki Vagrant machine, with help from @bd808. Those have already been useful to other developers, with several people reaching out to me about the Varnish and Swift Vagrant roles. While at the time it might have seemed like an unnecessary quest (why not develop straight on a production machine?) it was actually a great learning experience to write the extensive Puppet code required to make it work. While it’s a separate codebase, subsequent work to port that over to production Puppet was minimal.

This phase actually represented the bulk of the work, reproducing support for all the media formats and special parameters found in MediaWiki thumbnailing. I dedicated a lot of attention to making sure that the images generated by Thumbor were as good as what MediaWiki was outputting for the same original media. In order to do that, I wrote many integration tests using thumbnails from Wikimedia production, which were used as reference output. Those tests are still part of the Thumbor plugins Debian package and ensure that we avoid regressions. They use a DSSIM algorithm to visually compare images and make sure that what Thumbor outputs doesn’t visually diverge from the reference thumbnails. We also compare file size to make sure that the new output isn’t significantly heavier than the old.

Packaging

The next big phase of the project was to create a Debian package for our Thumbor code. I had never done that before and it wasn’t as difficult as some people make it out to be (I imagine the tooling has gotten significantly better than it used to be), at least for Python packages. However, in order to be able to ship our code as a Debian package, Thumbor itself needed to have a Debian package. Which wasn’t the case at the time. Some people had tried on much older versions of Thumbor but never reached the point where it was put in Debian proper. Since that last attempt, Thumbor added a lot of new dependencies that weren’t packaged either. @fgiunchedi and I worked on packaging it all and successfully did so. And with the help of Debian developer Marcelo Jorge Vieira who pushed most of those packaged for us into Debian, we crossed the finish line recently and got Thumbor submitted to Debian unstable.

One advantage of doing this is that it makes deployment of updates really straightforward, with the integration test suite I mentioned earlier running in isolation when the Debian package is built. With those Debian packages done, we were ready to run this on production machines.

But the more important advantage is that by having those Debian packages into Debian itself, other people are using the exact same versions of Thumbor’s dependencies and Thumbor itself via Debian, thus greatly expanding the exposure of the software we run in production. This increases the likelihood that security issues we might be exposed to are found and fixed.

Beta

Trying to reproduce the production setup locally is always limited. The full complexity of production configuration isn’t there, and everything is still running on the same machine. The next step was to convert the Vagrant Puppet code into production Puppet code. Which allowed us to run this on the Beta cluster as a first step, where we could reproduce a setup closer to production with several machines. This was actually an opportunity to improve the Beta cluster to make it have a proper Varnish and Swift setup closer to production than it used to have. Just like the Vagrant improvements, those changes quickly paid off by being useful to others who were working on Beta.

Just like packaging, this new step revealed bugs in the Thumbor plugins Python code that we were able to fix before hitting production.

Pre-production

The Beta wikis only have a small selection of media, and as such we still hadn’t been exposed to the variety of content found on production wikis. I was worried that we would run into media files that had special properties in production that we hadn’t run into in all the development phase. Which is why I came up with a plan to dual-serve all production requests to the new production Thumbor machines and compare output.

This consisted in modifications to the production Swift proxy plugin code we have in place to rewrite Wikimedia URLs. Instead of sending thumbnail requests to just MediaWiki, I modified it to also send the same requests to Thumbor. At first completely blindly, the Swift proxy would send requests to Thumbor and not even wait to see the outcome.

Then I looked at the Thumbor error logs and found several files that were problematic for Thumbor and not for MediaWiki. This allowed us to fix many bugs that we would have normally found out about during the actual launch. This was also the opportunity to reproduce and iron out the various throttling mechanisms.

To be more thorough, I mage the Swift proxy log HTTP status codes returned by MediaWiki and Thumbor and produced a diff, looking for files that were problematic for one and not the other. This allowed us to find more bugs on the Thumbor side, and a few instances of files that Thumbor could render properly that MediaWiki couldn’t!

This is also the phase where under the full production load, our Thumbor configuration started showing significant issues around memory consumption and leaks. We were able to fix all those problems in that fire-and-forget dual serving setup, with no impact at all on production traffic. This was an extremely valuable strategy, as we were able to iterate quickly in the same traffic conditions as if the service had actually launched, without any consequences for users.

Production

With Thumbor running smoothly on production machines, successfully rendering a superset of thumbnails MediaWiki was able to, it was time to launch. The dual-serving logic in the Swift proxy came in very handy: it became a simple toggle between sending thumbnailing traffic to MediaWiki and sending it to Thumbor. And so we did switch. We did that gradually, having more and more wikis’s thumbnails rendered by Thumbor over the course of a couple of weeks. The load was handled fine (predictable, since we were handling the same load in the dual-serving mode). The success rate of requests based on HTTP status codes was the same before and after.

However after some time we started getting reports of issues around EXIF orientation. A feature we had integration tests for. But the tests only covered 180 degrees rotation and not 90 degrees (doh!). The Swift proxy switch allowed us to quickly switch traffic back to MediaWiki. We did so because it’s quite a prevalent feature in JPGs. We fixed that one large bug, switched the traffic back to Thumbor and that was it.

Some minor bugs surfaced later regarding much less common files with special properties, that we were able to fix very quickly. And deploy fixes for safely and easily with the Debian package. But we could have avoided all of those bugs too if we had been more thorough in the dual-serving phase. We were only comparing HTTP status codes between MediaWiki and Thumbor. However, rendering a thumbnail successfully doesn’t mean that the visual contents are right! The JPG orientation could be wrong, for example. If I had to do it again, I would have run DSSIM visual comparisons on the live dual-served production traffic between the MediaWiki and Thumbor outputs. That would have definitely surfaced the handful of bugs that appeared post-launch.

Conclusion

All in all, if you do your homework and are very thorough in testing locally and on production traffic, you can achieve a very smooth launch replacing a core part of infrastructure with completely different software. Despite the handful of avoidable bugs that appeared around the launch, the switch to Thumbor went largely unnoticed by users, which was the original intent, as we were looking for feature parity and ease of swapping the new solution in. Thumbor has been happily serving all Wikimedia production thumbnail traffic since June 2017 in a very stable fashion. This concludes our journey to Thumbor!

Gilles Dubuc, Senior Software Engineer, Performance
Wikimedia Foundation

by Gilles Dubuc at December 09, 2017 06:49 AM

The journey to Thumbor, part 2: Thumbnailing architecture

Photo by Sarah Reid via Flickr, CC BY 2.0.

Part 1 | Part 2 | Part 3

Thumbor has now been serving all public thumbnail traffic for Wikimedia production since late June 2017.

In a previous blog post I explained the rationale behind that project. To understand why Thumbor is a good fit, it’s important to understand where it fits in our overall thumbnailing architecture. A lot of historic constraints come into play, where Thumbor could be adapted to meet those needs.

The stack

Like everything we serve to readers, thumbnails are heavily cached. Unlike wiki pages, there is no distinction in caching of thumbnails between readers and editors, in fact. Our edge is Nginx providing SSL termination, behind which we find Varnish clusters (both frontends and backend), which talk to OpenStack Swift – responsible for storing media originals as well as thumbnails – and finally Swift talks to Thumbor (previously MediaWiki).

The request lifecycle

Nginx concerns itself with SSL and HTTP/2, because Varnish as a project decided to draw a line about Varnish’s concerns and exclude HTTP/2 support from it.

Varnish concerns itself with having a very high cache hit rate for existing thumbnails. When a thumbnail isn’t found in Varnish, either it has never been requested before, or it fell out of cache for not being requested frequently enough.

Swift concerns itself with long-term storage. We have a historical policy – which is in the process of being reassessed – of storing all thumbnails long-term. Which means that when a thumbnail isn’t in Varnish, there’s a high likelihood that it’s found in Swift. Which is why Swift is first in line behind Varnish. When it receives a request for a missing thumbnail from Varnish, the Swift proxy first checks if Swift has a copy of that thumbnail. If not, it forwards that request to Thumbor.

Thumbor concerns itself with generating thumbnails from original media. When it receives a request from Swift, it requests the corresponding original media from Swift, generates the required thumbnail from that original and returns it. This response is sent back up the call chain, all the way to the client, through Swift and Varnish. After that response is sent, Thumbor saves that thumbnail in Swift. Varnish, as it sees the response go through, keeps a copy as well.

What’s out of scope

Noticeably absent from the above is uploading, extracting metadata from the original media, etc. All of which are still MediaWiki concerns at upload time. Thumbor doesn’t try to handle all things media, it is solely a thumbnailing engine. The concern of uploading, parsing and storing the original media is separate. In fact, Thumbor goes as far as trying to fetch as little data about the original from Swift as possible, seeking data transfer efficiency. For example, we have a custom loader for videos that leverages Ffmpeg’s support for range requests, only fetching the frames it needs over the network, rather than the whole video.

What we needed to add

We wanted a thumbnailing service that was “dumb”, i.e. didn’t concern itself with more than thumbnailing. Thumbor definitely provided that, but was too simple for our existing needs, which is why we had to write a number of plugins for it, to add the following features:

  • New media formats (XCF, DJVU, PDF, WEBM, etc.)
  • Smarter handling of giant originals (>1GB) to save memory
  • The ability to run multiple format engines at once
  • Support for multipage media
  • Handling the Wikimedia thumbnail URL format
  • Loading originals from Swift
  • Loading videos efficiently with range requests
  • Saving thumbnails in Swift
  • Various forms of throttling
  • Live production debugging with Manhole
  • Sending logs to ELK
  • Wikimedia-specific filters/settings, such as conditional sharpening of JPGs

We also changed the images included in the Thumbor project to be respectful of open licenses and wrote Debian packages for all of Thumbor’s dependencies and Thumbor itself.

Conclusion

While Thumbor was a good match on the separation of concerns we were looking for, it still required writing many plugins and a lot of extra work to make it a drop-in replacement for MediaWiki’s media thumbnailing code. The main reason being that Wikimedia sites support types of media files that the web at large cares less about, like giant TIFFs and PDFs.

In the next blog post, I’ll describe the development strategy that led to the successful deployment of Thumbor in production.

Gilles Dubuc, Senior Software Engineer, Performance
Wikimedia Foundation

by Gilles Dubuc at December 09, 2017 06:48 AM

The journey to Thumbor, part 1: Rationale

Photo by Sarah Reid via Flickr, CC BY 2.0.

 Part 1 | Part 2 | Part 3

 
We are currently in the final stages of deploying Thumbor to Wikimedia production, where it will generate media thumbnails for all our public wikis. Up until now, MediaWiki was responsible for generating thumbnails.

I started the project of making Thumbor production-ready for Wikimedia a year and a half ago and I’ll talk about this journey in a series of blog posts. In this one, I’ll explain the rationale behind this project.

Security

The biggest reason to change the status quo is security. Since MediaWiki is quite monolithic, deployments of MediaWiki on our server fleet responsible for generating thumbnails aren’t as isolated as they could be from the rest of our infrastructure.

Media formats being a frequent security breach vector, it has always been an objective of ours to isolate thumbnailing more than we currently can with Mediawiki. We run our command-line tools responsible for media conversion inside firejail, but we could do more to fence off thumbnailing from the rest of what we do.

One possibility would have been to rewrite the MediaWiki code responsible for thumbnailing, turning it into a series of PHP libraries, that could then be run without MediaWiki, to perform the thumbnailing work we are currently doing – while untangling the code enough that the thumbnailing servers can be more isolated.

However such a rewrite would be very expensive and when we can afford to, we prefer to use ready-made open source solutions with a community of their own, rather than writing new tools. It seemed to us that media thumbnailing was far from being a MediaWiki-specific problem and there ought to be open source solutions tackling that issue. We undertook a review of the open source landscape for this problem domain and Thumbor emerged as the clear leader in that area.

Maintenance

The MediaWiki code responsible for thumbnailing currently doesn’t have any team ownership at the Wikimedia Foundation. It’s maintained by volunteers (including some WMF staff acting in a volunteer capacity). However, the amount of contributors is very low and technical debt is accumulating.

Thumbor, on the other hand, is a very active open-source project with many contributors. A large company, Globo, where this project originated, dedicates significant resources to it.

In the open source world, joining forces with others pays off, and Thumbor is the perfect example of this. Like other large websites leveraging Thumbor, we’ve contributed a number of upstream changes.

Maintenance of Wikimedia-specific Thumbor plugins remains, but those represent only a small portion of the code, the lion’s share of the functionality being provided by Thumbor.

Service-oriented architecture

For operational purposes, running parts of the wiki workflow as isolated services is always beneficial. It enables us to set up the best fencing possible for security purposes, where Thumbor only has access to what it needs. This limits the amount of damage possible in case of a security vulnerability propagated through media files.

From monitoring, to resource usage control and upstream security updates, running our media thumbnailing as a service has significant operational upsides.

New features

3rd-party open source projects might have features that would have been low priority on our list to implement, or considered too costly to build. Thumbor sports a number of features that MediaWiki currently doesn’t have, which might open exciting possibilities in the future, such as feature detection and advanced filters.

At this time, however, we’re only aiming to deploy Thumbor to Wikimedia production as a drop-in replacement for MediaWiki thumbnailing, targeting feature parity with the status quo.

Performance

Where does performance fit in all this? For one, Thumbor’s clean extension architecture means that the Wikimedia-specific code footprint is small, making improvements to our thumbnailing pipeline a lot easier. Running thumbnailing as a service means that it should be more practical to test alternative thumbnailing software and parameters.

Rendering thumbnails as WebP to user agents that support it is a built-in feature of Thumbor and the most likely first performance project we’ll leverage Thumbor for, once Thumbor has proven to handle our production load correctly for some time. This alone should save a significant amount of bandwidth for users whose user agents support WebP. This is the sort of high-impact performance change to our images that Thumbor will make a lot easier to achieve.

Conclusion

Those many factors contributed to us betting on Thumbor. Soon it will be put to the test of Wikimedia production where not only the scale of our traffic but also the huge diversity of media files we host make thumbnailing a challenge.

In the next blog post, I’ll describe the architecture of our production thumbnailing pipeline in detail and where Thumbor fits into it.

Gilles Dubuc, Senior Software Engineer, Performance
Wikimedia Foundation

by Gilles Dubuc at December 09, 2017 06:48 AM

December 08, 2017

This month in GLAM

This Month in GLAM: November 2017

by Admin at December 08, 2017 08:48 PM

Wikimedia Foundation

Community digest: WikiFemHack, the first Wikimedia hackathon to encourage gender diversity in Greece; news in brief

Photo by Magioladitis, CC BY-SA 4.0.

WikiFemHack was a one-day event in October held to explore and address the gender gap and gender bias on Wikipedia issues.

Organised by Wikimedia User Group Greece—in collaboration with SheSharp, a female-oriented IT group, and the Open Knowledge Foundation of Greece—the event focused on finding the roots of these serious issues and coming up with ways to mitigate them.

The main goal was to welcome women, inform them and increase their participation in the technological and programming fields, in addition to Wikipedia.

During the event, the audience was divided into two room. In the first, talks from Greek and international researchers about Wikipedia’s gender gap were held. Speakers informed the participants about community projects that women can join to help boost gender diversity, like Thesswiki, Wiki Loves Women, WikiWomenCamp, Women in Red, and others.

Simultaneously, the second room’s audience attended an editathon (editing workshop) on the Greek Wikipedia and a hackathon, where community programmers and volunteers were there to support the participants. The editathon aimed at creating and translating female profiles using information from Wikidata and other sources. The hackathon was more of an introductory developing session where the participants were given certain programming tasks in Python with an open part with API and PHP.

At the closing of WikiFemHack, we gave out awards to the participants with the best contributions including a laptop, an Arduino kit, a usb stick, free tickets to for Voxxed Days Thessaloniki and more. Near the end of the event, results from the Why Women Don’t Edit Wikipedia research about the gender gap in Greece were presented.

WikiFemHack was targeted to designers, scientists, educators, undergraduate and graduate students with near zero experience in similar projects. The goal was to inform the participants and for them to exchange views so we could, through conversation, take a step towards mitigating the Gender Gap in Technology.

Some of the attendees did not even know they could edit Wikipedia, so taking the time to explain the projects that were presented in an informal fashion helped everyone to get a better grasp of the talks that were hosted.

Everyone had the chance to learn about the policies and code behind Wikipedia and the other Wikimedia projects, enrich the Greek Wikipedia with content and also use their coding skills.

WikiFemHack was free to attend, and the organizers provided support throughout the event and follow-up emails were sent later to help keep them posted with community updates.

The general feedback was highly positive and the event is most likely to be repeated next year. WikiFemHack was based on the positive experience we had in ThessHack and a gender gap editathon organized by the user group in 2016.

Marios Magioladitis, Wikimedia Community User Group Greece

In brief

Photo by Jamie Tubers, CC BY-SA 4.0.

Wikimedia community in Nigeria hosts Wiki Loves Africa workshop in Ilorin: On 20 and 25 November, the Ilorin Wikimedia Hub organized Wiki Loves Africa 2017 workshops and uploading sessions in Ilorin. This workshop marks the first time ever that any Wikimedia event has been held in Ilorin, the state capital of Kwara and one of the largest cities in Nigeria. More about the event and the Ilorin community is on Meta.

Music Museum of Barcelona uses Wikipedia to guide their exhibition audience: The museum is holding an exhibition to celebrate 22 recently-donated Korean instruments. “Every time we prepare for an exhibition, we analyze in parallel what content we have in Catalan Wikipedia, and what’s in other languages,” says Sara Guasteví, documentalist in the Music Museum of Barcelona in a blog post. “In a particular case, in the exhibition of Musical Sculptures, we [used Wikipedia content about the sculptures] as a part of the text for the exhibition. Since then, we have been collaborating with Amical Wikimedia and Catalan Wikimedians, and allow them to access the information generated by the museum, specialized bibliography from the museum library, and help the Wikipedians to expand or create articles on various topics.”

Wikimania 2018 scholarship committee is looking for volunteers: The Scholarship Committee for Wikimania, the annual conference of the Wikimedia movement, is an important and diverse group of volunteers who help run the Wikimania scholarship program. People from all Wikimedia communities are encouraged to apply for this position. More on the duties and membership criteria on Wikimedia-l.

Registration for Wikimedia Conference 2018 is now open: Registration for the 2018 Wikimedia Conference, which will be held in Berlin from Friday, 20 April, through Sunday, 22 April, is now open. The Wikimedia Conference organizing team is willing to “provide the applicants with important information regarding the eligibility for participation, participant number regulation, registration procedure, specifics in regards to the travel and hotel booking as well as the visa application process,” they wrote in an email to Wikimedia-l where more details are available.

New board for Wikimedia Finland: Wikimedia Finland, the independent chapter that supports the Wikimedia movement in the country have announced their new board for 2018 where three members are keeping their seats from 2017 and two new members are joining.

The WikiChallenge Ecoles d’Afrique kicks in 4 Francophone African countries: The WikiChallenge African Schools is a writing contest held in primary schools in several African countries (Mali, Tunisia, Madagascar, and Guinea). The contest takes place during October and November 2017 with primary school children adding Vikidia articles using the WikiFundi software. More on This Month In Education newsletter.

Samir Elsharbaty, Writer, Communications
Wikimedia Foundation

 

by Marios Magioladitis and Samir Elsharbaty at December 08, 2017 07:51 PM

First World War digitization project nets awards for Romanian Wikimedia community

Macreanu Iulian, project representative. Photo by staff photographer/Europeana AGM 2017, CC BY-SA 4.0.

Romania’s experience of the First World War was mixed: they joined the war on the side of the Allies, and made successful attacks into Austria-Hungary and Bulgaria before being pushed back by numerically superior and better-equipped forces. When Russia left the war after the Russian Revolution of 1917, Romania was surrounded and forced to capitulate.

A century later, six members of Romania’s Wikimedia community jumped at the chance to document their country’s wartime history. Their efforts won them the overall prize for best portfolio in a continent-wide contest run by Europeana, publicly funded institution to support European digital cultural heritage, for which they were recognized at an event held this week in Milan.[1]

Iulian Macreanu, known on Wikimedia by the username Macreanu Iulian, was the volunteer team’s coordinator and representative at Europeana’s ceremony. He told us that the team has been working for more than three years to digitize their country’s war history, and that the Europeana contest helped expand their ambitions. “The contest’s proposed objectives,” he said, “presented new ways and opportunities to improve our ongoing project as well as a chance to evaluate and validate our works in comparison with other similar projects.”

In addition to Macreanu, the team is composed of Strainu, the technical coordinator; WWI Aprentices, content digitizer and corrector; Accipiter Q. Gentilis, Donarius, and Nenea hartia, all content editors; and Strainubot, their bot.

A 15 cm gun in use against the Romanian Army during the First World War. Photo by unknown via Österreichische Nationalbibliothek, public domain.

In all, the team has written more than 800 articles on the Romanian Wikipedia, added more than 1,200 images to Wikimedia Commons (the media repository that hosts many of the images used on Wikipedia), added to the metadata of 250 images on Europeana’s servers, and digitized more than 45,000 characters during a transcribathon.

“We’ve really learned some valuable lessons from this contest,” Macreanu says, “that will “improve our perception and skills in dealing with these kinds of projects.” These include:

  • “First, we’ve discovered that Europeana has a huge amount of valuable resources that can be easily used for developing our project and bringing it to a new level.
  • “Second, we’ve learned that it is easy and very useful to develop similar projects both on Wikipedia and Europeana, as they are very complementary from many points of view.
  • “Third, we were ‘caught’ and became enthusiastic about the new type of works we discovered on Europeana: telling stories, bringing old photos to light, transcribing old war journals, and more.
  • “But perhaps the most important lesson was that we discovered it is not so difficult to make valuable contributions towards recovering our cultural heritage, even if you are a small team, from a not so big country.”

Macreanu said that the team’s work is not done—”business as usual”—despite the award granted to them. They have many more Wikipedia articles to write, stories to upload to Wikimedia Commons, transcriptions to add to Wikisource, and more. They hope to be able to take the experience gained from the competition and use it to improve both their own work and others in their community.

Ed Erhart, Senior Editorial Associate, Communications
Wikimedia Foundation

Looking to get involved? Visit their project page. Want to learn more? See their portfolio from the contest.

Footnotes

  1. They also won an award for the best partnership project with a Europeana partner.

by Ed Erhart at December 08, 2017 05:52 PM

Wiki Education Foundation

Adding anthropological perspectives to Wikipedia

I accompanied Educational Partnerships Manager Jami Mathewson to DC last week where we attended the American Anthropological Association’s 116th annual meeting. We spoke with dozens of instructors about the importance of disseminating academic scholarship to a wider audience. How can we not only increase coverage of anthropology topics on Wikipedia, but also add an anthropological perspective to existing articles?

Our Theories brochure was especially popular at the booth this year, which discusses the production of knowledge as it relates to Wikipedia (the seventh most visited site in the world!). Knowledge doesn’t sit still on Wikipedia; it’s updated to reflect new scholarship and new voices. But much of the work that comes out of Academia doesn’t make it onto this highly-accessed resource. We want to change that. And we see our new partnership with the American Anthropological Association as an important part of this vision.

Public engagement in academic scholarship is possible through a Wikipedia assignment. Students learn how to contribute academic content to Wikipedia and to evaluate the site for content gaps. Students ask themselves—What anthropology topics are missing on Wikipedia? Why might they be missing? What can I do to right this? In evaluating content gaps on Wikipedia and understanding how to address them, students become creators, not just consumers, of knowledge. A Wikipedia assignment also presents students with important questions about knowledge production—Who gets to engage? Whose histories are represented? Whose voices are heard?

As the public engages more and more with digital platforms, sharing academic scholarship on Wikipedia becomes an increasingly important opportunity. In our partnership with the American Anthropological Association, we hope to engage more anthropology courses in contributing to this great resource. We made some great connections at the annual meeting this year. Here’s to adding more topics in social, cultural, and linguistic anthropology to Wikipedia and engaging more voices in the production of that knowledge!

by Cassidy Villeneuve at December 08, 2017 05:06 PM

Magnus Manske

Recommendations

Reading Recommending Images to Wikidata Items by Miriam, which highlights missing areas of image coverage in Wikidata (despite being the most complete site in the WikimediaVerse, image-wise), and strategies to address the issue, I was reminded of an annoying problem I have run into a few times.

My WD-FIST tool uses (primarily) SPARQL to find items that might require images, and that usually works well. However, some larger queries do time out, either on SPARQL, or the subsequent image discovery/filtering steps. Getting a list of all items about women with image candidates occasionally works, but not reliably so; all humans is out of the question.

So I started an extension to WD-FIST: A caching mechanism that would run some large queries in a slightly different way, on a regular basis, and offer the results in the well-known WD-FIST interface. My first attempt is “humans”, and you can see some results here. As of now, there are 275,500 candidate images for 160,508 items; the link shows you all images that are used on three or more Wikipedias associated with the same item (to improve signal-to-noise ratio).

One drawback of this system is that it has some “false positive” items; because it bypasses SPARQL, it gets some items that link to “human” (Q5), but not via “instance of” (P31). Also, matching an image to an items, or using “ignore” on the images, might not immediately reflect on reload, but the daily update should take care of that.

Update code is here.

by Magnus at December 08, 2017 11:55 AM

December 07, 2017

Wikimedia Foundation

Wikimedia Diversity Conference 2017: Outcomes, next steps and how to join the conversation

Photo by AbhiSuryawanshi, CC BY-SA 4.0.

Over the first weekend of November, 80 Wikimedians from 43 countries, and speaking close to 30 languages, gathered in Stockholm for the Wikimedia Diversity Conference. The conference was organised by Wikimedia Sweden (Sverige), with support from Wikimedia Norway (Norge), and made possible through a conference grant from the Wikimedia Foundation.

The Diversity Conference provided a thematic space for community organizers and program coordinators across the world to share experiences from working in the diversity space within the Wikimedia movement. We came together to reflect on questions such as: What are some common pain points? How can we support each other better? What skills do we need to develop to advance our work in this field? With a shared understanding of what makes each of us unique, and what brings us together, we then engaged in conversation.

Participants of the conference brought with them examples of a wide range of initiatives to increase diversity in Wikimedia projects. At the conference they were able to discuss these examples, network with each other, and share insights. Capping the group at 80 also allowed for participants to gather for group dialogue aimed at harnessing our collective knowledge and to jointly explore further actions. This specific dialogue activity was organised as a “WikiCafé”, relying on the conversational process known as World Café, and lasted around three hours. The starting point of the conversation was the strategic direction, guided by the question How can we move towards knowledge equity?”

Outcomes

This conversation outcomes are now published. Conference participants also released a statement articulating that “different communities have different needs, and currently face uneven resources and distribution of support” and that “progress is blocked primarily by lack of resources in terms of infrastructure, access, capacity and people, and perceptions of Wikimedia as a closed community.” The resource page also lists a series of problems and identified solutions which the statement is based on.

It is now time to take this conversation beyond the conference and hear from the movement at large. We are asking community members to answer three questions: What does diversity mean in your context? What specific needs should be addressed in order to advance your work? And what resources would address those needs. Please join the conversation.

Where are we going next?

The outcomes of the conversation were captured and documented on the conference portal on Meta. In the synthesis, one group of conference participants pulled out what common blockers, as well as possible solutions. We also identified what are the most critical next steps needed in order to move towards knowledge equity. A drafting group created a summary of the key takeaways. Conference participants and other diversity advocates are invited to engage with these outcomes and help define what diversity means in the local context and share more thoughts online around the three questions shared.

If you are interested in building inclusive communities and projects in the Wikimedia universe, connect with the Wikimedia diversity group on Meta and join the diversity conversation.

We want to hear from you! Take this outcome to your local community, project or affiliate and identify what it means in your context. Imagine a world where everyone has equal footing to share in free knowledge, and share your thoughts on it.

Sara Mörtsell, Education Manager, Wikimedia Sweden (Sverige)
María Cruz, Communication and Outreach Project Manager, Learning and Evaluation, Wikimedia Foundation

by Sara Mörtsell and María Cruz at December 07, 2017 06:07 PM

Wiki Education Foundation

Student at York University wins faculty writing prize for Wikipedia article

“Did you know… that the Canadian government invests millions of dollars in First Nations communities to close the digital divide in Canada?”

On February 8, 2017, this blurb appeared on Wikipedia’s main page in the Did You Know? section, which features interesting facts from newly developed articles. In this case, the article was created and fleshed out by Andrew Hatelt, a student in a Wiki Education-supported course.

The DYK shoutout isn’t the only recognition Andrew received for his excellent work. In fact, Andrew received a fourth-year writing prize for the article at York University, where he attends.

In Fall 2016, as part of Jonathan Obar’s course at York University, Resistance and Subversion on the Internet, Andrew created a new Wikipedia article about the Digital divide in Canada. Throughout the course, students discussed “the open internet as a site of resistance, with Wikipedia presented as an exemplar of this resistance to institutional power structures.”

A digital divide refers to the inequality of access to internet or digital resources for citizens. This lack of access may be due to economic reasons (the high price of Wi-Fi, for example), geographical reasons (connectivity limitations in rural areas), educational reasons (lack of digital literacy), and even social reasons (differences in digital practices due to age, gender, language, or culture). Andrew’s contributions to this new article highlight efforts to close the digital divide in Canada by government and other entities, and the societal implications of those efforts.

Andrew was initially surprised at the mention of a semester-long Wikipedia project, as the merits of Wikipedia were often condemned by his high school teachers. In an interview about York University’s recognition of Andrew’s work, Professor Obar had some great things to say about the merits of a Wikipedia assignment.

“When students contribute content to Wikipedia, they not only benefit from the active learning outcomes achieved through the power of online social networks—their work lives on beyond the scope of the class, helping others to learn about, debate, and change the way we understand the world,” he said.

Jonathan Obar, Andrew Hatelt, and Jon Sufrin 
Image: File:Obar Hatelt Sufrin York University Contest Winners.jpg, Jon Sufrin, on behalf of Faculty of LA&PS, York University, CC BY-SA 4.0, via Wikimedia Commons.

The LA&PS 2017 Writing Prize that Andrew received is a “faculty-wide competition open to all students from the Faculty of Liberal Arts and Professional Studies at York University,” said Jon Sufrin, coordinator of the competition. Submissions are diverse, ranging from “reports, reflective writing, grammar assignments, feature-length journalism, and academic essays.” Professor Obar nominated Andrew’s work for the fourth-year category.

The prize is highly competitive, with about 25,000 students eligible across 21 different departments and schools.

“To be nominated for, let alone win, the competition is a significant achievement,” said Sufrin. “Overall, I’d say the fourth-year level of the LA&PS Writing Prize is usually the toughest to win. The papers are expected to be well-researched, well-composed and deep pieces of writing that showcase the best the Faculty has to offer.”

“I think it is fantastic that the work I did with Wikipedia was even considered for the award, and was completely blown away after finding out that it had won,” said Andrew.

Not only was the recognition satisfying, but the assignment itself was an enriching experience for Andrew.

“I learned and gained more from working with Wikipedia than I have from almost any other assignment I have completed,” he said. “Learning how to interact with Wikipedia’s collaborative social network, adapting to a work environment that isn’t a traditional word processor, and practicing a style of writing which isn’t common among university assignments. These are all things that I would not have experienced if I had been working on something more traditional, yet I believe having less traditional experiences like these is also an important part of growing academically.”

Sufrin also spoke of the value of a Wikipedia article as an academic assignment.

“It’s pretty clear that digital writing is going to be in demand in the future, and this kind of writing takes a specific set of skills to do well,” he said. “You have to be able to sort through all the available sources, have skills at hyperlinking, and understand how to make use of the web as a dynamic medium. Digital writing isn’t just screen prose, it’s interactive prose. All of these skills are in addition to actually being able to write something. And if you get your submission recognized by the online community of Wikipedia editors, as Andrew has, it means you’ve really done a great job.”

Through his editing efforts, Andrew has taken part in an act of digital citizenship, and he hopes to remain involved in improving the great resource that is Wikipedia.

“As time goes on and as new developments arise regarding the digital divide in Canada, I am sure that I will come across more information which would be a great fit for the Wikipedia article,” Andrew said. “In these cases I look forward to going back and working with my article again, and also hope to see others contribute to it as they discover suitable information.”

Contributing to Wikipedia has a tremendous impact—not only to a student, but also in the greater context of public literacy.

“By contributing this content, Andrew has contributed to a valuable resource that Canadians and others can use to understand one of the biggest twenty-first century challenges that we face in Canada, how to ensure that all Canadians have equal access to the internet,” said Professor Obar. “I think in giving this award, York University is not only acknowledging the quality of Andrew’s research and writing, but his efforts to improve the world outside the university walls.”

Interested in teaching with Wikipedia? Find out more at teach.wikiedu.org.

File:Winner of the 2017 Liberal Arts and Professional Studies Writing Prize, York University, Toronto.jpg, Jon Sufrin, on behalf of Faculty of LA&PS, York University, CC BY-SA 4.0, via Wikimedia Commons.

by Cassidy Villeneuve at December 07, 2017 05:27 PM

Wikimedia Foundation

Why I write science articles on Wikipedia

Gerardus Mercator placed magnetic mountains on his maps to account for compasses pointing north. Image by Gerardus Mercator, public domain.

I am a geophysics professor. I first encountered Wikipedia in 2010, when I did an online search for my specialty, rock magnetism. The top hit was a Wikipedia article. It was a sad little stub—three sentences, no references and one external link—yet it was the first resource people would turn to on this subject.

I created an account (with the user name RockMagnetist), and started editing that stub. I expanded it by a factor of about 20.[1] From there, I started looking at increasingly general articles—first paleomagnetism, then the Earth’s magnetic field, and then geophysics. I found huge holes in their coverage. The main article on geophysics was primarily a collection of lists, while major subfields like geophysical fluid dynamics and mineral physics had no article. Perhaps, I thought, I should be spending my time filling in the big picture.

This wasn’t a trivial task, as geophysics is a broad, interdisciplinary subject. It intersects with many fields, including geology, oceanography, glaciology, and space physics, and its boundaries are hard to define. There is even a field of biogeophysics. No book or series of books covers the full breadth of this subject. Given this, I came up with a novel approach for the main article. I started with basic physical phenomena, like gravity and heat flow, and wrote sections on how they related to the Earth. Then, I started to write about how geophysicists gather information on these phenomena to learn about the Earth and its surroundings, and I created some of those missing articles for the subfields. I made a lot of progress, but got to a point where adding to the geophysics article would take more research than I had time for.

Putting the geophysics project aside, I wandered about Wikipedia, contributing to hundreds of articles in the earth sciences and physics on a broad range of subjects. Often, these were the result of some call for help. Someone complained that the article on momentum was inaccessible, so I rewrote it to build up the subject from the simplest ideas to the most complex. Other times, I found some fascinating sources on a subject, like the history of geomagnetism (see the image at top). And perhaps most often, I came across a page with glaring problems and couldn’t resist the urge to fix them.

Gravity map created by NASA’s GRACE mission. Image by NASA, public domain.

Scientific biographies are a special challenge. Many Wikipedians think it unfair that there aren’t more articles on scientists, but there are good reasons for that: generally, science journalists write about research, not the people doing it. Even scientists write little about the people in their field. Often, Wikipedians can’t write a good biography on a scientist until their obituary is published. Some argue that we can still write a biography of a scientist if their work is discussed in depth in reliable, independent sources, but that is also rare. Although a distinguished scientist’s publications will have been cited thousands of times, typically each citation is a bare mention. As a result, many of the scientific biographies on Wikipedia are little more than CV‘s. Still, there are some surprising omissions (even the occasional Nobel Laureate with just a stub), and I have written biographies on several scientists.

So why do I do all this writing for Wikipedia? In my day job, I have published two research papers that have been cited over 100 times. The rest have been cited less frequently. Some of the Wikipedia articles I have edited, however, are read by over a million people a year. Its impact is such that even a stub, like that article on rock magnetism, can be the top hit for a Google search. I spent some time on the Physics Stack Exchange looking at questions on geomagnetism, and was amused to find people recommending articles that I wrote.

Recently, I became a visiting scholar with the Deep Carbon Observatory (DCO). The visiting scholar program was developed by the Wikimedia Foundation to connect Wikipedia editors with organizations that have resources for research and would like to see more articles on their subject area. Because I have worked on a lot of earth science articles, people who were trying to set up the DCO program approached me for help. At first I wasn’t interested, but when I read about the DCO, some of their work caught my interest. For example, they found that as tectonic plates dive into the interior, more carbon is taken from the Earth’s surface than is returned to the surface by volcanoes. At this rate, life will start to suffer from carbon limitations—albeit in about 100 million years.

I am getting an honorarium for this work, which is a first for the visiting scholars program. The Wiki Education Foundation hopes that incentives will lead to better use of the program’s resources and benefit Wikipedia.[2] The DCO is a loose collaboration of thousands of scientists, doing work that covers a broad range of subjects related to carbon in the Earth, including deep life. I have agreed to provide context for their work, taking care to remain neutral. So far, I have expanded a couple of scientific biographies of DCO scholars; created articles on organic minerals and extraterrestrial diamonds; and extended the list of mineralogists. I have also added a lot of material to geochemistry—I was amazed to find that most of the existing article was taken from an article in the 1911 edition of Encyclopedia Britannica! One of my next goals is to improve the geology section in the article on diamonds.

The Moon Mineralogy Mapper, a spectrometer that mapped the lunar surface. Photo by NASA, public domain.

And one of these days I’ll get back to that article on rock magnetism.

Andrew Newell (User:RockMagnetist), Wikimedian

Footnotes

  1. I must confess that it is still not a very good article. I think I’m too close to it and have too much to say, so I have trouble saying anything.
  2. The Wiki Education Foundation and I are well aware that there is a potential conflict of interest (COI). Wikipedia has a “behavioral guideline” to avoid such conflicts, and Wikipedians are responsible for following both the basic disclosure requirements and the guidelines set by the site’s community—which includes declaring their source of income on their user page and on the talk page of each page they edit (this being Wikipedia, there is a standard template for that). I created a special account for this work and I add the template to talk pages when I make major changes.

by Andrew Newell at December 07, 2017 04:10 PM

December 06, 2017

Wikimedia Tech Blog

Think twice, code once: How Wikimedia shares common functionality across different projects

Laziness is a virtue to some. “I’m not lazy. I’m just… Incredibly efficient and economical.” Mockup, CC BY 2.5. Painting, public domain. Text by various contributors via Wikipedia/”Laziness,” CC BY-SA 3.0.

Don’t repeat yourself

The Android and iOS Wikipedia apps both offer similar polished native experiences to Wikipedians around the world. Usually, building an identical feature on different platform requires duplicating work effort with different source code written by different people and each with bugs unique to their implementations. This means that for each new feature, the expense to develop and maintain it must be paid not once but twice and a bug fixed on one doesn’t necessarily translate to the other.[1] Additionally, it is common that only a given platform’s developers are sufficiently proficient at programming for that target so multiple parties are often involved even for seemingly trivial changes.

This is where the Wikimedia page library comes in. The page library is a new project by the Wikimedia Readers department to share development and design resources between different software projects, starting with the Wikipedia native applications. The goals of the project are to:

  • Share features and talent: New features are invested in once by a cross-team contributor pool that can focus on a single implementation providing a consistent and accurate experience across all platforms.
  • Reduce bugs and maintenance: The consolidation of duplicate code has already greatly helped to eliminate flaws and each enhancement is propagated to all platforms.
  • Improve stability: Testing page content (“anything in the browser”) for Android and iOS in a scalable way has historically been particularly challenging, flaky, and error prone. The page library has initially provided numerous DOM state unit tests for transforms as well as integration demos, and, in the long term, may also provide integration tests for transforms and visual regression integration tests at the component level.
  • Make changes easier and safer: In general, changes to common code do not require expertise in Vagrant, Android, or iOS to make and separating platform concerns keep effects predictable and tests performant and sustainable. Additionally, we have started to consolidate the current cascade of style inheritances across multiple projects.
  • Update page presentation with page content and independently of app updates: The library is versioned independently of the apps and may eventually be provided remotely so fixes and improvements could be published continuously on the fly.

Working together on a concerted effort maximizes the impact of each contributor and the quality of each product and it’s not without precedent either. After an initial investment to consolidate and clarify sources into the library, every consumer reaps the benefits. The page library is currently used in production for both the Android and iOS apps for theming content, enlarging images, and much more. It’s already planned to be used by the upcoming Page Content Service which is the future endpoint for Wikipedia clients including the Android, iOS, and even web apps.

The remainder of this article will discuss the development of a new page library feature, optimized image loading, also known as lazy loading.

What is lazy loading?

A simulated page of images lazy loading on a 3G Internet connection. Underlying mockup by Android Developers, CC BY 2.5. Other images by Vincent van Gogh, public domain.

There are many different strategies that can be employed to improve image loading for webpages, a common performance bottleneck many websites must optimize for, especially for mobile users. One strategy that dramatically decreases the page load time and reduces the average number of bytes sent is called lazy loading. This technique involves initially replacing large images with placeholder content such as the grey boxes above. When a user first visits the web page, their browser only downloads the images within view. Offscreen images are then gradually loaded as the user scrolls further into the content (or never if they only view a portion of it). Lazily loading images on demand, as opposed to eagerly loading all of them, was shown to be enormously effective for Wikipedia users around the world, as well the efficiency of Wikipedia’s own data centers, when introduced on mobile web over a year ago.

Jon Robson covered the Readers’ Web team implementation of lazy loading images last year in a fantastic piece, “How Wikimedia helped mobile web readers save on data”. The native apps share many of the same concerns as mobile web so there was a strong desire for this functionality on both Android and iOS. The next sections will describe differences from the original web implementation.

What’s new in lazy loading?

When we first considered adding lazy loading to the native apps, we wanted to repeat the Web team’s success but, as a previous section mentioned, only write it once. Ideally, the page library would simply reuse the web’s implementation as is but unfortunately that was impractical. The web’s server implementation was written in PHP which isn’t supported by frontend clients (the native apps or browsers in general) and jQuery is a dependency of the web’s client implementation which the native apps otherwise do not need.

Since a rewrite was necessary, the choice of which language to use quickly came to the forefront. Although it was pragmatically the only option as a truly universal language for smartphone apps, desktop web browsers, server farm services, and beyond, JavaScript was a natural choice because it goes so well with webpage content, has a plethora of tools, an extraordinarily active community of developers, and a great quantity of code to be shared was already written in it. According to some, JavaScript is “the lingua franca of the web[2] for both front and backend development. Initially, the page library is for frontend usage only by the native apps. However, writing in JavaScript means that it’s already slated to be supported on the backend and any future browser usage.

To understand the changes needed for the page library implementation, it’s helpful to first broadly revisit the current web implementation which is delineated by server-side and client-side portions:

Plain HTML (1) is the input. The service (2) serves HTML as page content which includes transforming large images into economical grey placeholder boxes (3). The client (4) receives the content with the images already removed and may selectively re-add them by “undoing” the lazy load transform. At a high level, the page library works the same way except the service (2) is currently executed client-side too. In the future, the service (2) will move to the backend for improved performance but it will be identical code thanks to the ubiquitous quality of the page library’s JavaScript implementation.

Transform lifecycle: images to placeholders and back again

The service (2) replaces all images with spans and persists important attributes such as the image URL as data-* attributes which are application specific and untouched by clients. If the client displays this transformed HTML verbatim, all large images will be replaced with grey boxes. These are the spans. Pre-content presentation in this style is known as a skeleton screen and it looks kind of like a drawing of a newspaper in a comic style with lines and blocks instead of legible content:

Text by various contributors via Wikipedia/”China,” CC BY-SA 3.0.

For a given image, the plain input HTML (1) may be:

<img width=100 height=200 src=picture.png>

And the transformed content HTML (3) would then be:

<span
class='pagelib_lazy_load_placeholder pagelib_lazy_load_placeholder_pending'
style='width: 100px'
data-width=100
data-height=200
data-src=picture.png>
<span style='padding-top: 200%'></span>
</span>

The anticipatory load distance is a couple screens on the Android client (4) so in reality the user will often never notice images that have yet be loaded. Placeholders are reevaluated for loading whenever the user scrolls. Loading simply means selectively undoing the transform applied by the service (2).

Text by various contributors via Wikipedia/”China,” CC BY-SA 3.0. Images left to right 1, 2, 3/public domain, CC BY-SA 3.0, CC BY-SA 3.0 IGO (respectively).

When an image is queued for downloading, the placeholder enters a loading state:

<span
class='pagelib_lazy_load_placeholder pagelib_lazy_load_placeholder_loading'
style='width: 100px'
data-width=100
data-height=200
data-src=picture.png>
<span style='padding-top: 200%'></span>
</span>

This placeholder state is stylized as a pulsing grey box using performant CSS animations to cue a quick-scrolling user that the image is coming and will soon shimmer in. To actually start the download, an image element, just like prior to the transform, is also created at this time but is not attached to asynchronously load the picture behind the scenes:

<img class=pagelib_lazy_load_image_loading width=100 height=200 src=picture.png>

Finally, after a successful download the entire placeholder is replaced with the previously detached image:

<img class=pagelib_lazy_load_image_loaded width=100 height=200 src=picture.png>

Since a user’s connection may be poor, the page library also presents the error state distinctly. The placeholder enters an error state when loading fails:

<span
class='pagelib_lazy_load_placeholder pagelib_lazy_load_placeholder_error'
style='width: 100px'
data-width=100
data-height=200
data-src=picture.png>
</span>

And a click listener is provided for the user to retry the failed download when connectivity improves. This deviates from the web implementation.

Placeholder structure

There are many possible implementations for placeholders which are essentially rectangles:

  • Replace the original image with a span and append a new downloaded image to the span.
    This option has the best cross-fading and extensibility but makes duplicating all the CSS rules for the appended image impractical in the wild world of Wikipedia content possible.
  • Replace the original image’s source with a transparent image and update the source from a new downloaded image.
    This option has a good fade-in and minimal CSS concerns for the placeholder and image but causes significant reflows when used with image widening.
  • Replace the original image with a span and replace the span with a new downloaded image.
    This approach used by web has a good fade-in but has some CSS concerns for the placeholder, particularly max-width, and causes significant reflows when used with image widening.
  • Replace the original image with a couple spans and replace the spans with a new downloaded image.
    This is the current approach and about the same as the web but supports image widening without reflows.

Another implementation difference arose from performing what were server-side transforms on the client. Prior to displaying content, the native apps execute a variety of transforms to enhance the browsing experience. When lazy loading was considered, it was envisioned that this transform would be similar enough to the preexisting ones and maybe only the order would matter. However, one important difference emerged in that all the transforms require content HTML to be inflated into the DOM before they can execute. However, the nature of the lazy load transform is that it must strip all images prior to inflation to avoid eager downloads by the browser. The browser may download images at any time regardless of whether their elements are attached or a parentless tree. We solved this problem by performing the transform in a separate document entirely. This detail will vanish when the transforms are performed conventionally on in the context of a headless server like the web’s PHP implementation but it’s one of the few integration concerns not captured internally by the library for client-side transform usage.

Lastly, since the page library transforms are all written in JavaScript, a non-JavaScript client won’t execute them meaning that the web’s noscript fallbacks were unnecessary.

Outcome and the road ahead: the code so nice they wrote it once

On a fast connection, a user may never notice that images have been lazily loaded. On a slow connection, the user will hopefully only notice that the load time is nearly as good as on a fast connection with the anticipatory look ahead strategy balancing user experience against data savings. The end result is usually that the article contents are nearly unchanged but the data savings are excellent.

A surprise bonus effect of lazily loading images was seen on economy Android devices when viewing large articles with many images. Previously, these devices would stutter, become unresponsive, or even crash! With lazy loading, content loads promptly, on demand, and doesn’t exhaust memory.

Lazy loading is available today on the Wikipedia Android app and planned for the iOS app. The page library is already an integral part of both apps and in the future will also appear as a component of the Page Content Service. There are many transforms left to consolidate across platforms, new features to write, and bugs to fix. The page library makes it possible to ship high quality features across diverse platforms at a fraction of the cost. Don’t think twice, it’s al- … Actually, do think twice. Then code only once.

Stephen Niedzielski, Software Engineer
Wikimedia Foundation

Footnotes

  1. Note too that maintenance is equivalent to bringing a car into the mechanic’s for regular repair, sometimes commonplace and sometimes unexpected and expensive; software maintenance is a simple term for a recurring cost that comes with compound interest for the lifetime of a feature’s existence. It’s a big deal!
  2. A common language used to make communication possible between entities who do not share a native language.

by Stephen Niedzielski at December 06, 2017 08:19 PM

Neha Jha

Community Bonding and First Task

The community bonding period has started. I have started talking to my mentors and getting to know my role better. I have joined Wikimedia’s IRC channel to get to know the community better. In open source, all the people are generally very helping. I have witnessed a few technical meetings and learned a thing or two from them.

I wanted to complete at least one task before starting the internship. So I started reading more about slim framework and Wikimedia Slimapp library. Initially, I faced a few problems getting started with the task but my mentor Bryan helped me. And finally, after a few days I was able to submit my first patch for the internship. My happiness knew no bounds when I saw the uploaded patch. It feels amazing to work for Wikimedia.

I have already started working on my next task. Surely, my understanding of the project is growing day by day. Really looking forward to the three months of coding:)

by Neha Jha at December 06, 2017 07:43 PM

User:Geni

Wikipedia, right, but on the blockchain round 2

Well its more a re-launch of Everipedia which has been around for a while. Everipedia is essentially an attempt to make money by starting with a mirror of wikipedia and then going forward with no notability standards. Income attempts are by charging people for article writing services and adverts.

So far success seems to have been fairly limited. Low alexa rank and a fairly low rate of linking from reddit a fair number of which focuses on stuff of interest to reddit’s nastier sub communities. It also has the issues you would expect of something with no notability standards and has become a popular platform for the likes of 4chan to libel people

Still I’ve seen worse.

So where does the block-chain come in? Mostly in the form of what appears to be (or what is at leasdt marketed as) a relaunch which uses much the same language as Lunyr. Decentralised, censorship resistant and on the block-chain. In practice this seems to boil down to them thinking about hosting on the interplanetary file system (so Lunyr again).

One trick they have beaten Lunyr on is hiring wikipedia co-founder Larry Sanger who’s name is apparently enough to get your press release read. Quite how a project that claims to be removing gatekeepers is meant to work with a guy who’s previous project (Citizendium) was effectively a nested pile of gatekeepers I’m not sure.

What this is all in aid of is an initial coin offering. Called IQ, the coin’s nominal use case appears to be buying the right to challenge edits on the wiki (although things are at the unpublished draft stage). How they plan to make that something people would want without completely breaking the wiki system is unclear but I personally doubt they will ever have enough traffic for it to matter. If they get their coin out before the ICO bubble bursts they might raise some cash (after all they have the advantage of a product that actually sort of exists) otherwise not so much.

Everipedia is a project that has been trying to compete with wikipedia for 2 years now even as someone who keeps an eye on wikipedia competitors its barely on my radar. I don’t see anything in the relaunch that would change that.

As for Lunyr they continue to update their advertising system but since everything is still a behind a closed alpha its hard to say what’s going on. They are though apparently having issues with spotting bugs in their software due to CryptoKitties messing up the Ethereum blockchain. They’ve also apparently been removed from the Liqui cryptocurrency exchange but I really doubt that will matter to anyone beyond the unfortunate compulsive gamblers who trade the thing.


by geniice at December 06, 2017 07:15 PM

Wiki Education Foundation

Monthly Report, October 2017

Highlights

  • In October, we formalized a partnership with the American Anthropological Association (AAA). Through this partnership, AAA will encourage its members to participate in Wiki Education’s programs, increasing the availability of information about anthropology from a global perspective on Wikipedia.
  • Classroom Program staff held their second round of Office Hours earlier this month. These sessions invite instructors to ask questions of program staff and each other about using Wikipedia in the classroom, and offer us a valuable opportunity to gauge the changing needs of our program participants.
  • We announced a new Visiting Scholar this month; Paul Thomas will work with the University of Pennsylvania to develop Wikipedia articles related to Classics. He has already done some great work since becoming a Visiting Scholar, including bringing Liber physiognomiae up to Good Article status.
  • Cassidy Villeneuve was hired this month as Wiki Education’s new full-time Communications Associate. Her role supports staff across departments and projects in general communication needs, as well as maintains the wikiedu.org blog.

Programs

Educational Partnerships

We formalized a partnership with our newest academic association, the American Anthropological Association (AAA). AAA is an organization promoting the development and dissemination of anthropological knowledge. Through this partnership and Wikipedia initiative, AAA will encourage its members to participate in Wiki Education’s programs, increasing the availability of information about anthropology from a global perspective on Wikipedia.

Outreach Manager Samantha Weald ran two online workshops for instructors interested in learning more about Wiki Education and teaching with Wikipedia. The first webinar was with a graduate level course at George Mason University, where students were interested in “Higher Education in the Digital Age” and how Wiki Education’s work promotes open access, open educational resource-enabled pedagogy, and digital literacy. The second webinar was for a group of instructors at Louisiana State University interested in implementing these assignments in their research and writing courses.

Classroom Program

Status of the Classroom Program for Fall 2017 in numbers, as of October 31:

  • 306 Wiki Education-supported courses were in progress (169, or 55%, were led by returning instructors)
  • 5,946 student editors were enrolled
  • 58% were up-to-date with the student training
  • Students edited 3,100 articles, created 92 new entries, and added 1.05 million words.

It’s the middle of the term which means that our students are well underway with their Wikipedia assignments. They’ve chosen their topics and are beginning to draft their contributions. With more than 5,500 students enrolled on the Dashboard, the Classroom Program is busy monitoring their work and responding to questions.

As part of our efforts to improve the quality of our support to a growing number of courses doing Wikipedia assignments, we held our second round of Wiki Education Office Hours earlier this month. During these sessions, instructors in our program have the chance to meet face-to-face with Wiki Education staff, interact with other instructors using Wikipedia in their classrooms, and ask critical questions about their Wikipedia assignments. Questions range from concerns over privacy on Wikipedia to the logistics of keeping track of student contributions. These sessions are incredibly valuable to us, and we use them to gauge the changing needs of our program participants.

If one word could define Fall 2017, it’s quality. With several terms of experience and growth behind us, we’ve focused our efforts this term on ensuring that we provide the most effective support to our program participants while refining processes to make sure that our student contributions to Wikipedia are of the highest quality. While the Fall term is just gearing up for our students, we’re already preparing for Spring 2018. We’ll continue to apply our learnings as we transition to the next term.

In 2015, the United States Supreme Court issued a judgement on Obergefell v. Hodges, stating that state bans on same-sex marriage was unconstitutional. Many saw one of their fondest desires granted — the ability to legally marry the person they loved. This freedom is, however, not recognized by all Native American tribes, making it necessary for people such as Heather Purser to fight for tribal recognition. Purser knew from an early age that she was a lesbian but did not come out until her teens, aware that not everyone in her family and community would be supportive of her sexuality. After coming out, Purser was subjected to physical and emotional attacks against her person, however she didn’t allow this to discourage her and in 2009, began petitioning her tribal leaders to recognize same-sex marriage. Approximately two years later her one-woman campaign ended with about 300 of her fellow tribes members voting in favor of recognizing same-sex marriage. If not for a student in California State University educator Karyl E. Ketchum’s Gender and Technoculture class, this woman’s courageous efforts might not have been recorded on Wikipedia for months, years, or possibly not ever, which would have left a hole in Wikipedia’s LGBT coverage.

There are many amazing people in this world — all of us likely have several people that would fit this description, both in our personal lives and from those we’ve discovered through the media or education. Janice Forsyth is one person who merits being called “amazing”, as she has earned multiple degrees from Western University, began working as an educator while earning a PhD, wrote multiple publications and technical reports, and also competed in Western University’s badminton, cross country, and track & field teams while a student. She’s also the University’s Director of the First Nations Studies Program and has earned awards for both her time as an athlete as well as her academic and scholarly work. Forsyth’s research leans towards Aboriginal and indigenous physical culture and its history, interests that were undoubtedly inspired by her ties to the Fisher River Cree and Peguis First Nation in Manitoba, Canada, via her maternal family. Only a portion of Wikipedia’s editors actively create articles on academics, which makes this contribution from Victoria Paraschak’s Sport and Aboriginal Peoples in Canada students at the University of Windsor that much more important. If not for them, Forsyth’s impressive accomplishments may have continued to go unrecorded for an indeterminate amount of time — just as with Purser’s article.

Animals form aggregations for a wide variety of reasons, and sometimes those aggregations can have unexpected effects. When pods of Norwegian orcas encounter a school of herring, the orcas will split off a portion of a school and herd it into a tight, ball-like formation from which members of the pod will feed. This behavior is known as carousel feeding. Students in David Wilson’s Animal Behaviour class expanded a short stub on this topic into a fairly substantial article that looks at the herding and feeding behavior of the orcas, and the ecological impact of this practice.

Niphanda Fusca butterfly
Image: File:Niphanda fusca9.jpg, Oleg Kosterin, via Wikimedia Commons.

The bogong moth is an Australian moth that forms aggregations for another purpose — during the heat of summer, they migrate to mountain caves where millions of them gather, in densities as high as 17,000 individuals per square meter. Such a dense gathering doesn’t simply concentrate moths—it also concentrates the elements in their bodies that they have absorbed while feeding. In 2001, runoff from these caves, enriched as it was in moth–transported arsenic, proved toxic to native grasses. The bogong moth article was one of the many articles about butterflies and moths that were expanded by students in Joan Strassmann’s Behavioral Ecology class. Working from little more than a stub, a student in the class expanded Wikipedia’s coverage of this moth into a fairly comprehensive article that discusses the social behavior, diet, distribution, migration and life cycle, among other things.

Niphanda Fusca butterfly
Image: File:Niphanda fusca7.jpg, Oleg Kosterin, via Wikimedia Commons.

Butterflies are usually thought of as delicate and beautiful. The marsh fritillary, Polygonia c-album, Niphanda fusca, and Gonepteryx rhamni all fit this picture, but many other lepidopterans have an entirely less benign reputation. The eastern spruce budworm, Choristoneura fumiferana, is one of the most destructive native insect pests in North America; outbreaks are capable of defoliating millions of hectares of forest. The codling moth is a major pest of apples and pears. Spodoptera litura, commonly known as the tobacco cutworm or cotton leafworm, is a pest of a wide variety of crops, including (as the common names suggest) tobacco and cotton, as does Heliothis virescens, commonly known as the tobacco budworm. The larvae of the cabbage looper attack cruciferous vegetables, while those of the almond moth, on the other hand, feed on dried fruit, grains, and nuts. These articles are among the very many articles that were substantially expanded by students in the Behavioral Ecology class.

Drawing by Giorgio Vasari
Image: File:Page from “Libro de’ Disegni”- 1.jpg, by Giorgio Vasari, public domain, via Wikimedia Commons.
Artwork by Tobias Schutz
Image: File:Harmonia macrocosmi cum microcosmi.jpg, by Tobias Schutz, public domain, via Wikimedia Commons.

Students in Alison Marshall’s class, The Scientress, have been working to expand articles on female scientists. In October, students created new articles on Verena Tunnicliffe, a Canadian marine biologist known for her work on hydrothermal vents, and Patricia Woolley, an Australian zoologist who works on dasyurid marsupials. Several students in Sara Galletti’s Art in Renaissance Italy class uploaded excellent photographs. One student uploaded an artwork that was created by Tobias Schutz in 1654, while another added a photograph they took of a Giorgio Vasari drawing at the National Gallery of Art.

Two other students also uploaded images they took at the National Gallery of Art, which show a deep appreciation of the building’s architecture — itself a lovely example of creativity.

The lobby of National gallery of Art East Building
Image: File:National Gallery East Building Lobby.pic.jpg, by Atsushi Hu, CC BY-SA 4.0, via Wikimedia Commons.
Walkway to West Building and Cascade Cafe in National Gallery of Art, Washington D.C.
Image: File:Walkway to West Building and Cascade Cafe.jpg, by Evelyn y, CC BY-SA 4.0, via Wikimedia Commons.  

Community Engagement

The first page of a 1505 copy of the Liber Physiognomiae.
Image: File:Liber Physiognomiae Michael Scot.jpg, by Jean Petit (1505, original),
Gen. Quon (2016, touch-ups), public domain, via Wikimedia Commons.

Community Engagement Manager Ryan McGrady began the month announcing a new Visiting Scholar at the University of Pennsylvania, Paul Thomas. Editing as User:Gen. Quon, Paul is a prolific Wikipedian who has written more than 250 Good Articles, but he did not have access to key scholarship in one of his major areas of interest: Classics. We were excited to be able to connect him with Penn’s vast library resources and its Department of Classical Studies, which has provided undergraduate and graduate programs for more than 200 years. He has already written some impressive articles since becoming a Visiting Scholar. This month, he brought the article about the Liber physiognomiae, a book written in the 13th century about physiognomy, the evaluation of a person’s personality based on their outward appearance, up to Good Article status.

Other Visiting Scholars were busy this month, as well. Deep Carbon Observatory Visiting Scholar Andrew Newell had two articles appear in the Did You Know section of Wikipedia’s Main Page, extraterrestrial diamonds and organic mineral:

 

An artist’s conception of extraterrestrial diamonds next to a hot star.
Image: SpaceNanoDiamonds.jpg, by NASA/JPL-Caltech, public domain, via Wikimedia Commons.

 

[Did You Know] … that carbon-carrying minerals are known as organic minerals, except for some that were considered inorganic before 1828?

 

[Did You Know] … that extraterrestrial diamonds in meteorites preserve their history from before the Solar System formed?

 

 

 

Two high-visibility articles improved by George Mason University Visiting Scholar Gary Greenbaum were promoted to Featured Article. The first was Spiro Agnew, the 39th Vice President of the United States under Richard Nixon. The second on Casey Stengel, a professional baseball player turned baseball manager who led the Yankees to 7 world championships.

Spiro Agnew
Image: Spiro Agnew.jpg, public domain, via Wikimedia Commons.

Rosie Stephenson-Goodknight, Visiting Scholar at Northeastern University, continued her run of creating great new articles about women writers, including physician Lucy M. Hall, who wrote about health topics for magazines and periodicals, and Julia C. R. Dorr, an American author of poetry and prose who published her work in literary publications as well as in many of her own books.

Eryk Salvaggio at Brown University created an article about the Canales investigation, a congressional hearing in 1919 concerning criminal conduct by the Texas Rangers. The article also appeared as a Did You Know:

[Did You Know] … that the Canales investigation heard 80 witnesses describing murders, kidnappings, and other crimes committed by the Texas Rangers in the early 20th century?

Ryan spent time this month working with Educational Partnerships Manager Jami Mathewson and Communications Associate Cassidy Villeneuve to develop communications materials and strategies for our 2018 Wikipedia Fellows pilot, and coordinating with current and prospective Visiting Scholars and sponsors at varying stages of the onboarding process.

Program Support

Communications

Communications Associate Cassidy Villeneuve

In October, Cassidy Villeneuve was hired as Communications Associate. Her role will support staff across departments and projects in general communication needs.

So far, that has included working with Ryan and Jami in preparing communications for Wiki Education’s upcoming pilot, Wikipedia Fellows. Cassidy worked with Samantha and Jami to develop an updated landing page for new instructors as well. Cassidy has also been working on updating text on our support site, ask.wikiedu.org, to more efficiently address issues instructors and students often encounter.

And in further efforts to improve our training resources, Cassidy worked with Wikipedia Content Expert Ian Ramjohn in developing a new training module for students to find appropriate Wikipedia articles that need improvement.

Blog posts:

Digital Infrastructure

In October, Product Manager Sage Ross focused on infrastructure improvements and a wave of upgrades to the Dashboard’s student training modules and course creation features.

Our websites — wikiedu.org, the Dashboard, ask.wikiedu.org, and several others — have now switched over to free “Let’s Encrypt” certificates, with improved HTTPS behavior to protect the data of all our visitors. Our key infrastructure has also been updated to the latest stable version of Debian GNU/Linux, which will help keep our servers secure. In the process of performing these upgrades, we also discovered and subsequently fixed a set of critical bugs related to the Dashboard’s automated edit features.

Refreshes of the most important student training modules went live in mid-October, and feedback from students so far has been very positive. Our growing collection of subject-specific Wikipedia editing handouts has also been integrated into the assignment wizard, letting instructors select which topics are relevant and include them automatically in the assignment timeline. Other recent improvements include more comprehensive monitoring of Wikipedia’s deletion processes, reminders for newly registered users to set an email on their Wikipedia account (so that they can reset their password if it gets lost), and more data transferring automatically from the Dashboard to Salesforce to reduce the amount of manual data entry required for new courses.

Data analysis

In October, Jami spent the month working with a pro-bono data scientist to do a deep dive into our documented data and find interesting trends about our day-to-day work. Here are a few of our findings:

  • Consistently since we started Wiki Education in Fall 2013, 44.5% of the new instructors joining the Classroom Program return to teach again within the academic year. This is exciting because we’ve spent those years increasing the number of students and instructors we support, even though we’ve had approximately the same number of staff members supporting the Classroom Program. One of our priorities is to make the Classroom Program more efficient, scaling up the impact to Wikipedia while decreasing the cost per student. To see that the same number of instructors have a good experience and choose to incorporate Wikipedia into another course shows our success in the attempts to scale.
  • We investigated the efficacy of various outreach forums we use to recruit new instructors into the Classroom Program, and found that conferences are the most reliable way to produce active courses. When those conferences are related to one of our educational partners, they’re even more effective. This confirms we should continue our current strategy of recruiting potential program participants at partners’ conferences. The academic associations’ support of our work is meaningful, and it’s effective to find instructors where they already are: academic conferences.
  • One of the most useful findings is that our limited timeframe (only going back 4 years) does not allow us to find many statistically significant trends. Importantly, we have now confirmed that while we’ll continue to track data we believe is important to the way we run our programs, we do not need to hire a full-time data analyst in the near future.

Finance & Administration/Fundraising

Finance & Administration

Expenses chart for Wiki Education Foundation, month of October 2017

For the month of October, expenses were $142,286 versus the approved budget of $180,322. The $38k variance can be attributed to some staffing vacancies ($12k) and less than anticipated spending on professional services ($13k), and travel ($13k).

Expenses chart for Wiki Education Foundation, year to date as of October 2017

Our year-to-date expenses of $602,207 are also less than our budgeted expenses of $718,080 by $115,873. Areas where our spending was significantly under budget include staffing ($29k), professional services ($42k), travel ($27k), and general operating expenses ($17k).

Fundraising

In October, TJ Bliss, Director of Development and Strategy, continued to cultivate relationships with funders who we felt might be interested in Wiki Education because our work on the Future of Facts, Guided Editing, and Sustaining Science initiatives. Several funders have expressed interest in funding Wiki Education as a means of improving the accuracy and scope of content on Wikipedia in key subject areas like science and public policy. No proposals were requested and no new grants were received during the month of October.

We submitted a final grant report for the Simple Annual Plan Grant we received from the Wikimedia Foundation. The grant, which covered some expenses from May 1, 2017 to September 30, 2017, enabled us to wrap up our student learning outcomes project, finish up the Classroom Program’s spring term, and create a plan of action for our Future of Facts campaign. We thank the Wikimedia Foundation, WMF staff, and the Simple APG committee for their support during the grant process.

Office of the ED

Current priorities:

  • Summarizing the results of the strategic planning meeting prior to the next in-person meeting in January
  • Re-aligning internal processes after restructuring finance & administration
  • Finalizing the audit and tax return for last fiscal year
Board and staff members at the 2-day strategic planning meeting in Half Moon Bay

In October, board and senior leadership met in Half Moon Bay for a two-day, in-person strategic planning kick-off meeting. In the weeks leading toward the meeting, a strategy taskforce consisting of three board and two staff members had collected comprehensive input from both the board and staff on (1) forces and trends in our external environment that will define our work in the upcoming years, (2) our organization’s own strengths and weaknesses, and (3) our fundamental beliefs and values. This analysis served as the starting point for two days of intense deliberations about Wiki Education’s future direction. In a joint effort and based on the analysis, board and senior leadership developed a shared understanding of where to take the organization over the next three years. The results of this in-person meeting will be documented by staff in a high-level summary. This summary will serve as the foundation for the final strategy document and is intended to be confirmed by the board in its January meeting.

* * *

Image: File:Walkway to West Building and Cascade Cafe.jpg, by Evelyn y, CC BY-SA 4.0, via Wikimedia Commons.

by Cassidy Villeneuve at December 06, 2017 05:34 PM

December 05, 2017

Erik Zachte

Wiki Loves Monuments 2017

Again in 2017 Wiki Loves Monuments (WLM) has been a top ranking project community initiative in terms of attention raised.

Here are further stats on that contest. The charts follow the layout of earlier years. The data have been aggregated from WLM stats tools wikiloves (which itself does great reporting) into spreadsheets 1 and 2

Participating countries in 2016 WLM contest

Map of countries participating in Wiki Loves Monuments 2017

 

 


Some charts are about image uploads.
One is about image uploaders, also known as contributors.


Countries

Participants-v3


Participants per year to Wiki Loves Monuments contest (click to zoom)

With 52 participating countries, 10 more than in 2016, the 2017 contest ranks a close second after 2013, when 53 countries participated. (See 1st table). 6 contenders participated for the first time: Australia, Croatia, Dutch Caribbean (Aruba participated as a country earlier), Finland, Saudi Arabia and Uganda.

In those 8 years since WLM started these countries participated most often:
7x France, Germany, Norway, Russia, Spain, Sweden
6x Austria, Israel, Italy, Netherlands, Slovakia, Ukraine

The contest ran in different countries during different periods (mostly because different calendars are in use, and the aim is to run the contest for a full calendar month).

 


Uploads

In 2017 in total 245,168 images were uploaded, which is 12% less than in 2016.

WLM_uploads_per_year_2010-2017-v2


In 2017 Ukraine contributed most images: 37,592WLM_uploads_by_country_2017-v2

WLM_uploads_by_country_cumulative_top20_2010_2017-v2

WLM_uploads_by_country_year_by_year_2010_2017-v2


Contributors

Where in 2016 India and United States scored an ex-aequo first place
with 1784 uploaders, this year the United States easily scored the first place,
with 1418 contributors, India came second, with 1130.

WLM_contributors_by_country_2017-v2

WLM_uploaders_by_country_year_by_year_2010_2017_top_10_v2


The following chart is new this year. We know for each country how many people contributed images. We don’t know how many of those people were foreign visitors (given the theme of the contest probably most of those were tourists). This proportion may vary widely per country. This is actually relevant for all presented charts, but here more than in other charts, as demographics are here an explicit part of the equation. It shows how in statistics, just like in photography, point-of-view and perspective can greatly influence the final picture.

WLM_contributors_per_million_population_2017


Edit activity on Commons

Two Wikistats diagrams: every year the Wiki Loves Monuments contest brings peak activity on Commons. The second peak earlier in the year, mostly since 2014, is the result of the Wiki Loves Earth contest (which has somewhat different dynamics: WLE runs for a longer period than WLM, in 2017 WLE had 52% more contributors than WLM, but 47% less uploads). See also: this Wiki-loves yearly results page.

Charts also available on Wikimedia Commons


PlotUploadsCOMMONSupdated

Data for 2017 not yet available on last chart.

by Erik at December 05, 2017 06:10 PM

Wiki Education Foundation

Roundup: American Literature Since 1865

Before television, before film – even before we had the radio, we had the written word. It was used not only to communicate but also to entertain and educate. It’s unsurprising that our love for literature has endured over time and it looks like it will most certainly survive the digital era. (Physical paper copies of literature are struggling, but have still managed to maintain a presence.) This past spring Shealeen Meaney’s class at Russell Sage College worked on expanding Wikipedia’s coverage of American Literature Since 1865. They looked not only at written literature but also at what led to the creation of some written work, as in the case of Uncle Remus.

Uncle Remus is a folktale character and narrator of a 1881 collection of folktale stories collected by Joel Chandler Harris, an American journalist, fiction writer, and folklorist. After its release the book was well received and author Mark Twain himself read the book to his children. In current years, the collection and character of Uncle Remus have been seen as controversial due to the dialect Harris chose to represent plantation slave speech and because the Uncle Remus character is seen as an example of the racism and condescension that African-Americans faced during the time period. Harris himself hoped that the work would stand as a companion to Harriet Beecher Stowe’s Uncle Tom’s Cabin, which he saw as a defense of slavery despite Stowe clearly writing it as an anti-slavery novel.

Students also expanded the article on Zora Neale Hurston’s short story “Sweat“, which deals with themes of domestic abuse, feminism, and survival. The work was published in 1926, while Hurston was an anthropology student at Barnard College of Columbia University, and centers on Delia, a washerwoman who is subjected to constant emotional and physical abuse from her husband, Sykes. During the course of the story Sykes tries to murder Delia via a poisonous snake he brings into the house, only for Sykes to die by snake bite instead. The class also expanded articles on two other short stories, “Babylon Revisited” by F. Scott Fitzgerald and “A White Heron” by Sarah Orne Jewett. These two stories deal with very different environments, as the Fitzgerald story is set in Paris after the stock market crash of 1929, while Jewett’s story takes place in the Maine woods in the late 1800s.

Students and educators have a wealth of knowledge that’s surpassed only by their passion to learn and teach, two things that are incredibly well suited to the task of Wikipedia editing as an educational assignment. If you’re interested in taking part, please contact Wiki Education at contact us at contact@wikiedu.org to find out how you can gain access to tools, online trainings, and printed materials.

Image: File:Old Plantation Play Song, 1881.jpg, by Frederick S. Church and James H. Moser, public domain, via Wikimedia Commons.

by Shalor Toncray at December 05, 2017 05:33 PM

Wikimedia Foundation

Why Tímea Baksa writes about Korean culture on Wikipedia

Photo by Tímea Baksa, CC BY-SA 4.0.

Tímea Baksa has been editing the Hungarian Wikipedia since 2005, picking up languages, cultures, and new friends along every step of the way. Over 135,000 edits later, she has written 56 featured articles on the encyclopedia (English equivalent), which undergo a peer review process to prove that they are the best Wikipedia has to offer. That total means that Baksa has had a hand in 6% of the entire site’s featured articles.

Most of those articles focus on South Korea and its distinctive culture, ranging widely from history to  economy, geography, transportation, literature, K-pop, and various TV series. Naturally, we wondered about how she got into this hobby.

———

What caused you to start editing Wikipedia?

Like many editors, I wanted to check an article about Tarkan, my then-favourite singer, and the article was subpar in quality. I decided that he deserved a good article, and that started me on my Wikipedia journey. I made a lot of mistakes as a beginner, of course, but it became my first featured article later.

What got you interested in Korean culture and East Asia more generally?

I can actually thank Wikipedia for my “Asian fever”. It all started with watching a Jet Li movie, after which I decided to take a look at the Hungarian article of the actor. Again, just like with my first edit, I found it to be in a miserable state. I decided to improve it, and so I went through Jet Li’s full filmography—watching all of his movies while gathering data for the article—and at the end of Fearless, a Jay Chou song was playing. It was so good, I immediately looked him up; in the process, I fell in love with his music and the Chinese language. And obviously I had to write his article too. (Naturally, both of the Hungarian-language articles on Jet Li and Jay Chou are now featured.) After that, like a snowball rolling down a hill, I ended up learning Korean and Japanese. Korean culture really struck close to my heart and I ended up researching a lot, buying history books and reading journals, which always results in sharing my new-found knowledge in the form of articles.

What do you feel like you get out of editing Wikipedia? What keeps you coming back, day after day?

I get to know about a lot of things, because Wikipedia naturally makes you click on internal links. When I write an article, I usually write auxiliary articles, as well (filling up “red links” so the topic is more clear to the reader). This sometimes drags me into totally different topics. For example, when I was writing about the history of the Turkish war of independence (en), some essential basic notions were red links, like “capitulation“. Alternative, I noticed that the article on “seafood” was completely missing when I was working on the article of Korean cuisine. I love that it is like a puzzle. The more articles I write, the more I learn. It’s addictive.

What kind of benefits do you get out of the Hungarian Wikipedia’s featured article process?

The process is not flawless and can garner criticism. Sometimes I think that because I worked a lot on an article and found enough resources to make it featured, it ought to be among the best immediately. Still, it’s good to see my articles go through peer review. Frankly, when you write something, there will always be errors—and it’s difficult to see your own mistakes, even if you re-read your text a hundred times (which I do).

How do you choose a topic to write about?

Usually they come totally randomly, or because I am into a topic off Wikipedia. That’s almost always the case with music and film related articles—if I really start to like an artist, chances are you’ll see that artist’s article on the Hungarian Wikipedia become featured pretty quickly. Other times I will systematically work on finishing a certain group of articles. For example, right now I’m into Seoul’s transportation system—which includes over nine hundred train stations on the Seoul metro that have to be written and organised. When I get tired of a topic, I temporarily switch to something completely different, so I don’t get bored. I also watch out for missing articles that are advertised on top of the personal watchlist page of editors in the Hungarian Wikipedia; sometimes it’s fun to dive into something totally unrelated.

What is your favorite article you’ve written?

This is a tough question—I have many favourites for different reasons. I am really proud of the Turkish war of independence article, because it took me quite some time to gather all that information, and it’s a one of a kind article on the Hungarian-language internet. You cannot find this topic in such detail anywhere else in Hungarian.

I’m also proud of my article on hangul (en), the Korean alphabet, as it was reviewed and approved by two Hungarian scholars of Korean studies. I think this is important because it shows that Wikipedia articles can please scholars as well. I do everything I can to be as accurate as possible, unbiased, and neutral, even when I write about my favourite things.

My top favourite articles are my “sweethearts”, like Japanese heavy metal legends X Japan (en) or Secret Garden, my favorite Korean TV drama,  I take writing about pop culture as seriously as writing about history or economy and do not believe that entertainment articles are worth less than “scientific” ones. People want information on these, too. And we should be a reliable source in every topic that interests our readers.

Where will you go from here? What kind of articles will you write next?

It will depend on my mood, I guess. Korea is still on the agenda, I have a bunch of red links to fill, because this is sadly not really the most popular topic to write about, so it’s basically me and another Korea-loving editor that are trying to fill in the gaps on Korean culture on the Hungarian Wikipedia. Japan came back into my sphere of vision recently, and there the biggest problem is that articles are inconsistent and mostly poorly written. In addition, many do not follow our transcription rules (we transcribe East Asian languages into a Hungarian system rather than using the English-oriented transcription). I will have to work on that, too. I also need to get back to work with photos—my “side job” in my home wiki is an image patroller, I started cleaning up our image database of copyright violations, and it’s a huge amount of work, ongoing for years now.

Anything you’d like to add?

Writing Wikipedia has changed my life in so many ways. I got to learn two new languages, got introduced to new cultures, and met wonderful people from around the world thanks to this encyclopedia. I would like to encourage people who hesitate to push the edit button to not be afraid of starting this journey. Yes, you will encounter mishaps; perhaps some editors will not be very polite and welcoming. But you can shut all that out. If you seriously love knowledge, sharing what you know and what you learn along the way, and learning is a beloved pastime, Wikipedia is just the perfect place to do a hobby that will also benefit others.

Interview by Ed Erhart, Senior Editorial Associate, Communications
Wikimedia Foundation

Why I …” is an ongoing series at the Wikimedia Blog. We want to hear what motivates you to contribute to Wikimedia sites: send us an email at digitalmedia [at] wikimedia [dot] org if you know of someone who wants to share their story about what gets them to write articles, take photographs, proofread transcriptions, or beyond.

by Ed Erhart at December 05, 2017 04:00 PM

Noella - My Outreachy 17

Community bonding

Now, Outreachy has started and it's time to get to work. 

Trying to know my mentors well and setting up for development. Even though I have some personal problems and instability in now in my life I am trying to push up with the work😥😥. I am anxious for the coding period to come and for me to start work fully😛. Hopefully by then, I will be more stable.

I created a phabricator task requesting for a new project " MassMessage-testing" on cloud VPS for testing that was granted and resolved by Brian Davis. The project will be used by my mentors and I for testing of changes during the Outreachy period. All I need is to set up MediaWiki on it now😌😌.

by Noella Teke (noreply@blogger.com) at December 05, 2017 02:44 PM

December 04, 2017

Wikimedia Tech Blog

The world’s most popular audio file format arrives at Wikimedia

We’re ‘launching’ support for MP3. Photo by the National Museum of the U.S. Navy, public domain.

Until this month, no Wikimedia site supported the world’s most popular audio file format, MP3, because the technology for encoding and decoding these files was encumbered by restrictive patents.

With the expiry of these patents, however, we are now supporting MP3 uploads for trusted users on Wikimedia Commons—a free media repository that hosts the majority of the images, videos, and audio recordings for Wikipedia and other Wikimedia projects.

———

The Wikimedia community cares deeply about copyright and free knowledge; indeed, Wikipedia was founded as a free culture project and has relied on free and open source software from its earliest days.

The reasons for Wikimedia’s stance on free software are both ideological and practical. Ideologically, we support keeping the internet and the software that runs it as open, transparent, and freely available as possible. Instead of walled gardens controlled by a small handful of large companies, we want to see the internet continue to flourish as a public resource that anyone can participate in, with as few barriers to entry (and software licensing fees) as possible. Practically, we believe that free and open source software offers significant advantages in better security and privacy for our users.

The problem with MP3s was in their restrictive patents. Because MP3 was covered by 20 different patents issued over a span of many years, the format did not become patent-free until earlier this year, despite the fact that the first MP3 specification was published back in 1993. Our commitment to free—as in free license, in addition to monetary—and open source architecture meant that we could not support MP3 until its patents expired. We instead created support for other free file types, like Ogg Vorbis, WAV, FLAC, and MIDI.

The question of MP3 on Wikimedia projects goes back at least as far as 2004, when Jimmy Wales, the founder of Wikipedia, sent an email to the wikitech-l mailing list:

The policy is that we must avoid file formats that can not be used by legal free software. Yes, we can make special allowances in some special cases.  Yes, we have to make some judgments about the patent status of some popular formats. … But that’s what our policy has to be (i.e. can the format be used by legal free software?), not ‘any open or common file format’.

Thirteen years later, we’re happy to say that MP3’s patents have expired. In preparation for supporting the format, our tech teams held several discussions with the volunteer Commons editor community. Both sides agreed that the potential for misuse was high. Wikimedia sites generally only accept public domain or freely licensed files. Nearly all popular music today, though, is still copyrighted.

With this in mind, we came to a consensus that MP3 support will be initially limited to a small roll-out on Wikimedia Commons alone. Inside that, only a small subset of user groups (administrators, extended uploaders, and image reviewers) are allowed to upload MP3 files. We hope that this beta period will allow us to monitor the effects of the roll-out to Commons and adapt our strategy accordingly before enabling MP3 uploading on other projects as well.

Wikimedia Commons has one other benefit as well: as a common file repository for the entire Wikimedia ecosystem, audio files uploaded there can be used on all other Wikimedia projects.

We hope that our support of MP3 will make it easier for our users to contribute audio content to the projects—whether it be word pronunciations for Wiktionary, audiobooks for Wikisource, or classical music recordings for Wikipedia.

Here are some of the first MP3 files that have been uploaded to Commons so far. Enjoy!

Ryan Kaldari, Senior Engineering Manager, Community Tech
Wikimedia Foundation

by Ryan Kaldari at December 04, 2017 09:25 PM

Addshore

Wikidata Map November 2017

It has only been 4 months since my last Wikidata map update post, but the difference on the map in these 4 months is much greater than the diff shown in my last post covering 9 months. The whole map is covered with pink (additions to the map). The main areas include Norway, Germany, Malaysia, South Korea, Vietnam and New Zealand to name just a few.

As with previous posts varying sizes of the images generated can be found on Wikimedia Commons along with the diff image:

July to November in numbers

In the last 4 months (roughly speaking):

  • ~9.8 million new items have been created, an increase of roughly 30%
  • ~400 new properties to describe those items
  • ~1.4 million new links to wikimedia project articles (sitelinks)
  • ~150 million new statements on items, an increase of 78%, from ~191 million
    • Of these statements only ~7 million lack references, leaving ~143 million of the new statements having references.
  • This brings the total number of references from ~265 million to ~643 million, with only ~39 million of those referencing back to Wikipedia

All of these numbers were roughly pulled out of graphs by eye. The graphs can be seen below:

by addshore at December 04, 2017 07:10 PM

Wikimedia Foundation

Things my professor never told me about Wikipedia

Photo by Beko, CC BY-SA 4.0.

Have you ever had to defend Wikipedia to a professor (or a parent or friend)? This collection of published studies and reports may help make the case for a more enlightened (but still critical) use of the world’s most popular reference website. Wikipedia is a uniquely comprehensive, freely accessible, regularly reliable, well-referenced starting point for further research — so use it wisely.

Away from stigma

The first step is admitting that everyone, from students to doctors, uses Wikipedia. We need to change the conversation from one of abstinence to intelligent information consumption.

“Using Wikipedia to Teach Information Literacy”College and Undergraduate Libraries

“Wikipedia is increasingly becoming the go-to reference resource for the newest generation of students…Librarians and faculty should help remove the stigma associated with Wikipedia by embracing this Website and its imperfections as a way to make information literacy instruction valuable for the twenty-first-century student.”

“Wikipedia and the University, a case study”Teaching in Higher Education

“Our conclusion is that whilst Wikipedia is now unofficially integrated into universities, it is not ‘the’ information resource as feared by many and that an enlightened minority of academics have attempted to assimilate it into their teaching.”

“Faculty Perception of Wikipedia in the California State University System”New Library World

“Overall, faculty perceptions of Wikipedia have shifted in Wikipedia’s favor and that some faculty members create interesting and unique assignments that involve Wikipedia or Wikipedia-like work.”

“In Defense of Wikipedia”100 Law Library Journal

“Research instructors should teach students to use Wikipedia properly, rather than trying to convince them not to use it…Wikipedia can be used to help teach the importance of evaluating sources.”

“How and why do college students use Wikipedia?”Journal of the American Society for Information Science and Technology

“This study supports the knowledge value of Wikipedia, despite students’ cautious attitudes toward Wikipedia. The study suggests that educators and librarians need to provide better guidelines for using Wikipedia, rather than prohibiting Wikipedia use altogether.”

“Employing Wikipedia for good not evil: innovative approaches to collaborative writing assessment”, Assessment & Evaluation in Higher Education

“Unwarranted stigma is attached to the use of Wikipedia in higher education due to fears that students will not pursue rigorous research practices because of the easy access to information that Wikipedia facilitates.”

Towards integration

Wikipedia is more than just a place to consume: it’s a forum for practicing vital skills of information literacy and digital citizenship. Wikipedia and its volunteers practice rigorous research, and learning how to edit can improve those skills.

“Wikipedia: When College Students Have an Audience, Does Their Writing Improve?”EdTech Magazine

“When asked to contribute to a wiki — a space that’s highly public and where the audience can respond by deleting or changing your words — college students snapped to attention, carefully checking sources and including more of them to back up their work… their audience — the Wikipedia community — was quite gimlet-eyed and critical…by teaching them to use Wikipedia, they became much better users of the tool. Instead of blindly consuming the content, they understand where the research comes from and how it gets there. In the past, we’ve told them not to use Wikipedia. That’s insane. Rather than saying, ‘It doesn’t have a place in the academy,’ let’s explain to students how it can be used as a tertiary resource. It’s not the end-all and be-all of research, but it’s incredibly useful.”

“The new information literate: Open collaboration and information production in schools” , International Journal of Computer-Supported Collaborative Learning

“Writing for a non-teacher audience is motivating.. creating a public information resource is associated with a sense of responsibility that promotes critical engagement with information… This sense of responsibility provides an ideal context for practicing information literacy skills like identifying information needs, searching for, and assessing information sources…producing information for others in online environments can give young people a starting point for reflecting on where information comes from; such experiences…require students to reflect on the nature of information production…If we want to develop a more local, shared sense of responsibility, continuing efforts to incorporate public information production in classrooms should include opportunities for students to support and challenge one another in justifying and critiquing claims, as is done by co-authors on Wikipedia.”

“Wikipedia: The ‘Intellectual Makerspace’ of Libraries”Programming Librarian

“We need to see Wikipedia as a makerspace in its own right…Instead of creating a physical item, we have an intellectual makerspace. We need to encourage the activity by teaching editors the value of some information sources over others, how to write for an encyclopedia and how to deal with conflict in virtual environments.”

To improve society

Wikipedia’s mission is to share the sum of all human knowledge. Sometimes this resolves trivia and bar bets, and sometimes it results in new inventions or medical cures. The potential is unlimited.

“Amplifying the Impact of Open Access: Wikipedia and the Diffusion of Science”, arXiv.org

“In most of the world’s Wikipedias, a journal’s high status (impact factor) and accessibility (open access policy) both greatly increase the probability of referencing…the chief effect of open access policies may be to significantly amplify the diffusion of science, through an intermediary like Wikipedia, to a broad public audience.”

“Science is Shaped by Wikipedia: Evidence from a Randomized Control Trial”SSRN

“As the largest encyclopedia in the world, it is not surprising that Wikipedia reflects the state of scientific knowledge. However, Wikipedia is also one of the most accessed websites in the world, including by scientists, which suggests that it also has the potential to shape science. Incorporating ideas into a Wikipedia article leads to those ideas being used more in the scientific literature. We find that fully a third of the correlational relationship is causal, implying that Wikipedia has a strong effect in shaping science.”

For a full review of the new role Wikipedia can play in education, see our Guide for Research Libraries. For a digestible overview of how to use Wikipedia for regular research, check out the Research Help Guide.

Jake Orlowitz, Wikimedian

This post was originally published on Medium, and its text is available under the CC BY-SA 4.0 license.. While Jake works with us over here at the Wikimedia Foundation, this was written in an entirely volunteer capacity. The views and opinions expressed are those of the author alone.

by Jake Orlowitz at December 04, 2017 06:36 PM

iOS feature helps you read Wikipedia in the dark

Image by Rafael Fernandez, modified by the Carolyn Li-Madeo/Wikimedia Foundation, CC BY-SA 4.0

Over the past year, people using the Wikipedia iOS app have repeatedly told us that they wanted a night-time reading mode so they could read Wikipedia in the dark.

The story often went like this: “I love to explore corners of Wikipedia, but the screen is too bright and my nightly read-a-thons are keeping my partner awake.”

Beyond the requests—the many, many requests—building the nighttime reading mode feature aligned with our recent efforts to make the iOS app more accessible for users with visual impairments. Last year, we added VoiceOver, which allows users to navigate Wikipedia by voice, as well as Dynamic Type support, which allows users to control the text size across the app. Adding a nighttime mode would help as well for users with color sensitivities (such as to blue light) to read more comfortably.

Here’s what we did.

Designing “night-time reading mode.”

Over the past year, designers across the Wikimedia Foundation have been working together to develop the Wikimedia style guide, an ongoing and collaborative project which aims to bring consistency across the Foundation’s many projects.

When the iOS team began thinking about what colors would go into the “nighttime reading mode” experience for readers and creating new color palettes that support comfortable reading in a variety of light settings, we started by building off of the existing colors in the WMF color palette. The colors in our “dark reading mode palette” were developed collaboratively by the Reading team’s designers and are extensions of the colors in the Wikimedia Foundation’s style guide.  WMF colors became the anchors for the secondary palettes, with an emphasis on WCAG conformance and accessibility.

Engineering and design working

The iOS engineering team works closely with the design team throughout all stages of feature design. By defining the problem together—and coming up with a shared language—we’re able to clearly communicate technical restrictions to the design team, which helps inform a technical solution that’s potentially easier to implement. Hearing design’s approach also helps shape our early technical decisionmaking and sets us up for success.

In using this process for our “dark mode” reading themes, we were able to distill every color used in the app into a set of named colors. This was our shared language for discussing changes, which helped with rapid iteration by making it easier to clearly identify what color or set of colors was incorrect in a given screenshot. It also helped pave the way for implementing another highly requested feature—Sepia mode. With the palette already defined, adding another theme was an easy win.

What we heard and what’s next

The feedback from users has been tremendous. Users gave this version our highest star rating ever. MacRumors and other Apple tech sites ran positive reviews about the update and Apple featured it in their “Best App Updates” collection. Dark theme is especially popular, with two-thirds of users choosing that theme.

And as we continue to improve the app, we’ll be making sure it stays accessible.

If you want to help the team continue making a great app together, we always need beta testers. You can sign up here. You can also help write code or join the mobile-l mailing list for updates on new versions and features.

Josh Minor, Senior Product Manager, Reading Product
Joe Walsh, Software Engineer, Mobile Apps
Carolyn Li-Madeo, User Experience Designer, Audiences Design
Wikimedia Foundation

by Josh Minor, Joe Walsh and Carolyn Li-Madeo at December 04, 2017 05:46 PM

Net neutrality is essential for access to knowledge

Photo by Victor Grigas/Wikimedia Foundation, CC BY-SA 3.0.

Last week, the Federal Communications Commission (FCC) published a proposal to deregulate Internet Service Providers (ISPs) and to eliminate the U.S. net neutrality rules. After the agency announced its vision for the internet earlier this year, we submitted a letter to the Commission, urging it to keep the Open Internet rules in place. Now that the drastic extent of this sweeping deregulation has been communicated, we want to explain why net neutrality matters for Wikipedia and for free knowledge in general.

Net neutrality is essential to the preservation of an open internet for the United States and around the world. The FCC’s net neutrality rules protect internet users’ access to the internet’s wealth of information and their ability to collaborate via the internet by specifically prohibiting ISPs from discriminating among websites or applications by blocking or slowing some, or prioritizing traffic from others in exchange for a fee. They help ensure that everyone with an internet connection can connect to the services, applications, and content of their choice and that ISPs do not abuse their market power or local monopoly position to restrict the free and open flow of information. This openness is an important prerequisite for the Wikimedia Foundation to further its mission to “empower and engage people to collect and develop educational content and to share it globally”—Wikipedia and the other Wikimedia projects are built by thousands of volunteers who collaborate to edit articles in real time and contribute and curate images and data. To do so, they need to be able to freely and equally connect to the Wikimedia websites. In addition, they need to able to read original and verifiable sources online with information about the topic to be included in Wikipedia.

We believe that everyone should be able to freely access knowledge and that this should not be a question of affluence, neither in the United States nor in the rest of the world. To ensure that all internet users can participate in knowledge and access Wikipedia as well as other sources of information, we support rules that keep the internet open. Net neutrality is essential to prevent a new digital divide from opening between those internet users who can afford to pay to access all information online and those who cannot afford to do so. The FCC’s current net neutrality rules help make sure that every internet user in the United States can access any information found online. If these protections are rolled back and ISPs are allowed to block or throttle traffic from websites and applications, not all subscribers may be able to afford access to content they have today.

ISPs should not decide what their subscribers see online. The Indian telecoms regulator TRAI recognizes this in the Recommendations on Net Neutrality, which it released this week. Net neutrality creates a level playing field for diverse content providers and a plurality of voices online. New content providers, especially non-profit projects and maybe even the next Wikipedia, depend on rules against blocking, throttling, and paid prioritization to be able to offer their services to a wide audience via the internet. A roll-back of the current net neutrality rules would negatively affect many people’s ability to find the information they seek online and will instead create a new digital divide. It would also set a bad example for regulators across the world at a time when people from developing regions are coming online at impressive rates. We believe that everyone should be able to participate in knowledge, on Wikipedia and elsewhere on the internet. Strong protections for net neutrality are an essential part of this vision and we urge the FCC to keep the existing framework in place.

Jan GerlachPublic Policy Manager, Legal
Wikimedia 
Foundation

You can read our letter to the FCC, join policy conversations on our public policy mailing list, and follow our work at @wikimediapolicy.

You can also read this blog post on Medium.

by Jan Gerlach at December 04, 2017 04:52 PM

Tech News

Tech News issue #49, 2017 (December 4, 2017)

TriangleArrow-Left.svgprevious 2017, week 49 (Monday 04 December 2017) nextTriangleArrow-Right.svg
Other languages:
العربية • ‎čeština • ‎English • ‎español • ‎suomi • ‎français • ‎עברית • ‎हिन्दी • ‎magyar • ‎italiano • ‎日本語 • ‎ಕನ್ನಡ • ‎polski • ‎português do Brasil • ‎русский • ‎svenska • ‎українська • ‎中文

December 04, 2017 12:00 AM

December 03, 2017

Addshore

Wikidata Map May 2016 (Belarus & Uganda)

I originally posted about the Wikidata maps back in early 2015 and have followed up with a few posts since looking at interesting developments. This is another one of those posts covering the changes since the last post, so late 2015, to now, May 2016.

The new maps look very similar to the naked eye and the new ‘big’ map can be seen below.

So while at the 2016 Wikimedia Hackathon in Jerusalem I teamed up with @valhallasw to generate some diffs of these maps, in a slightly more programatic way to my posts following up the 2015 Wikimania!

In the image below all pixels that are red represent Wikidata items with coordinate locations and pixels that are yellow represent items added between October 27, 2015 and April 2, 2016 with coordinate locations. Click the image to see it full size.

The area in eastern Europe with many new items is Belarus and the area in eastern Africa is Uganda. Some other smaller clusters of yellow pixels can also be seen in the image.

All of the generated images from April 2016 can be found on Wikimedia Commons at the links below:

by addshore at December 03, 2017 05:25 PM

Wikidata Map October 2016

I has been another 5 months since my last post about the Wikidata maps, and again some areas of the world have lit up. Since my last post at least 9 noticeable areas have appeared with many new items containing coordinate locations. These include Afghanistan, Angola, Bosnia & Herzegovina, Burundi, Lebanon, Lithuania, Macedonia, South Sudan and Syria.

The difference map below was generated using Resemble.js. The pink areas show areas of difference between the two maps from April and October 2016.

Who caused the additions?

To work out what items exist in the areas that have a large amount of change the Wikidata query service can be used. I adapted a simple SPARQL query to show the items within a radius of the centre of each area of increase. For example Afghanistan used the following query:

#defaultView:Map
 SELECT ?place ?placeLabel ?location ?instanceLabel
WHERE
{
  wd:Q889 wdt:P625 ?loc . 
  SERVICE wikibase:around { 
      ?place wdt:P625 ?location . 
      bd:serviceParam wikibase:center ?loc . 
      bd:serviceParam wikibase:radius "100" . 
  } 
  OPTIONAL {    ?place wdt:P31 ?instance  }
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
  BIND(geof:distance(?loc, ?location) as ?dist) 
} ORDER BY ?dist


The query can be see running here and above. The items can then directly be clicked on, the history loaded.

The individual edits that added the coordinates can easily be spotted.

Of course this could also be done using a script following roughly the same process.

It looks like Reinheitsgebot (Magnus Manske) can be attributed to many of the areas of mass increase due to a bot run in April 2016. It looks like KrBot can be attributed to many of the coordinates in Lithuania due to a bot run in May 2016.

October 2016 maps

The October 2016 maps can be found on commons:

Labs project

I have given the ‘Wikidata Analysis’ tool a speedy reboot over the past weeks and generated many maps for may old dumps that are not currently on Wikimedia Commons.

The tool now contains a collection of date stamp directories which contain the data generated by the Java dump scanning tool as well ad the images that are then generated from that data using a Python script.

by addshore at December 03, 2017 05:24 PM

Wikidata Map Animations

Back in 2013 maps were generated almost daily to track the immediate usage of the then new coordinate location within the project. An animation was then created by Denny & Lydia showing the amazing growth which can be seen on commons here. Recently we found the original images used to make this animation starting in June 2013 and extending to September 2013, and to celebrate the fourth birthday of Wikidata we decided to make a few new animations.

The above animation contains images from 2013 (June to September) and then 2014 onwards.

This gap could be what resulted in the visible jump in brightness of the gif. This jump could also be explained by different render settings used to create the map, at some point we should go back and generate standardized images for every week / months that coordinates have existed on Wikidata.

The whole gif and the individual halves can all be found on commons under CC0:

The animations were generated directly from png files using the following command:

convert -delay 10 -loop 0 *.png output.gif

These animations use the “small” images generated in previous posts such as Wikidata Map October 2016.

by addshore at December 03, 2017 05:24 PM

Wikidata Map July 2017

It’s been 9 months since my last Wikidata map update and once again we have many new noticable areas appearing, including Norway, South Africa, Peru and New Zealand to name but a few.  As with the last map generation post I once again created a diff image so that the areas of change are easily identifiable comparing the data from July 2017 with that from my last post on October 2016.

The various sizes of the generated maps can be found on Wikimedia Commons:

Reasons for increases

If you want to have a shot at figuring out the cause of the increases in specific areas then take a look at my method described in the last post using the Wikidata Query Service.

Peoples discoveries so far:

  • Sweden & Denmark – Most probably Wiki Loves Monuments imports of ancient monuments.
  • Yemen – it looks like it’s the result of a stub-creation bot on the Cebuano Wikipedia, adding geographical places with coordinates, resulting in another bot feeding them to Wikidata.

I haven’t included the names of those that discovered reasons for areas of increase above, but if you find your discovery here and want credit just ask!

by addshore at December 03, 2017 05:24 PM

Wikibase docker images

This is a belated post about the Wikibase docker images that I recently created for the Wikidata 5th birthday. You can find the various images on docker hub and matching Dockerfiles on github. These images combined allow you to quickly create docker containers for Wikibase backed by MySQL and with a SPARQL query service running alongside updating live from the Wikibase install.

A setup was demoed at the first Wikidatacon event in Berlin on the 29th of October 2017 and can be seen at roughly 41:10 in the demo of presents video which can be seen below.

The images

The ‘wikibase‘ image is based on the new official mediawiki image hosted on the docker store. The only current version, which is also the version demoed is for MediaWiki 1.29. This image contains MediaWiki running on PHP 7.1 served by apache. Right now the image does some sneaky auto installation of the MediaWiki database tables which might be disappearing in the future to make the image more generic.

The ‘wdqs‘ image is based on the official openjdk image hosted on the docker store. This image also only has one version, the current latest version of the Wikidata Query Service which is downloaded from maven. This image can be used to run the blazegraph service as well as run an updater that reads from the recent changes feed of a wikibase install and adds the new data to blazegraph.

The ‘wdqs-frontend‘ image hosts the pretty UI for the query service served by nginx. This includes auto completion and pretty visualizations. There is currently an issue which means the image will always serve examples for Wikidata which will likely not work on your custom install.

The ‘wdqs-proxy‘ image hosts an nginx proxy that restricts external access to the wdqs service meaning it is READONLY and also has a time limit for queries (not currently configurable). This is very important as if the wdqs image is exposed directly to the world then people can also write to your blazegraph store.

You’ll also need to have some mysql server setup for wikibase to use, you can use the default mysql or mariadb images for this, this is also covered in the example below.

All of the wdqs images should probably be renamed as they are not specific to Wikidata (which is where the wd comes from), but right now the underlying repos and packages have the wd prefix and not a wb prefix (for Wikibase) so we will stick to them.

Compose example

The below example configures volumes for all locations with data that should / could persist. Wikibase is exposed on port 8181 with the query service UI on 8282 and the queryservice itself (behind the proxy) on 8989.

Each service has a network alias defined (that probably isn’t needed in most setups), but while running on WMCS it was required to get around some bad name resolving.

version: '3'

services:
  wikibase:
    image: wikibase/wikibase
    restart: always
    links:
      - mysql
    ports:
     - "8181:80"
    volumes:
      - mediawiki-images-data:/var/www/html/images
    depends_on:
    - mysql
    networks:
      default:
        aliases:
         - wikibase.svc
  mysql:
    image: mariadb
    restart: always
    volumes:
      - mediawiki-mysql-data:/var/lib/mysql
    environment:
      MYSQL_DATABASE: 'my_wiki'
      MYSQL_USER: 'wikiuser'
      MYSQL_PASSWORD: 'sqlpass'
      MYSQL_RANDOM_ROOT_PASSWORD: 'yes'
    networks:
      default:
        aliases:
         - mysql.svc
  wdqs-frontend:
    image: wikibase/wdqs-frontend
    restart: always
    ports:
     - "8282:80"
    depends_on:
    - wdqs-proxy
    networks:
      default:
        aliases:
         - wdqs-frontend.svc
  wdqs:
    image: wikibase/wdqs
    restart: always
    build:
      context: ./wdqs/0.2.5
      dockerfile: Dockerfile
    volumes:
      - query-service-data:/wdqs/data
    command: /runBlazegraph.sh
    networks:
      default:
        aliases:
         - wdqs.svc
  wdqs-proxy:
    image: wikibase/wdqs-proxy
    restart: always
    environment:
      - PROXY_PASS_HOST=wdqs.svc:9999
    ports:
     - "8989:80"
    depends_on:
    - wdqs
    networks:
      default:
        aliases:
         - wdqs-proxy.svc
  wdqs-updater:
    image: wikibase/wdqs
    restart: always
    command: /runUpdate.sh
    depends_on:
    - wdqs
    - wikibase
    networks:
      default:
        aliases:
         - wdqs-updater.svc

volumes:
  mediawiki-mysql-data:
  mediawiki-images-data:
  query-service-data:

Questions

I’ll vaugly keep this section up to date with Qs & As, but if you don’t find you answer here, leave a comment, send an email or file a phabricator ticket.

Can I use these images in production?

I wouldn’t really recommend running any of these in ‘production’ yet as they are new and not well tested. Various things such as upgrade for the query service and upgrades for mediawiki / wikibase are also not yet documented very well.

Can I import data into these images from an existing wikibase / wikidata? (T180216)

In theory, although this is not documented. You’ll have to import everything using an XML dump of the existing mediawiki install, the configuration will also have to match on both installs. When importing using an XML dump the query service will not be updated automatically, and you will likely have to read the manual.

Where was the script that you ran in the demo video?

There is a copy in the github repo called setup.sh, but I can’t guarantee it works in all situations! It was specifically made for a WMCS debian jessie VM.

Links

by addshore at December 03, 2017 02:50 PM

December 02, 2017

Vinitha VS

Time to bond..

The woods are lovely dark and deep

but i have promises to keep

and miles to go before I sleep..

-Robert Frost

Meeting people has always been an exciting feeling.. This time it is twice as exciting. I got to meet the mentors who took out their time to review the code I had written and suggested edits. I am so much moved by their simplicity and care. Also I am sending mails and talking to people I have no idea who are. But I have the belief  that because they are part of  an open source community, they must be willing to help if they have time.  The world seems more beautiful, connecting with more people. Getting selected is an amazing feeling, but what is more amazing is having some great people to work with..

Another thing I am  doing  is more research  about the task at hand. My mentors have given some ideas and I think unlike other projects, my task requires coming up with effective ideas and reading about what has already been done. This period is so exciting as I am  reading to actually know more and implement an effective system. Knowing more about what to do made me think about other possible ways to implement the tasks. I am confident that I will come up with some idea which will make it all work :).

The internship period will begin soon. I will be actually writing code which people can use.  The journey has already begun.. Let it bear the best of all fruits 🙂

 

 


by vinithavs at December 02, 2017 03:52 PM

Shyamal

Shocking tales from ornithology

Manipulative people have always made use of the dynamics of ingroups and outgroups to create diversions from bigger issues. The situation is made worse when misguided philosophies are peddled by governments that place economics ahead of ecology. The pursuit of easily gamed targets such as GDP is easy since money is a man-made and controllable entity. Nationalism, pride, other forms of chauvinism, the creation of enemies and the magnification of war threats are all effective tools in the arsenal of Machiavelli for use in misdirecting the masses. One might imagine that the educated, especially scientists, would be smart enough not to fall into these traps but cases from recent history will dampen hopes for such optimism.

There is a very interesting book in German by Eugeniusz Nowak called "Wissenschaftler in turbulenten Zeiten" (or scientists in turbulent times) that deals with the lives of ornithologists, conservationists and other naturalists during the Second World War. Preceded by a series of recollections published in various journals, the book was published in 2010 but I became aware of it only recently while translating some biographies into the English Wikipedia. I have not yet actually seen the book (it has about five pages on Salim Ali as well) and have had to go by secondary quotations in other content. Nowak was a student of Erwin Stresemann (with whom the first chapter deals with) and he writes about several European (but mostly German, Polish and Russian) ornithologists and their lives during the turbulent 1930s and 40s. Although Europe is pretty far from India, there are ripples that reached afar. Incidentally, Nowak's ornithological research includes studies on the expansion in range of the collared dove (Streptopelia decaocto) which the Germans called the Tuerkentaube, literally the "Turkish dove", a name with a baggage of cultural prejudices.

Nowak's first paper of "recollections" notes that: [he] presents the facts not as accusations or indictments, but rather as a stimulus to the younger generation of scientists to consider the issues, in particular to think “What would I have done if I had lived there or at that time?” - a thought to keep as you read on.

A shocker from this period is a paper by Dr Günther Niethammer on the birds of Auschwitz (Birkenau). This paper (read it online here) was published when Niethammer was posted to the security at the main gate of the concentration camp. You might be forgiven if you thought he was just a victim of the war. Niethammer was a proud nationalist and volunteered to join the Nazi forces in 1937 leaving his position as a curator at the Museum Koenig at Bonn.
The contrast provided by Niethammer who looked at the birds on one side
while ignoring inhumanity on the other provided
novelist Arno Surminski with a title for his 2008 novel -
Die Vogelwelt von Auschwitz
- ie. the birdlife of Auschwitz.

G. Niethammer
Niethammer studied birds around Auschwitz and also shot ducks in numbers for himself and to supply the commandant of the camp Rudolf Höss (if the name does not mean anything please do go to the linked article / or search for the name online).  Upon the death of Niethammer, an obituary (open access PDF here) was published in the Ibis of 1975 - a tribute with little mention of the war years or the fact that he rose to the rank of Obersturmführer. The Bonn museum journal had a special tribute issue noting the works and influence of Niethammer. Among the many tributes is one by Hans Kumerloeve (starts here online). A subspecies of the common jay was named as Garrulus glandarius hansguentheri by Hungarian ornithologist Andreas Keve in 1967 after the first names of Kumerloeve and Niethammer. Fortunately for the poor jay, this name is a junior synonym of  G. g. anatoliae described by Seebohm in 1883.

Meanwhile inside Auschwitz, the Polish artist Wladyslaw Siwek was making sketches of everyday life  in the camp. After the war he became a zoological artist of repute. Unfortunately there is very little that is readily accessible to English readers on the internet.
Siwek, artist who documented life at Auschwitz
before working as a wildlife artist.
 
Hans Kumerloeve
Now for Niethammer's friend Dr Kumerloeve who also worked in the Museum Koenig at Bonn. His name was originally spelt Kummerlöwe and was, like Niethammer, a doctoral student of Johannes Meisenheimer. Kummerloeve and Niethammer made journeys on a small motorcyle to study the birds of Turkey. Kummerlöwe's political activities started earlier than Niethammer, joining the NSDAP (German: Nationalsozialistische Deutsche Arbeiterpartei = The National Socialist German Workers' Party)  in 1925 and starting the first student union of the party in 1933. Kummerlöwe soon became part of the Ahnenerbe, a think tank meant to give  "scientific" support to the party-ideas on race and history. In 1939 he wrote an anthropological study on "Polish prisoners of war". At the museum in Dresden which he headed, he thought up ideas to promote politics and he published them in 1939 and 1940. After the war, it is thought that he went to all the European libraries that held copies of this journal (Anyone interested in hunting it down should look for copies of Abhandlungen und Berichte aus den Staatlichen Museen für Tierkunde und Völkerkunde in Dresden 20:1-15.) and purged them of his article. According to Nowak, he even managed to get his hands (and scissors) on copies held in Moscow and Leningrad!  

The Dresden museum was also home to the German ornithologist Adolf Bernhard Meyer (1840–1911). In 1858, he translated the works of Charles Darwin and Alfred Russel Wallace into German and introduced evolutionary theory to a whole generation of German scientists. Among Meyer's amazing works is a series of avian osteological works which uses photography and depict birds in nearly-life-like positions - a less artistic precursor to Katrina van Grouw's 2012 book The Unfeathered Bird. Meyer's skeleton images can be found here. In 1904 Meyer was eased out of the Dresden museum because of rising anti-semitism. Meyer does not find a place in Nowak's book.

Nowak's book includes entries on the following scientists: (I keep this here partly for my reference as I intend to improve Wikipedia entries on several of them as and when time and resources permit. Would be amazing if others could pitch in!).
In the first of his "recollection papers" (his 1998 article) he writes about the reason for writing them  - the obituary for Prof. Ernst Schäfer  was a whitewash that carefully avoided any mention of his wartime activities. And this brings us to India. In a recent article in Indian Birds, Sylke Frahnert and others have written about the bird collections from Sikkim in the Berlin natural history museum. In their article there is a brief statement that "The  collection  in  Berlin  has  remained  almost  unknown due  to  the  political  circumstances  of  the  expedition". This might be a bit cryptic for many but the best read on the topic is Himmler's Crusade: The true story of the 1939 Nazi expedition into Tibet (2009) by Christopher Hale. Hale writes about Himmler: 
He revered the ancient cultures of India and the East, or at least his own weird vision of them.
These were not private enthusiasms, and they were certainly not harmless. Cranky pseudoscience nourished Himmler’s own murderous convictions about race and inspired ways of convincing others...
Himmler regarded himself not as the fantasist he was but as a patron of science. He believed that most conventional wisdom was bogus and that his power gave him a unique opportunity to promulgate new thinking. He founded the Ahnenerbe specifically to advance the study of the Aryan (or Nordic or Indo-German) race and its origins
From there Hale goes on to examine the motivations of Schäfer and his team. He looks at how much of the science was politically driven. Swastika signs dominate some of the photos from the expedition - as if it provided for a natural tie with Buddhism in Tibet. It seems that Himmler gave Schäfer the opportunity to rise within the political hierarchy. The team that went to Sikkim included Bruno Beger. Beger was a physical anthropologist but with less than innocent motivations although that would be much harder to ascribe to the team's other pursuits like botany and ornithology. One of the results from the expedition was a film made by the entomologist of the group, Ernst Krause - Geheimnis Tibet - or secret Tibet - a copy of this 1 hour and 40 minute film is on YouTube. At around 26 minutes, you can see Bruno Beger creating face casts - first as a negative in Plaster of Paris from which a positive copy was made using resin. Hale talks about how one of the Tibetans put into a cast with just straws to breathe from went into an epileptic seizure from the claustrophobia and fear induced. The real horror however is revealed when Hale quotes a May 1943 letter from an SS officer to Beger - ‘What exactly is happening with the Jewish heads? They are lying around and taking up valuable space . . . In my opinion, the most reasonable course of action is to send them to Strasbourg . . .’ Apparently Beger had to select some prisoners from Auschwitz who appeared to have Asiatic features. Hale shows that Beger knew the fate of his selection - they were gassed for research conducted by Beger and August Hirt.
SS-Sturmbannführer Schäfer at the head of the table in Lhasa

In all Hale, makes a clear case that the Schäfer mission had quite a bit of political activity underneath. We find that Sven Hedin (Schäfer was a big fan of him in his youth. Hedin was a Nazi sympathizer who funded and supported the mission) was in contact with fellow Nazi supporter Erica Schneider-Filchner and her father Wilhelm Filchner in India, both of whom were interned later at Satara. while Bruno Beger made contact with Subhash Chandra Bose more than once. [Two of the pictures from the Bundesarchiv show a certain Bhattacharya - who appears to be a chemist working on snake venom at the Calcutta snake park - one wonders if he is Abhinash Bhattacharya.]

My review of Nowak's book must be uniquely flawed as  I have never managed to access it beyond some online snippets and English reviews.  The war had impacts on the entire region and Nowak's coverage is limited and there were many other interesting characters including the Russian ornithologist Malchevsky  who survived German bullets thanks to a fat bird observation notebook in his pocket! In the 1950's Trofim Lysenko, the crank scientist who controlled science in the USSR sought Malchevsky's help in proving his own pet theories - one of which was the ideas that cuckoos were the result of feeding hairy caterpillars to young warblers!

Issues arising from race and perceptions are of course not restricted to this period or region, one of the less glorious stories of the Smithsonian Institution concerns the honorary curator Robert Wilson Shufeldt (1850 – 1934) who in the infamous Audubon affair made his personal troubles with his second wife, a grand-daughter of Audubon, into one of race. He also wrote such books as America's Greatest Problem: The Negro (1915) in which we learn of the ideas of other scientists of the period like Edward Drinker Cope! Like many other obituaries, Shufeldt's is a classic whitewash.  

Even as recently as 2015, the University of Salzburg withdrew an honorary doctorate that they had given to the Nobel prize winning Konrad Lorenz for his support of the political setup and racial beliefs. It should not be that hard for scientists to figure out whether they are on the wrong side of history even if they are funded by the state. Perhaps salaried scientists in India would do well to look at the legal contracts they sign with their employers, the state, more carefully.

PS: Mixing natural history with war sometimes led to tragedy for the participants as well. In the case of Dr Manfred Oberdörffer who used his cover as an expert on leprosy to visit the borders of Afghanistan with entomologist Fred Hermann Brandt (1908–1994), an exchange of gunfire with British forces killed him although Brandt lived on to tell the tale.

by Shyamal L. (noreply@blogger.com) at December 02, 2017 03:51 PM

December 01, 2017

Wikimedia Foundation

Esra’a Al Shafei joins Wikimedia Foundation Board of Trustees

Image by Takasakiyama, public domain.

The Wikimedia Foundation today announced the appointment of Esra’a Al Shafei, a prominent human rights activist and a passionate defender of free expression, to the Wikimedia Foundation Board of Trustees.

A native of Bahrain, Esra’a’s work aims to increase and protect free speech, promote expression for youth and underrepresented voices, and improve the lives of LGBTQ people in the Middle East and North Africa. She founded and directs Majal, a network of online platforms that amplify under-reported and marginalized voices.

“Esra’a shares Wikimedia’s foundational belief that shared knowledge can facilitate shared understanding,” said Wikimedia Foundation Executive Director, Katherine Maher. “Her achievements exemplify how intentional community building can be a powerful tool for positive change, while her passion for beautiful and engaging user experiences will only elevate our work. We are so fortunate to have her perspective in support of our global Wikimedia communities.”

Esra’a founded Majal in 2006 as Mideast Youth, at the time a series of blogs bringing a voice to marginalized and underrepresented young people across the Middle East. Today, the organization’s team helps build communities that celebrate, protect, and promote diversity and social justice. Their endeavors include CrowdVoice.org, which curates crowdsourced media to contextualize social movements throughout the world; Mideast Tunes, the largest web and mobile app showcasing underground musicians in the Middle East and North Africa who use music as a tool for social change; and Ahwaa.org, an open discussion platform for Arab LGBTQ individuals that uses game mechanics to protect and engage its community.

“When I first encountered Wikipedia shortly after obtaining an internet connection in the early 2000s, I felt that the true purpose of the internet was realized. With Wikipedia, I accessed research regarding persecuted communities in my home country and the wider region: ethnic and religious minorities whom we were discouraged from learning about, and whose histories and beliefs were dictated to us from a singular government perspective. Wikipedia’s open source and crowdsourcing practices would inspire the platforms I built to advocate for underrepresented communities, and the internet would shape my life’s work in advocating for freedom of expression and identity around the world,” said Esra’a.

Esra’a received the Berkman Award for Internet Innovation from the Berkman Klein Center for Internet & Society at Harvard Law School in 2008 for “outstanding contributions to the internet and its impact on society.” The World Economic Forum listed her as one of “15 Women Changing the World in 2015.” She has won the “Most Courageous Media” Prize from Free Press Unlimited, and the Monaco Media Prize, which acknowledges innovative uses of media for the betterment of humanity.

She has been featured in Fast Company as one of the “100 Most Creative People in Business;” in The Daily Beast as one of the 17 bravest bloggers worldwide; and in Forbes’ “30 Under 30” list of social entrepreneurs making an impact in the world.

Esra’a was a keynote speaker at Wikimania 2017 in Montreal, the annual conference centered on the Wikimedia projects.

“Esra’a brings tech expertise and a valuable perspective to the Board – coming from a region where access to information is not taken for granted. I was impressed by her talk during Wikimania 2017 on ‘Experiences from the Middle East: Overcoming Challenges and Serving Communities’. I think her experience in that region will be important to our efforts around the globe.” said Nataliia Tymkiv, Governance Chair for the Board.

Esra’a is a senior TED Fellow, an Echoing Green fellow, and a Director’s Fellow at the MIT Media Lab. She received a Shuttleworth Foundation Fellowship in 2012 for her work on CrowdVoice.org.

Esra’a joins nine other Foundation Trustees who collectively bring expertise in the Wikimedia community, financial oversight, governance, and organizational development; and a commitment to advancing Wikimedia’s mission of free knowledge for all.

She was approved unanimously by the Wikimedia Foundation Board of Trustees. Her term is effective December 2017 and will continue for three years. Please see the Wikimedia Foundation Board of Trustees for complete biographies.

About the Wikimedia Foundation

The Wikimedia Foundation is the non-profit organization that supports and operates Wikipedia and its sister free knowledge projects. Wikipedia is the world’s free knowledge resource, spanning more than 45 million articles across nearly 300 languages. Every month, more than 200,000 people edit Wikipedia and the Wikimedia projects, collectively creating and improving knowledge that is accessed by more than 1 billion unique devices every month. This all makes Wikipedia one of the most popular web properties in the world. Based in San Francisco, California, the Wikimedia Foundation is a 501(c)(3) charity that is funded primarily through donations and grants.

Wikimedia Foundation press contact

Samantha Lien
Communications Manager
press@wikimedia.org

by Wikimedia Foundation at December 01, 2017 09:00 PM

Wiki Education Foundation

Welcome, Will!

I’m happy to announce a new staff member at Wiki Education, Will Kent. Will’s role is unique, as we have two staff members taking leave in the next year. Will has graciously agreed to join the team as a Program Manager, providing interim coverage for the Classroom Program beginning this week, and coverage for the Visiting Scholars Program and Wikipedia Fellows pilot later in the spring.

Will holds a bachelor’s degree from Tufts University in Anthropology and a masters degree from the University of Illinois in Library and Information Science. In his previous role as librarian at Loyola University in Chicago, Will advocated for open access and supported engagement with Wikipedia on campus. He is familiar with the Wikipedia editing community, facilitating edit-a-thons during his time in Chicago. Will’s diverse skillset makes him an excellent fit for this unique position.

Initially, Will’s responsibilities include responding to instructor questions and providing support as they participate in the Classroom Program. Later this spring, he will pick up management of the Visiting Scholars program, an initiative where we pair universities and their resources with existing Wikipedia editors. Will’s role helps facilitate general communication between the Wikipedia editing community, instructors, and students. He’ll also wrap up our Wikipedia Fellows pilot program, working with us to determine whether we should repeat the pilot.

Outside of work, you can find Will scurrying around the Bay Area on his bicycle, exploring neighborhoods, going to shows, cooking, and bridging the gap between real life and the internet.

Glad to have you join our team, Will!

by LiAnna Davis at December 01, 2017 04:52 PM

Wikimedia Tech Blog

Wikimedia Foundation funds six Outreachy interns for round 15

Photo by Victor Grigas, CC BY-SA 3.0.

Wikimedia has accepted six interns through the Outreachy program who will work on a wide variety of Wikimedia projects (five coding and one translation related) with help from twelve Wikimedia mentors from December 2017 through March 2018.

Outreachy is an internship program coordinated by Software Freedom Conservancy twice a year with a goal to bring people from backgrounds underrepresented in tech into open source projects. Seventeen open source organizations are participating in this round, and will be working with a total of 42 interns. (Six interns are participating from Wikimedia.) Accepted interns will be funded by the Wikimedia Foundation’s Technical collaboration team that coordinates Wikimedia’s participation in various Outreach programs.

Below are details of the accepted projects:

Translation outreach: User guides on MediaWiki.org – Intern: Anna e só, Goiânia, Brazil; Mentor(s) Johan Jönsson, Benoît Evellin

There is a lot of technical documentation in English around Wikimedia projects and topics hosted on MediaWiki.org. This documentation is usually translatable but is rarely translated, at least in not more than a few languages. Also, becoming and staying engaged as a translator is difficult. This project will help develop outreach strategies to reach out to potential translators, with an ultimate goal of providing technical documentation to readers in the language they are comfortable.

User contribution summary tool – Intern: Megha Sharma, Punjab, India; Mentor(s) Gergő Tisza, Stephen LaPorte

Being able to take a sneak peek into your contributions can be rewarding in the open source world and keeps you motivated in staying involved. In Wikipedia, due to its highly collaborative nature, it’s not easy for the editors to take credit for the value they added to an article. This project would be a first step towards tackling this problem through a new tool that would allow Wikipedia editors to view a summary of their contributions.

Improvements to Grants review and Wikimania scholarships web apps – Intern: Neha Jha, New Delhi, India; Mentor(s) Niharika Kohli, Bryan Davis

Wikimedia Grants review and Wikimania scholarship web applications are two similar platforms that enable users to submit scholarship applications and for administrators to review and evaluate them. This project will help make planned improvements to these apps.

Refactoring the MediaWiki MassMessage Extension – Intern: Noella, Douala, Cameroon; Mentor(s) Alangi Derick, Legoktm

The MassMessage extension allows a user to send a message to a list of pages via special page Special:MassMessage. There are large wikis that use this extension already: Wikimedia Commons, MediaWiki, Meta-Wiki, Wikipedia, Wikidata, etc. This project is about refactoring the technical debt that has accumulated over years since this extension is in use to make it match recent MediaWiki standards.

Improve Programs & Events Dashboard support for Art+Feminism 2018 – Intern: Candela Jiménez Girón, Berlin, Germany; Mentor(s) Sage Ross, Jonathan Morgan

Programs & Events Dashboard helps organize and track group editing projects on Wikipedia and other wikis (such as edit-a-thons). One of the major use cases of this dashboard is the Art+Feminism project, a worldwide program of edit-a-thons that take place in March every year. There was a range of problems identified during the 2017’s edition, and new features requested afterward. This project will focus on improving the Dashboard with help from organizers to better support the 2018’s edition.

Automatically detect spambot registration using machine learning like invisible reCAPTCHA – Intern: Vinitha VS, Telangana, India; Mentor(s) Gergő Tisza, Stephen LaPorte

Wikimedia’s current captchas can easily be cracked by spam bots, and it takes multiple attempts for a human to solve them. Moreover, the statistics show a failure rate of around 30%, and we don’t know what percentage of this is due to bots. Current captchas are problematic as they allow registration via bots, and trouble people with visual impairments and lacking English skills. This project aims to develop a revised Captcha system which would be friendlier to humans and harder for bots to crack.

You can stay up to date with the progress on these projects through the reports our interns will write on personal blogs. Our interns are quite excited about the opportunity and dedicated to making their project a successful one. One of the accepted interns wrote in their blog post before the results were out: “Even if my name doesn’t appear in the Outreachy’s interns page this Thursday I want to make a pin to celebrate what I made in the last two months. I dedicated a lot of time to contribute to Wikimedia and all the things I learned there were really important. I don’t regret anything..”

We would like to thank everyone who applied to Wikimedia. We received a total of fourteen robust proposals, out of which we chose six. Wikimedia mentors spent quite a lot of time mentoring candidates during the application period, in reviewing their pull requests, and giving them feedback on their proposals.

Stay tuned with Wikimedia’s participation in Outreach programs. If you are interested in applying for the next round of Outreachy, remember that the application will be due in the last week of March of 2018.

Srishti Sethi, Developer Advocate, Developer Relations
Wikimedia Foundation

This post has been updated to add the name of one of the six Outreachy interns.

by Srishti Sethi at December 01, 2017 04:36 PM

Weekly OSM

weeklyOSM 384

21/11/2017-27/11/2017

Frederic Stats

The currently most active OpenStreetMap contributors in almost realtime 1 | © Frederik Ramm

Mapping

  • Martijn van Exel is working on a new version of MapRoulette and asks users to name the feature they miss the most.
  • Selfish Seahorse asks on the mailing list how to tag correctly a street with a barrier but double-sided passage for pedestrians and cyclists.
  • The Telenav Mapping Team is active in Ecuador and invites the community to collaborate. Activities will initially focus on Quito, Guayquil, Cuenca, Machala and Loja.
  • On Reddit the question was raised if it is useful that StreetComplete adds the tag cycleway=no to roads that were reviewed with the app.

Community

  • [1] Frederik Ramm has published a visualisation website which shows the currently most active OpenStreetMap users in almost realtime.
  • Edward Betts has written a tool that makes it easy for OSM newcomers to find a mailing list relevant to them from the OSM environment. Local mailing lists could be added to the tool.
  • Geochicas celebrate their first anniversary!
  • This week SotMLatam 2017 in Lima is underway! Mappers from communities in Costa Rica, Chile, Colombia, Bolivia, Argentina, Mexico and Peru meet at the Faculty of Social Sciences of the National University of San Marcos in Peru. There is also the participation of public officials, technology researchers and open data to explore solutions for different topics.

OpenStreetMap Foundation

  • Four people are now candidates for the two vacancies in the OSMF board of directors. The additional candidates are: Paul Norman who is seeking re-election and David Dean. Although the mailing list osmf-talk discussions can be heated, dedicated mappers will certainly find good arguments for their election decision. The discussions revolve around: gender equality (although it remains unclear how this equality is to be measured); the possible influence of HOT US Inc. on the OSMF; to a possible influence of companies; and the under-representation of areas where few people are mapping. See also the Q&A with candidates on the wiki.
  • The Data Working Group released a draft Directed Editing policy (formerly known as ‘Organised Editing Policy’ or ‘Paid Editing Policy’). There is a short discussion on the Talk mailing list and a longer discussion on the OSMF-Talk mailing list where some people express their concern that the policy will impede the work of humanitarian mapping activities.
  • After many years of activity, the OSM community in the United Kingdom announced the decision to form an official OSM Local Chapter. The formally registered organisation plans to continue the good work on improving data collection, import and use, and on strenghtening the local community.

Humanitarian OSM

  • Nate Smith reports about HOT’s technology planning for 2018: more recognition and validation of volunteer roles; development of further partnerships for more training and tool development; more involvement of the community in the planning of core HOT tools and processes.
  • ThinkWhere, a Scottish GIS company, worked with HOT – the Humanitarian OpenStreetMap Team – to develop version 3 of the Tasking Manager.
  • The University of the Philippines is using OSM to develop resilience and natural disaster planning for its campuses.

Maps

  • OpenStreetMap is used in newly published paper Michelin maps for the islands of La Reunion, Martinique and Guadeloupe.

switch2OSM

  • Milad Moradi points out an MSc thesis on the quality of OSM for the Canadian road network. Hongyu Zhang from the University of Western Ontario finds OSM data is of comparable quality to other sources.

Open Data

  • The UK Finance Minister Phillip Hammod announced that a new Geospatial Commission will be formed to maximise the value of all UK government data linked to location. The mention of the possibility of the release of the national map agency (Ordnance Survey) MasterMap has excited lots of commentary: here, here and here; discussion on talk-gb; and a statement from OSM-UK.

Programming

  • Oleksiy Muzalyev writes that he has written a simple tool to report averaged values of ele tags on OpenStreetMap
  • Simone Primarosa (aka simoneepri) published a tool that
    extracts GeoJSONs of the boundaries of OpenStreetMap and other open databases.

OSM in the media

  • Bike Citizens Mobile Solutions a Graz software company (automatic translation) helps cities to better understand cycling. Their app allows cyclists to share issues with the cycle network. This generates data that provides information about behavior, obstacles or the choice of routes in urban cycling. Kofler explains, “We help the city planners to better understand the cycling traffic and thus adjust the infrastructure accordingly,”

Other “geo” things

  • Richard Fairhurst uses a railway map to offer a few hints about poor cartographic choices.
  • Helios Pro, an augmented reality app for iPhone and iPad, uses, among other data sources, OpenStreetMap buildings to render a 3D reconstruction of buildings.
  • ESA will soon be launching Galileo satellites number 19 to 22 from Kourou. (de) (automatic translation)
  • GIS finds its space on campus at the Center For Geospatial Analysis (CGA), it is housed on the second floor of the Earl Gregg Swem Library. This would become a space for anyone who is working on any type of GIS project.
  • Ahmed Loai Ali from the University of Bremen looks for participants for a study on the influence of human cognition on data classification in OSM.
  • Inside Culture, a programme on the Irish Radio station RTE1, provides an introduction to the Situationists, and also covers the recent 4th World Conference of Psychogeography (4wcop) in Huddersfield and psychogeography in Dublin. OSMer Tim Waters’ blog gives more links to 4wcop talks.
  • Wired suggests that “cartography is the new code” and that there is a developing skills shortage of skilled cartographers.
  • The weekly journal “Der Spiegel” published an article “How to dream” and talks about four interesting books (also for the wish list 😉 ) (de) (automatic translation) – Please observe all books are available in English as well.
    • Alastair Bonnett, New Views, The World Mapped Like Never Before: 50 maps of our physical, cultural and political world
    • Edward Brooke-Hitching, The Phantom Atlas: The Greatest Myths, Lies and Blunders on Maps
    • Georg Braun, Franz Hogenberg, Cities of the World, The Colored Tables from 1572-1617
    • Jasmine Desclaux-Salachas, The Art of Cartographics
  • The MIT Media Lab Emerging Worlds programme discusses on their blog the suitability of various address schemes for India.

Upcoming Events

Where What When Country
Lima State of the Map LatAm 2017 2017-11-29-2017-12-02 perú
London Pub meet-up 2017-11-30-Invalid date united kingdom
Yaoundé State of the Map Cameroun 2017 2017-12-01-2017-12-03 cameroun
Denver Online High School Mitchell Foundation board elections 2017-12-02 everywhere
Dortmund Mappertreffen 2017-12-03 germany
Grenoble Rencontre groupe local 2017-12-04 france
Rostock Rostocker Treffen 2017-12-05 germany
Albuquerque MAPABQ (join us!) 2017-12-06 united states
Stuttgart Stuttgarter Stammtisch 2017-12-06 germany
Montreal Les Mercredis cartographie 2017-12-06 canada
Antwerp Missing Maps @ IPIS 2017-12-06 belgium
Praha – Brno – Ostrava Kvartální pivo 2017-12-06 czech republic
Dresden Stammtisch 2017-12-07 germany
Berlin 114. Berlin-Brandenburg Stammtisch 2017-12-08 germany
Thessaloniki Ελληνικό OSM Party 2017-12-08 greece
Dar es Salaam State of the Map Tanzania 2017 2017-12-08-2017-12-10 tanzania
Denver Online High School Mitchell Foundation board elections 2017-12-09 everywhere
online via IRC Foundation Annual General Meeting 2017-12-09 everywhere
Rennes Réunion mensuelle 2017-12-11 france
Lyon Rencontre mensuelle 2017-12-12 france
Nantes Réunion mensuelle 2017-12-12 france
Toulouse Rencontre mensuelle 2017-12-13 france
Munich Stammtisch 2017-12-14 germany
Moscow Schemotechnika 13 2017-12-14 russia
Berlin DB Open Data Hackathon 2017-12-15-2017-12-16 germany
Rome FOSS4G-IT 2018 2018-02-19-2018-02-22 italy
Bonn FOSSGIS 2018 2018-03-21-2018-03-24 germany
Poznań State of the Map Poland 2018 2018-04-13-2018-04-14 poland
Milan State of the Map 2018 (international conference) 2018-07-28-2018-07-30 italy

Note: If you like to see your event here, please put it into the calendar. Only data which is there, will appear in weeklyOSM. Please check your event in our public calendar preview and correct it, where appropriate.

This weeklyOSM was produced by Anne Ghisla, Laura Barroso, Nakaner, Polyglot, SK53, Spanholz, YoViajo, derFred, jinalfoflia.

by weeklyteam at December 01, 2017 08:39 AM

Wikipedia Weekly

Wikipedia Weekly #127 – WikidataCon 2017

This Wikipedia Weekly podcast episode was recorded in Berlin, Germany, following WikidataCon 2017. This roundtable of five attendees reflects on the first-ever conference dedicated to Wikidata. We’re joined by guests Stacy Allison-Cassin, W.P. Scott Chair in E-Librarianship at York University in Canada and Wikimedian of the Year 2016, Rosie Stephenson-Goodknight.

Participants: Andrew Lih (User:Fuzheado), , Liam Wyatt (User:Wittylama), Stacy Allison-Cassin (User:Smallison), Rosie Stephenson-Goodknight (User:Rosiestep), Rob Fernandez (User:Gamaliel)

Links:

Opening music: “At The Count” by Broke For Free, is licensed under CC-BY-3.0; Closing music: “Things Will Settle Down Eventually” by 86 Sandals, is licensed under CC-BY-SA-3.0

All original content of this podcast is licensed under CC-BY-SA-4.0.

by admin at December 01, 2017 04:51 AM

November 30, 2017

Wiki Education Foundation

Wiki Education partners with American Anthropological Association

Wiki Education has launched another educational partnership—with the American Anthropological Association (AAA). AAA will encourage its members to participate in Wiki Education’s Classroom Program, where students write Wikipedia articles as a classroom assignment. Wikipedia is the public’s first stop for information about anthropology, and it’s an important part of their mission to ensure accuracy in the content people find when they search online for information about anthropology topics. Students will add anthropological perspectives to existing articles, contributing to a more complete representation of academic topics.

As part of this new partnership, Wiki Education is attending this week’s annual meeting in Washington, D.C. to encourage attendees to join our programs. We will help instructors design a meaningful assignment for students, give them access to the Wiki Education Dashboard, provide support materials about how to contribute to Wikipedia, and otherwise help students as they improve Wikipedia’s coverage of anthropology.

If you’re attending the AAA meeting, please join me and Communications Associate Cassidy Villeneuve in the exhibit hall to learn how to join Wiki Education’s Classroom Program. For more information about how to work together to improve Wikipedia, email contact@wikiedu.org.

by Jami Mathewson at November 30, 2017 07:04 PM