July 22, 2018

One particularly interesting topic discussed during the Hackathon Technical Debt session (T194934) was that of the contagious aspect of technical debt. Although this makes sense in hindsight, it's not something that I had really given much thought to previously.

The basic premise is that existing technical debt can have a contagious effect on other areas of code. One aspect of this is developers new to the MediaWiki code base may use existing code as a pattern for new code development. If that code has technical debt, the technical debt could get replicated in other areas of code.

This can be overcome with both education about desired patterns as well as sharing the technical debt state of existing code. It's not clear how best to accomplish the later, but perhaps it's as simple as a comment in the code, once it's been identified and is being tracked in Phabricator.

Another aspect of the contagion effect (perhaps more of a compound effect), is the result of maintaining code with existing technical debt. As bugs are fixed or minor features added, those changes can, in effect, result in a spreading of the technical debt. Of course this doesn't always need to be the case, but it can be, if one is not careful.

I'd like to get your thoughts on this topic and your past experiences working with and around technical debt.

Thoughts/Questions:

  • Are some areas of code more contagious than others?
  • What are some ways to mark technical debt as such?
  • What do you do when you need to work on code with significant technical debt?

It has been a while since the last mediawiki_selenium release! 💎

I have just released version 1.8.1. 🚀

Notable changes:

  • Required Ruby version is 2.x
  • Upgrade selenium-webdriver to 3.2
  • Integration tests use Chrome instead of PhantomJS
  • Added license to readme file
  • Documented Sauce Labs usage in readme file
  • Updated Special:Preferences/reset page

I would like to thank several contributors that have improved the gem since the last release: @hashar, @Rammanojpotla, @demon and @thiemowmde! 👏

Between 2017-11-20 and 2017-12-01, the Wikimedia Foundation ran a direct response user survey of registered Toolforge users. 141 email recipients participated in the survey which represents 11% of those who were contacted.

Demographic questions

Based on responses to demographic questions, the average [1] respondent:

  • Has used Toolforge for 1-3 years
  • Developed 1-3 tools & actively maintains 1-2 tools
  • Spends an hour or less a week maintaining their tools
  • Programs using Python and/or PHP
  • Does 80% or more of their development work locally
  • Uses source control
  • Was not a developer or maintainer on Toolserver

[1]: "Average" here means a range of responses covering 50% or more of responses to the question. This summarization is coarse, but useful as a broad generalization. Detailed demographic response data is available on wiki.

Qualitative questions

  • 90% agree that services have high reliability (up time). Up from 87% last year.
  • 78% agree that it is easy to write code and have it running on Toolforge. Up from 71% last year.
  • 59% agree that they feel they are supported by the Toolforge team when they contact them via cloud mailing list, #wikimedia-cloud IRC channel, or Phabricator. This is down dramatically from 71% last year, but interestingly this question was left unanswered by 36% of respondents.
  • 59% agree that they receive useful information via cloud-announce / cloud mailing lists. Up from 46% last year.
  • 52% agree that documentation is easy to find. This is up from 46% last year and the first time crossing the 50% point. We still have a long way to go here though!
  • 96% find the support they receive when using Toolforge as good or better than the support they received when using Toolserver. Up from 89% last year.
  • 50% agree that Toolforge documentation is comprehensive. No change from last year.
  • 53% agree that Toolforge documentation is clear. Up from 48% last year.

Free form responses

The survey included several free form response sections. Survey participants were told that we would only publicly share their responses or survey results in aggregate or anonymized form. The free form responses include comments broadly falling into these categories:

  • Documentation (58 comments)
  • Platform (48 comments)
  • Workflow (48 comments)
  • Community (17 comments)
  • Support (6 comments)

Documentation

Comments on documentation included both positive recognition of work that has been done to improve our docs and areas that are still in need of additional work. Areas with multiple mentions include need for increased discoverability of current information, better getting started information, and more in depth coverage of topics such as wiki replica usage, Kubernetes, and job grid usage.

There were also comments asking for a self-service version control system and pre-installed pywikibot software. Both of these are offered currently in Toolforge, so these comments were classified as missing or difficult to find documentation.

Platform

Comments about the Toolforge platform have been subcategorized as follows:

  • Software (26 comments)
    • The majority of software comments were related to a desire for newer language runtime versions (PHP, Java, nodejs, Python) and more flexibility in the Kubernetes environment.
  • Database (10 comments)
    • Database comments include praise for the new Wiki Replica servers and multiple requests for a return of user managed tables colocated with the replica databases.
  • Reliability (10 comments)
    • Reliability comments included praise for good uptime, complaints of poor uptime, and requests to improve limits on shared bastion systems.
  • Hardware (2 comments)
    • (sample too small to summarize)

Workflow

  • Deploy (12 comments)
    • The major theme here was automation for software deployment including requests for full continuous delivery pipelines.
  • Debugging (10 comments)
    • People asked for better debugging tools and a way to create a more full featured local development environment.
  • Monitoring (10 comments)
    • Monitoring comments included a desire for alerting based on tracked metrics and tracking of (more) metrics for each tool.
  • Setup (10 comments)
  • Files (6 comments)
    • Improved workflows for remote editing and file transfer are desired.

Community

Comments classified as community related broadly called for more collaboration between tool maintainers and better adherence to practices that make accessing source code and reporting bugs easier.

Support

Support related comments praised current efforts, but also pointed to confusion about where to ask questions (irc, email, phabricator).

To be blunt, what Mr Jacobs is talking about is one or more step removed of the Wikimedia reality. His story is important and indicates that a specific type of source exists and is available for study. Mr Jacobs informs on the importance of Twitter for the Zulu language.

Mr Jacobs is an academic and the reality of Zulu Wikipedia is that only a few days ago we celebrated article number 1000 for the Zulu Wikipedia. What the Zulu Wikipedia needs is high school students writing in Zulu. Writing about what is important to them, what is important to their curriculum and to their world.

Just consider what one high school could do. Now consider what ten high schools could do. Compare that with one academic or what all the Zulu students currently in university could do.

Yes, history has been written so far and it does report in a biased way. When the Zulu language is to gain a foothold in the Wikimedia world, we need many people being involved in writing first the most basic information. Once there is a basis, the sources Mr Jacobs mentions become relevant in a Zulu Wikipedia.
Thanks,
      GerardM

July 21, 2018

The exhibition “Changing Tartu in Four Views” was the first from Tartu Art Museum to end up in Wikipedia—but it wasn’t the last one. Photo by Merli-Triin Eiskop, CC BY-SA 4.0.

 

While the Estonian museums hold nearly nine million items, only 3.5% of them are displayed in exhibitions. The case isn’t significantly different in other countries: museums have a limited display space and can only open up a fraction of their collections to the public at some occasions. Much of the preserved items may even never leave their storage rooms, sometimes left in poor temperature and humidity conditions. In some occasions, the undisplayed collections include the work of well-known artists.

We started to actively engage museums in 2017, which helped us collaborate on several projects. Tartu Art Museum, the second biggest art museum in Estonia was one of the first to join the initiative. We used the opportunity to feature the artworks uploaded as part of the Europeana 1914–1918 project, which lets Wikipedia readers take a closer look at the possible influence of the World War I on the Estonian art.

The decisive years prior to Independence: Moments from the Estonian art of 1914–18, our first virtual exhibition, is divided into three sections: intro, image gallery, and a guest book. 46 pieces of art are displayed in the image gallery from the collections of the Tartu Art Museum and the Art Museum of Estonia accompanied by description in English, Finnish in addition to Estonian.

Helped by several Wikipedians, the author of this post worked with Merli-Triin Eiskop from the Tartu Art Museum and Stina Sarapuu from the Art Museum of Estonia to get the pieces assembled.

Digitization helps get more artworks seen by the public, However, it would take long until many pieces get online. Even when people are ready to volunteer, it is not yet the norm for museums to be willing to share their original reproductions with the world, not only limiting it to downsized samples.

When we get to the point, where images are no longer locked away, another challenge arises. How can we present this material and build context for it?

This is where Wikipedia can step in—not just the online encyclopedia, but all of the Wikimedia movement, together with the community that keeps it running. So far we have been focusing on collecting images and media files. When museums are ready to provide some images, we happily bring them to Wikimedia Commons, the free media repository, and link them to data items on Wikidata. With more museums donating their collections, it will not only be harder to showcase the artworks, but rather doing anything with the files after getting them uploaded.

With digitalization and with open sharing, we are rapidly moving towards an era where much of the human culture will be freely accessible to every human being. That allows us to step on the shoulders of our forerunners and reach even higher.

But that also poses new challenges. Will we make sure, that this information is indeed made available for reuse so that we can actually build on it? And can we provide the ways around it so that it gets organized in a meaningful way and can be found by the people looking for it?

Ivo Kruusamägi, Wikimedian from Estonia

In brief

Krishna Chaitanya Velaga. Photo by Saileshpat, CC BY-SA 4.0.

The Wikipedia Library names Krishna Chaitanya Velaga star coordinator: Krishna Chaitanya Velaga was named by The Wikipedia Library (TWL) as the star coordinator for January to March 2018, for his dedicated work that inspired the creation of the Hindi Wikipedia Library. Velaga has engaged in several outreach and partnership activities to support the creation of new TWL branches in his country.

Velaga recalls that he found about The Wikipedia Library around the time he started editing Wikipedia. That was a result of his quest to contribute to military history articles on Wikipedia. Velaga joined the program as a user when he was granted access to four publications that helped him write about military history. He used this material to improve one featured article in addition to several A-Class articles.

Velaga then realized having no publisher in India that partnered with TWL. He signed up as a partnership coordinator and spoke at several conferences including the Hindi Wikipedia Conference and OpenCon 2018 New Delhi, which he helped organize.

As an outreach coordinator, Velaga is currently trying to spread the awareness about the need for reliable sources. He is also trying to establish a new partnership with the British Council in India for a new visiting scholars program and leading two 1Lib1Ref training courses in India.

Photo by Ssofija, CC BY-SA 4.0.

Shared Knowledge group in Macedonia hold their first WikiGap event in Skopje: On 20 June 2018, Shared Knowledge group in Macedonia held a big editathon in Skopje as part of the WikiGap initiative to help increase gender diversity on Wikipedia. The event was attended by 50 participants who worked on improving 100 articles on the Macedonian Wikipedia.

“We focused on creating a list of women from Macedonia who had an impact on the social, cultural and public life,” Toni Ristovski from Shared Knowledge told us. “The event went live under the title ‘Prominent women from Macedonia.’ We used the Macedonian Encyclopedia as a main source, which gave us material for more than 400 women biographies, two thirds of whom don’t exist on the Macedonian Wikipedia. So, the main focus of this edit-a-thon was creating women profiles and sharing them with the world.”

The four-hour event was held in collaboration with the Swedish embassy in Skopje, UN Women and the Ministry of Labor and Social Affairs.

Photo by Conrad Nutschan, Public domain.

A group of Wikimedian Jules Verne fans take a tour to explore his heritage: From 4 to 6 May 2018 a group of researchers, librarians, publishers, and Wikimedians came together in Braunschweig, Germany to explore the ways and heritage of Jules Verne.

The symposium took place at the delightful buliding of Braunschweigische Stiftungen, the Gerloffsche Villa, and was organised by Wikimedian Brunswyk and supported by Wikimedia Deutschland in collaboration with Stiftung Braunschweigischer Kulturbesitz and the Jules-Verne-Club Bremerhaven. Read more about the event on the German Wikipedia.

Samir Elsharbaty, Writer, Communications
Wikimedia Foundation

Image showing the infobox in source editor, and the resulting infobox on Commons.

Mike Peel is a UK Wikimedian who is running a session on the development of multilingual infoboxes on Wikimedia Commons at Wikimania 2018 in Cape Town. You can see the session information here.

_____________________________________________________

Wikimedia Commons is a multilingual repository holding multimedia related to all possible topics. All language Wikipedias, as well as other projects, rely on the content hosted on Commons. However, MediaWiki is monolingual by default, so the defacto language on Commons is English, with category names in English — even for non-English topics.

As a result, it can be difficult for non-English speakers to understand the context and scope of a category. Sometimes manual descriptions and translations are present, along with various different utility templates, but the amount varies dramatically between categories.

Wikidata content, however, is inherently multilingual. Topics have Q-identifiers, and statements are made using properties, all of which are translatable. Structured Data on Commons (a Wikimedia project to ‘Wikidatafy’ Commons and improve its searchability) will soon use this system on file description pages – but it can also make the categories significantly easier to use by providing information relevant to the category in the user’s preferred language through an infobox.

Implementation

{{Wikidata Infobox}} is designed to be one infobox that work for all topics in all languages. It is written using parser functions, and it primarily uses [[User:RexxS]]’s Lua module [[Module:WikidataIB]] to fetch the values of nearly 300 properties from Wikidata, along with their qualifiers (and more can be added on demand). The label language is based on the user’s selected language.

The values are then displayed in various different formats such as strings, links, dates, numbers with units, and so on, as appropriate. The main image, as well as flags and coats of arms, are also displayed, along with the Wikidata description of the topic

Coordinates are displayed with Geohack and links to display the coordinates of all items in the category. Maps are displayed in the user’s language using Kartographer, with the map zoom level based on the area property for the topic. Links to Wikipedia, Wikivoyage, etc. are displayed where they are available in the user’s language. Links to tools such as Reasonator and Wikishootme are also included.

For categories about people, the infobox automatically adds the birth, death, first name and surname categories, along with tracking categories like ‘People by Name’. Authority control IDs are also displayed as external links, and ID tracking categories can also be automatically added.

Poster on Template:Wikidata Infobox by Mike Peel – image CC BY-SA 4.0

Roll-out

You can easily add the infobox to categories that have a sitelink on Wikidata: just add {{Wikidata Infobox}}!

The infobox was started in January 2018, with several test categories. An initial discussion on Commons’ Village Pump was very positive, and by the end of February it had been manually added to 1,000 categories, increasing to 5,000 by mid-March. Work on a bot to deploy the template was started in February, and was approved by the community by the end of April, when around 10,000 categories had infoboxes. The bot roll-out was started slowly to catch any issues with the infobox design, and particularly increases in the server load – but no server load issues arose.

In parallel, over 500,000 new commons sitelinks were added to Wikidata using P373 (the ‘commons category’ property) and monument ID matching, and many links to category redirects have been fixed. This has also caused many new interwiki links to be displayed in Commons categories.

In mid-June 2018, with the use of Pi bot and friends to add the infobox to categories, uses of the Wikidata infobox passed 1 million.

Next steps

The infobox continues to evolve and to gain new features, for example the implementation of multilingual interactive maps using Kartographer was quickly added to the infobox to make it available in over 600,000 categories displaying a map. More properties are being added to the box, although striking a balance between keeping the infobox small and adding relevant new properties is an ongoing discussion.

The infobox is not used where other boxes such as {{Creator}} and “category definition” are already in use; this could potentially change so that there is a uniform look across all categories. It is also not used for taxonomy categories due to different taxonomy structures on Commons and Wikidata.

Over 4 million Commons categories do not yet have a sitelink on Wikidata, so there is plenty of scope to add the infobox to more categories! The infoboxes will update and grow as more information and translations are added to Wikidata – so if you see wrong or untranslated information in them, correct it on Wikidata!

WikidataIB and the other tools used here (or even the entire infobox!) can easily be installed on other Wikimedia projects – providing that there is community consensus to do so!

10/07/2018-16/07/2018

Logo

Orientation of streets in European cities 1 | © Geoff Boeing, Tobias Kunze © map data OpenStreetMap contributors

Mapping

  • Warin, who noticed a new mapper trying to map a golf course, complained to the tagging mailing list about the current wiki page describing how to map a golf course. That triggered a lengthy discussion on the mailing list, where we discovered that it makes sense to consider even the grass height in recognising the tee area.
  • Leo Gaspard would like the landuse=highway tag to be used more often as he wants road areas to be tagged along with their surfaces. However, as noted during the conversation, the widely used area:highway=* includes an optional surface tag already.
  • Danilo reported on the Swiss mailing list (de) his experience with creating his own aerial images with a paraglider. He mentioned the project opendronemap.org, a toolchain for processing aerial imagery. His example result (large file) looks promising.
  • How many traffic light nodes should be added for complex intersections? This question was raised (de) in the German forum. The conclusion was to tag traffic lights for pedestrians with highway=crossing and crossing=traffic_signals to avoid multiple time penalties when routing cars over an intersection with multiple traffic lights. The possibility to add the direction of traffic lights was also mentioned.
  • The Open Knowledge Lab Karlsruhe has created a map that shows farm shops, market places and vending machines for food and milk in Austria, Germany, Switzerland and some neighbouring regions. They explain the map and future plans in the German forum (de). If you want to join the development or make your own map for your region, you can find the project on GitHub.
  • Warin61 has created a proposal for ephemeral water, i.e. water that is only present for short duration.
  • The voting for the tagging proposal aeroway=highway_strip is under way.

Community

  • The OSM API is currenly running slower than usual because the master OSM server is moving from Imperial College, London to the Equinix data centre in Amsterdam. In the transition period, a slower server in York, England is serving as the database master server (we reported earlier). The OWG is looking for volunteers from the Amsterdam OSM community to help during the move on 25 and 26 July.
  • OSM US has not found the right person for the new position of an executive director and is still waiting for more applications.
  • OSM Belgium has chosen Lionel Giard (OSM Anakil) as mapper of the month and published an interview with him about his way of contributing.
  • chris66 found an object during his field survey which was treated as a land mine by the police. His finding resulted in the arrival of an Ordnance Disposal Unit and was reported in a local newspaper. He received tips in the always-helpful German forum (de) (automatic translation) on the different ways to tag a crater.
  • The voting for the OSM Awards ends on July 26th. Please vote if you have not done so already.
  • The media platform TriplePundit published a story about the benefits of collaborative problem-solving. It mentioned the well-known code creation and mapping efforts for disaster relief but also forward-looking development aid such as openRMS, an open source enterprise electronic medical record system platform for resource-constrained environments.
  • [1] Rixx published a blog post about the orientation of streets in European cities. He wrote a Python script, which
    he made available on GitHub, which shows the direction that a city’s streets run in most often. He continues the work of Geoff Boeing who wrote a blog post "Comparing City Street Orientations" using his OSMnx Python library. In a description of OSMnx Christoph linked to a number of blog posts about OSMnx use cases ranging from figure-ground diagrams over isochrone maps to network-based spatial clustering.

    Mapbox catches up with its own version and presented a slippy map with streets orientations. In a blog post they explained their implementation that is available at GitHub.

  • The company Uber uses OSM internally for fare calculations and to optimise driver and rider matching as they have explained in the forum. The team in Palo Alto will start fixing bugs found in OSM by Uber drivers in the Delhi region in India as a test. According to their forum post, they will share the profiles of the editors on Uber’s OSM page, will follow the Organised Editing Best Practices and Indian guidelines, and do not plan to make large-scale, machine-generated edits.

OpenStreetMap Foundation

  • The OpenStreetMap Foundation has published a report highlighting OSM and OSMF related updates. Regular readers of the weeklyOSM may have heard some of the updates already but there is also a lot of new information such as working group insights in the report.

Events

Maps

  • The outdoor map provided by Andy Allan’s commercial Thunderforest Platform is now rendering difficulty classifications based on sac_scale.
  • Skinfaxi introduces a new service of meaningfully cut out topographic maps for Sweden and the southern neighbouring countries on a scale of 1:50.000, which can be downloaded as PDF. Unfortunately there is hardly any documentation (example as PDF).

switch2OSM

  • User jbelien has published an article in the OSM user diaries on why one should use OSM instead of Google. The article was originally written for the Open Summer of Code 2018 in Belgium. If you prefer the article in French, voilà.

Programming

  • Have you ever been impatiently waiting for your large changeset to finish uploading? Then you might be interested in the blog post from mmd. He wrote about a new, faster changeset upload implementation that has been developed and is available for testing now. Instructions on how to help with testing are included in the blog post.
  • A new version of GeoPandas, an extension for Python’s data science package Pandas, was just released. GeoPandas extends Pandas’ data types and allows spatial operations on geometric types that would otherwise require a spatial database such as PostGIS. The new version 0.4.0 improves the performance and behaviour of the overlay functionality. Further, there is a long list of other new features and bug fixes.

Releases

  • The version 1.3.0 of Maputnik, a free visual style editor for maps, has been released.
  • Wambacher’s software list was updated as of July 17. The almost endless list shows the current versions of OSM related software.

Did you know …

  • … that Thomas Konrad created a website showing the coverage of buildings in Austria compared to public data. While he considers Austria as a whole nearly complete (with coverage greater than 85 percent), there are still areas with coverage below 50 percent where some work needs to be done.
  • … of Pascal’s website Unmapped Places? The intention of this tool is to detect under-represented regions a.k.a. “unmapped” places in OpenStreetMap. Basically it checks whether a road can be found within 700 m around a place= tag or otherwise it will be flagged as unmapped.

Other “geo” things

  • The German blog beyong-print published a post (de) about the start-up cartida that is offering art prints of maps based on OSM data in different styles and sizes. Currently they only ship to Germany, Austria, France, Netherlands, Belgium and Luxembourg but they plan to add more countries.
  • govtech.com, an online portal covering information technology’s role in state and local governments, reported that Google’s new API restrictions will have minimal impact on goverment bodies. From July 16 onwards, Google Maps requires an official API key and a valid credit card attached to the account in order to work. However, state and local governments prefer using paid products from ESRI, Mapbox and Boundless Spatial for their work.
  • The Mississippi’s Board of Licensure for Professional Engineers and Surveyors sued the company Vizaline, which provides banks with polygons on satellite images for use in granting loans. The Board says that it should be the state entity solely responsible for land surveying.
  • It is no secret that new edits do not always improve the data in OSM. A research study now provided some more insight. The conclusion was that the completeness and positional precision of features can be improved up to 14% if you take the history data of objects into account.
  • Mongabay, an online platform for environmental topics, published an article about MapHubs. MapHubs is a commercial platform storing maps and spatial data for viewing and analyzing. Paying users can use public and private data sets to create customised maps showing deforestation and generate time-lapse videos in "minihubs" called websites.
  • The website visualising data published the 53th episode of its "the little of visualisation design" series. The series focuses on small design choices and demonstrates with a sample project in each episode. Using an article from The Guardian, the website explains in the current episode the use of thumbnail map images that helps orientate the viewer.

Upcoming Events

Where What When Country
Bamako OSM training at Faculté Sciences et Technologies 2018-07-16-2018-07-20 Mali
Mumble Creek OpenStreetMap Foundation public board meeting 2018-07-19 everywhere
Heidelberg End of semester Mapathon 2018-07-19 germany
Essen Mappertreffen 2018-07-21 germany
Tokyo 東京!街歩き!マッピングパーティ:第21回 増上寺 2018-07-21 japan
Greater Manchester More Joy Diversion 2018-07-21 united kingdom
Nottingham Pub Meetup 2018-07-24 united kingdom
Düsseldorf Stammtisch 2018-07-25 germany
Lübeck Lübecker Mappertreffen 2018-07-26 germany
Manila Maptime! Manila 2018-07-26 philippines
Milan State of the Map 2018 (international conference) 2018-07-28-2018-07-30 italy
Stuttgart Stuttgarter Stammtisch 2018-08-01 germany
Bochum Mappertreffen 2018-08-02 germany
Amagasaki みんなのサマーセミナー:地図、描いてますか?描きましょう! 2018-08-05 japan
London Missing Maps Mapathon 2018-08-07 united kingdom
Munich Münchner Stammtisch 2018-08-08 germany
Urspring Stammtisch Ulmer Alb 2018-08-09 germany
Dar es Salaam FOSS4G & HOT Summit 2018 2018-08-29-2018-08-31 tanzania
Buenos Aires State of the Map Latam 2018 2018-09-24-2018-09-25 argentina
Detroit State of the Map US 2018 2018-10-05-2018-10-07 united states
Bengaluru State of the Map Asia 2018 2018-11-17-2018-11-18 india
Melbourne FOSS4G SotM Oceania 2018 2018-11-20-2018-11-23 australia

Note: If you like to see your event here, please put it into the calendar. Only data which is there, will appear in weeklyOSM. Please check your event in our public calendar preview and correct it, where appropriate.

This weeklyOSM was produced by Nakaner, Polyglot, Rogehm, SK53, SunCobalt, YoViajo, derFred, jinalfoflia.

When there are many Listeria lists that you follow, when you care about the development about the subject, it is wonderful to see so much activity related to Africa.  As more people care to work on African politicians or "administrative territorial entities", the Listeria lists that also exist on Wikipedias in African languages will be updated as well.

When the Listeria lists become part of the main body of a Wikipedia, the politicians and entities will be found. When the info boxes as presented at the Celtic knot conference follow, slowly but surely quality content in quantity about Africa will no longer be a mirage.
Thanks,
      GerardM

July 20, 2018

Photo via Tanya Capuano.

This week, the Wikimedia Foundation announced a new member and leadership appointments to its Board of Trustees. Tanya Pine Capuano, recently Chief Financial Officer of the digital marketing company G5 in Bend, Oregon, will be the newest member of the Board of Trustees. The Board also appointed María Sefidari as Chair, and Christophe Henner as Vice Chair to lead the Board of Trustees. The announcement was made at the 2018 Wikimania conference, the annual celebration of Wikipedia, free knowledge, and the global Wikimedia community, held this year in Cape Town, South Africa.

The Wikimedia Foundation’s Board of Trustees oversees the Wikimedia Foundation and its work, and serves as the organization’s ultimate corporate authority. As an incoming Trustee, Tanya will serve a three year term effective immediately.

Tanya replaces the role formerly held by Kelly Battles whose term on the Board concludes this month along with longtime member Alice Wiegand. The Board thanks its outgoing Trustees for their service to the Board and the Wikimedia movement and mission.

Tanya has wide-ranging experience including strategy, mergers and acquisitions, and financial planning and analysis in technology from her roles at Intuit, Hewlett-Packard and G5. She has also served on several nonprofit Boards supporting education including Education Pioneers, Los Altos Educational Foundation, and “I Have a Dream” Foundation’s San Francisco chapter, which she also co-founded.

In addition to her deep commitment to education, especially expanding access to higher education, Tanya brings with her a passion for Wikimedia’s values and vision.

Tanya joins Wikipedia founder Jimmy Wales, Board Chair María Sefidari, Vice Chair Christophe Henner, and Board members Esra’a Al Shafei, Raju Narisetti, Dr. James Heilman, Dr. Dariusz Jemielniak, and Nataliia Tymkiv.

Newly appointed Board Chair, María Sefidari, succeeds Christopher Henner, who will serve as Vice Chair, the role María previously held. María is a professor in the Digital Communications, Culture and Citizenship Master’s degree program of Rey Juan Carlos University at the MediaLab-Prado. Born in Madrid, Spain, where she still lives today, María served on the Wikimedia Foundation Board from 2013 to 2015 and re-joined the Board in 2016.

“The Wikimedia movement has been an important part of my life for over a decade and it is a great honor to be able to serve it as Chair of the Wikimedia Foundation Board of Trustees,” María said. “The Wikimedia 2030 movement strategy we are in the midst of developing is the most significant and expansive discussion about our long-term future we have undertaken since our founding. We have much to accomplish in the upcoming year to be ready to implement our new strategy, and I am thrilled to able to contribute as Board Chair.”

The Wikimedia Foundation Board of Trustees

  • María Sefidari, Board Chair
  • Christophe Henner, Vice Chair
  • Jimmy Wales, Founder of Wikipedia
  • Dr. Dariusz Jemielniak
  • Esra’a Al Shafei
  • Dr. James Heilman
  • Nataliia Tymkiv
  • Raju Narisetti
  • Tanya Capuano

About Tanya Capuano

Tanya Pine Capuano most recently was the chief financial officer (CFO) of the digital marketing company G5 in Bend, Oregon.

Originally from San Jose, California, she has wide-ranging experience including strategy and financial planning and analysis in technology. In addition to her recent role at G5, she previously held leadership positions at Intuit, Hewlett-Packard and APM Management Consultants/CSC Healthcare. She has also supported numerous education initiatives throughout her career; including serving on the boards of Education Pioneers, Los Altos Educational Foundation, and “I Have a Dream” Foundation San Francisco, an organization whose San Francisco chapter she co-founded.

She is very involved with Stanford University alumni life, having earned a Bachelor’s in economics, a master’s in education, and a Master of Business Administration from the university. After graduating, she worked as the university’s Director of Alumni Relations for the Graduate School of Education and Development Director for the Initiative on Improving K-12 Education. She has also served on the board of the Stanford University Graduate School of Business Alumni Association.

Tanya lives in the San Francisco Bay Area with her husband and two teenagers. They enjoy traveling as a family and experiencing the great outdoors.

About María Sefidari

María Sefidari Huici is a professor in the Digital Communications, Culture and Citizenship Master’s degree program of Rey Juan Carlos University at the MediaLab-Prado.

Born in Madrid, Spain, where she still lives today, María graduated with a Psychology degree from Universidad Complutense de Madrid, and later a Master’s degree in Management and Tourism at the Business faculty of the same university. She was a 2014 Techweek Women’s Leadership Fellow, which showcases, celebrates, and supports emerging female leaders in business and technology.

María started contributing to the Wikimedia projects in 2006, and has since served in many different roles across the Wikimedia movement. Maria was a founding member of Wikimedia España and Wikimujeres Grupo de Usuarias, and also created Spanish Wikipedia’s LGBT Wikiproject. She has served on several Wikimedia governance committees, including the Affiliations and Individual Engagement Grants committees. In her time on the Affiliations Committee, María served as the first Treasurer of the committee, effectively overseeing and monitoring disbursement of the committee’s budget. From 2013 to 2015, she was also a member of the Wikimedia Foundation Board.

María re-joined the Wikimedia Foundation Board in 2016 to fill an community-nominated seat vacancy, and was later re-confirmed for a second term in August 2017.

In her spare time, María travels around the world, runs wiki-workshops to engage new editors, and supports Real Madrid Club de Fútbol.

About Christophe Henner

Christophe Henner is the former Board Chair of Wikimedia France and current Chief Operating Officer (COO) of the Blade Group, a cloud computing company headquartered in France. At Blade, Christophe is scaling up operations to support the company’s transition from a start-up to a global company.

Originally from Lavaur, Christophe studied economics and law at the University of Toulouse. He has deep and varied experience across the marketing sector, holding a variety of leadership positions including Head of Marketing at the online media group, L’Odyssée Interactive, Chief Marketing Officer at an international digital media group, Webedia, and later deputy Chief Executive Officer of Webedia’s gaming division.

Christophe has been an active member of the Wikimedia community for more than 12 years. In 2007, he joined the Board of Wikimedia France and has remained an active Board member in various positions for the past ten years. Nearly three of those years on the Board were spent in leadership roles, including Chair and Vice Chair of the Board.

During his time on the Board, Christophe helped lead Wikimedia France through a significant period of growth. This included leading the development of the chapter’s brand and supporting the development of a clear organizational strategy and vision for the chapter.

Photo by Gregory Varnum/Wikimedia Foundation, CC BY-SA 4.0.

More than 700 attendees from nearly 80 countries gathered today for the start of Wikimania 2018—the annual conference celebrating Wikipedia and the Wikimedia projects, the Wikimedia free knowledge movement, and the community of volunteers who make them possible. This marks the 14th annual Wikimania, which takes place 18–22 July at the Southern Sun Cape Sun Hotel in Cape Town, where volunteers will come together to discuss and share ideas around the future of Wikipedia and free knowledge globally.

The event kicked off with an opening ceremony featuring a special presentation by a group of local Sinenjongo high school students welcoming conference attendees to Cape Town. It continued with remarks by Douglas Scott, President of Wikimedia ZA, the local South African Wikimedia affiliate and lead local organizer of this year’s conference who introduced this year’s conference theme: Bridging knowledge gaps, the ubuntu way forward, which aims to address gaps in knowledge, particularly those about African people, cultures, and languages, on Wikipedia and the Wikimedia projects. As part of this week’s centennial celebration to commemorate the birth of Nelson Mandela, Scott announced a partnership with the Nelson Mandela Foundation to make the inspirational writings of the former South African President’s 1962 diary available to the world on Wikimedia Commons and Wikisource.

Throughout this year’s Wikimania, attendees will explore sessions related to development of Wikimedia projects in Africa, global collaborations to support the advancement of free knowledge, opportunities to partner with galleries, libraries, archives, and museums (GLAMs). Wikimania 2018 is co-organized by the Wikimedia Foundation and Wikimedia ZA.

Wikimania 2018 will also bring together a diverse mix of attendees, including seasoned volunteer editors; researchers and data scientists; members from the medical community; librarians; and other free knowledge leaders. Confirmed keynote speakers include internet geographist Dr. Martin Dittus, who will be speaking on economic development, labour, power, participation, and representation, Joy Buolamwini, a noted artificial intelligence expert fighting to remove bias in machine learning, and Professor Sean Jacobs, an esteemed data scientist in digital culture and digital geography, alongside Wikimedia Foundation Executive Director, Katherine Maher, and Wikipedia founder, Jimmy Wales. All plenary sessions will be live streamed and available for public viewing online. More information and links to livestream session are available here.

“Wikipedia today is already fascinating and expansive. But it does not begin to represent the entirety of the world we live in — so much of the rich history, diversity of language, culture, and peoples of Africa is missing from the site,” said Katherine Maher, Executive Director of the Wikimedia Foundation. “We are honored to be hosted by Cape Town for this year’s Wikimania, the first ever in sub-saharan Africa, and look forward to speaking with our global communities, South Africans, and more about how we can begin to hear the critical perspectives that are missing from Wikipedia today.”

As part of the Wikimania press conference held on Friday, Katherine was joined by Banks Baker,  Head of Global Product Partnerships – Search Content at Google, to announce the outcome of a recent collaboration between the Wikimedia Foundation and Google to expand and improve the representation of knowledge in Indic languages on Wikipedia, called Project Tiger. Through the project, both organizations, working in close collaboration with the Centre for Internet and Society (CIS), Wikimedia India chapter (WMIN) and local volunteers, hosted a pilot program to increase locally relevant content available in 12 Indic languages on Wikipedia.

Google provided Chromebooks and internet access to support volunteer editors with content creation as well as insights into popular search topics on Google that lack information in Indian languages online. Through a three month writing competition, volunteers created nearly 4,500 new Wikipedia articles across 12 languages, nearly double the initial benchmarks for the project. Based on this initial success of the pilot program, Google and the Wikimedia Foundation will hear feedback from volunteers about the program in a session on Sunday at Wikimania and further explore future implementation of these types of programs with other volunteer communities.

The annual Wikimania gathering will also provide an opportunity for volunteers to discuss Wikimedia’s future as part of Wikimedia 2030, a global consultation to define the future of the Wikimedia movement. Challenges with software localisation, the structure of data, and even newforms of knowledge that Wikimedia has defined as “verifiable” (text-based, rather than oral, for example) are some of the issues the movement is grappling with as it moves actively towards incorporating more diverse forms of knowledge within Wikipedia and the Wikimedia projects.

Wikimania also offers conference-goers time to experience the unique culture of Cape Town and join in the celebrations of Nelson Mandela’s 100th birthday. The conference’s theme, “Bridging knowledge gaps—the ubuntu way forward,” captures this spirit and finds its roots in the philosophy and way of life ubuntu.

“Ubuntu is summarised as the philosophy of ‘I am because you are,’ or alternatively ‘the belief in a universal bond of sharing that connects all humanity,'” explained Wikimedia South Africa President, Douglas Scott. “Volunteer community-driven projects like Wikipedia and what we are hoping to achieve at Wikimania in Cape Town capture this ethos well. Wikimedia South Africa is honored to have this opportunity to share this spirit with our friends and colleagues around the world.”

The Wikimedia Foundation is the nonprofit organization that supports Wikipedia, the Wikimedia free knowledge projects, and its mission of free knowledge for every single person.

To learn more about the conference, please visit: wikimania2018.wikimedia.org. You can follow the conference on #Wikimania as well as on Facebook and Twitter.

Photo by Danielarapava, CC BY-SA 4.0.

The Wikimedia Foundation has supported free access to the sum of all knowledge for fifteen years. This longstanding vision would not be possible without the dedication of community members who contribute content to the Wikimedia projects. As a global platform for free knowledge, we are sometimes approached by governments and private parties with requests to delete or change project content, or to release nonpublic user information. The Foundation consistently evaluates such requests with an eye towards protecting privacy and freedom of expression. We are committed to sharing our responses to these requests with the diverse network of Wikimedians who contribute to the projects we support.

Twice a year, we publish a transparency report outlining the number of requests we received, their types, countries of origin, and other information. The report also features an FAQ and stories about interesting and unusual cases.

A few highlights:

Content alteration and takedown requests. From January to June of 2018, we received 388 requests to alter or remove project content, six of which came from government entities. We did not make any changes to project content as a result, but often encouraged the requesters to work with the user community to address their concerns. The volunteer contributors who build, grow, and improve the Wikimedia projects follow community-created policies that ensure project content is appropriate and well-sourced. We support the community’s prerogative to determine what educational content belongs on the projects.

Copyright takedown requests. The Wikimedia community works diligently to ensure that copyrighted material is not distributed on our platforms without an appropriate free license or exception, such as fair use. Most of Wikimedia’s content is therefore freely licensed or in the public domain. When we occasionally receive Digital Millennium Copyright Act (DMCA) notices asking us to remove allegedly infringing material, we conduct thorough investigations to make sure the claims are valid. From January to June of 2018, we received 12 DMCA requests and granted two of them.

Requests for user data. The Wikimedia Foundation only grants requests for user data that comply with our requests for user information procedures and guidelines (which includes a provision for emergency conditions). The Foundation also collects little nonpublic user information in the first place, as part of our commitment to user privacy, and retains that information for a short amount of time. Of the 23 user data requests we received, only two resulted in disclosure of nonpublic user information.

In addition to updating the online report, we have also released a fully-updated version of our print edition, which we first launched in August 2017. These print versions will be available at Wikimedia Foundation events for the next year, beginning with our annual Wikimania conference in Cape Town, South Africa, which will take place in July 2018.

The Wikimedia Foundation’s biannual transparency report reaffirms our commitment to transparency, privacy, and freedom of expression. It also reflects the diligent work of the Wikimedia community members who shape the projects. We invite you to learn more about requests we received in the past six months in our comprehensive transparency report. For information about past transparency reports, please see our previous blog posts.

Jim Buatti, Legal Counsel
Leighanna Mixter, Legal Counsel
Aeryn Palmer, Senior Legal Counsel II
Wikimedia Foundation

The transparency report would not be possible without the contributions of Siddharth Parmar, Jacob Rogers, Jan Gerlach, Katie Francis, Rachel Stallman, James Alexander, Joe Sutherland, Emine Yildirim, Rachel Brown, Imogen Sealy, Yuan Tian, and the entire Wikimedia communications team. Special thanks to Anna Windemuth for help in preparing this blog post, and to the entire staff at Oscar Printing Company.

July 19, 2018

July 18, 2018

Haleigh Marcello 
Image: File:Haleigh Marcello headshot.jpg, Hemarcello, CC BY-SA 4.0, via Wikimedia Commons.

It’s rare that students return to a research paper or project after their work has been graded, and even more rare after their course has ended. That’s the power of a Wikipedia assignment. As instructors and students have recounted, when students learn how to evaluate and contribute to Wikipedia as an assignment, inspiration can reach beyond the classroom.

Haleigh Marcello learned how to improve Wikipedia in Ulrike Strasser’s course, Early Modern Women, at the University of California, San Diego. In an interview with Wiki Education, Haleigh explains why she has continued to contribute to Wikipedia – now as a hobby.

When Haleigh first heard that she would engage with Wikipedia as part of her course, she was intrigued by the digital literacy aspects of the assignment. “I very often use Wikipedia, and didn’t know much about the process behind writing the articles, besides the fact that anyone could contribute to them,” she says. “I thought it would be interesting to get sort of a behind-the-scenes look into how these articles are created.” As part of the assignment, students understand how knowledge is constructed on Wikipedia; how to evaluate information on the site for accuracy; and then how to participate in the creation and correction of articles.

The fact that her work would help round out the world’s favorite encyclopedic resource also appealed to Haleigh. “I enjoyed the idea of contributing to public knowledge. As a history major, I write a lot of essays, but they don’t often get more than a handful of readers. Knowing that my article could be read by people all over the world sounded very exciting!”

Haleigh began working to improve the biography of Sarah Crosby, the first female Methodist preacher. It was an article that was only a few sentences long.

“I had read briefly about Crosby in our class textbook, Women and Gender in Early Modern Europe by Merry Wiesner-Hanks. When it came time to pick our articles, I compiled a list of everyone that Wiesner-Hanks mentioned in her book. From there, I eliminated women who had well-done articles. I looked over the rest of the list and settled on Crosby because I felt that she was an interesting figure, especially as I began to look into her more. She started preaching before it was allowed in the Methodist religion, basically following the saying, ‘It’s easier to ask for forgiveness than to beg for permission.’ I am a big fan of historical women who went outside of the status quo, so I thought Crosby was really cool for doing this. Her article was also only five sentences long before I began working on it, and I felt sad that she was not well-represented. Overall, I was interested in her story, and felt that her article should have been longer.”

After identifying this gap in Wikipedia’s content, Haleigh took it upon herself to flesh out the biography. She contributed more than 11,000 words to the article by the end of the term.

“Writing for Wikipedia is entirely different from writing a traditional research paper,” says Haleigh. “The beginning of the process is the same; you have to find sources, mark the information you’re going to use, etc. But from there it becomes very different. If I were writing a research paper, I would have taken my sources and constructed an argument. But when writing on Wikipedia, there is no argument; you are simply stating facts. Instead, you have to take the information from the source, summarize/paraphrase it, and work it into your article. I found this to be pretty challenging, as my sources tended to jump around in time. I had to construct a timeline for Crosby, put my sources on this timeline, and then work on my article from there.”

Students master a unique skillset when engaging with Wikipedia. They must understand the norms of the editing community, as well as the nature of encyclopedic writing. Haleigh really took to the process and accepted these challenges.

“I wound up finishing Crosby’s article a couple of weeks before our class ended,” she recounts. “I didn’t want to sit around and do nothing while my peers continued to work on their articles, so I decided to look into working on another article. When I was researching for Sarah Crosby’s article, I would often read information about other prominent 18th century Methodist women. The woman who was most often mentioned was Mary Bosanquet Fletcher. Bosanquet had been mentioned in Wieser-Hanks’s book, but I decided not to choose her article since it had more information than Crosby’s article (Bosanquet’s article was rated start-class, while Crosby’s was a stub). Since I already had the sources from working on Sarah Crosby’s article, I figured I might as well improve Mary Bosanquet Fletcher’s as well. As I read more about her, I discovered that she was a pretty amazing figure as well. She and Crosby were both female Methodist preachers, but Bosanquet was the one who convinced John Wesley, one of the founders of Methodism, to allow all women to preach. Sarah Crosby and Mary Bosanquet Fletcher are two pretty amazing figures, and I am proud to have given the world some more information about them.”

The interconnectedness of Wikipedia articles is inspiring – some say addicting – and with the tools to make a difference, Haleigh took to the task. She continues to edit, although her course ended earlier this summer.

“Most of my edits have been fixing grammar and formatting mistakes that I see when browsing through articles. Though I did go in and add some content to the article on the Greenham Common Women’s Peace Camp, and I added a lot to the article for my college at UCSD, Eleanor Roosevelt College. My goals are to continue working on these articles. I did some research on the Greenham Common Women’s Peace Camp recently, so I want to use those sources and add to the article. Additionally, I want to find some more third-party sources for the article on Eleanor Roosevelt College, as the way I have it now relies a bit too much on primary sources… And of course, I want to continue keeping my eye out for grammar and formatting mistakes.”

As a history major, the Wikipedian commitment to comprehensive and accessible knowledge is aligned with Haleigh’s personal values. And this assignment has given her the tools to put that commitment into practice and make a real impact.

“One of the reasons that I became a history major is because I believe that history is often misconstrued as boring and uninteresting, when it’s so complex and fascinating. These articles are helping people around the world gain a better understanding of history, which is my #1 goal as a history major. I love to spread knowledge.” And she’s done just that! During the duration of her course, Haleigh contributed 33,700 words to Wikipedia articles, which have been viewed 494,000 times in total. And she continues to improve a wide variety of articles on Wikipedia in her free time.

“I find the editing process to be pretty fun,” says Haleigh. “And it’s going very well! I have learned so much about Wikipedia through exploring the site and with the help of other Wikipedians.”

The recent national study published by Strada and Gallup reports that students find education to be worth the cost when coursework is relevant to their lives. When a student contributes academic content to a Wikipedia article, they understand the workings of a resource they use all the time. They also understand that their words can be accessed by millions, and they feel an increased sense of responsibility to produce quality work. Students gain critical skills in digital literacy, research, writing for a public audience, and collaboration through the process. And they build up their confidence to participate in important modes of knowledge production and transmission. A Wikipedia assignment is a chance to take on an active role in their own education.

“I didn’t expect to become as passionate about Wikipedia as I did because of this course,” Haleigh reflects. “I took the course because I thought it would be interesting and very different from a typical class, and it was. But I did not expect to come out of it with a new hobby. I would tell any students out there that if there’s a class like this offered at your university: take it! It’s a very rewarding and fun experience.”


Interested in teaching with Wikipedia? Visit teach.wikiedu.org to get started or reach out contact@wikiedu.org with questions.


ImageFile:Price Center, UCSD.jpgAlex Hansen, CC BY 2.0, via Wikimedia Commons.

"Comparing codes of conduct to copyleft licenses": written notes for a talk by Sumana Harihareswara, delivered in the Legal and Policy Issues DevRoom at FOSDEM, 31 January 2016 in Brussels, Belgium. Video recording available. Condensed notes available at Anjana Sofia Vakil's blog.

FOSDEM logo Good afternoon. I'm Sumana Harihareswara, and I represent myself, and my firm Changeset Consulting. I'm here to discuss some things we can learn from comparing antiharassment policies, or community codes of conduct, to copyleft software licenses such as the GPL. I'll be laying out some major similarities and differences, especially delving into how these different approaches give us insight about common community attitudes and assumptions. And I'll lay out some lessons we can apply as we consider and advocate various sides of these issues, and potentially to apply to some other topics within free and open source software as well.

My notes will all be available online after this, so you don't have to scramble to write down my brilliant insights, or, more likely, links. And I don't have any slides. If you really need slides, I'm sorry, and if you're like, YES! then just bask in the next twenty-five minutes.

I. Credibility

I will briefly mention my credentials in speaking about this topic, especially since this is my first FOSDEM and many of you don't know me. I have been a participant in free and open source software communities since the late 1990s. I'm the past community manager for MediaWiki, and while at the Wikimedia Foundation, I proposed and implemented our code of conduct, which we call a Friendly Space Policy, for in-person Wikimedia technical spaces such as hackathons and conferences.

I wrote an essay about this topic last year, as a guest post on the social sciences group blog Crooked Timber, and received many thoughtful comments, some of which I'll be citing in this talk.

I am also a contributor to several GPL'd pieces of code, such as MediaWiki and GNU Mailman, on code and non-code levels. And I am the creator of Randomized Dystopia, a GPL'd web application that helps you in case you want to write scifi novels about new dystopian tyrannies that abrogate different rights.

And I have been flamed for suggesting codes of conduct; for instance, one Crooked Timber commenter called me "a wannabe politician, trying to find a way to become important by peddling solutions to non-problems." Which is not as bad as when one person replied to me on a public mailing list and said, "Deja Vue all over again. I finally understand why mankind has been plagued by war throughout its entire history...." So maybe I'm the cause of all wars in human history. But I probably won't be able to cover that today.

II. The basic comparison

So let's start with a basic "theory of change" lens. When you're an activist trying to make change in the world, whether it's via a boycott, a new app, a training session, founding an organization, or some other approach, you have a theory of change, whether it's explicit or implicit. You have an assessment of the way the world is, a vision of how you want the world to look, and a hypothesis about some change you could make, an activity or intervention you could perform to move us closer from A to B. There's a pretty common theory of change among copyleft advocates and a couple of theories of change that are common to code of conduct advocates.

A. GPL

The GPL restricts some software developers' freedom (around redistributing software and around adding code under an incompatible license) so as to protect all users' freedom to use, inspect, modify, and hack on software.

The copyleft theory of change supposes that more people will be more free if we can see, modify, and share the source code to software we depend on, and so it's worth it to prohibit enclosure-style private takeovers of formerly shared code. Because in the long run, this will enable free software developers to build on each others' work, and incentivize other developers to choose to make their software free.

B. Codes of Conduct

Now, codes of conduct, antiharassment policies, friendly space policies: They restrict some people's behavior and require certain kinds of contributions from beneficiaries, so as to increase everyone's capabilities and freedom in the long run.

One pretty popular theory of change goes like this: we will make better software and have a greater impact if more people, and more different kinds of people, find our communities more appealing to work in. One thing making an unpleasant environment and driving away contributors, especially contributors with perspectives that are underrepresented in our communities, is hurtful misbehavior in community spaces. So we'll make the trade and say that it's worth it to restrict some behavior, in order to make the environment better so more, and more varied, people can do work in our communities, and thus make more free software and make it better.

And here's another related one, very similar to the one above, but focusing on the day-to-day freedom of community participants who are marginalized. If the constraint stopping me from, for instance, speaking in an IRC channel is that I strongly suspect I'll be harassed if they know I'm a woman, and that I don't have any reason to believe I can avoid or usefully complain about that harassment, how free am I to participate in that community? Is there perhaps a way to understand a certain level of safety as a necessary prerequisite to liberty?

I realize that this is probably the one room in the world where I have the highest chance of getting into a multi-hour "what does freedom mean" bikeshedding session, so I'm going to avoid focusing on the second model there and focus more on the first one, which emphasizes the end result of more free software.

Photo of me at FOSDEM 2016, CC BY-SA Luis García Castro, https://twitter.com/luiyo/status/700938185115836416

C. Assumptions

So I am not assuming that everyone in this room is a copyleft advocate, but I am going to assume from this point forward that we in this room fundamentally understand the restrictive license argument, that we have a handle on the theory of change that it's operating on. And similarly, I'm sure there are people here who aren't so big on codes of conduct, but I'm going to assume that we fundamentally understand the theory of change behind that approach, regardless.

D. Similarities

Now let's talk about similarities. Chris Webber calls both of these approaches "added process which define (and provide enforcement mechanisms for) doing the right thing." I agree. Without this kind of gatekeeping we see free rider incentives, on other people's software work and on other people's attention and patience and emotional labor.

They are written-down formalizations of practices and values that some community members think should be so intuitive and obvious that asking people to formally offer or accept the contract is an insult, or at least an unnecessary inconvenience. And so some people counterpropose sort-of-humorous policies, such as the "Do What the Fuck You Want to" software license and "don't be a jerk" codes of conduct.

They are loci of debate and fragmentation.

Some people agree to them thoughtfully, some agree distractedly as they would to corporate clickthrough EULAs, some disagree but click through anyway (acting in bad faith), some disagree and silently leave, some disagree and negotiate publicly, some disagree and fork publicly. Some people won't show up if the agreement is mandatory; some people won't show up UNLESS it's mandatory; some people don't care either way. And, by the way, good community management requires properly predicting the proportions, and navigating accordingly.

Both copyleft licenses and codes of conduct are approaches to solving problems that became more apparent along with different people realizing they have different expectations and needs, and consider different outcomes or processes to be "fair."

These kinds of codes and licenses usually cover specific bounded events and spaces or sites, and their scope covers interpersonal or public interactions. Codes of conduct usually don't cover conversations outside community-run spaces or the beliefs you hold in your head; open source licenses' restrictions usually kick in on redistribution, not use, so they don't constrain anything you do only on your own computer.

Neither one of these approaches can rely on self-enforcement. There is some self-enforcement of both, of course. There's a perception that -- as Harald K. commented on my blog post -- "licenses more or less police themselves (or in extreme instances, are policed by outsiders) whereas codes of conduct need an internal governing structure, a new arena where political power can be exercised." My personal understanding, which I share with people like Matthew Garrett, is that there's a ton of license-breaking happening, and we need to support existing organizations like the Software Freedom Conservancy to police that misbehavior and litigate to defend the GPL. As Conservancy head Karen Sandler points out in her December essay "From a lawyer who hates litigation", "I've seen companies abuse rights granted to them under the GPL over and over again. As the years pass, it seems that more and more of them want to walk as close to the edge of infringement as they can, and some flagrantly adopt a catch-me-if-you-can attitude." And you see enough individuals in our communities acting similarly that I don't think I need to belabor this point; codes of conduct are much more productive when they're actually, you know, enforced.

And both copyleft licenses and codes of conduct restrict freedom regarding certain acts, over and above what is restricted by the law, in the interest of a long-term good, which can in both cases be construed as greater freedom. As Belle Waring says to one skeptic in the Crooked Timber comments, paraphrasing their argument: "part of your reasonable resentment is, 'I don't want to be forced to do freedom-restricting things in support of a very uncertain outcome, just because the final proposed outcome is a good one.'" I will go into that bit of argument later.

E. Differences

But these kinds of agreements are different on a few different axes, which I think are worth considering for what they tell us about open stuff community values and about our intuitions on what kinds of freedom restrictions we find easier to accept.

One is that many codes of conduct focus on in-person events such as conferences, rather than online interactions. Many of the unpleasant incidents that caused communities to adopt CoCs -- or that communities see as "let's not let that happen here" warning bells -- happen at face-to-face events. And face-to-face spaces have a much longer history and context of ways of dealing with bad behavior than do online spaces. After all, a pretty widespread reading of the core function of government and law enforcement is that they keep Us Good Guys safe by stopping The Bad Guys from committing face-to-face (or knife-to-face or chair-to-face) assault.

But there's another axis I want to explore here: whether the behavior constraint feels like a contract or whether it feels like governance. Of course, we toss around phrases like "the social contract" and use the metaphor of contract to talk about the legitimacy of government, but to an ordinary citizen, contracts and governance feel like significantly different things. To oversimplify: to a non-lawyer like me, something that feels like a contract formalizes a specific trade, something discrete and finite and a bit rare. A copyleft license feels that way to me; it specifies that if I distribute a certain artifact -- which is something I would only do after some amount of thought and work -- I then also undertake certain obligations, namely, I must also redistribute the software's source code, under the same license. And, notwithstanding edge cases, it is often easy to examine the artifact, follow a decision procedure, and determine that I have complied with the terms of the license. If I meant to comply in the first place.

On the other hand, when we make rules constraining acts, especially speech acts, it feels more like governance.

Codes of conduct serve as part of a community's infrastructure to fulfill the first duty of a government — to protect its citizens from harm — and in order to make them work, communities must develop governance processes. That is to say, "governance" is what we call it when we're explicit about who gets to make and implement rules that affect everyone in a community, and how we choose those people or get rid of them. And a governance body does not necessarily have to be a legal entity. For instance, in MediaWiki governance, there's an architecture committee that decides on large technical architectural changes, and it has no standing in the eyes of the United States government.

It takes work to evaluate whether actions have complied with rules, and that work might require asking questions of suspects, bystanders, and targets. Enforcing a code of conduct, even a narrowly scoped anti-harassment policy, often requires that someone act on behalf of a community to do this, and to implement the outcome -- be it informed by retributive, rehabilitative, transformative, or some other justice model. And it feels more like governance than contract to me if a rule applies to actions I take many times a day without deliberate planning -- such as saying something in my project's live internet chat room.

One way of thinking about this is: is there some kind of authority that the community acknowledges as having legitimate power over everyday behavior, over and above existing government with a capital G? Because, again, licenses affect certain coding and architectural decisions, but they don't preclude, for instance, everyday discussion. In fact, the social and digital infrastructure it takes to make robust and usable software, including our bug reports, our automated tests, our conversation on mailing lists, and so on, is often not covered by any particular open license -- if it were, maybe we'd be seeing a different level of pushback even from developers who are happy with copyleft as applied to their code.

F. Shortcomings of the contract model

But I think another interesting thing that happens when you compare a governance model to a contract model, regarding approaches we take to improving behavior in our communities, is seeing how governance wins. It takes a lot of work, but it has a lot of advantages.

1. Flexibility

Contracts are binary where ongoing dialogue and governance can be more flexible and responsive. If I were going to be really annoying I would compare them to compiled bytecode and to interpretable scripts. Contracts have to sort of self-contain the tests for what the contract permits, mandates, and prohibits, whereas governance mechanisms and bodies can use more general standards, which might change over time. To quote one of the commenters on my essay, Stephenson-quoter kun:

contracts explicitly restrict acts which are simply unpardonable -- not sharing the source code to your modified version of a GPL-licensed project, sexually assaulting someone at a conference -- because everyone agrees that those things are wrong and we feel confident that we can agree up-front that there can never be any extenuating circumstances in which those things are actually OK. Governance, however, can serve to 'nudge' people away from bad behaviours – poor coding standards, rudeness on mailing lists -- by giving us a standard to measure those things against without enumerating every possible violation of the standard. A governance procedure can take context into account, and is much more easily subject to improvement and revision than a contract is.

Sometimes it's the little stuff, more subtle than the booth babe/groping/assault/slur kind of stuff, that makes a community feel inhospitable to me. When I say "little stuff" I am trying to describe the small ways people marginalize each other: dominance displays, cruelty in the guise of honesty, the use of power in inhospitable ways, feeling unvalued, "jokes", clubbiness, watching my every public action for ungenerous interpretation, nitpicking, and bad faith.

Changing these habits requires a change of culture, and that kind of deliberate change in culture requires people who take up the responsibility in stewarding the culture.

And a governance approach has a lot more ability to affect culture than a contracts-only approach does.

2. Contracts give us an illusion of equality and self-containedness

As Tim McGovern said in the comments to my Crooked Timber post:

contracts have taken over as a primary way of negotiating relationships: a EULA is a replacement for a legal understanding of the relationship between two parties who are doing business. I don't, in other words, sign a EULA when I buy a pair of socks -- or even when I buy a car (Teslas excepted) because the purchase relationship is legally defined; even the followup on what can and can't be in your warranty is legally defined. But companies would rather be bound by an agreement they write than a body of law based on either commonlaw or constitutional concepts, or legislation.

Contracts presume an equality between the parties; in theory, both sides can take a breach of contract to court. In practice, of course, a EULA is a contract that masks radical inequality in power between the parties.... Governance requires wrestling with equality in a real way, on the other hand, and voluntarily submitting to an authority constituted in some fashion (over time, by people, etc.), as opposed to preserving a contractual illusion of equality.

3. Contract pretends you have choices

I recommend that, if you haven't, you check out the article "Mothering versus Contract" by Virginia Held, from Beyond Self-Interest in 1990. It suggests that perhaps we should fundamentally conceive of our interactions with others as following a paradigm of motherhood rather than of contract -- one truth this approach acknowledges is that by default most interactions in your life are opt-out rather than opt-in, if there's any opting or choice at all.

Yes, there's the freedom to fork. But realistically, if you want to get things done, you have to collaborate with others, and we need to accede to other people's demands, in terms of interface compatibility, learning and speaking fluent English, and all sorts of other needs. A FLOSS project with a thriving ecology of contributors is far more valuable than a nearly identical chunk of code with only a couple of voices available to help out, and thus the finite amount of human attention limits our ability to make effective forks. We're more inderdependent than independent, and acknowleding that as a fundamental truth complicates the contracts-y libertarian narrative potentially beyond usefulness.

III. Lessons

I hope that my analysis helps give some vocabulary and frameworks for understanding arguments around these issues, and that we can use them to develop more effective arguments.

A. Freedom tradeoff comparisons

The first step might be — if you're trying to get your community to adopt a code of conduct, you might benefit by looking at other freedom-restricting tradeoffs the community is okay with, so you can draw out that comparison.

Or with UX (user experience) -- design is the art of taking things away, and when you're advocating for better user experience, which often involves reducing the number of visible ways to do things, consider comparing your approach to one of the freedom tradeoffs that your interlocutor is already okay with, such as the fact that your community has standardized on a single version control system. A single way for that kind of user to interact.

B. Artifacts

And if you're trying to build a code of conduct consensus in your community, it might help to start by talking, not about day-to-day beavior, but about artifacts that people think of as artifacts. Talk about the things we make, like slide decks for presentations, articles on your wiki. That can get people on the same page as you, in case they're not yet ready to think of the community itself as an artifact we make together.

C. Theory of change

If you're an advocate for a new initiative, licensing, code of conduct, or something else, understand your own theory of change, and build mental models to help you understand the people who disagree with you. Understand what part of the theory of change they disagree with, and gather data to counter it.

And, incidentally, this lens will also help you appreciate other complementary approaches that will help you achieve your goals. As Mike Linksvayer says: "Of course I think that copyleft advocates who really want to ensure people have software freedom rather than just being enamored of a hack should be always on the lookout for cheaper and/or socialized enforcement (as implied above, control of distribution channels that matter, and state regulation)."

So why might people oppose codes of conduct? Here are a few ideas:

  • they might disagree on whether the goal makes sense
  • or on whether codes of conduct, when enforced, make the situation more conducive to diverse populations and to net growth in community -- have your research close at hand!
  • or on what the biggest problems you're facing are, and whether they're community recruitment and retention

As Chris Webber notes, "there's an argument that achieving real world social justice involves a certain amount of process, laying the ground for what's permitted and isn't, and (if you have to, but hopefully you don't) a specified direction for requiring compliance with that correct behavior." The addendum is that, as Alberto Brandolini said "The amount of energy necessary to refute bullshit is an order of magnitude bigger than to produce it." So part of the mental model you're trying to understand is what the person you're arguing with is trying to maximize, and another part is whether you agree on how to maximize it.

Paul Davis, the Ardour BDFL, commented on my Crooked Timber post, "The dilemma for a mid-size project like mine is that the overhead of developing and maintaining a CoC seems like just another thing to do amidst a list of things that is already way too long, and one that addresses a problem that we just don't have (yet)." He said he's more worried about technical, architectural decisions causing developer loss.

So, for instance, you could argue with Paul: what genuinely causes developer loss? And what priorities should you have, given your goals?

D. A fresh set of governance needs and questions

CoC adoption drives the adoption of explicit governance mechanisms, as Christie Koehler has recently explored in depth in her post "The complex reality of adopting a meaningful code of conduct" .... but we have many open questions that the legal and policy community within free and open source could really help with.

For instance, it's great that we have people like Ashe Dryden and organizations like Safety First PDX helping develop standards and advising organizers on developing and enforcing codes of conduct, but should we actually be centralizing this kind of reporting, codification and enforcement across the FLOSS ecosystem? Different subcommunities have different needs and standards, but just as OSI has helped us stave off the worst possibilities of license proliferation, maybe we should be avoiding the utter haphazardness of Code of Conduct proliferation.

And -- given how interconnected our projects are -- what if single open source projects are the wrong size or shape or scope for this particular aspect of stewardship and governance?

I'd very much appreciate thoughts on this from other folks in future devroom talks or blog posts -- if you tell me this is the kind of thing we talk about on the FLOSS Foundations mailing list then maybe I'll have to bite the bullet and go ahead and subscribe.

IV. Other thoughts + Conclusion

A. Comments on my CT piece

The comments on my Crooked Timber piece had many fine insights, on enforcement, culture, exit, voice and loyalty, fairness, and the consent of the governed. They're worth reading.

B. Hospitality to liberty spectrum

In addition to the contract-governance contrast, I think it's also worth thinking about the spectrum of liberty versus hospitality. The free software movement really privileges liberty, way over hospitality. And for many people in our movement, free speech, as John Scalzi put it, is the ability to be a dick in every possible circumstance. Criticize others in any words we like, and do anything that is not legally prohibited.

Hospitality, on the other hand, is thinking more about right speech, just speech, useful speech, and compassion. We only say and do things that help each other. The first responsibility of every citizen is to help each other achieve our goals, and make each other happy.

I think these two views exist on a spectrum, and we are way over to one side, the liberty side, as a community, and moving closer to the middle would help everyone learn better and would help us keep and grow our contributor base, and help make it more diverse. And to the extent that comparing codes of conduct to copyleft licenses helps some people put new initiatives in perspective, balancing the relationship between rights and responsibilities, perhaps that can also help shift our culture into one that's more willing to be hospitable. I hope.

C. This feels like a potentially insoluble problem

William Timberman said in Crooked Timber comments, "how does a socialist persuade a libertarian that coherence and the common good is sometimes a legitimate constraint on individual freedom?" And the answer is that I don't know, but I hope it is a soluble problem, and I hope I've opened up some avenues for exploration on that topic. Thank you.

Photo by Zack McCune/Wikimedia Foundation, CC BY-SA 4.0.

The Wikimedia Foundation has announced a partnership with Kiwix, the free and open-source software solution that enables offline access to educational content, to expand and improve access to Wikipedia and other Wikimedia projects globally. This partnership will include a $275,000 contribution to Kiwix to further enhance offline access to Wikipedia in parts of the world where consistent, affordable internet connectivity presents a significant barrier to accessing Wikipedia.

“Our hope is that one day everyone will have access to the internet, and eliminate the need for other offline methods of access to information.” said Kiwix CEO Stephane Coillet-Matillon. “But we know that there are still serious gaps in internet access globally that require solutions today. Kiwix is a tool to start fixing things right now.”

The Wikimedia Foundation and Kiwix have had a long-standing collaborative relationship to expand access to Wikipedia around the world. This includes recent support to Kiwix and WikiProject Medicine to improve the availability of offline Wikipedia medical content, as well as improvements to the Kiwix desktop experience.

Through this partnership, the two organizations will collaborate to create a long-term strategy for third party reuse of Kiwix’s free access platform, fix longstanding code debt, improve Kiwix’s usability across mobile platforms including Android, and integrate Kiwix’s and the Wikimedia Foundation’s technical operations more closely for improved Wikipedia offline experiences.

“As part of the 2030 direction for Wikimedia’s future, we’re thrilled to be partnering with Kiwix to invest in solutions to address one of the critical barriers to participating in Wikipedia globally: reliable internet access,” said Anne Gomez, Senior Program Manager at the Wikimedia Foundation. “We have made a commitment as an organization to actively address the challenges and barriers to reaching our global Wikimedia vision: a world in which everyone can freely share in knowledge. Today marks an important step toward realizing that commitment.”

The Wikimedia vision is global: a world in which everyone can freely share in the sum of all knowledge. While there has been a significant reduction in high mobile data costs and other barriers to participating in Wikipedia, more than half the world’s population is not yet online.

Today, Kiwix sits at the heart of the offline ecosystem with more than 3 million users from more than 200 countries. It can store millions of Wikipedia articles from any of Wikipedia’s nearly 300 languages along with thousands of books and videos on a single flash drive or microSD card for access on smartphones and computers. Kiwix has also worked with nonprofits such as the Orange Foundation, Human Rights Foundation, Internet in a Box, WikiFundi, and Digisoft to scale distribution of offline education materials around the world to students, teachers, and the general public.

More information about the Wikimedia Foundation’s work to expand access and participation to Wikipedia globally, including information about this partnership with Kiwix, can be found in the Wikimedia Foundation’s 2018-2019 annual plan.

About the Wikimedia Foundation

The Wikimedia Foundation is the nonprofit organization that supports and operates Wikipedia and its sister free knowledge projects. Wikipedia is the world’s free knowledge resource, spanning more than 45 million articles across nearly 300 languages. Every month, more than 200,000 people edit Wikipedia and the Wikimedia projects, collectively creating and improving knowledge that is accessed by more than 1 billion unique devices every month. This all makes Wikipedia one of the most popular web properties in the world. Based in San Francisco, California, the Wikimedia Foundation is a 501(c)(3) charity that is funded primarily through donations and grants.

About Kiwix

Kiwix is an open-source software that brings internet content to millions of people without internet access – be it because of cost, poor infrastructures or even censorship. Websites like Wikipedia, TED talks, the Gutenberg library and many more can be stored and browsed as if users were online. Kiwix is available in more than 100 languages, and runs on all major desktop and mobile platforms. Based in Lausanne, Switzerland, Kiwix Association is a registered Swiss Verein that is funded solely through donations and grants. For more information, see http://www.kiwix.org.

This map comes courtesy of the UN to Commons. It was downloaded in 2007 by Jeroen, the language on the map is French and Wikidata has much of its data in English. The names in French are mostly the same but that is for someone else to consider.

Many of the articles on "administrative territorial entities" are written by a small group of people. I want to single out Shevon Silva, the user page expresses the amount of work that went into adding stubs for so many African territories. The important thing about data is; once it is there you can change it in any way necessary.

When data gets entered into Wikidata, certain Wikipedia things are not possible; a "human settlement" is not a "administrative territorial entity". Such conflations need to be undone in Wikidata. Obviously the human settlement is located only in that administrative territorial entity and others only by inference. Attributes like "inception date" and links to other human settlements that are part of a sub-perfecture are for someone else to add/get right. Another consideration are historic administrative territorial entities particularly those of historic countries.

At this time it is important to celebrate what we have, morph it into a format that can be used on any and all of our projects. Once it is available in all the Wikipedias, it will generate more and more links and this will put Africa on the map.
Thanks,
      GerardM


July 17, 2018

 

There are villages in the Ecuadorian Andes that are so small you cannot find them on a map. Cajas Juridica is one such place, located just 13km north of the equator. But two engineering students, Joshua Salazar and Jorge Vega, and the staff of Yachay Tech University have figured out a way to give discarded CRT TV screens a second life, using Kiwix—an offline Wikipedia reader—to bring Wikipedia to these communities. Josh Salazar told us more about their Offline-Pedia project.

Tell me about what started your interest and involvement with offline Wikipedia: are you a Wikipedian?

Since I was a kid I always loved encyclopedias, especially the digital ones because of the multimedia content, I was a really big fan of Encarta. I remember that in the school I attended, there was a computer with Encarta which they used to rent. I really loved to surf through all of its topics and stuff, and in general every kid loved it too, and every kid in that school did as well. My interest really started there, because one of the main problems was that they rented the computers, and also there was no internet connection. A cheap, available-for-anyone computer including the whole Wikipedia would have being perfect in such places! That’s why I became really motivated to make Wikipedia available for rural communities such as the one where I grew up.

I just have edited some small sections of a couple of articles, already… but I’m looking forward to have more time to write more when finishing my bachelor degree.

What is Offline-Pedia? How does it work?

Offline-Pedia started as a project focused on setting up computers with Wikimedia content, using low cost and recycled materials, such as wood for the case and old CRT TVs as screens, for rural communities.

The usage of free hardware and free software (a Raspberry Pi, Raspbian: a Debian based OS, Kiwix) turns the installation of one device into something very cheap to make. By getting an old TV, everything can cost around USD 100 per device. Thus, one of the main goals is to upcycle old CRT TVs and other kind of compatible screens.

Why CRT TVs? Well, because in Ecuador, at the of this year (2018) the main TV broadcast signal mode will switch from analog to digital signal: then a lot of TVs will become electronic waste. With the project, we aim to solve two problems: the difficulty of accessing the internet in rural communities, and handling the future electronic waste by reusing the obsolete TVs.

How did you find out about Kiwix?

I knew of its existence since I was in the last year of high school, but actually don’t remember exactly where I read about it. I’ve been always subscribed to Free Software and Linux newsletters and pages, maybe in one of those.

What’s the hardest part of the technology to deal with (for you on the one side, and for the end users on the other)?

Compiling the program source code from scratch into the Raspberry Pi. You know, there are always more dependencies that rely on other dependencies to work, or forgetting the folder paths of the previously installed packages, and so on… but it was fun. After making it work for the first time, just recently actually, we found out that precompiled versions of Kiwix (but just the server) existed. I became a little upset because I even missed some classes for compiling it on time to fulfill the promise we made to the community. Anyways, later the Kiwix guys also shared with me a beta automatic installer and downloader for preparing microSD disk images for the Raspberry Pi. That will save us a lot of time for the next Offline-Pedias we will be preparing!

The technical difficulties with the users were more related to the ‘techno phobia’ of the elderly people, but the youth and children really loved it!

What happened when people started using your box? What lessons can you share with other people who might want to do similar projects?

They started to look for literature! ‘Romeo and Juliet’ in the Gutenberg Project Library, and the biography of William Shakespeare in Wikipedia. That was something we would never have imagined!

I encourage people to replicate the project in whatever and wherever place they are, because there always are curious people and helping them access to the biggest reference platform in the world makes you feel very inspiring, very accomplished and happy.

What’s been the biggest surprise? The biggest challenge?

The young people looking for English Literature was very unexpected. That demonstrated to us that we don’t really know what the people of the community want from us. So, a big challenge will be to properly learn from the people. But luckily, a professor on our University ‘Yachay Tech’, Sergio Minniti, who works developing ‘critical technical practice’ prototypes with students decided to help us. He is a sociologist and will help us start a second phase of the project: ethnographic studies for further understanding the actual needs of the people involved with the project.

Photo via Joshua Salazar, CC BY-SA 4.0.

Offline-Pedia also includes other contents: what are people most interested in, and what would you like to see made available?

Offline-Pedia contains the default tools provided by Raspbian — Scratch, Wolfram Mathematica, and so on —  and as much as we could download from the Kiwix zim Packages available (Vikidia, PHet Chemistry and Physics Simulations, the Gutenberg library; all except the TED talks, which were very large.)

People really loved to have offline Wikipedia access, because it has basically articles about whatever they want to know about.

I would like to implement an offline working wiki software to allow people to directly have a wiki experience: to create, edit, and discuss about the content. So later, after we return periodically to check whether everything is running nice and smooth we can take the new created content and upload it to the Wikipedia. That would expand the range of action of Wikipedia to the zones where even internet isn’t yet available.

How do you see these devices impacting the lives of people they are shared with?

I hope people will become more motivated to read and look for answers whenever they have doubts. Since I started to give speeches with Jorge Vega, another member of the Offline-Pedia project, I wanted the people to understand the importance of learning about science and technology, and how those aspects are vital for the economic, academic, and human development of our communities and country. But most importantly, making them aware of the enormous freely available knowledge for every single human on Earth, freely available on Wikipedia.

What’s next for your project? If you dreamed big, where should it go?

We want to work together with the Ministry of Education of Ecuador, for replacing the expensive budgets of the projects they have, and also developing some kind of software that could allow people to learn the minimum topics required for completing the Basic Education Program of the Ecuador’s Education Curriculum.

But my biggest dream is to facilitate the access to the Wikimedia content to every single place on Earth, where there is no internet access.

How else can Kiwix or the Wikimedia movement help you?

Spreading the word! We will be preparing some video-blog and tutorials where any tech-enthusiast can download the required packages, download the blueprints for cutting wood on CNC machine for the box, and understand the process of installation, and even sharing their experiences wherever they installed an Offline-Pedia device working with Kiwix and even further content that we may be developing in the meantime.

What resources exist for people who want to know more?

Right now, we just have the video that the cool guys of the Communication Office of our University helped us record while installing the device in the first rural community. Soon we will be finishing a blog with everything related to the project.

Interview by Stéphane Coillet-Matillon, Kiwix

Video at top by University Yachay Tech, CC BY-SA 4.0.

July 16, 2018

Photo by Rupika Sharma/Wikilover90, CC BY-SA 4.0.

Over the last few years, the annual Wikimedia Conference has seen many more individuals in emerging communities. This year, Wikimedians from 79 countries, representing nearly one hundred movement affiliates, contributed to the event’s cultural, regional, and language diversity.

Community member Rupika Sharma interviewed several of these attendees to get their thoughts on the Wikimedia movement, their own communities, the Wikimedia 2030 strategy process, and how their involvement with it all will change over the next five years. They include:

  • Felix Nartey, Open Foundation West Africa
  • Sam Oyeyele, Wikimedia User Group Nigeria
  • Coenraad Loubser, Wikimedia South Africa
  • Liang-Chih Shang Kuan, Taiwan chapter/user group
  • Rodrigo Barbano Tejera, Wikimedia Uruguay and Wikimedia Digitization User Group
  • Nahid Sultan, Wikimedia Bangladesh

Here’s more about Coenraad. Tune back over the next few weeks to read the rest of the interviews as they are published.

———

What are your personal and community’s plans and goals in the contexts of the global free knowledge movement and the Wikimedia 2030 strategy?

We were one of the earliest chapters, we came from the 11th chapter and it has been 10 years.  It has been volunteer work mostly, more of a same thing over and over again, [like] two or three weekly edit-a-thons in Cape Town. Some online communities and some of the South African Wikimedia [groups] are growing but very slowly and there has not been a push or outreach.

Most people on South Africa have no idea that they can click, edit and change things. Most people are very surprised when they find this. It goes more deeper than that. People don’t even understand what the internet is. Many people thinks that it is just a way to use Facebook or social media channels. It takes a few years for an internet service providing agency, there is a curve, you see people start using internet more and more, the longer they have engaged with it, the longer they have access to it.

The usage of data starts increasing when justifiably people have been on the internet for four or five years. So much of South Africa still hasn’t reached that point. So maybe for twenty percent of the country that has been there for enough years. But probably in the next three to four years we are going to have majority of the country reach that point. A goal for me is to make sure when that happens, people understands the way Wikipedia fits in and what Wikipedia is and this is a movement that they can be a part of.

What’s special about this conference is that it brings together such diverse people from all walks of life, from all industries, from all countries, all languages, all genders, all religions. I don’t think there is any other group on this planet that reaches so far and wide. I think people most capable of bridging the divides of this world are brought together on this planet. That is something really really special to be proud of.

So when you think about it, how difficult can it be to get volunteers and get people involved. So that is one side of it and other on the other hand, there is one part of diversity. There are two main groups of people, so everybody is passionate about either contributing, supporting something for a good cause or actually have a time and cause on mind. People who want to do something but they are not sure what, and the people who know exactly what they want to do. And then there are other two groups—that’s the people who know how to do certain things as opposed to people who know what they want to do.

And that is the special thing about Wikimedia Conference, because Wikimedia Deutschland [Germany] is so big and so well funded, funded at as opposed to 95 per cent of the rest is that they can afford to employ staff and they are becoming skilled at knowing how and the volunteers, they generally know what. So, bringing those two together and creating a space in which we can learn from each other is the magic that happens here.

It is one thing to attend a session and see a presentation, and then another thing to go out there, you realize that this is how you do things. Also, the learn about the balance between learning and practising. And the important thing about practising is that there are skilled people at hand and as you are trying something and making a mistake and there is something you don’t understand, you realize you can ask for assistance. Also, you build personal relationships, so that you know who to contact. And this person knows you through all the noise of the mailbox!

How was your experience at the conference? What did you learn, and how would you implement that in your community back home?

There are equal parts personal growth and skills that will assist our community. So, in personal growth I realized that even though I need to spend more time listening and paying attention to other people’s needs and trying to find ways on how to assist them, I still find it hard to do that. I still have urge to push my own agenda and to do what I think is the best rather than other people. And this is partly also why I joined this movement, in order to learn how to do this better. But I also joined because I can push things and can get things done. So it is about finding the middle foot, the way we can push and the way we listen. So, the strategy 2030 speaks to me about how I can push. We have volunteers and we are doing things but we are not actively doing outreach, we are not actively engaging with the partners in order to do projects. We are not looking for new sources of revenues in order to fund and drive some of this projects. So, that is three things that I am pushing for.

As told to Rupika Sharma (User:Wikilover90), Wikimedia community member

This interview has been minimally edited, preserving as many of the interviewee’s words as possible.

Why do some people contribute to Wikipedia? Conversely, why don’t others? Ever since Wikipedia became a self-aware community, this question has vexed those who participate in it, and would like to see more people pitch in and help build the encyclopedia. After all, Wikipedia was created by a community of individuals with diverse interests and motivations. Some stay for a short while, and others stay much longer, but no one can stay forever. For this reason, the community must analyze itself and attempt to address the problems which hold it back. But this is a very, very difficult topic to grapple with.

Wikimania_2012_Group_Photograph-0001In mid-June, an editor named Ziko van Dijk, who happens to be one of the longest-running active contributors, posed a version of this question on a Facebook group for Wikipedia editors called Wikipedia Weekly. In the post, van Dijk noted the difficulty of finding new contributors, and speculated that a big reason is “simply that most people don’t like the hobby that is Wikipedia”—it’s a rather abstruse pursuit. Few people enjoy writing, and those who do prefer to express themselves, rather than impersonally collate facts. Meanwhile, other “occupations” on Wikipedia, such as clerical work involving categorizing pages is similarly unappealing. Therefore, in his view, existing Wikipedians must be clearer about what being a Wikipedian really means.

A discussion ensued, and weeks later, the thread had grown to more than 100 comments, with numerous current and former editors, including Wikimedia Foundation personnel, weighing in. I was a participant near the beginning, and in returning to the thread last week, I found the discussion in its whole a fascinating and perhaps useful compilation of views about Wikipedia’s problems recruiting new editors and retaining existing ones. This blog post is an attempt to summarize some of the more interesting arguments; the following are presented without judgment as to their correctness, but simply to describe the views in circulation:

Why aren’t there more people joining Wikipedia in the first place?

  • Many people simply do not know that they can edit Wikipedia. This seems difficult to believe, when Wikipedia is one of the most-visited sites in the world and has been for more than a decade, but the fact remains: we can’t assume that everyone who reads Wikipedia understands how its articles come to be written in the first place.
  • As van Dijk suggests, most people are not writers. Despite the rise of social media, few people write very much or at length—Instagram is bigger than Twitter, and most people who use Twitter simply read, rather than tweet. Moreover, the kind of writing necessary to produce Wikipedia articles is slow, laborious, and exhausting. However energizing a Wikipedian might find the work involved, it’s not hard to see why others might find it enervating.
  • Those who do write tend toward personal expression, sharing opinions and experiences. Wikipedia is the opposite of this: it’s not a place to write what you know, but a place to record what others have written about what they know. Similarly, most who write like to have their name attached to it—even if it’s not their real name. But Wikipedia is not a place for brand-building; it’s a matter of policy that Wikipedia articles are unattributed to their authors, only to the sources the authors used to compile them.
  • Those who try may be surprised that Wikipedia places unexpected restrictions on what they can write. You can’t just copy material from another source into Wikipedia wholesale, for example. And the range of acceptable sources is fairly limited. Wikipedia’s content rules are complex, and many of them are non-intuitive for those not steeped in Wikipedia’s community.
  • Some who try writing or editing an article may have just one topic they really care about, and are uninterested in going beyond that to work on many articles. Once they’ve said their piece, or tried and failed, their interest in the project has been exhausted.
  • A lot of what’s involved in contributing to Wikipedia amounts to clerical work. For many people, this sounds like, well, work. People who work in information jobs, especially, may find that Wikipedia is not a break from the kind of tasks they have to do in their real jobs, so Wikipedia feels too much like more of the same.
  • Potential contributors may associate Wikipedia merely with writing, and not with the myriad other tasks necessary to build the encyclopedia. These include contributing photographs and illustrations, coding templates and writing software, curating information, reviewing content, or patrolling new changes to keep articles free from vandalism or nonsense. You can be a Wikipedian even if you never write an article! But this isn’t readily apparent.
  • Wikipedia is simply too difficult to understand, and finding your way around can be head-spinning. As one participant put it: “Wikipedia is a maze without walls.”

Even if they want to join, the barriers to contributing are quite high

  • Wikipedia now has more than 5.6 million articles: all of the “low-hanging fruit” has been picked and there are fewer opportunities to create new articles. Meanwhile, expanding or revising existing articles may be less enticing to new contributors than the possibility of creating new ones. This is not at all to say that Wikipedia has created all or even most of the articles that it should eventually include, but it does mean these remaining opportunities are likely to be on more esoteric topics.
  • Wikipedia’s rules are very difficult to discover and master. There is no comprehensive list, nor a clear order in which they should be read. Should you begin with Policies and guidelines, Key policies and guidelines, or List of policies and guidelines? Who knows? And once you’ve found them, they can take awhile to read, not to mention internalize.
  • Another potential problem is a lack of clear goals for the Wikipedia community: back when Wikipedia was much smaller, it was easier to say that the goal was to get to 50,000 articles, 100,000 articles, or 1 million articles. Growing the encyclopedia is no longer the focus—that seems to happen almost on its own these days—but what goal replaces it? Reach? Quality? It’s not clear.
  • The “confidence factor” may play a role in a few ways. One is simply by getting started editing, one exposes themselves to evaluation, judgment, and criticism for their work. That’s not inherently a lot of fun. Additionally, with so much already written, new contributors may be reluctant to “interfere” with the work of those who have come before. After all, Wikipedia seems to have done quite well without their input, so why start now?

Harassment is a problem, but how much of a problem?

  • A recurring theme in the discussion was the degree to which harassment, especially of women, on Wikipedia is really a problem. Many editors have experienced it or seen it, but disagreement exists about whether it is a truly pervasive problem that is turning off potential contributors, or if the worst examples are rare but memorable.
  • Prevalence of harassment is difficult to measure for the same reason that crimes of violence often are: victims may be unlikely to report it, because doing so is daunting, and more so when the default assumption of Wikipedia discussions is that they occur in public. Were ANI to feature a private reporting feature, perhaps this would be mitigated.
  • A related question: don’t you have to contribute to Wikipedia first in order to experience harassment? The thinking being, it doesn’t really make sense to discuss in terms of new editors. Still, it’s possible would-be contributors have heard horror stories. And regardless of the reality on the ground (or the page) you can be certain this is a topic that will come up when these questions are raised.
  • Lastly, was Wikipedia ever a friendlier place than it is now? One suggestion was: no, it only seemed that way because there were more wide open spaces between content and there were fewer opportunities for contention and confrontation. Also, because Wikipedia had not yet become a global brand, there was less vandalism, and fewer COI problems. It doesn’t change anything now, but it’s interesting to consider.

What might some potential solutions look like?

  • There are as many potential solutions as there are problems. Maybe more? Here is a short list of ideas floated in the discussion thread, relating to the explanations listed above. Like before, they are presented without judgment, but in some cases with a little bit of supplementary commentary mixed in.
  • Wikipedia’s information pages must explain better what participation means before new users sign up. Wikipedia:Introduction is intended to be the starting point, but it doesn’t really offer any context for what to do. Not only is a better community portal for first-time editors a possible solution, but perhaps “better” isn’t the same for everyone, and there should be more than one point of entry based on one’s background or intentions.
  • Spotlight other things people can do than simply edit articles: patrol changes, review articles for GA or FA status, contribute photos, produce cartography, create templates, write bots, or fix grammar and spelling. A “101 ways to contribute” video or similar presentation could help spread awareness.
  • Better integration of tools from the community; VisualEditor is the WYSYWIG editing interface new contributors are encouraged to try, and Wikipedia Teahouse is the place for new editors to ask questions of veterans, but you can’t use the VisualEditor at the Teahouse.
  • For those who want recognition for their contributions to Wikipedia, perhaps Wikipedia’s articles could be re-designed slightly to include randomized lists of contributors to the article. Every once in awhile, you would get to see your name in lights. (Un-discussed: what if you don’t want your name in lights?)
  • “Stop over-policing contributions and under-policing behavior”. This is a fascinating insight, but also one that appears to run counter to the long-observed community advice to “focus on the edit, not on the editor”.
  • Stop pretending that everyone should be an editor, and find ways to support those who do. Additionally, find out why current contributors do so, and find ways for Wikipedia’s support teams and infrastructure to better nurture these motivations. Showcase stories of editors explaining why they are personally motivated to contribute.
  • More outreach projects to specific communities who are actually likely to edit Wikipedia: in science, literature, and especially at libraries.
  • Find ways to surface specific tasks to be done within different modes of contribution. Twitter, Facebook, Reddit all have feeds with new content to consume, but Wikipedia has no such centralized resource, whether communal or individualized. A new editor-focused dashboard was a popular suggestion in the 2016 Community Wishlist Survey, but not much has happened with it recently.

Ultimately, to borrow a phrase from academic work, mentioned in the thread: “further research in this area is needed”. Hopefully, in the meantime, discussions like this can help shape more rigorous explorations of this subject matter, and point toward solutions that benefit Wikipedia and its contributors, present and future.

Photograph of 2012 Wikimania participants via Helpameout licensed under Creative Commons.

TriangleArrow-Left.svgprevious 2018, week 29 (Monday 16 July 2018) nextTriangleArrow-Right.svg
Other languages:
العربية • ‎čeština • ‎English • ‎español • ‎suomi • ‎français • ‎עברית • ‎हिन्दी • ‎italiano • ‎日本語 • ‎नेपाली • ‎polski • ‎русский • ‎svenska • ‎українська • ‎中文

July 14, 2018

The German and the English Wikipedia collide on the "administrative territorial entities" of the Gambia. I was told to remove entries that I made to Wikidata because they were "Falschinformationen". The German article is much better written but the English article indicates that the German information is likely to be outdated.

A discrepancy like this is obviously best solved to insist on "your" solution. The point that I have been making quite often is that such differences are commonplace and require proper sourcing. The obvious source will not be found on a university website, it will be found in governmental information of the Gambia.

Making information about Africa available in Wikidata makes the errors, the inconsistencies and the lack of data in the Wikipedias more visible. This is not solved by considering your "own" data to be best, it is by proving that information is up to date. According to the English Wikipedia, the Upper River Division is no longer; it is largely replaced by the Basse Local Government Area.

My question: what does it take for the Wikipedias to take their inconsistencies serious?
Thanks,
      GerardM
Support for "minority" languages was the subject of the Celtic knot conference. I have watched some of the presentations and find that there is a lot more to supporting minority languages from a Wikidata point of view than just adding missing labels. A vital strength of any Wikipedia is found in its relations between articles and that subjects of interest may be found.

Minority languages are a misnomer, what we mean is that the Wikipedias are small. They have a lack of articles, stucture is missing and subjects of interest are not found. Subjects have the same relations in any language and consequently lists expressing these relations can be shared using Wikidata in any language including "minority" languages. Missing labels need not be an issue; this is expressed nicely in this list of subdivisions of Egypt; the labels for most of them are only available in the Arabic script. A nice invite for people to add labels in the Latin and any other script.

The Welsh Wikipedia makes use of "Listeria" list in its main space and as a consequence, all items in these list can be found. They are available in a context, associated information may be available and they link to articles in other languages. The Welsh Wikipedia did implement the "Article Placeholder" and in this way they provide even more information for the ffspecific subjects.

When you consider Africa and information about Africa, there is no Wikipedia that provides adequate information. The data is incomplete, unstructured and often out of date. It is easy enough to improve on the quality of the data in Wikidata and when the information is updated in many Listeria lists on many Wikipedias, the impact is great.

The lack of coverage of subjects about Africa is huge. Less than 1% of humans is from Africa, we do not have up to date information about "administrative territorial entities" like provinces and districts. In my AfricaGap project only a limited range of subjects get some attention at this time. Obviously there is more that could be done. African cinema is one subject that is of interest to a group of Wikimedians. When they write their articles it will eventually translate to Wikidata and information about movies, actors and directors may be shown in Listeria lists in all the African language Wikipedias. This may generate interest from an African public for our projects.

There is only one purpose for Wikidata, Wikipedia and it is to find a public, a use case for the data, the articles, the information we provide. The one challenge we face is in both the quantity and quality of our articles and data.
Thanks,
     GerardM

“On the Self-similarity of Wikipedia Talks: a Combined Discourse-analytical and Quantitative Approach”

Reviewed by Maik Stührenberg

This paper[1] is thoroughly structured and combines the theory of web genres with dialogue theory to examine Wikipedia talk pages. Since Wikipedia is a web genre, “Wikicussions” (as the authors call them) form a subgenre. In this context, talk pages are examined further, including the quality of cooperation between Wikipedia users, that can be linked to social differentiation regarding roles and statuses of Wikipedians (content- vs. administration-related users). These group-related processes can be seen as a mediating layer between external parameters (system requirements for Wikipedia’s user community) and the structure and dynamics of WP’s subgenres.

Unlike face-to-face dialogue, the authors argue that Wikicussions stand out due to a publicly available common ground (derived from dialogue theory), which may provide a reason for the structures they found.

The paper is enriched with a number of high-quality figures that support and underpin the findings.

Graph between November 2000 and November 2015 clearly demonstrating that most posts come from registered users

Frequency distribution of talk posts over time within the German Wikipedia (blue: registered users; red: anonymous users; green: bots; black: all users). Unsigned posts (without timestamps) are excluded. Posts dated by posters outside of the valid time-frame (before the date of creation of the discussion or after the date of its download) are also excluded. (Figure 7 from the paper, by Alexander Mehler, Rüdiger Gleim, Andy Lücking, Tolga Uslu and Christian Stegbauer, CC BY-SA 4.0 )

“How Sudden Censorship Can Increase Access to Information”

Reviewed by Bri and Tilman Bayer

Our intuition might tell us that government censorship causes reduced access to online information. But recent research indicates that the effect can be exactly the opposite. Using data gathered from Wikipedia page views and other sources, researchers William Hobbs and Margaret Roberts found that:

[…] citizens accustomed to acquiring this [forbidden] information will be incentivized to learn methods of censorship evasion […] millions of Chinese users acquire[d] virtual private networks, and subsequently […] began browsing blocked political pages on Wikipedia, following Chinese political activists on Twitter, and discussing highly politicized topics such as opposition protests in Hong Kong.[2]

Specifically, the authors studied the impact of a block of Instagram in China on September 29, 2014, following protests in Hong Kong, on Chinese Wikipedia pages that were already blocked in the country. (This predates the 2015 total block of the Chinese Wikipedia and the switch of all Wikimedia sites to full encryption with HTTPS around the same time, which made such per-page blocking impossible.) The censored Chinese Wikipedia pages with the largest increase in views “shows that new viewers accessed pages that had long been censored including those related to the 1989 Tiananmen Square protests”,[2] i.e. “viewing patterns that would be more typical of new users who had just jumped the firewall, rather than of old VPN users who had presumably consumed this information long ago.”[2] Here is an excerpt of the full list examined in the research, the top 10 for the second day of the block, linked here to their English Wikipedia equivalents:

  1. People’s Republic of China blocked websites list
  2. Jiang Zemin
  3. Radio Australia
  4. Hu Jintao
  5. Zeng Qing
  6. Wang Weilin (Tank Man)
  7. Li Peng
  8. Tiananmen Square Incident
  9. Zhou Yongkang
  10. Wu’erkaixi (June 4 leader)

The researchers propose to name this phenomenon the “gateway effect”, a “mechanism through which repression can backfire inadvertently, without political or strategic motivation”,[2] because it incentivizes people to learn how to evade censorship and thus “have more, not less, access to information and begin engaging in conversations, social media sites, and networks that have long been off-limits to them.”[2] They distinguish it from the Streisand effect, where individuals specifically seek out information that is being hidden.

The second author of the study, Margaret Roberts, is also the author of Censored: Distraction and Diversion Inside China’s Great Firewall (Princeton University Press, 2018; print ISBN 978-0-691-17886-8, e-book ISBN 978-1-400-89005-7).

Marketing, social media, and Wikipedia

Reviewed by Barbara Page

This study was able to “characterize” the interests of Wikipedia editors and the editors’ social media activity on Twitter to facilitate:

Photograph of person's left hand holding a smartphone that is accessing social media

A marriage between editor editing topics and Twitter (and possibly Facebook) will result in targeted marketing tailored just for you!
(Photo: Harland Quarrington/MOD, OGL)

[…] building rich user profiles, which can be conveniently used in order to provide personalized contents and offers.” and “[…], i.e., the detection of the user’s core interests and, therefore, allows for product and service recommendations far more tailored than those stemming from other (usually) extemporary actions on the Internet, like flight ticket purchases and hotel reservations. In this light, it is important to notice that such a profiling potential associated to social login remains nowadays largely unused and enabling its exploitation is one of the main goals of the present work.[3]

Conferences and events

See the community-curated research events page on Meta-wiki for other upcoming conferences and events, including submission deadlines.

WMF research showcase

Recent presentations at the monthly Research showcase hosted by the Wikimedia Foundation included the following:

“Conversations Gone Awry: Detecting Early Signs of Conversational Failure”
PDF of "Conversations Gone Awry" with first page depicted

Presentation slides (video)

Antisocial behavior can exist in online social systems and may include harassment and personal attacks. A new paper[4] by seven researchers from Cornell University, Jigsaw, and the Wikimedia Foundation describes how the prediction of undesirable negative exchanges may be able to prevent the deterioration of a discussion. Prediction may be possible at the start of a conversation to prevent its deterioration. One of the authors also gave an interview published on the Wikimedia Foundation’s blog,[supp 1] and the paper was covered in popular media; see In the media § In brief.

Case studies in the appropriation of ORES

From the announcement (by Aaron Halfaker):

PDF of "ORES appropriation and reflection" with first page depicted

Presentation slides about the use of the ORES platform (video)

ORES is an open, transparent, and auditable machine prediction platform for Wikipedians to help them do their work. It’s currently used in 33 different Wikimedia projects to measure the quality of content, detect vandalism, recommend changes to articles, and to identify good faith newcomers. The primary way that Wikipedians use ORES’ predictions is through the tools developed by volunteers. These javascript gadgets, MediaWiki extensions, and web-based tools make up a complex ecosystem of Wikipedian processes – encoded into software.

The presentation covered “three key tools that Wikipedians have developed that make use of ORES”: Wikidata’s damage detection models, exposed through Recent Changes; Spanish Wikipedia’s PatruBOT; and WikiEdu tools from User:Ragesoss that incorporate article quality models.

Other recent publications

Other recent publications that could not be covered in time for this issue include the items listed below. Contributions are always welcome for reviewing or summarizing newly published research.

Compiled by Tilman Bayer
  • “On the Effects of Authority on Peer Motivation: Learning from Wikipedia”[5] – From the abstract: “We show that lateral authority, the legitimacy to resolve task‐specific problems, is welcomed by members of an organization in the resolution of coordination conflicts, the more so (1) the fiercer the conflict to be resolved, (2) the higher the competence‐based status of the authority, (3) the lower the tenure of, and (4) the more focused the organizational members are. Analyzing the discussion behavior of members of Wikipedia between 2002 and 2014, we corroborate our allegations empirically by analyzing 642,916 article–discussion pages.”
  • “A Comparison of the Historical Entries in Wikipedia and Baidu Baike[6] – From the abstract: “This research purposefully chose 6 entries and developed a framework to evaluate their performance in accuracy, breadth, depth, informativeness, conciseness and objectiveness. The result shows that: Wikipedia is superior in most cases while Baidu Baike is a little better in the entries on Chinese history. The operating mechanism is the main reason for it.”
  • “Sentiments in Wikipedia Articles for Deletion Discussions”[7] – From the abstract: “We performed sentiment analysis on 37,761 AfD discussions with 156,415 top-level comments and explored relationship between outcomes of the discussion and sentiments in the comments. Our preliminary work suggests: discussion that have keep or other outcomes have more than expected positive sentiment, whereas discussions that have delete outcomes have more than expected negative and neutral sentiment. This result shows that there tends to be positive sentiment in the comment when Wikipedia users suggest not to delete the article.”
  • What are these researchers doing in my Wikipedia?’: ethical premises and practical judgment in internet-based ethnography”[8] – From the abstract: “The article reflects on the heuristics that guided the decisions of a 4-year participant observation in the English-language and German-language editions of Wikipedia. […] it interrogates the technological, social, and legal implications of publicness and information sensitivity as core ethical concerns among Wikipedia authors. The first problem area of managing accessibility and anonymity contrasts the handling of the technologically available records of activities, disclosures of personal information, and the legal obligations to credit authorship with the authors’ right to work anonymously and the need to shield their identity. The second area confronts the contingent addressability of editors with the demand to assure and maintain informed consent.” (See also the Wikipedia essay “What are these researchers doing in my Wikipedia?“)
  • “Digging Wikipedia: The Online Encyclopedia As a Digital Cultural Heritage Gateway and Site”[9] – From the abstract: “[…] this article introduces Wikipedia as a digital gateway to and site of an active engagement with cultural heritage. We have developed the open source and freely available analysis architecture Contropedia [website ] to examine already existing volunteer user-generated participation around cultural heritage and to promote further engagement with it. Conceptually, we employ the notion of memory work, as it helps to treat Wikipedia’s articles, edit histories, and discussion pages as a rich resource to study how cultural heritage is received and (re)worked in and across languages and cultures. […] The analysis facilitated by Contropedia […] sheds light on the contentious articulation of perspectives on tangible and intangible heritage grounded by conflicting conceptions of events, ideas, places, or persons. Technologically, Contropedia combines techniques based on mining article edit histories and analyzing discussion patterns in talk pages to identify and visualize heritage-related disputes within an article, and to compare these across language versions.” (cf. earlier coverage: “‘Contropedia’ tool identifies controversial issues within articles“; “Towards better visual tools for exploring Wikipedia article development – the use case of ‘Gamergate controversy)
  • “Use of Louisiana’s Digital Cultural Heritage by Wikipedians”[10] – From the abstract: “This case study details an analysis of Wikipedia links to online resources from Louisiana cultural heritage institutions [also known among Wikimedians as GLAMs] in order to determine what types of cultural heritage resources users are citing on Wikipedia, what is the content of the Wikipedia articles with Louisiana CHI citations, and how this can influence the work of CHI. The results of the study include findings that digital library items and archival finding aids are the most cited sources from cultural heritage institutions on Wikipedia and are particularly popular for Louisiana-specific Wikipedia articles on society and the social sciences and culture and the arts.”
  • “The Conceptual Correspondence between the Encyclopaedia and Wikipedia”[11] – From the abstract: “This study […] focuses on the roles and attributes of both printed encyclopaedias and Wikipedia. First, we analyse the roles and attributes of an encyclopaedia by conducting a review of research related to them. Then we analyse whether or not Wikipedia fulfills the same roles and has the same attributes as the encyclopaedia by reviewing academic work that investigates and analyses Wikipedia from various perspectives. The results show that Wikipedia does not conceptually correspond to an encyclopaedia, except in cases where people use it for one-time searches. In the world of digital media, Wikipedia does not have the same status that the encyclopaedia holds in the world of print media.”
  • “Structural Differentiation in Social Media: Adhocracy, Entropy, and the ‘1 % Effect[12] – From the text: “Over the study period (2001–2010), we observed 235,701,162 edits completed by 22,792,847 unique contributors. Of these, 19,680,637 users were anonymous, identified only by their unique IP addresses. The rest (3,112,210) were registered users who were logged into their respective accounts. […] logged-in users were the clear minority group, yet they contributed far more edits than the anonymous users—all told, those logged-in individuals were responsible for almost two-thirds (68%) of the observed revisions. Even more importantly, the top 1% of all contributors were responsible for 77% of the collaborative effort based upon the extent to which the text of articles was actually changed (i.e., the contribution delta). [… The] simple answer to research question 2 (RQ2), ‘What is the social mobility (or its inverse, elite “stickiness”) of functional leaders on Wikipedia over time?’ is that on average, across the entire 9.5-year period, an individual who was a top contributor at a given point in time had a 40% probability of remaining in the top contributor group 5 weeks later. Twenty weeks later, that individual would have a 32% chance of still being a top contributor, and after 30 weeks, this figure would be at 28%.”

    In a press release by Purdue University, one of the authors commented: “What we saw is that a clear leadership has emerged, but it’s a leadership that cycles. We have a group of individuals who shape the content by working the hardest and clocking the most hours. The agenda is shaped by these people, and they’re driven by a sense of mission, much like political or religious movements.”[supp 2]

References

  1. Mehler, Alexander; Gleim, Rüdiger; Lücking, Andy; Uslu, Tolga; Stegbauer, Christian (January 30, 2018). “On the Self-similarity of Wikipedia Talks: a Combined Discourse-analytical and Quantitative Approach” (PDF). Glottometrics (RAM-Verlag, published January 2018). 40: 1–45. ISSN 1617-8351. OCLC 7493144471. Archived (PDF) from the original on June 28, 2018. Retrieved June 28, 2018 – via ResearchGate.  Open access
  2. a b c d e Hobbs, William R.; Roberts, Margaret E. (April 2, 2018). “How Sudden Censorship Can Increase Access to Information”. American Political Science Review (Cambridge University Press): 1–16. ISSN 0003-0554. OCLC 7435466814. doi:10.1017/S0003055418000084.  Closed access
  3. Torrero, Christian; Caprini, Carlo; Miorandi, Daniele (April 9, 2018). “A Wikipedia-based approach to profiling activities on social media”. arXiv:1804.02245v2 [cs.IR].  Free to read
  4. Zhang, Justine; Chang, Jonathan P.; Danescu-Niculescu-Mizil, Cristian; Dixon, Lucas; Yiqing, Hua; Thain, Nithum; Taraborelli, Dario (May 14, 2018). “Conversations Gone Awry: Detecting Early Signs of Conversational Failure”. arXiv:1805.05345v1 [cs.CL].  Free to read
  5. Klapper, Helge; Reitzig, Markus (May 7, 2018). “On the Effects of Authority on Peer Motivation: Learning from Wikipedia” (PDF). Strategic Management Journal (John Wiley & Sons). OCLC 7586436764. doi:10.1002/smj.2909. Retrieved June 28, 2018.  Open access
  6. Shang, Wenyi (March 15, 2018). “A Comparison of the Historical Entries in Wikipedia and Baidu Baike”. In Chowdhury, Gobinda; McLeod, Julie; Gillet, Val; et al. Transforming Digital Worlds. International Conference on Information (iConference 2018; March 25–28 at Sheffield, United Kingdom). Lecture Notes in Computer Science. 10766 (Online ed.). Cham, Switzerland: Springer International Publishing AG. pp. 74–80. ISBN 978-3-319-78105-1. OCLC 7357407865. doi:10.1007/978-3-319-78105-1_9.   Closed access
  7. Xiao, Lu; Sitaula, Niraj (March 15, 2018). “Sentiments in Wikipedia Articles for Deletion Discussions”. In Chowdhury, Gobinda; McLeod, Julie; Gillet, Val; et al. Transforming Digital Worlds. International Conference on Information (iConference 2018; March 25–28 at Sheffield, United Kingdom). Lecture Notes in Computer Science. 10766 (Online ed.). Cham, Switzerland: Springer International Publishing AG. pp. 81–86. ISBN 978-3-319-78105-1. OCLC 7357407963. doi:10.1007/978-3-319-78105-1_10.   Closed access
  8. Pentzold, Christian (May 3, 2017). What are these researchers doing in my Wikipedia?’: ethical premises and practical judgment in internet-based ethnography” (PDF). Ethics and Information Technology (Springer Science+Business Media, published May 5, 2017) 19 (2): 143–155. ISSN 1388-1957. OCLC 7039749181. doi:10.1007/s10676-017-9423-7. Archived (PDF) from the original on June 28, 2018. Retrieved June 28, 2018 – via ChristianPentzold.de.  Free to read
  9. Pentzold, Christian; Weltevrede, Esther; Mauri, Michele; Laniado, David; Kaltenbrunner, Andreas; Borra, Erik (March 13, 2017). Scopigno, Roberto, ed. “Digging Wikipedia: The Online Encyclopedia as a Digital Cultural Heritage Gateway and Site” (PDF). Journal on Computing and Cultural Heritage. Special Issue on Digital Infrastructure for Cultural Heritage, Part 1 (New York: Association for Computing Machinery, published April 14, 2017) 10 (1): 5:1–5:19. ISSN 1556-4673. OCLC 7006965721. doi:10.1145/3012285. Retrieved June 28, 2018 – via ResearchGate.  Free to read
  10. Kelly, Elizabeth Joan (November 28, 2017). “Use of Louisiana’s Digital Cultural Heritage by Wikipedians”. Practical Communication. Journal of Web Librarianship (Taylor & Francis) 12 (2): 85–106. ISSN 1932-2909. OCLC 7566358637. doi:10.1080/19322909.2017.1391733.  Closed access
  11. Yamada, Shohei (December 29, 2017). “The Conceptual Correspondence between the Encyclopaedia and Wikipedia”. Journal of Japan Society of Library and Information Science (Japan Society of Library and Information Science) 63 (4): 181–195. ISSN 1344-8668. OCLC 7261862873. doi:10.20651/jslis.63.4_181.  Closed access
  12. Matei, Sorin Adam; Britt, Brian C. (September 21, 2017). “Analytic Investigation of a Structural Differentiation Model for Social Media Production Groups”. In Alhajj, Reda; Glässer, Uwe. Structural Differentiation in Social Media: Adhocracy, Entropy, and the ‘1 % Effect’. Lecture Notes in Social Networks (1st ed.). Cham, Switzerland: Springer Nature. pp. 73, 75. ISBN 978-3-319-64424-0. ISSN 2190-5436. LCCN 2017948031. OCLC 7138124671. doi:10.1007/978-3-319-64425-7_5. 
Supplementary references:
  1. Zhang, Justine; Chang, Jonathan (June 13, 2018). Conversations gone awry’—the researchers figuring out when online conversations get out of hand”. Wikimedia Blog (Interview). Interviewed by Melody Kramer; Dario Taraborelli. Wikimedia Foundation. Archived from the original on June 28, 2018. Retrieved June 28, 2018. 
  2. Bush, Jim (November 6, 2017). “Results of Wikipedia study may surprise”. Purdue News Service and Agricultural Communications (Press release). West Lafayette, Indiana: Purdue University. OCLC 7177119166. Archived from the original on June 28, 2018. Retrieved June 28, 2018. 

Wikimedia Research Newsletter
Vol: 8 • Issue: 06 • June 2018
This newsletter is brought to you by the Wikimedia Research Committee and The Signpost
Subscribe: Syndicate the Wikimedia Research Newsletter feed
Email WikiResearch on Twitter WikiResearch on Facebook[archives] [signpost edition] [contribute] [research index]


03/07/2018-09/07/2018

Screenshot MapRoulette 3

MapRoulette 3 1

Mapping

  • Jeremiah Rose asks, on the tagging mailing list. if it is feasible to provide links to accessible restaurant menus. He suggests two possibilities for the tag syntax: url:menu=* or url:accessible_menu=*.
  • OpenStreetMap announced,via an official blog post, that Bing Streetside imagery is available in the online editor iD. Streetside is now the 3rd available street-level imagery in iD together with OpenStreetCam and Mapillary. The blog post says that Streetside will “probably become available” in other editors like JOSM. Please note before using this imagery that the licence from Bing has certain restrictions.
  • Martijn van Exel has released the 3rd version of MapRoulette. MapRoulette will give you a random task in OSM to work on. Since 2013 about 1.5 million tasks to improve OSM have been completed. Martijn is blogging about the new features of in a series of posts. His first post is concerned with the introduction of location, category and text based filters.
  • On the German OSM forum (in German), a user came across a trap street on another map. What should OSM show? Let the cat out the bag? An interesting discussion ensues.
  • If you like sauna you might be interested in the new proposal. It is trying to bring order into the current muddle.
  • A member of Amazon’s Logistics team has asked (de) in the German forum how they can address missing turn restrictions to OSM mappers. They fear that OSM notes could be left disregarded.

Community

  • On the talk mailing list, Frederik Ramm mentioned a scientific paper about imports into OSM and their effect on the local mappers’ communities. The answers mention the varied impact of imports around the world and in the history of OSM, and therefore exclude any universal recommendation for imports.
  • Pablo Sanxiao does a good job explaining (es) (automatic translation) that OpenStreetMap is not actually a map, but rather a database filled with geospatial data, continuously updated and improved by volunteer mappers and other actors, alike.
  • [1] Martijn van Exel also introduced his new tool Meet Your Mapper. It shows you the mappers and their number of contributions in any area based on a relation ID as a table. It also allows you to export the statistics.
  • Only 11% of the streets in Rosario (Argentina) are named after a woman.

OpenStreetMap Foundation

  • The OpenStreetMap web site and API will be read-only for around 30 minutes starting at 10:00 UTC on Sunday 15th July, whil the services are migrated to an alternate data centre. (Tweet)
  • The company Cesium, a US based company offering services around 3D maps and Geospatial data analysis, announced that it is now a corporate OSMF member.

Events

  • FOSS4G Tokail will be held Aug 24th and 25th at Aichi University.

Humanitarian OSM

  • HOT has held Community Knowledge Sharing Webinars between July 11th-16th to share experiences of 15 HOT communities about the topics gender, youth, advocacy, data integration and Leave No One Behind.

Maps

  • The uMap 1.0.0 release candidate was announced on the mailing list. Major changes are the merger of Leaflet.storage and django-storage, the move to Django 2.x, which does not support Python 2.x anymore, an improved permission panel, easier customisation, filtering and unicode markers.
  • A thread on searchandrescue subreddit brings to attention the combined pitfalls of OSM’s (but not only) incomplete data about hiking trails and renderer’s grouping choices. The discussion was started by the case of hikers who followed a challenging hiking trail on OSM, that was rendered exactly the same as an easy trail by the app they were using.

switch2OSM

  • As our colleagues from the OSM-US twitter team have noticed that the company ESRI has created a vector map with OpenStreetMap data to match OSM’s default carto style that can be viewed online.
  • Bing maps is now using building data from OSM in their proprietary maps. If you zoom in, you will see OSM based buildings and an OSM attribution. If available, Bing uses height information from OSM to render 2.5D building models.

Open Data

  • Opendata.ch, the local Swiss chapter of the Open Knowledge Foundation, has hosted their annual conference. As the Swiss newspaper Netzwoche wrote (de), the topics included challenges, open data politics and an outlook as usual but also a lessons learnt about a project that struggles to sustainably finance itself due to on-off interests in the project.
  • Mateusz Konieczny has written a new tutorial for learning how Overpass-Turbo is used. It is supposed to be useful also for people without any knowledge about OSM and without any programming skills.

Software

  • Daniel from Mapbox, writes a diary post about how to run the full RoboSat pipeline on your own imagery using drone imagery from the OpenAerialMap project, taking the area of Tanzania as an example.
  • Mateusz Konieczny describes how screenshots may be generated using a relatively simple Python script. Easy to update screenshots are likely to be useful in documentation like tutorials.

Programming

  • Michael Spreng has started to implement a better integration of the Overpass API in Umap and is looking for help with the front-end.
  • Peter Karich from the company GraphHopper, wrote a blog post about the visualisation of road network reachability with deck.gl. Although network reachability visualisation is nothing new, the blog post describes how you can create fancy visualisations of reachability yourself.

Did you know …

  • … the app OSMfocus? The open source app shows you the tag/values pairs of objects around you and makes it much easier to identify objects that need an update when passing them by. The source code was recently made available on GitHub and you can install the Android app from Google Play

Other “geo” things

  • Ian Webster created Ancient Earth globe, a 4D visualisation of the Earth in various moments of the distant past – going back to 700 millions years ago. Better not use this for mapping those coastlines.
  • Starting from iOS also third-party navigation apps can use Apples car integration Carplay. Sygic announced such a first integration, which uses OSM as one of many sources. Mapbox also offers an integration via their Software Development Kit.
  • You might not do so, but please do not write down names from door bell nameplates during mapping. Even your hand-written notes can be seen as a “file” as the European Court of Justice ruled against Jehova’s Witnesses. 😉
  • The Australian website spatialsource wrote an article about the MapXplorer’s new web-based un|earth:: app. The app allows you to browse through Landsat images and the changes over the last 32 years. The app supports simple image manipulation capabilities but is limited to Australia.

Upcoming Events

Where What When Country
Brussels OSMAnd route engine hacking 2018-07-16 belgium
Cologne Bonn Airport Bonner Stammtisch 2018-07-17 germany
Lüneburg Lüneburger Mappertreffen 2018-07-17 germany
Moscow Schemotechnika 17 2018-07-17 russia
Karlsruhe Stammtisch 2018-07-18 germany
Mumble Creek OpenStreetMap Foundation public board meeting 2018-07-19 everywhere
Essen Mappertreffen 2018-07-21 germany
Tokyo 東京!街歩き!マッピングパーティ:第21回 増上寺 2018-07-21 japan
Greater Manchester More Joy Diversion 2018-07-21 united kingdom
Nottingham Pub Meetup 2018-07-24 united kingdom
Dusseldorf Stammtisch 2018-07-25 germany
Lübeck Lübecker Mappertreffen 2018-07-26 germany
Milan State of the Map 2018 (international conference) 2018-07-28-2018-07-30 italy
Stuttgart Stuttgarter Stammtisch 2018-08-01 germany
Bochum Mappertreffen 2018-08-02 germany
Amagasaki みんなのサマーセミナー:地図、描いてますか?描きましょう! 2018-08-05 japan
Dar es Salaam FOSS4G & HOT Summit 2018 2018-08-29-2018-08-31 tanzania
Buenos Aires State of the Map Latam 2018 2018-09-24-2018-09-25 argentina
Detroit State of the Map US 2018 2018-10-05-2018-10-07 united states
Bengaluru State of the Map Asia 2018 2018-11-17-2018-11-18 india
Melbourne FOSS4G SotM Oceania 2018 2018-11-20-2018-11-23 australia

Note: If you like to see your event here, please put it into the calendar. Only data which is there, will appear in weeklyOSM. Please check your event in our public calendar preview and correct it, where appropriate.

This weeklyOSM was produced by Anne Ghisla, Nakaner, Polyglot, Rogehm, SK53, Spanholz, Spec80, SunCobalt, derFred, jinalfoflia.

July 13, 2018

Photo via Jess Wade, CC BY-SA 4.0.

By day, Dr. Jess Wade is a physicist best known for her work on “polymer-based, circularly polarising, light-emitting diodes.” But in the evenings (and on the weekends, and as other time permits) Dr. Wade is a strong advocate for increasing diversity and inclusion in STEM subjects, speaking at conferences and starting a campaign on Wikipedia to promote more early-career women scientists and woman role models in STEM. Her Twitter account introduces the world to dozens of female scientists—people like molecular biologist Jenefer Blackwell and chemist Mande Holford—through their newly created Wikipedia pages. We reached out to Wade to learn more about her work on Wikipedia, where she’s currently working on a year-long to craft a page a day about an “awesome underrepresented group working in science and engineering.”

How did your interest in adding biographies of women scientists to Wikipedia begin?

I’m tirelessly pro-equality, and spend all my free time working with the Institute of Physics and Women’s Engineering Society (WES) to increase the representation of women in physics. About two years ago on International Women in Engineering Day I met Dr. Alice White, a Wikimedian-in-Residence working at the Wellcome Collection in London. Alice was running a wikithon training members of WES on how to use Wikipedia—uploading stories from the wonderfully archived journal The Woman Engineer onto the world’s favourite encyclopedia.

Whilst the training only lasted a couple of hours, my enthusiasm lived on! A year or so later, Alice and I applied for some money to run wikithons in schools, learned societies and universities—teaching other people how (and why) to create biographies of women in science. It wasn’t having a huge impact though—a few pages here, a couple of stubs there—but when I woke up on January 1, 2018, I figured I’d set myself a challenge: one page about an awesome underrepresented group working in science and engineering every day that year.  I guess we’re about halfway through now!

How do you decide who to include? I’m also curious about the research you do for each bio.

Sometimes I am just scrolling through Twitter and read a tweet about or by someone who sounds interesting, and look them up. Because that’s what scientists do right? When we’re inspired by an idea or a person, we do some researching to learn more.

Well, what I was noticing was that most of the time the first search result wasn’t a Wikipedia page, but a university site, or a blog, or a TED talk. These aren’t just early career noobs like me—these were Professors, award-winners, inventors. So those kind of people are high on my list of who to make. Then there are fellows of learned societies and prize winners. Sometimes I just search through the faculty at particular universities—maybe ones I’m collaborating with or visiting—and sometimes I just see people do a talk or read their papers. There isn’t much logic.

The Twitter account @blackphysicists has helped me find heaps of African-American scientists who are wiki-worthy. Then I spend a couple of hours searching for impartial sources—citations written when they’ve won an award, news reports about their research, announcements when they receive promotion. There are heaps of newspapers archived online for free, which helps, especially when trying to work out women’s maiden names. I love when I can find where they went to school, what their parents did and who inspires them—that sounds weird I know—I think it helps readers understand that scientists are just normal people who found something they love.

What has the reception been on Twitter? With the scientists? Within the scientific community?

It has been great! I have to admit for every sad thing I read about the toxicity of social media and Twitter, there are heaps of positives on there too. Sometimes people send me suggestions and ask for me to make the pages of their professors. Sometimes teachers respond with how happy they were to learn to share a story with their class. When we hosted a wikithon at Imperial, where I work, lots of members of faculty emailed suggestions of people who deserved a page. I think the stream of friendly tweets has inspired other people too—editing Wikipedia is so easy and rewarding that everyone can get involved, whether it is from their lab, bedroom or office.

Have you run into any difficulties? If so, what?

At first I wasn’t great at sourcing impartial references—I relied on blogs or interviews, and as any experienced wiki editor will know, that won’t cut it when it comes to editorial review! (Editor’s note: For more information, see the English Wikipedia’s pages about verifiability and identifying reliable sources.) But I’ve learnt in the past six months, I get really sad when my pages get nominated for deletion—not because of the time investment, but because the people I’m creating are genuinely brilliant and don’t deserve to have their notability questioned. The notability criteria are a bit time-expired: academics are usually only deemed worthy if they are a Professor at an impressive university, have a high number of citations, or have won an internationally recognized prize.

Whilst that’s all well-and-good, it doesn’t take into account the amount of public engagement early career scientists are doing—they aren’t just sitting in their labs but writing books, newspaper articles, and doing TED talks. The notability criteria for early career faculty who are significant public figures hasn’t been written yet. I’ve recently become “autopatrolled”, so I don’t have to have my pages reviewed, but before then I noticed that it would take longer for the pages of African-American women to get reviewed. So I just got better at making their pages to avoid any criticism.

How does your Wikipedia work fit into the larger outreach work you do?

How does it fit into my life would be a better question! Really I talk about it with all the schools and academics I met, encouraging them to share biographies and images on Wikipedia. I’m trying to get more scientists to upload diagrams and lab photos onto Wikicommons. The overarching mission is to help the Institute of Physics on their decade-long campaign to increase girls’ participation in physics—their Improving Gender Balance project has just started to show big improvements when the whole school is on board with tackling gender stereotyping. Most outreach activities are aimed just at girls (the big “Girls in STEM” drive that seems to be worldwide at the moment), but won’t have any lasting effect unless parents and teachers are involved as well. The IOP have shown that working with teachers from every department (from psychology to physics to geography) has the most chance of helping girls recognise their ability and potential. We also need to improve careers advice, so girls and boys get up-to-date insight about the kinds of things people with degrees in physics and maths do. Wikipedia pages help a lot there!

What’s been your favorite biography to work on? Why?

Gladys West, the African American woman born in the ‘30s who did the original mathematics for GPS. She didn’t even know she was doing it! I saw a tweet about her in Black History Month, made the page that evening, and then a few months later she got picked by the BBC as one of their 100 women.

My first proper page was Kim Cobb, a climate scientist at Georgia Tech who studies corals. I’ll always look to her as an inspiration—she is a professor with 4 kids who still finds time to travel the world to collect samples with her lab. Roma Agrawal, the structural engineer from Mumbai who did A-Levels in London before completing a degree in physics—she’s great. Roma was responsible for designing and building the top of the Shard (the tallest building in London) – and has recently written her first book, Built. It’s amazing how much of their lives I can remember! I’m writing this on a plane with no wi-fi.

Is there anything else you’d like to add?

I am just grateful with every editor at Wikipedia for being patient whilst I found my feet and helping me out when I make spelling mistakes and grammatical errors. I’m forever in debt to Alice White, who first showed me how to wiki, then helped me to show others.

Melody Kramer, Senior Audience Development Manager, Communications
Wikimedia Foundation

Elysia Webb
Image: File:Elysia Webb headshot.jpg, Enwebb, CC BY-SA 4.0, via Wikimedia Commons.

Elysia Webb first started editing Wikipedia as a graduate student at the University of Florida in January 2017. She improved the Wikipedia article about the Florida bonneted bat in Emily Sessa’s Wiki Education-supported course, Principles of Systematic Biology. A few months later, she reflected on our blog about the experience, writing,

“The semester that I signed up for my Wikipedia account, I surpassed a thousand edits. Now that I knew how to edit articles and had the tools to do so, I had no excuse not to.”

Now, more than a year later, Elysia continues to contribute to Wikipedia. Since that January, she has made more than 10,000 edits and has created more than 40 new articles. Elysia cites a few different reasons for why she’s stuck around. She feels a responsibility to improve articles that need work and she has the expertise to do it well.

“I have continued to edit Wikipedia because there continue to be articles that need someone’s input,” she tells Wiki Education in an interview. “I primarily work on bat articles, for example. Before I began editing Wikipedia, 3 out of 4 bat-related articles were ‘stub class’ or had only three sentences or fewer. They were really non-informative and frankly did not do justice to the fascinating lives that bats live and the impact that they have on both the natural world and ours. With the edits I’ve made over the past 18 months, about 57% of bat articles are currently stub class, which is really a significant improvement. I’ve also created articles for over 40 species of bats that were previously not represented on Wikipedia at all. However, that still leaves me with almost 900 articles that lack nearly all content and dozens more missing articles that need to be written. I’ve got my work cut out for me, certainly, and it’s the kind of self-appointed project that will be measured in years, not weeks. I try to focus on short-term goals, though, to keep this Herculean task manageable, such as creating 5 missing articles each month.”

We’ve heard from instructors that among the many student learning objectives achieved through a Wikipedia assignment, students feel inspired to delve deep into their self-chosen topics. They also gain confidence in the self-directed process that is Wikipedia editing. At the basis of being a Wikipedian is the commitment to free knowledge, as well as a love of learning.

“I find that the more I edit, the more I learn about editing, and I’m able to make more impactful changes,” says Elysia. She has refined her editing process since learning how to improve Wikipedia from Wiki Education’s resources, and has created new tools to be even more of an effective contributor.

“I set up an automated system to determine which bat-related articles get the most page views each month. Once I had that up and running, I could make targeted improvements to the content that people wanted to learn more about. I also feel more comfortable making bigger and bolder changes as I’ve gained experience. It wasn’t until 10 months after the conclusion of my Wiki Education course that I brought my first article to ‘Good Article’ status. There’s a lot of work that goes into creating a ‘Good Article,’ and you certainly have to handle a lot of criticism and feedback to make a polished product.”

Elysia’s userpage on Wikipedia shows her areas of expertise, WikiProjects she’s involved in, and other user information.

A big part of what keeps her on Wikipedia, Elysia says, is her involvement in WikiProjects.

“WikiProjects are associations of editors who enjoy working on the same subjects. I’ve become particularly involved—and somewhat akin to a groundskeeper—at the Bats Task Force, which is quite young.”

Bats Task Force is made up of four Wikipedians so far, including Elysia. It’s a localized page on Wikipedia that measures what bat articles are receiving attention lately; lays out monthly goals for creating and improving certain articles; and presents a suggested layout for what information an article about a bat should include. It’s a great resource for future biology students that Elysia takes great care to maintain.

“I would advise future students to edit what they’re passionate about! Find a WikiProject and ask how you can help,” she says.

Editing Wikipedia as an assignment situates student learning in wider contexts. Students participate in knowledge production, bringing academic information (often restricted behind paywalls) to millions worldwide. Students experience increased motivation when their work has an impact outside of the classroom. They understand that their work can be accessed by any reader of English Wikipedia around the world, and feel a responsibility to get it right.

And their work can really make a difference. For example, Elysia improved the article about the Indian flying fox shortly before a Nipah virus outbreak in India.

“The flying foxes can transmit Nipah virus, and sure enough, I went to the article view history and the page views jumped dramatically from around 100 per day to up to 1,700 per day during this outbreak. It made me thankful that I had included information on diseases in the article so that people could quickly get access to the facts instead of having to sift through the dozens of research publications that I used to write the article. Preventing virus transmission can be simple and inexpensive based on the data I used to expand the article; maybe people will favor those options if they simply know that they exist instead of culling the flying foxes. While I try to pace myself in editing so that I don’t get burnt out, I’ve come to espouse the Wikipedian philosophy that ‘the deadline is now.’ The content should be written with haste and accuracy so that it exists when the need arises.”

“I also contribute in other areas from time to time based on current events and what I feel people want to know about. Immediately following the death of Barbara Bush, for example, I wrote the article for her daughter Robin, who died at 3 years old. Many people were just hearing about Robin for the first time and wanted to know more about her; the page I wrote got a quarter-million views in the month after Barbara’s death. Occasionally, I’ll come across an influential and notable woman who doesn’t have a Wikipedia article and take it upon myself to write it to combat content gender bias. Recently I wrote the article for Sherry Johnson, an incredible woman who campaigns against child marriage in the United States.”

Improving Wikipedia speaks to a big picture mission: making accurate knowledge accessible to the world.

“I enjoy editing, and I’m always surprised at how much I learn in the process of creating content for the public,” says Elysia. “It also feels altruistic—like I’m performing a public service. If someone can read an article I’ve expanded and learn that bats provide economic services to humans or I can dispel a dangerous myth that they’ve held about bats, then maybe they will care more about them and their conservation.”

“As I researcher, I acknowledge that the number of everyday people who will read a manuscript with my name on it is infinitesimal. Research is so important, of course, but it’s literally inaccessible in that it’s behind paywalls and it’s otherwise inaccessible in that it contains so much jargon that it’s unintelligible. But, thousands of people have read the articles I’ve written or expanded, and I’m building them from that same data and research written by scientists. I feel like a conduit or an interpreter; the work I do on Wikipedia feels important.”

When asked what role she thinks Wikipedia should play in higher education, Elysia says they’re a “match made in heaven.”

“Everyone wins—the students benefit by learning about scientific writing for a wider audience, and the public wins because they can access more complete and useful information. We know that pretty much everyone with internet access has used Wikipedia at some point. When Wikipedia is more reliable, thorough, and neutral, society is better for it.”


Interested in teaching with Wikipedia? Visit teach.wikiedu.org or reach out to contact@wikiedu.org with questions. Also check out Elysia’s reflections from a year ago on our blog.


ImageFile:Indian flying fox cropped.jpg, Manojiritty, CC BY-SA 4.0, via Wikimedia Commons. 

July 12, 2018

Many of the millions who visit Wikipedia every month are looking for information to understand their current political climate. They may be researching candidates or policies related to upcoming midterm elections. Or they may be looking to understand how our history has shaped present day affairs. Encyclopedias and other accessible knowledge resources (like newspapers) are vital to a functioning democracy. They are tools to ensure that everyone feels empowered, prepared, and well-informed enough to participate in democratic functions that protect their rights.

However, even widely used resources like Wikipedia contain gaps in their content. These gaps arise primarily due to who is fleshing out content on the site. Wikipedia is a platform that relies upon the hard work of volunteers, most of whom are young men who work on topics of interest to them. These editors don’t necessarily have access to academic databases or journals, leaving a number of gaps when it comes to academic topics. That’s why higher education students are so well-positioned to become new editors. They can use their access to academic sources to apply course concepts in a worldwide context, helping to make a resource that we all look to for information more comprehensive.

When mainstream media and politics don’t reflect the diversity of the people they are supposed to represent, alternative platforms for discourse must arise. La Raza is one such platform, a Chicano newspaper and magazine that developed in East Lost Angeles in 1967 and ran for ten years. Before Dr. David-James Gonzales’ student at the University of Southern California created the Wikipedia article, the history and legacy of the newspaper wasn’t documented at all on the site. As the student writes in the article’s lead section, La Raza provided a space for Chicano activists “to document the abuses and inequalities faced by Mexican-Americans in Southern California.” Staff employed a photojournalistic approach, capturing images of “police brutality, segregation, and protests that rallied support to the Chicano cause.” The paper advocated educational reform and improved conditions for minority students in Los Angeles. It served as a vehicle for community organization, founding groups like the the Barrio Union for Scholastic Community Action (BUSCA), which sought to engage Chicano parents and community leaders in improving education for Chicano students. La Raza also played a role in developing El Partido de la Raza Unida, a political party centered around Chicano nationalism.

Dr. Gonzales’ course, Borderlands in a Global Context, explores migration, migrant rights, indigenous sovereignty, transnational labor, and how national identity forms around the metaphor of border and borderlands. Over the course of the term, his students added almost 70,000 words to a variety of articles on Wikipedia. Those articles have received 2.4 million views since.


Interested in teaching with Wikipedia? Visit teach.wikiedu.org to get started or reach out to contact@wikiedu.org with questions.


Image: File:International newspaper, Rome May 2005.jpg, Stefano Corso, via Wikimedia Commons.

Photo by Matthew Henry via Unsplash, CC0.*

Most of the time, Wikimedians work together in healthy, collaborative ways. However, there are times when Wikimedia volunteers and administrators have to deal with problematic behaviour in their communities. This work can be very difficult; disputes may start in one place, and move to another. They may simmer over long periods of time, occasionally boiling over then cooling off again. This can make the work of reviewing a situation time-consuming and frustrating.

The Wikimedia Foundation’s Anti-Harassment Tools team has been working with Wikimedia administrators and contributors (English, Meta-Wiki) to identify areas where tools can help make difficult jobs less time-consuming and discouraging. The interaction timeline tool is a useful addition to the suite of tools available for looking at inter-user disputes.

The interaction timeline is a way to look at two contributors’ editing history—where they have interacted, when, and how often. It offers a clean visual summary of connections between two users. This can help add clarity when reviewing reports of harassment and abuse, and takes some of the burden off both the people reviewing problems, and the people reporting them.

To use this tool, first choose the project you would like to look at. Then enter the usernames of the two accounts you would like to see interactions for. If you have a date range you would like to look at specifically, use the date selector.  From here, the tool will give you a timeline of all pages where the two accounts have intersected, as well as links to the difference page showing the edit. It can tell you how much time has elapsed between edits, as well as the bytes added or removed.

Currently, a way to generate wikitext about the tool’s results is under development. You can suggest ideas about the best way to summarize the information for better noticeboard harassment reports. This tool has several other features requests in Phabricator; if you have a suggestion for development, or are interested in helping translate the interface to more languages, please drop by the Phabricator task board for the tool.

If you are one of the many people who help deal with disputes, harassment and problems between users on Wikimedia projects, your work is invaluable. Hopefully this tool can reduce the time and strain that comes from reviewing long talk pages and contribution histories. The editor interaction analyser, Intertwined, and Intersect can be used in conjunction with the interaction timeline to gain a fuller picture than one tool alone.

Please add the timeline to your toolbox, and let us know what features or fixes you would like to see in the future!

Sydney Poore, Trust and Safety Specialist
Wikimedia Foundation

*Unsplash’s licensing terms changed on 5 June 2017. This photo was published on 25 April 2016.

Having listened to a Youtube presentation on Article Placeholder, I am seriously disappointed. There are a few statements in there that show a lack of understanding on the functionality of the Reasonator. It is dismissed for all the wrong reasons and as a result there are a lot of missed opportunities.

What is missed is that Reasonator, as it is, provides superior representation in any language. It is a tool that helps with missing labels from within the tool. Missing descriptions in Reasonator do not need to be a problem; there are automated functionality that has shown its merits in many languages. Do compare the representation of Wikidata data and the structured representation will be seen to be more rich with the inclusion of maps, images and data linked to the subject in question.

What is particularly galling is that Reasonator is dismissed because "it is an external tool". Before work on the Article Placeholder started, it would have been easy enough to adopt functionality as provided by this external tool and it would not have been an external tool, an obvious argument AFTER the fact.

Where Reasonator provides texts, it is done based on little scripts. This is seen as problematic as is seen as a drain on the community. Templates on the other hand may be a part of the Article Placeholder and they have the same problem.

For me the bottom line is not so much about the Article Placeholder but the lack of usability of Wikidata. It is only because of Reasonator that it is easy and obvious to work on the subjects I work on. I have not spend hours learning how to query, Reasonator provides me instantly with the results in any context like the missing "Districts of Djibouti".
Thanks,
        GerardM


July 11, 2018

Today I released two new json files: one file with demographics data from World Bank, a second file with a subset of the first, augmented with Wikimedia page views counts.

Both complement visualization Wikipedia Views Visualized (aka WiViVi), but both can be used in other contexts as well.

1) World Bank demographics data

This file world-bank-demographics.json resulted from harvesting World Bank API files.

It contains yearly (!) figures for four metrics: (more could be added rather easily):

– population counts,
– percentage internet users,
– percentage mobile subscriptions,
– GDP per capita.

I used this demographics file to publish a set of charts (many more on meta).

World_Bank_internet_users_per_100_-_Regions World_Bank_mobile_subscriptions_per_100_-_Regions

 Details: World Bank files have different formats (some csv, some json) and use a variety of indexes (some use ISO 3166-1 alpha-2 codes, others ..-alpha-3). Script 1) first does normalization, then data are aggregated, filtered, indexed.

2) Global Wikipedia page views data, incl demographics data

This file datamaps-data.json contains the equivalent of 3 rather complex (*) csv files which feed WiViVi. This new format brings together demographics data and pageviews (by country, by region, and by language), and also adds additional meta info. This json format is meant for external use, as it’s much easier to parse for some than the 3 csv files which WiViVi uses itself (the csv files use nested delimiters).

Notes:
A) Json file 1) replaces two csv files which up to now were filled from Wikipedia pages, one on population counts, one on internet users.
B) Although Wikipedia lists nowadays also use World Bank data, this is not consistently done, see talk pages here and here (sections ‘Wikipedia vs World Bank’).
C) For scripts and data files see GitHub: 1) here and 2) here 

Older blog entries