In late 2023, an article claimed that Wikipedia is one of the fastest websites in the USA. Flattering, right? I've been measuring web performance for over a decade, I couldn't help but wonder: How did they measure that? How do you know that Wikipedia is one of the fastest websites? The article does not say anything on how they did measure it.
I went to the Web Performance Slack channel (yes, there's a dedicated place where web performance geeks hang out). I asked the question:
“Has anyone seen the data or the actual “study” done by DigitalSilk about the fastest loading US websites? https://www.technewsworld.com/story/craigslist-wikipedia-zillow-top-list-of-fastest-us-websites-178713.html - I can only find references to it and a screenshot, nothing else?”
Not providing references? That's not Wikipedia! We're all about citations and verifiable sources. No one on the Slack channel knew anything about how the test was run. But then, one of the channel members took action: Stoyan Stefanov emailed the journalist and actually got an answer!
Methodology
The most visited websites based on web traffic were ran through Google's PageSpeed Insights tool, to find out how long it takes for each site to load in full on average“
So, while it's flattering to see Wikipedia crowned as one of the fastest websites based on Google's PageSpeed Insights tool, I couldn't help but feel a tricked. They seemed to rely on the onload metric. That's a metric that, in the web performance world was regarded as old and not correlating to user experience since 2013.
Understanding the limitations using the onload metric, let’s shift our focus to modern metrics that better reflect real-world user experiences: Google Web Vitals
Google Web Vitals
Google Web Vitals is Google's initiative to focus on the metrics that matter to users and also affects Googles core ranking system. Unlike the old-school onload time, Google Web Vitals better measure aspects of real world user experience.
The core metrics at the moment are three metrics:
- Large Contentful Paint (LCP) - when in time is the largest element painted on the users screen. For Wikipedia that is very often a paragraph, but sometimes its an image element or a heading.
- Interaction To Next Paint (INP) - measure the responsiveness of page, meaning that a page responds quickly to user interactions. For Wikipedia the responsiveness can be slow depending on the amount of JavaScript we ship, event listeners or click events.
- Cumulative Layout Shift (CLS) - measure the visual stability of the page. That means it measures if content is moved around. In the Wikipedia case this means when campaigns runs and we move the content of Wikipedia.
Google also have two other web vitals, two metrics that are important for the user experience but not listed as core:
- Time To First Byte (TTFB) - measure the time between the request for a resource and when the first byte of a response begins to arrive. For Wikipedia TTFB depends on where the users are in the world and how far it is to the closest data center.
- First Contentful Paint (FCP) - measure the time from when the user first navigated to the page to when any part of the page's text/images painted on the screen. For Wikipedia this is often text.
Google pay special attention to the 75th percentile of those metrics. It was chosen because "First, the percentile should ensure that a majority of visits to a page or site experienced the target level of performance. Second, the value at the chosen percentile shouldn't be overly impacted by outliers.". But what does the 75 percentile mean for us at Wikipedia?
The 75th percentile at Wikipedia
Now let’s put the 75th percentile into perspective by applying it to Wikipedia’s vast global audience.
Imagine that there were 100 people visiting Wikipedia. Each person got a different user experience because of their device, the internet connection and how we build Wikipedia. For some users the experience will be really fast, for some it will be slower.
The 75th percentile focus on the worst experience of the best 75%. If you take all 100 users and then sort the experience from fastest to slowest, the 75th experience is where you draw the line. This means that 75% of the users had a better or equal experience and 25% had a worse one. So, how many users are in that 25% for us? We measure unique devices and not users so lets use that.
Well, for Wikipedia, those 100 users are actually 1,5 billion unique devices per month and 24 billion page views.
That means if we look at the 75 percentile and we see that a metric move we know that at least 6 billion page views per month ( 24 billion × 0.25) is affected. And 375 million unique devices (1.5 billion × 0.25).
That is many devices. Suppose we have a regression of just 100 milliseconds in the 75th percentile. That is at least 375 million devices are experiencing this delay. Collectively, those users are waiting an extra 434 days. Yes, over a year of extra wait time for the users with the worst experience because of a (tiny) 100 ms change.
Is the English Wikipedia the fastest website in the USA according to Google Web Vitals?
With the metrics Google collects from different web sites, you can compare different sites with each other! The metrics are available per domain (not user country), so we can not compare if the English Wikipedia is one of the fastest web sites in the USA, but we can compare the English Wikipedia against other web entities with users all around the world.
However before we do that, I want to point out that "Is the English Wikipedia fastest website in USA according to Google Web Vitals?" is a very exclusionary question to ask since:
- The English Wikipedia is used in more places than the USA
- There are many Wikipedias for other languages out there and we should not only focus on the English Wikipedia. We need to make sure that everyone independent of language has the same user experience.
Looking just at "Are we fast in the USA" we leave out a big part of the world. So today we gonna look at the English Wikipedia compared to other web sites and then also look at Wikipedias all around the world to see what kind of user experience all users have.
But first let's talk about how Google also categorises these experiences as good, needs improvement, or poor by setting specific limits for each metric. With Googles definitions we can see how many of our users have different kinds of experiences. In the data I will show, green means good, needs improvement yellow and red means bad/poor experience.
We collect all data that is available through the Chrome User Experience API and you can see that in our Chrome User Experience dashboard. There's a lot of metrics, so I will focus on just the Largest Contentful Paint today.
First let's look at the actual 75 percentile Largest Contentful Paint. We compare against a couple of other web sites. Lower numbers are better. Green is good. We will start to look at the numbers for mobile.
Mobile
This graph highlights that Wikipedia's mobile LCP performance is nearly as fast as Google's, which is quite remarkable!
We can also look at how many of our users have a slow/bad experience.
Wow we can see that we have less users in percentage with a bad experience than the rest of the sites. However the graph shows a small percentage of mobile users experiencing suboptimal LCP. For a website of Wikipedia's scale, this small percentage translates into millions of users, we need to be even better!
I wonder if it's the same for desktop users? Lets look at the 75 percentile again.
Desktop
Again we can see that Wikipedia is almost the fastest, outperforming many major websites! We seem to be fast on both mobile and desktop.
Yes we are really fast! Can we open the champagne and celebrate?
Are we the fastest site known to human kind?
Well I would take it a little easy here before we start to brag. Do you remember how we calculated how many users are left out when we use the 75 percentile? I would be careful with a web site with so many users. I would say that: "The English Wikipedia is really fast compared to other web pages looking at the Largest Contentful Paint at the 75% percentile for Chrome users that Google collects metrics from".
Another way of looking at the data we get from Google is to see how many users have a bad experience using Wikipedia. By taking the ones that need improvement and poor experience, we can see how many users in percentage we need to move to having a good experience.
First let's look at Largest Contentful Paint again for desktop users. This time we look at the number of users in percentage that have a non good experience per wiki.
And then we look at the same for mobile.
We can see that on desktop and mobile we have Wikipedias where we as developers have work to do to give more users a good experience.
As a last example I want to share the interaction to next paint data for mobile. This is interesting because here JavaScript comes into play and there are many things we can do on our side to give the user a better experience.
We see that for almost every Wikipedia, 5% of the users have a not so good user experience.
Summary
Wikipedia's performance story is one of scale and precision. By focusing on Google Web Vitals, we've seen how milliseconds of delay can impact millions of users. Metrics like Largest Contentful Paint (LCP) and Interaction to Next Paint (INP) can provide valuable insights into real-world user experiences, guiding us to optimize for both mobile and desktop users.
With billions of page views monthly, even the smallest regressions in performance ripple across the globe. Yet, Wikipedia stands as a benchmark of speed in the US, rivaling even the likes of Google. This achievement underscores the importance of continuous monitoring, fine-tuning, and maintaining a user-first perspective in web development.
As we celebrate, we also need to acknowledge the challenges. Moving the needle for those users with "non-good" experiences remains our mission. By using data and ongoing analysis, we can ensure that Wikipedia stays fast, accessible, and enjoyable for everyone, everywhere.
Discuss this story