Today’s The Fast and the Curious post covers how Chrome achieved best-in-class Speedometer scores on mobile devices, resulting in faster and smoother web experiences for Android users.
Chrome has always been about speed. Whether it’s loading pages quickly, running complex web apps smoothly, or delivering a seamless browsing experience, performance is at the heart of our browser. And we’re always looking for ways to make Chrome even faster.
Over the last two years, we have been hard at work on a number of performance improvements for Android devices. We’re excited to share some of the progress we’ve made.
Speedometer on Android
One of the key metrics we use to track Chrome’s performance is the Speedometer benchmark. This benchmark is developed in collaboration with other major web browser engines and measures how quickly Chrome can complete interactions with web pages, including parsing/rendering HTML or CSS and running JavaScript.
Since the release of Chrome M112, we’ve seen a significant increase in Speedometer 2.1 scores on Android devices [1]. In fact, on many devices, scores more than doubled, with the newest Snapdragon® 8 Elite Mobile Platform setting new records for Speedometer performance on mobile devices! These huge accomplishments are a testament to the work not only of the Chrome and Android teams, but also our silicon and SoC partners.
Since Chrome M112, Speedometer 2.1 scores have more than doubled on many Android devices. [1]
How Did We Do It?
The improvements resulted from several changes, including:
- Build optimizations: We’ve made a number of changes to the way Chrome is built, which has resulted in faster code execution tuned to modern premium Android devices and SoCs.
- V8 and Blink improvements: Many improvements to the JavaScript engine (V8) and the rendering engine (Blink) have further boosted performance.
- Scheduling, OS and SoCs: We worked closely with Android partners to optimize the way Chrome interacts with the operating system and its thread scheduling to make the best use of the silicon on the devices.
Let’s take a closer look at each of these areas.
Build optimizations
The Android device ecosystem is very diverse. From entry-level phones to the newest premium ones, Chrome needs to run well on all devices. Up until last year, we shipped the same Chrome build to all these different Android devices. The memory and disk size constraints on entry-level devices resulted in Chrome having to prioritize a small binary size. Consequently, many modern build optimizations were out of reach, as they resulted in much larger binaries.
With M113, Chrome was finally able to ship a separate higher-performance build targeting premium Android devices via the Google Play Store. While we still ship a more binary-size-constrained build to other devices, this approach allowed us to land some of those modern optimizations into the new premium build:
- By targeting 64-bit Arm instead of 32-bit Arm, we can make use of more efficient Arm instruction set features and larger 64-bit operations.
- Since binary size is less relevant on premium devices with large disks and sufficient memory, we can now compile C++ code optimized for speed (-O2 / -O3) rather than size (-Oz).
- Furthermore, we tweaked the inlining thresholds used by the compiler to enable more inlining in hot code (within and across modules), while updating the model and policy used by another compiler pass (MLGO) to reduce inlining in cold code.
- We now also apply profile-guided optimization (PGO) techniques to the build to further improve the code layout and optimization level for hot code.
- Finally, we improved cross-function code ordering by aligning Chrome’s orderfile generation with the new 64-bit build. We also now include Speedometer 3, the latest version of the industry-standard browser speed benchmark, in the workloads used to generate the orderfile.
Together, these build optimizations account for more than half of the overall Speedometer score improvements. This progress was facilitated by our collaboration with Arm, who contributed valuable insights and improvements, including to identify and address inefficiencies in Chrome’s PGO setup and inlining.
V8 and Blink improvements
Chrome continuously improves the performance of its JavaScript and web rendering engines, V8 and Blink. Most optimizations are small in individual impact, but stacked together, these improvements add up and contributed most of the remaining Speedometer impact! Notable ones include:
- We now utilize an optimized fast-path HTML parser to parse innerHTML attributes.
- V8 launched its Sparkplug compiler tier, a super fast baseline compiler that sits right above its Ignition interpreter and generates non-optimized code very quickly. Later, V8 also launched Maglev, a new mid-tier compiler that generates semi-optimized code. It takes longer to do so than Sparkplug, but much less time than Turbofan, V8’s ultra-optimizing compiler tier. All together, this new tiering hierarchy allows V8 to tier up more gradually, improving both performance and power consumption.
- We tuned our heuristics that decide when garbage collection occurs, targeting times when the rendering engine is idle or when users navigate away from pages.
- We landed many other incremental optimizations, e.g. to V8 and our parsing, style, layout, and text rendering engines.
Scheduling and OS
To achieve the best possible performance, Android partners invest heavily in tuning the operating system’s thread scheduling and frequency scaling policies, as well as improving the performance of the Silicon itself.
We worked closely with our partners to improve their tuning for Chrome and Speedometer. In particular, our collaboration with Qualcomm was very fruitful: By combining optimized scheduling policies with improved hardware performance, their newest Snapdragon 8 Elite mobile platform realized a 60-80% improvement in Speedometer 3.0 compared to its predecessor, resulting in class-leading web performance. This collaboration also highlighted important bottlenecks in Chrome’s code, such as the need for improved PGO and opportunities in V8.
Speedometer 3.0 on Snapdragon 8 Gen 3 (left) compared to Snapdragon 8 Elite (right), Chrome M131
Why do these improvements matter?
Faster Speedometer scores translate to improvements in real user interactions with web content, such as faster page loads and interactions. Back at M112, loading a Google Docs document on Pixel Tablet took more than 50% longer than it does today — that’s the effect of a doubled Speedometer score!
Chrome M112 vs. M129 on Pixel Tablet, loading a Google Doc (frame count)
[1] Speedometer 3 was released during M122, so results from Speedometer 2.1 are provided for a full picture. Measurements shown in graphs were taken on Pixel Tablet.
Posted by Eric Seckler, Software Engineer, Chrome