Skip to main content

Performance

ProtonAOSP features a number of performance improvements that make the overall system performance significantly better:

  • Up to 18% faster app/menu/screen opening
  • 16% faster screenshot capturing
  • Up to 4x faster low-level memory management
  • Faster image loading and saving (JPEG and PNG)

Benchmark results are available to quantify these performance improvements. Most of them are the result of empirically profiling for bottlenecks and optimizing accordingly.

The sections below describe the technical details of our optimizations. All pre-optimization profiling percentages were sourced from a Pixel 5 running ProtonAOSP 11.3.1, unless otherwise stated. “The Settings test” refers to opening and closing activities in Settings, specifically Developer Options because of the amount of preferences it contains.

Native code#

Most of ProtonAOSP’s performance improvements are in native components, which comprise much of the system’s performance-critical code.

Memory allocation#

Android 11 switched to the Scudo memory allocator for security hardening, but this comes at the expense of performance. We trade the ability to detect memory usage bugs for performance instead by using the latest stable version of jemalloc, updated from the official repository.

In Bionic libc’s semi-realistic memory trace replay tests, jemalloc performed up to 4x better than Scudo while using nearly the same amount of memory.

Optimized zlib#

We use an optimized fork of the ubiquitous zlib data compression library, zlib-ng, to improve compression and decompression performance for many use cases:

  • HTTP gzip compression
  • PNG compression (e.g. screenshot saving and image editing)
  • Android resource loading
  • ZIP archives

Combined with other improvements, this speeds up screenshot saving by 16% on the Pixel 5 and likely contributes to the faster cold app launches as well.

Bionic libc#

Bionic libc includes string and memory routines used by nearly every process on Android. We use more optimized versions of these commonly-used functions by porting them from Arm’s arm-optimized-routines project.

Global ThinLTO#

The LLVM/Clang compiler includes support for ThinLTO, a DSO-wide (i.e. program-wide or library-wide) optimization that can improve performance significantly in many cases because it improves the compiler’s ability to inline code.

We added Android 12’s experimental mode to enable ThinLTO globally for most components in the system and enabled it by default, which improves app launch performance by ~2% in addition to the individual components we already enabled ThinLTO for before this.

SIMD#

On modern CPUs, leveraging SIMD is key to maximizing performance in compute-heavy workloads such as image and data compression.

We have either updated or switched to accelerated forks of the following libraries in order to take (more) advantage of NEON SIMD and/or other ARMv8.2-A extensions (e.g. CRC32 and polynomial multiplication):

Reduced debugging#

We’ve disabled the statsd daemon, which collects diagnostic statistics that are normally unused on ProtonAOSP. In our testing, statsd itself accounted for 0.04% of CPU time in the Settings test, with more overhead (over 0.02%) from clients serializing and sending stats for collection.

Similarly, we disabled debug tracing in ART to save 33% of the 0.1% CPU time spent checking whether specific trace tags are enabled.

Compiler optimizations#

Google enabled additional compiler optimizations (-O3) for some components in Android 12. While this can cause breakage in some cases, we followed suit with the components that Google deemed safe to optimize:

Compiler update#

We use a newer version of the Clang compiler from Android 12: Clang 12.0.4. This does not make much of a difference by itself, but it helps global ThinLTO work better by avoiding compiler bugs and taking advantage of newer LLVM optimizations.

Miscellaneous#

Some other miscellaneous optimizations in native libraries have been ported from Android 12:

Java code#

While we have more optimizations focused on native code, higher-level Java code has also been optimized according to simpleperf profiles.

Percentages in the following sections refer to global CPU time in the Settings test. In indented sections, the percentage is a fraction of the parent items.

Framework & core services#

UI services#

Native bindings#

Benchmarks#

Raw benchmark results and analysis spreadsheets can be found in the Google Drive folder. Most of the results are from a Pixel 5.

Last updated on