Open Bug 1943145 Opened 24 days ago Updated 2 days ago

Our Firefox performance tests on Android aren't making use of Baseline profiles, so Firefox is unrealistically slow in them when compared to a Firefox installed from the Play Store

Categories

(Testing :: Performance, enhancement, P1)

enhancement

Tracking

(Not tracked)

People

(Reporter: aglavic, Assigned: aglavic)

References

(Blocks 3 open bugs)

Details

(Whiteboard: [fxp])

Attachments

(1 obsolete file)

In this bug we are adding the speed compile command after installing mozilla mobile products(fenix, gv) for the mozperftest and raptor frameworks

Whiteboard: [fxp]

We noticed that the presence of "Baseline profiles" didn't make a difference in some of the startup tests such as the shopify-applink-startup test. Digging in deeper, we noticed that functions that were mentioned in the baseline profiler were not ahead-of-time compiled. This is different from what happens when you install Firefox Nightly from the play store.

We dug deeper, doing some system-wide profiling using simpleperf.

Installing from Play Store and then launching: https://share.firefox.dev/4gkV9DW
The profile shows that the frames starting with org.mozilla.fenix.onboarding.OnboardingFragment during the Fenix launch come from base.odex, the o in odex indicating that this function was compiled ahead-of-time. The profile also shows what happens during the dex2oat invocation, in particular it contains art::OptimizingCompiler::Compile.

The dumpsys package dexopt command confirms that the "speed-profile" was used:

% adb shell dumpsys package dexopt | grep -A 1 org.mozilla.fenix
  [org.mozilla.fenix]
    path: /data/app/~~-_YqvF5MqD0Nfzhe5zvPOw==/org.mozilla.fenix-r7YImhCqQD3ABWxGJGwtsQ==/base.apk
      arm64: [status=speed-profile] [reason=install-dm]

In contrast, here's what happens when you install Fenix Nightly using adb install: https://share.firefox.dev/40wA2IB
This profile shows that the frames starting with org.mozilla.fenix.onboarding.OnboardingFragment during the Fenix launch come from base.vdex, the v in vdex here meaning that this function was running in the ART interpreter. And the dex2oat invocation does not contain art::OptimizingCompiler::Compile, it only shows verification work, no compilation.

The dumpsys package dexopt command confirms that the "speed-profile" was not used:

% adb shell dumpsys package dexopt | grep -A 1 org.mozilla.fenix
  [org.mozilla.fenix]
    path: /data/app/~~S2czri8VhkZ57O7rM4bLEA==/org.mozilla.fenix-WtFUhWarfx2ErfKm_0KBOQ==/base.apk
      arm64: [status=verify] [reason=install]

However, on this rooted Pixel 6 running Android 13, running adb shell pm compile -m speed-profile -f org.mozilla.fenix right after adb install also does not appear to do any compilation! So I'm not sure if it works. We should run adb shell dumpsys package dexopt | grep -A 1 org.mozilla.fenix in CI so that we can see whether the Baseline profile was respected.

Ok I think it's all figured out now. Rather than running cmd package compile -m speed-profile, we should run profgen extractProfile on the host, and then run adb install-multiple instead of adb install.

Backstory:

  1. The Baseline profile inside the apk at assets/dexopt/baseline.prof is ignored when installing an apk.
  2. There are two ways to tell Android to use a Baseline profile: Either by using install-multiple with the Baseline profile in a .dm file ("Dex Metadata") [1], or by broadcasting androidx.profileinstaller.action.INSTALL_PROFILE [2] which is handled by the profileinstaller library that we include in fenix. The Play Store does the former [3], the Macrobenchmark framework does the latter [4].
  3. Baseline profile versions: The profile at assets/dexopt/baseline.prof must always have version v0_1_0_p. The profile in the .dm file provided to adb install-multiple must be of the version that the targeted Android understands. Starting with Android 12, that's v0_1_5_s [5].
  4. .dm files are just zip files with two files in them: primary.prof and primary.profm.
  5. Transcoding from v0_1_0_p to v0_1_5_s happens in these places:
    1. When submitting an APK to the Play Store, Google's servers transcode the profile found in the APK into all versions that might be needed. The Play Store fetches the .dm file of the right version when it installs the app. [6]
    2. When using Macrobenchmark / profileinstaller, the ProfileTranscoder class inside profileinstaller [7] probably does the transcoding.
    3. If you want to run install-multiple yourself, you can transcode the profile on the host by running profgen extractProfile (from the Android SDK). You specify the destination format in the command line invocation. [8]
  6. profgen is in the Android SDK. profgen extractProfile takes an apk as the input and creates a dm file as the output.
  7. When running adb install-multiple myfenix.apk myfenix.dm, the filenames before the .apk and the .dm must match exactly.

All of this information comes from this page in the documentation: https://developer.android.com/topic/performance/baselineprofiles/manually-create-measure

[1] adb install-multiple is mentioned in multiple sections of the document.
[2] mentioned under "Broadcast with androidx.profileinstaller"
[3] mentioned under "Use install-multiple with DexMetadata".
[4] see androidx/benchmark/macro/ProfileInstallBroadcast.kt
[5] see section "Profile formats and platform versions"
[6] mentioned under "Use install-multiple with DexMetadata": "Play generates a tuple by transcoding profiles packaged as v0_1_0_p to every known profile version in use to deliver the correct version."
[7] see androidx/profileinstaller/ProfileTranscoder.java
[8] described under "Use install-multiple with profgen or DexMetaData"

I tried profgen extractProfile + install-multiple locally and it totally works!

% ~/.mozbuild/android-sdk-macosx/cmdline-tools/12.0/bin/profgen extractProfile --apk /Users/mstange/Downloads/target.arm64-v8a\(1\).apk  --output-dex-metadata /Users/mstange/Downloads/target.arm64-v8a\(1\).dm --profile-format V0_1_5_S
% adb install-multiple /Users/mstange/Downloads/target.arm64-v8a\(1\).apk /Users/mstange/Downloads/target.arm64-v8a\(1\).dm 
Success
% adb shell dumpsys package dexopt | grep -A 1 org.mozilla.fenix
  [org.mozilla.fenix]
    path: /data/app/~~ttHRMX3che3RLM_Tf7pn1g==/org.mozilla.fenix-qcG3MkjrZdHKQvDTtx9BVg==/base.apk
      arm64: [status=speed-profile] [reason=install-dm] [primary-abi]
        [location is /data/app/~~ttHRMX3che3RLM_Tf7pn1g==/org.mozilla.fenix-qcG3MkjrZdHKQvDTtx9BVg==/oat/arm64/base.odex]
  [org.mozilla.geckoview.test_runner]

Profile: https://share.firefox.dev/4aEZZuf

The output of arm64: [status=speed-profile] [reason=install-dm] matches what I got in the Play Store scenario in comment 2!

Summary: Add speed-compile command after installing mozilla browsers on android → Our Firefox performance tests on Android aren't making use of Baseline profiles, so Firefox is unrealistically slow in them when compared to a Firefox installed from the Play Store

(mentioning this again outside of element at least): what is a bit concerning/confusing is that we apparently didn't see any improvement in telemetry e.g. https://glam.telemetry.mozilla.org/fenix/probe/perf_startup_cold_main_app_to_first_frame/explore?aggType=avg&currentPage=1&ref=2024011216&timeHorizon=ALL from bug 1887651 automated baseline when it landed (as that would be installed from playstore and properly respect the profiles right?).
But we did see a dip back in January 2024 from the manual baseline

(In reply to Kash Shampur [:kshampur] ⌚EST from comment #5)

(mentioning this again outside of element at least): what is a bit concerning/confusing is that we apparently didn't see any improvement in telemetry e.g. https://glam.telemetry.mozilla.org/fenix/probe/perf_startup_cold_main_app_to_first_frame/explore?aggType=avg&currentPage=1&ref=2024011216&timeHorizon=ALL from bug 1887651 automated baseline when it landed (as that would be installed from playstore and properly respect the profiles right?).
But we did see a dip back in January 2024 from the manual baseline

I think it would be easier to see changes caused by this if we had separate telemetry for the first startup after an update vs. all startups. I wouldn't be surprised if the local profiles and cloud profiles would mostly compensate for the lack of automated baseline profiles over time.

Attachment #9461121 - Attachment is obsolete: true

(quick update)

I'm having trouble locating profgen in the bitbar hosts. there is no cmdline-tools folder. closest I found was /builds/worker/android-sdk-linux/tools/bin but that only contained the following things :

['avdmanager', 'apkanalyzer', 'screenshot2', 'uiautomatorviewer', 'jobb', 'sdkmanager', 'monkeyrunner', 'lint', 'archquery']

for reference my own ~/.mozbuild/android-sdk-macosx/cmdline-tools/17.0/bin contains:

[apkanalyzer, d8, profgen, resourceshrinker, screenshot2, avdmanager, lint, r8, retrace, sdkmanager]

bit of overlap hence why I thought this was the closest folder. It is possible it is in a completely unrelated folder. Currently have a try push to rglob the entire sdk directory to search for profgen- will see if that yields anything. Otherwise I will ping relops/bitbar about this.

We may potentially have to introduce profgen as a toolchain via https://android.googlesource.com/platform/tools/base/+/refs/heads/mirror-goog-studio-master-dev/profgen/ (not sure if theres another repo that might be better and/or we build profgen ourselves)

edit: confirmed bitbar linux host currently does not have profgen anywhere (much older sdk version maybe?)

ah okay so it seems we can directly download the commandline tools https://developer.android.com/studio#command-tools and it contains profgen. We can potentially make this either a toolchain task or upload it to tooltool. First going to see what relops has to say re: bitbar hosts

sparky has pointed out we have a toolchain task already for the sdk https://firefox-ci-tc.services.mozilla.com/tasks/LdHgKv2DREeenMxPNU_qlQ

it is a large archive so ideally we find a better solution (e.g. bitbar side), but I can try seeing if the toolchain's profgen works in the meantime

(In reply to Kash Shampur [:kshampur] ⌚EST from comment #9)

sparky has pointed out we have a toolchain task already for the sdk https://firefox-ci-tc.services.mozilla.com/tasks/LdHgKv2DREeenMxPNU_qlQ

it is a large archive so ideally we find a better solution (e.g. bitbar side), but I can try seeing if the toolchain's profgen works in the meantime

I think there seems to be an incompatibility issue with the JDK version with the e.g.
https://firefox-ci-tc.services.mozilla.com/tasks/Z2_shX_WQ3GaFpPH3RXi7Q/runs/0/logs/public/logs/live.log#L551-552

[task 2025-02-07T00:04:28.158Z] This tool requires JDK 17 or later. Your version was detected as 1.8.0_382.
[task 2025-02-07T00:04:28.158Z] To override this check, set SKIP_JDK_VERSION_CHECK.

setting that env variable however didn't seem to help
https://firefox-ci-tc.services.mozilla.com/tasks/DTu1kZ09TKmy2hQqFi_uVQ/runs/0/logs/public/logs/live.log#L501-515

[task 2025-02-07T02:59:15.595Z] Error: A JNI error has occurred, please check your installation and try again
[task 2025-02-07T02:59:15.595Z] Exception in thread "main" java.lang.UnsupportedClassVersionError: com/android/tools/profgen/cli/MainKt has been compiled by a more recent version of the Java Runtime (class file version 55.0), this version of the Java Runtime only recognizes class file versions up to 52.0
[task 2025-02-07T02:59:15.595Z] 	at java.lang.ClassLoader.defineClass1(Native Method)
[task 2025-02-07T02:59:15.595Z] 	at java.lang.ClassLoader.defineClass(ClassLoader.java:756)
[task 2025-02-07T02:59:15.595Z] 	at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
[task 2025-02-07T02:59:15.595Z] 	at java.net.URLClassLoader.defineClass(URLClassLoader.java:473)
[task 2025-02-07T02:59:15.595Z] 	at java.net.URLClassLoader.access$100(URLClassLoader.java:74)
[task 2025-02-07T02:59:15.595Z] 	at java.net.URLClassLoader$1.run(URLClassLoader.java:369)
[task 2025-02-07T02:59:15.595Z] 	at java.net.URLClassLoader$1.run(URLClassLoader.java:363)
[task 2025-02-07T02:59:15.595Z] 	at java.security.AccessController.doPrivileged(Native Method)
[task 2025-02-07T02:59:15.595Z] 	at java.net.URLClassLoader.findClass(URLClassLoader.java:362)
[task 2025-02-07T02:59:15.595Z] 	at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
[task 2025-02-07T02:59:15.595Z] 	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)
[task 2025-02-07T02:59:15.595Z] 	at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
[task 2025-02-07T02:59:15.595Z] 	at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:621)

Looks like afterall this might need to be something done on the bitbar side. I will ping relops

Kash, can you update the bug with the current state? Do we have a newer JDK available on these machines now?

Flags: needinfo?(kshampur)

:aerickson is working on it, tracked here https://mozilla-hub.atlassian.net/browse/RELOPS-1293

Flags: needinfo?(kshampur)

(In reply to Kash Shampur [:kshampur] ⌚EST from comment #12)

:aerickson is working on it, tracked here https://mozilla-hub.atlassian.net/browse/RELOPS-1293

Previously it was on version 8, we are going to try 17

You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: