Closed Bug 1865886 Opened 1 year ago Closed 10 months ago

Use aarch64 builds for PGO profile generation on Android

Categories

(Core :: Performance Engineering, task)

task

Tracking

()

RESOLVED FIXED
124 Branch
Tracking Status
firefox123 --- fixed
firefox124 --- fixed

People

(Reporter: jnicol, Assigned: jrmuizel)

References

(Depends on 1 open bug, Blocks 1 open bug)

Details

(Keywords: perf-alert, Whiteboard: [sp3])

Attachments

(5 files, 1 obsolete file)

Currently we use x86 instrumented builds to generate profile data for arm32 shippable builds, and x86_64 for aarch64.

I see a 3 to 4% improvement in speedometer3 score when doing an aarch64 build using profile data I generated locally from an aarch64 instrumented build compared to an x86_64 one. (By hacking the android_emulator_pgo.py script to run locally and performing the profile generation in an emulator on my M2 macbook).

In terms of performing these builds in CI:

  • Recent android arm64 system images cannot be emulated on an x86_64 host. However, we may be able to use an older android version system image. Prior to bug 1718341 we used an arm32 image, but we switched away for a reason.
  • Recent x86_64 system images can obviously be run on an x86_64 host, and are able to run aarch64 binaries. However, we need to use the google_apis variant, and I'm not sure whether that's okay.
  • Perhaps we can run an aarch64 system image on an aarch64 Linux host
Depends on: linux-arm64-ci

As well as bug 1677963 blocking the "run on an arm64 linux host" option, there are also not yet officially-supported arm64 linux emulator builds: https://issuetracker.google.com/issues/242699119

Recent android arm64 system images cannot be emulated on an x86_64 host. However, we may be able to use an older android version system image. Prior to bug 1718341 we used an arm32 image, but we switched away for a reason.

This also doesn't appear to work. While I can get an arm32 SDK 24 emulator image to launch, I cannot get any arm64 one to. Even on a local machine.

On SDK level 24 and 25 it gets stuck in a bootloop. Nothing hugely jumps out in the logs as the main issue, but googling most of the warnings shows other people also failing to launch emulators.

And on SDK 26 onwards it exits immediately with this error:

qemu-system-aarch64-headless: PCI bus not available for hda
VERBOSE | Done with QEMU main loop

which, again, googling results in others failing to launch arm64 emulators.

And on SDK 28 onwards it's a lot more explicit:

PANIC: Avd's CPU Architecture 'arm64' is not supported by the QEMU2 emulator on x86_64 host.

Whiteboard: [sp3]

(In reply to Jamie Nicol [:jnicol] from comment #0)

  • Recent x86_64 system images can obviously be run on an x86_64 host, and are able to run aarch64 binaries. However, we need to use the google_apis variant, and I'm not sure whether that's okay.

This approach works, and gives a 4.35% improvement on try: https://treeherder.mozilla.org/perfherder/compare?originalProject=try&originalRevision=6dcce504a500bb2c227f42066d0635c63fd5531d&newProject=try&newRevision=10b59290544284b699b973c9b92e60c66c53650c&framework=13&page=1

However, the profile generation needs to be run on a device with KVM access, which is blocked on bug 1545497. Without KVM, recent android emulator images (especially google_apis variants) are unusable: they take upwards of 30 minutes to boot, frequently crash, and time out when attempting to install an APK.

I should also note that I needed to use an android-31 google_apis x86_64 emulator image for this to work, which only supports running aarch64 binaries (and x86_64 of course), not arm32. From SDK 31 onwards the emulators are 64-bit only. So we can only get this win for aarch64 builds, not arm32. That's the vast majority of our users though, so that's fine.

Depends on: 1875490
Attached file Bug 1865886. Use kvm to repack avd. (obsolete) —
Assignee: nobody → jmuizelaar
Status: NEW → ASSIGNED
Depends on: 1876089
Component: Performance → Performance Engineering
Depends on: 1876337
Attachment #9375829 - Attachment description: Bug 1865886. kvm android 31 google_apis → Bug 1865886. Switch to the android-31 emulator package and add google_apis.
Attachment #9375830 - Attachment description: Bug 1865886. use kvm to repack avd → Bug 1865886. Use kvm to repack avd.
Attachment #9375831 - Attachment description: Bug 1865886. Add arm and aarch64 instrumented jobs → Bug 1865886. Add aarch64 instrumented jobs.
Attachment #9375832 - Attachment description: Bug 1865886. add arm and aarch64 profile generation jobs on x86_64 emulator → Bug 1865886. Add aarch64 profile generation jobs on x86_64 emulator.
Attachment #9375833 - Attachment description: Bug 1865886. Use arm/aarch64 profile generate jobs for shippable builds → Bug 1865886. Use aarch64 profile generate jobs for shippable builds.
Summary: Use aarch64/arm32 builds for PGO profile generation on Android → Use aarch64 builds for PGO profile generation on Android

It looks like this breaks the x86-32 PGO run. I get:
INFO - Failed to install /builds/worker/fetches/geckoview-test_runner.apk on None: ADBProcessError args: /builds/worker/fetches/android-sdk-linux/platform-tools/adb wait-for-device install /builds/worker/fetches/geckoview-test_runner.apk, exitcode: 1, stdout: adb: failed to install /builds/worker/fetches/geckoview-test_runner.apk: Failure [INSTALL_FAILED_NO_MATCHING_ABIS: Failed to extract native libraries, res=-113]

Ah and the reason is in comment 4 "From SDK 31 onwards the emulators are 64-bit only", so we'll need to use separate versions of the emulator.

Attachment #9375829 - Attachment description: Bug 1865886. Switch to the android-31 emulator package and add google_apis. → Bug 1865886. Add an android-31 emulator package and add google_apis.
Attachment #9375830 - Attachment is obsolete: true
Pushed by jmuizelaar@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/7bf33179c4b8 Add an android-31 emulator package and add google_apis. r=geckoview-reviewers,glandium,m_kato https://hg.mozilla.org/integration/autoland/rev/df1b87110c9d Add aarch64 instrumented jobs. r=geckoview-reviewers,glandium,m_kato https://hg.mozilla.org/integration/autoland/rev/e2529b2ccc91 Add aarch64 profile generation jobs on x86_64 emulator. r=glandium https://hg.mozilla.org/integration/autoland/rev/7cc03b5ce4a9 Use aarch64 profile generate jobs for shippable builds. r=glandium
Depends on: 1877194
Pushed by jmuizelaar@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/e2c2461d3c62 Add an android-31 emulator package and add google_apis. r=geckoview-reviewers,glandium,m_kato https://hg.mozilla.org/integration/autoland/rev/6f85a2b71370 Add aarch64 instrumented jobs. r=geckoview-reviewers,glandium,m_kato https://hg.mozilla.org/integration/autoland/rev/b1155fcccab5 Add aarch64 profile generation jobs on x86_64 emulator. r=glandium https://hg.mozilla.org/integration/autoland/rev/252218f93e3a Use aarch64 profile generate jobs for shippable builds. r=glandium
Flags: needinfo?(jmuizelaar)

This fixes the ModuleNotFoundError: No module named 'mozinfo' error.

Comment on attachment 9375833 [details]
Bug 1865886. Use aarch64 profile generate jobs for shippable builds.

Beta/Release Uplift Approval Request

  • User impact if declined: 5% on Speedometer3
  • Is this code covered by automated tests?: No
  • Has the fix been verified in Nightly?: No
  • Needs manual test from QE?: No
  • If yes, steps to reproduce:
  • List of other uplifts needed: Bug 1877194, Bug 1875490, Bug 1876337
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): This changes the set of functions for which PGO works. The risk for change in behaviour should be very low.

The biggest risk is around the needed infrastructure changes. It's possible this could break the build when uplifted to beta or cause some other weirdness. Any problems should be easy to fix though.

  • String changes made/needed: n/a
  • Is Android affected?: Yes
Attachment #9375833 - Flags: approval-mozilla-beta?
Attachment #9375829 - Flags: approval-mozilla-beta?
Attachment #9375831 - Flags: approval-mozilla-beta?
Attachment #9375832 - Flags: approval-mozilla-beta?
Attachment #9376992 - Flags: approval-mozilla-beta?

I should note that this appears to increase the APK size by about 400k

Comment on attachment 9375833 [details]
Bug 1865886. Use aarch64 profile generate jobs for shippable builds.

Approved for 123 beta 6, thanks.

(Bug 1875490 landed during the 123 nightly cycle and this won't need an uplift)

Attachment #9375833 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Attachment #9375829 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Attachment #9375831 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Attachment #9375832 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Attachment #9376992 - Flags: approval-mozilla-beta? → approval-mozilla-beta+

(In reply to Pulsebot from comment #17)

Pushed by smolnar@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/d2a5bbf7755f
Fix merge with changes from bug 1874234. CLOSED TREE

== Change summary for alert #41365 (as of Tue, 06 Feb 2024 17:33:29 GMT) ==

Improvements:

Ratio Test Platform Options Absolute values (old vs new) Performance Profiles
15% cnn-ampstories largestContentfulPaint android-hw-a51-11-0-aarch64-shippable-qr cold webrender 1,448.58 -> 1,226.73 Before/After
4% instagram ContentfulSpeedIndex android-hw-a51-11-0-aarch64-shippable-qr warm webrender 1,117.23 -> 1,070.02 Before/After
4% instagram ContentfulSpeedIndex android-hw-a51-11-0-aarch64-shippable-qr warm webrender 1,122.40 -> 1,078.40 Before/After
3% instagram SpeedIndex android-hw-a51-11-0-aarch64-shippable-qr warm webrender 1,406.50 -> 1,367.73 Before/After
2% espn loadtime android-hw-a51-11-0-aarch64-shippable-qr warm webrender 918.10 -> 895.78 Before/After
... ... ... ... ... ...
2% booking LastVisualChange android-hw-a51-11-0-aarch64-shippable-qr warm webrender 2,047.40 -> 2,001.91 Before/After

For up to date results, see: https://treeherder.mozilla.org/perfherder/alerts?id=41365

Keywords: perf-alert
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: