Closed Bug 1663700 Opened 4 years ago Closed 6 months ago

Enable PGO profile generation for child processes on Android

Categories

(GeckoView :: General, task, P2)

Unspecified
All

Tracking

(Performance Impact:high, firefox119 wontfix, firefox120 wontfix, firefox121 fixed)

RESOLVED FIXED
121 Branch
Performance Impact high
Tracking Status
firefox119 --- wontfix
firefox120 --- wontfix
firefox121 --- fixed

People

(Reporter: Gijs, Assigned: jnicol)

References

(Blocks 1 open bug)

Details

(Keywords: perf, perf-alert, Whiteboard: [geckoview:2022h2?][sp3])

Attachments

(1 file)

Android now runs PGO, yay!

It manually dumps profiling info because unlike desktop there's no clear "exit and shutdown" hook - here.

Unfortunately, we only do the manual dumping of info in the parent process. I suspect this means we never get profiling info out of the child process. That's a shame, because that's where most of the gecko/spidermonkey wrangling of web content will be happening (esp. on Android which I'm pretty sure will be using very little DOM / JS code in the parent process). So I suspect we might be able to get some perf benefits from enabling it there.

Note that sandboxing might be a little in the way here - on desktop we just disable sandboxing; bug 1553850 covers a more thorough fix.

This was the original android PGO issue, Bug 632954
I wonder if we should move this to Firefox Build System ?

(In reply to Andrew Creskey [:acreskey] [he/him] from comment #1)

This was the original android PGO issue, Bug 632954
I wonder if we should move this to Firefox Build System ?

Maybe, though this will likely need code changes in the compiled binary (even if behind a MOZ_PROFILE_GENERATE ifdef) in order to work...

Note bug 1602810 for rust code, which is waiting on a rustc upgrade.

See Also: → 1602810
Priority: -- → P2
Whiteboard: [qf]

I saw that FX82 is using Rust 1.43. Not likely that it'll bump to 1.46?

Severity: -- → N/A
Type: defect → task
Whiteboard: [geckoview:2022h2?]

Agi says we might not be collecting profiles for child processes for PGO (i.e. any DOM/JS code is likely not getting profiled for PGO). There could be an easy perf improvement here. Might not be that much work if all we have to do is figure out a way to write the profile at the right time.

Whoever investigates this PGO bug should also take a look at bug 1722266 about PGO on 32-bit Android devices.

See Also: → 1722266
Performance Impact: ? → P2
Keywords: perf
Whiteboard: [geckoview:2022h2?] → [geckoview:2022h2?][sp3]

Profile data gets dumped to disk here, but this doesn't get a chance to execute for child processes on Android because we terminate the process early here. This happens in response to this call in ContentParent::MarkAsDead(), called from ContentParent::MaybeShutdownProcess().

Commenting out the System.exit() call during onDestroy() allows the profile data to get written to disk, which speeds us up a lot. Perhaps therefore skipping this call in MOZ_PROFILE_GENERATE builds is the solution. However, there may be unintended consequences to not terminating the process early, and perhaps even in the limited scope of the MOZ_PROFILE_GENERATE builds we mustn't do that.

The System.exit() call was introduced in bug 1696460, however prior to that we called Process.killProcess() from onUnbind() which is effectively the same. That goes back as long as we have been using Android Services rather than fork() to launch child processes (bug 1314466).

There are some comments in ContentParent which imply we may rely on that behaviour, though I'm unsure how important that is

I discussed this with Nika on matrix. The reason we shutdown content processes quickly in ContentParent::MarkAsDead on Android is because we have a finite number of Services available, therefore unbinding the service immediately allows us to launch a new process sooner when at that limit. Once we do decide to unbind a service it is important that we kill the process immediately, otherwise we could end up in a situation where we attempt to make a new binding to the same Service and Android reuses the old process which hadn't finished shutting down yet.

We do guard against that situation, and it probably doesn't matter anyway as we don't run in to it during the profile run. But to me it therefore seems safer to avoid initiating the early shutdown on the parent side here rather than only skipping the System.exit() part.

Currently Android PGO builds only use profile data from the parent
process. On Android especially a lot of interesting work only occurs
in the content process, so we are missing out on some useful
optimizations here.

This occurs because profile data gets dumped at the end of a process'
shutdown, but content processes are terminated early on Android before
that has a chance to happen. We terminate processes early due to there
being a finite number of content process Services available. We
therefore unbind the Service immediately so that it becomes available
to launch a new process with sooner. (And once a Service has been
unbound it is indeed imperative that its process is killed immediately
so that Android does not attempt to reuse the process for a new
Service binding.) Since we will not reach the process limit during the
PGO profile run we can afford to wait for the process to shut down
cleanly.

Performance Impact: medium → high
Blocks: 1863557
Pushed by jnicol@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/112da6f5092c
Use content process profile data in Android PGO builds. r=nika

Comment on attachment 9362176 [details]
Bug 1663700 - Use content process profile data in Android PGO builds. r?nika

Beta/Release Uplift Approval Request

  • User impact if declined: This is a big performance improvement on Android (10%-12% improvement on speedometer 3) which would be nice for users to get ASAP
  • Is this code covered by automated tests?: Yes
  • Has the fix been verified in Nightly?: No
  • Needs manual test from QE?: No
  • If yes, steps to reproduce:
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): No code changes in shipping code, just generates more PGO prof data to feed to compiler.
  • String changes made/needed:
  • Is Android affected?: Yes
Attachment #9362176 - Flags: approval-mozilla-beta?

How close is this going to get us to our already near-the-limit APK size? Note that for 121, we're shipping AABs with a much higher limit.

Flags: needinfo?(jnicol)

geckoview_example.apk is 1.1MB bigger according to my pushes. I'm not sure how that translates to Fenix apk sizes

Flags: needinfo?(jnicol)

Is PGO enabled on all Android architectures: ARM64, ARMv7, x86, and x86_64?

Yes, though we would be able to make this change only affect certain architectures if we want

Comment on attachment 9362176 [details]
Bug 1663700 - Use content process profile data in Android PGO builds. r?nika

:jnicol it is very late in the cycle, I would prefer this rides the trains with 121 (in beta on nov 20). I understand this would be nice to have, but i dont see an urgency factor here given that it will have very little bake time in nightly after landing. We usually approve improvements like these earlier in the cycle to allow for reaction time in case of any unwanted effects.

-risks with changing compiler optimizations this late
-beta limit size is 100MB (were at 99MB) and this add ~1.1 MB
-Fx121 will increase limit to 150 MB

Attachment #9362176 - Flags: approval-mozilla-beta? → approval-mozilla-beta-
Status: NEW → RESOLVED
Closed: 6 months ago
Resolution: --- → FIXED
Target Milestone: --- → 121 Branch

(In reply to Pulsebot from comment #12)

Pushed by jnicol@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/112da6f5092c
Use content process profile data in Android PGO builds. r=nika

== Change summary for alert #40219 (as of Fri, 10 Nov 2023 16:23:56 GMT) ==

Improvements:

Ratio Test Platform Options Absolute values (old vs new)
11% speedometer3 android-hw-a51-11-0-aarch64-shippable-qr webrender 2.65 -> 2.94
5% amazon-search fcp android-hw-a51-11-0-aarch64-shippable-qr warm webrender 259.52 -> 247.68
5% amazon loadtime android-hw-a51-11-0-aarch64-shippable-qr warm webrender 278.62 -> 266.05
4% bing-search-restaurants loadtime android-hw-a51-11-0-aarch64-shippable-qr warm webrender 163.57 -> 156.41
4% amazon-search LastVisualChange android-hw-a51-11-0-aarch64-shippable-qr cold webrender 2,395.05 -> 2,292.38
... ... ... ... ...
3% allrecipes loadtime android-hw-a51-11-0-aarch64-shippable-qr warm webrender 767.83 -> 743.73

For up to date results, see: https://treeherder.mozilla.org/perfherder/alerts?id=40219

Keywords: perf-alert
Regressions: 1865593
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: