Closed Bug 1710552 Opened 3 years ago Closed 3 years ago

Inter-process GC scheduling should account for JS engine initiated GCs

Categories

(Core :: DOM: Content Processes, enhancement, P2)

enhancement

Tracking

()

RESOLVED FIXED
93 Branch
Fission Milestone MVP
Tracking Status
firefox92 --- wontfix
firefox93 --- fixed

People

(Reporter: pbone, Assigned: pbone)

References

(Regressed 1 open bug)

Details

(Keywords: perf-alert, Whiteboard: fission-soft-blocker)

Attachments

(3 files, 7 obsolete files)

48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review

Bug 1629064 will introduce inter-process GC scheduling. We will avoid overwhelming the system with many GCs in different processes. However it ignores GCs started immediately based on memory threshold triggers. it should take these into account when scheduling GCs. In other words, a process should always be able to begin a GC if it hits a threshold (already implemented), but another process requesting a GC may have to wait for it (not implemented).

The JS engine can start GCs based on allocation thresholds (for example).
We don't ask the parent if we should run one of these, but by letting the
parent know that's what's happening it can schedule other GC requests
appropriately.

When digging into the regressions I never managed to fully track down, I noticed that I had changed how GCReasons were recorded, in more ways than intended. I have some patches in bug 1711128 to revert the change, but it got a little tangled up in the code you just landed. So that bug now includes a fix for this bug, if we want to do things the way I'm proposing over there.

Wait, no, I am wrong. Bug 1711128 does the opposite of this. Depending on which lands first, I think you'll either want your existing patch or a slight rebase on top of bug 1711128. Sorry for the confusion; hopefully I can get better clued in during our meeting a later on today.

Severity: -- → S3
Priority: -- → P2

M8 I guess, but this has a patch already

Fission Milestone: ? → M8

Or M7a, since this is kind of a regression from a M7a bug.

Fission Milestone: M8 → M7a

...and again, I misread what you very clearly stated in this bug. You're not blocking internally generated GCs; you're allowing them to block other processes' GCs. Doh! That makes sense.

(In reply to Olli Pettay [:smaug] from comment #5)

Or M7a, since this is kind of a regression from a M7a bug.

More like an edge case that the other bug doesn't cover. But the edge case is much less important, I don't expect it to hurt users' experience the way Bug 1629064 did.

Actually my patch here could introduce further edge cases. I can do Bug 1703443 first and prevent all the edge cases there.

Depends on: 1703443
Fission Milestone: M7a → MVP
Depends on: 1718184
Depends on: 1722143

Depends on D120780

Depends on D120781

Depends on D120782

time r=smaug

Depends on D120784

Comment on attachment 9232931 [details]
Bug 1710552 - pt 1. Remove a redundant statement r=smaug

Revision D120780 was moved to bug 1722143. Setting attachment 9232931 [details] to obsolete.

Attachment #9232931 - Attachment is obsolete: true

Comment on attachment 9232932 [details]
Bug 1710552 - pt 2. Make GetNextGCRunnerAction() const r=smaug

Revision D120781 was moved to bug 1722143. Setting attachment 9232932 [details] to obsolete.

Attachment #9232932 - Attachment is obsolete: true

Comment on attachment 9232933 [details]
Bug 1710552 - pt 3. Move some definitions into .cpp file r=smaug

Revision D120782 was moved to bug 1722143. Setting attachment 9232933 [details] to obsolete.

Attachment #9232933 - Attachment is obsolete: true

Comment on attachment 9232934 [details]
Bug 1710552 - pt 4. GetNextGCRunnerAction() doesn't need a deadline r=smaug

Revision D120783 was moved to bug 1722143. Setting attachment 9232934 [details] to obsolete.

Attachment #9232934 - Attachment is obsolete: true

Comment on attachment 9232935 [details]
Bug 1710552 - pt 5. Use GCRunnerFired to handle full and user inactive GCs r=smaug

Revision D120784 was moved to bug 1722143. Setting attachment 9232935 [details] to obsolete.

Attachment #9232935 - Attachment is obsolete: true

Comment on attachment 9232936 [details]
WIP: Bug 1710552 - pt 6. Run the first slice of Full and User-idle GCs in idle

Revision D120785 was moved to bug 1722143. Setting attachment 9232936 [details] to obsolete.

Attachment #9232936 - Attachment is obsolete: true

This patch also:

  • adds an assertion to KillGCRunner() to ensure it's never killed if
    needed, now that there are more calls to KillGCRunner(), some calls have
    been moved eg in nsJSEnvironment so as not to kill the runner a little
    later and keep the assertions happy.

  • IdleSchedulerChild will decline a request for a GC if there's already a
    request in flight.

  • CCGCScheduler will check if a GC is already in progress when handling the
    parents' response to a GC request.

This lets the idle scheduler know that we've initiated a GC that we didn't
ask its permission for. Eg the JS engine hit a threshold. It now uses this
info when scheduling GCs for other processes.

Depends on D120830

Depends on D120831

Attachment #9221466 - Attachment is obsolete: true

This bug is a soft blocker for Fission MVP. We'd like to fix it before our Release channel rollout, but we won't delay the rollout waiting for it.

Whiteboard: fission-soft-blocker

This is ready to land but I will wait until after the soft-freeze.

Pushed by pbone@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/eee94cab9825
pt 1. Prepare for GCs being denied r=smaug
https://hg.mozilla.org/integration/autoland/rev/09a9a01bbfff
pt 2. Inform the idle scheduler when we initiate a GC r=smaug
https://hg.mozilla.org/integration/autoland/rev/0d808485592f
pt 3. Fix some whitespace r=smaug
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → 93 Branch

== Change summary for alert #30931 (as of Sun, 15 Aug 2021 19:49:29 GMT) ==

Improvements:

Ratio Test Platform Options Absolute values (old vs new)
2% ares6 macosx1015-64-shippable-qr webrender 53.40 -> 52.24

For up to date results, see: https://treeherder.mozilla.org/perfherder/alerts?id=30931

Regressions: 1725635
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: