Closed Bug 1713320 Opened 3 years ago Closed 3 years ago

Runaway event processing if you leave google calendar(?) open all night

Categories

(Core :: XPCOM, defect)

defect

Tracking

()

RESOLVED FIXED
91 Branch
Performance Impact ?
Tracking Status
firefox-esr78 --- unaffected
firefox88 --- wontfix
firefox89 --- wontfix
firefox90 + fixed
firefox91 --- fixed

People

(Reporter: jesup, Assigned: smaug)

Details

Attachments

(1 file)

[Tracking Requested - why for this release]:
Critical performance regression for google pages

Bas has seen this in beta; neha and I in Nightly.

Several people have been hitting major (seconds or minutes) jank when touching google calendar after it sits for hours or overnight. (This also seems to happen to all other google.com tabs (in fission) until you touch calendar and let it resolve.)

Profiles show seconds or minutes of jank, entirely in mozilla::TaskController::DoExecuteNextTaskOnlyMainThreadInternal()

From capturing in gdb, we've seen millions or even >124 million of events stacked up to be sorted and processed, leading to the jank.

Many (but not all) of these messages seem to be DidComposite. I also saw a Key event, in a case where I went to a google search page and cursored back and forth in the input box (seeing seconds of jank for each cursor move). I only looked at a few random events, however. Bas also saw a bunch of DidComposite events.

Something is causing an amplification of events (perhaps DidComposite events) until it resolves (apparently after I let Calendar jank and then resolve itself).

Likely should be a blocker for beta going to release

It isn't surprising to see many DidComposite tasks, since those are very common in general.

Ok, lots more info. It appears the pileup of events are on a suspended manager (the Idle Manger) when Calendar has an alert up. Hitting OK on the alert causes the events to start to flush.

The events stuck in the suspended manager are all Idle Wrappers which have timed out (and have null mRunnables)

Assignee: nobody → bugs

If IdleTaskManager is suspended and non-idle task is_run, nothing seems to guarantee idle tasks get run later.
Calling UpdateCachedIdleDeadline triggers child->parent->child ipc messages if needed and ends up enabling
IdleTaskManager.

89 ships tomorrow, I am keeping the status at fix-optional in case we have a safe fix for uplift as a ride along in a dot release.

FWIW, the patch is changing behavior which has been there at least from last summer (2020).

Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → 91 Branch

Since the status are different for nightly and release, what's the status for beta?
For more information, please visit auto_nag documentation.

Given comment 5, do we still want to consider this for uplift to 90 or should it ride the trains?

Flags: needinfo?(bugs)

yes, I just want to have this in Nightly for a day or two before asking for uplift.

Flags: needinfo?(bugs)

The patch landed in nightly and beta is affected.
:smaug, is this bug important enough to require an uplift?
If not please set status_beta to wontfix.

For more information, please visit auto_nag documentation.

Flags: needinfo?(bugs)

boo.

Flags: needinfo?(bugs)
Flags: needinfo?(bugs)

Comment on attachment 9224142 [details]
Bug 1713320, ensure idle tasks get run, r=Bas

Beta/Release Uplift Approval Request

  • User impact if declined: The main thread doesn't process idle tasks, but they are queued and eventually that leads to slowing down all processing
  • Is this code covered by automated tests?: Yes
  • Has the fix been verified in Nightly?: Yes
  • Needs manual test from QE?: No
  • If yes, steps to reproduce:
  • List of other uplifts needed: None
  • Risk to taking this patch: Medium
  • Why is the change risky/not risky? (and alternatives if risky): This is a change to behavior which has been there for years (though landing TaskController a year ago made it worse). Scheduling changes might be a bit risky, but at least this is only for the case when there aren't other tasks to process than idle tasks.
  • String changes made/needed:
Flags: needinfo?(bugs)
Attachment #9224142 - Flags: approval-mozilla-beta?

Comment on attachment 9224142 [details]
Bug 1713320, ensure idle tasks get run, r=Bas

We've got a good amount of Beta runway left this cycle. Let's get this landed and see how it goes. Approved for 90.0b4.

Attachment #9224142 - Flags: approval-mozilla-beta? → approval-mozilla-beta+

Was this a regression? Do we know what caused it?

I think our idle handling has always had this bug, and TaskController made it worse.

We'd have to do archaeology to see if this issue existed before TaskController; IIRC from working on the idle code before TaskController landed I don't think it would have happened, but that's shaky.

Performance Impact: --- → ?
Whiteboard: [qf]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: