If you think a bug might affect users in the 57 release, please set the correct tracking and status flags for Release Management.

Consider dispatching WindowDestroyedEvents to the idle queue

RESOLVED FIXED in Firefox 55

Status

()

Core
DOM
P2
normal
RESOLVED FIXED
5 months ago
3 months ago

People

(Reporter: mstange, Assigned: Ehsan)

Tracking

Trunk
mozilla55
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox55 fixed)

Details

(Whiteboard: [qf:p1])

Attachments

(1 attachment, 1 obsolete attachment)

(Reporter)

Description

5 months ago
If you close a window with a lot of tabs, or a tab with a lot of iframes, we dispatch lots of WindowDestroyedEvents at the same time. I'm worried that these events can clog up the event queue and stop us from processing other events that are more important for responsiveness.

I know that I've seen this problem in profiles for years, but I haven't been able to reproduce it just now, by closing a window with 80 searchfox.org tabs. Maybe that's because searchfox doesn't contain iframes or plugins. Or maybe other scheduling changes have solved this problem in the meantime?

Assuming the problem still exists, could we use NS_IdleDispatchToCurrentThread for these events instead?
Since these events also free up memory, we should probably have a way to prevent these events from being deferred indefinitely. For example, we could react to memory pressure events by just flushing all pending WindowDestroyedEvents.

Comment 1

5 months ago
Yeah, those would need to use idle dispatch with timeout, to ensure some relatively fast handling.
(Not sure we get too useful memory pressures.)
But better to capture at least some profile showing this is actually an issue these days.
Blocks: 1358476
The main slowdowns I'm aware of for WindowDestroyedEvents is compartment nuking, and bug 816784 should improve that.

Updated

5 months ago
See Also: → bug 816784
(In reply to Andrew McCreight [:mccr8] from comment #2)
> The main slowdowns I'm aware of for WindowDestroyedEvents is compartment
> nuking, and bug 816784 should improve that.

I'm wondering if it will make it so that it will be guaranteed to be *much* less than a frame budget at all times after that bug.  That seems...  unlikely?  :-/

(In reply to Olli Pettay [:smaug] from comment #1)
> But better to capture at least some profile showing this is actually an
> issue these days.

Going back in memory of the bugs I have files I managed to find this profile from reloading Facebook with 50ms of WindowDestroyedEvents in two separate chunks of timeline: https://perfht.ml/2oVXMqN.  I've seen it pretty often anecdotally.  I think we should think about mitigating the case where the user input events keep coming in, and then just move this over to idle dispatch, *and* fix bug 816784 at the same time.
No longer blocks: 1358476
Depends on: 1358476
Priority: -- → P2
Whiteboard: [qf]
Whiteboard: [qf] → [qf:p1]
Assignee: nobody → ehsan
Created attachment 8870987 [details] [diff] [review]
Dispatch WindowDestroyedEvent to the idle queue
Attachment #8870987 - Flags: review?(bugs)

Updated

4 months ago
Attachment #8870987 - Flags: review?(bugs) → review+

Comment 5

4 months ago
Pushed by eakhgari@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/4ac329aeaaf9
Dispatch WindowDestroyedEvent to the idle queue; r=smaug
Backed out for various dead window related test failures.
https://hg.mozilla.org/integration/mozilla-inbound/rev/e761a94c2dc7e5fddb9ce5b160468ac6763daf69

https://treeherder.mozilla.org/logviewer.html#?job_id=101745289&repo=mozilla-inbound
https://treeherder.mozilla.org/logviewer.html#?job_id=101745221&repo=mozilla-inbound
https://treeherder.mozilla.org/logviewer.html#?job_id=101745304&repo=mozilla-inbound
https://treeherder.mozilla.org/logviewer.html#?job_id=101745347&repo=mozilla-inbound
https://treeherder.mozilla.org/logviewer.html#?job_id=101745281&repo=mozilla-inbound
https://hg.mozilla.org/projects/cedar/rev/4ac329aeaaf94f6a9316a3f562e465db2dd24675
Bug 1361461 - Dispatch WindowDestroyedEvent to the idle queue; r=smaug
Depends on: 1368272
Depends on: 1368275
Depends on: 1368285
Depends on: 1368286
Depends on: 1370100
Depends on: 1370101
Depends on: 1370102
Comment on attachment 8870987 [details] [diff] [review]
Dispatch WindowDestroyedEvent to the idle queue

This approach proved to be impractical due to having to fix thousands of failing tests all over the tree.  The reason why those tests were failing was the changing of the timing of the inner/outer-window-destroyed events which a lot of things depend on apparently.

I ended up employing a slightly different approach.  The inner/outer-window-destroyed events is actually not the interesting part of the runnable for our purposes here since the compartment nuking is the time consuming part.  So I decided to break this task off into two phases, adding a second nuking phase which will run off of idle dispatch and will send the inner/outer-window-nuked events that can be used in the few tests that actually care about compartment nuking rather than window closing.
Attachment #8870987 - Attachment is obsolete: true
Created attachment 8875064 [details] [diff] [review]
Dispatch the compartment-nuking part of WindowDestroyedEvent to the idle queue
Attachment #8875064 - Flags: review?(bugs)

Comment 10

4 months ago
Comment on attachment 8875064 [details] [diff] [review]
Dispatch the compartment-nuking part of WindowDestroyedEvent to the idle queue


>-    nsCOMPtr<nsISupports> window = do_QueryReferent(mWindow);
>-    if (!skipNukeCrossCompartment && window) {
>-      nsGlobalWindow* win = nsGlobalWindow::FromSupports(window);
>-      nsGlobalWindow* currentInner = win->IsInnerWindow() ? win : win->GetCurrentInnerWindowInternal();
>-      NS_ENSURE_TRUE(currentInner, NS_OK);
>+        if (!skipNukeCrossCompartment) {
>+          // The compartment nuking phase might be too expensive, so do that
>+          // part off of idle dispatch.
> 
>-      AutoSafeJSContext cx;
>-      JS::Rooted<JSObject*> obj(cx, currentInner->FastGetGlobalJSObject());
>-      if (obj && !js::IsSystemCompartment(js::GetObjectCompartment(obj))) {
>-        JSCompartment* cpt = js::GetObjectCompartment(obj);
>-        nsCOMPtr<nsIPrincipal> pc = nsJSPrincipals::get(JS_GetCompartmentPrincipals(cpt));
>+          // For the compartment nuking phase, we dispatch either an
>+          // inner-window-nuked or an outer-window-nuked notification.
>+          // This will allow tests to wait for compartment nuking to happen.
>+          mTopic.ReplaceSubstring("destroyed", "nuked");
This is a bit ugly. Would be nicer to just assign the right value, then the assertion wouldn't be needed.
but up to you


> private:
>   uint64_t mID;
>+  uint32_t mPhase;
Please, some enum for this.
Attachment #8875064 - Flags: review?(bugs) → review+

Comment 11

4 months ago
Pushed by eakhgari@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/962439237d24
Dispatch the compartment-nuking part of WindowDestroyedEvent to the idle queue; r=smaug

Comment 12

4 months ago
Backout by eakhgari@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/6353d0ebd86f
Backout due to potentially introducing some intermittent reftest failures
The reftest failures were either showing up as shutdown timeouts or leaks, which seemed to indicate that we had ended up with a whole bunch of nuking tasks which were never run until the end of the test suite.  As a fix I pushed a change which added a timeout of 1 second to the idle dispatch to make sure the cleanup eventually runs, and that seems to have worked well:

https://treeherder.mozilla.org/#/jobs?repo=try&author=eakhgari@mozilla.com

I retriggered R1 on Linux32 debug, Ru2 on Windows 7 debug, and R and Ru on Windows 7 opt (that last one looks bad but those are unrelated intermittents.)

I'm going to reland this now.

Comment 14

4 months ago
Pushed by eakhgari@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/575e351a12af
Dispatch the compartment-nuking part of WindowDestroyedEvent to the idle queue; r=smaug
sorry had to back this out for frequent r5 failures like https://treeherder.mozilla.org/logviewer.html#?job_id=105425298&repo=mozilla-inbound
Flags: needinfo?(ehsan)

Comment 16

4 months ago
Backout by cbook@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/82bdfe54a6cd
Backed out changeset 575e351a12af for causing frequent reftest assertion failures like Assertion failure: false (Ran out of memory while building cycle collector graph), at z:/build/build/src/xpcom/base/nsCycleCollector.cpp:929
(In reply to Carsten Book [:Tomcat] from comment #15)
> sorry had to back this out for frequent r5 failures like
> https://treeherder.mozilla.org/logviewer.html#?job_id=105425298&repo=mozilla-
> inbound

Sigh, if you look at the stack trace for that crash, the presence of ICCRunnerFired() which is new code that was just added by bug 1367905 a few pushes before this one, and the fact that these oranges did not go away after this backout proves that it is bug 1367905 that needed to be backed out not this one.  I'm going to reland it again.  Please don't back it out without pinging me beforehand, this has been tested on try before.
Flags: needinfo?(ehsan)

Comment 18

4 months ago
Pushed by eakhgari@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/7ae0d4245ce7
Dispatch the compartment-nuking part of WindowDestroyedEvent to the idle queue; r=smaug

Comment 19

4 months ago
(In reply to :Ehsan Akhgari (needinfo please, extremely long backlog) from comment #17)
> (In reply to Carsten Book [:Tomcat] from comment #15)
> > sorry had to back this out for frequent r5 failures like
> > https://treeherder.mozilla.org/logviewer.html#?job_id=105425298&repo=mozilla-
> > inbound
> 
> Sigh, if you look at the stack trace for that crash, the presence of
> ICCRunnerFired() which is new code that was just added by bug 1367905 a few
> pushes before this one
Some new code somewhere doesn't indicate anything.


, and the fact that these oranges did not go away
> after this backout proves that it is bug 1367905
That does

> that needed to be backed
> out not this one.  I'm going to reland it again.  Please don't back it out
> without pinging me beforehand, this has been tested on try before.
So did I test bug 1367905 and we both had issues with reftests.
I asked Tomcat to backout this, since when this landed, that was the first time those failures occurred.
(In reply to Olli Pettay [:smaug] from comment #19)
> (In reply to :Ehsan Akhgari (needinfo please, extremely long backlog) from
> comment #17)
> > (In reply to Carsten Book [:Tomcat] from comment #15)
> > > sorry had to back this out for frequent r5 failures like
> > > https://treeherder.mozilla.org/logviewer.html#?job_id=105425298&repo=mozilla-
> > > inbound
> > 
> > Sigh, if you look at the stack trace for that crash, the presence of
> > ICCRunnerFired() which is new code that was just added by bug 1367905 a few
> > pushes before this one
> Some new code somewhere doesn't indicate anything.

It's not the existence of the new code but what that bug was changing (i.e., causing CC/GC to be backed up and not free up memory frequently enough resulting in an OOM) which made it obvious to me that this bug couldn't have been at fault, since this bug technically only changes the timing of compartment nuking, not when we free memory.

Comment 21

4 months ago
compartment nuking has very much to do with memory management ;)

Comment 22

4 months ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/7ae0d4245ce7
Status: NEW → RESOLVED
Last Resolved: 4 months ago
status-firefox55: --- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
Depends on: 1371728
You need to log in before you can comment on or make changes to this bug.