Closed Bug 874502 Opened 7 years ago Closed 6 years ago

Investigate making window.close() from chrome code not truly sync

Categories

(Core :: DOM: Core & HTML, defect)

x86
macOS
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla27
Tracking Status
firefox26 + fixed
firefox27 --- fixed

People

(Reporter: bzbarsky, Assigned: bholley)

References

Details

(Whiteboard: [qa-])

Attachments

(6 files, 4 obsolete files)

4.17 KB, patch
bzbarsky
: review+
Details | Diff | Splinter Review
2.01 KB, patch
tbsaunde
: review+
Details | Diff | Splinter Review
3.22 KB, patch
jdm
: review+
Details | Diff | Splinter Review
1.76 KB, patch
mossop
: review+
Details | Diff | Splinter Review
1.90 KB, patch
mossop
: review+
Gavin
: feedback+
Details | Diff | Splinter Review
16.52 KB, patch
Details | Diff | Splinter Review
This apparently causes test failures, but we should figure out how many and whether it's worse than fixing all the things that bug 862627 would break due to changing the JSContext.

Specifically, it seems like there are three things we can do with chrome window.close():

1)  Fully async.
2)  Fully sync as now,
3)  End of microtask.

I'm fairly interested in #3 and seeing how it does both with and without bug 862627.
Let's see what tests break when we do fully async (like we do for content):

https://tbpl.mozilla.org/?tree=Try&rev=6dd1fdbc096c
(In reply to Bobby Holley (:bholley) from comment #1)
> Let's see what tests break when we do fully async (like we do for content):
> 
> https://tbpl.mozilla.org/?tree=Try&rev=6dd1fdbc096c

These try results are gone again. How bad was it?
(In reply to :Gijs Kruitbosch from comment #2)
> (In reply to Bobby Holley (:bholley) from comment #1)
> > Let's see what tests break when we do fully async (like we do for content):
> > 
> > https://tbpl.mozilla.org/?tree=Try&rev=6dd1fdbc096c
> 
> These try results are gone again. How bad was it?

IIRC it was tractable but nontrivial. I don't really remember though. Feel free to repush.
This is on the dependency chain for bug 903770. Does it really need to be? If if it does, can we up the priority?
Specifically, does this need to block bug 862627?
(In reply to Jonas Sicking (:sicking) from comment #5)
> Specifically, does this need to block bug 862627?

I'm pretty sure it does. bz knows more.
Flags: needinfo?(bzbarsky)
> Specifically, does this need to block bug 862627?

It does for mainthread, to some extent: without this change that test causes massive chrome test failures because windows die while script is expecting to be able to touch them.  Worse yet, it would likely cause similar issues with addons.

It doesn't have to block for workers, as long as we generate separate worker bindings for EventTarget and its descendants: we can simply use the WebIDL EventListener in worker bindings and keep using nsIDOMEventListener on mainthread.  All the infrastructure for that should be in place.

That said, _I_ would like us to up the priority of this...

Bobby, mind attaching the patch you tested here?
Flags: needinfo?(bzbarsky)
Err, attached the patch to the wrong bug. Here it is, along with a try push so
that we can see the failures again.
Try results are in. As before, there are several (but not too many) failures, mostly in mochitest-chrome and mochitest-browser. We just need someone to spend some time (somewhere between 1 hour and 1 day) fixing those tests to be green, and then this can land.
Bobby, So is it accurate to say that what needs to be done here is to go through those test failures and either:

* Fix legitimate breakage in browser front-end code
* Fix tests where the problem is with the test rather than product code.
(In reply to Jonas Sicking (:sicking) from comment #11)
> Bobby, So is it accurate to say that what needs to be done here is to go
> through those test failures and either:
> 
> * Fix legitimate breakage in browser front-end code
> * Fix tests where the problem is with the test rather than product code.

Correct. From the tests that I've already fixed (bundled into the attached patch), I suspect that it's mostly the latter. Getting a frontend-knowledgeable person to drive this might significantly increase efficiency.
(In reply to Bobby Holley (:bholley) from comment #12)
> Correct. From the tests that I've already fixed (bundled into the attached
> patch), I suspect that it's mostly the latter. Getting a
> frontend-knowledgeable person to drive this might significantly increase
> efficiency.

Can you summarize the behavior change that requires test changes?

I don't quite understand the test changes in your patch - do we need to remove all window.closed checks? Why do you remove a this.windowsOpen get that occurs immediately after a set?
(In reply to :Gavin Sharp (use gavin@gavinsharp.com for email) from comment #13)
> (In reply to Bobby Holley (:bholley) from comment #12)
> > Correct. From the tests that I've already fixed (bundled into the attached
> > patch), I suspect that it's mostly the latter. Getting a
> > frontend-knowledgeable person to drive this might significantly increase
> > efficiency.

FWIW, bz and I decided to just split up the failures and go through them. I'm most of the way through the m-o and JP failures, and Boris is working on bc.
 
> Can you summarize the behavior change that requires test changes?

Invoking window.close() from chrome no longer tears down the window synchronously (along with all the observable effects like firing "domwindowclosed")
 
> I don't quite understand the test changes in your patch - do we need to
> remove all window.closed checks?

because window.closed is no longer set synchronously when close() is invoked from chrome.

> Why do you remove a this.windowsOpen get that occurs immediately after a set?

Because callers that do window.open(); window.close(); no longer receive "domwindowclosed" before "domwindowopened". So the value here is ephemerally 2.
I've got patches to fix the mochitest-chrome failures here. I also dug into the jetpack failure, which turned out to be a royal PITA.

In a nutshell, the panel tests do the following:
(1) close a window with win.close()
(2) listen for "domwindowclosed"
(3) in the domwindowclosed handler, synchronously install a XUL panel and register a listener to wait for the panel to appear.

With the patch in this bug, we time out in (3). This turns out to be because nsXULWindow::Destroy() first unregisters the window from the AppShellService (which synchronously fires domwindowclosed), and _then_ invokes mWindow->Show(false), which calls nsXULPopupManager::Rollup, which tears down _all_ the popups in the system. nsXULPopupManager is a singleton, and doesn't seem to have any awareness of popups on different windows - it just tears them all down.

So the three options, as I see it, are (1) to fire domwindowclosed later (perhaps even asynchronously), (2) to move the call to mWindow->Show(false) before unregistering the XULWindow, or (3) just change the test.

Making domwindowclosed completely async seems much more robust long-term, but seems pretty likely to break things. Hoisting the call to Show(false) seems like a reasonable compromise, except Neil points out that panel rollup happens differently in different widget implementations. On Mac it happens in Show(false), but on Windows it happens when the widget is destroyed. We could jigger things around, but who knows what _that_ would break.

So maybe this stuff is just intractably fickle, and we should just avoid changing things, and fix the test? Gavin, Boris, what do you guys think?

In the words of bent: "Appshell will change you".
Flags: needinfo?(gavin.sharp)
Flags: needinfo?(bzbarsky)
Actually, I guess it would be a change to the SDK, not the test. I'll attach a patch in case we decide to go that route.
Attached patch Fix addon sdk. v1 (obsolete) — Splinter Review
(In reply to Bobby Holley (:bholley) from comment #14)
> > I don't quite understand the test changes in your patch - do we need to
> > remove all window.closed checks?
> 
> because window.closed is no longer set synchronously when close() is invoked
> from chrome.

Actually, this isn't true. It looks like FinalClose() sets mIsClosed before doing any of the asynchronous stuff. So that change I made to bc isn't necessarily valid.

Boris, I've spent most of a day on this bug, and am kind of done. Can you look into that one?
Since popups are involved, Neil, thoughts?
Flags: needinfo?(enndeakin)
Bobby, can you attach the patch to fix the mochitest-chrome failures?
Flags: needinfo?(bzbarsky) → needinfo?(bobbyholley+bmo)
Attachment #797639 - Flags: review?(trev.saunders)
Attachment #797640 - Flags: review?(josh)
Flags: needinfo?(bobbyholley+bmo)
Attachment #797640 - Flags: review?(josh) → review+
Comment on attachment 797639 [details] [diff] [review]
Fix a11y tests. v1

r=me with below explained

>+          ok(!isAccessibleInCache(dlgDoc),
>+             "The document accessible for '" + aURL + "' is in cache still!");
>+        });

what prevents this from racing with SimpleTest.finish() ?
Attachment #797639 - Flags: review?(trev.saunders) → review+
> With the patch in this bug, we time out in (3). This turns out to be because
> nsXULWindow::Destroy() first unregisters the window from the AppShellService
> (which synchronously fires domwindowclosed), and _then_ invokes
> mWindow->Show(false), which calls nsXULPopupManager::Rollup, which tears
> down _all_ the popups in the system. nsXULPopupManager is a singleton, and
> doesn't seem to have any awareness of popups on different windows - it just
> tears them all down.
> 

There shouldn't under normal circumstances be popups associated with multiple windows.

Why is there here?
Flags: needinfo?(enndeakin)
(In reply to Neil Deakin from comment #24)
> > With the patch in this bug, we time out in (3). This turns out to be because
> > nsXULWindow::Destroy() first unregisters the window from the AppShellService
> > (which synchronously fires domwindowclosed), and _then_ invokes
> > mWindow->Show(false), which calls nsXULPopupManager::Rollup, which tears
> > down _all_ the popups in the system. nsXULPopupManager is a singleton, and
> > doesn't seem to have any awareness of popups on different windows - it just
> > tears them all down.
> > 
> 
> There shouldn't under normal circumstances be popups associated with
> multiple windows.
> 
> Why is there here?

There aren't. It's just an ordering thing. Because of the time at which domwindowclosed is fired, we end up adding a popup to window Foo before window Bar has finished going away (and, in particular, before it calls Rollup).
Flags: needinfo?(enndeakin)
> Can you summarize the behavior change that requires test changes?

Yes.  This test is removing a tab from a browser by calling .close() on the content window.  Then it checks .closed on the _chrome_ window.

Before this patch, since chrome code was doing the .close() on the content window it immediately went ahead and tore down the chrome window.  After this patch, teardown happens off an event, just as it would if _content_ code called .close() on the content window.  So at the point at which .closed is checked on the chrome window, that chrome window is not in fact torn down yet.

I'm going to restructure the test to give the callee function the ability to decide when to run the next test, and move the .closed test to be async.  Gavin, does that sound reasonable?
> Why do you remove a this.windowsOpen get that occurs immediately after a set?

We can keep this if we can make the test do an executeSoon between the part that closes a window and the part that opens the next one.  If you know what part that is for this test, please say so!
(In reply to Boris Zbarsky [:bz] from comment #26)
> I'm going to restructure the test to give the callee function the ability to
> decide when to run the next test, and move the .closed test to be async. 
> Gavin, does that sound reasonable?

Yeah, sounds like the right fix.
Flags: needinfo?(gavin.sharp)
Depends on: 913855
With this patch and the patch for bug 913855, the change in this bug doesn't cause any BC orange on try.  See try run at https://tbpl.mozilla.org/?tree=Try&rev=725898b62184
Attachment #801156 - Flags: review?(dolske)
Assignee: nobody → bzbarsky
Status: NEW → ASSIGNED
Assignee: bzbarsky → bobbyholley+bmo
Bobby, even with all the patches in this bug, there are still JP test failures: https://tbpl.mozilla.org/?tree=Try&rev=a17b71c18aa6
So, I thought there was only one JP failure, but it turns out the JP tests are actually just very bad at recovering from timeouts, so the other failures were masked.

The next problem appears to be in testTrackWindows, here:

http://mxr.mozilla.org/mozilla-central/source/addon-sdk/source/test/windows/test-firefox-windows.js#322

We were previously relying on the fact that the onActivate() and onDeactivate() handlers would never be called for windows that we call .close() on, since we'd remove all the listeners upon receiving domwindowclosed. Since this notification is now async, the activate/deactivate notifications come before domwindowclosed, so the evbent handlers get called.

I could fix the test. But I'm concerned that we'd be changing an SDK invariant that would break addons. Should we make .close() (in lib/sdk/windows/firefox.js) eagerly remove all listeners?
Flags: needinfo?(poirot.alex)
I think that's fine. It sounds quite natural to get the deactivate notification on close. (I don't think we will have an active event in common cases, right?)
I'd even say that it would be a bugfix if we weren't receiving deactivate because of the previous synchronous behavior. The important thing is to receive a deactivate for each activate event, and ensure that event order makes sense. closed sounds like something that should be fired last.

But I'd prefer people actually maintaining this code to make this call.
Flags: needinfo?(rFobic)
Flags: needinfo?(poirot.alex)
Flags: needinfo?(evold)
(In reply to Alexandre Poirot (:ochameau) from comment #33)
> I think that's fine. It sounds quite natural to get the deactivate
> notification on close. (I don't think we will have an active event in common
> cases, right?)
> I'd even say that it would be a bugfix if we weren't receiving deactivate
> because of the previous synchronous behavior. The important thing is to
> receive a deactivate for each activate event, and ensure that event order
> makes sense. closed sounds like something that should be fired last.
> 
> But I'd prefer people actually maintaining this code to make this call.

I agree with Alex, I think there firefox window tests were written to rely on sync behavior not because that is what we expected or promised add-on devs, but instead because it was easier to do so.  I don't know though, I wasn't around at the time, but I think it is safe to simply update the tests.
Flags: needinfo?(evold)
(In reply to Erik Vold [:erikvold] [:ztatic] from comment #34)
> I think there firefox window tests were written to rely

*the
(In reply to Erik Vold [:erikvold] [:ztatic] from comment #34)
> but I think it is safe to simply update the tests.

And easier too! I'll do that.
Fixed the test and pushed to try to see if there were any other lingering jetpack failures:

https://tbpl.mozilla.org/?tree=Try&rev=a5a692298e74
Jetpack tests are green! Let's get this stuff reviewed.
Attachment #794758 - Flags: review?(bzbarsky)
Comment on attachment 794758 [details] [diff] [review]
Remove IsCallerChrome path for tearing down windows synchronously. v1

> +++ b/browser/base/content/test/browser_bug462673.js

Drop this part; other patches cover it.

>+++ b/toolkit/components/passwordmgr/test/test_prompt_async.html

Likewise.

r=me
Attachment #794758 - Flags: review?(bzbarsky) → review+
Attachment #801156 - Flags: review?(dolske) → review+
Attachment #801156 - Attachment is obsolete: true
Attachment #801157 - Attachment is obsolete: true
Assignee: bobbyholley+bmo → bzbarsky
Depends on: 916391
Attachment #804817 - Attachment is obsolete: true
Comment on attachment 803472 [details] [diff] [review]
Fix JP test-windows to handle on{activate,deactivate} after close(). v1

Erik is apparently unavailable until next week.  He suggested some other possible reviewers, so trying them, whichever can do it first.
Attachment #803472 - Flags: review?(zer0)
Attachment #803472 - Flags: review?(rFobic)
Attachment #803472 - Flags: review?(jsantell)
Attachment #803472 - Flags: review?(evold)
Attachment #803472 - Flags: review?
Attachment #803473 - Flags: review?(zer0)
Attachment #803473 - Flags: review?(rFobic)
Attachment #803473 - Flags: review?(jsantell)
Attachment #803473 - Flags: review?(evold)
Comment on attachment 794758 [details] [diff] [review]
Remove IsCallerChrome path for tearing down windows synchronously. v1

Once the final reviews for those jetpack test fixes comes through, this can land. And once it lands, I think we should backport it to aurora.

In a nutshell, this is a change to the very finicky behavior of window.close() when invoked by chrome (so only frontend and addons are affected). The patch itself is quite small, but it changes the ordering of certain observer notifications, which has the potential to break frontend code that was relying on it. Boris and I have spent the last few weeks fixing up frontend and jetpack tests to accomodate this change.

This change is required to unblock a bunch of worker stuff that's on the critical path and is probably going to need to land within the next six weeks.

The risk to the browser proper of this change is pretty minimal. But the risk to addons is quite high. Those addons are probably going to have to change, but it would be nice if we had some breathing room where we could back it out for one cycle if we discover late-breaking bustage. And that backout probably won't be possible if we have a big pile of DOM/platform changes depending on this patch in the same cycle.

So I'd like to get this pole a bit out ahead of the other polls. As such, I think we should uplift this to aurora after this lands on central (probably tomorrow).
Attachment #794758 - Flags: approval-mozilla-aurora?
Attachment #803473 - Flags: review?(gavin.sharp)
Attachment #803473 - Flags: review?(dtownsend+bugmail)
Comment on attachment 803473 [details] [diff] [review]
Fix Jetpack to avoid firing script notifications midway through window teardown. v1

Looks fine to me, I'm not a peer.
Attachment #803473 - Flags: review?(gavin.sharp) → feedback+
Comment on attachment 803473 [details] [diff] [review]
Fix Jetpack to avoid firing script notifications midway through window teardown. v1

Review of attachment 803473 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good enough but for the love of all intermittent oranges everywhere can we please get a reliable notification for when a window is truly closed sometime some?
Attachment #803473 - Flags: review?(zer0)
Attachment #803473 - Flags: review?(rFobic)
Attachment #803473 - Flags: review?(jsantell)
Attachment #803473 - Flags: review?(dtownsend+bugmail)
Attachment #803473 - Flags: review+
(In reply to Dave Townsend (:Mossop) from comment #48)
> Comment on attachment 803473 [details] [diff] [review]
> Fix Jetpack to avoid firing script notifications midway through window
> teardown. v1
> 
> Review of attachment 803473 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Looks good enough but for the love of all intermittent oranges everywhere
> can we please get a reliable notification for when a window is truly closed
> sometime some?

Can you file a bug about that? Maybe we could use xul-window-destroyed an add a subject? Or maybe we could add something totally different that's guaranteed to round-trip through the event loop?
(In reply to Bobby Holley (:bholley) from comment #49)
> (In reply to Dave Townsend (:Mossop) from comment #48)
> > Comment on attachment 803473 [details] [diff] [review]
> > Fix Jetpack to avoid firing script notifications midway through window
> > teardown. v1
> > 
> > Review of attachment 803473 [details] [diff] [review]:
> > -----------------------------------------------------------------
> > 
> > Looks good enough but for the love of all intermittent oranges everywhere
> > can we please get a reliable notification for when a window is truly closed
> > sometime some?
> 
> Can you file a bug about that? Maybe we could use xul-window-destroyed an
> add a subject? Or maybe we could add something totally different that's
> guaranteed to round-trip through the event loop?

Filed bug 917371
Attachment #803472 - Flags: review?(zer0)
Attachment #803472 - Flags: review?(rFobic)
Attachment #803472 - Flags: review?(jsantell)
Attachment #803472 - Flags: review?
Attachment #803472 - Flags: review+
Commits pushed to master at https://github.com/mozilla/addon-sdk

https://github.com/mozilla/addon-sdk/commit/f0341976477d74e97f205e40012443916c3fa3ad
Bug 874502 part 2. Fix addon sdk to not assume synchronous closing of windows. r=mossop

https://github.com/mozilla/addon-sdk/commit/bece20a8bd12ee2511ac3bc279e55575931780ee
Bug 874502 part 5. Fix JP test-windows to handle on{activate,deactivate} after close(). r=mossop
Comment on attachment 794758 [details] [diff] [review]
Remove IsCallerChrome path for tearing down windows synchronously. v1

Approving for Aurora uplift - what do we need to be able to tell addon developers to watch out for here?  Is there anything about this we need to be trying to broadcast or get QA coverage on while this is baking on FF26?
Attachment #794758 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
I'm going to track this in case we do need to revisit for a backout at some point in Aurora.
(In reply to lsblakk@mozilla.com [:lsblakk] from comment #54)
> Comment on attachment 794758 [details] [diff] [review]
> Remove IsCallerChrome path for tearing down windows synchronously. v1
> 
> Approving for Aurora uplift - what do we need to be able to tell addon
> developers to watch out for here?  Is there anything about this we need to
> be trying to broadcast or get QA coverage on while this is baking on FF26?

There are a few visible changes - closed windows may appear in window enumeration for longer, "domwindowclosed" gets fired later (and async), etc. But I would very much doubt that this is the kind of thing that an addon developer would say "oh! yeah! I rely on that flaky behavior!". I think we should just encourage anyone using windowenumerator, windowwatcher, or "domwindowclosed" to test their addon early.
Whiteboard: [qa-]
Depends on: 918267
Blocks: 922995
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.