Closed Bug 1510382 Opened Last year Closed 11 months ago

PROCESS-CRASH | newmailaccount | application crashed [@ nsGlobalWindowOuter::SetOpenerWindow(nsPIDOMWindowOuter*, bool)]

Categories

(Thunderbird :: General, defect, critical)

defect
Not set
critical

Tracking

(thunderbird65 fixed, thunderbird66 fixed)

RESOLVED FIXED
Thunderbird 66.0
Tracking Status
thunderbird65 --- fixed
thunderbird66 --- fixed

People

(Reporter: jorgk, Assigned: farre)

References

Details

(Keywords: crash, Whiteboard: [Thunderbird-testfailure: Z all][Thunderbird-disabled-test])

Attachments

(2 files, 4 obsolete files)

In opt builds we see:
PROCESS-CRASH | newmailaccount | application crashed [@ nsGlobalWindowOuter::SetOpenerWindow(nsPIDOMWindowOuter*, bool)]

In debug builds we have:
Assertion failure: !aOriginalOpener || !aOpener || aOpener->GetBrowsingContext() == GetBrowsingContext()->GetOpener(), at /builds/worker/workspace/build/src/dom/base/nsGlobalWindowOuter.cpp:2272

https://taskcluster-artifacts.net/HWzyuLyCQU2ubUmfGcsJUQ/0/public/logs/live_backing.log show that it's from this test:
TEST-START | /builds/worker/workspace/build/tests/mozmill/newmailaccount/test-newmailaccount.js

M-C last good: ce39a152428a7f8ba5a4c82455dcf501c7
M-C first bad: f83dcdf697697c8708e8015cd5c7b0b49d
https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=ce39a152428a7f8ba5a4c82455dcf501c7&tochange=f83dcdf697697c8708e8015cd5c7b0b49d

That's new here:
https://hg.mozilla.org/mozilla-central/rev/cb9dec83210d#l6.12

For opt builds the crash is here:

https://taskcluster-artifacts.net/VQ68sxiZQdyhkWnL9szjnQ/0/public/logs/live_backing.log

INFO -   0  libxul.so!nsGlobalWindowOuter::SetOpenerWindow(nsPIDOMWindowOuter*, bool) [nsGlobalWindowOuter.cpp:fab70cf9e35485946adfc8a507210a6f837c64e0 : 2270 + 0x0]
INFO -   1  libxul.so!nsWindowWatcher::OpenWindowInternal(mozIDOMWindowProxy*, char const*, char const*, char const*, bool, bool, bool, nsIArray*, bool, bool, nsDocShellLoadState*, mozIDOMWindowProxy**) [nsWindowWatcher.cpp:fab70cf9e35485946adfc8a507210a6f837c64e0 : 2075 + 0x17]
INFO -   2  libxul.so!non-virtual thunk to nsWindowWatcher::OpenWindow2(mozIDOMWindowProxy*, char const*, char const*, char const*, bool, bool, bool, nsISupports*, bool, bool, nsDocShellLoadState*, mozIDOMWindowProxy**) [nsWindowWatcher.cpp:fab70cf9e35485946adfc8a507210a6f837c64e0 : 413 + 0x37]

Andreas, that shouldn't crash under any circumstance, should it?

NIing some C-C people who might want to take a look too.
Flags: needinfo?(benc)
Flags: needinfo?(afarre)
Flags: needinfo?(acelists)
Running
mozmake SOLO_TEST=newmailaccount/test-newmailaccount.js mozmill-one
locally I get this stack:

Assertion failure: !aOriginalOpener || !aOpener || aOpener->GetBrowsingContext() == GetBrowsingContext()->GetOpener(), at c:/mozilla-source/comm-central/dom/base/nsGlobalWindowOuter.cpp:2272
#01: nsWindowWatcher::ReadyOpenedDocShellItem (c:\mozilla-source\comm-central\toolkit\components\windowwatcher\nsWindowWatcher.cpp:2081)
#02: nsWindowWatcher::OpenWindowInternal (c:\mozilla-source\comm-central\toolkit\components\windowwatcher\nsWindowWatcher.cpp:998)
#03: nsWindowWatcher::OpenWindow2 (c:\mozilla-source\comm-central\toolkit\components\windowwatcher\nsWindowWatcher.cpp:413)
#04: nsGlobalWindowOuter::OpenInternal (c:\mozilla-source\comm-central\dom\base\nsGlobalWindowOuter.cpp:7157)
#05: nsGlobalWindowOuter::OpenOuter (c:\mozilla-source\comm-central\dom\base\nsGlobalWindowOuter.cpp:5561)
#06: nsGlobalWindowInner::Open (c:\mozilla-source\comm-central\dom\base\nsGlobalWindowInner.cpp:4093)
#07: mozilla::dom::Window_Binding::open (c:\mozilla-source\comm-central\obj-i686-pc-mingw32\dom\bindings\WindowBinding.cpp:2423)
#08: mozilla::dom::binding_detail::GenericMethod<mozilla::dom::binding_detail::MaybeGlobalThisPolicy,mozilla::dom::binding_detail::ThrowExceptions> (c:\mozilla-source\comm-central\dom\bindings\BindingUtils.cpp:3376)
#09: CallJSNative (c:\mozilla-source\comm-central\js\src\vm\Interpreter.cpp:468)
#10: js::InternalCallOrConstruct (c:\mozilla-source\comm-central\js\src\vm\Interpreter.cpp:560)
#11: InternalCall (c:\mozilla-source\comm-central\js\src\vm\Interpreter.cpp:614)

and more, but then we're already in JS land.
I've been trying to bring gdb to bear on this, but I'm really struggling with it here. I can attach it while running, but it always seems to fail on me (a poll() call in the jsbridge times out - the debugger slows things down, so I suspect I can track down and increase the timeout as a workaround).

There's reference in the mozmill docs and source to passing in "--debugger=gdb" to run it via gdb (you can pass in extra params by setting MOZMILL_EXTRA).
But digging down through all the layers of python, I can see that that the code to use the "--debugger" option invoke this isn't actually being run.
Is this just a bug, or am I doing something wrong, or is there some fundamental reason mozmill can't run under gdb after all?
Yep, this is new code intended to replace workings of nsDocShell. Good thing is that it isn't actually replacing it at the moment, it only does the work in parallel and checks what happens is consistent.

In this particular case we've re-written how we set the opener of a browsing context. Before the opener would be set on an outer window in a method call after nsDocShell construction, but now we make it possible to pass the opener on construction and set it then. But this is something we only do for the new BrowsingContext class that we're moving nsDocShell functionality into in increments, so the old setting of opener is still there, but we want to use the fact that we know the opener that we're bound to set.

So if that assert happens something is either missing or the assert is not complete. I'll build thunderbird and take a look.

Ben, can you use rr and debug after?
Flags: needinfo?(afarre)
Only one subtest fails. If you want to run/debug is locally, please remove the line before running the test. You could also try to disable/delete the other parts since it's a very long test.
Flags: needinfo?(acelists)
Keywords: leave-open
Pushed by mozilla@jorgk.com:
https://hg.mozilla.org/comm-central/rev/48522dae2fd4
temporarily disable failing part of newmailaccount/test-newmailaccount.js. rs=bustage-fix
(In reply to Andreas Farre [:farre] from comment #3)
> Ben, can you use rr and debug after?

I think I just hit the same issue: rr is invoked instead of the executable, just as gdb (or valgrind or whatever else) is, right? That's what I'm having trouble with - the runtest.py script that thunderbird uses to run the mozmill tests just doesn't seem to handle running any debugger tool at all. I'm digging into it, but it's all a little twisty-turny down there in the depths of the mozmill modules :-) Will post an update if I have any breakthroughs, otherwise I'll find some workaround.
Whiteboard: [Thunderbird-testfailure: Z all][Thunderbird-disabled-test]
Yeah, that would be awesome. Given an rr trace this should definitely be investigatable.
Hmm, only one subtest fails here:

https://searchfox.org/comm-central/rev/5f75e643c8b41a286005bb2b1558bf06edae18e7/mail/test/mozmill/newmailaccount/test-newmailaccount.js#948

Looks like the "horrible" thing it's doing is this:
  open_content_tab_with_click(newTabLink, function(aURL) {
    return aURL.host == "localhost" && aURL.pathQueryRef == "/target.html";
  });

So somehow that calls into here:
https://searchfox.org/comm-central/rev/5f75e643c8b41a286005bb2b1558bf06edae18e7/mail/base/content/mailWindow.js#627
note the nice comment I added back then ;-)

So I guess something around that code will be wrong now. Perhaps visual inspected gets us further quicker than painful debugging. Andreas, can you please take another look.
Flags: needinfo?(afarre)
Sure thing, looking. It makes sense that it has something to do with opening windows. 

I might end up just removing that assert. I still think that it should hold, but it currently isn't necessary to hold for gecko to function.

The nice thing will be that given the possibility to test it I'm going to be able to break in SetOpenerWindow where this assert doesn't hold and actually see what's going on.
Assignee: nobody → afarre
Flags: needinfo?(afarre)
OK, so I've got a couple of ways to get the mozmill run invoking a debugger:

1) rename the Thunderbird binary and replace it with a wrapper script, eg:

    #!/bin/bash
    here=$( dirname ${BASH_SOURCE[0]} )
    gdb -ex "run" --args $here/thunderbird2 $@

2) hack the mozmill package to pass debug_args in to the runner
(for me, it's in obj-x86_64-pc-linux-gnu/_tests/mozmill-virtualenv/lib/python2.7/site-packages/mozmill/__init__.py):

in MozMill.start(), change:
        self.runner.start()
 to:
        self.runner.start(debug_args = ['gdb', '-ex', 'run', '--args'])

or whatever args should be on the commandline before the thunderbird executable.
(ideally, we'd pass these in via MOZMILL_EXTRA, but there are sooooo many mozmill layers to pass the args through, so for now, it's nasty-hack time).

This still doesn't seem to work so well for gdb - I think stdin/stdout are being co-opted by mozmill, so it's hard to tell what's going on and it doesn't seem interactive. And I've had enough digging about in mozmill for now.

But it does mean that an rr run should work just fine!

If rr ran on my machine. Which - as it turns out - it doesn't (https://github.com/mozilla/rr/issues/2034). Sigh.

Maybe this is useful to you Andreas?

But - as Jorg said - you can't really beat visual inspection. And printfs everywhere. So that's what I'm doing now.
Flags: needinfo?(benc)
Oh, I had a quick stab at running under valgrind (not so much for this bug as for others), but no dice there either.

Valgrind complained about missing support for a syscall (seccomp), then segfaulted.
A valgrind bug rather than a TB bug, but frustrating anyway.
Attached patch skip-non-asserting-tests.patch (obsolete) — Splinter Review
helper patch to skip all the non-crashy tests. "Crash early, crash often" (tm)
I have the review for the patch that fixes this over in Bug 1510928.
Keywords: leave-open
Andreas, that sadly doesn't fix the problem, the crash persists:

https://hg.mozilla.org/mozilla-central/rev/6f780fe1f6cc#l1.15
+    MOZ_DIAGNOSTIC_ASSERT(
+        !aOriginalOpener || !aOpener ||
+        nsGlobalWindowOuter::Cast(aOpener)->IsClosedOrClosing() ||
+        aOpener->GetBrowsingContext() == GetBrowsingContext()->GetOpener());

Assertion failure: !aOriginalOpener || !aOpener || nsGlobalWindowOuter::Cast(aOpener)->IsClosedOrClosing() || aOpener->GetBrowsingContext() == GetBrowsingContext()->GetOpener(), at c:/mozilla-source/comm-central/dom/base/nsGlobalWindowOuter.cpp:2238
#01: nsWindowWatcher::OpenWindowInternal (c:\mozilla-source\comm-central\toolkit\components\windowwatcher\nswindowwatcher.cpp:927)
#02: nsWindowWatcher::OpenWindow2 (c:\mozilla-source\comm-central\toolkit\components\windowwatcher\nswindowwatcher.cpp:364)
#03: nsGlobalWindowOuter::OpenInternal (c:\mozilla-source\comm-central\dom\base\nsglobalwindowouter.cpp:6677)
#04: nsGlobalWindowOuter::OpenJS (c:\mozilla-source\comm-central\dom\base\nsglobalwindowouter.cpp:5262)
#05: nsGlobalWindowOuter::OpenOuter (c:\mozilla-source\comm-central\dom\base\nsglobalwindowouter.cpp:5228)
#06: nsGlobalWindowInner::Open (c:\mozilla-source\comm-central\dom\base\nsglobalwindowinner.cpp:3677)
#07: mozilla::dom::Window_Binding::open (c:\mozilla-source\comm-central\obj-x86_64-pc-mingw32\dom\bindings\windowbinding.cpp:2452)
#08: mozilla::dom::binding_detail::GenericMethod<mozilla::dom::binding_detail::MaybeGlobalThisPolicy,mozilla::dom::binding_detail::ThrowExceptions> (c:\mozilla-source\comm-central\dom\bindings\bindingutils.cpp:3063)
#09: CallJSNative (c:\mozilla-source\comm-central\js\src\vm\interpreter.cpp:443)
#10: js::InternalCallOrConstruct (c:\mozilla-source\comm-central\js\src\vm\interpreter.cpp:535)
and so on.
Flags: needinfo?(afarre)
Right, I'll take another look.
Flags: needinfo?(afarre)
Ok, taken a long a thorough look and this is what I've found; after calling createContentWindow[1] the returned mozIDOMWindowProxy has mDocShell->mBrowsingContext->mOpener == nullptr. Looking at createContentWindow I see that it calls getContentWindowOrOpenURI[2] which in turn calls tabmail.openTab[3] to get a new tab and with that a new window. But that call to openTab doesn't pass opener an opener. Looking at openTab[4] I see a further call to openTab[5] which finally creates a tab, but still without passing opener while creating a window. The assert we see happens here[6], and to make it work the opener must be threaded to this point and presetOpenerWindow[7] needs to be called to provide the opener before the window is created.

Since this is a Thunderbird bug I'm unassigning myself, but hopefully this is enough information to be able to fix it in the right place.

[1] https://searchfox.org/comm-central/source/mail/base/content/mailWindow.js#628
[2] https://searchfox.org/comm-central/source/mail/base/content/mailWindow.js#642
[3] https://searchfox.org/comm-central/source/mail/base/content/mailWindow.js#681
[4] https://searchfox.org/comm-central/source/mail/base/content/tabmail.xml#497
[5] https://searchfox.org/comm-central/source/mail/base/content/specialTabs.js#702
[6] https://searchfox.org/comm-central/source/mail/base/content/specialTabs.js#723
[7] https://searchfox.org/mozilla-central/source/dom/webidl/MozFrameLoaderOwner.webidl#8
Flags: needinfo?(jorgk)
Attachment #9029869 - Attachment is obsolete: true
Assignee: afarre → nobody
Thanks, Andreas. I assume the patch adding
  !XRE_IsContentProcess() ||
will be dropped.

Let's see whether we can get this fixed with the information you've provided.

Any takers?
Flags: needinfo?(jorgk)
Flags: needinfo?(benc)
Flags: needinfo?(acelists)
Attached patch 1510382-pass-opener.patch (obsolete) — Splinter Review
This is based on comment #17 and "pattern matching". I have to build first to try it. Andreas, something like this?
Attachment #9030760 - Flags: feedback?(afarre)
Comment on attachment 9030760 [details] [diff] [review]
1510382-pass-opener.patch

Review of attachment 9030760 [details] [diff] [review]:
-----------------------------------------------------------------

Yes, I'm fairly certain that this is how you should do it. Maybe we can get the opinion of some Firefox front end dev as well.
Comment on attachment 9030760 [details] [diff] [review]
1510382-pass-opener.patch

Doesn't work :-(
Attachment #9030760 - Flags: feedback?(afarre)
Attached patch 1510382-pass-opener.patch (v2) (obsolete) — Splinter Review
The previous version didn't work, in fact, it obliterated the test completely, and this version doesn't work either. I'm not sure to which object the
  .presetOpenerWindow(aArgs.opener)
should be applied.

This really isn't my area of expertise, so I'm giving up.
Assignee: nobody → jorgk
Assignee: jorgk → nobody
The crash from comment #14 is still up-to-date apart from the line numbers. Here a refresh:

Assertion failure: !aOriginalOpener || !aOpener || nsGlobalWindowOuter::Cast(aOpener)->IsClosedOrClosing() || aOpener->GetBrowsingContext() == GetBrowsingContext()->GetOpener(), at c:/mozilla-source/comm-central/dom/base/nsGlobalWindowOuter.cpp:2239
#01: nsWindowWatcher::OpenWindowInternal (c:\mozilla-source\comm-central\toolkit\components\windowwatcher\nsWindowWatcher.cpp:929)
#02: nsWindowWatcher::OpenWindow2 (c:\mozilla-source\comm-central\toolkit\components\windowwatcher\nsWindowWatcher.cpp:364)
#03: nsGlobalWindowOuter::OpenInternal (c:\mozilla-source\comm-central\dom\base\nsGlobalWindowOuter.cpp:6682)
#04: nsGlobalWindowOuter::OpenOuter (c:\mozilla-source\comm-central\dom\base\nsGlobalWindowOuter.cpp:5233)
#05: nsGlobalWindowInner::Open (c:\mozilla-source\comm-central\dom\base\nsGlobalWindowInner.cpp:3695)
#06: mozilla::dom::Window_Binding::open (c:\mozilla-source\comm-central\obj-x86_64-pc-mingw32\dom\bindings\WindowBinding.cpp:2452)
#07: mozilla::dom::binding_detail::GenericMethod<mozilla::dom::binding_detail::MaybeGlobalThisPolicy,mozilla::dom::binding_detail::ThrowExceptions> (c:\mozilla-source\comm-central\dom\bindings\BindingUtils.cpp:3066)
#08: CallJSNative (c:\mozilla-source\comm-central\js\src\vm\Interpreter.cpp:443)

This time I also got a JS stack using this:
  if (mDocShell) {
    xpc_DumpJSStack(true, true, true);
    MOZ_DIAGNOSTIC_ASSERT(
        !aOriginalOpener || !aOpener ||
        // TODO(farre): Allowing to set a closed or closing window as

This is what I get right before the crash:
0 onclick(event = [object MouseEvent]) ["http://localhost:43336/registration.html?product=personalized_email&quote=b28acb3c0a464d33af22":1:0]
    this = [object HTMLParagraphElement]
1 synthesizeMouse(aTarget = [cross-compartment wrapper], aOffsetX = 410.5, aOffsetY = 9, aEvent = [object Object], aWindow = [cross-compartment wrapper]) ["chrome://mozmill/content/stdlib/EventUtils.js":238:6]
    this = [object Object]
2 MozMillController.prototype.mouseEvent(aTarget = [object Object], aOffsetX = 410.5, aOffsetY = 9, aEvent = [object Object], aExpectedEvent = undefined) ["chrome://mozmill/content/modules/controller.js":516:4]
    this = [object Object]

Andreas, how does that relate to comment #17?
Yeah, so that is definitely expected. That assert triggers due to mDocShell->mBrowsingContext->mOpener == nullptr, but aOpener->mDocShell->mBrowsingContext != nullptr.

So, the patch is almost correct, sloppy reviewing from my side, sorry. Since you haven't appended the cloned element yet you'll never find the element with the id 'browser', so what you need to do is to call presetOpenerWindow on the clone!

If I do:

       // First clone the page and set up the basics.
       let clone = document.getElementById("contentTab").firstChild.cloneNode(true);
 
+      if ("opener" in aArgs && aArgs.opener)
+        clone.querySelector("browser").presetOpenerWindow(aArgs.opener);
+

instead, all tests passes!
Flags: needinfo?(jorgk)
Works for me, I'll attach the patch in your name and get it landed. I hope there's no objection. It's all your idea ;-)
Assignee: nobody → afarre
Status: NEW → ASSIGNED
Flags: needinfo?(jorgk)
Flags: needinfo?(benc)
Flags: needinfo?(acelists)
Attachment #9028766 - Attachment is obsolete: true
Attachment #9030760 - Attachment is obsolete: true
Attachment #9030835 - Attachment is obsolete: true
Attachment #9031420 - Flags: review+
Pushed by mozilla@jorgk.com:
https://hg.mozilla.org/comm-central/rev/f5c4404b29d2
Pass opener to specialTabs.openTab(). r=jorgk
Status: ASSIGNED → RESOLVED
Closed: 11 months ago
Resolution: --- → FIXED
Target Milestone: --- → Thunderbird 66.0
Attachment #9031420 - Flags: approval-comm-beta+
You need to log in before you can comment on or make changes to this bug.