Closed Bug 1223372 Opened 6 years ago Closed 6 years ago

devtools test_throw.html causing crash on osx 10.10.5 machines about 50% of the time

Categories

(DevTools :: Console, defect)

defect
Not set
normal

Tracking

(firefox45 fixed)

RESOLVED FIXED
Firefox 45
Tracking Status
firefox45 --- fixed

People

(Reporter: jmaher, Assigned: jandem)

References

Details

Attachments

(1 file, 1 obsolete file)

I thought we were finished on osx 10.10.5, but alas there is a new issue on mochitest other (in this case we run the chrome suite, not browser-chrome) on the file devtools/shared/webconsole/test/test_throw.html:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=67186bce0bae&group_state=expanded&filter-searchStr=other

since this doesn't happen everytime, it could have been the case when I verified a green run on oct 31:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=80c5a8048328&filter-searchStr=oth

I have retriggered the 'oth' job on opt as well as opt/debug from the Oct 31 try run to see if this is debug only and something newer.

you can see the stack in the raw log file:
http://archive.mozilla.org/pub/firefox/try-builds/jmaher@mozilla.com-67186bce0bae0e3687ce2df523e03ae3d5ca0db7/try-macosx64-debug/try_yosemite_r7-debug_test-mochitest-other-bm107-tests1-macosx-build8.txt.gz

04:55:53     INFO -  Operating system: Mac OS X
04:55:53     INFO -                    10.10.5 14F27
04:55:53     INFO -  CPU: amd64
04:55:53     INFO -       family 6 model 69 stepping 1
04:55:53     INFO -       4 CPUs
04:55:53     INFO -  Crash reason:  EXC_BAD_ACCESS / KERN_INVALID_ADDRESS
04:55:53     INFO -  Crash address: 0x0
04:55:53     INFO -  Process uptime: 20 seconds
04:55:53     INFO -  Thread 0 (crashed)
04:55:53     INFO -   0  XUL!xpc::WrapperFactory::PrepareForWrapping(JSContext*, JS::Handle<JSObject*>, JS::Handle<JSObject*>, JS::Handle<JSObject*>) [WrapperFactory.cpp:67186bce0bae : 165 + 0x0]
04:55:53     INFO -      rax = 0x0000000000000000   rdx = 0x00007fff70f751f8
04:55:53     INFO -      rcx = 0x0000000000000000   rbx = 0x00007fff70f75c50
04:55:53     INFO -      rsi = 0x0000fe000000fe00   rdi = 0x00000001062b9f5c
04:55:53     INFO -      rbp = 0x00007fff5fbfcc70   rsp = 0x00007fff5fbfcae0
04:55:53     INFO -       r8 = 0x00007fff5fbfca90    r9 = 0x00007fff72532300
04:55:53     INFO -      r10 = 0x0000000000000002   r11 = 0x0000000000004000
04:55:53     INFO -      r12 = 0x0000000109e1d000   r13 = 0x0000000113c02868
04:55:53     INFO -      r14 = 0x00007fff5fbfce00   r15 = 0x0000000109e1d018
04:55:53     INFO -      rip = 0x00000001025a4eea
04:55:53     INFO -      Found by: given as instruction pointer in context
04:55:53     INFO -   1  XUL!JSCompartment::wrap(JSContext*, JS::MutableHandle<JSObject*>, JS::Handle<JSObject*>) [jscompartment.cpp:67186bce0bae : 441 + 0xc]
04:55:53     INFO -      rbx = 0x0000000109e1d018   rbp = 0x00007fff5fbfcd80
04:55:53     INFO -      rsp = 0x00007fff5fbfcc80   r12 = 0x0000000113c02868
04:55:53     INFO -      r13 = 0x0000000109e1d000   r14 = 0x00007fff5fbfce48
04:55:53     INFO -      r15 = 0x0000000113c02800   rip = 0x0000000105dab3d1
04:55:53     INFO -      Found by: call frame info
04:55:53     INFO -   2  XUL!JS_WrapObject(JSContext*, JS::MutableHandle<JSObject*>) [jsapi.cpp:67186bce0bae : 850 + 0x12]
04:55:53     INFO -      rbx = 0x0000000132ea64c0   rbp = 0x00007fff5fbfcdb0
04:55:53     INFO -      rsp = 0x00007fff5fbfcd90   r12 = 0x00007fff5fbfcf40
04:55:53     INFO -      r13 = 0x00007fff5fbfd101   r14 = 0x00007fff5fbfce48
04:55:53     INFO -      r15 = 0x0000000109e1d000   rip = 0x0000000105d71a89
04:55:53     INFO -      Found by: call frame info
04:55:53     INFO -   3  XUL!XPCConvert::NativeInterface2JSObject(JS::MutableHandle<JS::Value>, nsIXPConnectJSObjectHolder**, xpcObjectHelper&, nsID const*, XPCNativeInterface**, bool, nsresult*) [XPCConvert.cpp:67186bce0bae : 790 + 0x5]
04:55:53     INFO -      rbx = 0x000000011fe20808   rbp = 0x00007fff5fbfcf20
04:55:53     INFO -      rsp = 0x00007fff5fbfcdc0   r12 = 0x00007fff5fbfcf40
04:55:53     INFO -      r13 = 0x00007fff5fbfd101   r14 = 0x00007fff5fbfce38
04:55:53     INFO -      r15 = 0x00007fff5fbfd0c0   rip = 0x00000001025db92f
04:55:53     INFO -      Found by: call frame info
04:55:53     INFO -   4  XUL!XPCConvert::NativeData2JS(JS::MutableHandle<JS::Value>, void const*, nsXPTType const&, nsID const*, nsresult*) [XPCConvert.cpp:67186bce0bae : 342 + 0x24]
04:55:53     INFO -      rbx = 0x00000001025dac12   rbp = 0x00007fff5fbfd050
04:55:53     INFO -      rsp = 0x00007fff5fbfcf30   r12 = 0x00007fff5fbfd094
04:55:53     INFO -      r13 = 0x00007fff5fbfd198   r14 = 0x00007fff5fbfd0b0
04:55:53     INFO -      r15 = 0x00007fff5fbfd0c0   rip = 0x00000001025dad5d
04:55:53     INFO -      Found by: call frame info
04:55:53     INFO -   5  XUL!CallMethodHelper::GatherAndConvertResults() [XPCWrappedNative.cpp:67186bce0bae : 1633 + 0x5]
04:55:53     INFO -      rbx = 0x00007fff5fbfd150   rbp = 0x00007fff5fbfd100
04:55:53     INFO -      rsp = 0x00007fff5fbfd060   r12 = 0x0000000000000000
04:55:53     INFO -      r13 = 0x00007fff5fbfd190   r14 = 0x0000000000000000
04:55:53     INFO -      r15 = 0x0000000000000012   rip = 0x0000000102649889
04:55:53     INFO -      Found by: call frame info
04:55:53     INFO -   6  XUL!CallMethodHelper::Call() [XPCWrappedNative.cpp:67186bce0bae : 1425 + 0x8]
...
ok, something changed between oct 31 and today- nothing with the test specifically, and debug only.  I wonder if this is a platform related change.  I am trying to reproduce on a loaner first.
I can reproduce this locally, but not all the time.  Working on trying to narrow this down.  It seems to reproduce as part of the entire directory of tests, not just the single test case.
this appears that test_throw.html is a red herring, it is the last test case- so something in here is causing the problems, let me try a few different builds!
oh fun the crash happened somewhere in 176 changesets from over the weekend:
https://hg.mozilla.org/mozilla-central/pushloghtml?changeset=e1ef2be156de

At least we have it down to a single range!
bah, treeherder is being slow, might not get this narrowed down much more.
and the culprit is:
https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?changeset=ff1dc3e22d19

I am not sure what to do next to debug this- but I can take some instructions and give it a try.  I found that a fresh restart and then running the test right away always reproduces it.

:jandem, can you take a look at this?
Flags: needinfo?(jdemooij)
jmaher, what's the exact command you use to reproduce this? Is it just test_throw.html or do you have to run more tests? I'm on OS X 10.10.5 so it'd be great if I could repro it locally.
I had downloaded a build and test package, then ran the mochitests by hand:
python mochitest/runtests.py --app ../application/NightlyDebug.app/Contents/MacOS/firefox --utility-path bin --extra-profile-file bin/plugins --certificate-path certs --chrome devtools/shared/webconsole/test

I assume this is related to a local debug build of:
./mach test devtools/shared/webconsole/test

note- this reproduced 100% of the time as the first command run after a fresh reboot, otherwise it sometimes took a few tries to see the error.

As a related note, this isn't test_throw.html, it is a crash in the browser I think on shutdown, it just so happens to be that test_throw.html is the last test we run in that directory.
It's pretty weird, we really should have a WindowProxy object there (after unwrapping) but we're failing that assertion.

The only possibility I see is that we turned it into a dead object wrapper. I pushed a patch to Try to add some logging right before the assertion failure, to test that theory:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=8bd99165b159
So we have a Window and get its WindowProxy but we get a dead object proxy instead. The old OuterObject hook went through the C++ outer window and got the WindowProxy from there.

I'm not sure how to handle this. Is it really a problem if ToWindowProxyIfWindow returns a dead object proxy?
(In reply to Jan de Mooij [:jandem] from comment #10)
> The old OuterObject hook went through the C++ outer window and got
> the WindowProxy from there.

Thinking about this more, the old OuterObject hook would return a CrossCompartmentWrapper but at some later point this wrapper could *also* be nuked, just like we're doing here.

I think returning a dead object proxy from ToWindowProxyIfWindow is okay, but we should audit all callers.

I'll also try to write a test for this.
(In reply to Jan de Mooij [:jandem] from comment #11)
> I'll also try to write a test for this.

Hm not sure how to test this.

bz, you suggested starting with a chrome document, loading another chrome document in a subframe and then getting its 'window'. Isn't that going to give us (a CCW to) a WindowProxy, instead of the Window?
> the old OuterObject hook would return a CrossCompartmentWrapper but at some later point
> this wrapper could *also* be nuked

Indeed.  What's not clear is whether immediately outerizing to a dead object proxy is OK.  I mean, it seems like we used to be "resurrecting" things that were dead-object-proxied... and now we won't.  Might be OK.

> Isn't that going to give us (a CCW to) a WindowProxy, instead of the Window?

Yes.  And then I said to navigate the subframe to a non-chrome thing.  This will create a new WindowProxy, point our CCW to that WindowProxy (I believe), and transplant the old WindowProxy to be an Xray to the new WindowProxy, right?  Then if we remove the iframe from the DOM and gc that Xray should, if I understand this stuff correctly, become a dead object wrapper.  Now we just need to trigger outerization somehow in that unloaded chrome document, and should hit this codepath.  I think.
Flags: needinfo?(khuey)
Flags: needinfo?(bobbyholley)
There's also the extra complication of the NukeReferencesToWindow enum here:

http://mxr.mozilla.org/mozilla-central/source/js/src/jsfriendapi.h#1117

So I think that technically resurrection would be the more correct behavior, but this case only comes up in the situation when we're navigating a given docshell between chrome and non-chrome content, which is totally insecure anyway, and something I'd like to prohibit altogether. So I don't really mind if we make navigated-away-from chrome documents behave a bit more flakily in these cases, though it'd be great if we could warn people somehow that they should stop doing that.
Flags: needinfo?(bobbyholley)
Are my steps in comment 13 the right way to get into this situation?
Flags: needinfo?(bobbyholley)
(In reply to Boris Zbarsky [:bz] from comment #15)
> Are my steps in comment 13 the right way to get into this situation?

Just to be clear, the issue is that we previously outerized via JSInner->C++Inner->C++Outer->JSOuter (and hence there was no CCW to nuke), and that we're now outerizing via JSInner->JSOuter (with a CCW, which gets nuked)?

In that case, yes, I think we want a chrome subframe which navigates itself to a content page, and then, while navigated away from, evals |window|.
Flags: needinfo?(bobbyholley)
Attached patch Patch (obsolete) — Splinter Review
This patch changes PrepareForWrapping to handle the dead wrapper case.

I also checked the other callers of ToWindowProxyIfWindow and they don't assume the result is a WindowProxy.

The test I added fails at the revision before bug 1125423 landed and passes after it.
Assignee: nobody → jdemooij
Status: NEW → ASSIGNED
Flags: needinfo?(jdemooij)
Attachment #8687581 - Flags: review?(bobbyholley)
:jandem, thanks for getting a patch up for this!  Looking forward to this landing and turning on the new 10.10.5 machines by default!
Comment on attachment 8687581 [details] [diff] [review]
Patch

Review of attachment 8687581 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/xpconnect/wrappers/WrapperFactory.cpp
@@ +168,5 @@
> +    // If we've got a WindowProxy or dead wrapper (from the
> +    // ToWindowProxyIfWindow call above), there's nothing special that needs
> +    // to be done here, and we can move on to the next phase of wrapping.
> +    // We handle this case first to allow us to assert against wrappers below.
> +    if (js::IsWindowProxy(obj) || JS_IsDeadWrapper(obj))

As long as we're explicitly handling the dead-wrapper case, I'd rather avoid creating a cross-compartment edge to a dead proxy. I'd be fine with JS_ReportError and returning nullptr in this case (causing wrapping to fail), or generating a local dead object proxy (by entering the local compartment, and creating one).
Attachment #8687581 - Flags: review?(bobbyholley) → review-
Attached patch PatchSplinter Review
This adds the JS_ReportError call as suggested.

I also *finally* managed to reproduce the assertion failure itself (with Components.utils.getGlobalForObject), so I could write a test for it.

This patch also fixes the devtools orange on OS X 10.10.
Attachment #8687581 - Attachment is obsolete: true
Attachment #8688421 - Flags: review?(bobbyholley)
Attachment #8688421 - Flags: review?(bobbyholley) → review+
Flags: needinfo?(khuey)
https://hg.mozilla.org/mozilla-central/rev/29001cebc62c
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 45
Product: Firefox → DevTools
You need to log in before you can comment on or make changes to this bug.