Closed Bug 1428725 Opened 6 years ago Closed 6 years ago

Crash in `anonymous namespace''::Wrap

Categories

(DevTools :: Console, defect, P2)

x86_64
All
defect

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: emmanuel.benzaquin, Assigned: baku)

References

Details

(Keywords: regression)

Crash Data

Attachments

(3 files, 2 obsolete files)

Attached file Test to reproduce bug (obsolete) —
This bug was filed from the Socorro interface and is
report bp-0e5b658b-fffd-423c-bc04-0c2000180104.
=============================================================

Top 10 frames of crashing thread:

0 xul.dll `anonymous namespace'::Wrap dom/workers/RuntimeService.cpp:944
1 xul.dll JSCompartment::getOrCreateWrapper js/src/jscompartment.cpp:433
2 xul.dll JSCompartment::wrap js/src/jscompartment.cpp:479
3 xul.dll JSCompartment::wrap js/src/jscompartmentinlines.h:155
4 xul.dll js::CrossCompartmentWrapper::nativeCall js/src/proxy/CrossCompartmentWrapper.cpp:407
5 xul.dll js::Proxy::nativeCall js/src/proxy/Proxy.cpp:543
6 xul.dll TypedArray_lengthGetter js/src/vm/TypedArrayObject.cpp:1414
7 xul.dll js::InternalCallOrConstruct js/src/vm/Interpreter.cpp:473
8 xul.dll js::fun_call js/src/jsfun.cpp:1239
9 xul.dll js::InternalCallOrConstruct js/src/vm/Interpreter.cpp:473

=============================================================

Crash occurs in developer tools for worker when I try call console.log
on a typed array built with a SharedArrayBuffer received from the main thread.

Launch firefox on test.html
Open Debugger
Launch worker Debugger
Select Console Tab in worker developer tools ==> Crash
Hello Emmanuel, thanks for filing this bug.
I put up your test page online https://devtools-sab-crash.glitch.me/ to test it, but I can't reproduce the crash.
I do have an error in the console, but only saying that SharedArrayBuffer is undefined (they were disabled because of Spectre/Meltdown). Don't you see this error in the console ?

Also, can you be more specific on each steps ? What do you mean by "launch worker Debugger" ?

Thanks !
Flags: needinfo?(emmanuel.benzaquin)
(In reply to Nicolas Chevobbe [:nchevobbe] from comment #1)
> Hello Emmanuel, thanks for filing this bug.
> I put up your test page online https://devtools-sab-crash.glitch.me/ to test
> it, but I can't reproduce the crash.
> I do have an error in the console, but only saying that SharedArrayBuffer is
> undefined (they were disabled because of Spectre/Meltdown). Don't you see
> this error in the console ?
> 
> Also, can you be more specific on each steps ? What do you mean by "launch
> worker Debugger" ?
> 
> Thanks !

Hello Nicolas,

I use a nightly version of Firefox : 59.0a1 (2018-01-02) (64-bit)
and in about:config I have :

devtools.debugger.new-debugger-frontend = false
devtools.debugger.workers   =  true
javascript.options.shared_memory  = true

I test url https://devtools-sab-crash.glitch.me/ and I reproduce the bug

To launch worker debugger I click on script.js in main debugger ==> a new window is opened with title 
"Developer Tools - Worker - script.js"
this window has only 2 tabs : Console and Debugger
it is opened on tab Debugger

when I activate the tab Console I get the bug

Sorry for my English
Flags: needinfo?(emmanuel.benzaquin)
(In reply to emmanuel.benzaquin from comment #2)
> (In reply to Nicolas Chevobbe [:nchevobbe] from comment #1)
> > Hello Emmanuel, thanks for filing this bug.
> > I put up your test page online https://devtools-sab-crash.glitch.me/ to test
> > it, but I can't reproduce the crash.
> > I do have an error in the console, but only saying that SharedArrayBuffer is
> > undefined (they were disabled because of Spectre/Meltdown). Don't you see
> > this error in the console ?
> > 
> > Also, can you be more specific on each steps ? What do you mean by "launch
> > worker Debugger" ?
> > 
> > Thanks !
> 
> Hello Nicolas,
> 
> I use a nightly version of Firefox : 59.0a1 (2018-01-02) (64-bit)
> and in about:config I have :
> 
> devtools.debugger.new-debugger-frontend = false
> devtools.debugger.workers   =  true
> javascript.options.shared_memory  = true
> 
> I test url https://devtools-sab-crash.glitch.me/ and I reproduce the bug
> 
> To launch worker debugger I click on script.js in main debugger ==> a new
> window is opened with title 
> "Developer Tools - Worker - script.js"
> this window has only 2 tabs : Console and Debugger
> it is opened on tab Debugger
> 
> when I activate the tab Console I get the bug
> 
> Sorry for my English

Re-Hello Nicolas,

I test with another version : 57.0 (64 bits)

with same config, and I do not reproduce the bug
Attached file bug1428725.zip
New test to reproduce this bug
Attachment #8940653 - Attachment is obsolete: true
This bug is not specific to SharedArrayBuffer, in new test I can reproduce it with ArrayBuffer and Uint8Array
(In reply to emmanuel.benzaquin from comment #4)
> Created attachment 8941055 [details]
> bug1428725.zip
> 
> New test to reproduce this bug

Thanks for the test case with instructions - I can confirm the crash on Nightly 59. :baku, would you be able to help debug what's causing this?
Status: UNCONFIRMED → NEW
Ever confirmed: true
Flags: needinfo?(amarchesini)
Keywords: regression
Attached patch console.patch (obsolete) — Splinter Review
This happens because PreDispatch() fails, the ConsoleData is not set to eInUse, and when it's relased, we crash finding it as eUnused.
Assignee: nobody → amarchesini
Flags: needinfo?(amarchesini)
Attachment #8941161 - Flags: review?(bugs)
Comment on attachment 8941161 [details] [diff] [review]
console.patch

baku, could you explain this a bit more? Do you know which bug regressed this?
(this is marked to be regression)
Flags: needinfo?(amarchesini)
I don't think this is a regression, and if it is, it's related to SharedArrayBuffer: something that cannot be serialized to string and that is disabled by default and soon removed.

Here a description of what happens:

ConsoleData has a state: a. by default Unknown; b. "in used" if owned by a ConsoleRunnable; c. "Unused"  when ready to be released. If ConsoleData is released in a state different than "Unused", we crash in debug builds.
Here we crash because the ConsoleRunnable owns ConsoleData but this is not set to 'eInUse'. And this happens because PreDispatch() doesn't complete the operation, because of a StructuredCloneAlgorithm failure.

The fix is: we should mark ConsoleData as 'eInUse' immediately when PreDispatch() is executed, because, also if there is a failure, ReleaseData() is always called, and there we don't want to see 'Unused' ConsoleData objects.
Flags: needinfo?(amarchesini)
FYI I marked it as a regression based on the testcase attached which says the crash doesn't occur in Firefox 54. I just confirmed on 57 that a crash doesn't occur as well.
Comment on attachment 8941161 [details] [diff] [review]
console.patch

But shouldn't mCallData->mStatus = ConsoleCallData::eInUse; be set in 
ConsoleCallDataRunnable constructor?
Comment on attachment 8941161 [details] [diff] [review]
console.patch

So conceptually, as far as I understand this code, ConsoleCallData should change its state when it is passed to ConsoleCallDataRunnable, not in PreDispatch.

But regression range would be nice here too, in order that we know which branches may want to have the fix.
Attachment #8941161 - Flags: review?(bugs) → review-
Attached patch console.patchSplinter Review
In the CTOR is fine as well. Initially I wanted to mark it as in used when the runnable is actually dispatched, but there is fine as well.

I think the range is related to SharedArrayBuffer because the test uses SharedArrayBuffer, which fails in ToString() and in being cloned by StructuredCloneAlgorithm.
Attachment #8941161 - Attachment is obsolete: true
Attachment #8941341 - Flags: review?(bugs)
Comment on attachment 8941341 [details] [diff] [review]
console.patch

This is for the debug-only assertion.
Attachment #8941341 - Flags: review?(bugs) → review+
Adding leave-open, so that we don't yet close this bug before actually fixing the issue.
Keywords: leave-open
Pushed by amarchesini@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/11c801abda3c
Fix a crash in ConsoleData when StructuredCloneHolder fails to write data, r=smaug
Attached file bug1428725bis.zip
New test case for this bug

No use of SharedArrayBuffer and no use of ArrayBuffer

Just call new Uint8Array(length) and console.log(array) in worker

==> ok with Firefox 57.0.4
==> crash with Nightly 2018-01-13
Thanks for you bug test and the PDF. I can reproduce this issue.
Ok, I think I understand what's going on here and this might just be an overly-strict assert unless it reflects a GC graph invariant necessary for correctness in which case we have to think harder.

So the sequence is:
 1. debugger compartment D creates a wrapper W for a content typed array T in content compartment C
 2. while executing in D, the .length getter F is called on W
 3. F is the native JSFunction TypedArray_lengthGetter, not a wrapper, so it is called still in G
 4. TypedArray_lengthGetter uses CallNonGenericMethod() calls CCW::nativeCall because 'this' is the wrapper W
 5. CCW::nativeCall enters the compartment of unwrapped 'this', C, and then wraps the callee, F, into C
 6. the worker wrap hook release-asserts when trying to create a wrapper from D into C:
  https://searchfox.org/mozilla-central/source/dom/workers/RuntimeService.cpp#934

So the first question is why F is not also a wrapper into C, but rather a function living in D.  I think the answer might be bholley's slaughterhouse work that does some proto-swapping for builtin objects?  (ni? bholley)

Assuming F is as it should be, then the next question is whether GC correctness actually requires there to be no CCW's from D to C.  (ni? jonco)

Assuming there is no GC invariant, then the last question is if we should just remove the assert or if there is some weaker assert that would hold.
Flags: needinfo?(jcoppeard)
Flags: needinfo?(bobbyholley)
Oh, and baku's callstack: https://pastebin.mozilla.org/9076523
(In reply to Luke Wagner [:luke] from comment #20)
> Ok, I think I understand what's going on here and this might just be an
> overly-strict assert unless it reflects a GC graph invariant necessary for
> correctness in which case we have to think harder.
> 
> So the sequence is:
>  1. debugger compartment D creates a wrapper W for a content typed array T
> in content compartment C
>  2. while executing in D, the .length getter F is called on W
>  3. F is the native JSFunction TypedArray_lengthGetter, not a wrapper, so it
> is called still in G
>  4. TypedArray_lengthGetter uses CallNonGenericMethod() calls
> CCW::nativeCall because 'this' is the wrapper W
>  5. CCW::nativeCall enters the compartment of unwrapped 'this', C, and then
> wraps the callee, F, into C
>  6. the worker wrap hook release-asserts when trying to create a wrapper
> from D into C:
>  
> https://searchfox.org/mozilla-central/source/dom/workers/RuntimeService.
> cpp#934
> 
> So the first question is why F is not also a wrapper into C, but rather a
> function living in D.  I think the answer might be bholley's slaughterhouse
> work that does some proto-swapping for builtin objects?  (ni? bholley)

There's not proto-swapping per se, but if Xrays are involved, we'd mint a new length getter in the wrapper's compartment, which would be a native function rather than a wrapper. I don't recall the details of the debugger setup well enough to know whether Xrays are / should be involved with debugger compartments.

An easy way to verify that this is the cause would be to inspect the proxy handler of W and determine if it's an Xray.
Flags: needinfo?(bobbyholley)
(In reply to Luke Wagner [:luke] from comment #20)
> Assuming F is as it should be, then the next question is whether GC
> correctness actually requires there to be no CCW's from D to C.  (ni? jonco)

Both ways round are fine from a GC point of view.  We already have edges from debugger to debuggee and the GC adds fake edges from the debuggee to the debugger to ensure that debugger and debuggee compartments are swept in the same sweep group.  

The assert makes sense to me from a correctness point of view though since you don't want the debuggee to have access to anything in the debugger compartment.
Flags: needinfo?(jcoppeard)
(In reply to Jon Coppeard (:jonco) from comment #23)
> The assert makes sense to me from a correctness point of view though since
> you don't want the debuggee to have access to anything in the debugger
> compartment.

It does seem like a nice property, but it does mean that unwitting JS debugger code can cause a hard browser crash.  Do you know if the main thread wrap hooks have anything like this?

So based on what bholley said, I was wondering if maybe the debugger JS code (that is calling the .length getter) is explicitly  .call()ing a debugger length getter.  I tried to see what the calling JS is by running the testcase in comment 18, but it hits a different assert:
  https://searchfox.org/mozilla-central/source/dom/workers/ScriptLoader.cpp#1093
> https://searchfox.org/mozilla-central/source/dom/workers/ScriptLoader.
> cpp#1093

Are you loading file:// URLs? Use a web server. The issue you are triggering is this: bug 1429127.
(In reply to Andrea Marchesini [:baku] from comment #25)
Ah hah, that fixed it.  However, ow I don't see any crash with the testcase; I just see some BROWSER and WORKER log statements in the console.
(In reply to Luke Wagner [:luke] from comment #26)
> (In reply to Andrea Marchesini [:baku] from comment #25)
> Ah hah, that fixed it.  However, ow I don't see any crash with the testcase;
> I just see some BROWSER and WORKER log statements in the console.

there is a PDF in the zip file. Follow that STR. You must open about:debugging#workers and from there you must open the web console connected to worker script.
(In reply to Luke Wagner [:luke] from comment #24)
> Do you know if the main thread wrap hooks have anything like this?

As far as I can see, no they do not.

> it does mean that unwitting JS debugger code can cause a hard browser crash

Is the debugger unwittingly executing debuggee code?  That seems like a problem.

Do you know where in the debugger this problem happens?  I looked at the stack but couldn't get much out of it.
(In reply to Jon Coppeard (:jonco) from comment #28)
> > it does mean that unwitting JS debugger code can cause a hard browser crash
> 
> Is the debugger unwittingly executing debuggee code?  That seems like a
> problem.

No, really the debugger is doing everything right here, they are just trying to access the 'length' of a typed array and our wrapper nativeCall() machinery is wrapping the callee into the debuggee's compartment.  See comment 20 for more.

> Do you know where in the debugger this problem happens?  I looked at the
> stack but couldn't get much out of it.

Yeah, that's what I'm asking about too; I can't repro any crash, but baku apparently can.
I don't know if this is particularly useful, but, when I reproduce the crash, calling DumpJSStack() says: 'void'.
luke, if you want, we can debug this together on IRC.
Flags: needinfo?(luke)
(debugged together on IRC)  So here's the debugger code callsite:
  https://searchfox.org/mozilla-central/source/devtools/server/actors/object.js#2642

So they are going out of their way to use the debugger-global typed array length getter and .call() it on a content typed array and so this assert is completely expected.  So we have three options here:
 1. just delete the failing assert; this line of code shows it's not valid
 2. try to preserve the intention of the assert by adding another disjunct allowing callees that are wrapped natives (or something like that)
 3. change the debugger code to (somehow) get the builtin content length getter and .call() that instead of the debugger length getter

3 seems like the best fix given that simple asserts are nice and the assert already holds for everything else, but I don't know how the debugger compartment can reliably get the right content getter.  bholley: did you have to solve this problem for any other chrome->content situations?
Flags: needinfo?(luke) → needinfo?(bobbyholley)
(In reply to Luke Wagner [:luke] from comment #31)
> 3 seems like the best fix given that simple asserts are nice and the assert
> already holds for everything else, but I don't know how the debugger
> compartment can reliably get the right content getter.  bholley: did you
> have to solve this problem for any other chrome->content situations?

In general, this is the problem that Xrays solve - they know how to mint a fresh getter in the caller's compartment, which is exactly what the devtools code is doing manually.

The problem is that Workers don't have any security wrappers - Eddy wanted to avoid making the XPConnect security wrapper setup operate correctly for workers when introducing this debugger machinery there, and so the compromise was to make strict rules about who touches what, and hard-assert those in the worker security wrapper. I think those asserts need to remain in place, because they're all that we have.

Really though, the issue here is that the nativeCall implementation happens to rewrap the callee into the target compartment as an implementation detail. Perhaps we could fix its API somehow, or make the worker version of CCW specialize nativeCall to work around this issue?
Flags: needinfo?(bobbyholley)
And to be clear - Xrays on workers wouldn't solve this, they'd just make that length-lookup property the default, and we'd then hit this crash even more.

And I guess it isn't quite true that the asserts are _all_ we have, since we do implement a simple opaque wrapper in the other direction: https://searchfox.org/mozilla-central/rev/74b7ffee403c7ffd05b8b476c411cbf11d134eb9/dom/workers/RuntimeService.cpp#951

However, with the lack of Xrays, I'd generally be very worried to allow the untrusted code to get references to our code, even opaque ones. So I'd like to keep the assert.
Yes, the assert seems quite useful, so let's rule out option 1.

nativeCall rewraps the callee into the compartment of 'this' to ensure the compartment invariant when executing the native; I'm not sure what we could possibly change about its behavior that would helper here.

Are either of option 2 or 3 viable?
Flags: needinfo?(bobbyholley)
(In reply to Luke Wagner [:luke] from comment #34)
> Yes, the assert seems quite useful, so let's rule out option 1.
> 
> nativeCall rewraps the callee into the compartment of 'this' to ensure the
> compartment invariant when executing the native; I'm not sure what we could
> possibly change about its behavior that would helper here.

I specifically meant nerfing the callee or replacing it with some kind of dummy object. In the non-worker chrome->content Xray world, that callee is going to be an opaque security wrapper anyway, so I can't imagine all that much interesting happening with it.

> Are either of option 2 or 3 viable?

Option 2 could work, it's basically just Option 1 but more scoped down, so I'm not super wild about it.

Option 3 would work fine conceptually but it's probably a fair amount of code. All the right data structures and APIs are available because Xrays use them, though Xrays mint funtions on the caller side, and here we want the callee side. Could add a custom API on the worker global.

Really though I think we should fix this in nativeCall somehow, so that this code can just work similarly to how Xrays work on the main thread.
Flags: needinfo?(bobbyholley)
Ah ok, nerfing sounds like it could work if callees are already something that can't really be depended on.
Baku, could you assign a priority to this ? It looks like the crash didn't happen for some time now, but I'm not used to these crash bugs.
Flags: needinfo?(amarchesini)
Luke, do you know who can work on this bug?
Flags: needinfo?(amarchesini) → needinfo?(luke)
OS: Windows 10 → All
Priority: -- → P2
bholley has the winning idea here, so maybe he could write the patch or suggest someone who could?
Flags: needinfo?(luke)
Product: Firefox → DevTools
Closing because no crash reported since 12 weeks.
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → WONTFIX
Closing because no crash reported since 12 weeks.
(In reply to Sylvestre Ledru [:sylvestre] from comment #41)
> Closing because no crash reported since 12 weeks.

I experienced something that is marked related with this today with v63.0.3.
My steps:
1. Open a worker debugger window.
2. Go to console there and enter any variable name.
3. Expand the printed object -> Tab crash (Object is expanded anyway)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: