Closed Bug 1296275 Opened 8 years ago Closed 8 years ago

Intermittent webaudio/test/test_audioParamTimelineDestinationOffset.html,test_mediaDecoding.html | application crashed [@ js::CompartmentChecker::check] after Assertion failure: IsInsideNursery(obj) || !obj->asTenured().isMarked(gc::GRAY)

Categories

(Core :: Web Audio, defect, P1)

defect

Tracking

()

RESOLVED FIXED
mozilla51
Tracking Status
firefox50 --- fixed
firefox51 --- fixed

People

(Reporter: jesup, Assigned: mccr8)

References

Details

(Keywords: crash, intermittent-failure)

Attachments

(1 file)

Intermittent I noticed on inbound which implies a GC issue

Assertion in CompartmentChecker::check --  GC-marked-gray object passed into JS_StealArrayBufferContents

https://treeherder.mozilla.org/logviewer.html#?job_id=34081663&repo=mozilla-inbound#L25064

JS_StealArrayBufferContents() was called from:

AudioBuffer::StealJSArrayDataIntoSharedChannels [AudioBuffer.cpp:391)
AudioBuffer::GetThreadSharedChannelsForRate [AudioBuffer.cpp:411)
AudioBufferSourceNode::SendBufferParameterToStream [AudioBufferSourceNode.cpp:703)
Terrence: what are the likely sources of this type of error/assertion, and what are the likely consequences in opt builds without the assertion?

I see bug 1291364 where bz did some work to avoid this in workers; are patterns like that what we should look to to avoid this?  (Note I haven't actually looked at the patch, just the bug.)
Rank: 23
Flags: needinfo?(terrence)
Looks like bug 1292852 is a dumping ground for assertions generated by this
See Also: → 1292852
Summary: Assertion due to a GC-marked-gray object passed into JS_StealArrayBufferContents → Intermittent dom/media/webaudio/test/test_audioParamTimelineDestinationOffset.html | application crashed [@ js::CompartmentChecker::check] after Assertion failure: IsInsideNursery(obj) || !obj->asTenured().isMarked(gc::GRAY)
Terrence has a proposed fix for this class of assertions
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → DUPLICATE
Actually, this is probably a long-tail situation and the stack is pretty different.

The problem is that if we have a GC-black-to-CC-gray edge then there is a UAF bug somewhere in the system. This assertion ensures that we never pass a gray object into SpiderMonkey, which would imply that an ExposeToActiveJS was missed somewhere. There may be many sites that are missing this barrier.
Status: RESOLVED → REOPENED
Flags: needinfo?(terrence)
Resolution: DUPLICATE → ---
Going to need more room in the summary, then, even just for a webaudio dumping ground.
Summary: Intermittent dom/media/webaudio/test/test_audioParamTimelineDestinationOffset.html | application crashed [@ js::CompartmentChecker::check] after Assertion failure: IsInsideNursery(obj) || !obj->asTenured().isMarked(gc::GRAY) → Intermittent webaudio/test/test_audioParamTimelineDestinationOffset.html,test_mediaDecoding.html | application crashed [@ js::CompartmentChecker::check] after Assertion failure: IsInsideNursery(obj) || !obj->asTenured().isMarked(gc::GRAY)
(In reply to Terrence Cole [:terrence] from comment #4)
> Actually, this is probably a long-tail situation and the stack is pretty
> different.
> 
> The problem is that if we have a GC-black-to-CC-gray edge then there is a
> UAF bug somewhere in the system. This assertion ensures that we never pass a
> gray object into SpiderMonkey, which would imply that an ExposeToActiveJS
> was missed somewhere. There may be many sites that are missing this barrier.

Is there a good explanation/example of the requirements around ExposeToActiveJS that would help us figure out what's being done wrong here?  GC-black-to-CC-edge means little to me and nothing to most of our devs (I can infer a little from it), and if we're going to have this assertion there to catch real problems, it would help to have something to point people who hit it in the right direction (for example, a comment pointing to a wiki page or bug with the info, or info in the comment itself.)
Flags: needinfo?(terrence)
> Is there a good explanation/example of the requirements around ExposeToActiveJS
> that would help us figure out what's being done wrong here?

The fundamental requirement is this: If you have a JS thing that you cycle collect (in this case the things in AudioBuffer.mJSChannels) then you must ensure that ExposeToActiveJS is called any time you're relying on the GC keeping it alive.

This requirement is fairly complicated to test for, because it depends on lifetime analysis of the cycle-collected member.  So in practice we assert a simpler invariant: any time you're passing a thing to JSAPI functions that do anything interesting it needs to have been exposed to active JS.  In theory, we should really assert if a non-exposed thing is stored in a Rooted, but in practice this would have too many false positives right now (e.g. places where the store into Rooted happens right before an Expose call).
karl - is this something you could take, or at least scope the work required to deal with it?  Thanks!
Rank: 23 → 15
Flags: needinfo?(karlt)
Priority: P2 → P1
The basic fix is to call JS::ExposeObjectToActiveJS whenever you read a value out of channelArray, except for the cycle collector helper functions. The nicest way to deal with that would be a subclass of JS::Heap that does that automatically, with a color-preserving getter for the CC to use, and then change mJSChannels to use it. Terrence, what do you think about that?
> you read a value out of channelArray
Sorry, I meant mJSChannels here.
(In reply to Andrew McCreight [:mccr8] from comment #14)
Terrence is working on this in bug 1297558.
Depends on: 1297558
Terrence said his big fix will take a bit so I'll do something in the interim to stop the bleeding.
Assignee: nobody → continuation
Flags: needinfo?(terrence)
Flags: needinfo?(karlt)
I looked at 8 or so crashes and they were all in JS_StealArrayBufferContents() so I'll just patch that up.
Keywords: leave-open
Attachment #8786159 - Flags: review?(karlt) → review?(terrence)
Well, maybe it makes more sense for Terrence to review this.
Comment on attachment 8786159 [details]
Bug 1296275 - Be better about exposing to active js in AudioBuffer::StealJSArrayDataIntoSharedChannels().

https://reviewboard.mozilla.org/r/75124/#review73196

Looks good to me.
Comment on attachment 8786159 [details]
Bug 1296275 - Be better about exposing to active js in AudioBuffer::StealJSArrayDataIntoSharedChannels().

https://reviewboard.mozilla.org/r/75124/#review73198
Attachment #8786159 - Flags: review?(terrence) → review+
Pushed by amccreight@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/8a8e5d79f09c
Be better about exposing to active js in AudioBuffer::StealJSArrayDataIntoSharedChannels(). r=terrence
I'll check this in a day or two to see if there are any residual crashes once the bulk of them have been fixed (hopefully) by my patch.
Flags: needinfo?(continuation)
Status: REOPENED → RESOLVED
Closed: 8 years ago8 years ago
Flags: needinfo?(continuation)
Resolution: --- → FIXED
Keywords: leave-open
Comment on attachment 8786159 [details]
Bug 1296275 - Be better about exposing to active js in AudioBuffer::StealJSArrayDataIntoSharedChannels().

Approval Request Comment
[Feature/regressing bug #]: Incremental CC, found by the assertions in bug 1283634.
[User impact if declined]: Intermittent crashes.
[Describe test coverage new/current, TreeHerder]: It's been on TH for a week and no longer hits the assertions.
[Risks and why]: There was a slow trickle of hard-to-exploit UAF crashes in previous branches caused by missing ExposeToActiveJS barriers in a few places. We added an assertion that catches these in bug 1283634. We'd like to uplift the fixes to Aurora to solve the crashes 6 weeks earlier than we otherwise might. The impact is relatively low, but the patches are also extremely simple and low risk. Aurora seems like the right balance here.
[String/UUID change made/needed]: None.
Attachment #8786159 - Flags: approval-mozilla-aurora?
Target Milestone: --- → mozilla51
Comment on attachment 8786159 [details]
Bug 1296275 - Be better about exposing to active js in AudioBuffer::StealJSArrayDataIntoSharedChannels().

Crash fix, has stabilized on Nightly for ~10 days, Aurora50+
Attachment #8786159 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
See Also: 1292852
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: