crash in mozilla::dom::AudioDestinationNode::DestroyAudioChannelAgent

VERIFIED FIXED in Firefox 46

Status

()

defect
P1
critical
Rank:
7
VERIFIED FIXED
3 years ago
3 years ago

People

(Reporter: philipp, Assigned: karlt)

Tracking

({crash, regression})

42 Branch
mozilla48
Points:
---
Dependency tree / graph
Bug Flags:
in-testsuite +

Firefox Tracking Flags

(firefox45 affected, firefox46 verified, firefox47 verified, firefox48 verified)

Details

(crash signature)

Attachments

(2 attachments)

This bug was filed from the Socorro interface and is 
report bp-6eae7b0c-9356-4aa4-a2c4-0ad0a2160310.
=============================================================
Crashing Thread (0)
Frame 	Module 	Signature 	Source
0 	xul.dll 	mozilla::dom::AudioDestinationNode::DestroyAudioChannelAgent() 	dom/media/webaudio/AudioDestinationNode.cpp
1 	xul.dll 	mozilla::dom::AudioContext::Suspend(mozilla::ErrorResult&) 	dom/media/webaudio/AudioContext.cpp
2 	xul.dll 	nsGlobalWindow::SuspendTimeouts(unsigned int, bool, bool) 	dom/base/nsGlobalWindow.cpp
3 	xul.dll 	nsGlobalWindow::SuspendTimeouts(unsigned int, bool, bool) 	dom/base/nsGlobalWindow.cpp
4 	xul.dll 	nsGlobalWindow::SuspendTimeouts(unsigned int, bool, bool) 	dom/base/nsGlobalWindow.cpp
5 	xul.dll 	nsXMLHttpRequest::Send(nsIVariant*, mozilla::dom::Nullable<nsXMLHttpRequest::RequestBody> const&) 	dom/base/nsXMLHttpRequest.cpp
6 	xul.dll 	nsXMLHttpRequest::Send(JSContext*, mozilla::ErrorResult&) 	dom/base/nsXMLHttpRequest.h
7 	xul.dll 	mozilla::dom::XMLHttpRequestBinding::send 	obj-firefox/dom/bindings/XMLHttpRequestBinding.cpp
8 	xul.dll 	mozilla::dom::GenericBindingMethod(JSContext*, unsigned int, JS::Value*) 	dom/bindings/BindingUtils.cpp
9 	xul.dll 	js::Invoke(JSContext*, JS::CallArgs const&, js::MaybeConstruct) 	js/src/vm/Interpreter.cpp
10 	xul.dll 	Interpret 	js/src/vm/Interpreter.cpp
11 	xul.dll 	js::RunScript(JSContext*, js::RunState&) 	js/src/vm/Interpreter.cpp
12 	xul.dll 	js::Execute(JSContext*, JS::Handle<JSScript*>, JSObject&, JS::Value*) 	js/src/vm/Interpreter.cpp
13 	xul.dll 	ExecuteScript 	js/src/jsapi.cpp
14 	xul.dll 	ExecuteScript 	js/src/jsapi.cpp
15 	xul.dll 	JS_ExecuteScript(JSContext*, JS::AutoVectorRooter<JSObject*>&, JS::Handle<JSScript*>) 	js/src/jsapi.cpp
16 	xul.dll 	nsJSUtils::EvaluateString(JSContext*, JS::SourceBufferHolder&, JS::Handle<JSObject*>, JS::CompileOptions&, nsJSUtils::EvaluateOptions const&, JS::MutableHandle<JS::Value>, void**) 	dom/base/nsJSUtils.cpp
17 	xul.dll 	nsJSUtils::EvaluateString(JSContext*, JS::SourceBufferHolder&, JS::Handle<JSObject*>, JS::CompileOptions&, void**) 	dom/base/nsJSUtils.cpp
18 	xul.dll 	nsScriptLoader::EvaluateScript(nsScriptLoadRequest*, JS::SourceBufferHolder&) 	dom/base/nsScriptLoader.cpp

this cross-platform crash seems to be regressing since firefox 42 builds, but its volume is visibly spiking recently (since 2016-02-25): https://crash-stats.mozilla.com/signature/?date=%3E2015-06-01&signature=mozilla%3A%3Adom%3A%3AAudioDestinationNode%3A%3ADestroyAudioChannelAgent#graphs

on 44.0.2 it is on #24 of the top crash list with 0.5% of all crashes in this version (in 45 it's still too hard to quantify the "real" impact, since trusteer crashes are dominating the data).
Thanks for the report.

Offset 0xb0 in amd64 reports is consistent with
AudioDestinationNode::mAudioChannelAgent suggesting that mDestination is null
at
http://hg.mozilla.org/releases/mozilla-release/annotate/b6609650a911/dom/media/webaudio/AudioContext.cpp#l869
which is the situation after the AudioContext is unlinked.

So this looks like a regression from
http://hg.mozilla.org/releases/mozilla-release/rev/feb02fb51745#l6.12

The AudioContext is not removed from the nsGlobalWindow until
~AudioContext().

Perhaps, because the AudioContext is not in a usable state after Unlink(), the
safest thing would be to remove from nsGlobalWindow in Unlink().  I don't
know whether or not we have a convention re removing weak pointers from
Unlink().

https://treeherder.mozilla.org/#/jobs?repo=try&revision=3bb587880320
Assignee: nobody → karlt
Blocks: 1180539
OS: Windows NT → All
Priority: -- → P1
Hardware: x86 → All
there's also a [@ mozilla::dom::AudioDestinationNode::CreateAudioChannelAgent ] signature spiking since the same date, like: https://crash-stats.mozilla.com/report/index/aa898ff5-d0d7-40e8-8852-6dcbe2160311

should i file a new bug for this one or are they interrelated?
Flags: needinfo?(karlt)
Thanks, philipp.  Good to know that version exists too, but it's the same root cause, so no need for a new bug.
Flags: needinfo?(karlt)
mDestination is cleared during unlink, which means that after that point the
window can't do much with the AudioContext, nor should need to do so.

Review commit: https://reviewboard.mozilla.org/r/39563/diff/#index_header
See other reviews: https://reviewboard.mozilla.org/r/39563/
Attachment #8729695 - Flags: review?(ehsan)
Crash Signature: [@ mozilla::dom::AudioDestinationNode::DestroyAudioChannelAgent] → [@ mozilla::dom::AudioDestinationNode::DestroyAudioChannelAgent] [@ mozilla::dom::AudioDestinationNode::CreateAudioChannelAgent]
Comment on attachment 8729695 [details]
MozReview Request: bug 1255618 remove AudioContext from global window at unlink r?ehsan

https://reviewboard.mozilla.org/r/39563/#review36239
Attachment #8729695 - Flags: review?(ehsan) → review+
Karl -- Thanks for jumping on this.  As a follow up, can we add a test?  (I don't want to hold up landing this for a test.)
Flags: needinfo?(karlt)
Comment on attachment 8729695 [details]
MozReview Request: bug 1255618 remove AudioContext from global window at unlink r?ehsan

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/39563/diff/1-2/
Comment on attachment 8729695 [details]
MozReview Request: bug 1255618 remove AudioContext from global window at unlink r?ehsan

Sorry to throw this at you again, Ehsan.  Are you happen with the disconnect from the destructor also?
Attachment #8729695 - Flags: review+ → review?(ehsan)
https://reviewboard.mozilla.org/r/39563/#review36289

::: dom/media/webaudio/AudioContext.cpp:152
(Diff revisions 1 - 2)
>  AudioContext::~AudioContext()
>  {
> +  DisconnectFromWindow();

This version keeps the disconnect in the destructor also, which is what I intended to do, but missed.

It's actually unnecessary because unlink is also called because the context and destination always make a cycle, but I don't want to risk leaving a dangling pointer if that ever changes.
Attachment #8730071 - Flags: review?(ehsan)
Flags: needinfo?(karlt)
Comment on attachment 8729695 [details]
MozReview Request: bug 1255618 remove AudioContext from global window at unlink r?ehsan

https://reviewboard.mozilla.org/r/39563/#review36371

To be perfectly honest, I thought the original version of the patch did disconnect in the destructor as well, otherwise I'd have asked for it.  Thanks for the due diligence and sorry I missed this earlier!
Attachment #8729695 - Flags: review?(ehsan) → review+
Comment on attachment 8730071 [details]
MozReview Request: mochitest for bug 1255618 r=ehsan

https://reviewboard.mozilla.org/r/39651/#review36379

::: dom/media/webaudio/test/test_bug1255618.html:19
(Diff revision 1)
> +function collect_and_fetch() {
> +    SpecialPowers.forceGC();
> +    SpecialPowers.forceCC();
> +
> +    var xhr = new XMLHttpRequest();
> +    xhr.open("GET", filename, false);

I assume you're doing the sync XHR to spin the event loop.  Please add a comment to that effect.
Attachment #8730071 - Flags: review?(ehsan) → review+
Attachment #8730071 - Attachment description: MozReview Request: mochitest for bug 1255618 → MozReview Request: mochitest for bug 1255618 r=ehsan
Comment on attachment 8730071 [details]
MozReview Request: mochitest for bug 1255618 r=ehsan

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/39651/diff/1-2/
https://reviewboard.mozilla.org/r/39651/#review36379

> I assume you're doing the sync XHR to spin the event loop.  Please add a comment to that effect.

I added a comment to explain.  It's actually to suspend timeouts, as in the crash report stacks.
https://hg.mozilla.org/mozilla-central/rev/b22ef40863a0
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla48
Hi Karl -- It looks like the test got backed out, but the fix landed.  Can you re-land the test and ask for uplift to Aurora and Beta?  (Like I said earlier, I don't want to hold up the fix for the test, especially since this is a regression.) Thanks.
Flags: needinfo?(karlt)
The test depends on bug 1257407.

Waiting to check bug 1257090 is fixed before requesting uplift.
Depends on: 1257407
Flags: needinfo?(karlt)
(In reply to Karl Tomlinson (ni?:karlt) from comment #19)
> The test depends on bug 1257407.
> 
> Waiting to check bug 1257090 is fixed before requesting uplift.

I'm not sure it makes sense to wait on a retest of bug 1257090 before uplifting this fix.  We know you found a real bug here.  If bug 1257090 still exists after testing with this fix, we can handle that separately.  If you feel strongly we should wait on a retest of bug 1257090, I'll send Jesse email letting him know this is blocked on his re-testing of bug 1257090.  Thanks!
Flags: needinfo?(karlt)
Comment on attachment 8729695 [details]
MozReview Request: bug 1255618 remove AudioContext from global window at unlink r?ehsan

Approval Request Comment
[Feature/regressing bug #]: bug 1180539
[User impact if declined]: safe crash when sync XHR is used soon after
AudioContexts are CCed. #24 crash.
[Describe test coverage new/current, TreeHerder]:
New test cannot land due to bug 257407.
There is however existing test coverage that would have been likely to catch
problems with the patch. 
[Risks and why]:
Things involving CC timing are not simple to analyse, but I've looked at the
interactions between the objects being separated by this patch and they all
should not be able to cause observable effects that would be influenced by
this patch.
[String/UUID change made/needed]: none.

If beta is in a stage of complete risk aversion, then I'd understand rejecting
this patch.  I do however recommend uplift for stability improvement.
Flags: needinfo?(karlt)
Attachment #8729695 - Flags: approval-mozilla-beta?
Attachment #8729695 - Flags: approval-mozilla-aurora?
Duplicate of this bug: 1257090
Wes, Karl, the MozReview request shows 86 test jobs failed on inbound. Is this still ok to uplift? Were those intermittent failures? Thanks!
Flags: needinfo?(wkocher)
Flags: needinfo?(karlt)
From my understanding, those failures in Mozreview were from the test that was added, which was then backed out before the merge to m-c. Sounds like that test depends on another patch landing before it lands.

I can't speak to whether it'd be a good idea to uplift the patch without the test, but it seems to be doing okay on m-c.
Flags: needinfo?(wkocher)
Comment on attachment 8729695 [details]
MozReview Request: bug 1255618 remove AudioContext from global window at unlink r?ehsan

Crash fix, taking it in Aurora47 and Beta 46 (since we still have over half of beta cycle remaining).
Attachment #8729695 - Flags: approval-mozilla-beta?
Attachment #8729695 - Flags: approval-mozilla-beta+
Attachment #8729695 - Flags: approval-mozilla-aurora?
Attachment #8729695 - Flags: approval-mozilla-aurora+
Wes described the situation accurately, thanks.
Flags: needinfo?(karlt)
Socorro shows 0 crashes over the past 4 weeks, in builds after this landed.
Flags: in-testsuite+
You need to log in before you can comment on or make changes to this bug.