WebAudio use-after-free crash [@mozilla::MediaStreamGraphImpl::AppendMessage]

RESOLVED FIXED in Firefox 24

Status

()

defect
--
critical
RESOLVED FIXED
6 years ago
5 years ago

People

(Reporter: posidron, Assigned: Ehsan)

Tracking

(Blocks 1 bug, 4 keywords)

Trunk
mozilla24
x86_64
macOS
Points:
---
Dependency tree / graph
Bug Flags:
in-testsuite +

Firefox Tracking Flags

(firefox21 unaffected, firefox22 disabled, firefox23 disabled, firefox24 fixed, firefox-esr17 unaffected, b2g18 unaffected)

Details

(Whiteboard: [adv-main24-])

Attachments

(5 attachments, 1 obsolete attachment)

Posted file callstack
alloc: ./content/media/webaudio/AudioDestinationNode.cpp:188

  MediaStreamGraph* graph = aIsOffline ?
*                           MediaStreamGraph::CreateNonRealtimeInstance() :
                            MediaStreamGraph::GetInstance();


free: ./content/media/MediaStreamGraph.cpp:1504

void
MediaStream::Destroy()
{
[...]
* GraphImpl()->AppendMessage(new Message(this));
[...]
}


re-use: ./content/media/MediaStreamGraph.cpp:1504

void
MediaStream::Destroy()
{
[...]
* GraphImpl()->AppendMessage(new Message(this));
}


and then:


re-use: ./content/media/MediaStreamGraph.cpp:1379

void
MediaStreamGraphImpl::AppendMessage(ControlMessage* aMessage)
{
[...]
* if (mDetectedNotRunning &&
      mLifecycleState > LIFECYCLE_WAITING_FOR_MAIN_THREAD_CLEANUP) {
[...]



Tested with m-i changeset: 132700:cf2106c1f0c7 and the patches in bug 874915 and bug 874869.
So the stack here tells me that we first call AppendMessage() which results in |delete this;|, then we call another AppendMessage() which tries to read mDetectedNotRunning, which is now garbage.

Now, for the first call to have deleted the graph, IsEmpty() must have returned true, which means that there should _not_ be an outstanding MediaStream associated with this graph.

So this got me curious to see if I can come up with cases where mStreams _could_ be empty, and it seems to me that accesses to mStreams on the main thread (the IsEmpty() call at least) is not synchronized in any way with the graph thread, so at least in theory it should be possible for the graph thread to call UpdateStreamOrder for example which temporarily empties out mStreams, and then for an IsEmpty() check on the main thread to succeed.

roc, do you think I'm barking up the wrong tree?
Flags: needinfo?(roc)
Blocks: 875414
Ehsan thinks this bug affects everywhere we have WebRTC.
Blocks: 875402
From AppendMessage:
  if (mDetectedNotRunning &&
      mLifecycleState > LIFECYCLE_WAITING_FOR_MAIN_THREAD_CLEANUP) {
    // The graph control loop is not running and main thread cleanup has
    // happened.

So there is no need to synchronize with the graph thread, since it has already exited. At least that's the plan :-)
Flags: needinfo?(roc)
Hmm, then I don't have any ideas, I'm afraid.
Keywords: testcase-wanted
Posted file testcase
Testcase which will crash at the given location.
I believe this is connected to bug 875402. It crashed for me with the stack attached here but sometimes also with the stack attached in bug 875402.
Assignee: nobody → ehsan
Posted file reduced testcase
(from attachment 753840 [details])
Attachment #753854 - Attachment mime type: text/plain → text/html
Christoph, I can't reproduce this bug using the reduced testcase, and having the smallest possible testcase is going to help tremendously here.  Can you please give reducing this testcase another shot?  Thanks!
Flags: needinfo?(cdiehl)
The reduced testcase works but you need to hit refresh a bit more often.
Flags: needinfo?(cdiehl)
Do a soft reload followed by a hard reload - this does trigger it every time for me.

Tested with m-i changeset: 133065:1d6a04be8c0f
OK, I finally have some idea what's happening here.

So we get a call to MediaStreamGraphImpl::AppendMessage, and we delete the graph object, but its thread is still running, which can result in any number of bad things to happen.

Now, the question is why.  I don't know that yet.
I don't actually see a way to fix this problem.  When we currently delete this in AppendMessage, we need to make sure that the thread is correctly shut down.  We can't do that there though because nsThread::Shutdown spins the event loop.  We can't do it before since, well, that ship has sailed.  I'm not sure how MSG is supposed to be properly shut down usually, and why this is not a bug for realtime MSGs.  roc, do you know how this shutdown path is supposed to work?
Flags: needinfo?(roc)
Posted patch Patch (v1) (obsolete) — Splinter Review
I think I finally figured this out!  This is caused by an imbalance in mStreams, since we currently suppress the CreateMessage if run during shutdown.  The obvious solution is to not do that for non-realtime graphs at least.

Requesting feedback from cdiehl to give him a chance to test this with all of these test cases.  It does fix the crash with the non-reduced testcase, but I have not been able to reproduce with the other two.
Attachment #755131 - Flags: review?(roc)
Attachment #755131 - Flags: feedback?(cdiehl)
I can not reproduce it anymore :-)

Tested with m-i changeset: 133183:1c67a51e0fe5 and https://bugzilla.mozilla.org/attachment.cgi?id=755131
Duplicate of this bug: 876273
(In reply to :Ehsan Akhgari (needinfo? me!) from comment #12)
> roc, do you know how this shutdown path is supposed to work?

It's documented in LifecycleState in MediaStreamGraphImpl.h. I know it's complicated, and this complexity keeps being a problem, but I did my best to explain how it works and right now I don't know of a simpler way to do it.

When mLifecycleState reaches LIFECYCLE_WAITING_FOR_THREAD_SHUTDOWN, the following must have already happened:
-- on the MSG thread, RunThread() set LIFECYCLE_WAITING_FOR_MAIN_THREAD_CLEANUP as the last thing it did before releasing the lock on the MediaStreamGraph and exiting to the event loop.
-- then, in a stable state, the main thread took the MSG lock and transitioned mLifecycleState to LIFECYCLE_WAITING_FOR_THREAD_SHUTDOWN. Since the main thread got the lock, we know the RunThread() must have released the lock and either it has actually returned, or at least it's in a state where it won't access the MediaStreamGraph object again.

Let me know if that's still not clear.
Flags: needinfo?(roc)
Comment on attachment 755131 [details] [diff] [review]
Patch (v1)

Review of attachment 755131 [details] [diff] [review]:
-----------------------------------------------------------------

::: content/media/MediaStreamGraph.cpp
@@ +1257,5 @@
> +      // is destroyed, and if that races with stream creation, we need
> +      // to be able to depend on knowing exactly when the last stream
> +      // is destroyed, hence we cannot bypass processing the create
> +      // message during "shutdown".
> +      Run();

I think this should be unconditional. There's no particular reason why the shutdown path for realtime and offline MSGs should differ, AFAIK.
Maybe what's unclear is that LIFECYCLE_WAITING_FOR_THREAD_SHUTDOWN means that RunThread won't touch the MSG again, so effectively the MSG thread is done. But we're still waiting for XPCOM thread shutdown of that thread, hence the name.
Thanks, this makes more sense now.
Posted patch Patch (v2)Splinter Review
Attachment #755131 - Attachment is obsolete: true
Attachment #755131 - Flags: review?(roc)
Attachment #755131 - Flags: feedback?(cdiehl)
Attachment #755181 - Flags: review?(roc)
Add a comment in RunDuringShutdown that we need to add the stream to balance the pending Destroy call.
https://hg.mozilla.org/mozilla-central/rev/318254a9a66b
Status: NEW → RESOLVED
Closed: 6 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → mozilla24
Mass moving Web Audio bugs to the Web Audio component.  Filter on duckityduck.
Component: Video/Audio → Web Audio
Whiteboard: [adv-main24-]
Group: core-security
You need to log in before you can comment on or make changes to this bug.