creation/destruction of DecodedStreamData might get out of order when multiple DecodedStream::Start()/Stop() are called in a short time

RESOLVED FIXED in Firefox 48

Status

()

Core
Audio/Video: Playback
P2
normal
RESOLVED FIXED
2 years ago
2 years ago

People

(Reporter: jwwang, Assigned: jwwang)

Tracking

unspecified
mozilla48
Points:
---

Firefox Tracking Flags

(firefox46 wontfix, firefox47 wontfix, firefox48 fixed)

Details

MozReview Requests

()

Submitter Diff Changes Open Issues Last Updated
Loading...
Error loading review requests:

Attachments

(1 attachment)

(Assignee)

Description

2 years ago
Spawned from bug 1249540 comment 8.

repro steps:
1. run http://thomastortorini.free.fr/bug-currenttime/
2. wait for a while

result:
crash for hitting the assertion at https://hg.mozilla.org/mozilla-central/file/f0c0480732d36153e8839c7f17394d45f679f87d/dom/media/mediasink/OutputStreamManager.cpp#l32
(Assignee)

Updated

2 years ago
Blocks: 1249540
(Assignee)

Comment 1

2 years ago
root cause:
DecodedStreamData is created asynchronously on the main thread when DecodedStream::Start() is called and 2 consecutive DecodedStreamData creation might happen back to back without the 1st one gets destroyed first.

Here is an example call flow:
1. DecodedStream::Start() which dispatches task (a) to the main thread.
2. DecodedStream::Stop()
3. DecodedStream::Start() which dispatches task (b) to the main thread.
4. task (a) creates DecodedStreamData on the main thread which calls OutputStreamManager::Connect() to connect the input stream.
5. task (b) creates DecodedStreamData on the main thread which calls OutputStreamManager::Connect() to
connect the input stream again.
6. hit the assertion in OutputStreamData::Connect() because the stream is not disconnected yet.

We need to ensure creation/destruction of DecodedStreamData happens in the same order as DecodedStream::Start()/Stop().
(Assignee)

Updated

2 years ago
Assignee: nobody → jwwang
(Assignee)

Comment 2

2 years ago
Created attachment 8730500 [details]
MozReview Request: Bug 1256520 - use SyncRunnable to create DecodedStreamData synchronously to ensure the creation and destruction of DecodedStreamData happen in order.

This greatly simplify the code because:
1. we don't have to dispatch the newly created DecodedStreamData to the work thread and store it to |mData|.
2. no need to deal with dispatch failure incurred by 1 due to task queue shutdown.
   (see: https://hg.mozilla.org/mozilla-central/file/f0c0480732d36153e8839c7f17394d45f679f87d/dom/media/mediasink/DecodedStream.cpp#l392)

Review commit: https://reviewboard.mozilla.org/r/39943/diff/#index_header
See other reviews: https://reviewboard.mozilla.org/r/39943/
(Assignee)

Updated

2 years ago
Attachment #8730500 - Flags: review?(karlt)
Comment on attachment 8730500 [details]
MozReview Request: Bug 1256520 - use SyncRunnable to create DecodedStreamData synchronously to ensure the creation and destruction of DecodedStreamData happen in order.

I'm not familiar with this code at all, but I would usually try to avoid a
sync runnable at all costs.

If a sync runnable is the appropriate solution, where is it or should it be
documented that the main thread must never block on the DecodedStream owner
thread?  Is that thread the MSDM thread?

Other questions in mind, with my limited understanding, are:

Why is dispatching a task from DecodedStream::Stop() not keeping the creation
and destruction in order?

Would it be possible to trigger connection on the main thread from
mozCaptureStream, instead of from the DecodedStream owner thread?

cpearce and jesup reviewed the original implementation here, so maybe they can help?

I haven't worked out how to request reviewers on other people's patches in mozreview.
Attachment #8730500 - Flags: review?(karlt) → review?(cpearce)
(Assignee)

Comment 4

2 years ago
(In reply to Karl Tomlinson (ni?:karlt) from comment #3)
> If a sync runnable is the appropriate solution, where is it or should it be
> documented that the main thread must never block on the DecodedStream owner
> thread?  Is that thread the MSDM thread?
We spent lots of efforts to remove lock/monitor from our media stack to free us from worrying about deadlocks. Moreover, the general principle of media code is that main thread will never block on any other threads. So it is safe to use sync runnable to wait for something to happen on the main thread.
(Assignee)

Comment 5

2 years ago
(In reply to Karl Tomlinson (ni?:karlt) from comment #3)
> Why is dispatching a task from DecodedStream::Stop() not keeping the creation
> and destruction in order?

Here is a typical sequence for what happens between DecodedStream::Start()/Stop():
1. DecodedStream::Start() posts a runnable (a) to the main thread. [1]
2. (a) calls DecodedStream::CreateData() to create DecodedStreamData on the main thread. [2]
3. DecodedStream::CreateData() posts a runnable (b) to the MDSM thread. [3]
4. (b) calls DecodedStream::OnDataCreated() on the MDSM thread to store DecodedStreamData to mData. [4]

If Stop() is called immediately after Start(), it is possible for Stop() to happen before 4. and therefore we won't destroy DecodedStreamData for mData is still null.

It is also possible for the 2nd Start() to happen before 4. and we will store the wrong DecodedStreamData (created by 1st Start()) to mData.

Therefore I would like to use sync runnable to synchronize the create/destruction of DecodedStreamData with Start()/Stop().

Btw, things would be much easier if MediaStream APIs can be called off the main thread.

[1] https://hg.mozilla.org/mozilla-central/file/4037eb98974db1b1e0b5012c8a7f3a36428eaa11/dom/media/mediasink/DecodedStream.cpp#l297
[2] https://hg.mozilla.org/mozilla-central/file/4037eb98974db1b1e0b5012c8a7f3a36428eaa11/dom/media/mediasink/DecodedStream.cpp#l358
[3] https://hg.mozilla.org/mozilla-central/file/4037eb98974db1b1e0b5012c8a7f3a36428eaa11/dom/media/mediasink/DecodedStream.cpp#l395
[4] https://hg.mozilla.org/mozilla-central/file/4037eb98974db1b1e0b5012c8a7f3a36428eaa11/dom/media/mediasink/DecodedStream.cpp#l407
(Assignee)

Comment 6

2 years ago
(In reply to Karl Tomlinson (ni?:karlt) from comment #3)
> Would it be possible to trigger connection on the main thread from
> mozCaptureStream, instead of from the DecodedStream owner thread?
We used to do that which caused several bugs due to the create/destruction of DecodedStreamData not synchronized with Start()/Stop().
Priority: -- → P2
(Assignee)

Comment 7

2 years ago
Comment on attachment 8730500 [details]
MozReview Request: Bug 1256520 - use SyncRunnable to create DecodedStreamData synchronously to ensure the creation and destruction of DecodedStreamData happen in order.

Per discussion on IRC, Chris won't be able to review this patch in a short time.

Hi Kilik,
Can you review this patch for you have experience about media sink and DecodedStream? Thanks!
Attachment #8730500 - Flags: review?(cpearce) → review?(kikuo)
Attachment #8730500 - Flags: review?(kikuo) → review+
Comment on attachment 8730500 [details]
MozReview Request: Bug 1256520 - use SyncRunnable to create DecodedStreamData synchronously to ensure the creation and destruction of DecodedStreamData happen in order.

https://reviewboard.mozilla.org/r/39943/#review38917

Looks good to me.
Curious to ask is there any specific reason that DecodedStreamData should be created at mainthread ?
Or it is just because now MediaStream APIs are called on mainthread, and do we have plans to refactor on that ?
(Assignee)

Comment 9

2 years ago
Yes, MediaStream APIs must be called on the main thread. Thanks!

We do have a plan to remove the restriction so the APIs can be called off the main thread. However, I've heard of any progress on that recently.

Comment 10

2 years ago
https://hg.mozilla.org/integration/mozilla-inbound/rev/9667b7f27279
https://hg.mozilla.org/mozilla-central/rev/9667b7f27279
Status: NEW → RESOLVED
Last Resolved: 2 years ago
status-firefox48: --- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla48
(Assignee)

Comment 12

2 years ago
This crash fix needs to uplift to 46.
Flags: needinfo?(jwwang)
(Assignee)

Comment 13

2 years ago
Comment on attachment 8730500 [details]
MozReview Request: Bug 1256520 - use SyncRunnable to create DecodedStreamData synchronously to ensure the creation and destruction of DecodedStreamData happen in order.

Approval Request Comment
[Feature/regressing bug #]:1256520
[User impact if declined]:It might crash when seeking very frequently on a captured media element (mostly used in WebAudio).
[Describe test coverage new/current, TreeHerder]:TreeHerder
[Risks and why]: Low. This patch greatly simplify the code and has backed in Central for a week without problems so far.
[String/UUID change made/needed]:none
Flags: needinfo?(jwwang)
Attachment #8730500 - Flags: approval-mozilla-aurora?

Updated

2 years ago
status-firefox47: --- → affected
I was able to repro this on Beta 46 as well.
status-firefox46: --- → affected
Hi Chris, while I am able to reproduce this crash on Beta46 and Aurora47, I wanted to get another opinion on the risk of uplifting this to Aurora vs letting it ride the trains. Is this a common enough scenario? Should we let this bake some more on Nightly? Thanks!
Flags: needinfo?(cpearce)
Duplicate of this bug: 1249540
I don't think this crash should be easy to hit.

My crash for the test case gives us this signature:
https://crash-stats.mozilla.com/report/list?product=Firefox&range_unit=days&range_value=28&signature=mozilla%3A%3ACycleCollectedJSRuntime%3A%3ARunInStableState#tab-sigsummary

and that doesn't have many reports, so I think we can safely let this fix ride the trains.
Flags: needinfo?(cpearce)
Comment on attachment 8730500 [details]
MozReview Request: Bug 1256520 - use SyncRunnable to create DecodedStreamData synchronously to ensure the creation and destruction of DecodedStreamData happen in order.

Given that the likelihood of hitting this crash is very low, Chris agrees that we should let this one ride the trains instead of uplifting to Aurora47.
Attachment #8730500 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora-

Updated

2 years ago
status-firefox46: affected → wontfix
status-firefox47: affected → wontfix
You need to log in before you can comment on or make changes to this bug.