Closed Bug 1237816 Opened 5 years ago Closed 5 years ago

Crash in MediaStreamGraph GraphDriver::SetGraphTime when accepting separate video then audio gum calls

Categories

(Core :: WebRTC: Audio/Video, defect, P1)

x86_64
macOS
defect

Tracking

()

RESOLVED FIXED
mozilla47
Tracking Status
firefox46 --- disabled
firefox47 --- fixed
Blocking Flags:

People

(Reporter: jib, Assigned: jesup)

References

()

Details

(Keywords: regression)

Attachments

(1 file)

STR:
1. Open URL
2. Click "Start video" and accept.
3. Click "Start audio" and accept.

Crashes.

> MediaStreamGrph (40)
> #0	0x0000000103c7ce8d in mozilla::GraphDriver::SetGraphTime(mozilla::GraphDriver*, long long, long long) at /Users/Jan/moz/mozilla-central/dom/media/GraphDriver.cpp:68

void GraphDriver::SetGraphTime(GraphDriver* aPreviousDriver,
                               GraphTime aLastSwitchNextIterationStart,
                               GraphTime aLastSwitchNextIterationEnd)
{
  GraphImpl()->GetMonitor().AssertCurrentThreadOwns();
  // We set mIterationEnd here, because the first thing a driver do when it
  // does an iteration is to update graph times, so we are in fact setting
  // mIterationStart of the next iteration by setting the end of the previous
  // iteration.
  mIterationStart = aLastSwitchNextIterationStart;
  mIterationEnd = aLastSwitchNextIterationEnd;

-> STREAM_LOG(LogLevel::Debug, ("Setting previous driver: %p (%s)", PreviousDriver(), PreviousDriver()->AsAudioCallbackDriver() ? "AudioCallbackDriver" : "SystemClockDriver"));

(lldb) p PreviousDriver()
(mozilla::GraphDriver *) $0 = 0x0000000000000000
 
> #1	0x0000000103c7dbe0 in mozilla::ThreadedDriver::RunThread() at /Users/Jan/moz/mozilla-central/dom/media/GraphDriver.cpp:340
> #2	0x0000000103da5319 in mozilla::MediaStreamGraphInitThreadRunnable::Run() at /Users/Jan/moz/mozilla-central/dom/media/GraphDriver.cpp:229
> #3	0x00000001007b1dcd in nsThread::ProcessNextEvent(bool, bool*) at /Users/Jan/moz/mozilla-central/xpcom/threads/nsThread.cpp:989
> #4	0x000000010083bc37 in NS_ProcessNextEvent(nsIThread*, bool) at /Users/Jan/moz/mozilla-central/xpcom/glue/nsThreadUtils.cpp:297
> #5	0x0000000100f24c72 in mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate*) at /Users/Jan/moz/mozilla-central/ipc/glue/MessagePump.cpp:326
> #6	0x0000000100e4d025 in MessageLoop::RunInternal() at /Users/Jan/moz/mozilla-central/ipc/chromium/src/base/message_loop.cc:234
> #7	0x0000000100e4cf35 in MessageLoop::RunHandler() at /Users/Jan/moz/mozilla-central/ipc/chromium/src/base/message_loop.cc:227
> #8	0x0000000100e4cedd in MessageLoop::Run() at /Users/Jan/moz/mozilla-central/ipc/chromium/src/base/message_loop.cc:201
> #9	0x00000001007afb4d in nsThread::ThreadFunc(void*) at /Users/Jan/moz/mozilla-central/xpcom/threads/nsThread.cpp:401
> #10	0x00000001003f9f1f in _pt_root at /Users/Jan/moz/mozilla-central/nsprpub/pr/src/pthreads/ptthread.c:212
> #11	0x00007fffa0c8b9b1 in _pthread_body ()
> #12	0x00007fffa0c8b92e in _pthread_start ()
> #13	0x00007fffa0c89385 in thread_start ()
Assignee: nobody → rjesup
backlog: --- → webrtc/webaudio+
Rank: 15
Priority: -- → P1
FYI, I think the CameraPreviewMediaStream impl was just plain wrong, and the GraphDriver change I think is a case where it isn't needed at the moment given the implementation of RemoveCallbacks in that class, but it might be in the future, and it keeps the pattern that the other drivers use.

The big thing is not using mGraph == null as a flag that it's been destroyed, since we have a lot of async things that need to happen after Destroy before we're done with the MediaStream.
Attachment #8714023 - Flags: review?(roc)
Dropping review request while I resolve some Try failures
Comment on attachment 8714023 [details]
MozReview Request: Bug 1237816: count open input sources for MediaStreams to release inputs on Destroy() r?roc,padenot

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/32941/diff/1-2/
Attachment #8714023 - Flags: review?(roc)
https://treeherder.mozilla.org/#/jobs?repo=try&revision=46b678f96954&selectedJob=16140176
green except stuff-that-isn't-us and a couple small leaks on asan/lsan
Comment on attachment 8714023 [details]
MozReview Request: Bug 1237816: count open input sources for MediaStreams to release inputs on Destroy() r?roc,padenot

https://reviewboard.mozilla.org/r/32941/#review29743

::: dom/media/MediaStreamGraph.h:461
(Diff revision 2)
> -    NS_ASSERTION(NS_IsMainThread(), "Call only on main thread");
> +    return mDestroyed;

This change is scary. Where are we calling this off the main thread now?
Attachment #8714023 - Flags: review?(roc)
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #6)
> Comment on attachment 8714023 [details]
> MozReview Request: Bug 1237816: Make MediaStream::Destroy not block stream
> shutdown operations r?roc
> 
> https://reviewboard.mozilla.org/r/32941/#review29743
> 
> ::: dom/media/MediaStreamGraph.h:461
> (Diff revision 2)
> > -    NS_ASSERTION(NS_IsMainThread(), "Call only on main thread");
> > +    return mDestroyed;
> 
> This change is scary. Where are we calling this off the main thread now?

We previously were overloading mGraph == nullptr as an indication that the MediaStream had been destroyed.  It turns out that shutting down a stream (as we've grown the ability to do full-duplex audio) involves multiple messages and threads.  (And in fact always has to a degree, but a number of calls used GraphImpl() == nullptr as an indicator to "not bother").  Witness all the instances of "if (auto graph = GraphImpl()) { graph->EnsureNextIteration()); }" or equivalent.  Since after Destroy() there were still refs to MediaStreams around, they were in a bit of a zombie state.

This allows for a more direct way to examine the state of a stream; IsDestroyed() was mainthread-only to avoid locks.  I moved it to an Atomic to minimize the overhead associated with accessing it.

The other option would be to remember what graph listener(s) were associated with a stream (at OpenAudioInput() time), and in DestroyImpl() remove all of them (effectively run CloseAudioInput()).  Which is certainly doable.  Or add a separate MSG-only flag for the MSG side to replace the "if (...GraphImpl())"'s, and set that flag in DestroyImpl() instead of nulling mGraph.
Comment on attachment 8714023 [details]
MozReview Request: Bug 1237816: count open input sources for MediaStreams to release inputs on Destroy() r?roc,padenot

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/32941/diff/2-3/
Attachment #8714023 - Flags: review?(roc)
roc - updated with an alternate approach.  I don't love caching a raw MSG pointer; we do own a SourceMediaStream, but it's not clear with mGraph nulled on Destroy() if the MediaStreamGraph itself is held open by anything - quite possibly not, in which case this isn't any good either, and we'll have to cache the info needed to undo the OpenAudioInput on Destroy.
Comment on attachment 8714023 [details]
MozReview Request: Bug 1237816: count open input sources for MediaStreams to release inputs on Destroy() r?roc,padenot

https://reviewboard.mozilla.org/r/32941/#review29759

this looks better
Attachment #8714023 - Flags: review?(roc) → review+
Comment on attachment 8714023 [details]
MozReview Request: Bug 1237816: count open input sources for MediaStreams to release inputs on Destroy() r?roc,padenot

https://reviewboard.mozilla.org/r/32941/#review29761
Attachment #8714023 - Flags: review+
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #6)
> Comment on attachment 8714023 [details]
> MozReview Request: Bug 1237816: Make MediaStream::Destroy not block stream
> shutdown operations r?roc
> 
> https://reviewboard.mozilla.org/r/32941/#review29743
> 
> ::: dom/media/MediaStreamGraph.h:461
> (Diff revision 2)
> > -    NS_ASSERTION(NS_IsMainThread(), "Call only on main thread");
> > +    return mDestroyed;
> 
> This change is scary. Where are we calling this off the main thread now?

This was a genuine question. I didn't see where IsDestroyed was getting called off the main thread. If it's not, we wouldn't need this change.

> Since after Destroy() there were still refs to MediaStreams around, they were in a bit of a zombie state.

Sure. But that's often the easiest way to solve these problems.
Comment on attachment 8714023 [details]
MozReview Request: Bug 1237816: count open input sources for MediaStreams to release inputs on Destroy() r?roc,padenot

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/32941/diff/3-4/
Attachment #8714023 - Attachment description: MozReview Request: Bug 1237816: Make MediaStream::Destroy not block stream shutdown operations r?roc → MozReview Request: Bug 1237816: count open input sources for MediaStreams to release inputs on Destroy() r?roc,padenot
Attachment #8714023 - Flags: review?(roc)
Attachment #8714023 - Flags: review?(padenot)
I'll take whomever I can get for a review first.
The older patch padenot and I decided was too unsafe due to the issues around holding a pointer to MSG.
This hides it all within MediaStream/MediaStreamGraph
Comment on attachment 8714023 [details]
MozReview Request: Bug 1237816: count open input sources for MediaStreams to release inputs on Destroy() r?roc,padenot

https://reviewboard.mozilla.org/r/32941/#review30213

::: dom/media/MediaStreamGraphImpl.h:635
(Diff revision 4)
> +  std::unordered_map<AudioDataListener *, uint32_t> mInputDeviceUsers;

nsDataHashtable please

::: dom/media/webrtc/MediaEngineWebRTC.h:143
(Diff revision 4)
> +  virtual void StopRecording(SourceMediaStream *aStream, bool aStop) = 0;

It's awfully confusing to have a StartRecording method with an aStart bool. What does it mean to call this with aStart false? Please document this. It's probably also a good idea to introduce an enum to clear up what "true" and "false" mean, especially in call sites.
Attachment #8714023 - Flags: review?(roc)
Comment on attachment 8714023 [details]
MozReview Request: Bug 1237816: count open input sources for MediaStreams to release inputs on Destroy() r?roc,padenot

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/32941/diff/4-5/
Attachment #8714023 - Flags: review?(roc)
> > +  std::unordered_map<AudioDataListener *, uint32_t> mInputDeviceUsers;
> 
> nsDataHashtable please

Done (though we use map/unordered_map elsewhere in webrtc - perhaps the issue is memory accounting?  Or just consistency in the tree?)  The code to report the size used on this map/hash would likely be larger than the map/hash itself. :-)

> ::: dom/media/webrtc/MediaEngineWebRTC.h:143
> (Diff revision 4)
> > +  virtual void StopRecording(SourceMediaStream *aStream, bool aStop) = 0;
> 
> It's awfully confusing to have a StartRecording method with an aStart bool.
> What does it mean to call this with aStart false? Please document this. It's
> probably also a good idea to introduce an enum to clear up what "true" and
> "false" mean, especially in call sites.

Agreed; I converted the "mInUse" boolean to a count, and got rid of the parameter to Start/StopRecording.
Comment on attachment 8714023 [details]
MozReview Request: Bug 1237816: count open input sources for MediaStreams to release inputs on Destroy() r?roc,padenot

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/32941/diff/5-6/
Comment on attachment 8714023 [details]
MozReview Request: Bug 1237816: count open input sources for MediaStreams to release inputs on Destroy() r?roc,padenot

https://reviewboard.mozilla.org/r/32941/#review30275
Attachment #8714023 - Flags: review?(roc) → review+
Comment on attachment 8714023 [details]
MozReview Request: Bug 1237816: count open input sources for MediaStreams to release inputs on Destroy() r?roc,padenot

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/32941/diff/6-7/
Comment on attachment 8714023 [details]
MozReview Request: Bug 1237816: count open input sources for MediaStreams to release inputs on Destroy() r?roc,padenot

https://reviewboard.mozilla.org/r/32941/#review30327
Attachment #8714023 - Flags: review?(padenot) → review+
https://hg.mozilla.org/mozilla-central/rev/9a5ad32f49f9
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla47
Firefox 46 is marked as affected. We're shipping in 2 weeks. Does this need to be uplifted?
Flags: needinfo?(rjesup)
(In reply to Lawrence Mandel [:lmandel] (use needinfo) from comment #24)
> Firefox 46 is marked as affected. We're shipping in 2 weeks. Does this need
> to be uplifted?

It doesn't.  This only applied if we are to pref-on full-duplex, which we held off until 48.
Flags: needinfo?(rjesup)
You need to log in before you can comment on or make changes to this bug.