Closed Bug 1072780 Opened 9 years ago Closed 9 years ago

MediaStreamGraph cleanup from refactor

Categories

(Core :: WebRTC: Audio/Video, defect)

Other Branch
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla35
Tracking Status
firefox32 --- unaffected
firefox33 --- unaffected
firefox34 --- fixed
firefox35 --- fixed
b2g-v2.1 --- fixed
b2g-v2.2 --- fixed

People

(Reporter: jesup, Assigned: jesup)

References

Details

(Whiteboard: [webrtc-uplift])

Attachments

(3 files, 3 obsolete files)

Bug 848954 refactored MediaStreamGraph and introduced GraphDriver; we need to land a number of cleanups for intermittents and race conditions.  CurrentDriver() from off-MSG-thread is a common problem.

Also found that mInCallback was getting set and cleared immediately due to how the DebugOnly was set up.

First patch here should underly bug 1072775 since it depends on InCallback()
Comment on attachment 8495053 [details] [diff] [review]
patch 3 - Use atomics for EnsureNextIteration to close races around CurrentDriver

roc - feedback appreciated at trying to streamline and make EnsureNextIteration safe without heavy locks (now that drivers can change under us and also sleep).

Implements roughly this:

[01:26]	jesup	considered an Atomic<bool> for EnsureNextIteration to avoid the need for CurrentDriver(), but the need to do a WakeUp() if the driver is asleep made that unappealing for now

[01:30]	jesup	Though I imagine something like "(atomic) mOneMoreIteration = true; if ((atomic) mIsAsleep) { MonitorAutoLock lock(mMonitor); CurrentDriver()->WakeUp; }" might work. Any races remaining are just un-needed WakeUps
Attachment #8495053 - Flags: feedback?(roc)
Blocks: 1071813
Comment on attachment 8495053 [details] [diff] [review]
patch 3 - Use atomics for EnsureNextIteration to close races around CurrentDriver

Review of attachment 8495053 [details] [diff] [review]:
-----------------------------------------------------------------

Why can't we use the graph monitor here?
Attachment #8495053 - Flags: feedback?(roc)
> Why can't we use the graph monitor here?

We certainly can; the previous patch lock the monitor around accesses to CurrentDriver() from other threads - but there are a lot (and even some SetCurrentDrivers() from other threads in the 'Revive()' case), so it gets messy and increases the chance that MSG threads (or system audio callback threads!) will end up blocked on a monitor held by a thread that won't be scheduled for a bit.  Atomics introduce memory barriers but don't block runnability for the MSG/callback.

mNeedAnotherIteration seems like a good use of an Atomic: lots of setters to true; once reader/setter-to-false, and a writer-reader race is ok (we just do an extra iteration, since we set false before starting an iteration, at the end of WaitForNextIteration() with this patch.)  Very rarely do we need to do the heavier-weight WakeUp().
Attachment #8495051 - Attachment is obsolete: true
Attachment #8495502 - Attachment is obsolete: true
Attachment #8495053 - Attachment is obsolete: true
Comment on attachment 8495529 [details] [diff] [review]
patch 1 - clean up CurrentDriver() use off-MSG-thread; fix InCallback()

The primary thing in this patch is support for controlling the use of CurrentDriver() from non-MSG threads, especially for a common pattern of "do something to a stream, graph->CurrentDriver()->EnsureNextIteration();"

The danger was that CurrentDriver() wasn't locked, could change on you, and was a TSAN violation to use like this.

Also some other bits of cleanup
Attachment #8495529 - Flags: review?(roc)
Attachment #8495533 - Flags: review?(roc)
Comment on attachment 8495534 [details] [diff] [review]
patch 4 - Use atomics for EnsureNextIteration to close races around CurrentDriver

Review of attachment 8495534 [details] [diff] [review]:
-----------------------------------------------------------------

This one is optional, but I think is an improvement
Attachment #8495534 - Flags: review?(roc)
Comment on attachment 8495533 [details] [diff] [review]
patch 3 - Fix up Revive() to not trigger assertions, and also to avoid Init() (blocking) on MainThread

Review of attachment 8495533 [details] [diff] [review]:
-----------------------------------------------------------------

::: content/media/MediaStreamGraphImpl.h
@@ +440,4 @@
>     */
>    void SetCurrentDriver(GraphDriver* aDriver) {
> +    MOZ_ASSERT(mDriver->OnThread() ||
> +               (mMonitor.AssertCurrentThreadOwns(), true));

This is a bit obscure. Please rewrite this as
  if (!mDriver->OnThread()) {
    mMonitor.AssertCurrentThreadOwns();
  }
Attachment #8495533 - Flags: review?(roc) → review+
Comment on attachment 8495534 [details] [diff] [review]
patch 4 - Use atomics for EnsureNextIteration to close races around CurrentDriver

Review of attachment 8495534 [details] [diff] [review]:
-----------------------------------------------------------------

I'm not crazy about this but I'm not against it either.
Attachment #8495534 - Flags: review?(roc) → review+
Backed out:
 https://hg.mozilla.org/integration/mozilla-inbound/rev/0c8faa128dff
for fatal assertion failures in Linux64 tests like:
Assertion failure: lock->locked && pthread_equal(lock->owner, pthread_self()), at /builds/slave/m-in-l64-d-0000000000000000000/build/nsprpub/pr/src/pthreads/ptsynch.c:226

e.g.
https://tbpl.mozilla.org/php/getParsedLog.php?id=49067104&tree=Mozilla-Inbound
Comment on attachment 8495529 [details] [diff] [review]
patch 1 - clean up CurrentDriver() use off-MSG-thread; fix InCallback()

(for all patches, and they need to land with bug 1072775):

Approval Request Comment
[Feature/regressing bug #]: bug 848954

[User impact if declined]: crashes, test oranges, possible sec issues

[Describe test coverage new/current, TBPL]: some known oranges are likely due to this.  Adds considerable thread-safety assertions, which caught a bug immediately on landing (which someone had been working and had a fix for, but took them far longer to find after it caused an odd crash)

[Risks and why]: fairly low.  Cleans up locking around a number of pointers; of course in theory it may have missed something, but it added asserts about safety/locking.  Main risk would be missing a case where the assert is too draconian (had one of those on the initial landing) and causing an intermittent orange.

[String/UUID change made/needed]: none.
Attachment #8495529 - Flags: approval-mozilla-aurora?
Attachment #8495533 - Flags: approval-mozilla-aurora?
Attachment #8495534 - Flags: approval-mozilla-aurora?
Comment on attachment 8495529 [details] [diff] [review]
patch 1 - clean up CurrentDriver() use off-MSG-thread; fix InCallback()

Aurora+
Attachment #8495529 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Comment on attachment 8495533 [details] [diff] [review]
patch 3 - Fix up Revive() to not trigger assertions, and also to avoid Init() (blocking) on MainThread

Aurora+
Attachment #8495533 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Comment on attachment 8495534 [details] [diff] [review]
patch 4 - Use atomics for EnsureNextIteration to close races around CurrentDriver

Aurora+
Attachment #8495534 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Flags: qe-verify-
You need to log in before you can comment on or make changes to this bug.