Closed Bug 1608118 Opened 4 years ago Closed 4 years ago

sender.replaceTrack() horribly broken, flickers back and forth continuously

Categories

(Core :: WebRTC: Audio/Video, defect, P1)

72 Branch
defect

Tracking

()

VERIFIED FIXED
mozilla74
Tracking Status
firefox-esr68 --- unaffected
firefox72 + wontfix
firefox73 + verified
firefox74 + verified

People

(Reporter: jib, Assigned: achronop)

References

(Regression)

Details

(Keywords: regression)

Attachments

(1 file)

sender.replaceTrack appears broken like out of a bad movie.

STRs:

  1. Open https://jsfiddle.net/jib1/yukbn0ep/ and allow camera
  2. Check ☑ scramble

Expected result:

  • Video switches from camera (track 1) to consistent white noise (track 2)

Actual result:

  • Tracks (Video and white noise) alternate rapidly, flickering back and forth forever. scramble switch stops working.

Tested on Mac and Windows.

Regression range:

8:34.06 INFO: Narrowed inbound regression window from [f02f44b9, 313e9635] (3 builds) to [7c1213a7, 313e9635] (2 builds) (~1 steps left)
8:34.06 INFO: No more inbound revisions, bisection finished.
8:34.06 INFO: Last good revision: 7c1213a74d2844a2da97228e4660ffd8711b9346
8:34.06 INFO: First bad revision: 313e96351327ad2a57182dc37743cedb1d8efa78
8:34.06 INFO: Pushlog:
https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=7c1213a74d2844a2da97228e4660ffd8711b9346&tochange=313e96351327ad2a57182dc37743cedb1d8efa78

Since bug 1601034 got uplifted, this just went to release. 😞

Flags: needinfo?(docfaraday)
Priority: -- → P2

[Tracking Requested - why for this release]: Video flickers during rudimentary WebRTC call that relies on sender.replaceTrack. Seems rudimentary, but haven't had time to find and observe a regression in the wild yet.

Oddly, https://whereby.com, for which bug 1601034 was landed to solve a problem, does not appear afflicted, so it is yet unclear how widely WebRTC sites will be affected by this.

There may perhaps be some combination of factors involved, like whether the tracks are enabled or disabled at the time of the replace perhaps?

Severity: normal → critical
Priority: P2 → --
Priority: -- → P2
Flags: needinfo?(docfaraday)

Note that neither track is disabled in this case.

Do we think someone has the cycles to fix this for a future 72 dot release?

Flags: needinfo?(jib)
Flags: needinfo?(jib) → needinfo?(docfaraday)

I could try, although this may be a problem in MediaTrackGraph, which I'm only passingly familiar with.

I've verified the problem also happens with two camera tracks https://jsfiddle.net/jib1/h05w2kgr/ (I thought maybe it was limited to cam vs. non-cam, given some of the commit messages in bug 1601034, but not so).

It's caused by https://phabricator.services.mozilla.com/D56619. Dan, Paul, any idea?

Flags: needinfo?(padenot)
Flags: needinfo?(dminor)
Priority: P2 → P1

The commit of the patch in comment 8 only mentions audio tracks, but the code changes setTracks for both video and audio.

One option might be to try to restore the old code path for video, and leave audio as is, since I see no mention of testing video—tests probably pass because they look for new color, not absence of old color (guessing).

But solving it that way might be a bit messy from looking at it. Might be simpler if we can find what's wrong with the new path for video. Clearly, something is still attached to the old dom track, as it's basically sending both old and new track at once it seems. Hoping others more familiar with MTG can spot it.

Checking the offending patch I believe the problem is that we do not remove the listeners from the old mDomTrack before we replace it. I did a quick patch to do that and it seems to work. I need to investigate further to be sure this is the right fix.

Thanks for having a look Alex!

Flags: needinfo?(dminor)
Assignee: nobody → achronop
Flags: needinfo?(docfaraday)

I think I found the cause. The MediaPipelineTransmit::mListener is added in mDomTrack indirectly here. This is because it goes through 1, 2, 4 and finally 5.

The ForwardedInputTrack::SetInput() get all the direct listeners from itself (mSendTrack) and add them to the source (mDomeTrack::mTrack) when the MediaInput is set. On the other hand on ForwardedInputTrack::RemoveInput() those listeners are not removed. Thus, later on, when the track is replaced and the mSendPort is destroyed the mListener is not removed from the old track and it continues pushing data to the transmitter pipeline.

Removing the direct listeners in ForwardedInputTrack::RemoveInput(), in the same way that they are added fixes the problem. I'll push on try and test it more on Monday.

BTW this is a video only bug because it involves direct listeners which are not used in audio.

Flags: needinfo?(padenot)

Thanks Alex! Test looks green and from rudimentary testing locally it seems to fix the problem. I'll put it up for review to Paul so we can get it on Nightly asap, so we can consider it for a dot release.

I do notice ForwardedInputTrack appears used in a couple other places. Are we confident this won't break any of those places? The first fix you showed me Friday in MediaPipeline.cpp seemed smaller and also seemed to work:

--- a/media/webrtc/signaling/src/mediapipeline/MediaPipeline.cpp
+++ b/media/webrtc/signaling/src/mediapipeline/MediaPipeline.cpp
@@ -992,16 +992,17 @@ nsresult MediaPipelineTransmit::SetTrack
         gMediaPipelineLog, LogLevel::Debug,
         ("Reattaching pipeline to track %p track %s conduit type: %s",
          &aDomTrack, track_id.c_str(),
          (mConduit->type() == MediaSessionConduit::AUDIO ? "audio" : "video")));
   }

   if (mDomTrack) {
     mDomTrack->RemoveConsumer(mTrackConsumer);
+    mDomTrack->RemoveDirectListener(mListener);
   }

Which is the minimal and best fix for a dot-release do we think? I don't have a preference or great knowledge of this code, just wanted to raise the question. cc Paul

Flags: needinfo?(padenot)

Thank you jib. I was planning to push for review once my testing was finished.

I am done with my testing, the patch is already r+ed, I'll go on and push it.

Pushed by achronopoulos@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/96ec3a9ac5b8
Remove direct listeners from the source in ForwardedInputTrack when MediaInput is removed r=padenot

(clearing ni)

Flags: needinfo?(padenot)
See Also: 1606875
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla74

Please nominate this for Beta approval when you get a chance.

Flags: qe-verify+
Flags: needinfo?(achronop)

Comment on attachment 9121718 [details]
Bug 1608118 - Remove direct listeners from the source in ForwardedInputTrack when MediaInput is removed r=padenot

Beta/Release Uplift Approval Request

  • User impact if declined: Remote viewers in WebRTC calls may experience seizure-inducing flickering (between two video sources) that won't stop, when a website attempts to replace a video track currently being transmitted with another different video track (without renegotiating) e.g. when a user attempts to switch to another camera or to sharing their screen in place of their camera stream.
  • Is this code covered by automated tests?: No
  • Has the fix been verified in Nightly?: Yes
  • Needs manual test from QE?: Yes
  • If yes, steps to reproduce: STRs in comment 0 and comment 7 (see comment 9 on limitations of our automated test).
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): Low risk since patch is small, well-contained and adds code to remove an input that should have been removed, but wasn't for video like it was for audio, because video uses "direct" listeners, a second class of listener plumbing. Now the code removes these "direct" listeners as well, not just the regular ones. These "direct" listeners are and were cleaned up everywhere else, just not in this recently introduced replaceTrack code optimization.
  • String changes made/needed:
Attachment #9121718 - Flags: approval-mozilla-release?
Attachment #9121718 - Flags: approval-mozilla-beta?
QA Whiteboard: [qa-triaged]

Clearing NI

Flags: needinfo?(achronop)

Reproduced the initial issue using a Beta build 73.0b7 (Build id: 20200118195856).
Verified - Fixed in latest Nightly 74.0a1 (Build id: 20200121215203) using Windows 10 and Mac OS 10.15. Verified also with two camera tracks (comment 7).

Comment on attachment 9121718 [details]
Bug 1608118 - Remove direct listeners from the source in ForwardedInputTrack when MediaInput is removed r=padenot

Fixes a very ugly WebRTC regression causing flickering video playback. Approved for 73.0b9.

Attachment #9121718 - Flags: approval-mozilla-beta? → approval-mozilla-beta+

Verified - Fixed in latest Beta 73.0b9 (Build id: 20200122212931) using Windows 10 and Mac OS 10.15

Comment on attachment 9121718 [details]
Bug 1608118 - Remove direct listeners from the source in ForwardedInputTrack when MediaInput is removed r=padenot

Not planning another dot release for 72, so this will ship in 73.

Attachment #9121718 - Flags: approval-mozilla-release? → approval-mozilla-release-
Status: RESOLVED → VERIFIED
QA Whiteboard: [qa-triaged]
Flags: qe-verify+
Has Regression Range: --- → yes
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: