Closed Bug 1592539 Opened 5 years ago Closed 4 years ago

HTMLVideoElement cloned for PiP doesn't track the selected video track properly when playing a MediaStream


(Core :: Audio/Video: Playback, defect, P2)




82 Branch
Tracking Status
firefox82 --- fixed


(Reporter: pehrsons, Assigned: pehrsons)


(Blocks 1 open bug)



(7 files, 1 obsolete file)

HTMLVideoElement::MaybeBeginCloningVisually sets up the target element's VideoFrameContainer with the selected video track when playing a MediaStream.

Then it forgets about things.

Problem #1:
HTMLVideoElement::EndCloningVisually removes the target element's VideoFrameContainer from the target element's selected video track. Is this ever non-null?

Problem #2:
Even with problem #1 fixed, the selected video track in HTMLVideoElement::MaybeBeginCloningVisually and HTMLVideoElement::EndCloningVisually might not be the same. That's a bummer for cleaning up. It's worse for the user if the application changed the selected track and the PiP player did not follow along. A simple test case that should trigger this bug follows below:

const gumStream = await navigator.mediaDevices.getUserMedia({audio: true, video: true});
const screenTrack = (await navigator.mediaDevices.getDisplayMedia()).getTracks()[0];
const video = document.createElement("video");
video.srcObject = new MediaStream(gumStream.getTracks());;
// Start picture-in-picture of `video`

Expected: PiP player displays the screen track
Actual: PiP player displays the camera track

I wrote a jsfiddle to demonstrate this.

Pre: Ensure PiP is enabled

  1. Open
  2. (to work around bug 1590479) Right-click in the main iframe and click "This Frame"->"Show Only This Frame"
  3. Click Start. Approve the camera/mic capture in the first prompt, and select a screen or window to capture and approve the second prompt.
  4. Start PiP for the video element on the page.
  5. Click "Display Screen"

Expected: PiP player shows the screen capture
Actual: PiP player shows frames from camera and screen capture interleaved

I am not sure how the PiP player got a hold of the new track, but as expected from reading the code, there's a bug in that the PiP player is still displaying frames from the video track that is no longer selected.

According to the folks in #media, this sort of thing should probably be a release blocker.

Hey astevenson, just getting this on your radar. The solution that the folks in #media suggested was that we uplift a patch to Beta (71) that disables the PiP toggle and context menu item for videos that have srcObject !== null on it.

This means that for 71, PiP would not be available for the types of <video> elements that are receiving, for example, streams from the camera, the desktop, or any other type of MediaStream. Normal videos, like the ones on YouTube, Twitch, Amazon, Netflix, etc, will be unaffected.

We would then work to fix this bug, hopefully for 72, and re-enable the PiP toggle and context menus for that kind of <video> element.

This seemed like the least riskiest option for 71. Does that sound satisfactory?

Edit: Updated because I got the version numbers wrong.

Blocks: 1527926
No longer blocks: videopip
Flags: needinfo?(astevenson)

Mike, that makes sense to me. Just want to clarify that the patch to disable the PiP toggle + context menu for MediaStream would happen in 71 Beta (current Beta) and 72 would hopefully have the complete fix.

Flags: needinfo?(astevenson)
See Also: → 1592729
Blocks: 1532675
No longer blocks: 1527926
Priority: -- → P2

I filed bug 1592729 to land patches to disable the toggle and context menu items for MediaStream videos.

No longer blocks: 1532675

I believe this is something that needs to be fixed in the media playback layer, I'm afraid.

Component: Video/Audio Controls → Audio/Video: Playback
Priority: P2 → --
Product: Toolkit → Core

The priority flag is not set for this bug.
:bryce, could you have a look please?

For more information, please visit auto_nag documentation.

Flags: needinfo?(bvandyk)
Flags: needinfo?(bvandyk)
Priority: -- → P3
Assignee: nobody → apehrson

This unclutters HTMLMediaElement somewhat by putting away the logic for handling
the FirstFrameListener into MediaStreamRenderer, which is better suited for the

This will allow us to use a second MediaStreamRenderer to properly track video
tracks for a secondary VideoFrameContainer.

Depends on D87133

This allows the FirstFrameListener to be instantiated outside of
HTMLMediaElement. A future patch brings in a subclass in HTMLVideoElement.

This also renames FirstFrameListener to FirstFrameVideoOutput since that better
denotes what it is.

Depends on D87134

I have observed async calls to the MediaStreamRenderer's mWatchManager::Unwatch()
in flight after it was Shutdown(). This would cause an assertion failure in
debug. Per the comment we should do the same in Unlink as
EndSrcMediaStreamPlayback() without creating new strong refs. This conforms to
that instruction.

I am not willing to null out the other members at this point, since at least
HTMLVideoElement's relies on the selected video track to exist in some cases.

Depends on D87135

This allows different users of renderers to inject different first frame
outputs. So far there is only one, but a future patch will bring a special one
for the secondary video frame container.

Depends on D87136

The first-frame output used to only be applied when not rendering the
MediaStream, and the regular video output was applied when rendering.

The difference with this patch is when rendering -- both the first-frame and the
regular outputs are applied at the same time. The former allows one frame to go
through to the VideoFrameContainer, then the regular output takes over and lets
any frames through. Nothing in how frames are rendered should be noticable by

This allows for simpler logic for resolving the visual clone target promise in a
future patch.

Depends on D87137

This re-uses MediaStreamRenderer to render video into a secondary container, to
get identical logic for the clone target as for the clone source. The
MediaStreamRenderer can handle audio, currentTime and more as well, but those
are not used here.

Depends on D87138

Priority: P3 → P2

This test used to rely on luck with timing as waiting for the "pause" event
after pausing does not guarantee that we have rendered the frame at currentTime

This changes the test so that it seeks the original video close to the end and
waits for the "ended" event on the streamTarget, to guarantee that we have
played the video to the end of the last frame -- which should be enough for the
last frame to have reached the compositor.

Since the streamTarget and the clone are rendered the same way, we also compare
the original video to the streamTarget for sanity.

Depends on D87132

Comment on attachment 9172100 [details]
Bug 1592539 - Stabilize test_cloneElementVisually_mediastream.html. r?mconley

Revision D88248 was moved to bug 1531988. Setting attachment 9172100 [details] to obsolete.

Attachment #9172100 - Attachment is obsolete: true
Pushed by
Test video element visual cloning with a MediaStream switching tracks. r=mconley
Move the FirstFrameListener handling to MediaStreamRenderer. r=jib
Move the FirstFrameListener to VideoOutput.h. r=jib
Also null out mMediaStreamRenderer in Unlink. r=bryce
Inject a FirstFrameVideoOutput into MediaStreamRenderer. r=jib
Make MediaStreamRenderer always apply its FirstFrameVideoOutput. r=jib,karlt
Use a MediaStreamRenderer to render video tracks into a secondary container. r=jib,mconley
You need to log in before you can comment on or make changes to this bug.