Open Bug 1685099 Opened 4 years ago Updated 2 years ago

VP9 60fps Youtube-live stream constantly drops frames because webm init data triggers decoder recreation

Categories

(Core :: Audio/Video: Playback, defect, P3)

Firefox 85
defect

Tracking

()

REOPENED
87 Branch
Tracking Status
firefox87 --- fixed

People

(Reporter: wakext, Unassigned)

References

(Depends on 1 open bug)

Details

Attachments

(6 files)

Attached file about:support.txt

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:85.0) Gecko/20100101 Firefox/85.0

Steps to reproduce:

Play any VP9 60fps Youtube-live stream regardless of resolution.

Actual results:

Stats show frames are being dropped constantly and the resulting playback is choppy.

Captured performance profile:
https://share.firefox.dev/3niHp0C

Expected results:

No frames should get dropped and playback should be smooth, as other combinations of codecs and resolutions such as
AVC 60 fps live streams
VP9 30 fps live streams
AV1/AVC/VP9 60 fps videos
exhibit.

Safe mode/disabling hardware acceleration or disabling just setting media.webm.enabled to false provides a workaround.

Does it only occur during the livestream, or with playing back the recorded version later as well? If it occurs with the saved stream, can you provide a specific video where the behavior occurs?

(In reply to Jon Bauman [:jbauman:] from comment #1)

Does it only occur during the livestream, or with playing back the recorded version later as well? If it occurs with the saved stream, can you provide a specific video where the behavior occurs?

So far as mentioned only on all VP9 60 fps live streams it seems, the videos and VP9 stream VoDs (don't really frequent channels which keep archives of their streams) have no issues.

A random 60 fps VoD example I with the same codec version I could find quickly: https://www.youtube.com/watch?v=_qFJsXxsRYw
Plays fine.

A random 60 fps live stream example with the same video codec, good for testing as it runs 24/7: https://www.youtube.com/watch?v=0ivd2qHSlF0
Color space might and audio codec is definitely different though but it would be odd if that's causing the issue.

Just upgraded to 85B5 manually and the playback behavior changed positively - so far in the few live streams I tried it seems the dropped frames are no longer present.

Just to add:
Seems it only works now because the decoding no longer is accelerated by the video engine as the GPU-Z sensor indicates 0% load there now.

Attached file Profile2 about:support
Attached image Profile2 stats+GPU-Z

Updated to 85B6 (and Nvidia 461.09 driver) and back to square one with the dropped frames because hw-acceleration is on again.
Added relevant files named starting with "Profile2"(also a fresh one) in the starting post.

At least the GPU clocks were stable if a bit high, the flucuating bit in the start was loading in the ads.

Performance capture from Profile2: https://share.firefox.dev/38tutRL

Bryce, can you reproduce this?

Flags: needinfo?(bvandyk)

(In reply to wakext from comment #8)

Performance capture from Profile2: https://share.firefox.dev/38tutRL

Just to be sure, that's a profile of the issue taking place? The screenshot in comment 6 shows YT with 38 frames dropped from 5337. Do you think this is correct, or would you estimate more frames are being dropped?


Technical thoughts:

In the profile runs in comment 8 and comment 0 we're creating a lot of decoders on the GPU process. We can see a lot of short lived threads that do some decoding then shutdown, and we then get other threads doing the same thing. There are threads being created and shut down every few seconds. This seems undesirable.

I am able to repro this in profiles I gather from live streams for vp9 at 60fps. I use the streams mentioned in comment 2, note that for the Rocket Beans TV link you may need to go to their YT home page (https://www.youtube.com/channel/UCQvTDmHza8erxZqDkjQ4bQQ) and find a currently live stream.

Streams from comment 2:
Live stream profile: https://share.firefox.dev/3se56er
Non-live stream profile: https://share.firefox.dev/3oExKmH

I also tried with some avc1 live streams and they do not have the problem:
Profile run of https://www.youtube.com/watch?v=5qap5aO4i9A (live avc1 30fps): https://share.firefox.dev/3oC79GP
Profile run of a live avc1 stream at 60fps (https://www.youtube.com/watch?v=k7-00koM4WE but don't think it will stay live): https://share.firefox.dev/3oBVkk7

I couldn't find any 30fps vp9 live streams, so cannot be sure the problem is isolated to 60fps vp9. I assume for now any live vp9 stream is affected.

So something is leading to us doing a lot of decoder recreation in the vp9 HW case, specifically for live streams on YT, and it looks to be having an impact on performance.

Flags: needinfo?(bvandyk)

Bryce, since you have working STR here, could you debug why this happens?

Given that it's happening every 2 seconds, then it's almost certainly every keyframe that we're deciding we need a new decoder.

The content process is running this every 2 seconds to create a new decoder: https://searchfox.org/mozilla-central/rev/31ddf859c57e812878a5f817e4097efb06de4d97/dom/media/ipc/RemoteDecoderManagerChild.cpp#316

Looking at the callstack (possibly async via promises), should hopefully let us know why we're deciding to do this. It may very well be in MediaChangeMonitor.

Flags: needinfo?(bvandyk)

(In reply to Bryce Seager van Dyk (:bryce) from comment #10)

Just to be sure, that's a profile of the issue taking place? The screenshot in comment 6 shows YT with 38 frames dropped from 5337. Do you think this is correct, or would you estimate more frames are being dropped?

Yes, everything regarding Profile2(including YT-stats + GPU-Z) is from a fresh FF-Profile (deleted Profile1, hence why another one). I even created a quick Wintogo-USB drive to try with a completely fresh Windows 10 to rule that out as well, same issue persists there too.

I'm not sure about your other question, I never considered that the Youtube stats could even show incorrect counts - but the video visibly stuttered/hitched very often iirc so it might.
Just to re-check I just went on to RBTVs stream just now and got 96 out of 5323 frames dropped with the video again visible skipping every few seconds, so the count in Profile2's stats seem a bit low in comparison.

(In reply to Matt Woodrow (:mattwoodrow) from comment #11)

Bryce, since you have working STR here, could you debug why this happens?

Given that it's happening every 2 seconds, then it's almost certainly every keyframe that we're deciding we need a new decoder.

The content process is running this every 2 seconds to create a new decoder: https://searchfox.org/mozilla-central/rev/31ddf859c57e812878a5f817e4097efb06de4d97/dom/media/ipc/RemoteDecoderManagerChild.cpp#316

Looking at the callstack (possibly async via promises), should hopefully let us know why we're deciding to do this. It may very well be in MediaChangeMonitor.

Can do. I need to clear some backlog, but should be able to look at this more later this week.

The constant recreation seems to be caused by us shutting down decoders and recreating them.

IPC decoding is fun to try and trace. Working my way back it looks like we're changing the stream ID which results in the decoders being recreated [0]. This does log some markers in profile runs, which I initially missed, but are indeed there in the above runs. These indicate the video stream is leading to the recreation.

This looks to be due to the TrackBuffersManager[1] getting new init data that differs from the prior[2]. Debugging this a bit shows we are indeed getting different init data (though not always, sometimes it repeats). I don't know if the init data is functionally different (in the sense that we actually need to recreate decoders).

I'll see if I can make any more of the init data and see if we can do anything from there.

[0] https://searchfox.org/mozilla-central/rev/03224522336f60a1a61a87e1fcd4feb0a0315a9b/dom/media/MediaFormatReader.cpp#1913
[1] https://searchfox.org/mozilla-central/rev/03224522336f60a1a61a87e1fcd4feb0a0315a9b/dom/media/mediasource/TrackBuffersManager.cpp#1372
[2] https://searchfox.org/mozilla-central/rev/03224522336f60a1a61a87e1fcd4feb0a0315a9b/dom/media/mediasource/TrackBuffersManager.cpp#1215

Got it. Youtube is sending init data which when read as a webm is functionally equivalent, i.e. all the webm fields are exactly the same. However, youtube are using a void element (an element that is ignored when parsing) to code some side channel data. This side channel appears to contain various timing information (that we don't use nor recognize).

The problem for us is we do a binary comparison of the init data[0], rather than parsing it and comparing the parsed structures. Because the side channel contains different binary data, we think we need to create new decoders, when in reality we don't.

It would be a non-trivial change, but we could try and parse the data and do a more structured comparison. I'd also need to check the MSE spec to make sure such a change wouldn't violate anything. Holding NI.

[0] https://searchfox.org/mozilla-central/rev/03224522336f60a1a61a87e1fcd4feb0a0315a9b/dom/media/mediasource/TrackBuffersManager.cpp#1217

For other devs, here are two init data that I dumped while debugging. You can open these in tools like MKVToolNix and see they are equivalent, but if you dump/hexedit the void element you can see those differ.

Assignee: nobody → bvandyk
Severity: -- → S2
Status: UNCONFIRMED → NEW
Ever confirmed: true
Priority: -- → P3
Summary: VP9 60fps Youtube-live stream constantly drops frames → VP9 60fps Youtube-live stream constantly drops frames because webm init data triggers decoder recreation

Upping priority as I expect to look at this during the coming cycle.

Priority: P3 → P2

This should be a strict refactor (no functional changes), but makes the
boundaries of the code doing the repeat check more clear. This code will be
updated in a following patch.

This updates the repeat init data detection code in TrackBuffersManager to also
detect repeat video init data. This will also detect cases where the init
segment contains repeat video and audio init data if both the audio and video
are the same as the previous init data.

Depends on D103173

Awesome, thank you Bryce!

Pushed by bvandyk@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/4e47f34e5eaf Refactor TrackBuffersManager repeat init data check into a method. r=alwu https://hg.mozilla.org/integration/autoland/rev/0b440804190c Don't reinit TrackBuffersManager on repeat video init data. r=alwu
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → 87 Branch

This should be fixed in the most recent nightly. If you're able to test it there and let me know if you see the problem or not, that would be appreciated. If you'd prefer to wait until it's in beta, I believe that 2021-02-23 is the ship date for the beta that will contain the fix.

Flags: needinfo?(wakext)

(In reply to Bryce Seager van Dyk (:bryce) from comment #24)

This should be fixed in the most recent nightly. If you're able to test it there and let me know if you see the problem or not, that would be appreciated. If you'd prefer to wait until it's in beta, I believe that 2021-02-23 is the ship date for the beta that will contain the fix.

Are you sure it is included?
I just tried 87B1 and it seems to behave the same, 36 out of 5000 dropped frames.

Flags: needinfo?(wakext)

Just to add, looks like it improved when you scrub back in a live stream but still drops frames when it is caught up with the stream (YT's live dot indicating red).

Another try and 43 out of 5000 frames dropped when being "live", skipping back -1:00:00 it's only 5 dropped out of the next 5000 frames.

(In reply to wakext from comment #25)

(In reply to Bryce Seager van Dyk (:bryce) from comment #24)

This should be fixed in the most recent nightly. If you're able to test it there and let me know if you see the problem or not, that would be appreciated. If you'd prefer to wait until it's in beta, I believe that 2021-02-23 is the ship date for the beta that will contain the fix.

Are you sure it is included?
I just tried 87B1 and it seems to behave the same, 36 out of 5000 dropped frames.

Thanks for checking. I'm fairly certain beta 87 will have the patch -- could you capture and share a Firefox profiler run using the media preset with the problem happening in the new beta? That will help narrow down why drops are still happening.

Flags: needinfo?(wakext)

(In reply to Bryce Seager van Dyk (:bryce) from comment #27)

(In reply to wakext from comment #25)

(In reply to Bryce Seager van Dyk (:bryce) from comment #24)

This should be fixed in the most recent nightly. If you're able to test it there and let me know if you see the problem or not, that would be appreciated. If you'd prefer to wait until it's in beta, I believe that 2021-02-23 is the ship date for the beta that will contain the fix.

Are you sure it is included?
I just tried 87B1 and it seems to behave the same, 36 out of 5000 dropped frames.

Thanks for checking. I'm fairly certain beta 87 will have the patch -- could you capture and share a Firefox profiler run using the media preset with the problem happening in the new beta? That will help narrow down why drops are still happening.

Sure, the captures are both from the same new profile:
46/5000 frames dropped when live streaming: https://share.firefox.dev/2NSW46O
and
4/5000 frames dropped when skipped back in the stream to -1:00:00 and playing from there: https://share.firefox.dev/2NAExk7

Flags: needinfo?(wakext)

Thank you. This is the same issue as before, so the patch is indeed not working as desired. I'll reopen this and investigate. Could you confirm that you're using the same stream as above, so I can make sure I'm reproducing correctly?

Status: RESOLVED → REOPENED
Resolution: FIXED → ---

I think I can reproduce. The good news is the patches above help, the bad news is there appears to be more data that youtube changes over time in the stream that I didn't catch.

I'm going to track further work here via another bug to avoid confusion. I will create it shortly and reference this bug in it.

(In reply to Bryce Seager van Dyk (:bryce) from comment #29)

Thank you. This is the same issue as before, so the patch is indeed not working as desired. I'll reopen this and investigate. Could you confirm that you're using the same stream as above, so I can make sure I'm reproducing correctly?

Yes, this was the same stream as above (RBTV) as it runs 24/7 and with VP9@60 to recreate the issue.
Thank you for further looking into it.

Unassigning bugs assigned to Bryce because he no longer works at Mozilla.

Assignee: brycebugemail → nobody
Severity: S2 → S4
Priority: P2 → P3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: