Closed Bug 1905022 Opened 5 months ago Closed 4 months ago

Video playback freezes while audio keeps going with hardware decoded video zero copy

Categories

(Core :: Audio/Video: Playback, defect, P3)

Firefox 127
Desktop
Windows
defect

Tracking

()

VERIFIED FIXED
130 Branch
Tracking Status
firefox-esr115 --- unaffected
firefox-esr128 --- wontfix
firefox127 --- unaffected
firefox128 + wontfix
firefox129 + wontfix
firefox130 --- verified
firefox131 --- verified

People

(Reporter: austin.donisan, Assigned: sotaro)

References

(Regression)

Details

(Keywords: regression)

Attachments

(12 files)

Attached video original.mp4

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:127.0) Gecko/20100101 Firefox/127.0

Steps to reproduce:

Watch a video on the hedgehealth.io site

Actual results:

After a few seconds the video freezes while the audio keeps going in the background. Pausing/playing the video jumps it to the correct spot and it plays correctly for a few seconds until it freezes again.

Expected results:

Normal video playback

Attached video noaudio.mp4
Attached video screencap.mp4

The Bugbug bot thinks this bug should belong to the 'Core::Audio/Video: Playback' component, and is moving the bug to that component. Please correct in case you think the bot is wrong.

Component: Untriaged → Audio/Video: Playback
Product: Firefox → Core

There is something broken with the hardware acceleration for the H.264 videos on this site. I have an XPS 7590, so the CPU is an Intel 9750H with Intel UHD 630 integrated graphics, and it also has an Nvidia GTX 1650.

This only occurs with hardware acceleration when using the Intel GPU. Force enabling the Nvidia GPU through Windows 11's settings fixes the problem. Disabling hardware acceleration withing Firefox also fixes the problem.

The playback errors happen with the files downloaded locally, so this isn't about the video player on the site. Also this doesn't happen in Chrome.

Additionally there some other weird behaviors:
If a video one tab is frozen, playing a video in another tab temporarily unfreezes the video in the original tab.
If a video has no audio track the timer appears to stop as well.
But the no-audio version is actually still playing and jumps to the correct timestamp if you play a video in another tab.

I attached an example of a video that glitches out, a version with the audio removed, and a screencap of the glitchy behavior I'm seeing.

"Media playback" logging of playing the video, having it freeze, pausing/playing to unfreeze it, and it freezing again ~5 times or so

"Graphics" logging of doing the same thing

Can confirm that the first video "original.mp4" freezes for me on Nightly, and continues to play in Chrome.

Status: UNCONFIRMED → NEW
Ever confirmed: true

Bisection:
Bug 1899450 - Re-enable zero copy video of hardware decoded video with all GPUs to early beta on Windows r=gfx-reviewers,jrmuizel
Differential Revision: https://phabricator.services.mozilla.com/D211950

Keywords: regression
Regressed by: 1899450
Flags: needinfo?(sotaro.ikeda.g)
Summary: Video playback freezes while audio keeps going → Video playback freezes while audio keeps going (Intel and AMD GPUs)

This bug is much older than that diff. It's been happening since at least March for me. Manually bisect the old nightlies puts it somewhere around FF 102 (I haven't tracked down to the exact one yet).

Another observation is that the "Video Decode" in windows task manager shows 0% when playing this video. Other h.264 videos show a non-zero utilization when playing.

This is specific to the h.264 videos on that site. Re-encoding it fixes the problem, and I haven't been able to find video from any other source that break like this.

Set release status flags based on info from the regressing bug 1899450

This broke in bug 1762125 when hardware decoding was re-enabled for Intel.

Than you for reporting. I take the bug.

Assignee: nobody → sotaro.ikeda.g
Flags: needinfo?(sotaro.ikeda.g)

Hi Austin, can you take Firefox Profiler with "media" setting with latest nightly like in instruction.?
https://profiler.firefox.com/

Flags: needinfo?(austin.donisan)

https://share.firefox.dev/45QOJsw

I started logging, then loaded the video. At 0:10 and 0:20 I hit pause, waited a second, then hit play. The video froze at around 0:02, 0:12, and 0:22.

Flags: needinfo?(austin.donisan)

Thank you! Can you also attach about:support to this bug?

Flags: needinfo?(austin.donisan)
Attached file support.txt
Flags: needinfo?(austin.donisan)
Attached file support.json

Thank you! Can you also check if the following address the problem for you?

  • pref media.wmf.zero-copy-nv12-textures = false in about:support and restart Firefox.
Flags: needinfo?(austin.donisan)

Yes that fixes the playback problem for me.

The "Video Decode" in task manager is still at 0% for the problematic videos after this change. All other videos still cause a non-zero value.

Flags: needinfo?(austin.donisan)

Attachment 9409858 [details] has only one ref frame. It might be related to the problem. The Intel hardware video decoder might not allocate enough video buffers for zero copy video frame.

I could reproduce the problem with one Win11 PC with intel GPU.

Sotaro, can you please assign a severity to this bug?

Flags: needinfo?(sotaro.ikeda.g)
Severity: -- → S2
Flags: needinfo?(sotaro.ikeda.g)

By setting pref media.wmf.zero-copy-nv12-textures-force-enabled = true, I could reproduce the problem also on Win11 PC with NVIDIA GPU.

When the problem happens, WMFMediaDataDecoder::ProcessOutput() did not exit and continuously called WMFVideoMFTManager::Output() more than 10 times.

Then this seems not gfx bug. Unassign myself.

Assignee: sotaro.ikeda.g → nobody

:alwu, can you take this bug?

Flags: needinfo?(alwu)

The bug is marked as tracked for firefox128 (beta) and tracked for firefox129 (nightly). However, the bug still isn't assigned.

:jimm, could you please find an assignee for this tracked bug? Given that it is a regression and we know the cause, we could also simply backout the regressor. If you disagree with the tracking decision, please talk with the release managers.

For more information, please visit BugBot documentation.

Flags: needinfo?(jmathies)

Sure, I will take a look later. Keep my NI.

Assignee: nobody → alwu
Flags: needinfo?(jmathies)

Hello, Austin, it seems your profiled result was captured by Firefox Profiler directly, instead of capturing through about:logging, so it's lacking of some important log messages. Would you mind follow this instruction to capture a result again? Thanks!

Flags: needinfo?(alwu) → needinfo?(austin.donisan)

Sotaro already had me do that above. Was it not correct? Here's another version but I think I recorded it exactly the same way as before:

https://share.firefox.dev/4eXofcI

Flags: needinfo?(austin.donisan)

Hmm, so the profile you captured in the comment 30 was also captured through about:logging? As I still couldn't see any log information in that profile, which is very strange... Is the Logging Output in about:logging set to Logging to Firefox Profiler when doing the profiling? Thank you so much!

Flags: needinfo?(austin.donisan)
Attached video logging-capture.mp4

Yes, I attached a screen capture of me doing it again.
https://share.firefox.dev/45Tzejr

I don't understand how I could have "captured by Firefox Profiler directly", which makes me worried I still did something wrong.

Flags: needinfo?(austin.donisan)

I'm so sorry, it seems our profiler has a problem to capture logs when playing a local file, I've filed a bug 1906756 for that. Your steps are exactly what we need, would you mind to try it again with this video, which is the video you uploaded on the bugzilla. So it's not the local file in your computer. Thank you so much!

Flags: needinfo?(austin.donisan)

No worries, by the 3rd try that did cross my mind as a potential problem. Recording it with the non-local file it looks like it worked:

https://share.firefox.dev/3WeHNlO

Flags: needinfo?(austin.donisan)

Perfect, that is what I need! Is this issue able to reproduce on other videos as well? Eg. this video. From what you described in the comment 4 I doubt that this is a graphic card/driver issue, and the reason might be the same as what Satoro described in the comment21.

Could you try if this build fixes the problem for you? I increased the size to 30 to see if that allows the the driver to better allocate the buffer.

Flags: needinfo?(austin.donisan)

BTW recently we get some reports for frozen video on Youtube, which makes me wonder if that is related with this problem.

I can only reproduce it with the videos from that website (100% of them, I can provide more if that would be helpful). The video you linked plays fine for me (though oddly VLC can't play that file if I download it). I can't find any other source of videos that don't play for me in Firefox.

The custom build you uploaded does fix the problem. In Task Manager "Video Decode" is still flat at 0%, and only for these problematic videos. The Big Buck Bunny video shows 1-2% by comparison.

Two weeks ago Youtube was freezing for me just in Firefox, but that went away and I had assumed it was adblock related.

Flags: needinfo?(austin.donisan)

Jeff, as Sotaro is in vacation, I wonder if you have any idea about this? This problem seems a drive/graphic card bug, the reporter said that the issue won't happen on sw decoding, and Nvidia hardware decoding. I had a custom build for reporter to adjust this value for users (I turned it to 30), which fixes the issue as well.

For fixing this specific issue for user, I think using 20 would already be enough, but I don't feel it's a good way to do. We can do that, but I have no idea if using 20 would cause any problem or not. Chromium originally used 5 for that value, but that part of codes seems already removed from their codebase. Mircosoft's documents only says those attributes representing upper and lower thresholds, but we don't know what exactly WMF uses those values. The potential downside of using a large value might be causing using too many GPU memory I guess?

I think for this problem, we either (1) disable zero copy for this specific hardware setup, or (2) increase the upper limit of WMF buffer size. Thanks!

Assignee: alwu → nobody
Severity: S2 → S3
Flags: needinfo?(jmuizelaar)
Priority: -- → P3

Take the wild speculation below with a grain of salt because I don't actually know anything about the topic, but here's something I chased down:

ffmpeg reports my video as using the "Baseline" profile, which I think is rare nowadays and I can't find any other videos like that. "Constrained Baseline" is usually used instead.

Using a hex editor to relabel the video as "Constrained Baseline" makes it play correctly and get hardware decoded. I have no idea about the validity of this or what happens if "Baseline" features are actually used in the encoding.

The bunny video Alastor linked is "Constrained Baseline." If I mark it as "Baseline" (which I'm guessing is legit) it still plays correctly, but "Video Decode" is now 0% in Task Manager. This actually makes sense since the Intel spec sheet doesn't say "Baseline" decoding is supported. So why does Chrome report hardware Video Decode happening?

Attached video baseline-bunny.mp4

Set release status flags based on info from the regressing bug 1899450

(In reply to Alastor Wu [:alwu] from comment #39)

Jeff, as Sotaro is in vacation, I wonder if you have any idea about this? This problem seems a drive/graphic card bug, the reporter said that the issue won't happen on sw decoding, and Nvidia hardware decoding. I had a custom build for reporter to adjust this value for users (I turned it to 30), which fixes the issue as well.

Nvidia hardware decoding also caused the problem for me.

:alwu,, is it possible to change WMFMediaDataDecoder::ProcessOutput() as to exit when it get one valid video frame? I am concerned that changing kOutputBufferSize to 30 may cause problems on some PCs

Flags: needinfo?(alwu)

As I couldn't reproduce this issue on my side, I couldn't test it on my own build. Sotaro, when you are able to reproduce the issue, what happen inside WMFMediaDataDecoder::ProcessOutput? As the code iterates mMFTManager->Output(), I wonder what are those output result when it's stuck? if it keeps stuck in the loop, does it mean that WMFVideoMFTManager::Output() keeps returning ok but no output sample is returned? could you help me check where the code stuck?

:alwu,, is it possible to change WMFMediaDataDecoder::ProcessOutput() as to exit when it get one valid video frame?

Does doing that fix the problem for you?

Thanks!

Flags: needinfo?(alwu) → needinfo?(sotaro.ikeda.g)
Summary: Video playback freezes while audio keeps going (Intel and AMD GPUs) → Video playback freezes while audio keeps going with hardware decoded video zero copy

(In reply to Alastor Wu [:alwu] from comment #45)

As I couldn't reproduce this issue on my side, I couldn't test it on my own build. Sotaro, when you are able to reproduce the issue, what happen inside WMFMediaDataDecoder::ProcessOutput?

Yes, I could reproduce the problem with local build.

As the code iterates mMFTManager->Output(), I wonder what are those output result when it's stuck?

The result was always S_OK with valid video frame.

if it keeps stuck in the loop, does it mean that WMFVideoMFTManager::Output() keeps returning ok but no output sample is returned? could you help me check where the code stuck?

When the problem happened, WMFVideoMFTManager::Output() was called 13 times and stuck at IMFTransform::ProcessOutput()

(In reply to Alastor Wu [:alwu] from comment #45)

:alwu,, is it possible to change WMFMediaDataDecoder::ProcessOutput() as to exit when it get one valid video frame?

Does doing that fix the problem for you?

Yes, D216775 addressed the problem for me, though I am not sure if the change is OK.

Flags: needinfo?(sotaro.ikeda.g)

This is a reminder regarding comment #27!

The bug is marked as tracked for firefox128 (release) and tracked for firefox129 (beta). We have limited time to fix this, the soft freeze is in 14 days. However, the bug still isn't assigned and has low priority.

(In reply to Sotaro Ikeda [:sotaro] from comment #46)

When the problem happened, WMFVideoMFTManager::Output() was called 13 times and stuck at IMFTransform::ProcessOutput()

Do you mean in one call of WMFMediaDataDecoder::ProcessOutput(), it triggered WMFVideoMFTManager::Output() 13 times, which means the MFT manager returned 13 different decoded frame at once? and one of those frames took way longer time than the others?

(In reply to Sotaro Ikeda [:sotaro] from comment #48)

Yes, D216775 addressed the problem for me, though I am not sure if the change is OK.

In theory, only returning one frame at a time should be fine, because our media pipeline will request more video frames if the amount of decoded frames is not enough. But I can't guarantee it won't cause any regression... Could you test your change on other videos (especially on 4k) to see if it causes any regression? Thanks!

Flags: needinfo?(jmuizelaar) → needinfo?(sotaro.ikeda.g)

(In reply to BugBot [:suhaib / :marco/ :calixte] from comment #49)

This is a reminder regarding comment #27!

The bug is marked as tracked for firefox128 (release) and tracked for firefox129 (beta). We have limited time to fix this, the soft freeze is in 14 days. However, the bug still isn't assigned and has low priority.

It is not a new regression for Intel GPUs on release. And for AMD GPUs, it affects only until early beta.

(In reply to Alastor Wu [:alwu] from comment #50)

(In reply to Sotaro Ikeda [:sotaro] from comment #46)

When the problem happened, WMFVideoMFTManager::Output() was called 13 times and stuck at IMFTransform::ProcessOutput()

Do you mean in one call of WMFMediaDataDecoder::ProcessOutput(), it triggered WMFVideoMFTManager::Output() 13 times, which means the MFT manager returned 13 different decoded frame at once? and one of those frames took way longer time than the others?

12 frames had different time stamps like the followings. And last WMFVideoMFTManager::Output() call did not exit as in Comment 46.

  • {1600000,1000000}
  • {1633333,1000000}
  • {1666666,1000000}
  • {1700000,1000000}
  • {1733333,1000000}
  • {1766666,1000000}
  • {1800000,1000000}
  • {1833333,1000000}
  • {1866666,1000000}
  • {1900000,1000000}
  • {1933333,1000000}

(In reply to Alastor Wu [:alwu] from comment #50)

(In reply to Sotaro Ikeda [:sotaro] from comment #48)

Yes, D216775 addressed the problem for me, though I am not sure if the change is OK.

In theory, only returning one frame at a time should be fine, because our media pipeline will request more video frames if the amount of decoded frames is not enough. But I can't guarantee it won't cause any regression... Could you test your change on other videos (especially on 4k) to see if it causes any regression? Thanks!

I tested on my Win11 PC, I did not see the problem. And the results of the CI tests seem to be fine.
https://treeherder.mozilla.org/jobs?repo=try&revision=de1327d2a67a1bd840111e26e6e309bf83bb48d9

Flags: needinfo?(sotaro.ikeda.g)

(In reply to Sotaro Ikeda [:sotaro] from comment #53)

I tested on my Win11 PC, I did not see the problem. And the results of the CI tests seem to be fine.
https://treeherder.mozilla.org/jobs?repo=try&revision=de1327d2a67a1bd840111e26e6e309bf83bb48d9

Okay, let's use D216775 but only doing that for video track when zero copy is enabled.

Flags: needinfo?(sotaro.ikeda.g)

Ok, thank you. I am going to update the patch.

Flags: needinfo?(sotaro.ikeda.g)
Assignee: nobody → sotaro.ikeda.g
Attachment #9413169 - Attachment description: WIP: Bug 1905022 - Exit WMFMediaDataDecoder::ProcessOutput() for video track if it gets one output → Bug 1905022 - Exit WMFMediaDataDecoder::ProcessOutput() if video track uses zero video frame copy and it gets one output
Status: NEW → ASSIGNED
Pushed by sikeda.birchill@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/9cd1ec508d2c Exit WMFMediaDataDecoder::ProcessOutput() if video track uses zero video frame copy and it gets one output r=media-playback-reviewers,alwu
Status: ASSIGNED → RESOLVED
Closed: 4 months ago
Resolution: --- → FIXED
Target Milestone: --- → 130 Branch

The patch landed in nightly and beta is affected.
:sotaro, is this bug important enough to require an uplift?

  • If yes, please nominate the patch for beta approval.
  • If no, please set status-firefox129 to wontfix.

For more information, please visit BugBot documentation.

Flags: needinfo?(sotaro.ikeda.g)
Flags: needinfo?(sotaro.ikeda.g)

(In reply to Mayank Bansal from comment #7)

Can confirm that the first video "original.mp4" freezes for me on Nightly, and continues to play in Chrome.

This is fixed for me on the latest Nightly - the video continues to play (AMD zen3 APU)

Is this something we should uplift to ESR128? This plus bug 1909610 graft cleanly.

Flags: needinfo?(sotaro.ikeda.g)

It seems not necessary to uplift to ESR128. Content causing this problem is very rare.

Flags: needinfo?(sotaro.ikeda.g)

This issue would not reproduce on a Windows 10 with AMD RX580 GPU, but it would reproduce on a Windows 11 with Intel(R) UHD Graphics 630 GPU. Unlike the set tracking flags, this issue does reproduce on Firefox Release v103.0, v127.0 and v129.0.2, and it no longer reproduces on Beta v130.0 (RC) or Nightly v131.0a1.

Status: RESOLVED → VERIFIED
Flags: qe-verify+
OS: Unspecified → Windows
Hardware: Unspecified → Desktop
See Also: → 1923697
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: