NVENC-encoded H.264 sent over WebRTC to Firefox 119 on Windows results in a glitching stream - a problem which doesn't exist in Firefox 118 and prior (except for 110)
Categories
(Core :: Graphics, defect)
Tracking
()
People
(Reporter: peinchka, Assigned: sotaro)
References
Details
Attachments
(3 files)
Steps to reproduce:
Sent an NVENC-encoded H.264 stream over WebRTC to Firefox 119 on Windows, testing with a range of resolutions at 30fps, and reproducible on two different Windows machines with NVidia GPU's (with hardware decoding active).
Actual results:
Results in glitching playback, where frames from approximately 0.5 to 1 second ago are randomly interjected into the current frames, making the video stream practically unusable.
Expected results:
Playback of video should not glitch. Curiously, Firefox 119 on Android and iOS does not experience this issue, using precisely the same set-up, and Firefox 118 and 120.0beta on Windows are equally glitch-free.
Comment 1•1 year ago
|
||
The Bugbug bot thinks this bug should belong to the 'Core::Graphics' component, and is moving the bug to that component. Please correct in case you think the bot is wrong.
Comment 2•1 year ago
|
||
Can you provide a link to test this playback?
Or could you use https://mozilla.github.io/mozregression/ to find out the changeset that introduced the problem?
Apologies for the delay in responding! Alas, it isn't straightforward to provide a link to test playback, but I will set something up if absolutely necessary. Further testing has revealed this issue can occur in any recent version of Firefox on Windows (certainly, 113 onwards), though doesn't occur in any Chrome-based browser on Windows, nor in Firefox on Android and iOS.
I've found a remedy which will hopefully shed light on the cause of the issue:
Web Developer Tools => Performance => Start recording (with "Settings" set to "Web Developer" or "Firefox").
During and after a recording, the problem completely disappears, and only reappears after closing and reopening Firefox, suggesting some setting related to H.264 hardware decoding is reset upon commencing a recording.
Many thanks in advance for taking a look at this! :)
Comment 4•1 year ago
|
||
If it's too hard to provide a link for testing, could you maybe use the tool https://mozilla.github.io/mozregression/ ? That will tell us exactly which code changed caused this regression and it's pretty quick and easy to use, and provides a way to get to the root of the problem quickly.
Apologies again for the delay in responding! Here is a long list of Firefox versions and whether they function correctly for H.264 over WebRTC on Windows:
100 - yes
105 - yes
108 - yes
109 - yes
110 - no
111 to 118 - yes
119 - no
119.b1 - yes
119.b2 - yes
119.b3 - yes
119.b4 - no
119.b5 - no
119.b6 - yes
119.b7 - no
120.b1 to 120.b6 - no
Unfortunately, using the Mozregression tool, I'm unable to find a Firefox iteration that works correctly, regardless of what version numbers or dates I test between. For example, setting the last known good build as 109 and the first known bad build as 110, 12 versions are generated for testing before the bisection is declared as done, none of which work correctly. Based on the above information, pleased suggest suitable release version numbers or dates to test between, and I'll try again. Many thanks!
Comment 6•1 year ago
|
||
Thanks for trying all those versions!
It seems like the problem might only show up maybe 80% of the time? Would that make sense? If that is the case I would suggest trying each build a few times to make sure it is either good or bad.
What happens if you set 100 as the good build, and 120 as the bad build?
After testing each build multiple times, I can confirm that when a build is bad, it invariably remains bad. When I previously stated that 120beta builds worked correctly, it turns out it wasn't H.264 over WebRTC that was being tested.
With Mozregression, I've now tried setting 100 as the good build and 120 as the bad build, but after just ten builds (which either glitched, or didn't support H.264, so reverted to VP9), the process stopped.
Please advise on what the most fruitful next step may be. Cheers!
Comment 8•1 year ago
|
||
Ten builds is actually the right amount to bisect that range down to a single day so that seems okay. As long as you answered correctly for each build (if you got vp9 fallback and were unable to test the build you shouldn't say its good or bad).
So at the end of the process there should be text output, that output will tell us which changesets landed when the bug regressed based on the tests you just did. Copy and paste that output into this bug (in a comment or upload a file).
Ah, brilliant, thanks! I've been making the mistake of marking VP9 builds as bad! Took a long while, but I've eventually ended up with this:
Differential Revision: https://phabricator.services.mozilla.com/D162353
Reporter | ||
Comment 10•1 year ago
|
||
Here's the final log from Mozregression in its entirety, in case it's useful.
Comment 11•1 year ago
|
||
Thanks for including the full output.
I'm a little confused by your results. The range that it seems to indicate is
https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=bd8c9b74&tochange=af2c13b9
Nothing in there seems likely to have caused this.
Are you picking skip or retry when you get vp9? And you're sure the randomness is showing the problem isn't interferring with your testing?
Reporter | ||
Comment 12•1 year ago
|
||
I selected Skip every time VP9 occurred, as I assumed that Retry would persist with VP9 - I'll go through the whole process again, but will attempt several retries before skipping versions that use VP9. Sorry, I'm not sure what you're asking in that final question...
Comment 13•1 year ago
|
||
FWIW, I've been having this same issue as described in the original post while using Wyze-Bridge with WebRTC locally. It is 100% reproducible for me. The only solutions I've found were to use either Chrome or Edge, or to disable Hardware Acceleration in Firefox.
Reporter | ||
Comment 14•1 year ago
|
||
To clarify, the issue is also 100% reproducible for me. A few days ago, when I stated: "Further testing has revealed this issue can occur in any recent version of Firefox on Windows (certainly, 113 onwards)", it was because I was struggling to find a version that worked correctly using Mozregression (which I now know was due to marking VP9-only builds as bad). All the release versions from 111 to 118 work perfectly every time, and 119 and 120 never work (though 119.b1 to b3 and b6 are fine).
This issue looks identical, especially from the video in the first post: https://github.com/mrlt8/docker-wyze-bridge/issues/1025
Comment 15•1 year ago
|
||
Is there some logging that might help us track this down?
Reporter | ||
Comment 16•1 year ago
|
||
Apologies for the delay! I’ve found a strange quirk which appears to provide a big clue as to the origins of this issue. If I go to C:\Users\<Username>\AppData\Roaming\Mozilla\Firefox and delete the Profiles folder and installs.ini/profiles.ini files, subsequently doing a fresh installation of 64-bit Firefox 110, 119 or 120 (any beta version of the latter) invariably displays the H.264 glitching issue. However, if I delete the profiles, install 109 (which doesn’t have the glitch), then reinstall 110, it appears to use the profile from 109 and is glitch-free, even after restarting Firefox. Similarly, upgrading directly from 109 to 119 or any of the 120 beta versions also remedies the glitch, but restarting Firefox in these cases seems to update the profile and the glitch appears.
Another foible is that no version of Firefox appears to support H.264 immediately upon installation – instead, something installs silently in the background over the space of typically 2 or 3 minutes to add support, though seems quicker in more recent versions.
In light of the above, I’ve retested every release version of Firefox from 100 to 120, deleting all profiles between each installation and patiently waiting for H.264 support to appear, and can confirm that only 110, 119 and 120 have the issue. There seems to be nothing intrinsically wrong with any of these versions, but something is definitely corrupt in their profiles.
Reporter | ||
Comment 17•1 year ago
|
||
Using the Mozregression tool with the last known good version set to 109 and the first known bad version set to 110 only resulted in bad versions and an unhelpful result. Thankfully, running the same test again with the last known good version as 108 resulted in this outcome, which looks potentially very useful, finally:
“Bug 1798242 - Check if video overlay works without ZeroCopyNV12Texture with non-intel GPUs r=gfx-reviewers,jrmuizel
The change disables ZeroCopyNV12Texture for checking if the video overlay works without ZeroCopyNV12Texture with non-intel GPU.
For now, on release, ZeroCopyNV12Texture is not enabled with non-intel GPUs. It blocks to enable video overlay with non-intel GPUs. Then it seems better to enable video overlay with non-intel GPUs on release without ZeroCopyNV12Texture if possible.
Differential Revision: https://phabricator.services.mozilla.com/D160744”
I can provide the full log upon request.
Reporter | ||
Comment 18•1 year ago
|
||
For anyone else experiencing this issue, if you’re able to modify the HTML code of the webpage, slightly blurring a single pixel has been found to completely remedy the glitching problem in all affected Firefox versions. A line like this is all you need:
<div style="width: 1px; height: 1px; position: fixed; backdrop-filter: blur(0.01rem)"></div>
Updated•1 year ago
|
Assignee | ||
Comment 19•1 year ago
|
||
Hi peinchka, can you take firefox profiler with media setting and can you attach about:support to this bug?
https://profiler.firefox.com/
Comment 20•1 year ago
|
||
The result of mozregression seems to suggest this is gfx-related. Changing component for now.
Updated•1 year ago
|
Reporter | ||
Comment 21•1 year ago
|
||
Reporter | ||
Comment 22•1 year ago
|
||
Reporter | ||
Comment 23•1 year ago
|
||
Please find attached the requested documents! :)
Comment 24•1 year ago
|
||
Per comment 17, it seems gfx issue. Remove NI. In addition, for general media playback log, we can use about:logging
>> media preset
that already have most log modules we use for debugging media issue. But I am not sure what log module we should use for WebRTC.
Comment 25•1 year ago
|
||
I am also having this issue with ff119. vaughn.live and ivlog.tv are two sites that it is obvious on.
The one blurred pixel does fix the jitter for me.
Comment 26•1 year ago
|
||
(In reply to Shin Ring from comment #25)
I am also having this issue with ff119. vaughn.live and ivlog.tv are two sites that it is obvious on.
The one blurred pixel does fix the jitter for me.
Just an additional comment. I don't think vaughn.live uses webRTC for its video, yet it is affected by this.
Reporter | ||
Comment 27•1 year ago
|
||
The same issue exists for H.264 over WebSocket, so it's definitely not limited to WebRTC, and the deciding factor seems to be hardware-decoding of H.264 using an nVidia GPU.
Reporter | ||
Comment 28•1 year ago
|
||
To clarify, H.264 over WebSocket means MSE.
Assignee | ||
Comment 29•1 year ago
•
|
||
From comment 17, Bug 1796511 seems to address the problem. But it might take long time until addressed.
Assignee | ||
Updated•1 year ago
|
Comment 30•1 year ago
|
||
The severity field is not set for this bug.
:bhood, could you have a look please?
For more information, please visit BugBot documentation.
Comment 31•1 year ago
|
||
Seems like you understand this issue the best. Could you set the severity field?
Assignee | ||
Updated•1 year ago
|
Comment 32•11 months ago
|
||
I Unsure how much does it helps and or bring awareness to the bug. We experienced the same issue and eventually reached to a similar solution, same applying a css filter property function, we discovered that so as blur, hue-rotate(360deg) and most of the image feed modifiers stop the frame jitter.
Couldn't test with other GPUs besides Nvida but all nvidia cards presented the same problem.
Updated•11 months ago
|
Comment 33•10 months ago
|
||
Hello, This is most likely a dupe of https://bugzilla.mozilla.org/show_bug.cgi?id=1817617 which has a working patch fix by sotaro that is currently WIP. See that bug for more details.
Assignee | ||
Comment 34•9 months ago
|
||
peinchka, does the problem still happens with latest nightly?
Comment 35•5 months ago
|
||
A needinfo is requested from the reporter, however, the reporter is inactive on Bugzilla. Given that the bug is still UNCONFIRMED
, closing the bug as incomplete.
For more information, please visit BugBot documentation.
Description
•