Hardware accelerated UI rendering broken (Sony Vaio VPCCA2S0E gen6 gt2)
Categories
(Core :: Graphics: WebRender, defect)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr102 | --- | unaffected |
firefox112 | --- | unaffected |
firefox113 | --- | unaffected |
firefox114 | + | fixed |
People
(Reporter: steven+mozilla, Assigned: gw)
References
(Blocks 1 open bug, Regression)
Details
(Keywords: correctness, regression)
Attachments
(7 files)
Following a recent Nightly update, the whole UI just renders unreadably as soon as I start Firefox (screenshot attached). If I turn off hardware acceleration it renders just fine (screenshot attached).
The problem reproduces using mozregression-gui which uses a fresh profile with no add-ons.
So far I've tried only 32-bit Firefox.
Reproduction steps
- Open up Firefox.
Expected outcome
The GUI should render correctly (screenshot attached).
Actual outcome
The GUI renders unreadably (screenshot attached).
Workaround
Disable hardware acceleration (Preferences->Performance->Use recommended performance settings: off->Use hardware acceleration when available: off)
Bisection results
Last working build: dccf044c
First broken build: 1c58e874
Change that caused the issue
https://phabricator.services.mozilla.com/D173095
GPU information
This is extracted from the full troubleshooting information which I've attached.
Active: Yes
Description: Intel(R) HD Graphics 3000
Vendor ID: 0x8086
Device ID: 0x0116
Driver Version: 9.17.10.4459
Driver Date: 5-19-2016
Drivers: igdumd64 igd10umd64 igd10umd64 igdumd32 igd10umd32 igd10umd32
Subsys ID: 00000000
RAM: 0
Other potentially relevant information
I can't get any useful logs (MOZ_LOG=all:5 gives no output, I can't work out a suitable set of modules for logging).
Some time ago I got a problem with webrendering on this machine. I got partway through tracking down but never reported it. I worked round that by setting gfx.webrender.force-disabled to true and I've never set it back to false. If you think information from there might be relevant then I'll dig up my notes.
Reporter | ||
Comment 1•2 years ago
|
||
Reporter | ||
Updated•2 years ago
|
Reporter | ||
Comment 2•2 years ago
|
||
Reporter | ||
Comment 3•2 years ago
|
||
This probably should be moved to product: Core, component: Graphics: WebRender, but I don't think I have permission to create bugs there.
Reporter | ||
Comment 4•2 years ago
|
||
Another workaround, instead of disabling hardware acceleration in about:preferences, the problem can be avoided by setting gfx.webrender.software to false
I see that gfx.webrender.force-disabled has been renamed and possibly removed. So, it's possible that this is as repeat of the investigation I started some time ago (serves me right for not filing a bug report at the time).
In the troubleshooting information, under Graphics/Features, with no changes in settings (fresh profile) it says:
Compositing WebRender
With hardware acceleration completely disabled it says:
Compositing WebRender (Software)
With hardware acceleration enabled but gfx.webrender.software set to false, it says:
Compositing WebRender (Software D3D11)
The first one has broken rendering, the other two are fine.
All of this means either the right answer is to disable webrender on gen6 gt2 (undoing Bug 1638905), or to diagnose and fix the problem on this hardware.
Reporter | ||
Comment 5•2 years ago
|
||
It looks like it's different from the problem I was investigating some time ago.
I found my notes and tried my reproduction of the old issue. On build dccf044c, the website in question rendered just fine on a fresh profile without altering any settings. On 1c58e874, the rendering is damaged but it's not the catastrophe I saw when I was investigating some time ago.
So, it looks like in general, webrender for gen6 gt2 is OK, and this could be a new, specific bug.
Reporter | ||
Comment 6•2 years ago
|
||
I started poking random gfx.webrender settings. These are my results so far:
gfx.webrender.software=true fixes the problem (as previously reported).
gfx.webrender.debug.disable-batching=true has no effect
gfx.webrender.compositor=false has no effect.
gfx.webrender.max-partial-present-rects=0 has no effect.
gfx.webrender.debug.gpu-cache=true has no effect.
gfx.webrender.batched-texture-uploads=false has no effect.
gfx.webrender.blob-images=false has no effect.
gfx.webrender.dcomp-use-virtual-surfaces=false has no effect.
gfx.webrender.max-filter-ops-per-chain=1 has no effect.
gfx.webrender.multithreading=false has no effect.
gfx.webrender.use-optimized-shaders=false fixes the problem.
So, something in the shaders? I'm out of my depth here.
Reporter | ||
Comment 7•2 years ago
|
||
I should note that in my previous comment, I tried each of those settings one at a time and then reverted them.
Updated•2 years ago
|
Comment 8•2 years ago
|
||
:gw, since you are the author of the regressor, bug 1823578, could you take a look? Also, could you set the severity field?
For more information, please visit auto_nag documentation.
Comment 9•2 years ago
|
||
Using software rendering is certainly a valid way to mitigate this, but I am a little curious what we did that broke on these drivers. The strange images on the left are clearly an offscreen render target that should not be making it to the picture cache tiles shown in the UI.
It's interesting that gfx.webrender.use-optimized-shaders=false fixes the problem because the most likely explanation for this rendering artifact would be that one or more of the shaders is not successfully compiling in the driver, which is made somewhat more obtuse by the fact it is translating the shader from OpenGL to Direct3D11 (via ANGLE), so the driver is actually possibly having a problem with the Direct3D11 shader we're sending.
Updated•2 years ago
|
Comment 10•2 years ago
|
||
We may need to back this patch out because this is a common GPU and SWGL fallback may not be ideal. We'll be figuring this out on Monday.
Attempting to fix the shader optimizer may be easier than backing out the patch, but it will be difficult to pin down the problem in there.
We may want to make a downloadable blocklist implementation for disabling the shader optimizer as well.
Comment 11•2 years ago
|
||
I'll compare the shader differences with previous bugs we saw on intel gen6 (https://github.com/jrmuizel/gen6-miscompilation linked from https://github.com/servo/webrender/wiki/Driver-issues
Comment 12•2 years ago
|
||
I'm a little confused by this bug actually as the regressor (bug 1823578 ) was backed out already?
Comment 13•2 years ago
|
||
Backed out but relanded a few days later.
Comment 14•2 years ago
|
||
Reporter | ||
Comment 15•2 years ago
|
||
Backed out but relanded a few days later.
This matches what I saw. I saw it break one day, but by the time I got round to looking, I updated Nightly and it was fixed. Then it broke again a few days later and stayed broken. I tried to target the mozregression search to the second breakage (because, for all I knew at the time, these were two separate bugs and I didn't want to report something that had already been fixed).
Updated•2 years ago
|
Comment 16•2 years ago
|
||
I can reproduce this locally on a Gen6 0x126 with 9.17.10.3347
Updated•2 years ago
|
Comment 17•2 years ago
|
||
The bug is marked as tracked for firefox114 (nightly). We have limited time to fix this, the soft freeze is in 3 days. However, the bug still isn't assigned.
:bhood, could you please find an assignee for this tracked bug? Given that it is a regression and we know the cause, we could also simply backout the regressor. If you disagree with the tracking decision, please talk with the release managers.
For more information, please visit BugBot documentation.
Comment 18•2 years ago
|
||
Seems like this could be the same underlying cause as bug 1708937
Updated•2 years ago
|
Assignee | ||
Comment 19•2 years ago
|
||
This should be temporarily resolved when bug #1830691 lands. We'll need to work out the underlying cause of this before we can re-enable the new clip-mask rendering paths.
Assignee | ||
Updated•2 years ago
|
Assignee | ||
Comment 20•2 years ago
|
||
Steven, are you able to confirm if the most recent nightly now works correctly on your hardware (with config settings reverted to their previous values) ?
Reporter | ||
Comment 21•2 years ago
|
||
No. It's not fully fixed.
It's better, more of the UI is working (notably, many of the buttons), but the there are still problems on the UI and the web page.
I've made sure I'm up to date on Nightly 2023-05-04 (20230504215417).
Disabling optimised shaders fixes it. I'll attach a couple of screenshots.
Reporter | ||
Comment 22•2 years ago
|
||
Reporter | ||
Comment 23•2 years ago
|
||
Reporter | ||
Comment 24•2 years ago
|
||
To my untrained eye, it looks like the problem is now restricted to backgrounds.
Reporter | ||
Comment 25•2 years ago
|
||
Definitely backgrounds, but also composition of text onto backgrounds.
All the images, text and so on are in the right places but backgrounds and text are the wrong colour. Sometimes it's the background that's the wrong colour. The problem with the text is that it's invisible (same colour as the background) regardless of whether the background was the correct colour.
I've attached a video which should make things clearer.
I double checked. On build dccf044c, just before bug 1823578 landed, everything was OK. On 1c58e874, just after, everything was broken. On build 45d725a4 just before bug 1830691 landed, everything was still broken. On build, just after, a62a959b things are in the right place but the background/text problem exists.
As before, turning off shader optimisation fixes the problem.
Assignee | ||
Comment 26•2 years ago
|
||
There must be a bug being exposed in the shader optimizer / driver by the changes to the base shader in that patch, I suspect. I'll create a build today that reverts more of that. I'll post a link to one or more test builds here once that's done, if it would be feasible for you to test them for me.
Reporter | ||
Comment 27•2 years ago
|
||
I may be able to test a build depending on when you send the link. I'm on UK time.
I normally run 32-bit Firefox, but the problem shows on both 64-bit and 32-bit, so make whatever's easier.
Assignee | ||
Comment 28•2 years ago
|
||
This try run will create both 32 and 64 bit builds [1].
For 32-bit Windows, the build has completed and a zip artifact can be downloaded from [2]. I believe if you unzip that to a local directory and run it, you should be able to test it without any installer etc. It should use your existing nightly profile, I think.
The 64-bit build hasn't quite completed yet.
Assignee | ||
Comment 29•2 years ago
|
||
The x64 build is now completed, available at [1].
Reporter | ||
Comment 30•2 years ago
|
||
Both zip files (32-bit and 64-bit) still show the problem (I've made a blank profile for testing, so I started them with -P "Test").
If you want to check I was running the right version, the troubleshooting information for both reports version 20230507194423.
Assignee | ||
Comment 31•2 years ago
|
||
That does look like the correct build id, thanks for checking. Do you happen to know if the driver on your machine from 2016 is the most recent driver available? It seems very old, but then maybe that's the last supported driver for that GPU?
Assignee | ||
Comment 32•2 years ago
|
||
I wonder if the best option might be to block hw-rendering on hd3000 drivers from 2016. How does the browser performance feel on your machine in general if you have gfx.webrender.software
enabled?
Comment 33•2 years ago
|
||
Would it be better to only block optimized shaders on hd3000? It looks like there's already infrastructure for doing this
https://searchfox.org/mozilla-central/rev/4e6970cd336f1b642c0be6c9b697b4db5f7b6aeb/widget/GfxInfoBase.cpp#227
Assignee | ||
Comment 34•2 years ago
|
||
I'm a bit worried that will just mask the driver bug until the next issue we hit like this. But it's probably worth doing in this case, and if we run in to it again we might block hw-wr completely.
Reporter | ||
Comment 35•2 years ago
|
||
I'm pretty sure this is the latest driver. This is a really old processor (2nd generation Intel Core) and is not being updated (has is end-of-life).
The Intel web site (https://www.intel.com/content/www/us/en/download/17608/intel-graphics-driver-for-windows-15-28.html) lists 9.17.10.4229 from 6/5/2015 as the latest. I'm running something similar (9.17.10.4459, maybe a manufacturer variant). Comment 10 on bug 1678903 gives a similar date.
I'll have to find a site to play with to see what the performance is like. I've been running just with optimised shaders turned off since I found that worked.
Assignee | ||
Comment 36•2 years ago
|
||
Jeff, Andrew, is it easy to block optimized shaders on windows for this device (old gen6)? What would the right approach to do that? I think that might be the best workaround to this for now, since merge day is tomorrow.
Assignee | ||
Comment 37•2 years ago
|
||
Assignee | ||
Comment 38•2 years ago
|
||
A follow up build for testing is available when you have time. This should block shader optimization pass on your GPU.
Reporter | ||
Comment 39•2 years ago
|
||
I hunted around and found a benchmark at https://browserbench.org/MotionMark1.2/ that shows the differences.
I tried latest Nightly (20230507095340) with three settings: default ("h/w"), gfx.webrender.use-optimized-shaders: false ("unopt"), gfx.webrender.software: true ("s/w"). I also tried the builds just before and after the first breakage (default settings only) and Chrome (113.0.5672.64) and Edge (113.0.1774.35).
I should note that the first run with Firefox after starting it up gave bad results (like 1.00±300.00%) for the first test (Multiply) which distorted the results so I discarded that run and took the values from the second run. Still, this was better than Edge which just failed to run at all the first time but ran OK after refreshing the page.
20230507 h/w | 20230507 unopt | 20230507 s/w | dccf044c h/w | Build 1c58e874 | Edge | Chrome | |
---|---|---|---|---|---|---|---|
Renders correctly | no[1] | yes | no[2] | yes | no[3] | yes | yes |
Overall score | 135.67 ±8.59% | 140.15 ±7.74% | 99.84 ±8.25% | 133.46± 9.87% | 144.30 ±8.09% | 150.29 ±4.99% | 137.85 ±7.07% |
Multiply | 30.73±29.34% | 41.31±20.87% | 60.88±22.66% | 35.62±35.50% | 68.37±18.35% | 230.61 ±4.54% | 266.18 ±5.71% |
Canvas Arcs | 380.07 ±5.12% | 395.85 ±4.04% | 204.77 ±5.28% | 376.33 ±5.48% | 391.85 ±4.31% | 160.43 ±2.14% | 112.28±10.41% |
Leaves | 149.59 ±7.84% | 148.62 ±6.32% | 93.14 ±9.72% | 147.51±15.16% | 117.65±13.03% | 92.00 ±4.35% | 53.55 ±7.44% |
Paths | 1020.45 ±4.74% | 1062.25 ±3.68% | 698.75 ±4.46% | 1072.05 ±5.60% | 1019.13 ±6.67% | 433.37 ±2.78% | 376.93 ±4.28% |
Canvas Lines | 777.33 ±8.38% | 729.07 ±7.35% | 500.22 ±3.25% | 811.53 ±4.21% | 744.90 ±4.38% | 2485.60 ±5.58% | 2474.41 ±3.47% |
Images | 9.09 ±3.74% | 8.89 ±4.40% | 30.12 ±5.94% | 8.31 ±4.70% | 8.63 ±3.18% | 53.06 ±7.30% | 41.30 ±6.24% |
Design | 47.53 ±9.01% | 48.21 ±9.04% | 8.73 ±6.83% | 39.04 ±8.24% | 45.86±10.85% | 25.65±12.59% | 36.00±22.22% |
Suits[4] | 191.60 ±3.56% | 184.38 ±5.18% | 92.55±10.40% | 180.37 ±4.45% | 198.56 ±4.50% | 52.18 ±6.19% | 58.75 ±3.77% |
The difference in benchmarks between full hardware acceleration and skipping shader optimisation is within the run-to-run variation.
[1] One of the shapes in Multiply flickers (and there are issues with text before and after the test).
[2] Many of the shapes in multiply have stray pixels in a near-line outside the shape at some angles (maybe worth investigating separately).
[3] Many shapes in Multiply flicker and, as per the initial bug report, the UI is badly corrupted.
[4] In Firefox, there's a noticeable stutter at the start of each burst whereas in Edge it's smooth.
So the headline figure doesn't change that much between hardware and software rendering (about 140 drops to about 100) but this is due to some scores dropping while others, notable Images, improve (maybe worth investigating separately and hardware rendering shouldn't make anything worse).
Reporter | ||
Comment 40•2 years ago
|
||
Yes. The new builds (20230507234502) that block shader optimisation appear to work (I checked both 32-bit and 64-bit).
Assignee | ||
Comment 41•2 years ago
|
||
Thanks for the details benchmarks and testing. We'll land that patch as an interim fix, and look for a better long-term fix.
Comment 42•2 years ago
|
||
Comment 43•2 years ago
|
||
bugherder |
Reporter | ||
Comment 44•2 years ago
|
||
Just to close the loop, my main copy of Nightly just upgraded to 115.0a1 (20230509093033) and everything's rendering correctly (on both my normal profile and the one I'm using for testing).
The graphics section of the troubleshooting info says WEBRENDER_OPTIMIZED_SHADERS, env, blocklisted, Blocklisted by gfxInfo, Blocklisted due to known issues: bug 1829487
Assignee | ||
Comment 45•2 years ago
|
||
Excellent, thanks for confirming.
Updated•2 years ago
|
Description
•