Open Bug 1808106 Opened 3 years ago Updated 1 year ago

transparent 4x4px hole in image at specific positions/conditions (mesa/crocus, a new driver for older Intel, has replaced classic mesa/i965 in Mesa 22)

Categories

(Core :: Graphics: WebRender, defect)

Firefox 108
x86_64
Linux
defect

Tracking

()

UNCONFIRMED

People

(Reporter: firefoxbugs, Unassigned)

References

(Blocks 2 open bugs)

Details

(Keywords: correctness)

Attachments

(6 files)

Attached file bug.htm

User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:108.0) Gecko/20100101 Firefox/108.0

Steps to reproduce:

Picture of any size/content/format sized and positioned in a div at specific size and position as demonstrated in attachment.

Actual results:

There is a 4px by 4px hole in the image, revealing the background behind it.

Expected results:

There obviously should be no hole.

Other info:
Reproduced/observed on three different browser installations on this machine, all with different profiles. (OS is Kubuntu 22.10.)

  1. Firefox 108.0.1 (64-bit) Snap for Ubuntu (canonical-002 -1.0) [snap version: 108.0.1-1 2022-12-16 (2211)]
  2. Firefox 108.0.1 (64-bit) extracted from firefox-108.0.1.tar.bz2 downloaded from mozilla.org, virgin profile with no extensions
  3. Tor Browser 12.0.1 (based on Mozilla Firefox 102.6.0esr) (64-bit)

This gif shows a series of screenshots with different places in which the hole can be seen. Value of "left" position is shown at the bottom. This is based on a slight variation of bug.htm. As you can see, the positive values of "left" are periodic (every 16px), while the negative values don't follow any obvious pattern and have the hole show up at different places in the picture. They are all pretty easy to reproduce, but sometimes some of the positions require a little "help" if they don't reproduce on initial loading, e.g.

  1. Move picture with javascript.
  2. Pressing reload while the defect is not reproduced might reveal the defect at that location for a split second before the picture (and defect) moves back to its original spot.
  3. If initial location doesn't reproduce defect, move image with javacript to a place where it does reproduce defect, then reload page, and defect will now appear.

List of "left" positions (in px) shown in this gif:
[-146, -144, -129, -114, -112, -110, -75, 15, 31, 47, 63, 79, 95, 111, 127, 143, 159, 175, 191, 207, 223, 239, 255, 271, 287, 303, 319, 335, 351, 367, 383, 399, 415, 431, 447, 463, 479, 495, 511]

The Bugbug bot thinks this bug should belong to the 'Core::Widget: Gtk' component, and is moving the bug to that component. Please correct in case you think the bot is wrong.

Component: Untriaged → Widget: Gtk
Product: Firefox → Core

This might depend on DPI settings? I don't see this on Windows at least... Could you attach the updated comment 2 test-case to the bug? Could you also attach your about:support information? Does gfx.webrender.software=true fix it for you?

Component: Widget: Gtk → Graphics
Flags: needinfo?(firefoxbugs)

As requested in comment 4. Buttons move left and right between known places. Found these places by moving per pixel. Initial start location can be modified with lines 22 & 40.

As requested in comment 4.

Q. This might depend on DPI settings?
A. Forcing system's font DPI has no effect on rendering.

Q. Does gfx.webrender.software=true fix it for you?
A. Yes, the bug cannot be reproduced when set to true but can be reproduced when returned to false. (Change requires restarting Firefox to take effect.) Support info provided is of course with it set to false.

Flags: needinfo?(firefoxbugs)

(In reply to firefoxbugs from comment #6)

Q. Does gfx.webrender.software=true fix it for you?
A. Yes, the bug cannot be reproduced when set to true but can be reproduced when returned to false. (Change requires restarting Firefox to take effect.) Support info provided is of course with it set to false.

Thanks. That smells like a driver bug... Might be worth trying to upgrade mesa and see if the bug is fixed there already? I'm not familiar with the crocus driver (but not an expert there). Seems relatively new?

Component: Graphics → Graphics: WebRender
Blocks: wr-linux
Keywords: correctness
OS: Unspecified → Linux
Hardware: Unspecified → x86_64
Summary: transparent 4x4px hole in image at specific positions/conditions → transparent 4x4px hole in image at specific positions/conditions (mesa/crocus, a new driver for older Intel, has replaced classic mesa/i965 in Mesa 22)

@gw @ahale FYI, and added to gfx-triage to discuss
"All old intel on linux" sounds bad enough for S2.

Blocks: gfx-triage
Severity: -- → S2
Flags: needinfo?(gwatson)
Flags: needinfo?(ahale)

It's unclear which hardware is affected, or if it is a bug in the driver for all the cards right?

Does setting gfx.webrender.debug.force-picture-invalidation (and restarting the browser) have any effect on your repro steps / how / when it occurs?

This does sound like a driver bug, I think it'd be worth opening a bug upstream in mesa and see if the mesa developers have any ideas or thoughts.

Flags: needinfo?(gwatson)
No longer blocks: gfx-triage
Severity: S2 → S3

Adding this attachment to show examples found in the wild too, not just from test examples. It seems to happen frequently in thumbnails in a list of YouTube videos (either search results or channel; I think one of these is even a YouTube result in a DuckDuckGo search). As you can see, it's not always a 4x4 pixel hole, though that's a common size.

Above, there seems to be some shade cast on the crocus driver. I don't know much about it, but I don't know of another driver to use. I suppose if I have time I could find and boot older distros that don't have this driver, but then maybe those drivers would be blamed. Even if I found a different driver under which this bug didn't manifest, would that mean the problem is definitely the driver or only that I changed the circumstances of the test enough that the bug didn't manifest?

In reply to comment 11:
Q. Does setting gfx.webrender.debug.force-picture-invalidation (and restarting the browser) have any effect on your repro steps / how / when it occurs?
A. Yes. It does not make the problem go away, but it does change things a bit. And the change does not require restarting the browser. (Of course I did restart the browser to test that also, but I didn't notice the restart making a difference.)

Details (using bug.htm above):
No difference. The hole is displayed either way.

Details (using my bug-move.htm above):
With gfx.webrender.debug.force-picture-invalidation set to false (default):

  • All the negative values (-146 to -75) of left in the positions variable reliably show the hole if the starting position is a positive value and you move down to them. (Also in other conditions, the rules of which are too complicated to explain here.)
  • The higher positive values (319 to 511) reliably show the hole.
  • The lower positive values (15 to 303) show the hole IF they are the initial start positions when you first load the example in a new tab, but may or may not show under other circumstances, the rules of which are complicated and I won't bore you with them.
    With gfx.webrender.debug.force-picture-invalidation set to true:
  • None of the negative values (-146 to -75) show the hole.
  • All of the positive values (15 to 511) reliably show the hole without any complicated rules.

I went back and forth between the settings a few times and what I've described seems very stable and reproducible.

Based on other debugging I've done on Intel iGPUs I'm guessing this is a tile mask bug like https://bugzilla.mozilla.org/show_bug.cgi?id=1817240 on Windows where the framebuffer appears to be made of small squares (8x4 in that bug but perhaps 4x4 on this iGPU).

The position of this hole would depend on where in the render target texture the holes appear and what order things are drawn by WebRender.

I'm not sure if this is a clear bug like that one or some other variant but 4x4 holes seem pretty specific as Intel GPUs go.

I don't have suitable hardware on hand to try to repro this bug - my only Intel GPUs on hand are Broadwell and Tigerlake.

Per my previous assessment in comment #13, I think this is technically a driver bug and there isn't much we can do about it. Furthermore the driver is deprecated at this point as far as I know.

Flags: needinfo?(ahale)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: