Open Bug 1714564 Opened 4 months ago Updated 2 days ago

Firefox 89 does not render anything correctly (Musl libc and Intel Xe GPU)

Categories

(Core :: Graphics, defect)

Firefox 89
defect

Tracking

()

UNCONFIRMED

People

(Reporter: nico.schottelius, Unassigned)

References

(Blocks 1 open bug)

Details

Attachments

(4 files)

User Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.77 Safari/537.36

Steps to reproduce:

I upgraded Firefox to firefox-89.0-r0.

Actual results:

Text input is not rendered while typing. No graphical updates happen anymore. Firefox 88 worked perfectly. Redrawing the window (focus off/focus on) shows the typed in text.

This problem basically renders firefox 100% unusable, as you cannot see anything unless you do a focus switch. No scrolling, no typing, etc.

Environment: Alpine Linux, Xorg, i3; Hardware: X1 Nano

Expected results:

It should show text / renderings.

The Bugbug bot thinks this bug should belong to the 'Core::Widget: Gtk' component, and is moving the bug to that component. Please revert this change in case you think the bot is wrong.

Component: Untriaged → Widget: Gtk
Product: Firefox → Core

Hmm, is there any chance you could pip install --user mozregression, then mozregression --good 88 --bad 89, and see what broke it in your environment? Thank you!

Flags: needinfo?(nico.schottelius)
Component: Widget: Gtk → Graphics
Attached file pip install output
Hey @emilio,

I tried to get mozregression running, but it seems the pip install with python3 on alpine fails:

Hey @emilio,

I tried to get mozregression running, but it seems the pip install with python3 on alpine fails, see attachment.

Flags: needinfo?(nico.schottelius)

Huh, does pip install wheel help? That's very odd... William do you happen to know what might be going on in comment 3?

Flags: needinfo?(wlachance)

Installing wheel gives a better error description, seems I am missing rustc. Just installed the following packages:

  • rust
  • python3-dev
  • libffi-dev
  • cargo

I am now running mozregression --good 88 --bad 89 and will report as soon as the process is through.

(In reply to Emilio Cobos Álvarez (:emilio) from comment #5)

Huh, does pip install wheel help? That's very odd... William do you happen to know what might be going on in comment 3?

Seems like a python issue with a bad error message (CC'ing mdroettboom, the Python Glean SDK maintainer for awareness, since that's where the issue is).

Flags: needinfo?(wlachance)

After about 30m downloading, the messages is as follows:

(venv) [18:24] nb3:~% mozregression --good 88 --bad 89
**********
You should use a config file. Please use the --write-config command line flag to help you create 
**********

 0:01.47 INFO: Using date 2021-04-19 for release 89
 0:02.60 INFO: Using date 2021-03-22 for release 88
 0:10.60 INFO: Testing good and bad builds to ensure that they are really good and bad...
 0:10.60 INFO: Downloading build from: https://archive.mozilla.org/pub/firefox/nightly/2021/03/20
===== Downloaded 100% =====
36:45.07 INFO: Running mozilla-central build for 2021-03-22
36:51.57 INFO: Launching /tmp/tmpl73lb2kq/firefox/firefox
36:51.57 INFO: Application command: /tmp/tmpl73lb2kq/firefox/firefox -profile /tmp/tmphjaek32b.mozrunner
['/tmp/tmpl73lb2kq/firefox/firefox', '-profile', '/tmp/tmphjaek32b.mozrunner']
36:51.58 ERROR: Unable to start the application (error: Failed to start the process: [Errno 2] No such file or directory: '/tmp/tmpl73lb2kq/firefox/firefox')
36:51.60 ERROR: Unable to start the application (error: Failed to start the process: [Errno 2] No such file or directory: '/tmp/tmpl73lb2kq/firefox/firefox')
36:51.94 INFO: Last good revision: 7bff3dc37b071d5272e2d445a0bfcbe21aabd38d (2021-03-22)
36:51.94 INFO: First bad revision: e2fb29057e4cb7cec500f1047e110dfa1ec0f901 (2021-04-19)
36:51.94 INFO: Pushlog:
https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=7bff3dc37b071d5272e2d445a0bfcbe21aabd38d&tochange=e2fb29057e4cb7cec500f1047e110dfa1ec0f901

36:51.94 INFO: To resume, run:
36:51.94 INFO: /home/nico/venv/bin/mozregression --repo=mozilla-central --good=2021-03-22 --bad=2021-04-19
(venv) [19:02] nb3:~% 

Something else I can see: if I just start firefox, it flickers, as if it wants to refresh something. It does stop though from time to time.

I think I know what the problem might be -- while I am trying to download (it's only 60KB/s), the binary compiled is probably linked against glibc, which does not exist on Alpine Linux.

ah, that might be it... Official builds probably use glibc. I guess you can't use any official build from e.g., https://nightly.mozilla.org? That definitely makes stuff slightly more complicated.

It might not be 100% accurate, but maybe there is a container build of firefox that I can use for testing instead of building all intermediate versions?

Hmm, other than bisecting yourself by building I don't think so... Anyhow, another thing to try... Does safe mode present the same issue?

Hi Nico, are you able to navigate to about:support and click the copy text to clipboard button? Or is that impossible due to the brokenness? If you can, could you please do that for both versions 88 and 89. Thanks.

Flags: needinfo?(nico.schottelius)
The output from 89:

```

The package for 88 seems to have been removed. I have installed firefox 84, which works fine and attached it's about:support output - does that help?

Flags: needinfo?(nico.schottelius)

I can see that in 89 you are running webrender, and in 84 you were not (which is expected). I was interested in whether that had changed from 88 to 89.

But nevermind, we can test this another way: if you go to about:conifg, and set gfx.webrender.force-disabled to true, then restart firefox, does that fix the issue?

Flags: needinfo?(nico.schottelius)

Wow, that was an adventure. Without refreshing the window or resizing the window, I could not see any update, but after 10m of moving forth and back and then restarting firefox: I can use it again. So it is indeed the webrender.

How does the webrender work / should I debug whether that is an xorg or driver issue?

Flags: needinfo?(nico.schottelius)

Thanks for persevering! There are a couple interesting things about your setup: Alpine Linux (ie musl instead of glibc), and the Xe is a recent GPU so there may be mesa bugs.

Do other opengl applications work okay?

If it's not too much trouble would you be able to run a LiveUSB of a glibc based distro (hopefully one with a similar Mesa version), and check if Firefox works there? That might help narrow it down.

Severity: -- → S3
Flags: needinfo?(nico.schottelius)
Summary: Firefox 89 does not render anything correctly → Firefox 89 does not render anything correctly (Musl libc and Intel Xe GPU)

Sure I can do that - do you have any fitting distro in mind?

For reference, these are the installed mesa libraries on Alpine:

nb3:# apk list -I | grep mesa
mesa-gl-21.1.2-r0 x86_64 {mesa} (MIT SGI-B-2.0 BSL-1.0) [installed]
mesa-dri-intel-21.1.2-r0 x86_64 {mesa} (MIT SGI-B-2.0 BSL-1.0) [installed]
mesa-gbm-21.1.2-r0 x86_64 {mesa} (MIT SGI-B-2.0 BSL-1.0) [installed]
mesa-dri-gallium-21.1.2-r0 x86_64 {mesa} (MIT SGI-B-2.0 BSL-1.0) [installed]
mesa-egl-21.1.2-r0 x86_64 {mesa} (MIT SGI-B-2.0 BSL-1.0) [installed]
mesa-dri-classic-21.1.2-r0 x86_64 {mesa} (MIT SGI-B-2.0 BSL-1.0) [installed]
mesa-glapi-21.1.2-r0 x86_64 {mesa} (MIT SGI-B-2.0 BSL-1.0) [installed]
mesa-21.1.2-r0 x86_64 {mesa} (MIT SGI-B-2.0 BSL-1.0) [installed]
nb3:
#

Quick check: glxgears also does not render correctly, but shows the regular framerate on the console:

308 frames in 5.0 seconds = 61.486 FPS
299 frames in 5.0 seconds = 59.745 FPS

Flags: needinfo?(nico.schottelius)

Can you try launching glxgears and/or firefox with MESA_LOADER_DRIVER_OVERRIDE=i965?

(In reply to Nico Schottelius from comment #22)

Quick check: glxgears also does not render correctly, but shows the regular framerate on the console:

It seems like you have a broken opengl setup so this more likely to be a distro/mesa bug than a Firefox bug.

Fedora 34 is probably the best option, it's usually a good combination of up to date and well tested. I'd suggest filing an issue with either Alpine Linux or Mesa. It might be a Mesa bug on Xe GPUs or specific to musl, or it might be Alpine's configuration.

One more thing would be really helpful if you could test for us: in about:config if you could set gfx.webrender.software to true, and reset gfx.webrender.force-disabled to false. That will let us know whether our new software fallback works correctly for systems which can't get hardware-accelerated webrender. Thanks again!

First feedback: glxgears with MESA_LOADER_DRIVER_OVERRIDE=i965 works fine. I did not dare to change the setting back for firefox yet, as it is a pain to restore it, but will do so after lunch -- same for the two config options.

Update: with gfx.webrender.force-disabled=false and gfx.webrender.software=true rendering works fine.

Update 2: firefox with MESA_LOADER_DRIVER_OVERRIDE=i965 and without any about:config tunings also works fine. The fedora image or the usb stick refused to boot so far, will dig into that more.

Thanks for testing Nico.

I guess from firefox's perspective we should detect the driver being used when you haven't set MESA_LOADER_DRIVER_OVERRIDE=i965, and block webrender in that case. And I imagine the fact that fixes it will be useful information for Mesa.

Nico, could you attach a copy of your about:support with MESA_LOADER_DRIVER_OVERRIDE=i965 set, so that we can determine how to detect the difference between the two configurations.

Compositing: WebRender (Software)
[snip]
Description: llvmpipe (LLVM 11.1.0, 256 bits)
Vendor ID: 0x8086
Device ID: 0x9a40
Driver Vendor: mesa/llvmpipe

Looks like the driver actually failed to load and we're using software rendering.

(In reply to Jan Alexander Steffens [:heftig] from comment #31)

Looks like the driver actually failed to load and we're using software rendering.

Yeah, IIUC this is super new hardware that's only supported by the iris driver.

I'd be personally against adding a blocklist entry here. IIUC correctly, this is new hardware on a up-to-date distro on a highly actively developed driver (iris) and even glxgears is buggy. So IMO adding workarounds in each project is totally over the top here - this hardware is simply not yet expected to work (at least not on the used mesa/kernel versions or so).

See Also: → 1708195
No longer blocks: gfx-triage
You need to log in before you can comment on or make changes to this bug.