Closed Bug 1707316 Opened 4 years ago Closed 3 years ago

webrender broken on musl at least on ppc64le (88.x)

Categories

(Core :: Graphics: WebRender, defect)

Firefox 88
defect

Tracking

()

RESOLVED INVALID

People

(Reporter: daniel, Unassigned)

Details

Attachments

(1 file)

User Agent: Mozilla/5.0 (X11; Linux ppc64le; rv:88.0) Gecko/20100101 Firefox/88.0

Steps to reproduce:

upgrade to firefox 88 in a musl-libc environment with xfce or KDE and maybe ppc64le architecture

Actual results:

firefox segfaults

Expected results:

firefox 88 has enabled webrender by default for xfce/kde environments; it was likely broken before too if forced

i get this backtrace: https://gist.github.com/q66/a7f7f530518cb2d74de66cd054b38358

i suspect there is something wrong with the way TLS is used, maybe?

I'm not sure if this also happens in x86_64 or other architecture musl environments

starting the browser with LIBGL_ALWAYS_SOFTWARE=1 (which disables webrender) and setting gfx.webrender.force-disabled to true restores function, but obviously only because this stuff is never called

The Bugbug bot thinks this bug should belong to the 'Core::Graphics: WebRender' component, and is moving the bug to that component. Please revert this change in case you think the bot is wrong.

Component: Untriaged → Graphics: WebRender
Product: Firefox → Core

oh, also, my environment in case it's useful:

void linux ppc64le-musl (kernel 5.11, 4k pages)
ibm power9
amd rx 5700 xt, on mesa 21.0.3

https://hg.mozilla.org/mozilla-central/file/FIREFOX_BETA_88_BASE/gfx/gl/GLContext.cpp#l559 doesn't seem like a likely place to crash. Can you build locally and see if this is null or otherwise garbage?

Flags: needinfo?(daniel)

i already did, it's a valid pointer and i can read its fields

Flags: needinfo?(daniel)

Can you tell what piece of code/action is actually causing the crash?

Flags: needinfo?(daniel)

the one the backtrace mentions - that's where it crashes (verified in a debugger, and also verified by placing logging around the statement)

the function runs twice, once successfully, the second time it crashes; but at the time of crash, 'this' is still a valid pointer with readable fields, so it's strange

Flags: needinfo?(daniel)

Can you attach an about:support?

Marking S3 for tier-3 platform.
Adding NI to provide about:support

Severity: -- → S3
Flags: needinfo?(daniel)
Attached file about_support.txt

attached about:support; this is run with LIBGL_ALWAYS_SOFTWARE=1 and unaltered about:config (i.e. in the same state it would crash in if it was run with accelerated opengl)

Flags: needinfo?(daniel)

Interesting. So this isn't WebRender itself, but rather the GL context initialization (even before WebRender kicks in).

this is not a firefox bug; i've tracked this down to some kind of strange corruption in libglvnd - basically libglvnd's TSB (since this is musl, which does not support initial-exec TLS) ppc64 assembly (as well as the pure C fallback when compiled with optimizations, but not without) causes crashes in opengl applications that use opengl in threads

working around the issue in libglvnd has fixed webrender as well as other threaded things that were otherwise crashing (webkit's accelerated compositor, kdenlive and possibly other things using qtquick, among others) so the issue needs to be reported and fixed there

Status: UNCONFIRMED → RESOLVED
Closed: 3 years ago
Resolution: --- → INVALID

er, s/TSB/TSD/

You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: