Closed Bug 1820055 Opened 1 year ago Closed 1 year ago

Block threadsafe GL and DMABUF on NVIDIA binary drivers older than 530.0

Categories

(Core :: Graphics: WebRender, defect, P2)

Firefox 106
x86_64
Linux
defect

Tracking

()

RESOLVED FIXED
112 Branch
Tracking Status
firefox-esr102 --- disabled
firefox110 --- disabled
firefox111 --- fixed
firefox112 --- fixed

People

(Reporter: aosmond, Assigned: aosmond)

References

(Depends on 1 open bug, Blocks 1 open bug)

Details

(Keywords: crash, topcrash, topcrash-startup)

Crash Data

Attachments

(1 file)

+++ This bug was initially created as a clone of Bug #1788573 +++

Crash report: https://crash-stats.mozilla.org/report/index/d9dfc4cf-fa65-4e51-83ca-5a8e60220828

Reason: SIGSEGV / SI_KERNEL

Top 10 frames of crashing thread:

0 libnvidia-eglcore.so.515.65.01 NvGlEglGetFunctions 
1 libnvidia-eglcore.so.515.65.01 NvGlEglApiInit 
2 libnvidia-eglcore.so.515.65.01 NvGlEglApiInit 
3 libEGL_nvidia.so.0 NvEglwlaf47906in 
4 libEGL_nvidia.so.0 NvEglwlaf47906in 
5 libEGL_nvidia.so.0 <.text ELF section in libEGL_nvidia.so.515.65.01> 
6 libEGL_nvidia.so.0 NvEglwlaf47906in 
7 libEGL_nvidia.so.0 NvEglwlaf47906in 
8 libxul.so DMABufSurfaceRGBA::ReleaseTextures widget/gtk/DMABufSurface.cpp:679
9 libxul.so mozilla::wr::RenderDMABUFTextureHost::ClearCachedResources gfx/webrender_bindings/RenderDMABUFTextureHost.cpp:72

The crash appears to be triggered if an external monitor is connected/disconnected while the computer is sleeping. I can confirm this if desired. I can also supply about 10 other crash reports with the same problem, more system information, or do other debugging steps.

This might be similar to https://bugzilla.mozilla.org/show_bug.cgi?id=1737834, but that seems to be caused by a memory leak and shows up after firefox is left running for a while.

I'm using the 515 version the nvidia drivers, but I believe that this bug was present with the 510 version as well.

Let's try blocking DMABUF in 510 (inclusive) to 530 (exclusive) based on discussion in bug 1788573. We get daily crashes in nightly so we can easily verify this over the next few days.

We are seeing a large uptick in crashes across all channels with the
NVIDIA binary drivers, including 510, 515, 520, and 525. This crash is
tracked in bug 1788573.

Attachment #9320898 - Attachment description: Bug 1820055 - Block DMABUF on NVIDIA binary drivers between 510.0 and 530.0. → Bug 1820055 - Block threadsafe GL and DMABUF on NVIDIA binary drivers older than 530.0.
Pushed by aosmond@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/87549300e165
Block threadsafe GL and DMABUF on NVIDIA binary drivers older than 530.0. r=gfx-reviewers,gw
Summary: Block DMABUF for NVIDIA 510.0 to 530.0 → Block threadsafe GL and DMABUF on NVIDIA binary drivers older than 530.0

The next release (bug 1788573 comment 26) after this one should have the fix: https://www.phoronix.com/news/NVIDIA-530.30.02-Linux-Beta

Status: ASSIGNED → RESOLVED
Closed: 1 year ago
Resolution: --- → FIXED
Target Milestone: --- → 112 Branch
Blocks: 1820181

We are doing a downloadable blocklist entry for 110 release and earlier, but we are going to want to uplift this to 111 beta once we confirm the crash drop off.

:aosmond in terms of timing, the 111 beta cycle is over and next week is RC week.
Given the volume on nightly is so low, wondering how long it would take to be confident versus the risk in uplifting it before 111 RC1 build on Monday?

Flags: needinfo?(aosmond)

The feature will need to be blocked either via uplift or a downloadable blocklist request. I consider the uplift low risk, so I can either request uplift now, or amend the downloadable blocklist rules to include the beta?

Might be the better option to request an uplift now.
I'll take some uplift requests over the weekend before RC and can include this based on the risk.

Comment on attachment 9320898 [details]
Bug 1820055 - Block threadsafe GL and DMABUF on NVIDIA binary drivers older than 530.0.

Beta/Release Uplift Approval Request

  • User impact if declined: Crashes on suspend/resume/in general with NVIDIA binary driver on Linux
  • Is this code covered by automated tests?: No
  • Has the fix been verified in Nightly?: Yes
  • Needs manual test from QE?: No
  • If yes, steps to reproduce:
  • List of other uplifts needed: Bug 1820181
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): Just a blocklist update. The features being blocked are optional and the fallback paths should be well tested already.
  • String changes made/needed:
  • Is Android affected?: No
Flags: needinfo?(aosmond)
Attachment #9320898 - Flags: approval-mozilla-beta?

Comment on attachment 9320898 [details]
Bug 1820055 - Block threadsafe GL and DMABUF on NVIDIA binary drivers older than 530.0.

Approved for 111.0 RC1

Attachment #9320898 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Blocks: 1824778
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: