Closed Bug 1680138 Opened 4 years ago Closed 3 years ago

Fission/Linux/proprietary Nvidia: Crash in [@ __memmove_avx_unaligned_erms | webrender::renderer::Renderer::update_texture_cache] when waking up from suspend

Categories

(Core :: Graphics: WebRender, defect, P2)

x86_64
Linux
defect

Tracking

()

RESOLVED DUPLICATE of bug 1682876
Tracking Status
firefox85 --- disabled

People

(Reporter: pbone, Unassigned)

References

(Depends on 1 open bug, Blocks 1 open bug)

Details

(Keywords: crash, regression)

Crash Data

Hi,

I've been noticing this crash in the last few days. I have 3 linux PCs with similar configurations (OS & architecture), but different hardware. Only this one shows the crash. Firefox crashes when my computer wakes from suspend. Requres fission to reproduce.

Crash report: https://crash-stats.mozilla.org/report/index/cbd22a6d-5a0c-49bf-82ec-ed9130201130

Reason: SIGSEGV /SEGV_MAPERR

Top 10 frames of crashing thread:

0 libc.so.6 __memmove_avx_unaligned_erms 
1 libxul.so webrender::renderer::Renderer::update_texture_cache gfx/wr/webrender/src/renderer.rs:4197
2 libxul.so webrender::renderer::Renderer::render_impl gfx/wr/webrender/src/renderer.rs:3618
3 libxul.so webrender::renderer::Renderer::render gfx/wr/webrender/src/renderer.rs:3414
4 libxul.so wr_renderer_render gfx/webrender_bindings/src/bindings.rs:614
5 libxul.so mozilla::wr::RendererOGL::UpdateAndRender gfx/webrender_bindings/RendererOGL.cpp:193
6 libxul.so mozilla::wr::RenderThread::UpdateAndRender gfx/webrender_bindings/RenderThread.cpp:488
7 libxul.so mozilla::wr::RenderThread::HandleFrameOneDoc gfx/webrender_bindings/RenderThread.cpp:325
8 libxul.so mozilla::detail::RunnableMethodImpl<mozilla::wr::RenderThread*, void  xpcom/threads/nsThreadUtils.h:1148
9 libxul.so base::MessagePumpDefault::Run ipc/chromium/src/base/message_pump_default.cc:35

what GPU/driver does the crashing machine have?

Flags: needinfo?(pbone)

NVidia Geforce 750 Ti, Driver version: 390.138-0ubuntu0.18.04.1

Flags: needinfo?(pbone)

I've been unsuccesful reproducing it in a blank profile for mozregression. And mozregression is refusing to copy my normal profile to help me find the cause (Bug 1680160).

Depends on: 1680160
Keywords: regression

Is WebRender on for you in the blank profile? Are you using nouveau or the nvidia binary drivers?

Flags: needinfo?(pbone)

(In reply to Jeff Muizelaar [:jrmuizel] from comment #4)

Is WebRender on for you in the blank profile? Are you using nouveau or the nvidia binary drivers?

It should be on. I gave mozregression --pref gfx.webrender.all:true --pref gfx.webrender.enabled:true options. I'll double check. yeah that was my bad. I had the syntax wrong, it still doesn't reproduce in mozregression's blank profile though. It could be that I need to leave the PC suspended for longer in case it's a timing thing.

The NVidia binary drivers.

Flags: needinfo?(pbone)
Severity: -- → S2
Fission Milestone: ? → M6c
Priority: -- → P2

Moving this out to MVP as aosmond told me that this wouldn't happen by default with Fission and only if WR is force enabled for NVIDIA binary drivers. And WR software will be shipped to all those that don't get hardware WR, so as not to block Fission.

Fission Milestone: M6c → MVP
Summary: Crash in [@ __memmove_avx_unaligned_erms | webrender::renderer::Renderer::update_texture_cache] → Fission/Linux/proprietary Nvidia: Crash in [@ __memmove_avx_unaligned_erms | webrender::renderer::Renderer::update_texture_cache] when waking up from suspend
See Also: → 1683266

In bug 1683266 :dholbert was able to run mozregression and found it was regressed by bug 1661528. It seems likely that the nvidia driver does not correctly handle persistently mapped buffers after suspend/resume. This bug almost certainly has the same cause, this signature is during update_texture_cache() whereas the other one is when uploading to vertex data textures.

Depends on: 1682876

This happens to me on Firefox 84 without Fission enabled. See https://crash-stats.mozilla.org/report/index/46f0c8b5-9d6e-4c68-9d71-beebf0201219.

After upgrading to Firefox 84, it crashes consistently after wake from suspend, but it happens with two different signatures, this one and bug 1683266.

It seems after I upgraded to nvidia-driver-460, the situation has been significantly improved. I have not observed any crash since the switch, and after restore from suspend, there is almost never any artifacts on Firefox anymore.

Maybe people having this problem can check the driver they are using, and switch to 460 if not already. it seems that it doesn't upgrade automatically when you are upgrading from an old version of system.

What is the exact driver version? I imagine we won't ship WebRender to earlier versions than that as a result of this issue.

We could also make using persistently mapped buffers a feature, and just block that on earlier versions. Although maybe the newer driver fixes resume in more cases than just persistently mapped buffer access

(In reply to Andrew Osmond [:aosmond] from comment #10)

What is the exact driver version? I imagine we won't ship WebRender to earlier versions than that as a result of this issue.

I'm currently using 460.32.03-0ubuntu0.20.10.1.

Note that while I don't see artifacts anymore, sometimes it still happens that some of text on browser chrome disappears after restoring from suspend. I think that's what happens previously (before the versions with this crash).

(In reply to Xidorn Quan [:xidorn] UTC+11 from comment #12)

(In reply to Andrew Osmond [:aosmond] from comment #10)

What is the exact driver version? I imagine we won't ship WebRender to earlier versions than that as a result of this issue.

I'm currently using 460.32.03-0ubuntu0.20.10.1.

Note that while I don't see artifacts anymore, sometimes it still happens that some of text on browser chrome disappears after restoring from suspend. I think that's what happens previously (before the versions with this crash).

This is on release? There is a change in 86 to better handle the resets I would observe on suspend/resume. Hopefully that resolve that.

(In reply to Andrew Osmond [:aosmond] from comment #13)

This is on release? There is a change in 86 to better handle the resets I would observe on suspend/resume. Hopefully that resolve that.

Yeah, that is about release. I don't usually keep my nightly for very long as I only use it for specific websites.

This is a crash related to older nvidia binary drivers and accelerated webrender, which will not be a target for release. These users will get software fallback instead.

Comment 8 suggests this happens even without Fission. Also, WR will be shipped everywhere before Fission, with sw fallback wherever hardware accelerated will not be done yet. Confirmed this with :jimm, and therefore, moving out of the Fission queue.

Fission Milestone: MVP → ---

This was fixed by bug 1682876 in 86. We aren't handling device resets properly with NVIDIA binary drivers.

Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.