Closed Bug 1615478 Opened 5 years ago Closed 5 years ago

Crash in [@ IPCError-browser | ShutDownKill | kevent_id] called from mozilla::gl::GLContext::fFinish()

Categories

(Core :: Graphics: CanvasWebGL, defect, P3)

Desktop
macOS
defect

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: mccr8, Unassigned)

Details

(Keywords: crash)

Crash Data

This bug is for crash report bp-065dc3c7-0140-412e-bc69-65d490200213.

Top 10 frames of crashing thread:

0 libsystem_kernel.dylib kevent_id 
1 libdispatch.dylib _dispatch_event_loop_wait_for_ownership 
2 libdispatch.dylib __DISPATCH_WAIT_FOR_QUEUE__ 
3 libdispatch.dylib _dispatch_sync_f_slow 
4 GLEngine glFinish_ExecThread 
5 XUL mozilla::gl::GLContext::fFinish gfx/gl/GLContext.h:1121
6 XUL mozilla::WebGLContext::Finish dom/canvas/WebGLContextGL.cpp:1486
7 XUL void mozilla::RunOn<void  dom/canvas/ClientWebGLContext.cpp:371
8 XUL mozilla::dom::WebGLRenderingContext_Binding::finish dom/bindings/WebGLRenderingContextBinding.cpp:17401
9 XUL bool mozilla::dom::binding_detail::GenericMethod<mozilla::dom::binding_detail::NormalThisPolicy, mozilla::dom::binding_detail::ThrowExceptions> dom/bindings/BindingUtils.cpp:3170

There's a decent number of these crashes in Nightlies on OSX. The signature looks pretty bad, but they all seem to involve shutting down WebGL. I don't know if this is actionable or not.

I wonder if this is a OSX 10.15 regression. All of the crash reports are for that version, and Haswell-gt3 GPU (0x0a2e), so perhaps the same generation of Mac? (Or the same user.)

Flags: needinfo?(jgilbert)
OS: Unspecified → macOS
Priority: -- → P3
Hardware: Unspecified → Desktop

WebRender is forced on as well for all reports, in case it is relevant.

Weird, I wonder what's going on. glFinish shouldn't be having any problems.

Flags: needinfo?(jgilbert)

Note that this isn't a crash in glFinish(). The content process was hung shutting down when we took this minidump, so this is glFinish() being so slow - or stuck - that we had to kill the content process. Two more important things: all these crashes seems to be coming from a single nightly user and they have fission enabled too.

I looked at one of the crash reports, and the telemetry environment includes this: {"compositor":"webrender","gpuProcess":{"status":"unused"},"wrQualified":{"status":"blacklisted"},"webrender":{"status":"available"}}}

Does the "blacklisted" in there mean anything important? IIRC Fission requires WebRender for some reason, so maybe that could be causing an issue?

This is on OSX, and we've seen some weird crashes with layers force enabled, but I don't see anything related to that here.

Oh, ok. Still weird. How long do we wait before axing a 'stuck' content process?
glFinish can take up to four seconds in the worst case, before context-loss TDR timeout should hit.

(In reply to Jeff Gilbert [:jgilbert] from comment #6)

Oh, ok. Still weird. How long do we wait before axing a 'stuck' content process?
glFinish can take up to four seconds in the worst case, before context-loss TDR timeout should hit.

We wait five seconds so that's pretty close.

We could probably use FenceSync/ClientWaitSync to build a shorter glFinish timeout.

Closing because no crashes reported for 12 weeks.

Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.