Closed Bug 1652972 Opened 1 year ago Closed 1 year ago

Crash in [@ @0x0 | mozilla::gl::GLContextEGL::IsCurrentImpl]

Categories

(Core :: Graphics: WebRender, defect, P3)

Unspecified
Windows 10
defect

Tracking

()

RESOLVED FIXED
81 Branch
Tracking Status
firefox-esr68 --- unaffected
firefox-esr78 --- unaffected
firefox79 --- wontfix
firefox80 --- wontfix
firefox81 --- fixed

People

(Reporter: sg, Assigned: kvark)

References

(Blocks 1 open bug)

Details

(Keywords: crash)

Crash Data

Attachments

(1 file)

This bug is for crash report bp-027413fe-e74e-4e50-a58b-711410200715.

Top 10 frames of crashing thread:

0  @0x0 
1 xul.dll mozilla::gl::GLContextEGL::IsCurrentImpl const gfx/gl/GLContextProviderEGL.cpp:484
2 xul.dll mozilla::gl::GLContext::MakeCurrent const gfx/gl/GLContext.cpp:2334
3 xul.dll mozilla::wr::RenderDXGITextureHostOGL::DeleteTextureHandle gfx/webrender_bindings/RenderD3D11TextureHostOGL.cpp:254
4 xul.dll mozilla::wr::RenderDXGITextureHostOGL::~RenderDXGITextureHostOGL gfx/webrender_bindings/RenderD3D11TextureHostOGL.cpp:40
5 xul.dll mozilla::wr::RenderDXGITextureHostOGL::~RenderDXGITextureHostOGL gfx/webrender_bindings/RenderD3D11TextureHostOGL.cpp:38
6 xul.dll std::list<RefPtr<mozilla::wr::RenderTextureHost>, std::allocator<RefPtr<mozilla::wr::RenderTextureHost> > >::clear vs2017_15.8.4/VC/include/list:1393
7 xul.dll mozilla::wr::RenderThread::DeferredRenderTextureHostDestroy gfx/webrender_bindings/RenderThread.cpp:760
8 xul.dll mozilla::detail::RunnableMethodImpl< xpcom/threads/nsThreadUtils.h:1240
9 xul.dll MessageLoop::DoWork ipc/chromium/src/base/message_loop.cc:548

Happening primarily on Nightly for quite some time now, since build id 20200415215103.

Hm, not sure is there is anything in that build that stands out particularly. Kvark, do you have any suggestions?

Flags: needinfo?(dmalyshau)
Blocks: gfx-triage
Severity: -- → S3
Priority: -- → P3

It's not clear to me what's going wrong there, although I'm not familiar with this code. Somebody with access to the crash dump could look and tell more. For example, at the state and see if it's fGetCurrentContext pointer being NULL, or something else.
If we don't have anyone available (given that Jeff is going on leave), I can request the access and look at it more.

Flags: needinfo?(dmalyshau)

This is a crash caused by jumping to a NULL pointer address: the crash reason is EXCEPTION_ACCESS_VIOLATION_EXEC, trying to execute a non-executable address and the first stack frame is @NULL so that's where the instruction pointer was at the time of the crash. The GLLibraryEGL object seems to have a bunch of function pointers that get filled up with the actual implementation so my guess is that the pointer for the fGetCurrentContext() function was NULL. I hope this makes the bug more actionable.

Also the stack is starting off of a runnable on the Renderer thread so maybe this might be a race of some sort with whatever thread should initialize the GLLibraryEGL object?

Above crash is Intel/Win10, but maybe this is useful to mention:
On EGL/proprietary Nvidia/Linux (MOZ_X11_EGL=1) there is a reliability problem regarding Firefox' usage of EGL: bug 1650583
I would assume it's the same problem Surfman ran into: https://github.com/servo/surfman/pull/178

Avoid calling eglMakeCurrent prior to creating a window surface

I looked at a couple of other libraries that use EGL on various hardware, and another common approach to resolving this issue is to create a 1x1 pbuffer "dummy surface", and make current against that when you want to unlock your main surface.

On Wayland, I got an EGL-related startup crash: bug 1653960

Kvark, If you could help investigate that would be good.

Blocks: wr-80
No longer blocks: gfx-triage
Flags: needinfo?(dmalyshau)
Assignee: nobody → dmalyshau
Flags: needinfo?(dmalyshau)

This is meant to save us in cases where the message loop in GPU process
receives commands related to resources that point to the old EGL context
that was just shut down. Since the symbols are erased, we'd end up with
trying to execute a nullptr on MakeCurrent(). With marking the context
as lost, however, no symbols will be accessed.

Pushed by dmalyshau@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/0067bb1fb8e4
Mark EGL context as lost on Shutdown() r=aosmond
Status: NEW → RESOLVED
Closed: 1 year ago
Resolution: --- → FIXED
Target Milestone: --- → 81 Branch
Regressions: 1656853
You need to log in before you can comment on or make changes to this bug.