Closed Bug 736436 Opened 13 years ago Closed 10 years ago

crash in glGetBooleanv on Samsung Galaxy Nexus while playing crazybug game

Categories

(Core :: Graphics: CanvasWebGL, defect)

ARM
Android
defect
Not set
critical

Tracking

()

RESOLVED WORKSFORME
Tracking Status
blocking-fennec1.0 --- -

People

(Reporter: martijn.martijn, Assigned: bjacob)

References

()

Details

(Keywords: crash, reproducible, Whiteboard: [native-crash][gfx] webgl-driver)

Crash Data

Attachments

(3 files)

Tested on the Samsung Galaxy Nexus, Android 4.0.2 with trunk Fennec 14.0a1 build from 2012-03-15 (so with Maple merged in). Steps to reproduce: - Go to http://crazybugs.ivank.net/ - Tap on "PLAY THE GAME" on the bottom right - Choose level 1 - When the game starts, tap on "SOLVE" at the bottom right Result: crash report bp-77ce267f-7032-4690-8376-afdff2120316 . ============================================================= 0 libGLESv2_POWERVR_SGX540_120.so libGLESv2_POWERVR_SGX540_120.so@0x75f8 1 libGLESv2_POWERVR_SGX540_120.so libGLESv2_POWERVR_SGX540_120.so@0x76ea 2 libGLESv2_POWERVR_SGX540_120.so libGLESv2_POWERVR_SGX540_120.so@0x282a6 3 libGLESv2_POWERVR_SGX540_120.so libGLESv2_POWERVR_SGX540_120.so@0x5a38e 4 libskia.so libskia.so@0xfdffe 5 libGLESv2_POWERVR_SGX540_120.so libGLESv2_POWERVR_SGX540_120.so@0x2a30e 6 libGLESv2_POWERVR_SGX540_120.so libGLESv2_POWERVR_SGX540_120.so@0x17132 7 libxul.so DEBUG_CheckWrapperThreadSafety js/xpconnect/src/XPCWrappedNative.cpp:3649 8 libxul.so XPCCallContext::Init js/xpconnect/src/XPCCallContext.cpp:188 9 libxul.so XPCCallContext::XPCCallContext js/xpconnect/src/XPCCallContext.cpp:67 10 libxul.so XPC_WN_Helper_NewResolve js/xpconnect/src/XPCWrappedNativeJSOps.cpp:1163 11 libGLESv2_POWERVR_SGX540_120.so libGLESv2_POWERVR_SGX540_120.so@0x11d86 12 libxul.so mozilla::WebGLContext::GenerateMipmap GLContext.h:2189 13 libxul.so nsIDOMWebGLRenderingContext_GenerateMipmap obj-firefox/js/xpconnect/src/dom_quickstubs.cpp:23605 14 libxul.so js::InvokeKernel js/src/jscntxtinlines.h:314 15 libxul.so js::Interpret js/src/jsinterp.cpp:2710 16 libxul.so UncachedInlineCall js/src/methodjit/InvokeHelpers.cpp:370 17 libxul.so js::mjit::stubs::UncachedCallHelper js/src/methodjit/InvokeHelpers.cpp:453 18 libxul.so CallCompiler::update js/src/methodjit/MonoIC.cpp:960 19 libxul.so js::mjit::ic::Call js/src/methodjit/MonoIC.cpp:1018 20 libxul.so libxul.so@0xd08795 etc...
blocking-fennec1.0: --- → ?
Keywords: reproducible
Hardware: All → ARM
Whiteboard: [native-crash]
libGLESv2_POWERVR_SGX540_120.so@0x17132 is glGetBooleanv.
Component: General → Canvas: WebGL
Product: Fennec Native → Core
QA Contact: general → canvas.webgl
Assignee: nobody → bjacob
blocking-fennec1.0: ? → +
Whiteboard: [native-crash] → [native-crash][gfx]
Summary: crash in libGLESv2_POWERVR_SGX540_120 on Samsung Galaxy Nexus while playing crazybug game → crash in glGetBooleanv on Samsung Galaxy Nexus while playing crazybug game
Given the stack, it seems likely that this will be fixed by the patch in Bug 738126, no?
Depends on: 738126
Ignore comment 2: comment 1 is probably smarter. When I go to this demo in a debug build on a Nexus S, it crashes with a SIGKILL so I can't get a stack. When I run with MOZ_GL_DEBUG_VERBOSE to see what are the most recent GL calls before the crash, it gives me an endless repetition of adb| [egl] > void* mozilla::gl::GLLibraryEGL::fGetCurrentContext() adb| [egl] < void* mozilla::gl::GLLibraryEGL::fGetCurrentContext() adb| [gl:0x498ad000] > void mozilla::gl::GLContext::fActiveTexture(GLenum) adb| [gl:0x498ad000] < void mozilla::gl::GLContext::fActiveTexture(GLenum) [0... adb| [gl:0x498ad000] > void mozilla::gl::GLContext::fBindTexture(GLenum, GLuint) adb| [gl:0x498ad000] < void mozilla::gl::GLContext::fBindTexture(GLenum, GLui... adb| [gl:0x498ad000] > void mozilla::gl::GLContext::fTexImage2D(GLenum, GLint... adb| [gl:0x498ad000] < void mozilla::gl::GLContext::fTexImage2D(GLenum, GLint... adb| [gl:0x498ad000] > void mozilla::gl::GLContext::fPixelStorei(GLenum, GLint) adb| [gl:0x498ad000] < void mozilla::gl::GLContext::fPixelStorei(GLenum, GLin... adb| [gl:0x498ad000] > void mozilla::gl::GLContext::fTexSubImage2D(GLenum, GL... adb| [gl:0x498ad000] < void mozilla::gl::GLContext::fTexSubImage2D(GLenum, GL... adb| [gl:0x498ad000] > void mozilla::gl::GLContext::fPixelStorei(GLenum, GLint) adb| [gl:0x498ad000] < void mozilla::gl::GLContext::fPixelStorei(GLenum, GLin... and the demo stalls repeating this so I can't get to the crash. If I set a breakpoint in TexSubImage2D, I get this call stack: Breakpoint 1, mozilla::gl::GLContext::fTexSubImage2D (this=0x498ad000, target=3553, level=0, xoffset=0, yoffset=0, width=224, height=179, format=6408, type=5121, pixels=0x4f28f000) at ../../../dist/include/GLContext.h:2425 2425 BEFORE_GL_CALL; (gdb) bt #0 mozilla::gl::GLContext::fTexSubImage2D (this=0x498ad000, target=3553, level=0, xoffset=0, yoffset=0, width=224, height=179, format=6408, type=5121, pixels=0x4f28f000) at ../../../dist/include/GLContext.h:2425 #1 0x4bff09e2 in mozilla::gl::GLContext::TexSubImage2DWithoutUnpackSubimage(unsigned int, int, int, int, int, int, int, int, unsigned int, unsigned int, void const*) () from /home/bjacob/mozilla-central/obj-mobile-debug/dist/bin/libxul.so #2 0x4bff0818 in mozilla::gl::GLContext::TexSubImage2D(unsigned int, int, int, int, int, int, int, int, unsigned int, unsigned int, void const*) () from /home/bjacob/mozilla-central/obj-mobile-debug/dist/bin/libxul.so #3 0x4bff0710 in mozilla::gl::GLContext::TexImage2D(unsigned int, int, int, int, int, int, int, int, unsigned int, unsigned int, void const*) () from /home/bjacob/mozilla-central/obj-mobile-debug/dist/bin/libxul.so #4 0x4bff0456 in mozilla::gl::GLContext::UploadSurfaceToTexture(gfxASurface*, nsIntRegion const&, unsigned int&, bool, nsIntPoint const&, bool) () from /home/bjacob/mozilla-central/obj-mobile-debug/dist/bin/libxul.so #5 0x4bff61d0 in mozilla::gl::TextureImageEGL::DirectUpdate(gfxASurface*, nsIntRegion const&, nsIntPoint const&) () from /home/bjacob/mozilla-central/obj-mobile-debug/dist/bin/libxul.so #6 0x4bfecc88 in mozilla::gl::TiledTextureImage::DirectUpdate(gfxASurface*, nsIntRegion const&, nsIntPoint const&) () from /home/bjacob/mozilla-central/obj-mobile-debug/dist/bin/libxul.so #7 0x4bfda472 in mozilla::layers::ShadowBufferOGL::DirectUpdate ( this=0x4e96a880, aUpdate=0x49484480, aRegion=...) ---Type <return> to continue, or q <return> to quit--- at /home/bjacob/mozilla-central/gfx/layers/opengl/ThebesLayerOGL.cpp:942 #8 0x4bfdaef8 in mozilla::layers::ShadowThebesLayerOGL::ProgressiveUpload ( this=0x4e973c00) at /home/bjacob/mozilla-central/gfx/layers/opengl/ThebesLayerOGL.cpp:1141 #9 0x4bfdc136 in DispatchToMethod<mozilla::layers::ShadowThebesLayerOGL, void (mozilla::layers::ShadowThebesLayerOGL::*)()> (obj=0x4e973c00, method=&virtual table offset 52, arg=...) at /home/bjacob/mozilla-central/ipc/chromium/src/base/tuple.h:383 #10 0x4bfdc0a8 in RunnableMethod<mozilla::layers::ShadowThebesLayerOGL, void (mozilla::layers::ShadowThebesLayerOGL::*)(), Tuple0>::Run (this=0x4e8e2600) at /home/bjacob/mozilla-central/ipc/chromium/src/base/task.h:307 #11 0x4beff158 in MessageLoop::RunTask (this=0x49affdf0, task=0x4e8e2600) at /home/bjacob/mozilla-central/ipc/chromium/src/base/message_loop.cc:318 #12 0x4beff1ae in MessageLoop::DeferOrRunPendingTask (this=0x49affdf0, pending_task=...) at /home/bjacob/mozilla-central/ipc/chromium/src/base/message_loop.cc:326 #13 0x4beff652 in MessageLoop::DoDelayedWork (this=0x49affdf0, next_delayed_work_time=0x49896d70) at /home/bjacob/mozilla-central/ipc/chromium/src/base/message_loop.cc:453 #14 0x4bf044f0 in base::MessagePumpDefault::Run (this=0x49896d60, delegate=0x49affdf0) at /home/bjacob/mozilla-central/ipc/chromium/src/base/message_pump_default.cc:27 ---Type <return> to continue, or q <return> to quit--- #15 0x4befed78 in MessageLoop::RunInternal (this=0x49affdf0) at /home/bjacob/mozilla-central/ipc/chromium/src/base/message_loop.cc:208 #16 0x4befed12 in MessageLoop::RunHandler (this=0x49affdf0) at /home/bjacob/mozilla-central/ipc/chromium/src/base/message_loop.cc:201 #17 0x4befecba in MessageLoop::Run (this=0x49affdf0) at /home/bjacob/mozilla-central/ipc/chromium/src/base/message_loop.cc:175 #18 0x4bf1e1fe in base::Thread::ThreadMain (this=0x4988f5b0) at /home/bjacob/mozilla-central/ipc/chromium/src/base/thread.cc:156 #19 0x4bf4a8fe in ThreadFunc (closure=0x4988f5b0) at /home/bjacob/mozilla-central/ipc/chromium/src/base/platform_thread_posix.cc:26 #20 0xafd118e8 in __thread_entry () from /home/bjacob/moz-gdb/lib/3834B8993E6B00EC/system/lib/libc.so #21 0xafd114b4 in pthread_create () from /home/bjacob/moz-gdb/lib/3834B8993E6B00EC/system/lib/libc.so #22 0x00414ca0 in ?? ()
I instrumented some functions to say when we are entering and leaving them. I get very different results across runs. One run gives me: adb| enter virtual void mozilla::layers::ShadowThebesLayerOGL::ProgressiveUpl... adb| enter void mozilla::layers::ShadowBufferOGL::EnsureTexture(gfxIntSize, g... adb| leave void mozilla::layers::ShadowBufferOGL::EnsureTexture(gfxIntSize, g... adb| enter void mozilla::layers::ShadowBufferOGL::DirectUpdate(gfxASurface*, ... Program terminated with signal SIGKILL, Killed. This seems to prove that the SIGKILL comes from another thread than the thread running these functions. Indeed, mozilla::layers::ShadowBufferOGL::DirectUpdate immediately calls mozilla::layers::ShadowBufferOGL::EnsureTexture. That we got SIGKILL before it calls EnsureTexture can only be explained, AFAICS, if the SIGKILL comes from another thread.
The continous ProgressiveUploads are not specific to this testcase and not specific to this crashy situation: filed bug 741984 about such continuous ProgressiveUploads happening on the home tab.
Depends on: 741984
No longer depends on: 738126
Steps to reproduce this log: - apply patch from bug 741984, otherwise there's too much noise. - set media.* format preferences to false, otherwise there's noise from SYDNEY_AUDIO. - use MOZ_GL_DEBUG_VERBOSE=1 - I also have a local patch adding more logging but that's not essential here.
Here the first acore process crash occurs during a WebGL texImage2D call but before the glTexImage2D call. Almost certainly, that means it occurred during the texture conversion.
Here is my best attempt at a testcase trying to reproduce the WebGL stuff done by this app: http://people.mozilla.org/~bjacob/webgl-repeated-texImage2D-calls.html Unfortunately it doesn't reproduce. My best theory to explain this is that the app that's crashing here, isn't allocating huge WebGL resources: there is typically only 10M to 30M of WebGL textures, and much less for buffers. So it's something else in this demo that's causing the OOM condition, and it just happens to be OOMing in WebGL stuff because that's what's doing the bigger allocations.
The memory-pressure observer idea from bug 736481 might, or might not, help here. The idea would be to lost the WebGL context on memory pressure. Depending on what the application does, that may or may not avert the OOM.
Depends on: 619670
Depends on: 744515
Next steps for this bug: try to solve bug 744515. Once it's resolved, and once the patches in bug 736481 have landed, WebGL won't cause OOM's anymore.
Since many/most WebGL OOMs are fixed by bug 736481, and the remaining ones like here are only not fixed by it due to a bug in Android, I propose no longer blocking on this.
blocking-fennec1.0: + → ?
blocking-fennec1.0: ? → -
Just a fyi, this is still crashing in current trunk build on the Galaxy Nexus.
I'm still crashing in current trunk build, but not directly, it takes a while. This is happening while the game is playing and I'm closing the browser and/or opening/closing about:apps in a new tab from the Java menu. https://crash-stats.mozilla.com/report/index/bp-2610377d-e42e-4453-8b75-348b72130519 0 libxul.so mozilla::dom::TimeRanges::TimeRange* nsTArray_Impl<mozilla::dom::TimeRanges::Tim obj-firefox/dist/include/nsTArray.h:363 1 libxul.so mozilla::dom::TimeRanges::Add content/html/content/src/TimeRanges.cpp:81 2 libGLESv2.so glGetString 3 libxul.so mozilla::dom::HTMLMediaElement::SetCurrentTime content/html/content/src/HTMLMediaElement.cpp:1322 4 libGLESv2.so glGetString 5 libxul.so mozilla::dom::HTMLMediaElement::SetCurrentTime content/html/content/src/HTMLMediaElement.cpp:1366 6 libxul.so nsAttrAndChildArray::IndexOfAttr const content/base/src/nsAttrAndChildArray.cpp:549 7 libxul.so mozilla::dom::HTMLMediaElement::PlaybackEnded content/html/content/src/HTMLMediaElement.cpp:2882 8 libxul.so mozilla::MediaDecoderStateMachine::GetNextFrameStatus content/media/MediaDecoderStateMachine.cpp:1367 9 libxul.so mozilla::MediaDecoder::PlaybackEnded content/media/MediaDecoder.cpp:859 etc... https://crash-stats.mozilla.com/report/index/bp-3894a9d8-a8d3-4634-8267-3265e2130519 0 libpvrANDROID_WSEGL.so libpvrANDROID_WSEGL.so@0xeb4 1 binder binder@0x29e 2 libIMGegl.so libIMGegl.so@0x666a 3 libIMGegl.so libIMGegl.so@0x9986 4 libEGL.so _ZN7android9SingletonINS_6LoaderEE11hasInstanceEv 5 libEGL.so _ZN7android9SingletonINS_6LoaderEE11hasInstanceEv 6 libEGL.so _ZN7android9SingletonINS_6LoaderEED2Ev 7 libEGL.so _ZNK7android13egl_display_t9getObjectEPNS_12egl_object_tE 8 libEGL_POWERVR_SGX540_120.so libEGL_POWERVR_SGX540_120.so@0xfce 9 libEGL.so eglQuerySurface https://crash-stats.mozilla.com/report/index/bp-b32a3644-ad92-4678-a6d1-f82c02130519 0 libxul.so mozilla::dom::TimeRanges::TimeRange* nsTArray_Impl<mozilla::dom::TimeRanges::Tim obj-firefox/dist/include/nsTArray.h:363 1 libxul.so mozilla::dom::TimeRanges::Add content/html/content/src/TimeRanges.cpp:81 2 libm.so tgamma 3 @0xfffffffd 4 libxul.so mozilla::MediaDecoder::PlaybackEnded content/media/MediaDecoder.cpp:859 5 libxul.so nsRunnableMethodImpl<tag_nsresult obj-firefox/dist/include/nsThreadUtils.h:350 6 libxul.so nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:627 7 dalvik-mark-stack (deleted) dalvik-mark-stack @0x1d78891 8 libxul.so NS_ProcessPendingEvents obj-firefox/xpcom/build/nsThreadUtils.cpp:188 9 libxul.so nsBaseAppShell::NativeEventCallback nsBaseAppShell.cpp:97 10 libxul.so nsAppShell::ProcessNextNativeEvent
I think libpvrANDROID_WSEGL.so@0xeb4 is still related to this bug but not libpvrANDROID_WSEGL.so@0xeb4.
Crash Signature: [@ libGLESv2_POWERVR_SGX540_120.so@0x75f8] → [@ libGLESv2_POWERVR_SGX540_120.so@0x75f8] [@ libpvrANDROID_WSEGL.so@0xeb4 ]
Whiteboard: [native-crash][gfx] → [native-crash][gfx] webgl-driver
Doesn't seem to crash anymore on Fennec beta on the Galaxy Nexus.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: