crash in mozilla::gl::GLContextEGL::ReleaseSurface

RESOLVED FIXED in mozilla29

Status

()

--
critical
RESOLVED FIXED
5 years ago
3 years ago

People

(Reporter: scoobidiver, Assigned: kamidphish)

Tracking

({crash, regression})

23 Branch
mozilla29
ARM
Android
crash, regression
Points:
---

Firefox Tracking Flags

(firefox22 unaffected, firefox23- affected, firefox24 unaffected, fennec-)

Details

(Whiteboard: [native-crash], crash signature)

Attachments

(1 attachment, 1 obsolete attachment)

(Reporter)

Description

5 years ago
It's #30 crasher in 23.0a2 and #39 in 24.0a1.
It started spiking in 23.0a1/20130503. The regression range for the spike might be (low volume):
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=70e0955ccc87&tochange=b35170667a2f

Signature 	mozilla::gl::GLContextEGL::ReleaseSurface() More Reports Search
UUID	c86d1a59-83f6-4ae4-abd3-dd3b42130601
Date Processed	2013-06-01 01:55:50
Uptime	63
Install Age	1.2 days since version was first installed.
Install Time	2013-05-30 20:29:09
Product	FennecAndroid
Version	24.0a1
Build ID	20130529031131
Release Channel	nightly
OS	Android
OS Version	0.0.0 Linux 3.0.31-ge491d6d #1 SMP PREEMPT Thu Feb 14 08:51:55 CST 2013 armv7l motorola/XT926_verizon/vanquish:4.1.2/9.8.1Q-62_VQW_MR-2/6:user/release-keys
Build Architecture	arm
Build Architecture Info	ARMv0
Crash Reason	SIGSEGV
Crash Address	0x1c
App Notes 	
AdapterDescription: 'Qualcomm -- Adreno (TM) 225 -- OpenGL ES 2.0 2184622 -- Model: DROID RAZR HD, Product: XT926_verizon, Manufacturer: motorola, Hardware: qcom'
GL Layers! EGL? EGL+ GL Context? GL Context+ GL Layers+ 
motorola DROID RAZR HD
motorola/XT926_verizon/vanquish:4.1.2/9.8.1Q-62_VQW_MR-2/6:user/release-keys
Processor Notes 	sp-processor04_phx1_mozilla_com_10969:2012; non-integer value of "SecondsSinceLastCrash"; exploitability tool: ERROR: unable to analyze dump
EMCheckCompatibility	True
Adapter Vendor ID	Qualcomm
Adapter Device ID	Adreno (TM) 225
Device	motorola DROID RAZR HD
Android API Version	16 (REL)
Android CPU ABI	armeabi-v7a

Frame 	Module 	Signature 	Source
0 	libEGL.so 	libEGL.so@0xabb8 	
1 	libEGL.so 	libEGL.so@0xa9a7 	
2 	libEGL.so 	libEGL.so@0x2212a 	
3 	libEGL.so 	libEGL.so@0x22106 	
4 	libEGL.so 	libEGL.so@0xc959 	
5 	libEGL.so 	libEGL.so@0x22106 	
6 	libEGL.so 	libEGL.so@0xc86b 	
7 	libxul.so 	mozilla::gl::GLContextEGL::ReleaseSurface 	gfx/gl/GLLibraryEGL.h:164
8 	libxul.so 	mozilla::layers::CompositorOGL::Pause 	gfx/layers/opengl/CompositorOGL.cpp:1363
9 	libxul.so 	mozilla::layers::CompositorParent::PauseComposition 	gfx/layers/ipc/CompositorParent.cpp:284
10 	libxul.so 	mozilla::layers::CompositorParent::PauseComposition 	obj-firefox/dist/include/mozilla/Mutex.h:79
11 	libxul.so 	mozilla::layers::CompositorParent::RecvPause 	gfx/layers/ipc/CompositorParent.cpp:230
12 	libxul.so 	mozilla::layers::PCompositorParent::OnMessageReceived 	obj-firefox/ipc/ipdl/PCompositorParent.cpp:434
13 	libmozglue.so 	arena_malloc 	memory/mozjemalloc/jemalloc.c:4167
14 	libmozglue.so 	__wrap_realloc 	memory/mozjemalloc/jemalloc.c:4504
15 	libmozglue.so 	__wrap_realloc 	memory/mozjemalloc/jemalloc.c:4247
16 	libmozglue.so 	arena_dalloc 	memory/mozjemalloc/jemalloc.c:4675
17 	libxul.so 	mozilla::ipc::SyncChannel::OnDispatchMessage 	ipc/glue/SyncChannel.cpp:145
18 	libxul.so 	mozilla::ipc::RPCChannel::OnMaybeDequeueOne 	ipc/glue/RPCChannel.cpp:400
19 	libxul.so 	RunnableMethod<IPC::ChannelProxy::Context, void 	ipc/chromium/src/base/tuple.h:383
20 	libxul.so 	libxul.so@0xa813b3 	
21 	libxul.so 	mozilla::ipc::RPCChannel::DequeueTask::Run 	obj-firefox/dist/include/mozilla/ipc/RPCChannel.h:425
22 	libxul.so 	MessageLoop::RunTask 	ipc/chromium/src/base/message_loop.cc:337
23 	libxul.so 	MessageLoop::DeferOrRunPendingTask 	ipc/chromium/src/base/message_loop.cc:345
24 	libxul.so 	MessageLoop::DoWork 	ipc/chromium/src/base/message_loop.cc:445
25 	libxul.so 	base::MessagePumpDefault::Run 	ipc/chromium/src/base/message_pump_default.cc:23
26 	libxul.so 	base::CreatePlatformFile 	ipc/chromium/src/base/platform_file_posix.cc:66
27 	libxul.so 	MessageLoop::RunInternal 	ipc/chromium/src/base/message_loop.cc:219
28 	libxul.so 	MessageLoop::Run 	ipc/chromium/src/base/message_loop.cc:212
29 	libxul.so 	base::Thread::ThreadMain 	ipc/chromium/src/base/thread.cc:160

More reports at:
https://crash-stats.mozilla.com/report/list?signature=mozilla%3A%3Agl%3A%3AGLContextEGL%3A%3AReleaseSurface%28%29
James, Nicolas, any thoughts?
(Reporter)

Comment 2

5 years ago
There are two comments:
"tried to give a rating and it crashed after there was no aurora on play"
"all I did was hit back"

It's #5 crasher in 23.0a2.
It has stopped in 24.0a1 since 24.0a1/20130530. The working range might be (discontinuous across builds):
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=e58336e81395&tochange=f66d956d188e
tracking-fennec: --- → ?
status-firefox24: affected → unaffected
tracking-firefox23: --- → ?
Keywords: topcrash
(In reply to Milan Sreckovic [:milan] from comment #1)
> James, Nicolas, any thoughts?
Could this have been fixed by something we've done?
Flags: needinfo?(snorp)
Flags: needinfo?(nical.bugzilla)
(In reply to Milan Sreckovic [:milan] from comment #3)
> Could this have been fixed by something we've done?

nothing that I'm aware of
Flags: needinfo?(nical.bugzilla)
Would be great to find the window of when this was fixed on 24. Can someone look at the range in comment 2?
tracking-firefox23: ? → +
Flags: needinfo?(milan)
Keywords: regressionwindow-wanted
Maybe 779a5cfb5395 for bug 874929.
Flags: needinfo?(milan)
(In reply to Milan Sreckovic [:milan] from comment #6)
> Maybe 779a5cfb5395 for bug 874929.

Don't think this is the right bug number, I just copied from the comments.  Jeff, do you have the bug number for the above change?
Flags: needinfo?(jgilbert)
tracking-fennec: ? → 23+
That was bug 876929, which only WebGL code, and so won't have fixed this bug.
Flags: needinfo?(jgilbert)
Assigning to Milan until the regressing bug is discovered and another owner can be tagged.
Assignee: nobody → milan
Flags: needinfo?(snorp)
(Reporter)

Comment 11

5 years ago
There are only one crash in 23.0b1 and no crashes in 24.0a2.
tracking-fennec: 23+ → ?
tracking-firefox23: + → ?
Keywords: topcrash

Updated

5 years ago
tracking-firefox23: ? → -
tracking-fennec: ? → -
Not a lot of crashes in the later releases, but there appear to be some in 26 and 28...
Assignee: milan → dglastonbury
No STR. I took a quick code review. The crash address is all over the place, not what I'd expect if we tried to access a null mGLContext pointer.

Is it possible that we try to pause after the compositor is destroyed?
Flags: needinfo?(snorp)
(In reply to Dan Glastonbury :djg :kamidphish from comment #13)
> No STR. I took a quick code review. The crash address is all over the place,
> not what I'd expect if we tried to access a null mGLContext pointer.
> 
> Is it possible that we try to pause after the compositor is destroyed?

It could be possible. There are a lot of bugs surrounding this right now. Benoit is, of course, the expert.
Flags: needinfo?(snorp) → needinfo?(bjacob)
Indeed, that is possible. CompositorParent::PauseComposition() and ResumeComposition() are called from Java code that doesn't know anything about the compositor's shutdown sequence, so this could race in arbitrary ways.

CompositorParent::PauseComposition() and ResumeComposition() should check two things that they currently don't:

1. that gl() returns non-null
2. that gl()->IsDestroyed() returns false.
Flags: needinfo?(bjacob)
Sorry, it's not CompositorParent::PauseComposition() and ResumeComposition() that should check this, but CompositorOGL::Pause() and Resume().
Benoit,
  If I add those checks, will you review? Then, I guess, we wait to see if the occurrence of crashes goes down.
Yup, looking forward to your patch! :-)
Created attachment 8362369 [details] [diff] [review]
Check for GL Context validity in Pause()/Resume().

Add the checks that Benoit suggested.
Attachment #8362369 - Flags: review?(bjacob)
Comment on attachment 8362369 [details] [diff] [review]
Check for GL Context validity in Pause()/Resume().

Review of attachment 8362369 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/layers/opengl/CompositorOGL.cpp
@@ +1539,5 @@
>  void
>  CompositorOGL::Pause()
>  {
>  #ifdef MOZ_WIDGET_ANDROID
> +  GLContext* gl = mGLContext;

I don't see the need for these local 'gl' variables: gl() should be inline i.e. cost zero to call.
Attachment #8362369 - Flags: review?(bjacob) → review+
(In reply to Benoit Jacob [:bjacob] from comment #20)
> I don't see the need for these local 'gl' variables: gl() should be inline
> i.e. cost zero to call.

Except that it looks really ugly. I'll change it, if that's the desire.
Created attachment 8363268 [details] [diff] [review]
Check for GL Context validity in Pause()/Resume().

Change local gl variable to gl(). Carry r=bjacob.
Attachment #8362369 - Attachment is obsolete: true
Keywords: checkin-needed
Attachment #8363268 - Flags: review+
https://hg.mozilla.org/mozilla-central/rev/c4a84171a0e2
Status: NEW → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla29
Keywords: regressionwindow-wanted
You need to log in before you can comment on or make changes to this bug.