Last Comment Bug 741166 - crash in AndroidGLController::ProvideEGLSurface
: crash in AndroidGLController::ProvideEGLSurface
: crash, regression, reproducible, topcrash
Product: Core
Classification: Components
Component: Widget: Android (show other bugs)
: 14 Branch
: ARM Android
-- critical (vote)
: mozilla14
Assigned To: Joe Drew (not getting mail)
: Jim Chen [:jchen] [:darchons]
: 686457 (view as bug list)
Depends on:
Blocks: 737949
  Show dependency treegraph
Reported: 2012-04-01 00:13 PDT by Scoobidiver (away)
Modified: 2012-06-07 01:37 PDT (History)
8 users (show)
See Also:
Crash Signature:
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---

restore waitForValidSurface (6.67 KB, patch)
2012-04-02 09:44 PDT, Joe Drew (not getting mail)
ajuma.bugzilla: review+
Details | Diff | Splinter Review

Description User image Scoobidiver (away) 2012-04-01 00:13:30 PDT
It first appeared in 14.0a1/20120315.

Signature 	AndroidGLController::ProvideEGLSurface More Reports Search
UUID	00c6f9c3-7173-467e-abd0-269962120401
Date Processed	2012-04-01 03:33:37
Uptime	4
Last Crash	26 seconds before submission
Install Age	38 seconds since version was first installed.
Install Time	2012-04-01 03:32:46
Product	FennecAndroid
Version	14.0a1
Build ID	20120331161857
Release Channel	nightly
OS	Linux
OS Version	0.0.0 Linux #1 PREEMPT Tue Aug 9 21:02:37 2011 armv7l
Build Architecture	arm
Build Architecture Info	
Crash Reason	SIGSEGV
Crash Address	0x8
App Notes 	
EGL? EGL+ AdapterVendorID: semc, AdapterDeviceID: R800at.
AdapterDescription: 'Android, Model: 'R800at', Product: 'R800at_1248-6414', Manufacturer: 'Sony Ericsson', Hardware: 'semc''.
Sony Ericsson R800at
EMCheckCompatibility	True

Frame 	Module 	Signature 	Source
1 	AndroidGLController::ProvideEGLSurface 	jni.h:706
2 	mozilla::AndroidBridge::ProvideEGLSurface 	widget/android/AndroidBridge.cpp:1115
3 	mozilla::gl::GLContextProviderEGL::CreateForWindow 	gfx/gl/GLContextProviderEGL.cpp:1507
4 	mozilla::layers::LayerManagerOGL::CreateContext 	gfx/layers/opengl/LayerManagerOGL.cpp:172
5 	mozilla::layers::CompositorParent::AllocPLayers 	LayerManagerOGL.h:110
6 	mozilla::layers::PCompositorParent::OnMessageReceived 	obj-firefox/ipc/ipdl/PCompositorParent.cpp:470
7 	mozilla::ipc::SyncChannel::OnDispatchMessage 	ipc/glue/SyncChannel.cpp:175
8 	mozilla::ipc::RPCChannel::OnMaybeDequeueOne 	ipc/glue/RPCChannel.cpp:432
9 	RunnableMethod<mozilla::ipc::RPCChannel, bool , Tuple0>::Run 	ipc/chromium/src/base/tuple.h:383
10 	mozilla::ipc::RPCChannel::DequeueTask::Run 	RPCChannel.h:462
11 	MessageLoop::RunTask 	ipc/chromium/src/base/
12 	MessageLoop::DeferOrRunPendingTask 	ipc/chromium/src/base/
13 	MessageLoop::DoWork 	ipc/chromium/src/base/
14 	base::MessagePumpDefault::Run 	ipc/chromium/src/base/
15 	MessageLoop::RunInternal 	ipc/chromium/src/base/
16 	MessageLoop::Run 	ipc/chromium/src/base/
17 	base::Thread::ThreadMain 	ipc/chromium/src/base/
18 	ThreadFunc 	ipc/chromium/src/base/

More reports at:
Comment 1 User image Scoobidiver (away) 2012-04-01 00:34:49 PDT
There is a spike in startup crashes from 14.0a1/20120331031108. The regression range for the spike is:
It's likely a regression from bug 740244.

Crashes after the spike occur on:
* LGE LG-P925, LG-P990
* NEC N-01D
* Samsung Galaxy Nexus, Nexus S, SCH-I500
* SMDKV210
* Sony Ericsson R800at
Comment 2 User image Oleg Romashin (:romaxa) 2012-04-01 04:43:34 PDT
Err... how glx test for GLX drivers caused problem with android EGL stuff?
Comment 3 User image Scoobidiver (away) 2012-04-01 05:34:55 PDT
(In reply to Oleg Romashin (:romaxa) from comment #2)
> Err... how glx test for GLX drivers caused problem with android EGL stuff?
Wrong bug because of EGL in its title.
It might a regression from bug 737437.
Comment 4 User image Scoobidiver (away) 2012-04-01 05:36:09 PDT
Other possible culprits are bug 739488 and bug 740190.
Comment 5 User image Ali Juma [:ajuma] 2012-04-01 07:10:49 PDT
(In reply to Scoobidiver from comment #3)
> It might a regression from bug 737437.

The patch for Bug 737437 that's in the regression range was also backed out within the regression range. The latest patch for Bug 737437 only made it to mozilla-central last night, so it's not in 14.0a1/20120331031108.

Since this crash is happening during the call to AndroidBridge::ProvideEGLSurface that happens right after AndroidBridge::RegisterCompositor, one plausible theory is that the call to GetJNIForThread in RegisterCompositor is failing, causing RegisterCompositor to return early without calling sController.Acquire, leading to a crash during the call to sController.ProvideEGLSurface.

kats, is it likely/possible that the call to GetJNIForThread is failing?
Comment 6 User image Kartikaya Gupta ( 2012-04-01 09:41:35 PDT
It's possible that GetJNIForThread is failing, but I wouldn't consider it likely. I don't think I've ever seen that fail before, but I don't know the specifics of how dalvik finds a JNIEnv for a random pthread.
Comment 7 User image Cristian Nicolae (:xti) 2012-04-02 04:34:02 PDT
This crash still occurs on the latest Nightly build:

Steps to reproduce:
1. Open Fennec
2. Right after performing step 1, tap on URL Bar
3. Tap on Bookmarks tab and then on History tab (keep switching them quickly until the crash will occur)

Expected result:
No crash should occur after step 3

Actual result:

Firefox 14.0a1 (2012-04-01)
Devices: HTC Desire (2.2), Motorola Droid 2 (2.3.3), Samsung Nexus (4.0.2)
Comment 8 User image Cristian Nicolae (:xti) 2012-04-02 04:43:34 PDT
Here is a video about this crash:
Comment 9 User image Joe Drew (not getting mail) 2012-04-02 09:44:52 PDT
Created attachment 611489 [details] [diff] [review]
restore waitForValidSurface

Ali tells me that, before we call egl functions on our window, we have to wait for a valid EGL surface. I removed this code in bug 737949 because I didn't think it was necessary any more, but this is apparently an Android requirement.

I can't reproduce the crash mentioned in comment 7 with this fix.
Comment 10 User image Robert Kaiser 2012-04-02 12:27:33 PDT
Comment on attachment 611489 [details] [diff] [review]
restore waitForValidSurface

I'm not a reviewer or such, but shouldn't some comment(s) be added to make clear why this is needed? I mean, if you thought it could be removed yourself before being bitten by this, anyone else could run into the same wrong thought and should be warned when looking into the code, right? ;-)
Comment 11 User image Joe Drew (not getting mail) 2012-04-02 12:49:23 PDT
Comment 12 User image Joe Drew (not getting mail) 2012-04-02 13:34:19 PDT
And in response to KaiRo's suggestion:
Comment 14 User image Cristian Nicolae (:xti) 2012-04-03 04:44:12 PDT
I am still able to reproduce this issue by performing the steps from comment #7 or by tapping continuous in URL Bar just after Fennec is opened. Reopening bug

Firefox 14.0a1 (2012-04-03)
Device: HTC Desire Z
OS: Android 2.3.3
Comment 15 User image Scoobidiver (away) 2012-04-03 05:25:03 PDT
The build from April 3rd ( doesn't contain the fix.
Comment 16 User image Cristian Nicolae (:xti) 2012-04-04 08:21:23 PDT
Verified fixed on:

Firefox 14.0a1 (2012-04-04)
Device: Samsung Galaxy S
OS: Android 2.2
Comment 17 User image JP Rosevear [:jpr] 2012-04-05 10:28:22 PDT
*** Bug 686457 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.