Last Comment Bug 741166 - crash in AndroidGLController::ProvideEGLSurface
: crash in AndroidGLController::ProvideEGLSurface
Status: VERIFIED FIXED
[native-crash][startupcrash]
: crash, regression, reproducible, topcrash
Product: Core
Classification: Components
Component: Widget: Android (show other bugs)
: 14 Branch
: ARM Android
: -- critical (vote)
: mozilla14
Assigned To: Joe Drew (not getting mail)
:
Mentors:
: 686457 (view as bug list)
Depends on:
Blocks: 737949
  Show dependency treegraph
 
Reported: 2012-04-01 00:13 PDT by Scoobidiver (away)
Modified: 2012-06-07 01:37 PDT (History)
8 users (show)
See Also:
Crash Signature:
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
+


Attachments
restore waitForValidSurface (6.67 KB, patch)
2012-04-02 09:44 PDT, Joe Drew (not getting mail)
ajuma.bugzilla: review+
Details | Diff | Splinter Review

Description Scoobidiver (away) 2012-04-01 00:13:30 PDT
It first appeared in 14.0a1/20120315.

Signature 	AndroidGLController::ProvideEGLSurface More Reports Search
UUID	00c6f9c3-7173-467e-abd0-269962120401
Date Processed	2012-04-01 03:33:37
Uptime	4
Last Crash	26 seconds before submission
Install Age	38 seconds since version was first installed.
Install Time	2012-04-01 03:32:46
Product	FennecAndroid
Version	14.0a1
Build ID	20120331161857
Release Channel	nightly
OS	Linux
OS Version	0.0.0 Linux 2.6.32.9-perf #1 PREEMPT Tue Aug 9 21:02:37 2011 armv7l
Build Architecture	arm
Build Architecture Info	
Crash Reason	SIGSEGV
Crash Address	0x8
App Notes 	
EGL? EGL+ AdapterVendorID: semc, AdapterDeviceID: R800at.
AdapterDescription: 'Android, Model: 'R800at', Product: 'R800at_1248-6414', Manufacturer: 'Sony Ericsson', Hardware: 'semc''.
Sony Ericsson R800at
AT&T/R800at_1248-6414/R800at:2.3.3/3.0.1.B.0.270/8W7P:user/release-keys
EMCheckCompatibility	True

Frame 	Module 	Signature 	Source
0 	libdvm.so 	libdvm.so@0x43262 	
1 	libxul.so 	AndroidGLController::ProvideEGLSurface 	jni.h:706
2 	libxul.so 	mozilla::AndroidBridge::ProvideEGLSurface 	widget/android/AndroidBridge.cpp:1115
3 	libxul.so 	mozilla::gl::GLContextProviderEGL::CreateForWindow 	gfx/gl/GLContextProviderEGL.cpp:1507
4 	libxul.so 	mozilla::layers::LayerManagerOGL::CreateContext 	gfx/layers/opengl/LayerManagerOGL.cpp:172
5 	libxul.so 	mozilla::layers::CompositorParent::AllocPLayers 	LayerManagerOGL.h:110
6 	libxul.so 	mozilla::layers::PCompositorParent::OnMessageReceived 	obj-firefox/ipc/ipdl/PCompositorParent.cpp:470
7 	libxul.so 	mozilla::ipc::SyncChannel::OnDispatchMessage 	ipc/glue/SyncChannel.cpp:175
8 	libxul.so 	mozilla::ipc::RPCChannel::OnMaybeDequeueOne 	ipc/glue/RPCChannel.cpp:432
9 	libxul.so 	RunnableMethod<mozilla::ipc::RPCChannel, bool , Tuple0>::Run 	ipc/chromium/src/base/tuple.h:383
10 	libxul.so 	mozilla::ipc::RPCChannel::DequeueTask::Run 	RPCChannel.h:462
11 	libxul.so 	MessageLoop::RunTask 	ipc/chromium/src/base/message_loop.cc:318
12 	libxul.so 	MessageLoop::DeferOrRunPendingTask 	ipc/chromium/src/base/message_loop.cc:326
13 	libxul.so 	MessageLoop::DoWork 	ipc/chromium/src/base/message_loop.cc:426
14 	libxul.so 	base::MessagePumpDefault::Run 	ipc/chromium/src/base/message_pump_default.cc:23
15 	libxul.so 	MessageLoop::RunInternal 	ipc/chromium/src/base/message_loop.cc:208
16 	libxul.so 	MessageLoop::Run 	ipc/chromium/src/base/message_loop.cc:201
17 	libxul.so 	base::Thread::ThreadMain 	ipc/chromium/src/base/thread.cc:156
18 	libxul.so 	ThreadFunc 	ipc/chromium/src/base/platform_thread_posix.cc:26
19 	libc.so 	libc.so@0x119a6 	
20 	libc.so 	libc.so@0x11572 	

More reports at:
https://crash-stats.mozilla.com/report/list?signature=AndroidGLController%3A%3AProvideEGLSurface
Comment 1 Scoobidiver (away) 2012-04-01 00:34:49 PDT
There is a spike in startup crashes from 14.0a1/20120331031108. The regression range for the spike is:
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=92fe907ddac8&tochange=4c43cfe73516
It's likely a regression from bug 740244.

Crashes after the spike occur on:
* LGE LG-P925, LG-P990
* NEC N-01D
* Samsung Galaxy Nexus, Nexus S, SCH-I500
* SMDKV210
* Sony Ericsson R800at
Comment 2 Oleg Romashin (:romaxa) 2012-04-01 04:43:34 PDT
Err... how glx test for GLX drivers caused problem with android EGL stuff?
Comment 3 Scoobidiver (away) 2012-04-01 05:34:55 PDT
(In reply to Oleg Romashin (:romaxa) from comment #2)
> Err... how glx test for GLX drivers caused problem with android EGL stuff?
Wrong bug because of EGL in its title.
It might a regression from bug 737437.
Comment 4 Scoobidiver (away) 2012-04-01 05:36:09 PDT
Other possible culprits are bug 739488 and bug 740190.
Comment 5 Ali Juma [:ajuma] 2012-04-01 07:10:49 PDT
(In reply to Scoobidiver from comment #3)
> It might a regression from bug 737437.

The patch for Bug 737437 that's in the regression range was also backed out within the regression range. The latest patch for Bug 737437 only made it to mozilla-central last night, so it's not in 14.0a1/20120331031108.

Since this crash is happening during the call to AndroidBridge::ProvideEGLSurface that happens right after AndroidBridge::RegisterCompositor, one plausible theory is that the call to GetJNIForThread in RegisterCompositor is failing, causing RegisterCompositor to return early without calling sController.Acquire, leading to a crash during the call to sController.ProvideEGLSurface.

kats, is it likely/possible that the call to GetJNIForThread is failing?
Comment 6 Kartikaya Gupta (email:kats@mozilla.com) 2012-04-01 09:41:35 PDT
It's possible that GetJNIForThread is failing, but I wouldn't consider it likely. I don't think I've ever seen that fail before, but I don't know the specifics of how dalvik finds a JNIEnv for a random pthread.
Comment 7 Cristian Nicolae (:xti) 2012-04-02 04:34:02 PDT
This crash still occurs on the latest Nightly build:

Steps to reproduce:
1. Open Fennec
2. Right after performing step 1, tap on URL Bar
3. Tap on Bookmarks tab and then on History tab (keep switching them quickly until the crash will occur)

Expected result:
No crash should occur after step 3

Actual result:
https://crash-stats.mozilla.com/report/index/b5786873-7b9b-4948-bcdd-55cb12120402

--
Firefox 14.0a1 (2012-04-01)
Devices: HTC Desire (2.2), Motorola Droid 2 (2.3.3), Samsung Nexus (4.0.2)
Comment 8 Cristian Nicolae (:xti) 2012-04-02 04:43:34 PDT
Here is a video about this crash: http://youtu.be/lMMry-SA_S4
Comment 9 Joe Drew (not getting mail) 2012-04-02 09:44:52 PDT
Created attachment 611489 [details] [diff] [review]
restore waitForValidSurface

Ali tells me that, before we call egl functions on our window, we have to wait for a valid EGL surface. I removed this code in bug 737949 because I didn't think it was necessary any more, but this is apparently an Android requirement.

I can't reproduce the crash mentioned in comment 7 with this fix.
Comment 10 Robert Kaiser 2012-04-02 12:27:33 PDT
Comment on attachment 611489 [details] [diff] [review]
restore waitForValidSurface

I'm not a reviewer or such, but shouldn't some comment(s) be added to make clear why this is needed? I mean, if you thought it could be removed yourself before being bitten by this, anyone else could run into the same wrong thought and should be warned when looking into the code, right? ;-)
Comment 11 Joe Drew (not getting mail) 2012-04-02 12:49:23 PDT
https://hg.mozilla.org/integration/mozilla-inbound/rev/3c1d6080a98f
Comment 12 Joe Drew (not getting mail) 2012-04-02 13:34:19 PDT
And in response to KaiRo's suggestion: https://hg.mozilla.org/integration/mozilla-inbound/rev/7da9ecd5424f
Comment 14 Cristian Nicolae (:xti) 2012-04-03 04:44:12 PDT
I am still able to reproduce this issue by performing the steps from comment #7 or by tapping continuous in URL Bar just after Fennec is opened. Reopening bug

--
Firefox 14.0a1 (2012-04-03)
Device: HTC Desire Z
OS: Android 2.3.3
Comment 15 Scoobidiver (away) 2012-04-03 05:25:03 PDT
The build from April 3rd (http://hg.mozilla.org/mozilla-central/rev/95df15895e02) doesn't contain the fix.
Comment 16 Cristian Nicolae (:xti) 2012-04-04 08:21:23 PDT
Verified fixed on:

Firefox 14.0a1 (2012-04-04)
Device: Samsung Galaxy S
OS: Android 2.2
Comment 17 JP Rosevear [:jpr] 2012-04-05 10:28:22 PDT
*** Bug 686457 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.