Closed Bug 737949 Opened 13 years ago Closed 13 years ago

Remove GLThread, don't initialize EGL from Java code

Categories

(Firefox for Android Graveyard :: General, defect)

14 Branch
ARM
Android
defect
Not set
critical

Tracking

(Not tracked)

RESOLVED FIXED
Firefox 14

People

(Reporter: scoobidiver, Assigned: joe)

References

Details

(Keywords: crash, regression, topcrash, Whiteboard: [native-crash][gfx][startupcrash])

Crash Data

Attachments

(1 file, 2 obsolete files)

It first appeared in 14.0a1/20120319 and occurs on Galaxy Nexus at startup. The regression range is: http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=e94edfdb1f5b&tochange=58a2cd0203ee Signature libGLESv2_POWERVR_SGX540_120.so@0xb8ae More Reports Search UUID b2640624-477f-494e-9317-122952120321 Date Processed 2012-03-21 13:26:39 Uptime 2 Last Crash 2.5 hours before submission Install Age 1.9 hours since version was first installed. Install Time 2012-03-21 11:34:33 Product FennecAndroid Version 14.0a1 Build ID 20120321031151 Release Channel nightly OS Linux OS Version 0.0.0 Linux 3.0.8-g8e0f3bc #1 SMP PREEMPT Wed Dec 14 18:52:51 PST 2011 armv7l Build Architecture arm Build Architecture Info Crash Reason SIGSEGV Crash Address 0x0 App Notes EGL? EGL+ AdapterVendorID: tuna, AdapterDeviceID: Galaxy Nexus. AdapterDescription: 'Android, Model: 'Galaxy Nexus', Product: 'yakju', Manufacturer: 'samsung', Hardware: 'tuna''. samsung Galaxy Nexus google/yakju/maguro:4.0.1/ITL41F/228551:user/release-keys EMCheckCompatibility True Frame Module Signature [Expand] Source 0 libc.so libc.so@0xdbac 1 libGLESv2_POWERVR_SGX540_120.so libGLESv2_POWERVR_SGX540_120.so@0xb8ae 2 libGLESv2_POWERVR_SGX540_120.so libGLESv2_POWERVR_SGX540_120.so@0xda66 3 libGLESv2_POWERVR_SGX540_120.so libGLESv2_POWERVR_SGX540_120.so@0x5a38e 4 libGLESv2_POWERVR_SGX540_120.so libGLESv2_POWERVR_SGX540_120.so@0xebc6 ... More reports at: https://crash-stats.mozilla.com/report/list?signature=libGLESv2_POWERVR_SGX540_120.so%400xb8ae
Bug 735353 is within libGLESv1_POWERVR_SGX540_120.so; this bug is wihin libGLESv2_POWERVR_SGX540_120.so
Crash Signature: [@ libGLESv2_POWERVR_SGX540_120.so@0xb8ae] → [@ libGLESv2_POWERVR_SGX540_120.so@0xb8ae] [@ memcpy | libGLESv2_POWERVR_SGX540_120.so@0x6df6 ] [@ libGLESv2_POWERVR_SGX540_120.so@0xb042] [@ memcpy | libGLESv2_POWERVR_SGX540_120.so@0x6e7e] [@ libGLESv2_POWERVR_SGX540_120.so@0xbefe] [@ libGLESv2_POW…
With all the dupes, it's no longer specific to Nexus.
Summary: crash in libGLESv2_POWERVR_SGX540_120 on Galaxy Nexus → crash in libGLESv2_POWERVR_SGX540_120
There are different versions of this DLL, but I don't think crashes in 0x1..., 0x6..., 0x7..., 0xa..., 0xb... are correlated. If you think so, bug 736436 and bug 736666 are a dupe of this bug.
Depends on: 738188
libGLESv2_POWERVR_SGX540_120.so@0xebc6 is glDrawArrays, so this does indeed seem to be a dupe of Bug 736666 and Bug 737957.
(In reply to Ali Juma [:ajuma] from comment #7) > libGLESv2_POWERVR_SGX540_120.so@0xebc6 is glDrawArrays How do you know this so that we don't group unrelated signatures in a single bug or file duplicated bugs?
Crash Signature: [@ libGLESv2_POWERVR_SGX540_120.so@0xb8ae] [@ memcpy | libGLESv2_POWERVR_SGX540_120.so@0x6df6 ] [@ libGLESv2_POWERVR_SGX540_120.so@0xb042] [@ memcpy | libGLESv2_POWERVR_SGX540_120.so@0x6e7e] [@ libGLESv2_POWERVR_SGX540_120.so@0xbefe] [@ libGLESv2_POW… → [@ libGLESv2_POWERVR_SGX540_120.so@0xb8ae]
Summary: crash in libGLESv2_POWERVR_SGX540_120 → crash in libGLESv2_POWERVR_SGX540_120@0xb8ae
No longer depends on: 738188
(In reply to Scoobidiver from comment #8) > (In reply to Ali Juma [:ajuma] from comment #7) > > libGLESv2_POWERVR_SGX540_120.so@0xebc6 is glDrawArrays > How do you know this so that we don't group unrelated signatures in a single > bug or file duplicated bugs? I have a local copy of drivers from some devices, and I'm using addr2line to map addresses to function names.
I get a similar stacktrace with this, I filed bug 736666 for it.
Keywords: topcrash
Assignee: nobody → joe
This is in glClear, apparently. joe@ubuntu:~/moz-gdb/lib/0149A4550C01701E/system/lib$ ~/android-ndk-r5c/toolchains/arm-linux-androideabi-4.4.3/prebuilt/linux-x86/bin/arm-linux-androideabi-addr2line -C -f -e libGLESv2_POWERVR_SGX540_120.so 0xb8ae glClear ??:0
Summary: crash in libGLESv2_POWERVR_SGX540_120@0xb8ae → crash in glClear
Summary: crash in glClear → startup crash in glClear called from java code
Ah, it's probably at some random address that happens to be near glClear - we have no such calls in our codebase. However, up on the callstack there's glDrawElements, which seems much more likely. However, we still don't have any calls to glDrawElements in our code. Maybe it's being called by general Android code? STR desperately needed.
Summary: startup crash in glClear called from java code → startup crash under glDrawElements called from java code
Keywords: qawanted
In an effort to reduce the number of moving parts (and event queues to manage), I've tried removing GLThread, instead running all the GLController methods directly from within FlexibleGLSurfaceView. This has lead to a crash, unfortunately. The compositor thread, when spinning up, asks the Java code through JNI for an EGLSurface (GLController::provideEGLSurface). From within this, eglCreateWindowSurface is throwing an error, with the following in adb lolcat: E/GeckoGLController( 2190): provideEGLSurface E/SurfaceTexture( 117): [SurfaceView] connect: already connected (cur=1, req=1) E/libEGL ( 2190): EGLNativeWindowType 0x1bc658 already connected to another API E/libEGL ( 2190): eglCreateWindowSurface:374 error 300b (EGL_BAD_NATIVE_WINDOW) It seems that eglCreateWindowSurface doesn't want to create a surface for a native window that is already bound to a surface. However, I've verified that the call to provideEGLSurface happens after GLController::disposeGLContext is called, so I'm not sure what else needs to happen for the native window to be disconnected.
Does the fix I landed for bug 738188 fix this?
(In reply to Jeff Muizelaar [:jrmuizel] from comment #14) > Does the fix I landed for bug 738188 fix this? There have been no crashes with this signature since 14.0a1/20120323031214, long before the patch in bug 738188 landed. So either it's fixed by another bug or Galaxy Nexus users gave up because of those startup crashes. That may be also under other close libGLESv2_POWERVR_SGX540_120.so signatures for devices different from Galaxy Nexus.
Summary: startup crash under glDrawElements called from java code → Remove GLThread, don't initialize EGL from Java code
Attached patch initial stab at this (obsolete) — Splinter Review
This is an initial stab at this. It removes GLThread and stops initializing OpenGL from Java before Gecko starts up.
Attachment #610701 - Flags: feedback?(jmuizelaar)
Attachment #610701 - Flags: feedback?(bugmail.mozilla)
Attachment #610701 - Flags: feedback?(bgirard)
Attachment #610701 - Flags: feedback?(ajuma)
Comment on attachment 610701 [details] [diff] [review] initial stab at this Provided this doesn't regress much, I like it.
Attachment #610701 - Flags: feedback?(jmuizelaar) → feedback+
Comment on attachment 610701 [details] [diff] [review] initial stab at this Review of attachment 610701 [details] [diff] [review]: ----------------------------------------------------------------- Hm, looks ok to me but I have no confidence in my ability to review this code. pcwalton would be a better reviewer, since he wrote the FlexibleGLSurfaceView.
Attachment #610701 - Flags: feedback?(bugmail.mozilla) → feedback+
Comment on attachment 610701 [details] [diff] [review] initial stab at this This makes our GL context handling much saner.
Attachment #610701 - Flags: feedback?(ajuma) → feedback+
Attachment #610701 - Flags: feedback?(bgirard) → feedback?(pwalton)
The reason that FlexibleGLSurfaceView works this way is so that it can render a pannable/zoomable screenshot maintained entirely in Java before Gecko comes up. This was a product directive from the beginning. If Gecko itself is responsible for bringing up GL in all cases, then this cannot be done. I'm ok with removing that possibility personally (I don't think it helps much at this point), but I can't unilaterally approve it.
OK. Let me deal with product; if you can do feedback on the patch, or possibly an initial review, I'd appreciate it.
We do have Fennec drivers and Product approval for removing the code. We are communicating the decision outward. An initial feedback or review on the patch would be great.
Comment on attachment 610701 [details] [diff] [review] initial stab at this Review of attachment 610701 [details] [diff] [review]: ----------------------------------------------------------------- f=me then.
Attachment #610701 - Flags: feedback?(pwalton) → feedback+
This also seems to fix the drawing of scrollbars and backgrounds.
(In reply to Jeff Muizelaar [:jrmuizel] from comment #24) > This also seems to fix the drawing of scrollbars and backgrounds. This really shouldn't affect that I don't think... Are you sure about this? The scrollbars/backgrounds disappearing is intermittent - I sometimes see them, sometimes I don't, and it can depend on the page, scroll position, etc.
(In reply to Chris Lord [:cwiiis] from comment #25) > (In reply to Jeff Muizelaar [:jrmuizel] from comment #24) > > This also seems to fix the drawing of scrollbars and backgrounds. > > This really shouldn't affect that I don't think... Are you sure about this? > The scrollbars/backgrounds disappearing is intermittent - I sometimes see > them, sometimes I don't, and it can depend on the page, scroll position, etc. I am not sure.
Attached patch v1 (obsolete) — Splinter Review
Almost the same as the previous version of this patch, but with more unused code removed.
Attachment #610701 - Attachment is obsolete: true
Attachment #610935 - Flags: review?(pwalton)
Whiteboard: [native-crash][gfx][startupcrash] → [native-crash][gfx][startupcrash][autoland-try]
Whiteboard: [native-crash][gfx][startupcrash][autoland-try] → [native-crash][gfx][startupcrash][autoland-in-queue]
Comment on attachment 610935 [details] [diff] [review] v1 Review of attachment 610935 [details] [diff] [review]: ----------------------------------------------------------------- r=me
Attachment #610935 - Flags: review?(pwalton) → review+
Autoland Patchset: Patches: 610935 Branch: mozilla-central => try Patch 610935 could not be applied to mozilla-central. patching file mobile/android/base/Makefile.in Hunk #1 FAILED at 118 1 out of 1 hunks FAILED -- saving rejects to file mobile/android/base/Makefile.in.rej patch failed, unable to continue (try -v) patch failed, rejects left in working dir Patch 610935 could not be applied to mozilla-central. patching file mobile/android/base/Makefile.in Hunk #1 FAILED at 118 1 out of 1 hunks FAILED -- saving rejects to file mobile/android/base/Makefile.in.rej patching file mobile/android/base/gfx/FlexibleGLSurfaceView.java Hunk #1 FAILED at 39 Hunk #2 FAILED at 56 Hunk #3 FAILED at 91 3 out of 3 hunks FAILED -- saving rejects to file mobile/android/base/gfx/FlexibleGLSurfaceView.java.rej patching file mobile/android/base/gfx/GLController.java Hunk #1 FAILED at 49 Hunk #2 FAILED at 74 Hunk #3 FAILED at 146 Hunk #4 FAILED at 236 4 out of 4 hunks FAILED -- saving rejects to file mobile/android/base/gfx/GLController.java.rej unable to find 'mobile/android/base/gfx/GLThread.java' for patching 1 out of 1 hunks FAILED -- saving rejects to file mobile/android/base/gfx/GLThread.java.rej patching file mobile/android/base/gfx/LayerView.java Hunk #1 FAILED at 97 1 out of 1 hunks FAILED -- saving rejects to file mobile/android/base/gfx/LayerView.java.rej patching file widget/android/AndroidBridge.cpp Hunk #1 FAILED at 1104 1 out of 1 hunks FAILED -- saving rejects to file widget/android/AndroidBridge.cpp.rej patching file widget/android/AndroidFlexViewWrapper.cpp Hunk #1 FAILED at 45 Hunk #2 FAILED at 81 2 out of 2 hunks FAILED -- saving rejects to file widget/android/AndroidFlexViewWrapper.cpp.rej patching file widget/android/AndroidFlexViewWrapper.h Hunk #1 FAILED at 52 1 out of 1 hunks FAILED -- saving rejects to file widget/android/AndroidFlexViewWrapper.h.rej patch failed, unable to continue (try -v) patch failed, rejects left in working dir Patchset could not be applied and pushed.
Whiteboard: [native-crash][gfx][startupcrash][autoland-in-queue] → [native-crash][gfx][startupcrash]
Blocks: 740898
Comment on attachment 610935 [details] [diff] [review] v1 This makes it so the code from GLThread.renderFrame() is called unconditionally, as though mGLThread != null. We don't want to call it at all.
Attachment #610935 - Flags: review-
Attached patch v2Splinter Review
Jeff pointed out that I had moved the contents of GLThread.requestRender into FlexibleGLSurfaceView.requestRender, but it was normally used only before the C++ compositor started up (and mGLThread was nulled out). The refactored code would, however, always be run, regardless of whether the C++ compositor was started up. Since we've made the decision to stop using OpenGL from Java on startup, we don't need the GLThread.requestRender code at all, so I've removed it. This is also rebased on current m-c trunk. Carrying forward r=pcwalton, and adding r=jrmuizel.
Attachment #610935 - Attachment is obsolete: true
Attachment #611037 - Flags: review+
Try run for 320256571f7a is complete. Detailed breakdown of the results available here: https://tbpl.mozilla.org/?tree=Try&rev=320256571f7a Results (out of 237 total builds): exception: 1 success: 214 warnings: 22 Builds (or logs if builds failed) available at: http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/jdrew@mozilla.com-320256571f7a
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 14
Keywords: qawanted
Depends on: 741166
Product: Firefox for Android → Firefox for Android Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: