Closed Bug 785597 Opened 12 years ago Closed 12 years ago

crash in AndroidGLController::SetGLVersion

Tracking

(firefox17 wontfix, firefox18 wontfix, firefox19 affected, firefox20 affected, firefox21 fixed)

Status:

RESOLVED FIXED

Milestone:

mozilla21

Tracking Flags:

Tracking

Status

firefox17

---

wontfix

firefox18

---

wontfix

firefox19

---

affected

firefox20

---

affected

firefox21

---

fixed

People

(Reporter: scoobidiver, Assigned: kats)

References

Details

(Keywords: crash, Whiteboard: [native-crash][startupcrash])

Crash Data

Attachments

(6 files, 3 obsolete files)

Patch to reliably trigger crashes during Android mochitests 12 years ago Vladan Djeric (:vladan) 1.64 KB, patch		Details \| Diff \| Splinter Review
nsWindow creation backtraces 12 years ago Kartikaya Gupta (email:kats@mozilla.staktrace.com) 35.64 KB, text/plain		Details
(1/3) - Cleanup - delete unused code 12 years ago Kartikaya Gupta (email:kats@mozilla.staktrace.com) 4.10 KB, patch	snorp : review+ kats : checkin+	Details \| Diff \| Splinter Review
(2/3) - Delete even more useless code 12 years ago Kartikaya Gupta (email:kats@mozilla.staktrace.com) 5.32 KB, patch	snorp : review+ kats : checkin+	Details \| Diff \| Splinter Review
(3/4) - Don't create a layer manager for a hidden window 12 years ago Kartikaya Gupta (email:kats@mozilla.staktrace.com) 1.86 KB, patch		Details \| Diff \| Splinter Review
(3/3) Don't create a layer manager for a hidden window 12 years ago Kartikaya Gupta (email:kats@mozilla.staktrace.com) 2.02 KB, patch	snorp : review+ kats : checkin+	Details \| Diff \| Splinter Review
(4/3) Allow the compositor to be created earlier 12 years ago Kartikaya Gupta (email:kats@mozilla.staktrace.com) 1.96 KB, patch		Details \| Diff \| Splinter Review
(5/3) Kill the deadlock 12 years ago Kartikaya Gupta (email:kats@mozilla.staktrace.com) 7.83 KB, patch		Details \| Diff \| Splinter Review
(4/3) Allow the compositor to be created earlier 12 years ago Kartikaya Gupta (email:kats@mozilla.staktrace.com) 6.02 KB, patch	snorp : review+	Details \| Diff \| Splinter Review

Scoobidiver (away)

Reporter

Description

•

12 years ago

There are seven crashes over the last four weeks.

Signature 	_JNIEnv::CallVoidMethod | AndroidGLController::SetGLVersion More Reports Search
UUID	d9810d76-f5ea-4398-b387-318132120825
Date Processed	2012-08-25 02:32:38
Uptime	8
Last Crash	13 seconds before submission
Install Age	1.4 minutes since version was first installed.
Install Time	2012-08-25 02:30:21
Product	FennecAndroid
Version	17.0a1
Build ID	20120824101755
Release Channel	nightly
OS	Linux
OS Version	0.0.0 Linux 3.1.10-digetx-thor-4.6 #411 SMP PREEMPT Mon Jun 11 23:56:23 EEST 2012 armv7l
Build Architecture	arm
Build Architecture Info	
Crash Reason	SIGSEGV
Crash Address	0x0
App Notes 	
AdapterDescription: 'NVIDIA Corporation -- NVIDIA Tegra -- OpenGL ES 2.0 14.01002 -- Model: Iconia A500, Product: full_a500, Manufacturer: Acer, Hardware: picasso'
Acer Iconia A500
acer/a500_ww_gen1/picasso:4.0.4/IMM76L/1333032611:user/release-keys
EMCheckCompatibility	True
Adapter Vendor ID	NVIDIA Corporation
Adapter Device ID	NVIDIA Tegra

Frame 	Module 	Signature 	Source
0 	libdvm.so 	libdvm.so@0x58134 	
1 	libdvm.so 	libdvm.so@0x58113 	
2 	libxul.so 	_JNIEnv::CallVoidMethod 	jni.h:631
3 	libxul.so 	AndroidGLController::SetGLVersion 	AndroidLayerViewWrapper.cpp:54
4 	libxul.so 	mozilla::AndroidBridge::RegisterCompositor 	AndroidBridge.cpp:1219
5 	libxul.so 	mozilla::gl::GLContextProviderEGL::CreateForWindow 	GLContextProviderEGL.cpp:2201
6 	libxul.so 	mozilla::layers::LayerManagerOGL::CreateContext 	LayerManagerOGL.cpp:144
7 	libxul.so 	mozilla::layers::CompositorParent::AllocPLayers 	LayerManagerOGL.h:75
8 	libxul.so 	mozilla::layers::PCompositorParent::OnMessageReceived 	PCompositorParent.cpp:485
9 	libxul.so 	mozilla::ipc::SyncChannel::OnDispatchMessage 	SyncChannel.cpp:143
10 	libxul.so 	mozilla::ipc::RPCChannel::OnMaybeDequeueOne 	RPCChannel.cpp:400
11 	libxul.so 	RunnableMethod<mozilla::ipc::RPCChannel, bool , Tuple0>::Run 	tuple.h:383
12 	libxul.so 	mozilla::ipc::RPCChannel::DequeueTask::Run 	RPCChannel.h:425
13 	libxul.so 	MessageLoop::RunTask 	message_loop.cc:326
14 	libxul.so 	MessageLoop::DeferOrRunPendingTask 	message_loop.cc:334
15 	libxul.so 	MessageLoop::DoWork 	message_loop.cc:434
16 	libxul.so 	base::MessagePumpDefault::Run 	message_pump_default.cc:23
17 	libxul.so 	MessageLoop::RunInternal 	message_loop.cc:208
18 	libxul.so 	MessageLoop::Run 	message_loop.cc:201
19 	libxul.so 	base::Thread::ThreadMain 	thread.cc:156
20 	libxul.so 	ThreadFunc 	platform_thread_posix.cc:31
21 	libc.so 	libc.so@0x12f3a 	
22 	libc.so 	libc.so@0x12a66

More reports at:
https://crash-stats.mozilla.com/report/list?signature=_JNIEnv%3A%3ACallVoidMethod+|+AndroidGLController%3A%3ASetGLVersion

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Assignee

Comment 1

•

12 years ago

The only possibility I can think of here is that the call to NewGlobalRef in AndroidGLController::Acquire returns null, so mJObj is null on the call to SetGLVersion. There isn't really any sane way to recover from that situation; putting in a guard here will will just move the crash elsewhere, if that's the cause.

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Assignee

Comment 2

•

12 years ago

Also on the 17 branch this hasn't happened since the July 18 build, so there's a good chance it was caused by something else and has since been fixed. We should wait and see if it there are any crashes from this in newer code before doing anything.

Scoobidiver (away)

Reporter

Comment 3

•

12 years ago

(In reply to Kartikaya Gupta (:kats) from comment #2)
> Also on the 17 branch this hasn't happened since the July 18 build
No, the latest crashes happened in 17.0a1/20120824, which is recent.

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Assignee

Comment 4

•

12 years ago

Oh, my mistake. Those were marked as "dup" so I ignored them. But they don't appear to be dups of anything sane.

Scoobidiver (away)

Reporter

Updated

•

12 years ago

Whiteboard: [native-crash] → [native-crash][startupcrash]

Scoobidiver (away)

Reporter

Comment 5

•

12 years ago

It still happens in the trunk: bp-92d3b851-5d72-496f-bf03-b72d12121105.

Crash Signature: [@ _JNIEnv::CallVoidMethod | AndroidGLController::SetGLVersion] → [@ _JNIEnv::CallVoidMethod | AndroidGLController::SetGLVersion] [@ JNI_CreateJavaVM | _JNIEnv::CallVoidMethod | AndroidGLController::SetGLVersion] [@ _JNIEnv::CallVoidMethod(_jobject*, _jmethodID*, ...) | AndroidGLController::SetGLVersion(int)]

Scoobidiver (away)

Reporter

Updated

•

12 years ago

status-firefox17: --- → affected

status-firefox18: --- → affected

Vladan Djeric (:vladan)

Updated

•

12 years ago

Assignee: nobody → bugmail.mozilla

Vladan Djeric (:vladan)

Updated

•

12 years ago

Blocks: 820167

Vladan Djeric (:vladan)

Comment 6

•

12 years ago

A small subset of Avi's patch in bug 820167 reliably triggers this crash:

https://tbpl.mozilla.org/?tree=Try&rev=82d078ed3eca

Vladan Djeric (:vladan)

Comment 7

•

12 years ago

Attached patch Patch to reliably trigger crashes during Android mochitests — Details — Splinter Review

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Assignee

Comment 8

•

12 years ago

This appears to be caused by the compositor registration (which is triggered by the GetLayerManager call) happening before LayerView creation:

12-19 17:06:20.777 I/Gecko   ( 3903): AndroidBridge::RegisterCompositor
12-19 17:06:20.777 E/GeckoLayerView( 3903): Error registering compositor!
12-19 17:06:20.777 E/GeckoLayerView( 3903): java.lang.NullPointerException
12-19 17:06:20.777 E/GeckoLayerView( 3903): 	at org.mozilla.gecko.gfx.LayerView.registerCxxCompositor(LayerView.java:349)
12-19 17:06:20.777 E/GeckoLayerView( 3903): 	at dalvik.system.NativeStart.run(Native Method)
12-19 17:06:20.777 E/GeckoLayerView( 3903): 	at dalvik.system.NativeStart.run(Native Method)
12-19 17:06:20.777 F/libc    ( 3903): Fatal signal 11 (SIGSEGV) at 0x00000000 (code=1), thread 3944 (Compositor)

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Assignee

Comment 9

•

12 years ago

Actually it's the mLayerView.mListener that's null. This should be easy enough to fix.

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Assignee

Comment 10

•

12 years ago

The simple fix didn't work :(

12-19 17:28:40.808 E/libEGL  ( 4083): EGLNativeWindowType 0x5d33c028 already connected to another API
12-19 17:28:40.808 E/libEGL  ( 4083): eglCreateWindowSurface:298 error 300b (EGL_BAD_NATIVE_WINDOW)
12-19 17:28:40.808 W/System.err( 4083): org.mozilla.gecko.gfx.GLController$GLControllerException: EGL window surface could not be created! Error 12299
12-19 17:28:40.808 W/System.err( 4083): 	at org.mozilla.gecko.gfx.GLController.provideEGLSurface(GLController.java:176)
12-19 17:28:40.808 W/System.err( 4083): 	at dalvik.system.NativeStart.run(Native Method)
12-19 17:28:40.808 W/System.err( 4083): 	at dalvik.system.NativeStart.run(Native Method)
12-19 17:28:40.816 I/Gecko   ( 4083): WARNING: Failed to create LayerManagerOGL context: file /Users/kats/zspace/mozilla-git/gfx/layers/opengl/LayerManagerOGL.cpp, line 469
12-19 17:28:40.816 I/Gecko   ( 4083): ###!!! ABORT: We need a context on Android: file /Users/kats/zspace/mozilla-git/gfx/layers/opengl/LayerManagerOGL.cpp, line 497
12-19 17:28:40.816 F/libc    ( 4083): Fatal signal 11 (SIGSEGV) at 0x00000000 (code=1), thread 4130 (Compositor)

I'll need to take a closer look at how all the compositor startup binding code works. CC'ing snorp as well for any ideas here.

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Assignee

Comment 11

•

12 years ago

After some more investigation the problem appears to be that when GetLayerManager is called really early on, it is called on an nsWindow instance that is different from the nsWindow instance in the "steady state" later. I don't know why we create a bunch of different nsWindow instances on startup but I will look into that. (The different nsWindow instances have different layer manager variables and so we end up creating two different compositors, resulting in the crash).

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Assignee

Comment 12

•

12 years ago

Attached file nsWindow creation backtraces — Details

So the nsWindow creation actually seems relatively sane after more investigation. I'm documenting what I found here so that it's recorded somewhere. 6 nsWindow instances are created on Fennec startup; see the attached backtraces to nsWindow::nsWindow(). The first one is the hidden window (I don't know exactly what this is used for, but apparently it has some useful purpose); the second is the content window inside that hidden window to load about:blank. The third is the main XUL window and the fourth is again the corresponding about:blank content window inside. The fifth is the actual hiddenWindow.html resource file that is loaded and replaces the about:blank that was the 2nd window created. And finally the sixth window created is for browser.xul and it replaces the about:blank that was the 4th window created.

With the failure patch (attachment #694032 [details] [diff] [review]) is applied, the first call to nsWindow::GetLayerManager happens on the hiddenWindow.html window, and ends up initializing the graphics stack there even though it's a hidden window. The next call to GetLayerManager (which is what happens normally) is called on the browser.xul window, creates a second compositor, and crashes.

I have patches that seem to fix this in a reasonable way; coming shortly.

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Assignee

Comment 13

•

12 years ago

Attached patch (1/3) - Cleanup - delete unused code — Details — Splinter Review

Also scope some things down and fix up documentation.

Attachment #694490 - Flags: review?(snorp)

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Assignee

Comment 14

•

12 years ago

Attached patch (2/3) - Delete even more useless code — Details — Splinter Review

I don't know why we were doing this and I don't care.

Attachment #694491 - Flags: review?(snorp)

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Assignee

Comment 15

•

12 years ago

Attached patch (3/4) - Don't create a layer manager for a hidden window (obsolete) — Details — Splinter Review

This makes GetLayerManager when called on the hiddenWindow.html window to just return null and not initialize a bunch of stuff improperly.

Attachment #694492 - Flags: review?(snorp)

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Assignee

Comment 16

•

12 years ago

Comment on attachment 694492 [details] [diff] [review]
(3/4) - Don't create a layer manager for a hidden window

Actually this might be wrong. TopWindow() doesn't return what I thought it did. Let me test more and fix.

Attachment #694492 - Flags: review?(snorp)

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Assignee

Comment 17

•

12 years ago

Attached patch (3/3) Don't create a layer manager for a hidden window — Details — Splinter Review

Ok, this seems to do the job. I don't need part 4 after all.

Attachment #694492 - Attachment is obsolete: true

Attachment #694514 - Flags: review?(snorp)

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Assignee

Updated

•

12 years ago

Attachment #694490 - Attachment description: (1/4) Cleanup - delete unused code → (1/3) - Cleanup - delete unused code

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Assignee

Updated

•

12 years ago

Attachment #694491 - Attachment description: (2/4) - Delete even more useless code → (2/3) - Delete even more useless code

James Willcox (:snorp) (jwillcox@mozilla.com) (he/him)

Updated

•

12 years ago

Attachment #694490 - Flags: review?(snorp) → review+

James Willcox (:snorp) (jwillcox@mozilla.com) (he/him)

Updated

•

12 years ago

Attachment #694491 - Flags: review?(snorp) → review+

James Willcox (:snorp) (jwillcox@mozilla.com) (he/him)

Comment 18

•

12 years ago

Comment on attachment 694514 [details] [diff] [review]
(3/3) Don't create a layer manager for a hidden window

Review of attachment 694514 [details] [diff] [review]:
-----------------------------------------------------------------

not sure I fully understand everything going on here, but I guess it makes sense enough to r+

Attachment #694514 - Flags: review?(snorp) → review+

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Assignee

Comment 19

•

12 years ago

(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #12)
> The first one is the hidden
> window (I don't know exactly what this is used for, but apparently it has
> some useful purpose)

mfinkle pointed me to bug 71895 which is trying to rip this out.

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Assignee

Comment 20

•

12 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/7315385d2b38
https://hg.mozilla.org/integration/mozilla-inbound/rev/3696e9df6e14
https://hg.mozilla.org/integration/mozilla-inbound/rev/d9611ad3d8bf

:Ms2ger (he/him; ⌚ UTC+1/+2)

Comment 21

•

12 years ago

https://hg.mozilla.org/mozilla-central/rev/7315385d2b38
https://hg.mozilla.org/mozilla-central/rev/3696e9df6e14
https://hg.mozilla.org/mozilla-central/rev/d9611ad3d8bf

Status: NEW → RESOLVED

Closed: 12 years ago

Resolution: --- → FIXED

Target Milestone: --- → mozilla20

Vladan Djeric (:vladan)

Comment 22

•

12 years ago

This is still crashing with Avi's patch in bug 820167:

https://tbpl.mozilla.org/?tree=Try&rev=ba3232b40309

Status: RESOLVED → REOPENED

Resolution: FIXED → ---

Vladan Djeric (:vladan)

Comment 23

•

12 years ago

Also crashes with the "reliable trigger" patch in this bug:

https://tbpl.mozilla.org/?tree=Try&rev=fd52a72f6f36

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Assignee

Comment 24

•

12 years ago

Interesting, I wasn't seeing this crash when I ran it locally. It looks like this one is even more timing-dependent since not all of the try runs are failing. I'll put up a patch to guard against the NPE that is causing this.

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Assignee

Comment 25

•

12 years ago

An update: I have a patch [1] that fixes the crash and that works perfectly when I build it locally. However when I make a try build out of it (e.g. [2]) and try to run on that on my device I get strange behaviour that looks like the compositor is always paused. Basically nothing gets painted to the screen, and I can't reproduce it on a local build so I don't have any good way of debugging it. I'm not sure how to move forward with this.

[1] https://hg.mozilla.org/try/rev/d20181d0dc75
[2] https://tbpl.mozilla.org/?tree=Try&rev=aed61693c996

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Assignee

Comment 26

•

12 years ago

I'm able to artificially reproduce the crash in comment #23 locally if I bump some stuff off to another thread and make it sleep() first. However my patch in comment #25 fixes the crash with no other apparent side-effects. I'm not sure why the try build corresponding to that patch had the weird blank screen issues. I'll try a few more things.

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Assignee

Comment 27

•

12 years ago

Ugh. Turns out this is being caused by the SetCompositor(NULL, NULL) in the nsWindow destructor. The early call to GetLayerManager() means that the compositor gets created early, and is then nulled out when the nsWindow destructor runs on windows #5 and #6 (see comment 12). Without the early call, the compositor is only created after those windows are destroyed, and so the compositor is never nulled out and the problem doesn't happen.

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Assignee

Updated

•

12 years ago

Attachment #694490 - Flags: checkin+

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Assignee

Updated

•

12 years ago

Attachment #694491 - Flags: checkin+

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Assignee

Updated

•

12 years ago

Attachment #694514 - Flags: checkin+

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Assignee

Comment 28

•

12 years ago

Attached patch (4/3) Allow the compositor to be created earlier (obsolete) — Details — Splinter Review

Pushed this to try at https://tbpl.mozilla.org/?tree=Try&rev=7e70d0658d03, will request review once that's green. It worked locally. I don't know why we null out the compositor in the destructor but given that we store the compositor as a static it seems like a bad idea to be nulling it out from an nsWindow instance (of which we have a bunch).

As for the listener hunk, I verified that none of the listener methods actually gets called before gecko is ready during normal startup flow, and if they do, then we want to process them normally. I also considered taking out the null checks for mListener in LayerView but it looks like there's still a period between LayerView construction and LayerView.initializeView() getting called where mListener will be null and I didn't want to risk hitting that NPE.

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Assignee

Comment 29

•

12 years ago

That weird INJECT_EVENTS permission error still seems to be happening, not sure what's going on there. Looks like it happened before, back in bug 766445, but went away by itself.

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Assignee

Comment 30

•

12 years ago

And it started happening on inbound as well, bug 825426. I can reproduce it (slightly different error message) when I run the tests locally on a GN so I'll try to debug it.

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Assignee

Updated

•

12 years ago

Depends on: 825426

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Assignee

Comment 31

•

12 years ago

So what I think is happening here is ScheduleResumeComposition, which was intended to only get called from the java UI thread, is now getting called from the gecko thread (as of bug 797942 / cset ef02503c4387). The implementation of this function eventually calls back into java to get a GL surface, but since it's not on the java UI thread, the surface could have disappeared by then. This happens if you open the awesomescreen during startup. This locks up basically all of the threads until the awesome screen is dismissed, which may not even be possible because of deadlock.

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Assignee

Comment 32

•

12 years ago

Attached patch (5/3) Kill the deadlock (obsolete) — Details — Splinter Review

This waitForValidSurface() code has irked me since it came into existence because it causes the compositor thread to block on java. And since the gecko thread blocks on the compositor thread in CompositorParent::ScheduleResumeOnCompositorThread, the gecko thread gets blocked too. Getting rid of it and letting the compositor remain in a paused state if a surface is not available seems to fix the problem. I have a small try run at https://tbpl.mozilla.org/?tree=Try&rev=200a4f0ded28 which shows the testAllPagesTab is fixed; pushed a more comprehensive try run to https://tbpl.mozilla.org/?tree=Try&rev=e972c0fb8292. Will request review if that's good.

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Assignee

Updated

•

12 years ago

Depends on: 826300

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Assignee

Comment 33

•

12 years ago

Comment on attachment 697437 [details] [diff] [review]
(5/3) Kill the deadlock

This patch almost works, but not quite. I am moving this fix into bug 826300 since it's a pretty large bug in its own right.

Attachment #697437 - Attachment is obsolete: true

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Assignee

Updated

•

12 years ago

No longer depends on: 825426

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Assignee

Comment 34

•

12 years ago

Attached patch (4/3) Allow the compositor to be created earlier — Details — Splinter Review

In addition to moving the setListener call earlier in java-land, I'm also making some changes to the c++ code to prevent creation of multiple layer managers. The previous patches were insufficient for fixing pageloader-based talos tests, because those tests create yet another nsWindow instance and the check that I added for hidden windows doesn't guard against that. With this patch I keep a static layer manager that corresponds to the static compositor and is shared across windows that are using OMTC. This has to be nulled out on shutdown because otherwise there's some shutdown crash because of leaked objects.

Try run with this patch and the one from bug 826300 is at https://tbpl.mozilla.org/?tree=Try&rev=f846a02b7c2e

Attachment #696358 - Attachment is obsolete: true

Attachment #698757 - Flags: review?(snorp)

Avi Halachmi (:avih)

Updated

•

12 years ago

Blocks: 828126

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Assignee

Updated

•

12 years ago

Blocks: 763813

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Assignee

Updated

•

12 years ago

Blocks: 801850

James Willcox (:snorp) (jwillcox@mozilla.com) (he/him)

Comment 35

•

12 years ago

Comment on attachment 698757 [details] [diff] [review]
(4/3) Allow the compositor to be created earlier

Review of attachment 698757 [details] [diff] [review]:
-----------------------------------------------------------------

looks good

Attachment #698757 - Flags: review?(snorp) → review+

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Assignee

Comment 36

•

12 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/dcaee859dcf5

Ed Morley [:emorley]

Updated

•

12 years ago

Target Milestone: mozilla20 → mozilla21

Ed Morley [:emorley]

Comment 37

•

12 years ago

https://hg.mozilla.org/mozilla-central/rev/dcaee859dcf5

Status: REOPENED → RESOLVED

Closed: 12 years ago → 12 years ago

Resolution: --- → FIXED

Ed Morley [:emorley]

Comment 38

•

12 years ago

https://hg.mozilla.org/mozilla-central/rev/dcaee859dcf5

Scoobidiver (away)

Reporter

Updated

•

12 years ago

Blocks: 807606

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Assignee

Comment 39

•

11 years ago

Comment on attachment 698757 [details] [diff] [review]
(4/3) Allow the compositor to be created earlier

[Approval Request Comment]
Bug caused by (feature/regressing bug #): none
User impact if declined: crashes with the signature in bug 834243
Testing completed (on m-c, etc.): on m-c
Risk to taking this patch (and alternatives if risky): low risk, android only
String or UUID changes made by this patch: none

Note that the first three patches on this bug are already on aurora because they landed when m-c was FF20.

Attachment #698757 - Flags: approval-mozilla-aurora?

Scoobidiver (away)

Reporter

Updated

•

11 years ago

Blocks: 834243

bhavana bajaj [:bajaj]

Updated

•

11 years ago

Attachment #698757 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Assignee

Comment 40

•

11 years ago

https://hg.mozilla.org/releases/mozilla-aurora/rev/d92c363a8217

status-firefox19: --- → affected

status-firefox20: --- → fixed

Ryan VanderMeulen [:RyanVM]

Comment 41

•

11 years ago

Backed out for turning Android tests orange.
https://hg.mozilla.org/releases/mozilla-aurora/rev/019452f32c21

status-firefox17: affected → wontfix

status-firefox18: affected → wontfix

status-firefox20: fixed → affected

status-firefox21: --- → fixed

status-firefox-esr17: --- → affected

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Assignee

Comment 42

•

11 years ago

I haven't been able to figure out why the above push made everything orange. My initial thought was that it needed bug 826300 as well but a try push with both patches was also just as orange. I can run the build from aurora on my device directly without issue, and a debug build I made using the same cset also works fine locally. Without a stack for the segfault I'm not sure I can figure out why this is crashing.

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Assignee

Comment 43

•

11 years ago

After some more investigation it looks like everything goes orange because the first call to renderRequested() crashes. Based on the crash info in e.g. https://tbpl.mozilla.org/php/getParsedLog.php?id=19424393&tree=Try&full=1#error0 and an disassembly of libmozglue.so I think the problem is that GeckoAppShell.scheduleComposite's native implemention in AndroidJNI.cpp calls nsWindow::ScheduleComposite(), and if libxul isn't loaded yet (which it doesn't appear to be) then this call resolves to 0x0 and it crashes.

It's odd that this is getting tickled on aurora tests but nowhere else though.

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Assignee

Comment 44

•

11 years ago

Thinking about this more, I think this might be a legitimate startup crash that could occur in the wild and not get reported since it happens so early (before libxul is fully loaded). I will file a separate bug for this.

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Assignee

Comment 45

•

11 years ago

Filed bug 837821.

Depends on: 837821

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Assignee

Comment 46

•

11 years ago

Comment on attachment 698757 [details] [diff] [review]
(4/3) Allow the compositor to be created earlier

Based on the number of problems this patch has turned up and the depth of the rabbit hole I've been pulled in to, I no longer think this is safe for uplift. Dropping the aurora approval flag.

Attachment #698757 - Flags: approval-mozilla-aurora+

Lukas Blakk [:lsblakk] use ?needinfo

Updated

•

11 years ago

status-firefox21: fixed → wontfix

status-firefox-esr17: affected → ---

Kevin Brosnan [Ex-Mozilla]

Updated

•

11 years ago

status-firefox21: wontfix → fixed

BMO Automation

Updated

•

3 years ago

Product: Core → Core Graveyard

You need to log in before you can comment on or make changes to this bug.