Firefox hangs with "Don't keep activities" developer option enabled

VERIFIED FIXED in Firefox 15

Status

()

Firefox for Android
General
VERIFIED FIXED
5 years ago
9 months ago

People

(Reporter: mcomella, Assigned: kats)

Tracking

({regression})

17 Branch
Firefox 17
ARM
Android
regression
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox15 verified, firefox16 verified, firefox17 verified)

Details

Attachments

(1 attachment)

0) Turn on "Don't keep activities" option (Settings -> Developer Options")
1) Open Nightly

Expected: Nightly is usable and loads pages.
Actual: Nightly hangs and nothing loads. Occasionally, after some wiggling, a page may load (most notably on a clean startup) but clicking the awesomebar and selecting something will cause Firefox to hang again.

Tested on Galaxy Nexus, Android 4.0.4, and Asus Transformer Prime TF201, Android 4.0.3. Nightly 8/6.

Can anyone else confirm?
Confirmed (TF201/4.0.4, Nightly 08/06).

Comment 2

5 years ago
Is this actually a regression? Does it work in 14/15/16?
Keywords: regressionwindow-wanted
tracking-fennec: --- → ?
This might be a regression from bug 770047, in which case we have a problem since they're both beta-blocker bugs.
err I mean tracking-15+ bugs. or whatever those bugs are called that need to go into 15...
This is happening because of a deadlock condition in the code. The UI thread is here (part of the stack I filled in by hand):

at MonitorAutoLock(mResumeCompositorMonitor)::Wait
at CompositorParent::ScheduleResumeOnCompositorThread
at nsWindow::ScheduleResumeComposition
at org.mozilla.gecko.GeckoAppShell.scheduleResumeComposition
at org.mozilla.gecko.gfx.GeckoLayerClient.compositionResumeRequested(GeckoLayerClient.java:472)
at org.mozilla.gecko.gfx.GLController.resumeCompositorIfValid(GLController.java:78)
at org.mozilla.gecko.GeckoAppShell.setLayerClient(Native Method)
at org.mozilla.gecko.GeckoAppShell.setLayerClient(Native Method)
at org.mozilla.gecko.GeckoAppShell.setLayerClient(Native Method)
at org.mozilla.gecko.GeckoApp.initialize(GeckoApp.java:1788)
at org.mozilla.gecko.GeckoApp.onWindowFocusChanged(GeckoApp.java:2011)
at com.android.internal.policy.impl.PhoneWindow$DecorView.onWindowFocusChanged(PhoneWindow.java:2423)
at android.view.View.dispatchWindowFocusChanged(View.java:7321)
at android.view.ViewGroup.dispatchWindowFocusChanged(ViewGroup.java:933)
at android.view.ViewRootImpl$ViewRootHandler.handleMessage(ViewRootImpl.java:2881)
at android.os.Handler.dispatchMessage(Handler.java:99)
at android.os.Looper.loop(Looper.java:137) 
at android.app.ActivityThread.main(ActivityThread.java:4745)
at java.lang.reflect.Method.invokeNative(Native Method)
at java.lang.reflect.Method.invoke(Method.java:511)
at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:786)
at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:553)
at dalvik.system.NativeStart.main(Native Method)

This thread is waiting for the compositor-resume on the compositor thread to finish executing. However it is a Java thread that is in the middle of executing GLController.resumeCompositorIfValid, which is a function synchronized on the GLController instance.

The compositor thread is trying to reacquire a surface via a call to AndroidBridge::ProvideEGLSurface, which tries to call GLController.waitForValidSurface. However that function is also synchronized on the GLController instance, and so it cannot even enter the waitForValidSurface function, resulting in a deadlock.
Created attachment 650120 [details] [diff] [review]
Patch

This fixes the problem. I don't think the GLController needs to hold on to the lock when calling the listener function. There might be better fixes though, it seems kind of wrong to me that the java UI thread gets blocked on the compositor thread, so maybe shunting some of that code onto a different thread would work better?
Attachment #650120 - Flags: review?(ajuma)
Assignee: nobody → bugmail.mozilla
This is indeed a regression from bug 770047, and is on beta as of
http://hg.mozilla.org/releases/mozilla-beta/rev/b40643c774f6
Blocks: 770047
status-firefox15: --- → affected
status-firefox16: --- → affected
Keywords: regressionwindow-wanted

Comment 8

5 years ago
Comment on attachment 650120 [details] [diff] [review]
Patch

(In reply to Kartikaya Gupta (:kats) from comment #6)
> Created attachment 650120 [details] [diff] [review]
> Patch
> 
> This fixes the problem. I don't think the GLController needs to hold on to
> the lock when calling the listener function. There might be better fixes
> though, it seems kind of wrong to me that the java UI thread gets blocked on
> the compositor thread, so maybe shunting some of that code onto a different
> thread would work better?

In this particular instance, we don't need the java UI thread blocked until the compositor resumes, so we could indeed move this code onto a different thread. However, the simple fix here seems to be the best approach for something we're going to want to uplift.

Also, note that there are other situations where the java UI thread *needs* to block on the compositor thread (e.g. during a surfaceDestroyed event), so we cannot eliminate all instances of the java thread blocking on the compositor thread.
Attachment #650120 - Flags: review?(ajuma) → review+
Comment on attachment 650120 [details] [diff] [review]
Patch

[Approval Request Comment]
Bug caused by (feature/regressing bug #): bug 770047
User impact if declined: when the "don't keep activities" developer option is checked, fennec won't load pages properly from the awesomebar
Testing completed (on m-c, etc.): locally so far
Risk to taking this patch (and alternatives if risky): mobile only. should be fairly low risk. another option is to back out bug 770047 but then we get the black flash back.
String or UUID changes made by this patch: none
Attachment #650120 - Flags: approval-mozilla-beta?
Attachment #650120 - Flags: approval-mozilla-aurora?
https://hg.mozilla.org/integration/mozilla-inbound/rev/add211f29374
status-firefox17: affected → fixed
https://hg.mozilla.org/mozilla-central/rev/add211f29374
Status: NEW → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 17
Comment on attachment 650120 [details] [diff] [review]
Patch

mobile only, so willing to take this low risk patch for branches.
Attachment #650120 - Flags: approval-mozilla-beta?
Attachment #650120 - Flags: approval-mozilla-beta+
Attachment #650120 - Flags: approval-mozilla-aurora?
Attachment #650120 - Flags: approval-mozilla-aurora+
https://hg.mozilla.org/releases/mozilla-aurora/rev/1b4e93422e58
https://hg.mozilla.org/releases/mozilla-beta/rev/8c91d2848d80
status-firefox15: affected → fixed
status-firefox16: affected → fixed

Updated

5 years ago
Status: RESOLVED → VERIFIED
status-firefox17: fixed → verified
Unable to reproduce the issue on:

Firefox Mobile 16.0b5 / Firefox Mobile 15
Samsung Galaxy Nexus (Android 4.1.1)

Marking as verified on Firefox Mobile 16 and 15
status-firefox15: fixed → verified
status-firefox16: fixed → verified
tracking-fennec: ? → ---
You need to log in before you can comment on or make changes to this bug.