Unresponsive browser on resume from sleep on tablet device

VERIFIED FIXED in Firefox 9

Status

Fennec Graveyard
General
--
major
VERIFIED FIXED
6 years ago
6 years ago

People

(Reporter: aaronmt, Assigned: snorp)

Tracking

({regression})

Firefox 9
Firefox 9
ARM
Android
regression
Dependency tree / graph

Details

Attachments

(2 attachments)

(Reporter)

Description

6 years ago
Mozilla/5.0 (Android; Linux armv7l; rv:9.0a1) Gecko/20110902 Firefox/9.0a1 Fennec/9.0a1
Device: Samsung Galaxy Tab 10.1
Android OS: 3.1

Upon attempting to use Fennec again after a device wakeup (after a display timeout, and device sleep) Fennec becomes unresponsive to all input.

STR: 
1. http://www.neowin.net
2. Let your device screen timeout and sleep
3. Few minutes later, wake device up and attempt to use Fennec
(Reporter)

Comment 1

6 years ago
Device SKU: Model GT-P7510MA
(Reporter)

Comment 2

6 years ago
Happens on about:home
I'm seeing this on my Motorola Xoom too.  It also happens when I use the power button to put the device to sleep and then wake it up.  Assigning to myself as a reminder, but please feel free to steal this.
Assignee: nobody → mbrubeck
Severity: normal → major
tracking-fennec: --- → ?
Keywords: regression, regressionwindow-wanted
Version: Trunk → Firefox 9

Comment 4

6 years ago
logcat?
(Reporter)

Comment 5

6 years ago
Created attachment 558149 [details]
Nightly Debug (09/04) log

Everything from ACTION_SCREEN_OFF onwards

Comment 6

6 years ago
Looks like it's starting up from scratch when the screen turns back on. The bit about

I/Gecko   ( 1273): ###!!! ASSERTION: Potential deadlock detected:
I/Gecko   ( 1273): Cyclical dependency starts at
I/Gecko   ( 1273): Mutex : nsRecyclingAllocator.mLock
I/Gecko   ( 1273): Next dependency:
I/Gecko   ( 1273): ReentrantMonitor : nsComponentManagerImpl.mMon (currently acquired)
I/Gecko   ( 1273): Cycle completed at
I/Gecko   ( 1273): Mutex : nsRecyclingAllocator.mLock
I/Gecko   ( 1273): Deadlock may happen for some other execution

it a bit scary, but it seems to keep doing things after that.

Comment 7

6 years ago
Actually, it looks like there are two processes running - 1236 and 1273. That looks suspicious to me.
After I press "home" (to send Fennec to the background) and then resume it by tapping its icon in the launcher, it starts working again.

Comment 9

6 years ago
bug 684443 changes the behavior of how we kill off zombies.  Not sure if it is the cause.
(Reporter)

Comment 10

6 years ago
(In reply to Matt Brubeck (:mbrubeck) from comment #8)
> After I press "home" (to send Fennec to the background) and then resume it
> by tapping its icon in the launcher, it starts working again.

Confirming that this works for me on the Galaxy Tab 10.1 too.
(In reply to Doug Turner (:dougt) from comment #9)
> bug 684443 changes the behavior of how we kill off zombies.  Not sure if it
> is the cause.

bug 684443 has not landed yet.
bug 684152 has also changed code near where we kill zombies by never calling unpackFile after a firstrun. unpackFile is where we attempt kill zombies. However unpackFile would have retruned early on any subsequent startup and we would not have killed any zombies anyway.

We could change the patch in bug 684443 to _always_ attempt to kill zombies.
I can reproduce this, taking
Assignee: mbrubeck → snorp
I don't think this relates to zombie processes at all.  I have the same fennec processes before/after resume and also after 'relaunching'.
From bisecting with both mozilla-central and mozilla-inbound nightlies, the regression range is within this push:

https://hg.mozilla.org/mozilla-central/pushloghtml?changeset=94a4a478d774
This appears to be a regression from this patch to target API level 11 (bug 681980):
https://hg.mozilla.org/mozilla-central/rev/b532e0d93bc5

It looks like the app is still receiving input but is no longer painting anything to the screen.  For example, tapping in the url field still makes the on-screen keyboard appear.  Or if I try to scroll while the app is "frozen", then un-freeze it with the steps in comment 8, then I can see that the content has indeed scrolled.
Blocks: 681980
Keywords: regressionwindow-wanted
Backed out the offending patch, but leaving this bug open for investigation so we can re-land it: https://hg.mozilla.org/mozilla-central/rev/6a4e5dbe0d64
Blocks: 684963
This bug is caused because the Android surface is destroyed on suspend, but it isn't recreated on resume.  This is normally done by Android and we just get notified, so not sure what's going on here.  Investigating further...
Created attachment 559166 [details] [diff] [review]
Bug 684242 - don't send synthetic SURFACE_DESTROY event when stopping on Android
Comment on attachment 559166 [details] [diff] [review]
Bug 684242 - don't send synthetic SURFACE_DESTROY event when stopping on Android

I'm not sure why this was added to begin with, as normally Android will inform us when the surface is created/destroyed.  I think it's wrong, and removing it does fix this bug.  I tested with both software and GL rendering on honeycomb and gingerbread devices.
Attachment #559166 - Flags: review?(doug.turner)
Attachment #559166 - Flags: review?(doug.turner) → review?(ajuma)
Original code was added as part of fixing bug 677920, but I don't really understand why the part I removed was necessary.  Ali, can you explain?

Comment 22

6 years ago
This code was added since:
-there's OpenGL cleanup we need to do when the Android surface is destroyed
-the ACTIVITY_STOPPING event is always accompanied by a SURFACE_DESTROYED, at least on the Nexus S when targeting API level 5; when quickly changing device orientation repeatedly we were getting the SURFACE_DESTROYED too late to do the OpenGL cleanup (since, in that situation, the activity stops, the surface is destroyed, the activity restarts, and a new surface is created, all in rapid-fire sequence), and hence it made sense to start that cleanup as soon as we received ACTIVITY_STOPPING

It sounds like the assumption that ACTIVITY_STOPPING is always accompanied by SURFACE_DESTROYED no longer holds with API level 11.

I re-tested on a Nexus S using your patch and targeting API level 11, and I'm not seeing any problems during orientation changes.

Updated

6 years ago
Attachment #559166 - Flags: review?(ajuma) → review+
Keywords: checkin-needed
If this requires API level 11, we should make sure it lands at the same time as bug 684963.
Whiteboard: [must land with patch from bug 684963]
https://hg.mozilla.org/integration/mozilla-inbound/rev/36282a6788df
Status: NEW → ASSIGNED
Keywords: checkin-needed
Whiteboard: [must land with patch from bug 684963] → [inbound]
http://hg.mozilla.org/mozilla-central/rev/36282a6788df
Whiteboard: [inbound]
Target Milestone: --- → Firefox 9
Status: ASSIGNED → RESOLVED
Last Resolved: 6 years ago
Resolution: --- → FIXED
(Reporter)

Comment 26

6 years ago
Samsung Galaxy Tab 10.1 (Android v3.1)
Mozilla/5.0 (Android; Linux armv7l; rv:9.0a1) Gecko/20110909 Firefox/9.0a1 Fennec/9.0a1
tracking-fennec: ? → ---
status-firefox9: --- → fixed
(Reporter)

Updated

6 years ago
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.