Last Comment Bug 684242 - Unresponsive browser on resume from sleep on tablet device
: Unresponsive browser on resume from sleep on tablet device
Status: VERIFIED FIXED
: regression
Product: Fennec Graveyard
Classification: Graveyard
Component: General (show other bugs)
: Firefox 9
: ARM Android
: -- major (vote)
: Firefox 9
Assigned To: James Willcox (:snorp) (jwillcox@mozilla.com)
:
:
Mentors:
Depends on:
Blocks: 681980 684963
  Show dependency treegraph
 
Reported: 2011-09-02 08:22 PDT by Aaron Train [:aaronmt]
Modified: 2011-09-09 06:47 PDT (History)
8 users (show)
See Also:
QA Whiteboard:
Iteration: ---
Points: ---


Attachments
Nightly Debug (09/04) log (196.61 KB, text/plain)
2011-09-04 08:12 PDT, Aaron Train [:aaronmt]
no flags Details
Bug 684242 - don't send synthetic SURFACE_DESTROY event when stopping on Android (1.08 KB, patch)
2011-09-08 08:17 PDT, James Willcox (:snorp) (jwillcox@mozilla.com)
ajuma.bugzilla: review+
Details | Diff | Splinter Review

Description Aaron Train [:aaronmt] 2011-09-02 08:22:09 PDT
Mozilla/5.0 (Android; Linux armv7l; rv:9.0a1) Gecko/20110902 Firefox/9.0a1 Fennec/9.0a1
Device: Samsung Galaxy Tab 10.1
Android OS: 3.1

Upon attempting to use Fennec again after a device wakeup (after a display timeout, and device sleep) Fennec becomes unresponsive to all input.

STR: 
1. http://www.neowin.net
2. Let your device screen timeout and sleep
3. Few minutes later, wake device up and attempt to use Fennec
Comment 1 Aaron Train [:aaronmt] 2011-09-02 08:24:31 PDT
Device SKU: Model GT-P7510MA
Comment 2 Aaron Train [:aaronmt] 2011-09-02 08:40:38 PDT
Happens on about:home
Comment 3 Matt Brubeck (:mbrubeck) 2011-09-03 21:24:23 PDT
I'm seeing this on my Motorola Xoom too.  It also happens when I use the power button to put the device to sleep and then wake it up.  Assigning to myself as a reminder, but please feel free to steal this.
Comment 4 Doug Turner (:dougt) 2011-09-03 22:39:10 PDT
logcat?
Comment 5 Aaron Train [:aaronmt] 2011-09-04 08:12:16 PDT
Created attachment 558149 [details]
Nightly Debug (09/04) log

Everything from ACTION_SCREEN_OFF onwards
Comment 6 Josh Matthews [:jdm] 2011-09-04 08:27:21 PDT
Looks like it's starting up from scratch when the screen turns back on. The bit about

I/Gecko   ( 1273): ###!!! ASSERTION: Potential deadlock detected:
I/Gecko   ( 1273): Cyclical dependency starts at
I/Gecko   ( 1273): Mutex : nsRecyclingAllocator.mLock
I/Gecko   ( 1273): Next dependency:
I/Gecko   ( 1273): ReentrantMonitor : nsComponentManagerImpl.mMon (currently acquired)
I/Gecko   ( 1273): Cycle completed at
I/Gecko   ( 1273): Mutex : nsRecyclingAllocator.mLock
I/Gecko   ( 1273): Deadlock may happen for some other execution

it a bit scary, but it seems to keep doing things after that.
Comment 7 Josh Matthews [:jdm] 2011-09-04 08:28:46 PDT
Actually, it looks like there are two processes running - 1236 and 1273. That looks suspicious to me.
Comment 8 Matt Brubeck (:mbrubeck) 2011-09-04 09:21:10 PDT
After I press "home" (to send Fennec to the background) and then resume it by tapping its icon in the launcher, it starts working again.
Comment 9 Doug Turner (:dougt) 2011-09-04 09:43:33 PDT
bug 684443 changes the behavior of how we kill off zombies.  Not sure if it is the cause.
Comment 10 Aaron Train [:aaronmt] 2011-09-04 09:56:17 PDT
(In reply to Matt Brubeck (:mbrubeck) from comment #8)
> After I press "home" (to send Fennec to the background) and then resume it
> by tapping its icon in the launcher, it starts working again.

Confirming that this works for me on the Galaxy Tab 10.1 too.
Comment 11 Mark Finkle (:mfinkle) (use needinfo?) 2011-09-04 20:24:57 PDT
(In reply to Doug Turner (:dougt) from comment #9)
> bug 684443 changes the behavior of how we kill off zombies.  Not sure if it
> is the cause.

bug 684443 has not landed yet.
Comment 12 Mark Finkle (:mfinkle) (use needinfo?) 2011-09-04 20:30:19 PDT
bug 684152 has also changed code near where we kill zombies by never calling unpackFile after a firstrun. unpackFile is where we attempt kill zombies. However unpackFile would have retruned early on any subsequent startup and we would not have killed any zombies anyway.

We could change the patch in bug 684443 to _always_ attempt to kill zombies.
Comment 13 James Willcox (:snorp) (jwillcox@mozilla.com) 2011-09-06 11:40:56 PDT
I can reproduce this, taking
Comment 14 James Willcox (:snorp) (jwillcox@mozilla.com) 2011-09-06 11:45:08 PDT
I don't think this relates to zombie processes at all.  I have the same fennec processes before/after resume and also after 'relaunching'.
Comment 15 Matt Brubeck (:mbrubeck) 2011-09-06 12:45:09 PDT
From bisecting with both mozilla-central and mozilla-inbound nightlies, the regression range is within this push:

https://hg.mozilla.org/mozilla-central/pushloghtml?changeset=94a4a478d774
Comment 16 Matt Brubeck (:mbrubeck) 2011-09-06 12:55:26 PDT
This appears to be a regression from this patch to target API level 11 (bug 681980):
https://hg.mozilla.org/mozilla-central/rev/b532e0d93bc5

It looks like the app is still receiving input but is no longer painting anything to the screen.  For example, tapping in the url field still makes the on-screen keyboard appear.  Or if I try to scroll while the app is "frozen", then un-freeze it with the steps in comment 8, then I can see that the content has indeed scrolled.
Comment 17 Matt Brubeck (:mbrubeck) 2011-09-06 13:07:12 PDT
Backed out the offending patch, but leaving this bug open for investigation so we can re-land it: https://hg.mozilla.org/mozilla-central/rev/6a4e5dbe0d64
Comment 18 James Willcox (:snorp) (jwillcox@mozilla.com) 2011-09-08 07:42:38 PDT
This bug is caused because the Android surface is destroyed on suspend, but it isn't recreated on resume.  This is normally done by Android and we just get notified, so not sure what's going on here.  Investigating further...
Comment 19 James Willcox (:snorp) (jwillcox@mozilla.com) 2011-09-08 08:17:13 PDT
Created attachment 559166 [details] [diff] [review]
Bug 684242 - don't send synthetic SURFACE_DESTROY event when stopping on Android
Comment 20 James Willcox (:snorp) (jwillcox@mozilla.com) 2011-09-08 08:37:52 PDT
Comment on attachment 559166 [details] [diff] [review]
Bug 684242 - don't send synthetic SURFACE_DESTROY event when stopping on Android

I'm not sure why this was added to begin with, as normally Android will inform us when the surface is created/destroyed.  I think it's wrong, and removing it does fix this bug.  I tested with both software and GL rendering on honeycomb and gingerbread devices.
Comment 21 James Willcox (:snorp) (jwillcox@mozilla.com) 2011-09-08 08:49:08 PDT
Original code was added as part of fixing bug 677920, but I don't really understand why the part I removed was necessary.  Ali, can you explain?
Comment 22 Ali Juma [:ajuma] 2011-09-08 10:37:22 PDT
This code was added since:
-there's OpenGL cleanup we need to do when the Android surface is destroyed
-the ACTIVITY_STOPPING event is always accompanied by a SURFACE_DESTROYED, at least on the Nexus S when targeting API level 5; when quickly changing device orientation repeatedly we were getting the SURFACE_DESTROYED too late to do the OpenGL cleanup (since, in that situation, the activity stops, the surface is destroyed, the activity restarts, and a new surface is created, all in rapid-fire sequence), and hence it made sense to start that cleanup as soon as we received ACTIVITY_STOPPING

It sounds like the assumption that ACTIVITY_STOPPING is always accompanied by SURFACE_DESTROYED no longer holds with API level 11.

I re-tested on a Nexus S using your patch and targeting API level 11, and I'm not seeing any problems during orientation changes.
Comment 23 Matt Brubeck (:mbrubeck) 2011-09-08 11:24:15 PDT
If this requires API level 11, we should make sure it lands at the same time as bug 684963.
Comment 25 Matt Brubeck (:mbrubeck) 2011-09-08 16:19:25 PDT
http://hg.mozilla.org/mozilla-central/rev/36282a6788df
Comment 26 Aaron Train [:aaronmt] 2011-09-09 06:46:31 PDT
Samsung Galaxy Tab 10.1 (Android v3.1)
Mozilla/5.0 (Android; Linux armv7l; rv:9.0a1) Gecko/20110909 Firefox/9.0a1 Fennec/9.0a1

Note You need to log in before you can comment on or make changes to this bug.