[Browser] Closes upon return from display timeout (to lockscreen)

RESOLVED FIXED

Status

defect
RESOLVED FIXED
4 years ago
4 years ago

People

(Reporter: onelson, Assigned: gsvelto)

Tracking

({regression})

unspecified
ARM
Gonk (Firefox OS)
Dependency tree / graph

Firefox Tracking Flags

(blocking-b2g:2.5?, firefox41 fixed, b2g-v2.2 unaffected, b2g-master verified)

Details

(Whiteboard: [3.0-Daily-Testing], )

Attachments

(7 attachments, 1 obsolete attachment)

Description:
When the user is utilizing the browser and opens a webpage, the user may observe their browser force close upon returning from a display timeout. If a timeout occurs when sitting in browser, when the user unlocks and reenters their phone, they will observe the browser open for a brief second before force closing upon itself.

PreReq:
* display timeout set to 1 minute
Repro Steps:
1) Update a Flame to 20150421010201
2) Open the browser
3) Navigate to a data intensive website (blog/newsite [ex: www.kotaku.com]
4) Let phone sit for 1 minute until display times out
5) After phone timesout, tap power to bring display back on and unlock phone
6) Observe browser after unlock

Actual:
Browser will force close upon reentering phone from timeout if page was still loading

Expected:
Browser will remain open and continue loading, potentially while the display is timed out if using a Wi-Fi connection


Environmental Variables:
Device: Flame 3.0
Build ID: 20150421010201
Gaia: a8e4f95dce9db727dda5d408b038f97fb4296557
Gecko: 7b823253d9f2
Gonk: b83fc73de7b64594cd74b33e498bf08332b5d87b
Version: 40.0a1 (3.0)
Firmware Version: v18D-1
User Agent: Mozilla/5.0 (Mobile; rv:40.0) Gecko/40.0 Firefox/40.0

** observed this issue at the following websites:
* kotaku: 4/4
* buzzfeed: 2/2
* cnn.com: 1/1
* mozilla.com: 0/1
* goodreads.com (login): 0/1
* google search results: 0/1

Repro frequency: 7/10
See attached: 
logcat
video- https://youtu.be/36gxONxJPXY
Flags: needinfo?(pbylenga)
QA Whiteboard: [QAnalyst-Triage?]
Summary: [Browser] Closes upon return from display timeout if website → [Browser] Closes upon return from display timeout (to lockscreen)
qawanted for branch checks.
QA Whiteboard: [QAnalyst-Triage?]
Flags: needinfo?(pbylenga)
Keywords: qawanted
QA Contact: jmercado
This issue does NOT occur on Flame 2.2.

Actual Results:  The browser does not close when coming back from timeout.

Environmental Variables:
Device: Flame 2.2
BuildID: 20150421125610
Gaia: 41a85c5f9db291d4f7c0e94c8416b5115b4ee407
Gecko: e40544535399
Version: 37.0 (2.2) 
Firmware Version: v18D-1
User Agent: Mozilla/5.0 (Mobile; rv:37.0) Gecko/37.0 Firefox/37.0
QA Whiteboard: [QAnalyst-Triage?]
Flags: needinfo?(ktucker)
Keywords: qawantedregression
Regression that could affect many end users so nominating this 3.0? Let's get a regression window here.
QA Whiteboard: [QAnalyst-Triage?]
Flags: needinfo?(ktucker)
[Blocking Requested - why for this release]:
blocking-b2g: --- → 3.0?
QA Contact: jmercado → ychung
QA Contact: ychung
B2g-inbound Regression Window

Last Working 
Environmental Variables:
Device: Flame 3.0
BuildID: 20150413010438
Gaia: 6414097adfb054260e47cf72a9d1b80eaecef408
Gecko: 81c108144278
Version: 40.0a1 (3.0) 
Firmware Version: v18D-1
User Agent: Mozilla/5.0 (Mobile; rv:40.0) Gecko/40.0 Firefox/40.0

First Broken 
Environmental Variables:
Device: Flame 3.0
BuildID: 20150413012638
Gaia: 6414097adfb054260e47cf72a9d1b80eaecef408
Gecko: 246819a7ee89
Version: 40.0a1 (3.0) 
Firmware Version: v18D-1
User Agent: Mozilla/5.0 (Mobile; rv:40.0) Gecko/40.0 Firefox/40.0

Last Working gaia / First Broken gecko - Issue DOES occur
Gaia: 6414097adfb054260e47cf72a9d1b80eaecef408
Gecko: 246819a7ee89

First Broken gaia / Last Working gecko - Issue does NOT occur
Gaia: 6414097adfb054260e47cf72a9d1b80eaecef408
Gecko: 81c108144278

Gecko Pushlog:  http://hg.mozilla.org/integration/b2g-inbound/pushloghtml?fromchange=81c108144278&tochange=246819a7ee89
QA Whiteboard: [QAnalyst-Triage?]
Flags: needinfo?(ktucker)
Kan-Ru, can you take a look at this please? This might have been caused by the work done on bug 1142806.
Blocks: 1142806
QA Whiteboard: [QAnalyst-Triage?] → [QAnalyst-Triage+]
Flags: needinfo?(ktucker) → needinfo?(kchen)
Does it reproduce if you replace step 4 with manually turning the screen off?

Could you attach the output of |adb shell b2g-info| before and after the phone times out and preferably after you unlock the phone?
Flags: needinfo?(kchen) → needinfo?(onelson)
> Does it reproduce if you replace step 4 with manually turning the screen off?
This issue does reproduce with manually turning the screen off and returning.
Repro Rate: 4/6 @ www.kotaku.com

********************************
BEFORE TIMEOUT [ OPEN www.kotaku.com ]:
                         |     megabytes     |
          NAME  PID PPID CPU(s) NICE  USS  PSS  RSS SWAP VSIZE OOM_ADJ USER    
           b2g  207    1  227.2    0 41.0 44.2 52.9 25.3 263.7       0 root    
        (Nuwa)  462  207    1.3    0  0.1  1.6  6.8 11.5  76.9     -16 root    
    Homescreen 1208  207   13.0    0 11.6 13.8 20.8  6.3  93.6       8 u0_a1208
Find My Device 3074  207    2.7    0  1.6  5.0 13.6 11.2  77.9      10 u0_a3074
       Browser 3238  207    5.7    0 44.5 48.1 57.2  0.0 120.3       2 u0_a3238

System memory info:

            Total 214.6 MB
        SwapTotal 192.0 MB
     Used - cache 170.2 MB
  B2G procs (PSS) 112.7 MB
    Non-B2G procs  57.5 MB
     Free + cache  44.4 MB
             Free   4.5 MB
            Cache  39.9 MB
         SwapFree 115.7 MB

Low-memory killer parameters:

  notify_trigger 14336 KB

  oom_adj min_free
        0  4096 KB
       58  5120 KB
      117  6144 KB
      352  7168 KB
      470  8192 KB
      588 20480 KB

********************************
AFTER TURNING TOGGLING DISPLAY [ tap power to lock, unlock and attempt to browse ]:
                          |     megabytes     |
           NAME  PID PPID CPU(s) NICE  USS  PSS  RSS SWAP VSIZE OOM_ADJ USER    
            b2g  207    1  235.9    0 44.3 48.0 55.9 25.4 262.3       0 root    
         (Nuwa)  462  207    1.3    0  0.6  2.3  7.3 11.0  76.9     -16 root    
     Homescreen 1208  207   13.4    0  9.1 11.4 18.0  9.1  93.5       8 u0_a1208
        Browser 3238  207   15.8    0 26.5 29.8 37.4 29.4 125.0      10 u0_a3238
(Preallocated a 3342  207    0.5    0 12.9 13.7 16.0  0.0  64.4       2 u0_a3342

System memory info:

            Total 214.6 MB
        SwapTotal 192.0 MB
     Used - cache 172.3 MB
  B2G procs (PSS) 105.2 MB
    Non-B2G procs  67.1 MB
     Free + cache  42.3 MB
             Free   3.9 MB
            Cache  38.5 MB
         SwapFree  92.8 MB

Low-memory killer parameters:

  notify_trigger 14336 KB

  oom_adj min_free
        0  4096 KB
       58  5120 KB
      117  6144 KB
      352  7168 KB
      470  8192 KB
      588 20480 KB
********************************
Flags: needinfo?(onelson) → needinfo?(kchen)
The problematic sequence is:

Before screen off:

 Processes      OOM_ADJ
 ----------------------
 Browser              2
 Find My Device      10

After screen off:

 Processes      OOM_ADJ
 ----------------------
 Browser              2
 Find My Device         (killed because Browser continues to load)

After unlock lockscreen:

 Processes      OOM_ADJ
 ----------------------
 Browser              2
 Find My Device       2 (launched again, foreground priority)

 Processes      OOM_ADJ
 ----------------------
 Browser             10 (delayed priority change applied)
 Find My Device       2

 Processes      OOM_ADJ
 ----------------------
 Browser                (killed because Find My Device uses a lot of memory)
 Find My Device       2 (later will because background)

A bonus question is, I thought that the patch in bug 852925 was intended to freeze priority change in this case, but it doesn't? Currently in gaia the system app will put all foreground app to background before we freeze the priority manager.
Flags: needinfo?(kchen)
Attachment #8598566 - Flags: review?(gsvelto)
(In reply to Kan-Ru Chen [:kanru] from comment #10)
> A bonus question is, I thought that the patch in bug t was intended to
> freeze priority change in this case, but it doesn't? Currently in gaia the
> system app will put all foreground app to background before we freeze the
> priority manager.

We do freeze the priority change when turning the screen off, note that we also apply all pending priority changes when it is turned on again so if gaia did something in the meantime we might get those changes. Could this have been a side-effect of bug 1142806?

BTW I intend to remove all specialization for the homescreen soon and hopefully all these bugs will just go away.
(In reply to Gabriele Svelto [:gsvelto] from comment #11)
> (In reply to Kan-Ru Chen [:kanru] from comment #10)
> > A bonus question is, I thought that the patch in bug t was intended to
> > freeze priority change in this case, but it doesn't? Currently in gaia the
> > system app will put all foreground app to background before we freeze the
> > priority manager.
> 
> We do freeze the priority change when turning the screen off, note that we
> also apply all pending priority changes when it is turned on again so if
> gaia did something in the meantime we might get those changes. Could this
> have been a side-effect of bug 1142806?

It doesn't look like a side-effect. The behavior before bug 1142806 is all processes will be put into background before ProcessPriorityManager has a chance to freeze them. I'm afraid this could defeat the purpose of bug 852925.
 
> BTW I intend to remove all specialization for the homescreen soon and
> hopefully all these bugs will just go away.
(In reply to Kan-Ru Chen [:kanru] from comment #12)
> It doesn't look like a side-effect. The behavior before bug 1142806 is all
> processes will be put into background before ProcessPriorityManager has a
> chance to freeze them. I'm afraid this could defeat the purpose of bug
> 852925.

Yes, you're right. I've hacked together an alternative patch-set that addresses both this issue and the previous ones (screen off and card view) without the added complexity of delaying changes. Let's give it a try; if it doesn't work we'll take your patch.
This only reverts the changes made in bug 1142806 and doesn't delay re-prioritizing the only foreground process anymore.
This contains the most important change. It does a few different things:

- It moves the screen-state-change event so that it's fired before the windows are informed that their mode has changed. This allows listeners to always know the state of the screen when dealing with visibility changes. Since there's only one other listener (the RIL) and it doesn't care about visibility changes I think this is a safe change.

- I've removed the public Freeze() and Unfreeze() methods, they're not required anymore now that the events are in a more suitable order.

- When in frozen state the particular priority managers will keep reacting to priority changes *expect* those that are caused by visibility. This prevents the quick foreground->background->foreground transition that happened before when the screen was turned on again. If the visibility of some apps changed while the screen was off they will be re-prioritized correctly anyway since we're sending new visibility states to all windows anyway.

This should solve all problems associated with turning off the screen but doesn't deal with the card-view and other problems caused by all processes being sent into the background.
This is the final patch and it only removes the special homescreen priority level. I feel that this level is unnecessary and is causing all kind of problems (such as the card view issue in bug 1142806).

The upsides of this change are the following:

- If the user prefers the homescreen when switching apps then the behavior is unaffected. The LRU arrangement of the background apps WRT the LMK means that the homescreen will be killed in the same order as before.

- If the user switches between two or more apps using edge swipes then those apps will be favored over the homescreen. This is beneficial because it favors the apps the user is actually using above the homescreen which didn't happen before.

- In the card-view case when there are no more foreground apps the previous foreground app is sent to the background. However since the homescreen doesn't have a special priority anymore it won't be killed to keep it alive.

Kan-Ru, what do you think of this approach? I tend to prefer it over delays because it greatly reduces the priority manager's complexity while also improving other use-cases.
Flags: needinfo?(kchen)
Yep, this looks good to me.
Flags: needinfo?(kchen)
OK thanks. Sorry for having to revert your work from bug 1142806; I wish I'd thought about this back in the day.
Michael this patch-set should solve the problems we encountered in bug 852925 and bug 1142806 in a more robust way as well as playing nicer to activities when the screen is turned on and off. It does make the homescreen more prone to dying if the user uses edge-swiping for navigation but in general I'd expect a lot of corner cases we had before where the LMK killed the wrong app to go away. Can you give it a spin and see if it works on your side? We might consider it for inclusion in 2.2 and I feel kind of silly for not having thought about it before as it's a really simple change.
Flags: needinfo?(mvines)
Comment on attachment 8599830 [details] [diff] [review]
[PATCH 1/3] Revert bug 1142806

This is a simple revert of bug 1142806 which we don't need anymore.
Attachment #8599830 - Flags: review?(khuey)
Comment on attachment 8599833 [details] [diff] [review]
[PATCH 2/3] Prevent visibility changes from affecting the process priority when the screen is off

This does two major changes:

- Changing the order in which the screen-state-changed event is sent WRT the sizemodechange event sent to the windows when turning the screen off. screen-state-changed had only one consumer (the RIL) which doesn't care about sizemodechange so this should be safe.

- Changing the process priority manager to only freeze visibility-related changes when the screen is off. When the screen is turned back on we don't immediately reset the priorities because we don't need to: all windows will get a sizemodechange event after this point so if visibility changed while the screen was off the appropriate priority adjustments will happen nevertheless.
Attachment #8599833 - Flags: review?(khuey)
Comment on attachment 8599836 [details] [diff] [review]
[PATCH 3/3] Stop special-casing the homescreen in the process priority manager

Last but not least this stops special-casing the homescreen as far as LMK adjustments go and lumps it with the other background applications. The reason we can do so safely is that we handle background applications in LRU order (which we didn't when we introduced this, hence this peculiar choice) so if the user keeps going to the homescreen for switching apps it will be killed in the same order as before. On the other hand if the user uses edge-swiping  the homescreen will be demoted behind the application the user swept away from. This is an improvement as it doesn't prioritize the homescreen anymore over an app the user might want to switch again to.

Flagging Dave too for the HAL changes.
Attachment #8599836 - Flags: review?(khuey)
Attachment #8599836 - Flags: review?(dhylands)
Comment on attachment 8599836 [details] [diff] [review]
[PATCH 3/3] Stop special-casing the homescreen in the process priority manager

Review of attachment 8599836 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good to me.
Attachment #8599836 - Flags: review?(dhylands) → review+
Comment on attachment 8599836 [details] [diff] [review]
[PATCH 3/3] Stop special-casing the homescreen in the process priority manager

Review of attachment 8599836 [details] [diff] [review]:
-----------------------------------------------------------------

\o/
Attachment #8599836 - Flags: review?(khuey) → review+
Thanks to everybody who helped here, let's hope this puts our pains with this topic behind us. The try run is here: https://treeherder.mozilla.org/#/jobs?repo=try&revision=c9f7215736c0

I had to refresh the second patch as it did not apply cleanly anymore, I'll attach the refreshed patch ASAP.
Refreshed patch, this is r=khuey, carrying over the r+ flag.
Assignee: nobody → gsvelto
Attachment #8599833 - Attachment is obsolete: true
Status: NEW → ASSIGNED
Attachment #8604314 - Flags: review+
I had to re-trigger a few intermittent issues but everything looks green now, time to land. Fingers crossed as from a conceptual perspective this is one of the biggest changes to how we assign priorities since we established the system (even though it's very simple).
Keywords: checkin-needed
Flags: needinfo?(mvines)
This bug has been verified as pass on latest Nightly build of Flame 3.0 & Nexus 5_3.0 by the STR in Comment 0.

Actual result:
The Browser will keep loading the previous page after you unlock device.

See attachment:Verify_30.mp4
** Observed this issue at the following websites:
Reproduce rate:
* kotaku: 0/8
* buzzfeed: 0/8
* cnn.com: 0/8

Device: Flame 3.0 build(Pass):
Build ID               20150602160205
Gaia Revision          6d477a7884273886605049b20f60af5c1583a150
Gaia Date              2015-06-01 16:41:42
Gecko Revision         https://hg.mozilla.org/mozilla-central/rev/9eae3880b132
Gecko Version          41.0a1
Device Name            flame
Firmware(Release)      4.4.2
Firmware(Incremental)  eng.cltbld.20150602.192511
Firmware Date          Tue Jun  2 19:25:20 EDT 2015
Bootloader             L1TC000118D0

Device:Nexus5_3.0 build(Pass):
Build ID               20150602160205
Gaia Revision          6d477a7884273886605049b20f60af5c1583a150
Gaia Date              2015-06-01 16:41:42
Gecko Revision         https://hg.mozilla.org/mozilla-central/rev/9eae3880b132
Gecko Version          41.0a1
Device Name            hammerhead
Firmware(Release)      5.1
Firmware(Incremental)  eng.cltbld.20150602.192258
Firmware Date          Tue Jun  2 19:23:13 EDT 2015
Bootloader             HHZ12f
QA Whiteboard: [QAnalyst-Triage+] → [QAnalyst-Triage+], [MGSEI-Triage+]
Posted video Verify_30.MP4
(In reply to Coler from comment #30)
> This bug has been verified as pass on latest Nightly build of Flame 3.0 &
> Nexus 5_3.0 by the STR in Comment 0.

Excellent, thanks for verifying this.
You need to log in before you can comment on or make changes to this bug.