[Settings] The chrome and content become visible before they are ready

VERIFIED FIXED in 2.1 S5 (26sep)

Status

defect
VERIFIED FIXED
5 years ago
5 years ago

People

(Reporter: timdream, Assigned: arthurcc)

Tracking

({regression})

unspecified
2.1 S5 (26sep)
x86
macOS
Dependency tree / graph

Firefox Tracking Flags

(blocking-b2g:2.1+, b2g-v2.0 unaffected, b2g-v2.1 verified, b2g-v2.2 verified)

Details

(Whiteboard: [p=1])

Attachments

(1 attachment)

On a Flame which has default localization to English.

STR:

1. Switch the phone to a language other than English
2. Use card view to kill the Settings
3. Relaunch Settings app
4. Before the UI changed from English to the language set, tap Wi-Fi and try to access Wi-Fi panel

Expected:

1. Seeing the Wi-Fi panel

Actual:

1. Seeing a blank panel

Note:

I suspect this has something to do with recent l10n.js refactor or Settings panel refactor, but I am not sure which. Needinfo :gandalf for feedback. Step 4 (seeing English before seeing the changed language) is also something previously we are told to avoid -- not sure if that has changed.

[Blocking Requested - why for this release]: 

1. Not recoverable unless killing the Settings app again.
2. Extremely annoy if one tries to select a Wi-Fi as quickly as one could.
Flags: needinfo?(gandalf)
blocking-b2g: 2.1? → 2.1+
may QA check if this is regression? thanks.
Keywords: qawanted
I'm working on the FOUC in bug 1052861. I'll tackle this as well.
Assignee: nobody → gandalf
Status: NEW → ASSIGNED
Flags: needinfo?(gandalf)
This bug (clicking on wifi panel before JS init is complete) can be reproduced in default locale as well. It's not related to bug 1052861.
I was able to identify the reason for this behavior, and since it is part of a broader problem, I took liberty to rename the bug to cover the whole class of issues.

Overall it is a regression from bug 1024893.

The change in the bug introduced a class of race conditions between Gecko and App code. Root panel is visible and displayed independently of JS code initialization being complete which results in three classes of problems:

1) links in root panel are clickable before they are bind to Settings Panel actions [0]
2) Top bar can be not loaded when the app is displayed which causes flicker/reflow
3) Runtime localization can be completed after firstPaint which causes flicker

The first problem is the original one in this bug reported by Tim. The third problem is largely mitigated by the fix from bug 1052861, but together with problem 2 can be reproduced every once in a while.

I'm unassigning myself from the bug for now, because the solution largely depends on the overall strategy we want to pick for this (and possibly other apps).

Overall, my recommendation would be to block displaying of chrome/content on JS init being ready (we have events for that [1]), but the most visible problem 1 can be also fixed by simply removing href="" from the links or disabling them until bindings are completed.

I'll be happy to help implement the code, once we choose the strategy here.

[0] https://github.com/mozilla-b2g/gaia/blob/31cd87328f5fc76c60097d25354fe84bd6456542/apps/settings/js/modules/settings_panel.js#L49-L58
[1] https://developer.mozilla.org/en-US/Apps/Build/Performance/Firefox_OS_app_responsiveness_guidelines
Assignee: gandalf → nobody
Status: ASSIGNED → NEW
Keywords: qawanted
Summary: Transition to blank panel if tapped on an item before the UI is localized → [Settings] The chrome and content become visible before they are ready
(In reply to Zibi Braniecki [:gandalf] from comment #4)
> I was able to identify the reason for this behavior, and since it is part of
> a broader problem, I took liberty to rename the bug to cover the whole class
> of issues.
> 
> Overall it is a regression from bug 1024893.
> 
> The change in the bug introduced a class of race conditions between Gecko
> and App code. Root panel is visible and displayed independently of JS code
> initialization being complete which results in three classes of problems:
> 
> 1) links in root panel are clickable before they are bind to Settings Panel
> actions [0]
> 2) Top bar can be not loaded when the app is displayed which causes
> flicker/reflow
> 3) Runtime localization can be completed after firstPaint which causes
> flicker
> 
> The first problem is the original one in this bug reported by Tim. The third
> problem is largely mitigated by the fix from bug 1052861, but together with
> problem 2 can be reproduced every once in a while.
> 
> I'm unassigning myself from the bug for now, because the solution largely
> depends on the overall strategy we want to pick for this (and possibly other
> apps).
> 
> Overall, my recommendation would be to block displaying of chrome/content on
> JS init being ready (we have events for that [1]), 

Isn't this means a flash of white during start-up? Do these events actually makes the launching splash keep displaying until we dispatch such event? Flash of white is worse than unlocalized or unresponsive UI IMHO.

> but the most visible
> problem 1 can be also fixed by simply removing href="" from the links or
> disabling them until bindings are completed.
> 

I would suggest using |pointer-events: none| in this case (it's easier to flip style in JS than assigning href in each of the links).


> I'll be happy to help implement the code, once we choose the strategy here.
> 
> [0]
> https://github.com/mozilla-b2g/gaia/blob/
> 31cd87328f5fc76c60097d25354fe84bd6456542/apps/settings/js/modules/
> settings_panel.js#L49-L58
> [1]
> https://developer.mozilla.org/en-US/Apps/Build/Performance/
> Firefox_OS_app_responsiveness_guidelines
(In reply to Tim Guan-tin Chien [:timdream] (MoCo-TPE) (please ni?) from comment #5)
> Isn't this means a flash of white during start-up? Do these events actually
> makes the launching splash keep displaying until we dispatch such event?

Not yet. I would argue that we should start bringing the app to the foreground only when the app-chrome-loaded event is fired.
It will enable us to remove the race condition (and thus potential FOUC/flicker) from the equation all together.

> Flash of white is worse than unlocalized or unresponsive UI IMHO.

I agree.

> I would suggest using |pointer-events: none| in this case (it's easier to
> flip style in JS than assigning href in each of the links).

We can do that. This will not solve the problem of chrome modification after firstPaint.

There are two visible chrome modifications that sometimes happen after firstPaint:
 - non-default locale paint
 - haia bar paint.

I found three ways to solve the l10n part of the problem:
 - restore GAIA_INLINE_LOCALES (revert bug 994370) - I'd prefer not to do that, but if we will have to, I'd like to do this for Settings app only
 - remove defer="" from l10n.js script - I'd prefer not to do this, but it solves the problem
 - introduce a cleanDocument step in non-default locale bootstrap

The last solution is my preferred one. In it, we actually remove the text content before firstPaint and then we translate it to the proper locale once the resources are loaded. It does not seem to impact performance, keeps us winning the race condition in almost all cases as we are now, but in the ones where we "loose" it, instead of flash of default locale, we will have text appearing a split of a second later. I think it gives us the best UX unless we decide to tackle the whole race condition problem.

What do you think Tim?
Flags: needinfo?(timdream)
(In reply to Zibi Braniecki [:gandalf] from comment #6)
> (In reply to Tim Guan-tin Chien [:timdream] (MoCo-TPE) (please ni?) from
> comment #5)
> > Isn't this means a flash of white during start-up? Do these events actually
> > makes the launching splash keep displaying until we dispatch such event?
> 
> Not yet. I would argue that we should start bringing the app to the
> foreground only when the app-chrome-loaded event is fired.
> It will enable us to remove the race condition (and thus potential
> FOUC/flicker) from the equation all together.
> 

The solution is desirable but the proposal is a bold one. These events was created to measure Gaia app start-up time (a signal from app script to launch time measuring script), and it was not intended to affect the app launch itself in any way. The proposal also means all packaged app are obligated to send out this event in order to be reviled and seemed.

> > Flash of white is worse than unlocalized or unresponsive UI IMHO.
> 
> I agree.
> 
> > I would suggest using |pointer-events: none| in this case (it's easier to
> > flip style in JS than assigning href in each of the links).
> 
> We can do that. This will not solve the problem of chrome modification after
> firstPaint.
> 
> There are two visible chrome modifications that sometimes happen after
> firstPaint:
>  - non-default locale paint
>  - haia bar paint.
> 
> I found three ways to solve the l10n part of the problem:
>  - restore GAIA_INLINE_LOCALES (revert bug 994370) - I'd prefer not to do
> that, but if we will have to, I'd like to do this for Settings app only
>  - remove defer="" from l10n.js script - I'd prefer not to do this, but it
> solves the problem
>  - introduce a cleanDocument step in non-default locale bootstrap
> 
> The last solution is my preferred one. In it, we actually remove the text
> content before firstPaint and then we translate it to the proper locale once
> the resources are loaded. It does not seem to impact performance, keeps us
> winning the race condition in almost all cases as we are now, but in the
> ones where we "loose" it, instead of flash of default locale, we will have
> text appearing a split of a second later. I think it gives us the best UX
> unless we decide to tackle the whole race condition problem.
> 
> What do you think Tim?

3rd solution sounds technically saner than the previous two, if it's doable without the event mentioned. If I read you correctly, it means the user will see Settings app with no labels but only icons for a short time right? I'll accept that as a solution, but it's better to loop other stakeholders (product/QA/etc) to see if that's acceptable.
Flags: needinfo?(timdream)
Needinfo? Arthur, owner of Settings.
Flags: needinfo?(arthur.chen)
(In reply to Tim Guan-tin Chien [:timdream] (MoCo-TPE) (please ni?) from comment #7)
> The solution is desirable but the proposal is a bold one. These events was
> created to measure Gaia app start-up time (a signal from app script to
> launch time measuring script), and it was not intended to affect the app
> launch itself in any way. The proposal also means all packaged app are
> obligated to send out this event in order to be reviled and seemed.

The way we designed the bootstrap events was from the very beginning supposed to serve both purposes - performance measurements and system instrumentation.

There are multiple ways we can use the latter. We can for example suspend background activities during bootstrap, we can not enable touch events to be directed to the app until moz-app-chrome-interactive, we can analyze the boostrap process and snapshot the app's memory to warm boot faster, knowing that moz-app-loaded indicates when the app is booted etc. One of the first ones I had on mind is exactly this - moz-app-chrome-loaded is app driven indication to the system that the app's chrome is ready to be displayed.
It may help us to optimize Gecko operations better for such cases (no need to layout until this event to avoid reflows?)

I'd argue that we don't have to do this for all apps, we can start one by one. The idea I had is that we could enable that with a manifest.webapp flag that states that the app is going to instrument the system when the app is ready to be brought to foreground.

> 3rd solution sounds technically saner than the previous two, if it's doable
> without the event mentioned. If I read you correctly, it means the user will
> see Settings app with no labels but only icons for a short time right?

No, it's event better.

Today I landed bug 1052861 which in my tests reduced the frequency of race condition FOUC for non-default locale from 10/10 to below 1/10.

With the patch that I experiment with, the ratio of race condition states the same - below 1/10th, but the experience in this rare condition is different. Instead of seeing default locale flicker into the final locale, we see app's chrome without text and text appears in the final locale split of a second later.

See patch from bug 1053623, apply it yourself and see the impact on how Settings launch works for non-default in the rare case when JS init is not finished before firstPaint.
I applied the patch from 1053623 and it looks a little bit weird IMHO. Sometimes the english text even appeared first, then blank, then the localized text. I believe the time taking from clicking on the icon to the UI is localized is nearly the same in option 2 and 3. To me it is a choice between displaying the chrome as fast as we can but with flickering text or staying in the splash screen a little bit longer until the UI is completely ready.

What is the impact of removing defer="" from l10n.js script (option 2)? I would prefer the option if there is no unintended side effect.

As the original description is about seeing a blank screen, I'll submit a patch here. Regarding the localization problem let's use bug 1053623 to track it.
Assignee: nobody → arthur.chen
Flags: needinfo?(arthur.chen)
(In reply to Arthur Chen [:arthurcc] from comment #10)
> I applied the patch from 1053623 and it looks a little bit weird IMHO.

What device?

> What is the impact of removing defer="" from l10n.js script (option 2)? I
> would prefer the option if there is no unintended side effect.

There is no negative impact that I know of, we use non-deferred l10n.js in the system/homescreen.
I tested the patch on a Flame.
(In reply to Arthur Chen [:arthurcc] from comment #12)
> I tested the patch on a Flame.

That's surprising. You described three classes of bootstrap behavior:

1) the proper one where app chrome/content appears in the chosen locale
2) the one where the chrome appears without content text first and then text is added
3) the one where the text first shows in default locale, then disappears and then reappears in the result locale

I can't reproduce the third state. Are you sure that you're on master gecko and 123base image?

I'll try to reproduce it again tomorrow and play more with removing defer from l10n.js.
I flashed my flame with the latest build and applied the patch and it works much better now. For the three kinds of behaviors, I saw 90% of 1), 9% of 2), and 1% of 3), which makes me wavering between the proposed patch and removing the defer of l10n.js. Personally I prefer the later option but I would like the UX designer make the decision.

Hi Jenny, I'll show you the results of these two options later.
Flags: needinfo?(jelee)
(In reply to Tim Guan-tin Chien [:timdream] (MoCo-TPE) (please ni?) from comment #5)
> I would suggest using |pointer-events: none| in this case (it's easier to
> flip style in JS than assigning href in each of the links).

Does Chrome's decision not to support pointer-events mean anything for the long term support in Gecko?

Wrt. l10n. I'll test more, I still can't reproduce the 1% scenario that Arthur described. Will do more tests including the impact of removing defer from l10n.js.
(In reply to Arthur Chen [:arthurcc] from comment #14)
> I flashed my flame with the latest build and applied the patch and it works
> much better now. For the three kinds of behaviors, I saw 90% of 1), 9% of
> 2), and 1% of 3), which makes me wavering between the proposed patch and
> removing the defer of l10n.js. Personally I prefer the later option but I
> would like the UX designer make the decision.
> 
> Hi Jenny, I'll show you the results of these two options later.

Hello Arthur,
I prefer the one that provides most stability (method 2 remove defer="" from l10n.js script). Thanks!
Flags: needinfo?(jelee)
I filed bug 1057529 to remove defer. It seems to fix the problem and help with performance for non-default locale.
With bug 1057529 fixed, we have the l10n part of the bug covered. Now we have to deal with the fact that the UI is clickable before the bindings are set.
Requesting QAwanted for branch checks per jsmith - need to know if this should get fixed on other branches as well.
Keywords: qawanted
QA Contact: pcheng
Issue is reproducible on Flame 2.2, Flame 2.1, and Open C 2.1.

Observed behavior: Within Settings app, tapping on Wifi option before Settings fully loads opens a blank page. Changing language is NOT needed (step 1 of original STR) in order to reproduce the issue. Issue is 100% reproducible if done fast enough.

Device: Flame 2.2 Master
BuildID: 20140903115424
Gaia: af04d8bc2111d4ea146239a89ff602206b85eaf5
Gecko: 117271830c4d
Version: 35.0a1 (2.2 Master)
Firmware: V123
User Agent: Mozilla/5.0 (Mobile; rv:35.0) Gecko/35.0 Firefox/35.0

Device: Flame 2.1
BuildID: 20140903085050
Gaia: fbb297c39aab5f17b179533d2a9a6c5166b2c197
Gecko: 31dad821234e
Version: 34.0a2 (2.1)
Firmware: V123
User Agent: Mozilla/5.0 (Mobile; rv:33.0) Gecko/33.0 Firefox/33.0

Device: Open_C 2.1
BuildID: 20140903085050
Gaia: fbb297c39aab5f17b179533d2a9a6c5166b2c197
Gecko: 31dad821234e
Version: 34.0a2 (2.1)
Firmware: P821A10v1.0.0B06_LOG_DL
User Agent: Mozilla/5.0 (Mobile; rv:33.0) Gecko/33.0 Firefox/33.0

----------

This bug does NOT occur on Flame 2.0.

Tapping on Wifi option before Settings fully loads, the Wifi option stays highlighted for about a second, then enters Wifi option successfully.

Device: Flame 2.0
BuildID: 20140903075252
Gaia: d056733f8a7a1a152f5458b323f092c47dbffa48
Gecko: 742cb642750f
Version: 32.0 (2.0)
Firmware: V123
User Agent: Mozilla/5.0 (Mobile; rv:32.0) Gecko/32.0 Firefox/32.0
QA Whiteboard: [QAnalyst-Triage?]
Flags: needinfo?(jmitchell)
Keywords: qawanted
QA Whiteboard: [QAnalyst-Triage?] → [QAnalyst-Triage+]
Flags: needinfo?(jmitchell)
Status: NEW → ASSIGNED
EJ, could you help check this simple patch? Thanks!

As the UI is being displayed first, we should should block the click events until the first panel gets ready (which implies that the necessary AMD modules are loaded).
Attachment #8484075 - Flags: review?(ejchen)
Whiteboard: [p=1][ETA: 9/15]
Target Milestone: --- → 2.1 S5 (26sep)
QA Whiteboard: [QAnalyst-Triage+] → [QAnalyst-Triage+][lead-review+]
Comment on attachment 8484075 [details]
link to https://github.com/mozilla-b2g/gaia/pull/23711

Thanks Arthur, we should block the click events before panels are ready.

r+
Attachment #8484075 - Flags: review?(ejchen) → review+
Comment on attachment 8484075 [details]
link to https://github.com/mozilla-b2g/gaia/pull/23711

[Approval Request Comment]
[Bug caused by] (feature/regressing bug #): Bug 964180
[User impact] if declined: Users have to restart the app to recover.
[Testing completed]: Manual tests completed.
[Risk to taking this patch] (and alternatives if risky): Low
[String changes made]: N/A
Attachment #8484075 - Flags: approval-gaia-v2.1?(bbajaj)
master: 4995f9c6a62734d2e21a1be120d69641a3c9a551
Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Attachment #8484075 - Flags: approval-gaia-v2.1?(bbajaj) → approval-gaia-v2.1+
This issue is verified fixed for the Flame 2.2 Master (319mb) and the Flame 2.1 KK (319mb)

Flame 2.2 Master KK (319mb) (Full Flash)

Device: Flame 2.2 Master
BuildID: 20141011040204
Gaia: 95f580a1522ffd0f09302372b78200dab9b6f322
Gecko: 3f6a51950eb5
Gonk: 52c909e821d107d414f851e267dedcd7aae2cebf
Version: 35.0a1 (2.2 Master)
Firmware: V180
User Agent: Mozilla/5.0 (Mobile; rv:35.0) Gecko/35.0 Firefox/35.0

Flame 2.1 KK (319mb) (Full Flash)

Device: Flame 2.1
BuildID: 20141011000201
Gaia: f5d4ff60ffed8961f7d0380ada9d0facfdfd56b1
Gecko: d813d79d3eae
Gonk: 52c909e821d107d414f851e267dedcd7aae2cebf
Version: 34.0a2 (2.1)
Firmware: V180
User Agent: Mozilla/5.0 (Mobile; rv:34.0) Gecko/34.0 Firefox/34.0

Result: This issue is verfied fixed for the Flame 2.2 Master (319mb) and the Flame 2.1 KK (319mb)

Flame 2.2 Master KK (319mb) (Full Flash)

Device: Flame 2.2 Master
BuildID: 20141011040204
Gaia: 95f580a1522ffd0f09302372b78200dab9b6f322
Gecko: 3f6a51950eb5
Gonk: 52c909e821d107d414f851e267dedcd7aae2cebf
Version: 35.0a1 (2.2 Master)
Firmware: V180
User Agent: Mozilla/5.0 (Mobile; rv:35.0) Gecko/35.0 Firefox/35.0

Flame 2.1 KK (319mb) (Full Flash)

Device: Flame 2.1
BuildID: 20141011000201
Gaia: f5d4ff60ffed8961f7d0380ada9d0facfdfd56b1
Gecko: d813d79d3eae
Gonk: 52c909e821d107d414f851e267dedcd7aae2cebf
Version: 34.0a2 (2.1)
Firmware: V180
User Agent: Mozilla/5.0 (Mobile; rv:34.0) Gecko/34.0 Firefox/34.0

Result: Wi-Fi panel is no longer blank.
Status: RESOLVED → VERIFIED
Flags: needinfo?(ktucker)
QA Whiteboard: [QAnalyst-Triage+][lead-review+] → [QAnalyst-Triage?][lead-review+]
QA Whiteboard: [QAnalyst-Triage?][lead-review+] → [QAnalyst-Triage+][lead-review+]
Flags: needinfo?(ktucker)
You need to log in before you can comment on or make changes to this bug.