Settings' startup time is over the 1000 ms acceptance threshold for 2.1

RESOLVED FIXED

Status

RESOLVED FIXED
4 years ago
4 years ago

People

(Reporter: gmealer, Assigned: arthurcc)

Tracking

({perf})

unspecified
x86
All

Firefox Tracking Flags

(blocking-b2g:-, tracking-b2g:+)

Details

(Whiteboard: [priority])

Attachments

(1 attachment)

[Blocking Requested - why for this release]:

See test results at:

https://wiki.mozilla.org/B2G/QA/2014-10-02_Performance_Acceptance#Settings

Median startup time was 2577 ms for 450 launches. This is, however, a huge improvement from the 2.0 median startup time of 3397 ms.

Graphs, raw data and other testing details are on the wiki.

Realistically, I don't think there is any way this is hitting 1000 ms for release, but I'm noming for an explicit decision and perhaps guidance on a more realistic threshold for us to target.

If the problem is limited to a small set of widgets dependent on hardware status checks to finalize (which is what I suspect might be the case) it might be worth judging whether a user would really find that to be a bad experience and adjusting the definition of visually-complete or acceptance accordingly.

Comment 1

4 years ago
Triage group: this does block as no realistic threshold for us to target. putting to backlog to keep track on this.
blocking-b2g: 2.1? → backlog
Tim, we're going to be retesting Settings (and the other apps) periodically between now and release for performance acceptance and it would be useful to have some bright-line performance requirement for 2.1. 

I think it's plain that 1000 ms isn't realistic, but we were hoping this bug might actually produce an achievable number to target with the intention that we keep bringing that number down incrementally for subsequent releases. Even if we peg that number at 2600 ms for 2.1, per current results, that's better than no target at all.

Any comments here? In the 2.1 timeframe, can we identify a number where QA and release can effectively say "above this, no; below this, yes"?
Flags: needinfo?(timdream)
Whiteboard: [priority]
I have no opinion personally other than agreed on what you said (having a measurement to ensure we don't regress).

Redirect to Settings app owner.
Flags: needinfo?(timdream) → needinfo?(arthur.chen)
Created attachment 8503948 [details]
WIP

As the initial panel to be displayed exists in the DOM, so in this patch, I fire |moz-app-visually-complete| right after localization completes. It is the same timing of firing |moz-chrome-dom-loaded|.

I was trying to get the performance number using |make test-perf| but with no luck. It kept throwing assertion error "could not collect memory usage". Assuming that the number is close to |moz-chrome-dom-loaded|, the startup time after applying the patch should be around 1200ms based on the numbers on datazilla[1].

[1]: https://datazilla.mozilla.org/b2g/?branch=master&device=flame-319MB&range=7&test=startup_%3E_moz-chrome-dom-loaded&app_list=calendar,camera,clock,communications/contacts,communications/dialer,costcontrol,email%20FTU,fm,gallery,settings,sms,video&app=communications/dialer&gaia_rev=d0969a2263120c2a&gecko_rev=b10f859881e5&plot=median
Flags: needinfo?(arthur.chen)
Assignee: nobody → arthur.chen
I incline to set a target ensuring we won't regress what we have for v2.1. For the future releases, we will constantly improve the performance to meet the 1000ms target.
[Blocking Requested - why for this release]:

As of 10-31, Settings is still well over guidelines and has regressed from the 10-02 number (from 2577 ms to 2823 ms) that have been advanced as the informal target in Comment 5. 

I'm renominating this for blocking, pending at least seeing it come back down to the 10-02 numbers, a decision made regarding placement of the timing event, or some other concrete way of addressing the results. I don't think we can just backlog this one without some attention.

https://wiki.mozilla.org/B2G/QA/2014-10-31_Performance_Acceptance#Settings
blocking-b2g: backlog → 2.1?
(In reply to Arthur Chen [:arthurcc] from comment #4)
> Created attachment 8503948 [details]
> WIP
> 
> As the initial panel to be displayed exists in the DOM, so in this patch, I
> fire |moz-app-visually-complete| right after localization completes. It is
> the same timing of firing |moz-chrome-dom-loaded|.

I think that this is a mistake. 

According to https://developer.mozilla.org/en-US/Apps/Build/Performance/Firefox_OS_app_responsiveness_guidelines

 - moz-chrome-dom-loaded should be fired when chrome is displayed.
 - moz-chrome-dom-interactive should be fired when menu options are interactive
 - moz-app-visually-complete should be fired when updates to the screen stop (so, when Wifi/Antenna states are loaded)

From what I understand about Settings app, those three states should fire at different times.

NI on Eli to confirm that
Flags: needinfo?(eperelman)

Comment 8

4 years ago
Gandalf is, for the most part, correct. The Settings app has no functional chrome, so moz-chrome-dom-loaded and moz-chrome-dom-interactive can be fired pretty much immediately.

As far as visually complete meaning "updates to screen have stopped", that is mostly correct, but more correct would be to say "when the app appears ready for the user to interact with". If Settings appears ready to use but there are still updates to the screen, I think that is acceptable, as long as that is genuinely what is happening. Otherwise what Gandalf says holds true: don't fire visually-complete until it appears to have finished loading.

(In reply to Arthur Chen [:arthurcc] from comment #4)
> I was trying to get the performance number using |make test-perf| but with
> no luck. It kept throwing assertion error "could not collect memory usage".

Your gonk is out of date; you need to do a full flash with the latest and that should resolve your issue.
Flags: needinfo?(eperelman)
(In reply to :Eli Perelman from comment #8)
> Gandalf is, for the most part, correct. The Settings app has no functional
> chrome, so moz-chrome-dom-loaded and moz-chrome-dom-interactive can be fired
> pretty much immediately.
> 
> As far as visually complete meaning "updates to screen have stopped", that
> is mostly correct, but more correct would be to say "when the app appears
> ready for the user to interact with". If Settings appears ready to use but
> there are still updates to the screen, I think that is acceptable, as long
> as that is genuinely what is happening. Otherwise what Gandalf says holds
> true: don't fire visually-complete until it appears to have finished loading.

Oh, ok, so chrome-dom-loaded & chrome-dom-interactive are fairly meaningless for Settings.

moz-content-interactive and moz-app-visually-complete are meaningful.

 - content-interactive should be fired when menus are interactive (toggles can be toggles, menu options can be clicked)
 - app-visually-complete is more tricky as it represents the mark in UX when the user will consider the app to be fully loaded. 

I'm not sure how to tackle the Wifi/Antenna states here. On one hand I believe that if content is interactive before the Airplane mode is enabled, the app is usable. On the other, it still "feels" that the app is loading until it's done. The visual changes on the screen are pretty substantial when the hardware ends loading, so I have a hard time saying that before that we're "visually complete".

I think we need to make an arbitrary call here and I don't feel qualified to make it, so below is just my personal option:

I'd lean toward marking app-visually-complete only when everything shows up on the screen, firing content-interactive when menu's are interactive and using the content-interactive as a main performance metric for this app.
(In reply to Zibi Braniecki [:gandalf] from comment #9)
> I think we need to make an arbitrary call here and I don't feel qualified to
> make it, so below is just my personal option:
> 
> I'd lean toward marking app-visually-complete only when everything shows up
> on the screen, firing content-interactive when menu's are interactive and
> using the content-interactive as a main performance metric for this app.

Agree with Zibi. We've filed bug 1090843 and bug 1079576 trying to optimize the time required for interaction and stable screen. It also ensures that there is no major reflow in above-the-fold screen during the launch. Note that this is achieved by delay loading the hardware information so the time of app-visually-complete is affected.
(In reply to Arthur Chen [:arthurcc] from comment #10)
> After bug 1089459 landed the number went back to what we had on 10/02[1].
> 
> [1]:
> https://datazilla.mozilla.org/b2g/?branch=v2.1&device=flame-
> 319MB&range=60&test=startup_%3E_moz-app-visually-
> complete&app_list=settings&app=settings&gaia_rev=80479e8af783f6f0&gecko_rev=c
> 20912812877&plot=avg

That's great news! We'll be retesting at the EOW, so this improvement should show up then. If we get back to those initial numbers, I think backlogging this makes more sense.

Re: Gandalf's suggestion, sounds reasonable to me to go off the interactive metric. 

However, do understand that partner acceptance testing and Android comparison testing are both camera-based, and you can't tell on Camera when you're interactive. That's why we adopted visually-complete. 

My suggestion is that we want both--interactive tells us a truer story off our internal timings, but we still need to lock down a visually-complete target to be able to predict partner tests.

Comment 13

4 years ago
Triage: 1000ms is not a practical goal for settings app, remove nom. Flag tracking b2g to keep working on this so it does not loose track.
blocking-b2g: 2.1? → -
tracking-b2g: --- → +
(In reply to howie [:howie] from comment #13)
> Triage: 1000ms is not a practical goal for settings app, remove nom. Flag
> tracking b2g to keep working on this so it does not loose track.

Howie, Tim,
QA has been asking every team to define what is an "acceptable" startup time for each app.  if 1000ms is not practical, please provide the realistic number, followed by the recommended threshold that regression cannot go beyond.  

For example, Calendar team has concluded that for 2.1, 1150ms is the acceptable startup number, and anything > 100ms should be tracked as regressions.

I'd like to see settings team make the same statement, and then update the performance acceptance page for 2.1 at: https://wiki.mozilla.org/FirefoxOS/Performance/Release_Acceptance#2.1

This way, QA and Dev can track bugs as "PASS" or "FAIL" based on the defined criteria.

Thanks.
Flags: needinfo?(timdream)
Flags: needinfo?(hochang)
Per the description and comment 5, let's use 2600ms for the median time of startup. I've already updated the wiki page.
Flags: needinfo?(timdream)
Flags: needinfo?(hochang)
Resolve the bug per the latest (11/7) performance report[1]. Currently the result is under the acceptance criteria. Please reopen the bug if it fails the test.

[1]: https://wiki.mozilla.org/B2G/QA/2014-11-07_Performance_Acceptance
Status: NEW → RESOLVED
Last Resolved: 4 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.