Closed Bug 1259479 Opened 8 years ago Closed 8 years ago

Autophone - Throbber start regression 2016-03-22 on fx-team

Categories

(Firefox for Android Graveyard :: General, defect)

defect
Not set
normal

Tracking

(fennec48+)

RESOLVED WONTFIX
Tracking Status
fennec 48+ ---

People

(Reporter: bc, Assigned: mcomella)

References

(Blocks 1 open bug)

Details

Change in Throbber start appears to affect everyone except Nexus 9 (5.0.2)
Does not appear to affect Throbber stop.

http://phonedash.mozilla.org/#/org.mozilla.fennec/throbberstart/local-blank/norejected/2016-03-22/2016-03-22/notcached/noerrorbars/standarderror/notry/mean

http://hg.mozilla.org/integration/fx-team/pushloghtml?fromchange=43a70471d1acbc7dc39b645fe25645834fafbf6b&tochange=34227dc37f69cb1d156d401d5cdf5cfb4a4bc91e

-> Bug 1220309, Bug 1256427

Turn off the non fx-team data points to see the change clearly.

:mcomella, do you think your patches could have caused this?
Summary: Autophone - Throbber start regression 2016-03-02 → Autophone - Throbber start regression 2016-03-22
I doubt bug 1256427 had an effect, though bug 1220309 could have affected it. In that bug, we start to extend the support library Activity class, as opposed to using the framework methods. It's possible this could affect startup time because we're loading a separate activity.

It's possible this didn't affect the N9 because it's a new enough version that the support library might call out to the platform for this Activity, rather than running the shipped version.

Would it be possible to back out this change locally and test if it improves perf or would we have to back it out from fx-team?
Flags: needinfo?(bob)
I could do it on try I think. I'll do that this evening.
try with dabb7828c1ee reverted
https://treeherder.mozilla.org/#/jobs?repo=try&revision=b01cea87b2a3&exclusion_profile=false&filter-tier=1&filter-tier=2&filter-tier=3

try with dabb7828c1ee
https://treeherder.mozilla.org/#/jobs?repo=try&revision=32f516b5c80e&exclusion_profile=false&filter-tier=1&filter-tier=2&filter-tier=3

phonedash graph

http://phonedash.mozilla.org/#/org.mozilla.fennec/throbberstart/local-blank/norejected/2016-03-25/2016-03-25/notcached/errorbars/standarderror/try/mean

A bit of a mixed bag. There does appears to be throbber start regression across the board though the nexus 6p's show it the least. The nexus 9s are a bit behind and will fill in later. It is sometimes difficult to tell from so few datapoints. I'll see about resubmitting these two builds again to double the data points used to collect the mean/error.
Flags: needinfo?(bob)
tracking-fennec: --- → ?
(In reply to Bob Clary [:bc:] from comment #3)

> A bit of a mixed bag. There does appears to be throbber start regression
> across the board though the nexus 6p's show it the least. The nexus 9s are a
> bit behind and will fill in later. It is sometimes difficult to tell from so
> few datapoints. I'll see about resubmitting these two builds again to double
> the data points used to collect the mean/error.

Have you been able to gather more data?
Flags: needinfo?(bob)
I did an additional run which upped the total iterations from 8 to 16. The patterns persisted.
Flags: needinfo?(bob)
Mike, let's figure out what's going wrong here to decide if this is something we can fix, or if this is a regression we're willing to accept.
Assignee: nobody → michael.l.comella
Blocks: 1220309
tracking-fennec: ? → 48+
Bob, a few questions about phonedash:
  * What units are the measurements in? It looks like we use SystemClock.uptimeMillis for the throbber messages in the code.
  * What are the difference between devices – e.g. "Nexus 6p-3" vs. "Nexus 6p-4"? Are they the same device model & version, but different physical devices?
Flags: needinfo?(bob)
My thoughts on what might be causing this:
 * Switching from FragmentActivity (which is also a support library construct) to AppCompatActivity is just slower and there's nothing we can do. This seems feasible – AppCompatActivity probably has a lot of custom XML attrs to attempt to load, among other things
 * (unlikely) The change to RemoteTabsPanelItem to set the text to not all caps in the home panel slows down startup. This is unlikely since this panel is not shown, and thus probably not inflated (and this style never used), but it's possible the style is used elsewhere or I'm wrong about this behavior.
 * (also unlikely) For the other bug listed, bug 1256427, we use Pair in HomeConfig – perhaps it is used in the critical path and is much slower than the framework version (that being said, I looked into it and it only appears to be used during home panel installation/removal, which I assume only happens for add-on home panels. It gets called during locale change, which maybe we call for the initial locale)
(In reply to Michael Comella (:mcomella) from comment #7)
> Bob, a few questions about phonedash:
>   * What units are the measurements in? It looks like we use
> SystemClock.uptimeMillis for the throbber messages in the code.

milliseconds

>   * What are the difference between devices – e.g. "Nexus 6p-3" vs. "Nexus
> 6p-4"? Are they the same device model & version, but different physical
> devices?

same model and android version, just different physical devices. The scheme is <devicetype>-<devicenumber>.
Flags: needinfo?(bob)
We can see the regression visually looking at http://phonedash.mozilla.org/#/2016-03-21/2016-03-25/binning=repo-phonetype-phoneid-test_name-cached_label-metric&rejected=norejected&errorbars=noerrorbars&errorbartype=standarderror&valuetype=median&remote-nytimes=on&throbberstart=on&first=on&fx-team=on&nexus-4-6=on&nexus-5-4=on&nexus-6-2=on&nexus-9-1=on

You can "zoom" in by drag selecting across the x-axis to see that Bug 1220309 is identified.

It appears across tests for throbber start for nexus 5 (android 4.4.2) and nexus 6 (android 5.1.1). nexus 9 appears to actually improve a bit while nexus 4 is either too noisy to see the regression or has no change.

However, using an adaptation of perfherder's alert calculation shows that none of the tests for the revision http://hg.mozilla.org/integration/fx-team/rev/34227dc37f69cb1d156d401d5cdf5cfb4a4bc91e meet the criteria for a perfherder alert (2% change with t-score >= 7) though a couple come close.

Given all of this and mcomella's comment 8, I would say wontfix for this bug. margaret?
Flags: needinfo?(margaret.leibovic)
Funny, I was just looking at this today!

Bob, do we have any way to prevent a gradual regression (e.g. regressing over the course of several months)?

If not, I wonder if we should dig in more. That being said, I don't think we'd choose not to ship these changes – it'd be a matter of trying to rejigger stuff to reduce their impact which is just taking shots in the dark [1], which is probably not the best use of time for a seemingly minor regression.

[1]: Some speculative possibilities:
 * AppCompat sets some of the text to all caps. If I'm not mistaken, toUpperCase is known for causing a lot of new String allocations and GC churn, which could slow us down
 * We're loading the basic libraries from our APK rather than from the device. This could mean 1) they're fragmented and slower to load (they're unlikely to be fragmented if they're shipped with the framework), 2) they're not cached (like the framework might be from other apps)
 * There's just a lot of additional styling going on with AppCompat that wasn't going on before these changes (e.g. many text views to upper case)
Flags: needinfo?(bob)
(In reply to Michael Comella (:mcomella) from comment #11)
> Funny, I was just looking at this today!
> 
> Bob, do we have any way to prevent a gradual regression (e.g. regressing
> over the course of several months)?
> 

We've just started reporting the data to Perfherder in addition to Phonedash. Perfherder will give us automatic alerts when the tests regress by more than 2% with a t score of 7 or more when comparing historical values (for n = 12 to 24) to future values (n = 12). It would have missed this one unfortunately, but I wrote a tool that does the same thing using the raw data from Phonedash and have a number of unfiled regressions I need to file.

So,... I think we will still have paper cuts like this one but they are pretty easy to spot using the new Phonedash UI. The combination of Perfherder + improved Phonedash should help us notice these regressions in a more timely fashion.

> If not, I wonder if we should dig in more. That being said, I don't think
> we'd choose not to ship these changes – it'd be a matter of trying to
> rejigger stuff to reduce their impact which is just taking shots in the dark
> [1], which is probably not the best use of time for a seemingly minor
> regression.
> 

I think we have bigger losses that we can look at and see if we can recover from rather than investing a lot of time and effort into this one at least for the moment.

> [1]: Some speculative possibilities:
>  * AppCompat sets some of the text to all caps. If I'm not mistaken,
> toUpperCase is known for causing a lot of new String allocations and GC
> churn, which could slow us down
>  * We're loading the basic libraries from our APK rather than from the
> device. This could mean 1) they're fragmented and slower to load (they're
> unlikely to be fragmented if they're shipped with the framework), 2) they're
> not cached (like the framework might be from other apps)
>  * There's just a lot of additional styling going on with AppCompat that
> wasn't going on before these changes (e.g. many text views to upper case)

These certainly sound promising. Let me file what I have found so far and we can prioritize the ones to work on first.
Flags: needinfo?(bob)
Summary: Autophone - Throbber start regression 2016-03-22 → Autophone - Throbber start regression 2016-03-22 on fx-team
Spoke to Margaret IRL – we're okay with this regression.
Status: NEW → RESOLVED
Closed: 8 years ago
Flags: needinfo?(margaret.leibovic)
Resolution: --- → WONTFIX
Product: Firefox for Android → Firefox for Android Graveyard
You need to log in before you can comment on or make changes to this bug.