Closed Bug 981062 Opened 6 years ago Closed 6 years ago

~6% cart regression on Win 8 found on fx-team from march 7th.

Categories

(Firefox :: Toolbars and Customization, defect)

x86
Windows 8
defect
Not set

Tracking

()

RESOLVED FIXED
Firefox 30
Tracking Status
firefox29 --- fixed
firefox30 --- fixed

People

(Reporter: mconley, Assigned: mconley)

References

(Blocks 1 open bug)

Details

(Keywords: perf, regression, Whiteboard: [talos_regression][Australis:P2])

Attachments

(1 file)

Regression: Fx-Team - Customization Animation Tests - WINNT 6.2 x64 - 6.61% increase
------------------------------------------------------------------------------------
    Previous: avg 37.993 stddev 0.770 of 12 runs up to revision 5e6934522047
    New     : avg 40.503 stddev 0.897 of 12 runs since revision d8018c23945a
    Change  : +2.510 (6.61% / z=3.261)
    Graph   : http://mzl.la/P9b8db

Changeset range: http://hg.mozilla.org/integration/fx-team/pushloghtml?fromchange=5e6934522047&tochange=d8018c23945a

Changesets:
  * http://hg.mozilla.org/integration/fx-team/rev/c368136afb26
    : D?o Gottwald <dao@mozilla.com> - Bug 978003 - Update inContentUI.css for Windows 8 and up. r=mdeboer
    : http://bugzilla.mozilla.org/show_bug.cgi?id=978003

  * http://hg.mozilla.org/integration/fx-team/rev/4ece8b3d89cd
    : fx4waldi@yahoo.com - Bug 975794 - Toolbarbuttons in the menu panel shouldn't get an inverted dropmarker when using a dark LWT. r=dao
    : http://bugzilla.mozilla.org/show_bug.cgi?id=975794

  * http://hg.mozilla.org/integration/fx-team/rev/2c0d7218a005
    : Margaret Leibovic <margaret.leibovic@gmail.com> - backout 9ee8634f70dc because it didn't work

  * http://hg.mozilla.org/integration/fx-team/rev/8b7f3531e937
    : Margaret Leibovic <margaret.leibovic@gmail.com> - backout 4bc6027f5638 for causing bug 978741
    : http://bugzilla.mozilla.org/show_bug.cgi?id=978741

  * http://hg.mozilla.org/integration/fx-team/rev/b5f281d7610c
    : Blake Winton <bwinton@latte.ca> - Bug 963999 - Make the background a more similar shade of grey. r=dao, ui-r=phlsa.
    : http://bugzilla.mozilla.org/show_bug.cgi?id=963999

  * http://hg.mozilla.org/integration/fx-team/rev/f02aa38ed27e
    : Gijs Kruitbosch <gijskruitbosch@gmail.com> - Bug 979456 - fix extra padding in bookmarks toolbar on Australis, r=mconley
    : http://bugzilla.mozilla.org/show_bug.cgi?id=979456

  * http://hg.mozilla.org/integration/fx-team/rev/6dd7b0359a89
    : Mike Conley <mconley@mozilla.com> - Bug 978767 - "Hovering over the zoom controls and cut/copy/paste in the menu on Linux makes the panel layout jump" [r=mdeboer]
    : http://bugzilla.mozilla.org/show_bug.cgi?id=978767

  * http://hg.mozilla.org/integration/fx-team/rev/c6123402c5f6
    : Robert Strong <robert.bugzilla@gmail.com> - bug 896223 - Remove the version 1 manifest from the mar creation scripts. r=nthomas
    : http://bugzilla.mozilla.org/show_bug.cgi?id=896223

  * http://hg.mozilla.org/integration/fx-team/rev/3b71d879000c
    : Robert Strong <robert.bugzilla@gmail.com> - Bug 900251 - Add support for add-if-not instruction added by bug 759469 to the mar generation scripts. r=nthomas
    : http://bugzilla.mozilla.org/show_bug.cgi?id=900251

  * http://hg.mozilla.org/integration/fx-team/rev/6230354e8d05
    : Robert Strong <robert.bugzilla@gmail.com> - Bug 896224 - Remove support for the version 1 manifest from the updater. r=bbondy
    : http://bugzilla.mozilla.org/show_bug.cgi?id=896224

  * http://hg.mozilla.org/integration/fx-team/rev/75aef72bf1be
    : Robert Strong <robert.bugzilla@gmail.com> - Main patch for Bug 759469 - Add new updater instruction to add a file if it doesn't exist in the destination. r=bbondy
    : http://bugzilla.mozilla.org/show_bug.cgi?id=759469

  * http://hg.mozilla.org/integration/fx-team/rev/5265254ba734
    : Robert Strong <robert.bugzilla@gmail.com> - Updated test binaries for Bug 759469 - Add new updater instruction to add a file if it doesn't exist in the destination. r=bbondy
    : http://bugzilla.mozilla.org/show_bug.cgi?id=759469

  * http://hg.mozilla.org/integration/fx-team/rev/c1c3aa2b09d8
    : Robert Strong <robert.bugzilla@gmail.com> - Test cleanup and tests for Bug 759469 - Add new updater instruction to add a file if it doesn't exist in the destination. r=bbondy
    : http://bugzilla.mozilla.org/show_bug.cgi?id=759469

  * http://hg.mozilla.org/integration/fx-team/rev/260334e5c5aa
    : Ehsan Akhgari <ehsan@mozilla.com> - Bug 979067 - Stop exporting the guard object classes; r=froydnj
    : http://bugzilla.mozilla.org/show_bug.cgi?id=979067

  * http://hg.mozilla.org/integration/fx-team/rev/2fcb315f6461
    : Honza Bambas <honzab.moz@firemni.cz> - Bug 977339 - Do GECKO_SEPARATE_NSPR_LOGS=1 by default, r=jduell
    : http://bugzilla.mozilla.org/show_bug.cgi?id=977339

  * and 66 more

Bugs:
  * http://bugzilla.mozilla.org/show_bug.cgi?id=979456 - Australis: Extra padding exists at the left edge of the Bookmarks toolbar
  * http://bugzilla.mozilla.org/show_bug.cgi?id=977920 - MmsService should not check for nsIDOMMozGsmIccInfo when retrieving iccId
  * http://bugzilla.mozilla.org/show_bug.cgi?id=2
  * http://bugzilla.mozilla.org/show_bug.cgi?id=979843 - Remove redundant switch_to_frame code
  * http://bugzilla.mozilla.org/show_bug.cgi?id=935045 - Allow the user to enter birth date in contact add and contact edit
  * http://bugzilla.mozilla.org/show_bug.cgi?id=979341 - Error page displayed in Cost Control FTU, after setting Data usage limit
  * http://bugzilla.mozilla.org/show_bug.cgi?id=975637 - [B2G][l10n][FTE] Continent names display in English in the timezone title text
  * http://bugzilla.mozilla.org/show_bug.cgi?id=972645 - Implement Reset Password Link or Screen for FxA
  * http://bugzilla.mozilla.org/show_bug.cgi?id=970138 - [Tooling] Add a commit hook to unify the file permision
  * http://bugzilla.mozilla.org/show_bug.cgi?id=900251 - Add support for add-if-not instruction added by bug 759469 to the mar generation scripts
  * http://bugzilla.mozilla.org/show_bug.cgi?id=963999 - Difference between customize-entering and customize-entered states is too drastic
  * http://bugzilla.mozilla.org/show_bug.cgi?id=977435 - Better UI for TestAgent app: more powerful top panel
  * http://bugzilla.mozilla.org/show_bug.cgi?id=979205 - [tarako] Change the LMK and low-mem notification trigger preferences to be specified in KiB
  * http://bugzilla.mozilla.org/show_bug.cgi?id=759469 - Add new updater instruction to add a file if it doesn't exist in the destination.
  * http://bugzilla.mozilla.org/show_bug.cgi?id=978862
  * and 26 more

Analysis on Datazilla suggests this to be from bug 963999.

I'm doing some try pushes to try to confirm. Keeping this P- until we get a sense of how important this is to fix.

Baseline: https://tbpl.mozilla.org/?tree=Try&rev=43dabecb79e0
Backout 963999: https://tbpl.mozilla.org/?tree=Try&rev=e68d8bc4d94c
Backout not disabling urlbar: https://tbpl.mozilla.org/?tree=Try&rev=341d9e733be8
So I did some exploring on this one over the weekend. Here's what I did, and here's what I found out:

First, I broke the original patch for bug 963999 into 3 chunks (A, B and C). Then, I applied each patch individually against the baseline to see how each of them contributed to the regression. I also pushed pair combinations (so A+B, B+C, C+A), to check for interactions between patches.

Here are the pushes:

Baseline: https://tbpl.mozilla.org/?tree=Try&rev=e68d8bc4d94c
Chunk A: https://tbpl.mozilla.org/?tree=Try&rev=ae6d2ce8d76c
Chunk B: https://tbpl.mozilla.org/?tree=Try&rev=38ec3bbd16af
Chunk C: https://tbpl.mozilla.org/?tree=Try&rev=53355eb5a389
Chunk A + B: https://tbpl.mozilla.org/?tree=Try&rev=34489b3168e8
Chunk B + C: https://tbpl.mozilla.org/?tree=Try&rev=3fa14fe10225
Chunk C + A: https://tbpl.mozilla.org/?tree=Try&rev=b447dc849edc

Comparisons:

Baseline vs A: http://compare-talos.mattn.ca/?oldRevs=e68d8bc4d94c&newRev=ae6d2ce8d76c&server=graphs.mozilla.org&submit=true
Baseline vs B: http://compare-talos.mattn.ca/?oldRevs=e68d8bc4d94c&newRev=38ec3bbd16af&server=graphs.mozilla.org&submit=true
Baseline vs C: http://compare-talos.mattn.ca/?oldRevs=e68d8bc4d94c&newRev=53355eb5a389&server=graphs.mozilla.org&submit=true
Baseline vs A + B: http://compare-talos.mattn.ca/?oldRevs=e68d8bc4d94c&newRev=34489b3168e8&server=graphs.mozilla.org&submit=true
Baseline vs B + C: http://compare-talos.mattn.ca/?oldRevs=e68d8bc4d94c&newRev=3fa14fe10225&server=graphs.mozilla.org&submit=true
Baseline vs C + A: http://compare-talos.mattn.ca/?oldRevs=e68d8bc4d94c&newRev=b447dc849edc&server=graphs.mozilla.org&submit=true


Analysis:

Looking at the comparisons, I think we can rule out Chunk C as having anything to do with the regression. Chunk A and B, however, each seem to play a part.

Let's talk about Chunk A first.

Chunk A does 2 things: after wrapping the URL bar, it no longer sets the grab cursor or disables the URL bar text input. Instead, it hides the text inputs for both the search container and the URL bar at the start of the customize transition (when the customizing attribute is set on the main-window).

Here are the details of the Baselines vs Chunk A comparison: http://compare-talos.mattn.ca/breakdown.html?oldTestIds=34761881,34762051,34762075,34762111,34762121&newTestIds=34766479,34770617,34770653,34770717,34770727&testName=cart&osName=Windows%208&server=graphs.mozilla.org

Looking at Datazilla, it looks like that error regression for 1-customize-enter.error is likely just noise. The other regression on 3-customize-enter-css.error appears far more significant.

The fact that 1-customize-enter.error doesn't move significantly, while 3-customize-enter-css.error does is telling: 1-customize-enter.error looks for the amount of time we go outside of the expected duration for the customize mode transition from the start of the transition to the very end (where the user can begin interacting). 3-customize-enter-css.error is a little different - it measures the amount of time we go outside of the expected duration of the customize mode transition from the start of the transition to the end of the "squish" part of the transition (the margin-left and margin-right increase on the content-deck and navigation-toolbox).

What this means is that we increased the amount of time it takes to get to the end of the squish-transition, but did not significantly alter the length of the entire transition.

Thinking about it, this makes some sense - we moved the job of hiding the search-container and URL bar text inputs from the point of wrapping the toolbar items (which used to occur _after_ the squish transition) to the _start_ of the transition.

Also note that the average time to paint each of the frames of the transition actually go _down_ significantly. My hypothesis is that by hiding the search-container and URL bar text inputs at the start of the transition, we're adding something like 10ms of overhead to the first frame of the CSS transition test, but the average time to paint each frame is actually going down because each subsequent frame tends to be cheaper.

Avi, how does that analysis sound?
Flags: needinfo?(avihpit)
Ok, now let's talk about Chunk B.

Chunk B moves the disabling of the quit and help buttons in the panel from the end of the transition to the start, which places it within the realm of the customize-enter-css measurements.

I would have thought that setting opacity on things should be almost free since that sort of work is offloaded to the GPU, but I guess not - the comparison of Baseline VS Chunk B is pretty clear:

http://compare-talos.mattn.ca/?oldRevs=e68d8bc4d94c&newRev=38ec3bbd16af&server=graphs.mozilla.org&submit=true

I tried some experiments with this over the weekend. In particular, I attempted to only modify the panel elements when the panel was not in the DOM (at the point where it's been removed from the PanelUI-popup, and is in mid-transit to the panel customization holder). I also made sure that the borders of the quit and help buttons were transparent, and that we were only setting opacity on the icons.

Baseline: https://tbpl.mozilla.org/?tree=Try&rev=e68d8bc4d94c
Baseline + B: https://tbpl.mozilla.org/?tree=Try&rev=38ec3bbd16af
Baseline + B + PanelMod / Transparent Borders: https://tbpl.mozilla.org/?tree=Try&rev=bd652bb61ff0


Comparisons:

Baseline vs B: http://compare-talos.mattn.ca/?oldRevs=e68d8bc4d94c&newRev=38ec3bbd16af&server=graphs.mozilla.org&submit=true
Baseline vs B + PanelMod / Transparent Borders: http://compare-talos.mattn.ca/?oldRevs=e68d8bc4d94c&newRev=bd652bb61ff0&server=graphs.mozilla.org&submit=true
Baseline + B vs Baseline + B + PanelMod / Transparent Borders: http://compare-talos.mattn.ca/?oldRevs=38ec3bbd16af&newRev=bd652bb61ff0&server=graphs.mozilla.org&submit=true

The last link there seems to suggest my experimenting didn't really give us much. I'm going to spend today seeing if I can reproduce this on my local Windows 8 machine. If so, I'll do some profiling to see what that tells us.

If I can't get to the bottom of this today, we might want to back out chunks A and B.
Alright, I've talked to avih and phlsa, and here's our plan:

We're going to keep chunk A landed - avih says it seems like a fine tradeoff to add 10ms of lag for smoother framerate.

Chunk B however, is a little more troubling - it just seems to add cost all around, so we're going to back that out. I'm going to try to see what I can do to make it cheaper, but if I can't, phlsa has said we might try to transition the opacity of those two buttons _after_ the squish transition occurs.

He says it's the second best option, but far better than just flat out disabling the buttons after the transition in a single frame.
Chunk A: IMO extra 10ms pre-lag is a fair price (though not negligible) to pay for smoother animation. I'm for keeping it.

Chunk B: it increases the lag by 20ms (70%) from 30 to 50ms (win8). The main problem is that we know that this value "scales" with HW perf, so this regression probably translates to more than 100ms on a slow system. If it was 70% regression off 1ms, then it wouldn't have mattered, but it's not. We should keep evaluating alternatives here IMO.

Great analysis mike!
Flags: needinfo?(avihpit)
Thanks - will upload a patch for review once Bugzilla realizes that attachments are cool to accept.
Requested a Win 8 slave for further analysis, since I can't seem to reproduce this regression on my VM.
Depends on: 981719
Assignee: nobody → mconley
Status: NEW → ASSIGNED
Comment on attachment 8388715 [details] [diff] [review]
Revert 963999's change to CustomizeMode.jsm

This should knock out a bunch of the regression.
Attachment #8388715 - Flags: review?(jaws)
Comment on attachment 8388715 [details] [diff] [review]
Revert 963999's change to CustomizeMode.jsm

Review of attachment 8388715 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks for the detailed write-up.
Attachment #8388715 - Flags: review?(jaws) → review+
remote:   https://hg.mozilla.org/integration/fx-team/rev/c0d6b0273d10

I never prioritized this, but I probably should have. I'm putting it at a P2 for now.
Whiteboard: [talos_regression][Australis:P-] → [talos_regression][Australis:P2][fixed-in-fx-team]
https://hg.mozilla.org/mozilla-central/rev/c0d6b0273d10
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Whiteboard: [talos_regression][Australis:P2][fixed-in-fx-team] → [talos_regression][Australis:P2]
Target Milestone: --- → Firefox 30
No longer depends on: 981719
Comment on attachment 8388715 [details] [diff] [review]
Revert 963999's change to CustomizeMode.jsm

[Approval Request Comment]
Bug caused by (feature/regressing bug #): 

Bug 963999. That's got aurora approval pending on it as well, and these two should land at the same time to avoid the regression.


User impact if declined: 

Potentially worse customize mode transition performance on Windows 8.


Testing completed (on m-c, etc.): 

Lots of local testing, and this reverts us back to the way we've been for months.


Risk to taking this patch (and alternatives if risky): 

Extremely low.


String or IDL/UUID changes made by this patch:

None.
Attachment #8388715 - Flags: approval-mozilla-aurora?
Attachment #8388715 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
QA Whiteboard: [qa-]
You need to log in before you can comment on or make changes to this bug.