Closed Bug 1415196 Opened 3 years ago Closed 3 years ago
.21 - 21 .46% tp6 _amazon _heavy / tp6 _facebook / tp6 _facebook _heavy (linux64, windows7-32) regression on push 929b9943b4fc8af6af3d7553249995f51e02ac1d (Thu Nov 2 2017)
Talos has detected a Firefox performance regression from push: https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?changeset=929b9943b4fc8af6af3d7553249995f51e02ac1d As author of one of the patches included in that push, we need your help to address this regression. Regressions: 21% tp6_facebook summary linux64 pgo 1_thread e10s 293.02 -> 355.92 17% tp6_facebook_heavy summary windows7-32 opt e10s 297.96 -> 349.08 17% tp6_facebook_heavy summary linux64 pgo e10s 293.02 -> 342.79 13% tp6_facebook summary linux64 pgo e10s 294.15 -> 332.00 13% tp6_facebook summary windows7-32 opt e10s 306.74 -> 345.27 12% tp6_facebook summary windows7-32 pgo 1_thread e10s 252.85 -> 282.12 9% tp6_facebook summary windows7-32 opt 1_thread e10s 312.27 -> 340.58 5% tp6_facebook_heavy summary linux64 opt e10s 327.42 -> 343.83 4% tp6_amazon_heavy summary windows7-32 pgo e10s 553.50 -> 576.79 Improvements: 16% tp6_google summary windows10-64 opt 1_thread e10s 482.58 -> 406.04 15% tp6_google_heavy summary windows10-64 opt e10s 474.75 -> 401.83 15% tp6_google summary linux64 opt 1_thread e10s 527.00 -> 447.12 15% tp6_google summary windows10-64 opt e10s 475.12 -> 403.46 15% tp6_google summary windows10-64 pgo e10s 401.79 -> 341.88 15% tp6_google_heavy summary linux64 opt e10s 546.08 -> 464.92 15% tp6_google summary linux64 opt e10s 548.42 -> 467.58 15% tp6_google_heavy summary windows7-32 opt e10s 480.58 -> 409.92 14% tp6_google summary windows10-64 pgo 1_thread e10s 404.21 -> 345.83 14% tp6_youtube_heavy summary windows7-32 pgo e10s 243.75 -> 208.83 14% tp6_google_heavy summary windows10-64 pgo e10s 399.25 -> 342.17 14% tp6_google summary windows7-32 pgo e10s 395.25 -> 339.33 14% tp6_google summary windows7-32 opt e10s 477.90 -> 410.67 14% tp6_google_heavy summary windows7-32 pgo e10s 393.88 -> 338.96 14% tp6_google summary windows7-32 opt 1_thread e10s 488.38 -> 420.42 13% tp6_google summary linux64 pgo 1_thread e10s 466.83 -> 404.25 13% tp6_google summary linux64 pgo e10s 483.50 -> 419.12 13% tp6_google_heavy summary linux64 pgo e10s 481.67 -> 418.21 13% tp6_google_heavy summary osx-10-10 opt e10s 492.96 -> 429.79 11% tp6_google summary windows7-32 pgo 1_thread e10s 388.93 -> 345.79 2% tp5o summary windows10-64 opt e10s 258.39 -> 252.47 You can find links to graphs and comparison views for each of the above tests at: https://treeherder.mozilla.org/perf.html#/alerts?id=10314 On the page above you can see an alert for each affected platform as well as a link to a graph showing the history of scores for this test. There is also a link to a treeherder page showing the Talos jobs in a pushlog format. To learn more about the regressing test(s), please see: https://wiki.mozilla.org/Buildbot/Talos/Tests For information on reproducing and debugging the regression, either on try or locally, see: https://wiki.mozilla.org/Buildbot/Talos/Running *** Please let us know your plans within 3 business days, or the offending patch(es) will be backed out! *** Our wiki page outlines the common responses and expectations: https://wiki.mozilla.org/Buildbot/Talos/RegressionBugsHandling
Product: Firefox → Core
:nbp Looks like there are some big regressions on the tp6_facebook tests. Can you please take a look over these?
(In reply to Ionuț Goldan [:igoldan], Performance Sheriffing from comment #1) > :nbp Looks like there are some big regressions on the tp6_facebook tests. > Can you please take a look over these? I already tested this configuration in the previous report, the last time this got enabled, and concluded that this was a false positive (Bug 1392969). Looking at some of the graph reporting the regressions, it seems that the benchmark became bi-modal, while in some cases, it improves the best case compared to the previous baseline. A bi-modal behaviour is expected with the JSBC, as it attempts to encode the bytecode, and decode it on upcoming page loads. If a page is left before we got time to save the bytecode on the necko cache, then we might observe a slow-down as we keep trying to encode the page over and over. This should be fixed in a separate bug and is unlikely to be seen by users on facebook.com and amazon.com pages. I would expect a normal user of facebook and amazon to spend time on each page, and consume the content provided by the page. (jmaher) This behaviour got fixed on tp5 by Bug 1372942, do we have something similar on tp6? I will double check once more with the Gecko Profiler, to see if I can reproduce similar slow down while emulating a user behaviour which implies staying on the page longer.
tp6 should have the wait for idle-callback as we do for tp5, this applies to all pageloader tests. I know we have tested locally, but I find it odd that this change corresponds 100% to a clear signal that facebook.com is regressed, improved (on backout), and regressed (on relanding). This is showing regressions on opt and pgo- possibly there is something we are overlooking here?
Ok, here is the procedure I followed to test on my own facebook profile: 0. Install the Gecko Profiler addon: https://perf-html.io/ 1. Open facebook.com and login. 2. Repeat 6 times: (to fill the cache with the bytecode) - Refresh the page. - Wait until it is loaded (no new element appearing on the page) - Wait 3s seconds. 3. Capture the JSBC-enabled profile: - Start the profiler. - Refresh the facebook page. - Wait 3 seconds. - Capture the profile: Ctrl+Shift+2 - Stop the profiler. 4. Go to about:config and set "dom.script_loader.bytecode_cache.enabled" to false. 5. Capture the JSBC-disabled profile: - Start the profiler. - Refresh the facebook page. - Wait 3 seconds. - Capture the profile: Ctrl+Shift+2 - Stop the profiler. For both profiles: 0. Isolate & Select the Content process which has the most activity. 1. Go to the "Marker Chart" tab. 2. Double click on the first "Navigation::Start" event. 3. Select the time slice (dark-grey vertical bar, at the top) 4. Drag the time window end until the last "Reflow" event appears cutted in 2 at the end of the Marker Chart window. 5. Click on the Loop in the time slice. 6. Read the time which appears at the top "Full Range > x.x sec" While testing on my laptop, on my facebook profile, I get: - JSBC-enabled: 3.8 - JSBC-disabled: 4.5 Which corresponds to a 15.5% improvement. (Linux x64) This corresponds to the initial improvements reported on the opt-build (Bug 1392969 comment 0), and to the last result I got using the same method (Bug 1392969 comment 4) If you think the protocol I used above is not representative of a usual page-load, please tell me what I should change, otherwise, I think this is still a harness issue, and we should fix the harness.
Flags: needinfo?(nicolas.b.pierron) → needinfo?(igoldan)
a few things: 1) the other pages have increases, so if there is a harness issue possibly we are missing other regressions/gains- likewise your patch could negatively affect other pages without knowledge. 2) your facebook profile could be much different than what we load in automation: https://github.com/rwood-moz/talos-pagesets/, probably more realistic, but we need someone to give us a realistic profile :) 3) you are measuring some hand curated metric, arguably better than what we measure- as it stands we have been asking for a HERO element or something more realistic for page measurements- to date we haven't recieved that 4) we measure time to first non blank paint for calculating the score for each load of the page. This is measured with this code: https://searchfox.org/mozilla-central/source/testing/talos/talos/pageloader/chrome/pageloader.js#276 here are the raw values for a tp6 stylo threads=1 for facebook.com pageload: before: [#0] / Cycles:25 Average:347.80 Median:302.00 stddev:95.26 (31.5%) stddev-sans-first:70.08 Values: 665.0 304.0 326.0 426.0 321.0 276.0 435.0 297.0 282.0 400.0 273.0 302.0 282.0 463.0 288.0 321.0 384.0 289.0 301.0 460.0 288.0 280.0 481.0 271.0 280.0 after: [#0] / Cycles:25 Average:374.92 Median:387.00 stddev:82.83 (21.4%) stddev-sans-first:57.17 Values: 668.0 258.0 271.0 391.0 336.0 275.0 307.0 284.0 348.0 373.0 376.0 324.0 301.0 403.0 418.0 421.0 329.0 387.0 413.0 449.0 391.0 398.0 397.0 421.0 434.0 Since this is sustained we should look at how we summarize our findings. the averages are almost the same, but we have been using the median in all other cases, I suspect using the custom metric via profiling to find the time for pageload would either fall into the same pattern over time, or prove to be more stable on both before/after. As you can see any given data point can show an improvement or a regression.
Looks like a serious perf regression in 58 which we're shipping to beta. It would be good to have an owner here.
Priority: P2 → P1
(In reply to Jim Mathies [:jimm] from comment #6) > Looks like a serious perf regression in 58 which we're shipping to beta. It > would be good to have an owner here. First, we need to know if this is a harness issue or a browser issue, right? From what I tested and observed on new profiles instrumented with the Gecko profilers. This is ~15% improvements (comment 4) on the full page-load, i.e. from navigating start to the last reflow. The first non-blank paint metric (comment 5), reports a regression, but it is not clear to me that we even benefit from the JSBC, as this might happen if we reload the page too fast. This would also explain the bi-modal behaviour of the benchmark (comment 2). A simple way for testing that would be to re-run the talos benchmarks, but always waiting 10 seconds before refreshing the page, and also measuring the last reflow, to double check that what I observe manually corresponds to what talos reports.
(In reply to Nicolas B. Pierron [:nbp] from comment #7) > First, we need to know if this is a harness issue or a browser issue, right? > > From what I tested and observed on new profiles instrumented with the Gecko > profilers. > This is ~15% improvements (comment 4) on the full page-load, i.e. from > navigating start to the last reflow. > > The first non-blank paint metric (comment 5), reports a regression, but it > is not clear to me that we even benefit from the JSBC, as this might happen > if we reload the page too fast. This would also explain the bi-modal > behaviour of the benchmark (comment 2). > > A simple way for testing that would be to re-run the talos benchmarks, but > always waiting 10 seconds before refreshing the page, and also measuring the > last reflow, to double check that what I observe manually corresponds to > what talos reports. Have you managed to investigate this?
(In reply to Ionuț Goldan [:igoldan], Performance Sheriffing from comment #8) > (In reply to Nicolas B. Pierron [:nbp] from comment #7) > > A simple way for testing that would be to re-run the talos benchmarks, but > > always waiting 10 seconds before refreshing the page, and also measuring the > > last reflow, to double check that what I observe manually corresponds to > > what talos reports. > > Have you managed to investigate this? Who are you asking? I have not spent a single second on this bug, since it does not seems to be a JSBC issue from my point of view.
we have proven that bug 1405738 caused a perf regression, if we feel that in local testing this is not a regression, then we should wontfix this bug. Keep in mind there are variables outside of just machine differences such as what page you are using, firefox preferences, graphics cards/drivers, etc.
I tried backing out the patch and we see a regression- this is on today's code, not the codebase from 3 weeks ago- I suspect that any work here will not uncover anything and recommend closing this as wontfix
(In reply to Joel Maher ( :jmaher) (UTC-5) from comment #11) > I tried backing out the patch and we see a regression- this is on today's > code, not the codebase from 3 weeks ago- I suspect that any work here will > not uncover anything and recommend closing this as wontfix Thanks, let's do that.
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → WONTFIX
Clearly ni on Naveed as he no longer works at Mozilla.
You need to log in before you can comment on or make changes to this bug.