1004429 - 5% tart regression for linux 64 on fx-team (v32) April 24

Reporter

Description

•

11 years ago

when http://hg.mozilla.org/integration/fx-team/pushloghtml?fromchange=bd55f7f8b48c&tochange=532886a149ab landed on fx-team April 24th we ended up with a 5% regression for tart on linux64. here is a graph to prove it: http://graphs.mozilla.org/graph.html#tests=[[293,64,35]]&sel=none&displayrange=30&datatype=running I had done a few retriggers in case there was some noise: https://tbpl.mozilla.org/?tree=Fx-Team&fromchange=96eefe207020&tochange=1f5f1fe135b9&jobname=Ubuntu%20HW%2012.04%20x64%20fx-team%20talos%20svgr here is more information about tart: https://wiki.mozilla.org/Buildbot/Talos/Tests#TART

Drew Willcoxon :adw

Comment 1

•

11 years ago

A few thoughts: There's now more visible content in the page's source -- the search bar. I don't know whether the time to build that content and the time to draw it contribute to the regression. In the preloaded case, the time to build it shouldn't contribute. When the page moves out of the preloader and is shown, the search panel asynchronously builds itself: it creates and adds to itself a small number of nodes per search engine. "Asynchronously" because it sends an async message to chrome, waits for a response, and then builds itself based on the response. The response is large because it contains some base 64-encoded small images. I don't know if that matters. When the page loads, the search bar's width is changed along with the grid's. In the preloaded case, that ought to happen before the tab is shown. In the non-preloaded case, resizing the search bar shouldn't be any more expensive than resizing the grid, which was already happening before these changesets.

Joel Maher ( :jmaher ) (UTC -8)

•

11 years ago

hey, we are all in this together. Maybe backing out a small part on try to see if we can reduce the scope of code that causes the problem could be a next step.

Drew Willcoxon :adw

Comment 6

•

•

11 years ago

Assignee: nobody → adw

Status: NEW → ASSIGNED

Drew Willcoxon :adw

Comment 14

•

11 years ago

Both of those were 3.7 ms, which is not 3. There was one small subsequent fix that Ed landed in http://hg.mozilla.org/mozilla-central/rev/5a01a53e80e8, so I tried reverting that on top of the other two reverts: https://tbpl.mozilla.org/?tree=Try&rev=ad5deb12df72 But that was 3.7, too. Observations so far: Merely the presence of the XUL, CSS, and one related querySelector() call (https://hg.mozilla.org/try/rev/13f45a0f910f) seems to account for 1.4 ms of the 2 ms regression. But even after reverting the changesets there's ~0.6 ms that I can't account for.

Avi Halachmi (:avih)

Comment 15

•

•

11 years ago

Whiteboard: [talos_regression] s=IT-32C-31A-30B.1 [qa-] → [talos_regression] p=13 s=it-32c-31a-30b.1 [qa-]

Marco Mucci [:MarcoM]

Updated

•

11 years ago

Whiteboard: [talos_regression] p=13 s=it-32c-31a-30b.1 [qa-] → [talos_regression] p=13 s=it-32c-31a-30b.2 [qa-]

Benjamin Kerensa [:bkerensa]

•

11 years ago

I call startProfiling at the end of Tart.prototype._startTest, right before the _doSequence call. I call stopProfiling in _doneInternal. In tart.html, the only box I check under "Configure TART" is "newtabYesPreload". I set "Repeat" to 3. I uncheck "[Uncheck when profiling]". I hit the Start button. When I load the resulting profiles in Instruments, profiles without search are ~30s, profiles with search are ~40s, both of which times match my subjective experience of waiting for the test to finish. (In reply to Avi Halachmi (:avih) from comment #28) > And even if yes, I also don't understand how you conclude that longer or > shorter run time correlates to tart regression reports. I don't think I said that? But I don't understand what you mean. In the profiles' call trees, I'm looking for calls in the with-search profiles that are absent from the without-search profiles, and calls in the with-search profiles that take longer than their counterparts in the without-search profiles. Is that not right? Some of those calls are in the call tree diff above. > Once you're able to do that - see on your own system that TART reports > different results for different builds - then you should profile and see if > it shows interesting differences at the profile. I am able to see that and that's what I've been profiling, reporting, and trying to make sense of. Without profiling, I've been setting "Repeat" to 10 and running "newtabYesPreload": with search, I'm getting anywhere from 90-180ms for newtab-open-preload-yes.all, again on a debug build if that matters; without search, 60-70ms. When I think I've found a lead and try a patch locally, and the patch reduces the test time significantly, but usually not consistently, I push to try and check datazilla, but the results aren't repeated there. I don't know whether these patches actually improve things but only a very small amount that's lost in the noise, or they don't improve anything at all.

Avi Halachmi (:avih)

Comment 30

•

11 years ago

(In reply to Drew Willcoxon :adw from comment #29) > I call startProfiling at the end of Tart.prototype._startTest, right before > the _doSequence call. I call stopProfiling in _doneInternal. In tart.html, > the only box I check under "Configure TART" is "newtabYesPreload". I set > "Repeat" to 3. I uncheck "[Uncheck when profiling]". I hit the Start > button. I see. Do I understand correctly that you modified TART? If yes, what is your base build? and please post the patch you used to modify it. Without seeing the patch and after reading the above, I'd say that it's probably incorrect and captures more than it should, but post the base TART build and the patch, and let's take it from there. > When I load the resulting profiles in Instruments, profiles without search > are ~30s, profiles with search are ~40s, both of which times match my > subjective experience of waiting for the test to finish. I'm not familiar with "Instruments", but I still can't understand where could the difference come from. As far as I can guess what your TART patch does, it still shouldn't result in this. I do know that TART is not expected (or programmed) to take different duration to complete based on different performance of the build. So something here is definitely unexpected, at least to me. We should first make sure we don't have unexpected stuff - before going on to interpret the results it produced IMO. > (In reply to Avi Halachmi (:avih) from comment #28) > > And even if yes, I also don't understand how you conclude that longer or > > shorter run time correlates to tart regression reports. > > I don't think I said that? But I don't understand what you mean. Since we're trying to address an issue which was noticed on TART results, this makes TART the reference, but you quoted numbers which not taken with TART, so I assume you believe your numbers somehow correlate to the TART regressions. Is this not the case? > In the profiles' call trees, I'm looking for calls in the with-search > profiles that are absent from the without-search profiles, and calls in the > with-search profiles that take longer than their counterparts in the > without-search profiles. Is that not right? Some of those calls are in the > call tree diff above. Yes, this sounds very plausible, as long as your profiles are collected correctly. > Without profiling, I've been setting "Repeat" to 10 and running > "newtabYesPreload": with search, I'm getting anywhere from 90-180ms for > newtab-open-preload-yes.all, again on a debug build if that matters; without > search, 60-70ms. Right, so this means you can reproduce the talos results locally, and see even bigger regressions. Excellent. We couldn't continue without it. > When I think I've found a lead and try a patch locally, > and the patch reduces the test time significantly, but usually not > consistently Yup, that's the hard part. Hopefully we can make it easier by grabbing the profiles more accurately.

Drew Willcoxon :adw

Comment 31

•

Comment 36

•

11 years ago

From IRC: <ttaubert> the problem is that the event listener only exists in browser windows, not the hidden window <ttaubert> so initializing the about:newtab search wouldn't work in the preloaded tab

Avi Halachmi (:avih)

Comment 37

•

11 years ago

Is it possible to change some of the async stuff to sync? Would it work better? any downsides for such approach?

Tim Taubert [:ttaubert] (inactive)

Comment 38

•

11 years ago

Yeah, we can't do that synchronously as we want to support e10s in the future. There are a lot of bugs waiting for about:newtab to become unprivileged.

Drew Willcoxon :adw

Comment 39

•

•

11 years ago

•

11 years ago

I thought the proximate cause here was the search field - are you saying it's actually tiles (or both)?

Avi Halachmi (:avih)

Comment 55

•

11 years ago

This bug was not filed in order to back out the newtab search bar. It was filed in order to track and hopefully fix the regression caused by the search bar. This bug's regression of ~40% on newtab tab animation was strictly from the search field, and some of it got mitigated by bug 1019990. I can't tell how much exactly because DataZilla is the only tool which can show it and it's kinda mostly not working right now. My hunch however, is that what's left is still meaningful. The tiles reference was with regards to the newtab page in general. I took your suggestion "tell exactly what regression is remaining" as a suggestion to looking at the big picture of the newtab page, and which the search and tiles are part of. Trying to "fix" every small 3-10% regression is both probably impossible and might also paint an incorrect or partial picture of "this only ended up as 3%, and that only regressed overall at 4%", while in practice, when you look at the newtab tab animation, it's considerably worse than just plain tab animation without the newtab page.

Avi Halachmi (:avih)

•

11 years ago

The original issue was somewhat mitigated at bug 1019990. The real problem here is that we can't track how much it improved since then, because this 5% overall regression is actually about 40% newtab related regressions, but we can't view newtab-only results because datazilla is non functional, and graphserver's averaging just doesn't give us the view we need. OTOH, it's been so long since this regression appeared, that it's impractical to fix it specifically. However, there's some major fx-team newtab perf work which now hangs off bug 1059558. So I'm closing this as WONTFIX, and the overall newtab perf discussion moves to bug 1059558.

Status: NEW → RESOLVED

Closed: 11 years ago

Flags: needinfo?(avihpit)

Resolution: --- → WONTFIX

datazilla-comment-3.png - taken 2014-0516 11 years ago Avi Halachmi (:avih) 339.85 KB, image/png		Details
TART native profiling patch 11 years ago Drew Willcoxon :adw 4.11 KB, patch		Details \| Diff \| Splinter Review