Make tps Talos test less noisy

RESOLVED FIXED in Firefox 55

Status

()

Firefox
Tabbed Browser
P1
normal
RESOLVED FIXED
2 months ago
a month ago

People

(Reporter: mconley, Assigned: mconley)

Tracking

(Blocks: 2 bugs, {talos-regression})

unspecified
Firefox 56
talos-regression
Points:
---
Dependency tree / graph
Bug Flags:
qe-verify -

Firefox Tracking Flags

(firefox-esr52 unaffected, firefox54 unaffected, firefox55 fixed, firefox56 fixed)

Details

(Whiteboard: [photon-performance])

MozReview Requests

()

Submitter Diff Changes Open Issues Last Updated
Loading...
Error loading review requests:

Attachments

(3 attachments)

(Assignee)

Description

2 months ago
Here's a graph showing the tps Talos test on automation over the past month or so:

https://screenshots.firefox.com/jeslGsuSRJAMV0Ta/health.graphics

There's a pretty steep regression on win7-32 since early May. That test is also really noisy on that platform, it seems.

We should find out what's going on. I'd be surprised if there weren't Talos alerts already associated with these regressions - we just need to find them and hook them up.
(Assignee)

Comment 1

2 months ago
FWIW, I haven't noticed any unexpected creeping regressions from our Telemetry data:

https://mikeconley.github.io/bug1310250/

So this regression doesn't _appear_ to exist out in the wild. Still, we should try to figure out what happened here.
Blocks: 1363755
Whiteboard: [photon-performance]

Updated

2 months ago
Flags: qe-verify?
Whiteboard: [photon-performance] → [photon-performance] [triage]

Updated

2 months ago
Flags: qe-verify? → qe-verify-
Priority: -- → P2
Whiteboard: [photon-performance] [triage] → [photon-performance]

Updated

2 months ago
Assignee: nobody → mconley
Status: NEW → ASSIGNED
Iteration: --- → 56.1 - Jun 26
Priority: P2 → P1
The "Tab Opening (tabpaint)" Metric also looks odd.  https://screenshots.firefox.com/Fh5IdxnDvl3Q86D2/health.graphics
(Assignee)

Comment 3

2 months ago
One of the challenges with tps is that the score is actually an aggregation of a bunch of subtest scores, so once you detect a regression, you need to untangle the subtests to see what's actually being impacted.

So I've started to do that. At least for the steeper regression that shows up in early June in the screenshot of comment 0, it looks like only a few subtests are affected:

https://treeherder.mozilla.org/perf.html#/graphs?timerange=2592000&series=%5Bautoland,adc4a51a330fe97726a1b26b9c7fd8abfcb4e4db,1,1%5D&series=%5Bautoland,66b7148e5ff8413e13d3b5bf3113fb8abf1853f2,1,1%5D&series=%5Bautoland,400205fcea3cb47bccf7c06a5884986353e0c00c,1,1%5D&zoom=1496183433197.218,1497371642000,8.782767535148935,166.74156903327628
(Assignee)

Comment 4

2 months ago
That graph and the revisions it highlights seems to point at bug 1368872 as the potential regressor here.

In fact, bug 1368872 comment 0 actually points out that the pref flip in that bug causes a tps regression. Hey seanlee, was it expected that this regression should slip to beta? Were we going to address this regression somehow?
Flags: needinfo?(selee)
(Assignee)

Comment 5

2 months ago
The tps tests are pretty noisy. There _does_ appear to be a slight regression before bug 1368872 lands, but I'm having trouble nailing down exactly where.

Suspects by inspection are bug 1369141 and bug 1368103. I've kicked off a bunch of retriggers for those landings to see if the regression shows up more clearly with those landings.

Here's the graph to watch: https://treeherder.mozilla.org/perf.html#/graphs?timerange=7776000&series=%5Bmozilla-central,a86a2a069ed634663dbdef7193f2dee69b50dbc9,0,1%5D&series=%5Bautoland,890f291f15fa3591eb1694ceb3476e94a69a096a,1,1%5D&series=%5Bmozilla-inbound,890f291f15fa3591eb1694ceb3476e94a69a096a,1,1%5D&highlightedRevisions=9d55c2ccf868&highlightedRevisions=e2cf349598c8&zoom=1496260854424.963,1497376406000,10,43.67163814715485
Hey Mike,

For m-c, bug 1368872 comment 4 mentioned below:
(In reply to Sean Lee [:seanlee][:weilonge] from bug 1368872 comment 4)
> Since bug 1364818 refines `autofocus` case, the talos issue of this bug is
> resloved [1].
> Then we can turn on heuristics by default now.
> 
> [1]
> https://treeherder.mozilla.org/perf.html#/
> compare?originalProject=try&originalRevision=74da47a9d257c3e7f0467c0c68473d1f
> 5ad3cd16&newProject=try&newRevision=6be621841492d5464c126b12c2bee4cb323e4acb&
> framework=1&showOnlyImportant=0

I suppose enabling heuristics won't affect much talos regression based on the talos report.

FormAutofill won't be built in beta [2]. Any regression caused by FormAutofill is not possible to slip to beta until we enable it in beta.

[2] https://dxr.mozilla.org/mozilla-central/rev/91134c95d68cbcfe984211fa3cbd28d610361ef1/browser/extensions/moz.build#20,23
Flags: needinfo?(selee)
(Assignee)

Comment 7

2 months ago
(In reply to Sean Lee [:seanlee][:weilonge] from comment #6)
Hey seanlee,

I don't think I'm fully understanding here. I understand that the Form Autofill stuff won't be enabled on Beta (and I can confirm that no regression appears on beta: https://treeherder.mozilla.org/perf.html#/graphs?timerange=7776000&series=%5Bmozilla-beta,a86a2a069ed634663dbdef7193f2dee69b50dbc9,1,1%5D).

But what are we going to do about the actual regression on Nightly? Was bug 1364818 supposed to address it?
Flags: needinfo?(selee)
Hi Mike,

I should explain more for Nightly status.
At the moment of bug 1364818 landed, I did not see any regression for heuristics. 

However, I retriggered more jobs for TPS on windows[1]. Comparing to OSX and Linux, the regression does happen on rakuten.co.jp and naver.com for Windows platform only[2][3].

Ay the meantime, bug 1372843 is just fixed for improving the performance of heuristics. I found windows7-32[4] is improved for rakuten.co.jp and naver.com. I wish this solution can reduce the regression.

I still don't get why the significant result is only on Windows platform. May I have your thoughts here?

Thank you.

[1] https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=74da47a9d257c3e7f0467c0c68473d1f5ad3cd16&newProject=try&newRevision=6be621841492d5464c126b12c2bee4cb323e4acb&framework=1&filter=tps&showOnlyImportant=0
[2] https://treeherder.mozilla.org/perf.html#/comparesubtest?originalProject=try&originalRevision=74da47a9d257c3e7f0467c0c68473d1f5ad3cd16&newProject=try&newRevision=6be621841492d5464c126b12c2bee4cb323e4acb&originalSignature=ac0ceb5e254d455b35a76117cef9225ed4768340&newSignature=ac0ceb5e254d455b35a76117cef9225ed4768340&framework=1
[3] https://treeherder.mozilla.org/perf.html#/comparesubtest?originalProject=try&originalRevision=74da47a9d257c3e7f0467c0c68473d1f5ad3cd16&newProject=try&newRevision=6be621841492d5464c126b12c2bee4cb323e4acb&originalSignature=890f291f15fa3591eb1694ceb3476e94a69a096a&newSignature=890f291f15fa3591eb1694ceb3476e94a69a096a&framework=1
[4] https://treeherder.mozilla.org/perf.html#/comparesubtest?originalProject=try&originalRevision=612b20c5664ace6a303be53b1b765be3996db3e9&newProject=try&newRevision=94715b927cf99d934f1eb5fca01d346e5d61a66b&originalSignature=890f291f15fa3591eb1694ceb3476e94a69a096a&newSignature=890f291f15fa3591eb1694ceb3476e94a69a096a&framework=1
Flags: needinfo?(selee) → needinfo?(mconley)
(In reply to Sean Lee [:seanlee][:weilonge] from comment #8)
> Hi Mike,
> 
> At the moment of bug 1364818 landed,
> 
This statement should be corrected as `At the moment of bug 1364818 and bug 1368872 landed,`.
(Assignee)

Comment 10

2 months ago
Ah great, yes - I see the subtest improvements on autoland:

https://treeherder.mozilla.org/perf.html#/graphs?series=%5Bautoland,400205fcea3cb47bccf7c06a5884986353e0c00c,1,1%5D&series=%5Bautoland,66b7148e5ff8413e13d3b5bf3113fb8abf1853f2,1,1%5D&series=%5Bautoland,adc4a51a330fe97726a1b26b9c7fd8abfcb4e4db,1,1%5D
Flags: needinfo?(mconley)
(Assignee)

Comment 11

2 months ago
Thanks for your work there!
(Assignee)

Comment 12

2 months ago
A place where we appeared to "regress", but didn't actually, is via bug 1369662, where I changed the definition of the test to avoid responding to MozAfterPaint events with empty rects (which indicates a composite, but nothing in the content area being painted).

You can see that impacting the graph here:

https://treeherder.mozilla.org/perf.html#/graphs?series=%5Bmozilla-central,117a4f9a29d95c4b5f2de267475fffacc53f5944,1,1%5D&highlightedRevisions=7c9d96bbc400

If it's possible, we should perhaps do try pushes for both that build and the beginning of the regression window, as we essentially corrected goalposts in the test, and we should "re-baseline".
Flags: needinfo?(mconley)
(Assignee)

Updated

2 months ago
See Also: → bug 1362920
(Assignee)

Comment 13

2 months ago
So it seems as if the erratic behaviour of the test (which was originally filed in bug 1362920) _does_ have something to do with e10s-multi.

I did a try build with dom.ipc.processCount set to 1:

https://treeherder.mozilla.org/perf.html#/graphs?series=%5Btry,890f291f15fa3591eb1694ceb3476e94a69a096a,1,1%5D&highlightedRevisions=26814a834efc

And the datapoints really tighten up.

I'm going to do some try builds with dom.ipc.processCount set to 1 for the May 1st build as well, and then do a comparison to see if by subtracting the noise, we see any actual regression here - or if the noise is somehow a factor.
(Assignee)

Comment 14

2 months ago
Comparison between m-c from May 1st, and m-c from today with dom.ipc.processCount set to 1:

https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=26814a834efc&newProject=try&newRevision=dbc19e5e1ab602d1f606a907519738f11c93f6d1&framework=1&showOnlyImportant=0

Updated

2 months ago
Iteration: 56.1 - Jun 26 → 56.2 - Jul 10
(Assignee)

Comment 15

2 months ago
Ah hah - I think I may have figured out the noise issue.

I _think_ this has something to do with clock drift. The tps (tabswitch) test uses Date.now() in both the parent and the content processes in order to figure the amount of time that has passes between:

a) The initialization of a tab switch

and

b) The content area of the selected tab getting a MozAfterPaint

I have a hunch that an increase in the number of content processes (and therefore, the number of threads vying for attention from the CPU) results in an increase of clock drift between content processes.

I think a better way of measuring would be to use the window.performance.timing.navigationStart + window.performance.now() calculation which (at least according to MDN[1]) increases monotonically, and should (I believe) be in general agreement across processes.

To support my hypothesis of clock skew, here's a tps try push log:

https://archive.mozilla.org/pub/firefox/try-builds/mconley@mozilla.com-8a092b178995a6b5c6cc02ffb7010887e4719ad4/try-win32/try_win7_ix_test-g2-e10s-bm112-tests1-windows-build540.txt.gz

Here are the scores:

18:00:46     INFO -  PID 3564 | ------- Summary: start -------
18:00:46     INFO -  PID 3564 | Number of tests: 31
18:00:46     INFO -  PID 3564 |
18:00:46     INFO -  PID 3564 | [#0] cgi.ebay.com/cgi.ebay.com/ALL-NEW-KINDLE-3-eBOOK-WIRELESS-READING-DEVICE-W-WIFI-/130496077314@pt=LH_DefaultDomain_0&hash=item1e622c1e02.html  Cycles:4  Average:31.00  Median:37.50  stddev:9.76 (26.0%)
18:00:46     INFO -  PID 3564 | Values: 45.0  30.0  26.0  23.0
18:00:46     INFO -  PID 3564 |
18:00:46     INFO -  PID 3564 | [#1] 163.com/www.163.com/index.html  Cycles:4  Average:31.75  Median:36.00  stddev:6.50 (18.1%)
18:00:46     INFO -  PID 3564 | Values: 41.0  29.0  26.0  31.0
18:00:46     INFO -  PID 3564 |
18:00:46     INFO -  PID 3564 | [#2] mail.ru/mail.ru/index.html  Cycles:4  Average:34.50  Median:42.50  stddev:11.15 (26.2%)
18:00:46     INFO -  PID 3564 | Values: 35.0  25.0  50.0  28.0
18:00:46     INFO -  PID 3564 |
18:00:46     INFO -  PID 3564 | [#3] bbc.co.uk/www.bbc.co.uk/news/index.html  Cycles:4  Average:29.50  Median:33.00  stddev:5.74 (17.4%)
18:00:46     INFO -  PID 3564 | Values: 31.0  33.0  33.0  21.0
18:00:46     INFO -  PID 3564 |
18:00:46     INFO -  PID 3564 | [#4] store.apple.com/store.apple.com/us@mco=Nzc1MjMwNA.html  Cycles:4  Average:26.75  Median:38.50  stddev:15.63 (40.6%)
18:00:46     INFO -  PID 3564 | Values: 48.0  29.0  15.0  15.0
18:00:46     INFO -  PID 3564 |
18:00:46     INFO -  PID 3564 | [#5] imdb.com/www.imdb.com/title/tt1099212/index.html  Cycles:4  Average:33.25  Median:41.50  stddev:12.34 (29.7%)
18:00:46     INFO -  PID 3564 | Values: 49.0  31.0  19.0  34.0
18:00:46     INFO -  PID 3564 |
18:00:46     INFO -  PID 3564 | [#6] cnn.com/www.cnn.com/index.html  Cycles:4  Average:22.25  Median:28.50  stddev:8.10 (28.4%)
18:00:46     INFO -  PID 3564 | Values: 16.0  16.0  24.0  33.0
18:00:46     INFO -  PID 3564 |
18:00:46     INFO -  PID 3564 | [#7] sohu.com/www.sohu.com/index.html  Cycles:4  Average:26.75  Median:31.00  stddev:6.24 (20.1%)
18:00:46     INFO -  PID 3564 | Values: 25.0  35.0  20.0  27.0
18:00:46     INFO -  PID 3564 |
18:00:46     INFO -  PID 3564 | [#8] youku.com/www.youku.com/index.html  Cycles:4  Average:31.25  Median:36.50  stddev:8.06 (22.1%)
18:00:46     INFO -  PID 3564 | Values: 43.0  30.0  26.0  26.0
18:00:46     INFO -  PID 3564 |
18:00:46     INFO -  PID 3564 | [#9] ifeng.com/ifeng.com/index.html  Cycles:4  Average:42.25  Median:52.50  stddev:11.87 (22.6%)
18:00:46     INFO -  PID 3564 | Values: 52.0  53.0  31.0  33.0
18:00:46     INFO -  PID 3564 |
18:00:46     INFO -  PID 3564 | [#10] tudou.com/www.tudou.com/index.html  Cycles:4  Average:24.75  Median:27.50  stddev:5.19 (18.9%)
18:00:46     INFO -  PID 3564 | Values: 17.0  27.0  28.0  27.0
18:00:46     INFO -  PID 3564 |
18:00:46     INFO -  PID 3564 | [#11] chemistry.about.com/chemistry.about.com/index.html  Cycles:4  Average:21.75  Median:26.00  stddev:6.60 (25.4%)
18:00:46     INFO -  PID 3564 | Values: 13.0  22.0  23.0  29.0
18:00:46     INFO -  PID 3564 |
18:00:46     INFO -  PID 3564 | [#12] beatonna.livejournal.com/beatonna.livejournal.com/index.html  Cycles:4  Average:30.00  Median:40.00  stddev:12.91 (32.3%)
18:00:46     INFO -  PID 3564 | Values: 47.0  33.0  19.0  21.0
18:00:46     INFO -  PID 3564 |
18:00:46     INFO -  PID 3564 | [#13] rakuten.co.jp/www.rakuten.co.jp/index.html  Cycles:4  Average:37.00  Median:41.00  stddev:4.69 (11.4%)
18:00:46     INFO -  PID 3564 | Values: 42.0  40.0  33.0  33.0
18:00:46     INFO -  PID 3564 |
18:00:46     INFO -  PID 3564 | [#14] uol.com.br/www.uol.com.br/index.html  Cycles:4  Average:27.25  Median:31.00  stddev:4.43 (14.3%)
18:00:46     INFO -  PID 3564 | Values: 23.0  24.0  30.0  32.0
18:00:46     INFO -  PID 3564 |
18:00:46     INFO -  PID 3564 | [#15] thepiratebay.org/thepiratebay.org/top/201.html  Cycles:4  Average:25.00  Median:29.50  stddev:6.98 (23.6%)
18:00:46     INFO -  PID 3564 | Values: 16.0  25.0  33.0  26.0
18:00:46     INFO -  PID 3564 |
18:00:46     INFO -  PID 3564 | [#16] page.renren.com/page.renren.com/index.html  Cycles:4  Average:23.75  Median:29.00  stddev:6.85 (23.6%)
18:00:46     INFO -  PID 3564 | Values: 16.0  32.0  26.0  21.0
18:00:46     INFO -  PID 3564 |
18:00:46     INFO -  PID 3564 | [#17] chinaz.com/chinaz.com/index.html  Cycles:4  Average:27.50  Median:31.50  stddev:4.93 (15.7%)
18:00:46     INFO -  PID 3564 | Values: 22.0  33.0  25.0  30.0
18:00:46     INFO -  PID 3564 |
18:00:46     INFO -  PID 3564 | [#18] globo.com/www.globo.com/index.html  Cycles:4  Average:28.75  Median:31.50  stddev:3.40 (10.8%)
18:00:46     INFO -  PID 3564 | Values: 30.0  33.0  26.0  26.0
18:00:46     INFO -  PID 3564 |
18:00:46     INFO -  PID 3564 | [#19] spiegel.de/www.spiegel.de/index.html  Cycles:4  Average:22.75  Median:27.50  stddev:7.76 (28.2%)
18:00:46     INFO -  PID 3564 | Values: 13.0  32.0  23.0  23.0
18:00:46     INFO -  PID 3564 |
18:00:46     INFO -  PID 3564 | [#20] dailymotion.com/www.dailymotion.com/us.html  Cycles:4  Average:24.75  Median:20.50  stddev:13.23 (64.5%)
18:00:46     INFO -  PID 3564 | Values: 6.0  35.0  25.0  33.0
18:00:46     INFO -  PID 3564 |
18:00:46     INFO -  PID 3564 | [#21] goo.ne.jp/goo.ne.jp/index.html  Cycles:4  Average:38.00  Median:59.00  stddev:30.67 (52.0%)
18:00:46     INFO -  PID 3564 | Values: 17.0  36.0  17.0  82.0
18:00:46     INFO -  PID 3564 |
18:00:46     INFO -  PID 3564 | [#22] stackoverflow.com/stackoverflow.com/questions/184618/what-is-the-best-comment-in-source-code-you-have-ever-encountered.html  Cycles:4  Average:16.75  Median:18.50  stddev:2.36 (12.8%)
18:00:46     INFO -  PID 3564 | Values: 15.0  20.0  17.0  15.0
18:00:46     INFO -  PID 3564 |
18:00:46     INFO -  PID 3564 | [#23] ezinearticles.com/ezinearticles.com/index.html@Migraine-Ocular---The-Eye-Migraines&id=4684133.html  Cycles:4  Average:18.00  Median:27.00  stddev:15.64 (57.9%)
18:00:46     INFO -  PID 3564 | Values: 24.0  -5.0  30.0  23.0
18:00:46     INFO -  PID 3564 |
18:00:46     INFO -  PID 3564 | [#24] huffingtonpost.com/www.huffingtonpost.com/index.html  Cycles:4  Average:13.75  Median:18.50  stddev:13.05 (70.5%)
18:00:46     INFO -  PID 3564 | Values: 8.0  -1.0  19.0  29.0
18:00:46     INFO -  PID 3564 |
18:00:46     INFO -  PID 3564 | [#25] media.photobucket.com/media.photobucket.com/image/funny%20gif/findstuff22/Best%20Images/Funny/funny-gif1.jpg@o=1.html  Cycles:4  Average:10.75  Median:16.50  stddev:13.60 (82.4%)
18:00:46     INFO -  PID 3564 | Values: 7.0  -6.0  16.0  26.0
18:00:46     INFO -  PID 3564 |
18:00:46     INFO -  PID 3564 | [#26] imgur.com/imgur.com/gallery/index.html  Cycles:4  Average:23.50  Median:26.00  stddev:3.70 (14.2%)
18:00:46     INFO -  PID 3564 | Values: 23.0  24.0  19.0  28.0
18:00:46     INFO -  PID 3564 |
18:00:46     INFO -  PID 3564 | [#27] reddit.com/www.reddit.com/index.html  Cycles:4  Average:16.25  Median:24.50  stddev:11.90 (48.6%)
18:00:46     INFO -  PID 3564 | Values: 16.0  0.0  28.0  21.0
18:00:46     INFO -  PID 3564 |
18:00:46     INFO -  PID 3564 | [#28] noimpactman.typepad.com/noimpactman.typepad.com/index.html  Cycles:4  Average:15.50  Median:27.00  stddev:14.66 (54.3%)
18:00:46     INFO -  PID 3564 | Values: 11.0  -3.0  24.0  30.0
18:00:46     INFO -  PID 3564 |
18:00:46     INFO -  PID 3564 | [#29] myspace.com/www.myspace.com/albumart.html  Cycles:4  Average:22.75  Median:30.50  stddev:9.84 (32.3%)
18:00:46     INFO -  PID 3564 | Values: 20.0  10.0  30.0  31.0
18:00:46     INFO -  PID 3564 |
18:00:46     INFO -  PID 3564 | [#30] mashable.com/mashable.com/index.html  Cycles:4  Average:32.00  Median:41.00  stddev:14.38 (35.1%)
18:00:46     INFO -  PID 3564 | Values: 21.0  25.0  53.0  29.0
18:00:46     INFO -  PID 3564 | -------- Summary: end --------


First of all, there's a general trend that the amount of time to do switches is _increasing_ from cycle to cycle. Secondly, there are several _negative_ values! There's no way that the Date.now() timestamps were sync'd if we're getting negative deltas like this.

I have a patch that changes the tabswitch test to use this more reliable time measurement mechanism. Here's the try push:

https://treeherder.mozilla.org/perf.html#/graphs?timerange=172800&series=%5Btry,890f291f15fa3591eb1694ceb3476e94a69a096a,1,1%5D&highlightedRevisions=21a9ab3b01ea

Holy smokes, that's a lot less noisier, and doesn't require us to change the number of content processes.

I'm going to apply the same changes to an early May 1st build, and then do a comparison. If there's no significant regression, we should do the same thing as in bug 1377226 and re-baseline the test (and backfill the results).


[1]: https://developer.mozilla.org/en-US/docs/Web/API/Performance/now - see in particular this quote:

"Unlike other timing data available to JavaScript (for example Date.now), the timestamps returned by Performance.now() are not limited to one-millisecond resolution. Instead, they represent times as floating-point numbers with up to microsecond precision.

Also unlike Date.now(), the values returned by Performance.now() always increase at a constant rate, independent of the system clock (which might be adjusted manually or skewed by software like NTP). Otherwise, performance.timing.navigationStart + performance.now() will be approximately equal to Date.now()."
Flags: needinfo?(mconley)
Comment hidden (mozreview-request)
(Assignee)

Comment 17

2 months ago
Clock skew fix comparing a recent mozilla-central with a May 1st mozilla-central:

Comparison: https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=a52371addf31da079&newProject=try&newRevision=b63d3f958c45&framework=1&showOnlyImportant=0
Blocks: 1370336
status-firefox54: --- → unaffected
status-firefox55: --- → ?
status-firefox56: --- → affected
Keywords: talos-regression
(Assignee)

Updated

2 months ago
Attachment #8882507 - Flags: review?(gkrizsanits)
(Assignee)

Comment 18

2 months ago
Hey gabor, jmaher isn't accepting review requests right now. Are you willing to take this one?
(Assignee)

Comment 19

2 months ago
Alright, I think I might have figured out the remaining regression.

I highly suspect this is due to the fact that the tps test is accidentally capturing paint activity that's occurring in the tab strip when switching between tabs. This paint activity gets more and more frequent the further you go along the tab strip, since the strip smooth-scrolls further and paints more frames.

There's actually code in the tps test[1] that tries to work around this by hiding the tab strip, but according to paint flashing, we're still doing the invalidation in there. So that's the first bug - this scrollbox is still causing invalidation and paint even when it has its visibility set to hidden.

I'll note that the above bug with the painting still exists in the May 1st build. Since that time, however, we've changed the length of the tab strip - bug 1355764 got rid of the space above the tabs and put a draggable area in front of the tab strip. This causes the tab strip to be somewhat shorter.

Inintuitively, I think that means we end up painting more, because for long enough scrolls (like, for example, from the first tab to the last tab), we end up having to paint more since we have to scroll a greater distance, since the tab strip is slightly smaller.

I have a patch that, before each tab switch, moves each tab to just after the blank tab. This should avoid tab strip scrolling. I'll do a comparison between May 1st and mozilla-central with that change. If there's no difference, I think we can put this one to bed.

[1]: http://searchfox.org/mozilla-central/rev/e8f4f51cd543f203e9cb861cecb7545ac43c836c/testing/talos/talos/tests/tabswitch/bootstrap.js#393-394
(Assignee)

Comment 20

2 months ago
Try pushes up, and here's the comparison:

https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=8a1dfd29d9b5&newProject=try&newRevision=a08cc8a46889&framework=1&showOnlyImportant=0

If we can't detect a significant change on tps between those two builds, then I think it's safe to say we're done here (although we should probably land the patches to de-noise the test).
(Assignee)

Updated

2 months ago
Flags: needinfo?(mconley)
(In reply to Mike Conley (:mconley) - Digging out of needinfo / review backlog. from comment #18)
> Hey gabor, jmaher isn't accepting review requests right now. Are you willing
> to take this one?

I'm on PTO this week but I can take it early next week if that's not too late.
(Assignee)

Comment 22

2 months ago
Great success - comparison in comment 20 shows no difference between May 1st build and today's mozilla-central. This supports the theory that the decrease in tabstrip size caused noise to be introduced.

I still need to file a bug for the painting / invalidation we seem to be doing on a hidden scrollable, but in the meantime, I think the moving of the tab we're switching to is a fine workaround.
Flags: needinfo?(mconley)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
(Assignee)

Updated

2 months ago
Summary: Find out why tps has regressed on Win32 (Windows 7) since early May, 2017 → Make tps Talos test less noisy
(Assignee)

Comment 26

2 months ago
Filed bug 1378779 for the issue with invalidating / painting in the tab strip even when the container is hidden.
(Assignee)

Updated

2 months ago
Attachment #8882507 - Flags: review?(gkrizsanits) → review?(rwood)
Attachment #8883804 - Flags: review?(gkrizsanits) → review?(rwood)
Attachment #8883805 - Flags: review?(gkrizsanits) → review?(rwood)

Comment 27

2 months ago
Comment on attachment 8882507 [details]
Bug 1372261 - Make tps Talos test use the performance timing API instead of Date.now() to avoid clock skew.

LGTM, thanks for this!
Attachment #8882507 - Flags: review?(rwood) → review+

Comment 28

2 months ago
Comment on attachment 8883804 [details]
Bug 1372261 - Make tps Talos test move each tab next to the blank 'palette-clenser' tab before switching, to avoid noise from the tab strip scrolling.

Very cool!
Attachment #8883804 - Flags: review?(rwood) → review+

Comment 29

2 months ago
mozreview-review
Comment on attachment 8883805 [details]
Bug 1372261 - Bump tps Talos test version number.

https://reviewboard.mozilla.org/r/154756/#review160734

::: testing/talos/talos/tests/tabswitch/install.rdf:7
(Diff revision 1)
>  <RDF xmlns="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:em="http://www.mozilla.org/2004/em-rdf#">
>    <Description about="urn:mozilla:install-manifest">
>      <em:id>tab-switch-test@lassey.us</em:id>
>      <em:type>2</em:type>
>      <em:name>Tab Switch Test</em:name>
> -    <em:version>1.0.3</em:version>
> +    <em:version>1.0.4</em:version>

Does your change in /tabswitch/bootstrap.js (and hence this version increase) not require the add-on to be resigned? Or because there's no change in /tabswitch/content does that mean resigning is not required?
(Assignee)

Comment 30

2 months ago
mozreview-review-reply
Comment on attachment 8883805 [details]
Bug 1372261 - Bump tps Talos test version number.

https://reviewboard.mozilla.org/r/154756/#review160734

> Does your change in /tabswitch/bootstrap.js (and hence this version increase) not require the add-on to be resigned? Or because there's no change in /tabswitch/content does that mean resigning is not required?

Yeah, I'll need to package and sign this again. For these Talos add-ons, I tend not to do the packaging / signing until I receive r+'s, since I don't want to do multiple rounds of packaging and signing if there's review feedback. :)

Comment 31

2 months ago
mozreview-review
Comment on attachment 8883805 [details]
Bug 1372261 - Bump tps Talos test version number.

https://reviewboard.mozilla.org/r/154756/#review160740

::: testing/talos/talos/tests/tabswitch/install.rdf:7
(Diff revision 1)
>  <RDF xmlns="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:em="http://www.mozilla.org/2004/em-rdf#">
>    <Description about="urn:mozilla:install-manifest">
>      <em:id>tab-switch-test@lassey.us</em:id>
>      <em:type>2</em:type>
>      <em:name>Tab Switch Test</em:name>
> -    <em:version>1.0.3</em:version>
> +    <em:version>1.0.4</em:version>

Fair enough! Thanks!
Attachment #8883805 - Flags: review?(rwood) → review+
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)

Updated

2 months ago
Attachment #8882507 - Flags: review?(rwood) → review+

Updated

2 months ago
Attachment #8883804 - Flags: review?(rwood) → review+

Comment 35

2 months ago
mozreview-review
Comment on attachment 8882507 [details]
Bug 1372261 - Make tps Talos test use the performance timing API instead of Date.now() to avoid clock skew.

https://reviewboard.mozilla.org/r/153620/#review160748

Comment 36

2 months ago
mozreview-review
Comment on attachment 8883804 [details]
Bug 1372261 - Make tps Talos test move each tab next to the blank 'palette-clenser' tab before switching, to avoid noise from the tab strip scrolling.

https://reviewboard.mozilla.org/r/154754/#review160750

Comment 37

2 months ago
Pushed by mconley@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/2c655b5cfb55
Make tps Talos test use the performance timing API instead of Date.now() to avoid clock skew. r=rwood
https://hg.mozilla.org/integration/autoland/rev/2591e23b22ce
Make tps Talos test move each tab next to the blank 'palette-clenser' tab before switching, to avoid noise from the tab strip scrolling. r=rwood
https://hg.mozilla.org/integration/autoland/rev/9c31c803b3b4
Bump tps Talos test version number. r=rwood

Comment 38

a month ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/2c655b5cfb55
https://hg.mozilla.org/mozilla-central/rev/2591e23b22ce
https://hg.mozilla.org/mozilla-central/rev/9c31c803b3b4
Status: ASSIGNED → RESOLVED
Last Resolved: a month ago
status-firefox56: affected → fixed
Resolution: --- → FIXED
Target Milestone: --- → Firefox 56
Blocks: 1380003
status-firefox55: ? → affected
status-firefox-esr52: --- → unaffected
Whiteboard: [photon-performance] → [photon-performance][checkin-needed-beta]

Comment 39

a month ago
bugherderuplift
https://hg.mozilla.org/releases/mozilla-beta/rev/d5d9a5a835e6
https://hg.mozilla.org/releases/mozilla-beta/rev/aa356698281c
https://hg.mozilla.org/releases/mozilla-beta/rev/78df8725c1cc
status-firefox55: affected → fixed
Whiteboard: [photon-performance][checkin-needed-beta] → [photon-performance]
Depends on: 1380250
Blocks: 1362920
You need to log in before you can comment on or make changes to this bug.