Port base vs reference style-perf-tests into talos as subtests

RESOLVED FIXED in Firefox 58

Status

RESOLVED FIXED
a year ago
a year ago

People

(Reporter: rwood, Assigned: rwood)

Tracking

Version 3
mozilla58
Points:
---

Firefox Tracking Flags

(firefox58 fixed)

Details

(Whiteboard: [PI:September])

Attachments

(3 attachments)

(Assignee)

Description

a year ago
Now that stylo is merged, port over the style-perf-tests [1] into talos, and have them run as base vs reference i.e. bloom-basic vs bloom-basic-ref. This list [2] indicates which test page is compared with which reference page.

Currently talos can run base vs ref tests one at a time (i.e. the existing perf-reftest suite), but we want to land these as subtests - where each subtest is actually a base vs ref test. Support for that kind of subtest is being added in Bug 1397438.

Note: The style-perf-tests running stand-alone (i.e. not being compared with reference runs) are landing in the talos 'perf-reftest-singletons' test in Bug 1374750.

[1]https://github.com/heycam/style-perf-tests

[2]https://github.com/heycam/style-perf-tests/blob/master/perf-reftest/perf-reftest.list
(Assignee)

Updated

a year ago
Whiteboard: [PI:September]
(Assignee)

Comment 11

a year ago
The tests need to be split into multiple jobs as they run too long altogether. Here's the first set (12) on try [1]:

& http://localhost/tests/perf-reftest/bloom-basic.html, http://localhost/tests/perf-reftest/bloom-basic-ref.html
& http://localhost/tests/perf-reftest/bloom-basic-2.html, http://localhost/tests/perf-reftest/bloom-basic-ref.html
& http://localhost/tests/perf-reftest/coalesce-1.html, http://localhost/tests/perf-reftest/coalesce-ref.html
& http://localhost/tests/perf-reftest/coalesce-2.html, http://localhost/tests/perf-reftest/coalesce-ref.html
& http://localhost/tests/perf-reftest/style-sharing.html, http://localhost/tests/perf-reftest/style-sharing-ref.html
& http://localhost/tests/perf-reftest/style-sharing-style-attr.html, http://localhost/tests/perf-reftest/style-sharing-ref.html
& http://localhost/tests/perf-reftest/display-none-1.html, http://localhost/tests/perf-reftest/display-none-1-ref.html
& http://localhost/tests/perf-reftest/only-children-1.html, http://localhost/tests/perf-reftest/only-children-ref.html
& http://localhost/tests/perf-reftest/dep-check-1.html, http://localhost/tests/perf-reftest/dep-check-1-ref.html
& http://localhost/tests/perf-reftest/slow-selector-1.html, http://localhost/tests/perf-reftest/slow-selector-1-ref.html
& http://localhost/tests/perf-reftest/slow-selector-2.html, http://localhost/tests/perf-reftest/slow-selector-2-ref.html
& http://localhost/tests/perf-reftest/style-attr-1.html, http://localhost/tests/perf-reftest/style-attr-1-ref.html

Approx. duration to run this set/job:

Linux x64: 7 min
OSX 10.10: 7 min
Win 10: 8 min
Win 7: 12 min

:bholley, unless any objections I'm going to land this set as the 'perf-reftest' job 'p'. I'll need to add at least one more job (i.e.'p2') after for the remaining tests.

[1] https://treeherder.mozilla.org/#/jobs?repo=try&revision=73c83bd55ba4f8ed178e888141f6424bbd085748
Flags: needinfo?(bobbyholley)
(Assignee)

Comment 13

a year ago
FYI :bholley, looks like the stop-cascade-* tests crash on Win 7:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=a78e82648cf76b6bb09c229d822a5bd233c49519&selectedJob=133848772

Trying to run as many tests as possible just to have more data re: individual test durations (re: comment 11)
It looks like the coalesce tests dominate the total time. I think it should be fine to drop the iterations there from 50000 to 5000, which should greatly speed up total test time. Can you PR that against cameron's repo, which will give him the opportunity to speak up if there was any particular reason he made it large?

Similarly, I think the stop-cascade tests could be tweaked. Right now setup time dominates, because we create an array of 1000 subtrees in order to make the effect noticeable. Instead, we could make a smaller array (say, 200), and then do more iterations within the measured section to toggle the class on/off, which should have the same effect of making the numbers large enough, while vastly saving on machine time.
Flags: needinfo?(bobbyholley)
Basically, if any of these tests are taking more than a hundred milliseconds or so per iteration, we can probably tweak the test to be more efficient, which will save us a lot of machine time over the long run.
(Assignee)

Comment 17

a year ago
Created attachment 8913371 [details] [review]
https://github.com/heycam/style-perf-tests/pull/8

Reduce coalesce iterations (as :bholley suggested in comment 15, thanks!).
Attachment #8913371 - Flags: review?(cam)
(Assignee)

Comment 18

a year ago
Created attachment 8913391 [details] [review]
https://github.com/heycam/style-perf-tests/pull/9

Reduce the duration of cascade 1 and 2
Attachment #8913391 - Flags: review?(cam)
(Assignee)

Comment 20

a year ago
(In reply to Robert Wood [:rwood] from comment #19)

> https://treeherder.mozilla.org/#/jobs?repo=try&revision=ffeff79f676cd44ca8767cb3cad85dde99cfb472

This is the full test suite on try with the modifications I made to the coalesce and cascade tests (comment 17 and comment 18). This full suite is now running at:

Linux x64: 16 min
OS 10.10: 15 min
Win 10: 17 min
Win 7: doesn't finish, looks like a crash with stop-cascade-3

Still not an ideal duration for this job. Here's how long the tests take on Win 10 (tppagecycles 10):

bloom-basic.html 06:37:52, 06:38:00 => 0:08
bloom-basic-ref.html 06:38:00, 06:38:07 => 0:07
bloom-basic-2.html 06:38:08, 06:38:16 => 0:08
coalesce-1.html 06:38:25, 06:38:46 => 0:21
coalesce-ref.html 06:38:47, 06:39:07 => 0:20
coalesce-2.html 06:39:08, 06:39:29 => 0:21
style-sharing.html 06:39:51, 06:39:58 => 0:07
style-sharing-ref.html 06:39:58, 06:40:04 => 0:06
style-sharing-style-attr.html 06:40:04, 06:40:10 => 0:06
style-sharing-ref.html 06:40:11, 06:40:17 => 0:06
display-none-1.html 06:40:17, 06:40:26 => 0:09
display-none-1-ref.html 06:40:28, 06:40:33 => 0:05
nth-index-1.html 06:40:34, 06:40:58 => 0:24
nth-index-ref.html 06:40:59, 06:41:23 => 0:24
nth-index-2.html 06:41:24, 06:41:49 => 0:25
some-descendants-1.html 06:42:15, 06:42:34 => 0:19
some-descendants-1-ref.html 06:42:35, 06:42:55 => 0:20
only-children-1.html 06:42:56, 06:43:27 => 0:31
only-children-ref.html 06:43:27, 06:43:33 => 0:06
dep-check-1.html 06:43:36, 06:44:01 => 0:25
dep-check-1-ref.html 06:44:05, 06:44:13 => 0:08
slow-selector-1.html 06:44:14, 06:44:33 => 0:19
slow-selector-1-ref.html 06:44:33, 06:44:39 => 0:06
slow-selector-2.html 06:44:39, 06:44:58 => 0:19
slow-selector-2-ref.html 06:44:58, 06:45:04 => 0:06
style-attr-1.html 06:45:06, 06:45:28 => 0:22
style-attr-1-ref.html 06:45:28, 06:45:34 => 0:06
stop-cascade-1.html 06:45:36, 06:46:26 => 0:50
stop-cascade-ref.html 06:46:27, 06:46:33 => 0:06
stop-cascade-2.html 06:46:35, 06:47:29 => 0:54
stop-cascade-3.html 06:47:46, 06:51:54 => 4:08
stop-cascade-3-ref.html 06:51:54, 06:52:02 => 0:08

:bholley, I'm going to remove stop-cascade-3 since it crashes on win7, I'll file a bug for that issue. That will take about 4 minutes off the test jobs, that *might* make it doable as a single test job for this suite.
Flags: needinfo?(bobbyholley)
(In reply to Robert Wood [:rwood] from comment #20)
> :bholley, I'm going to remove stop-cascade-3 since it crashes on win7, I'll
> file a bug for that issue. That will take about 4 minutes off the test jobs,
> that *might* make it doable as a single test job for this suite.

Did you try making the modifications I proposed in comment 15 to the stop-cascade tests?
Flags: needinfo?(bobbyholley)
(Assignee)

Comment 24

a year ago
(In reply to Bobby Holley (:bholley) (busy with Stylo) from comment #23)
> (In reply to Robert Wood [:rwood] from comment #20)
> > :bholley, I'm going to remove stop-cascade-3 since it crashes on win7, I'll
> > file a bug for that issue. That will take about 4 minutes off the test jobs,
> > that *might* make it doable as a single test job for this suite.
> 
> Did you try making the modifications I proposed in comment 15 to the
> stop-cascade tests?

Yessir, my pull request is in comment 18. I didn't change stop-cascade-3 though because it's different than the others and just has one 'root' and no loop to cycle through in the measurement part. Also it crashes on win7 anyway. Without it, the test suite is running at about 16 min on win7 so it looks good.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=eb6edd48a5962ec6a681f157edb1dd69bfe7453b
(Assignee)

Comment 25

a year ago
I further reduced cascade 1 and 2, removed stop-cascade-3* since it fails on win7, delinted and updated the PR above. Looks good on try.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=eb6edd48a5962ec6a681f157edb1dd69bfe7453b

Linux x64: 10 min
OSX 10.10: 11 min
Win 10: 13 min
Win 7: 16 min

I'd like to land this (if my perf-style-reftest updates in comment 17 and comment 18 are approved)... and leave the addition of stop-cascades-3* and any further potential optimizations/additions to possible future work.
(In reply to Robert Wood [:rwood] from comment #25)
> I'd like to land this (if my perf-style-reftest updates in comment 17 and
> comment 18 are approved)... and leave the addition of stop-cascades-3* and
> any further potential optimizations/additions to possible future work.

sgtm.
Comment on attachment 8913391 [details] [review]
https://github.com/heycam/style-perf-tests/pull/9

Comments to address on this one, in the PR.  Please cancel NI when it's been updated (since I don't always seem to get GitHub notifications).
Flags: needinfo?(rwood)
(Assignee)

Comment 29

a year ago
(In reply to Cameron McCormack (:heycam) (away 4 Oct) from comment #28)
> Comment on attachment 8913391 [details] [review]
> https://github.com/heycam/style-perf-tests/pull/9
> 
> Comments to address on this one, in the PR.  Please cancel NI when it's been
> updated (since I don't always seem to get GitHub notifications).

Thanks for the comments, I'll update the PR accordingly. I really don't understand these tests and prefer to leave future optimizations up to the test author, after this PR, thanks!
Flags: needinfo?(rwood)
Comment hidden (mozreview-request)
(Assignee)

Comment 32

a year ago
(In reply to Robert Wood [:rwood] from comment #31)
> https://treeherder.mozilla.org/#/jobs?repo=try&revision=eba7a238f8a48e727dacbbc90992ff65937bfdce

Try run with latest test updates. Green, and job durations not bad:

Linux x64: 10 min, stylo-disabled 14 min
OS 10.10: 11 min, stylo-disabled 13 min
Win 7: 18 min, stylo-disabled 22 min
Win 10: 13 min, stylo-disabled 17 min

Comment 33

a year ago
mozreview-review
Comment on attachment 8914303 [details]
Bug 1398193 - Port base vs reference style-perf-tests into talos as subtests;

https://reviewboard.mozilla.org/r/185594/#review190782

awesome to see more tests
Attachment #8914303 - Flags: review?(jmaher) → review+
Attachment #8913391 - Flags: review?(cam) → review+
(Assignee)

Comment 34

a year ago
Thanks guys, landing this now

Comment 35

a year ago
Pushed by rwood@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/943ad1c9a59a
Port base vs reference style-perf-tests into talos as subtests; r=jmaher
I see an improvement in the summary number:
== Change summary for alert #9796 (as of October 03 2017 12:37 UTC) ==

Improvements:

 63%  perf_reftest summary windows7-32 opt e10s     10.57 -> 3.89

For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=9796

Comment 37

a year ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/943ad1c9a59a
Status: ASSIGNED → RESOLVED
Last Resolved: a year ago
status-firefox58: --- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla58
You need to log in before you can comment on or make changes to this bug.