Closed Bug 1425330 Opened 6 years ago Closed 6 years ago

Create Talos configuration for "emulate XUL with flex"

Categories

(Testing :: Talos, enhancement)

Version 3
enhancement
Not set
normal

Tracking

(firefox60 fixed)

RESOLVED FIXED
mozilla60
Tracking Status
firefox60 --- fixed

People

(Reporter: dholbert, Assigned: bgrins)

References

Details

(Whiteboard: [PI:January])

Attachments

(2 files)

In bug 1398963 and bug 1424095, we'll be creating a (preffed-off by default) opt-in feature that makes us use modern CSS Flexbox to emulate XUL, under the covers. (specifically, emulating -moz-box and -moz-inline-box)

This will remain preffed off for a while, but we'd like to keep track of performance data over time to track improvements and notice regressions.  So, we'd like to add a talos configuration with the pref enabled, to track this.  

Desires:
 - Tier 3 (just need to gather data; sheriffs don't need to care about it)
 - OK to run in parallel.
 - Platforms: Linux64 (what I'm using locally) and Win10 (if it's easy)
 - Tree: mozilla-central
 - Tests: TART and ts_paint

No action needed until we've landed bug 1424095.
...and the pref that should be set in this configuration (the only difference vs. normal Firefox) is one pref-tweak.

The pref is tentatively named "layout.css.emulate-moz-box-with-flex" (from final patch in bug 1398963), and it'll need to be set to "true" in this Talos configuration.
:dholbert, if you need help doing this, needinfo myself or :rwood and we can help or do this- ideally the more people that know how to hack on task creation the better, so I will not taking this by default :)
Whiteboard: [PI:January]
Robert, I'd like to get this configuation set up now that we've made some more progress on the feature. Could you outline what needs to be done to get TART and ts_paint running on m-c as tier 3 with a custom pref value?
Flags: needinfo?(rwood)
(In reply to Brian Grinstead [:bgrins] from comment #3)
> Robert, I'd like to get this configuation set up now that we've made some
> more progress on the feature. Could you outline what needs to be done to get
> TART and ts_paint running on m-c as tier 3 with a custom pref value?

Hi Brian,

Yes glad to help. There are several pieces to this. It requires a few different patches, in multiple repos, but I can help you - feel free to request info/feedback/reviews by me for any of these steps as you go.

1) Changes to talos to configure the new tests and their suite,
2) Additions to buildbot configs to schedule the new suite,
3) Additions to taskcluster configs for the new suite,
4) Additions to treeherder etl data ingestion,
5) Bunch of try runs to see how stable/noisy the tests are


More details:

1) Talos changes. The changes in talos itself requires creating configs for the new tests and the suite. Since you want to run existing tests ts_paint and tart, but just with a pref set, it's fairly easy.

Basically copy the existing test class definitions in test.py for ts_paint and tart, i.e. [a] but rename the new classes something like 'ts_paint_flex'; leave all the options the same except add a 'preferences' option with the preference that needs to be set for flex xul, similar to what is done here [b].

Then there needs to be a new suite configuration added to talos.json. Basically just copy an existing entry like this one [c] but call it something like 'talos-flex-e10s' and for the 'tests' have tart and ts_paint. The suite will be called by this name (i.e. './mach talos-test --suite talos-flex-e10s' for example when running locally).


2) Buildbot configs. Since this new suite will be running on linux and win 10, buildbot configs are also required (right now only OSX is purely on TC and can avoid BB configs). In the configs basically the new job names are identified, what platforms they run on, and what version of Firefox to enable them on (i.e. nightly is currently at 60 so we'd want these jobs to only run on Ffox 60+). BB configs are in the mozilla build repo [d].

Here is an example patch in the buildbot configs repo, that adds new talos buildbot jobs [e]. These need to land before the taskcluster configs.


3) Taskcluster configs. First the new talos suite needs to be defined in the taskcluster ci talos test definitions [f]. Basically copy an entry (like talos-dromaeojs) and modify it accordingly, call it the same name as the suite in talos i.e. 'talos-flex'. Update the description, treeherder symbol, run-on-projects, and extra-options suite etc. Also you'll want to add 'tier: 3' for each test job.

Then those new test definitions get added in the taskcluster test sets [g]. You want these to run on linux and windows so you'll need to add new sets i.e. 'linux-talos-flex', and 'windows-talos-flex', under each you'll just have the 'talos-flex' suite as defined above.

The last tc config step is telling tc what platforms to run the new test sets on, this is done in [h]. Add your test set defined above (i.e. 'linux-talos-flex') under 'linux64 opt' [i] and the 'windows-talos-flex' under 'windows10-64/opt'. If you want them on the pgo builds too then also add them under the corresponding '*-pgo/opt' sections.


4) Treeherder. In order for the jobs to actually show up and be displayed in treeherder, the treeherder buildbot ingestion code [j] needs to be updated for the new suite. This is because we want these jobs to show up in their own new treeherder symbol group.

If you let me know the group name where you want the job symbols to appear on treeherder, I can take care of this patch for you if you like as the treeherder dev env takes awhile to set up. I would suggest a group name of "Tf-e10s" for Talos flex maybe? Open to suggestions! That needs to land in order to have the jobs actually show up on treeherder itself.


5) Once it's all ready to go you can do a bunch of retriggers on try to see how stable those two tests are with that pref turned on. It's only tier 3 anyway so not too concerned but eventually if you want this to move up a tier and have performance sheriffing, the tests should have minimal noise so that perf regressions will be automatically detected and flagged etc.


[a] https://searchfox.org/mozilla-central/rev/062e1cf551f5bf3f0af33671b818f75a55ac497b/testing/talos/talos/test.py#130

[b] https://searchfox.org/mozilla-central/rev/062e1cf551f5bf3f0af33671b818f75a55ac497b/testing/talos/talos/test.py#153

[c] https://searchfox.org/mozilla-central/rev/062e1cf551f5bf3f0af33671b818f75a55ac497b/testing/talos/talos.json#10

[d] https://hg.mozilla.org/build/buildbot-configs/

[e] https://hg.mozilla.org/build/buildbot-configs/rev/aa469888ed3b04084ba4d8ddda0a369573394df6

[f] https://searchfox.org/mozilla-central/source/taskcluster/ci/test/talos.yml

[g] https://searchfox.org/mozilla-central/source/taskcluster/ci/test/test-sets.yml

[h] https://searchfox.org/mozilla-central/source/taskcluster/ci/test/test-platforms.yml

[i] https://searchfox.org/mozilla-central/rev/062e1cf551f5bf3f0af33671b818f75a55ac497b/taskcluster/ci/test/test-platforms.yml#45

[j] https://github.com/mozilla/treeherder/blob/master/treeherder/etl/buildbot.py
Flags: needinfo?(rwood)
Hey Brian - Good news earlier today the switch-over to taskcluster for Linux x64 was landed inbound - so for Linux x64, steps 2 and 4 in comment 4 above are no longer required (no buildbot changes needed to land on Linux x64 or OSX). If you want we can land this new talos flex/XUL suite on Linux x64 first.

Hopefully the taskcluster migration for Windows will be completed approx. end of Feb - if you can postpone this new flex/XUL suite for Windows until then, you can avoid the buildbot changes for Windows as well.
Flags: needinfo?(bgrinstead)
(In reply to Robert Wood [:rwood] from comment #5)
> Hey Brian - Good news earlier today the switch-over to taskcluster for Linux
> x64 was landed inbound - so for Linux x64, steps 2 and 4 in comment 4 above
> are no longer required (no buildbot changes needed to land on Linux x64 or
> OSX). If you want we can land this new talos flex/XUL suite on Linux x64
> first.
> 
> Hopefully the taskcluster migration for Windows will be completed approx.
> end of Feb - if you can postpone this new flex/XUL suite for Windows until
> then, you can avoid the buildbot changes for Windows as well.

That's great news, thank you! I do think it makes sense to push forward with just Linux x64 for now and hold off on Windows until that taskcluster migration is done.
Flags: needinfo?(bgrinstead)
Blocks: 1033225
Assignee: nobody → bgrinstead
Status: NEW → ASSIGNED
Alright, thanks again for the detailed instructions :rwood! I've pushed up a set that does steps 1 and 3 (part 1 and part 2 respectively). I've confirmed both that I can get the job to run on try as T-f with `./mach try fuzzy -q "flex"` [0] and that `-t all` does *not* schedule the flex job as expected [1].

[0]: https://treeherder.mozilla.org/#/jobs?repo=try&revision=d20d78462c8cbac2cf0659163666003cf5b05128&filter-tier=1&filter-tier=2&filter-tier=3
[1]: https://treeherder.mozilla.org/#/jobs?repo=try&revision=ee27851d6fbaacb0dfd5ced26052a69a9cdd86b1&filter-tier=1&filter-tier=2&filter-tier=3
Still awaiting more results but for the first run I see numbers like:

    tart_flex opt e10s: 21.56
    ts_paint_flex opt e10s: 465.00

And for non-flex:
    tart opt e10s: 2.29
    ts_paint opt e10s: 454.00

To summarize a comparison with the initial try push back in https://bugzilla.mozilla.org/show_bug.cgi?id=1398963#c16:

- ts_paint is much better than it was. We are within a couple percent compared with 39.28% regression on the initial push. I'd like to see the numbers over time - but it looks really close already.
- tart is much worse than it was. Before, we were seeing a 150.27% regression ("12.51 ± 0.81%" on linux64 with the patch applied vs "5.00 ± 3.62%" without it). So the # for without emulation went down by over 50% and the number with it went up over 50%. This could be explained by the fact that the initial push was before we landed the fixes in Bug 1424095 to make the tabs display properly. But getting this tracked in automation is the first step - and hopefully a combination of CSS changes and platform perf work will drive the flex numbers down over time.
Comment on attachment 8946940 [details]
Bug 1425330 - Part 1 - set up talos configurations for tart and ts_paint with XUL flexbox emulation;

https://reviewboard.mozilla.org/r/216788/#review223420

Apologies for the review delay!
Attachment #8946940 - Flags: review?(rwood) → review+
(In reply to Brian Grinstead [:bgrins] from comment #9)
> Alright, thanks again for the detailed instructions :rwood! I've pushed up a
> set that does steps 1 and 3 (part 1 and part 2 respectively). I've confirmed
> both that I can get the job to run on try as T-f with `./mach try fuzzy -q
> "flex"` [0] and that `-t all` does *not* schedule the flex job as expected
> [1].
> 
> [0]:
> https://treeherder.mozilla.org/#/
> jobs?repo=try&revision=d20d78462c8cbac2cf0659163666003cf5b05128&filter-
> tier=1&filter-tier=2&filter-tier=3
> [1]:
> https://treeherder.mozilla.org/#/
> jobs?repo=try&revision=ee27851d6fbaacb0dfd5ced26052a69a9cdd86b1&filter-
> tier=1&filter-tier=2&filter-tier=3

Thanks Brian! The job does show up when you run '-t all' on try but that's fine, the other similar jobs i.e. stylo-disabled have the same behaviour. There's an issue with the group name - I'll comment in the review.

The stability of the flex suite looks good in your 6 runs, a low std dev:

https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=d20d78462c8cbac2cf0659163666003cf5b05128&newProject=try&newRevision=d20d78462c8cbac2cf0659163666003cf5b05128&framework=1
Comment on attachment 8946941 [details]
Bug 1425330 - Part 2 - Taskcluster config for XUL flexbox emulation talos tests;

https://reviewboard.mozilla.org/r/216790/#review223422

Actually there's no issue with the group name (see review comment). Thanks for adding this!

::: taskcluster/ci/test/talos.yml:112
(Diff revision 1)
>              - --webServer,localhost
>  
> +talos-flex:
> +    description: "Talos XUL flexbox emulation enabled"
> +    try-name: flex
> +    treeherder-symbol: T(f)

So this forces the new flex job to appear in the treeherder group:

T-e10s(f)

I was originally thinking we'd have a separate group for tests with this flex pref ie.

T-F(f)

However since we're not running all the regular talos tests with the flex pref turned on then it should be fine. It does show up in it's own T-e10s section with a little "tier3" beside it, so this is fine. I also ran it by :jmaher on IRC and this setup is fine.

Thanks for your work on this! :)
Attachment #8946941 - Flags: review?(rwood) → review+
Pushed by bgrinstead@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/1ed969404e0d
Part 1 - set up talos configurations for tart and ts_paint with XUL flexbox emulation;r=rwood
https://hg.mozilla.org/integration/autoland/rev/9b5414f851f5
Part 2 - Taskcluster config for XUL flexbox emulation talos tests;r=rwood
https://hg.mozilla.org/mozilla-central/rev/1ed969404e0d
https://hg.mozilla.org/mozilla-central/rev/9b5414f851f5
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla60
Here are links to compare flex and non-flex results on linux:

  tart - https://mzl.la/2Clo7R6
  ts_paint - https://mzl.la/2o1NnaL

Today, Talos(flex) is still running as tier 3 on mozilla-central. All recent runs are green, so these could probably run as tier 2 or tier 1, where they would not be hidden and would get the benefits of sheriffing. Could/should they be promoted to tier 1 or 2? (Or, if results are not needed, should they be disabled on mozilla-central, or entirely?) Just let me know -- I'm happy to land patches if needed.

Flags: needinfo?(bgrinstead)

(In reply to Geoff Brown [:gbrown] from comment #17)

Today, Talos(flex) is still running as tier 3 on mozilla-central. All recent runs are green, so these could probably run as tier 2 or tier 1, where they would not be hidden and would get the benefits of sheriffing. Could/should they be promoted to tier 1 or 2? (Or, if results are not needed, should they be disabled on mozilla-central, or entirely?) Just let me know -- I'm happy to land patches if needed.

That's a good question. We are hoping to improve the numbers during this quarter as a result of ongoing flexbox perf work, so definitely don't want them disabled. I'm not sure if we want them to be sheriffed since I don't know that regressions would be actionable at least until stuff like https://bugzilla.mozilla.org/show_bug.cgi?id=1493962 lands and we are a bit greener on mochitests. My feeling is maybe start sheriffing it when we see a path forward for shipping CSS flex emulation? Although if we don't have other talos tests tracking CSS flexbox perf perhaps it would be a useful proxy for that. Forward to Daniel for another opinion.

Flags: needinfo?(bgrinstead) → needinfo?(dholbert)
See Also: → 1544113

Sorry for the delay -- the needinfo slipped off my radar.

I agree with bgrins.

  1. sheriffing doesn't seem necessary at the moment.
  2. We should keep this enabled as tier3 on mozilla-central (not on any other trees though).
  3. If it's occupying a lot of CPU time, we could bump it to once per day or something in the near term -- but probably not less frequent than that. I'd like to have ~continuous data to catch regressions & demonstrate improvements here, as we get closer to having flex/XUL emulation being a viable option.
Flags: needinfo?(dholbert)

OK, thanks. I don't have concerns regarding CPU time / won't make any changes at this time.

You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: