Closed Bug 1445952 Opened 2 years ago Closed Last month

[meta] - tracking bug for replacing AWFY with Talos

Categories

(Testing :: Talos, defect)

defect
Not set

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: jmaher, Unassigned)

References

Details

(Keywords: meta)

This bug will track the work required to turn off/stop supporting AWFY as a toolchain.

1) port browser tests to talos running on firefox on mozilla-central
2) add support for google chrome to run in CI
2) run browser tests in firefox/chrome regularly
3) build dashboard to show subtests of benchmarks and comparison
4) run primary benchmarks on win10 against reference laptops
5) run shell tests against various spidermonkey builds
6) run shell tests against google chrome
7) update dashboards to include shell tests
8) provide alerting on mozilla-central to help notify of regressions
9) update arewefastyet.com to use dashboards from #3 and #7
10) resolve hidden tests (possibly document how to add/run/view)
11) resolve any special builds (possibly document how to add/run/view)
:bc, can you help think of anything else we need (or add specifics to some of the above if there is something not obvious)
Flags: needinfo?(bob)
the AWFY shell benchmarks are documented here:
https://github.com/mozilla/arewefastyet/blob/master/slave/benchmarks_shell.py#L338
    Octane,
    SunSpider,
    Kraken,
    Assorted,
    AsmJSApps,
    AsmJSMicro,
    Dart,
    SixSpeed,
    Ares6,
    WebToolingBenchmark


we have a series of browser benchmarks that use the local runner:
https://github.com/mozilla/arewefastyet/blob/master/slave/benchmarks_local.py#L131
    AssortedDOM,
    WebAudio,
    UnityWebGL,


and many browser benchmarks that use the remote runner:
https://github.com/mozilla/arewefastyet/blob/master/slave/benchmarks_remote.py#L581
    Octane,
    Dromaeo,
    Massive,
    JetStream,
    Speedometer,
    Speedometer2,
    Kraken,
    SunSpider,
    Browsermark,
    WasmMisc,
    EmberPerf,
    MotionMark,


If we summarized this differently we would have a list of tests:

Already on Talos:
Dromaeo (browser)
JetStream (browser)
Speedometer (browser)
MotionMark (browser)

To move to Talos:
Speedometer2 (browser- do we need both)
Octane (browser, shell)
Kraken (browser, shell)
SunSpider (browser, shell)
BrowserMark (browser)
WasmMisc (browser)
ARES-6 (shell - running in talos as a browser test- convert to shell test)
WebAudio (browser)
WebToolingBenchmark (shell)
SixSpeed (shell)
Assorted (shell)


To consider moving to talos as these are nice to have but not critical:
AssortedDOM (browser - confirm with :bz vs dromaeo, etc.)
Massive (browser)
EmberPerf (browser)
UnityWebGL (browser)
asmJSApps (shell)
AsmJSMicro (shell)


when referring to shell, currently this appears to be osx only- it would be nice to confirm that is still desired and if we need 32 bit AND 64 bit or if it doesn't matter.
Depends on: 1436819
:bz, can you help me determine which DOM benchmarks make the most sense to run (all or some).  we have AssortedDOM from AWFY, and Dromaeo from AWFY.  In talos we have dromaeo DOM (linux only) and dromaeo CSS.
Flags: needinfo?(bzbarsky)
Depends on: 1445975
You mention Windows 10, but not Linux or macOS. I think we will need all three and preferably not ancient versions but the latest versions for some interpretation of latest. At a minimum we need "current versions" to pick up OS mitigations for Meltdown/Spectre. I would like a plan for updating the operating systems preferably every Patch Tuesday.

All current Intel cpus are affected by Meltdown/Spectre so our initial roll out will include those exclusively but we should plan for the introduction of fixed cpus and support testing both the older cpus and the later fixed ones with the ability to distinguish and compare them.

I would like to see us also test AMD cpus which would also imply pre Spectre and post Spectre fixed versions.

Plan for ARM based machines?

No Edge or Safari?

re 8: Ideally we would alert for autoland and inbound as well.

re 9: I'd rather nuke it from orbit. I'm assuming your hardware will be in a DC and Ops will manage it. In that case, arewefastyet.com will not be allowed to connect to them and we'll need a different situation. In addition arewefastyet.com is terminal and will be gone September 1. I think the appropriate item here is decommission arewefastyet.com, host new dashboards on IT managed infrastructure hosted in a DC.

re 10: I don't like the idea of hidden tests, but I guess we have no choice.

* private repos accessible by employee's or NDA contributors?
* visibility controls in Treeherder? I don't think they have the time if this isn't already available.

re 11: I don't have any experience on Windows or macOS, but Docker containers on Linux are a first class solution for documenting and setting up special build environments. I bet the smart kids could make that work on Windows and macOS as well.

We'll need to bring the Treeherder, Participation Systems and Taskcluster folks into this at the beginning to get their view points as well.
* visibility controls in Perfherder and the Dashboards?

I think this is a can of worms but if we really intend to support it, it should be a part of any design from the very beginning rather than trying to shoehorn it in after the other parts are design/implemented.
Flags: needinfo?(bob)
What do the awfy benchmarks actually run?  Links to source?

My impression in the past was that people could add new things to awfy easily, to talos ... less so.  Are we fixing that problem as part of this migration?

Also, last I checked AWFY ran things in Safari and sometimes Edge too.  I see no mention of those in comment 0.
Flags: needinfo?(bzbarsky) → needinfo?(jmaher)
regarding Edge, safari:

Safari: osx 64 bit only- kraken, sunspider, octane
edge: win 10 64 bit reference hardware- kraken, sunspider, octane (no data since Feb 28th, it is spotty in general)

I see many runs and data points of google chrome vs firefox in different modes and on many tests and platforms.



the assorted dom tests appear to be here:
https://github.com/mozilla/arewefastyet/tree/master/benchmarks/misc-desktop/tests/assorted

I am unable to find dromaeo on the github repo, there are tests which are setup on the arewefastyet server with copies of benchmarks, this appears to be one of those.

talos dromaeo is here:
https://searchfox.org/mozilla-central/source/testing/talos/talos/tests/dromaeo



lastly regarding ease of adding new tests- it should be easy to add tests to talos now that all scheduling and test definitions are in-tree.  In fact it was easier to get motionmark added to talos than AWFY back in January.  If there are other things we should do to reduce the friction of adding new tests as compared to AWFY, we should get bugs on file.
Flags: needinfo?(jmaher)
It seems like this bug is perhaps going to be the more likely alternative to bug 1436621? :-)
Blocks: 1349182
Yes, I would say so. My only interest in AWFY is in killing off the current implementation though work for that is gated by getting Autophone put to rest as well. If this is a pain point for you please let me know and I'll do something about it for each of them but if it is not a pain point, I'd rather let them die without investing more time in them than necessary.
It's not an imminent pain point - just wanting to make sure I have the bug dependencies set to the correct bugs :-)
Depends on: 1465336
Depends on: 1465360
Depends on: 1472782
Depends on: 1472800
Depends on: 1472803
Depends on: 1472804
Depends on: 1472979
Depends on: 1472987
Depends on: 1472992
Depends on: 1473334
Depends on: 1473365
Depends on: 1468812
Depends on: 1476568
No longer depends on: 1436819
No longer depends on: 1472782
No longer depends on: 1472987
Depends on: 1481245
Depends on: 1481707
Depends on: 1481708
Depends on: 1482151
Hi Joel, hope you are well :-)

If I understand this bug correctly, once it's fixed bug 1436621 will be redundant? If so, I'll close that bug out (likely as wontfix, as it's about AWFY which won't exist) and remove it from the dependency tree of bug 1349182.

Do you have a rough idea of when this bug might be finished and AWFY switched off? (No immediate hurry, just helps me figure out whether us resolving bug 1349182 is months/quarters/years away, since we're soon separately going to be removing buildbot support which opens more doors for refactoring Treeherder generally).

Many thanks!
Flags: needinfo?(jmaher)
this will be done in the next 4 weeks.
Flags: needinfo?(jmaher)
(In reply to Joel Maher ( :jmaher ) (UTC+2) from comment #11)
> this will be done in the next 4 weeks.

The feature parity or the switch from AWFY to Talos?
thanks for asking for clarification, we will have feature parity this week and hopefully next week turn off AWFY.
Depends on: 1486788
Depends on: 1486789
Depends on: 1488533
Depends on: 1488534
Status: NEW → RESOLVED
Closed: Last month
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.