Closed Bug 1403600 Opened 8 years ago Closed 8 years ago

Adjust the platforms where "stylo-disabled" talos tests run and are being displayed in TreeHerder

Categories

(Infrastructure & Operations Graveyard :: CIDuty, task, P1)

Tracking

(firefox-esr52 unaffected, firefox55 unaffected, firefox56 unaffected, firefox57 affected, firefox58 affected)

RESOLVED WONTFIX
Tracking Status
firefox-esr52 --- unaffected
firefox55 --- unaffected
firefox56 --- unaffected
firefox57 --- affected
firefox58 --- affected

People

(Reporter: aselagea, Assigned: aselagea)

References

(Blocks 1 open bug)

Details

(Whiteboard: [stylo])

Attachments

(4 files)

Two things needed here: - we shouldn't run regular talos tests for "stylo-disabled" platforms - "stylo-disabled" talos tests should be displayed under the corresponding platforms
Priority: -- → P1
(In reply to Alin Selagea [:aselagea][:buildduty] from comment #0) > Two things needed here: > - we shouldn't run regular talos tests for "stylo-disabled" platforms Alin, what do you you mean by "regular" Talos tests? IIUC, we want to continue running Talos tests for stylo-disabled platforms so they don't regress performance.
Blocks: 1397829
No longer depends on: 1397829
(In reply to Chris Peterson [:cpeterson] from comment #1) > (In reply to Alin Selagea [:aselagea][:buildduty] from comment #0) > > Two things needed here: > > - we shouldn't run regular talos tests for "stylo-disabled" platforms > > Alin, what do you you mean by "regular" Talos tests? IIUC, we want to > continue running Talos tests for stylo-disabled platforms so they don't > regress performance. e.g. https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&filter-searchStr=linux64 On "Linux x64" we have talos tests like the following: - test-linux64/opt-talos-g1-e10s - test-linux64/opt-talos-g1-stylo-disabled-e10s - test-linux64/opt-talos-g2-e10s - test-linux64/opt-talos-g2-stylo-disabled-e10s - test-linux64/opt-talos-g3-e10s - test-linux64/opt-talos-g3-stylo-disabled-e10s - test-linux64/opt-talos-g4-e10s - test-linux64/opt-talos-g4-stylo-disabled-e10s - test-linux64/opt-talos-g5-e10s - test-linux64/opt-talos-g5-stylo-disabled-e10s ...and so on. As you can see, we run both the "regular" variant and the "stylo-disabled" variant on the same platform. At the same time, on "linux64-stylo-disabled", we run no "stylo-disabled" talos tests. What I *think* we should do is move these "stylo-disabled" talos tests on the corresponding platforms (e.g. "linux64-stylo-disabled") and leave the regular ones running on the default platforms (e.g. "Linux x64"). Note: talos tests run on hardware machines and their number is limited. That being said, I don't think it helps running the regular talos tests on the "stylo-disabled" platforms since we'll get duplicate runs and that'll put significant strain on our talos machine pool. @Chris: does the approach described above work for you?
Flags: needinfo?(cpeterson)
Since I have several patches prepared, I will attach them on the bug. Depending on the answer we get from #comment 2, we could adjust the list of tests to run a bit later.
Attachment #8913631 - Flags: review?(jmaher)
Assignee: nobody → aselagea
Attached file builder_diff.txt
Few considerations here: 1. OS X 10.10.5 "stylo-disabled" talos tests are not needed in BB, as they run in TC (cross-compiled) - so I removed them 2. moved "stylo-disabled" talos tests to run on the corresponding "stylo-disabled" platforms 3. in bug 1397829 we also added entries for the regular talos tests to run on the "stylo-disabled" platforms. I removed them for two reasons: - avoid running duplicates (see #comment 2 above) - we are dealing with some builder limits, so we need to drop some builds
Comment on attachment 8913631 [details] [diff] [review] bug1403600_bbconfigs.patch Review of attachment 8913631 [details] [diff] [review]: ----------------------------------------------------------------- at first I was confused, then I saw the logic here- great stuff!
Attachment #8913631 - Flags: review?(jmaher) → review+
Attachment #8913634 - Flags: review?(aobreja)
Attachment #8913635 - Flags: review?(aobreja)
Note: the patches above will need to land only *after* the last two patches in bug 1397829 do so.
(In reply to Alin Selagea [:aselagea][:buildduty] from comment #2) > (In reply to Chris Peterson [:cpeterson] from comment #1) > > (In reply to Alin Selagea [:aselagea][:buildduty] from comment #0) > > > Two things needed here: > > > - we shouldn't run regular talos tests for "stylo-disabled" platforms > > > > Alin, what do you you mean by "regular" Talos tests? IIUC, we want to > > continue running Talos tests for stylo-disabled platforms so they don't > > regress performance. > > e.g. > https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&filter- > searchStr=linux64 > > On "Linux x64" we have talos tests like the following: > - test-linux64/opt-talos-g1-e10s > - test-linux64/opt-talos-g1-stylo-disabled-e10s > - test-linux64/opt-talos-g2-e10s > - test-linux64/opt-talos-g2-stylo-disabled-e10s > - test-linux64/opt-talos-g3-e10s > - test-linux64/opt-talos-g3-stylo-disabled-e10s > - test-linux64/opt-talos-g4-e10s > - test-linux64/opt-talos-g4-stylo-disabled-e10s > - test-linux64/opt-talos-g5-e10s > - test-linux64/opt-talos-g5-stylo-disabled-e10s > ...and so on. As you can see, we run both the "regular" variant and the > "stylo-disabled" variant on the same platform. At the same time, on > "linux64-stylo-disabled", we run no "stylo-disabled" talos tests. Yes, I believe :jmaher initially added the "stylo-disabled" Talos tests as separate _test names_ because it was simpler at the time? However, it differs from what we do for Stylo disabled unit tests, where we re-use the existing tests, but with a stylo-disabled _test platform_. Currently, I believe we are doing this with Talos for all platforms: Linux, Windows, and macOS all run `stylo-disabled` _tests_ on the "regular" test platform and _no_ Talos runs take place on the "stylo-disabled" test platform. > What I *think* we should do is move these "stylo-disabled" talos tests on > the corresponding platforms (e.g. "linux64-stylo-disabled") and leave the > regular ones running on the default platforms (e.g. "Linux x64"). > Note: talos tests run on hardware machines and their number is limited. That > being said, I don't think it helps running the regular talos tests on the > "stylo-disabled" platforms since we'll get duplicate runs and that'll put > significant strain on our talos machine pool. Certainly we don't want to run them twice! (Do you see evidence that were are doing so today? Or do you mean it would start happening with other changes?) If you want to make Talos more like unit tests, it might make sense to remove the special stylo-disabled tests and instead move them to the stylo-disabled test platform. But if we do that, it would seem more logical to do it across all 3 desktop platforms. An alternative would be to leave things as they are. You can avoid duplication by just not running _any_ Talos on the "stylo-disabled" platforms. "stylo-disabled" things will eventually be removed after a few more cycles, so it might make sense to take the shortest path to avoiding duplication, rather than moving all the jobs around to hang off the test platform (even though that would feel more correct).
(In reply to J. Ryan Stinnett [:jryans] (use ni?) from comment #9) > Certainly we don't want to run them twice! (Do you see evidence that were > are doing so today? Or do you mean it would start happening with other > changes?) No, they're not running twice at this point, but we already have builder names added for them in bug 1397829 - which were removed in this bug for the reasons mentioned in #comment 4, point 3 :-) > If you want to make Talos more like unit tests, it might make sense to > remove the special stylo-disabled tests and instead move them to the > stylo-disabled test platform. But if we do that, it would seem more logical > to do it across all 3 desktop platforms. That is my plan too. Besides these "stylo-disabled" talos tests, we also want to enable "stylo-disabled" unit tests (for Windows 7 and Windows 10), so it would make sense to have them each under the appropriate platform. Even though Linux and OS X won't have "stylo-disabled" unit tests, I think it would be a good idea to have the above mentioned talos tests moved to a separate platform. That will avoid confusion and make the results more readable.
(In reply to Alin Selagea [:aselagea][:buildduty] from comment #10) > That is my plan too. Besides these "stylo-disabled" talos tests, we also > want to enable "stylo-disabled" unit tests (for Windows 7 and Windows 10), > so it would make sense to have them each under the appropriate platform. I am not sure what you mean here. It looks like Windows unit tests already run on "stylo-disabled" test platforms: https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&filter-searchStr=stylo-disabled%20windows > Even though Linux and OS X won't have "stylo-disabled" unit tests, I think > it would be a good idea to have the above mentioned talos tests moved to a > separate platform. That will avoid confusion and make the results more > readable. All platforms currently run some amount of unit tests on "stylo-disabled" test platforms, so again I am a bit unsure what you mean here. Are you referring to some subset of tests that use BB instead of TC? As for moving the Talos tests, I am happy to help via additional review to make sure things are correct if you like! (:jmaher or whoever should also review as well... Mostly I just want to double check that the tests still report to Perfherder correctly and that Stylo is truly disabled as expected.)
Flags: needinfo?(aselagea)
(In reply to Alin Selagea [:aselagea][:buildduty] from comment #2) > What I *think* we should do is move these "stylo-disabled" talos tests on > the corresponding platforms (e.g. "linux64-stylo-disabled") and leave the > regular ones running on the default platforms (e.g. "Linux x64"). > Note: talos tests run on hardware machines and their number is limited. That > being said, I don't think it helps running the regular talos tests on the > "stylo-disabled" platforms since we'll get duplicate runs and that'll put > significant strain on our talos machine pool. > > @Chris: does the approach described above work for you? Sorry, I don't know. I defer to jryans for questions about these stylo-disabled tests. :)
Flags: needinfo?(cpeterson)
Attachment #8913634 - Flags: review?(aobreja) → review+
Attachment #8913635 - Flags: review?(aobreja) → review+
(In reply to J. Ryan Stinnett [:jryans] (use ni?) from comment #11) > I am not sure what you mean here. It looks like Windows unit tests already > run on "stylo-disabled" test platforms: > > https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&filter- > searchStr=stylo-disabled%20windows > > All platforms currently run some amount of unit tests on "stylo-disabled" > test platforms, so again I am a bit unsure what you mean here. Are you > referring to some subset of tests that use BB instead of TC? Yes, they *do* run, but they're TC-only. We'd also need to run some unit tests via buildbot bridge (e.g. schedule them in TC, but run them in BB). For that, we'd need to tweak some configs - that work is being handled in bug 1397829, please see [1] for a brief summary of what's needed. > As for moving the Talos tests, I am happy to help via additional review to > make sure things are correct if you like! (:jmaher or whoever should also > review as well... Mostly I just want to double check that the tests still > report to Perfherder correctly and that Stylo is truly disabled as expected.) [1] https://bugzilla.mozilla.org/show_bug.cgi?id=1397829#c57
Flags: needinfo?(aselagea)
Okay, so I see one issue here: at this point, the existing 'stylo-disabled' talos tests run for the 'opt' builds [1] and for the 'pgo' ones [2]. If we were to move them to dedicated 'stylo-disabled' platforms, we could do it easily for 'opt' since we have the corresponding platform defined [3], but it seems more difficult for 'pgo' since we don't have the corresponding build platform. Would it make sense running them for 'opt' and 'debug'? @Joel, @Chris what do you suggest here? [1] https://dxr.mozilla.org/mozilla-central/source/taskcluster/ci/test/test-platforms.yml#182 [2] https://dxr.mozilla.org/mozilla-central/source/taskcluster/ci/test/test-platforms.yml#192 [3] https://dxr.mozilla.org/mozilla-central/source/taskcluster/ci/test/test-platforms.yml#119
Flags: needinfo?(jmaher)
Flags: needinfo?(cpeterson)
we only need talos for opt/pgo, not debug. The unittests will need to run on opt/debug though. In my mind if we had the problem solved except for PGO, that would be a great spot to be in. We do not have other references to pgo in test-platforms.yml, so I don't know if adding pgo there is a good idea. There might be a transform in taskcluster to account for pgo, that has always been a mystery to me.
Flags: needinfo?(jmaher)
That would mean keeping the pgo 'stylo-disabled' talos tests run where they do now and only move the opt talos ones..which is easy to do in TC, but much more complicated in BB since we'd need to separate pgo from opt :| What I think we can do here is keep 'stylo-disabled' talos tests run as they do now. e.g. https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&filter-searchStr=stylo-disabled%20talos As for the 'stylo-disabled' unittests, we can move them to separate platforms (bug 1397829) but I don't think we should also run the talos tests there since they would be duplicates. Does that work for you?
Flags: needinfo?(jmaher)
yes, that works for me, thanks!
Flags: needinfo?(jmaher)
(In reply to Alin Selagea [:aselagea][:buildduty] from comment #14) > Would it make sense running them for 'opt' and 'debug'? > @Joel, @Chris what do you suggest here? Like Joel said, we don't need to run Talos on debug builds. Since the 'stylo-disabled' tests will only be around for a couple release cycles while we wait to make sure Stylo ships successfully, you probably don't need to invest a lot more time moving the stylo-disabled tests to a more complicated setup.
Flags: needinfo?(cpeterson)
Thanks for your input here. Since moving these talos tests is no longer required, I'll just 'wontfix' this bug.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → WONTFIX
Product: Release Engineering → Infrastructure & Operations
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: