Open Bug 1261953 Opened 8 years ago Updated 2 years ago

Formalize a new job priority system for getting early feedback on test results

Categories

(Testing :: General, defect)

defect

Tracking

(Not tracked)

People

(Reporter: ahal, Unassigned)

References

(Depends on 1 open bug, Blocks 1 open bug)

Details

We've often talked about having a tiered system, where fast running tests and frequently failing tests get run first, followed by the rest of them only if they pass. The benefit is that the lower tiered jobs will be fast running and contain the riskiest tests. So failures will surface quickly. This helps developers waiting for try results, helps sheriffs close the tree/backout bustage faster and helps save machine resources in cases where all jobs would have failed.

Previously this was always a bit of a pipe dream, but since then a lot pieces have fallen into place that will make this considerably easier.

* taskcluster graphs provide an easy way to implement the scheduling dependencies
* manifestparser tags/subsuites provide an easy way to move a test into/out of a tier
* autostar can be used to detect and ignore known intermittents (this isn't landed yet)

Aside from landing autostar, the hard part is figuring out which tests qualify as "fast" (can use ActiveData for this), and which tests fail frequently (can use OrangeFactor and/or SETA for this).

This will also take a little massaging of the taskcluster configs. There's currently a project to refactor the taskcluster config setup which may be worth blocking on as well.
Summary: Formalize a new level system for getting early feedback on test results → Formalize a new job priority system for getting early feedback on test results
Depends on: 1243759
You can probably also use ActiveData for test failures.
We're going to make SETA use ActiveData for this info.

We will probably need to shut off Buildbot scheduling for try and instead use BBB/TC scheduling. It makes it easier to create tiered graphs for Buildbot.
Depends on: 1263185
No longer depends on: 1243759
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.