Open Bug 1298910 Opened 8 years ago Updated 2 years ago

Cancel jobs for obsoleted changesets

Categories

(Firefox Build System :: Task Configuration, task)

task

Tracking

(Not tracked)

People

(Reporter: gps, Unassigned)

References

(Blocks 1 open bug)

Details

Over in bug 1288845, we're rolling out support for "dropping" "bad" changesets from the autoland repo (as opposed to using backout commits).

As described in bug 1298563, any commit after a "dropped" changeset has a new SHA-1 and has its pushlog entry copied. So it appears to automation like there was a new push.

There are capacity implications to automation. When a "drop" is performed, there will effectively be 2 instances of automation running for all descendant changesets of dropped changesets. Automation results for the original changesets provide no long-term value since those changesets won't be in the final history of mozilla-central. So our thinking was we should cancel automation from dropped changesets so we don't waste resources generating results.

Bug 1298563 describes a new Pulse exchange that can be used to listen for dropped changesets. We should be able to hook up a consumer that looks for "obsoleted" changesets and then cancels any buildbot and taskcluster automation running or pending against the old changeset.
Assignee: nobody → armenzg
Summary: Cancel jobs for obsoleted changesets → pulse_actions - Cancel jobs for obsoleted changesets
Assignee: armenzg → nobody
Blocks: 1365318
No longer blocks: 1288845
Shutting pulse_actions off (see bug 1379172).
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → WONTFIX
Moving bug to TaskCluster component then, as that seems like the next logical place for it.

jonasfj: do you think it is appropriate for this to live as part of the existing Pulse -> Decision scheduling code? Where is that code, anyway? Essentially we want certain notification messages to result in cancelling a task group. I reckon this can live anywhere. I'm just not sure the most appropriate place for it.
Status: RESOLVED → REOPENED
Component: General Automation → General
Flags: needinfo?(jopsen)
Product: Release Engineering → Taskcluster
QA Contact: catlee
Resolution: WONTFIX → ---
Summary: pulse_actions - Cancel jobs for obsoleted changesets → Cancel jobs for obsoleted changesets
Status: REOPENED → NEW
This could live in mozilla-taskcluster.

Long term, we dream of killing said service, and using tc-hooks triggered by pulse-messages, and when that day comes this is just another case for that hook to handle.

@garndt, (the unlucky one listed as owner in heroku)
I think we would need to store a mapping: push_log_id -> taskGroupId, in mozilla-taskcluster.
Does it already have that in one of it's many data stores? :)
Flags: needinfo?(jopsen) → needinfo?(garndt)
It does not contain much data other than repos it cares about and the last push ID it saw.
Flags: needinfo?(garndt)
I also think at this point we're getting close to mozilla-taskcluster really not doing much (ideally nothing once hooks is updated), so I'm a -1 on wanting to add a responsibility to mozilla-taskcluster.
Same.  I think this should be implemented as a hook listening to pulse messages and running some in-tree code that then figures out what it should cancel.
Depends on: 1286989
Ideally autoland will send some kind of notification that this is occurring, and we can then build a hook to listen to that notification that runs an in-tree action.  I don't know if this obsoleting ever got implemented?
Component: General → Task Configuration
Product: Taskcluster → Firefox Build System
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.