Closed Bug 1163802 Opened 10 years ago Closed 9 years ago

Running jobs should not have their state manually changed to usercancel when the cancel button pressed

Categories

(Tree Management :: Treeherder: Data Ingestion, defect, P2)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: emorley, Assigned: emorley)

References

Details

Attachments

(1 file)

In bug 1163659 it was discovered that a number of the stuck "loading" jobs in the objectstore had an equivalent completed (with result "usercancel") job in the jobs table. This is because we pre-emptively mark jobs as cancelled when people press the cancel button. Now this behaviour is necessary for _pending_ jobs (since when cancelled they don't end up in builds-4hr), but for _running_ jobs it ends up with a dupe. IMO we should just stop pre-emptively marking running jobs (vs allowing the dupe), since if the buildapi cancel request never makes it through, then we show the job as cancelled even if it's still running in builds-running.js
James, when taskcluster jobs are cancelled using Treeherder, does taskcluster later submit the stopped/cancelled "completed" job to Treeherder? ie: can Treeherder just wait until it sees the job was definitely cancelled, or does Treeherder need to mark the job as cancelled in the jobs table itself?
Flags: needinfo?(jlal)
We submit cancelled/completed later regardless
Flags: needinfo?(jlal)
Attachment #8757959 - Flags: review?(cdawson)
Comment on attachment 8757959 [details] [review] [treeherder] mozilla:only-force-usercancel-pending > mozilla:master Cool, yeah. Good call.
Attachment #8757959 - Flags: review?(cdawson) → review+
Commit pushed to master at https://github.com/mozilla/treeherder https://github.com/mozilla/treeherder/commit/443c1d56b66f06e837d01b4927c144dc1d2797ad Bug 1163802 - For job cancellation only manually update pending jobs If a buildbot job is cancelled whilst in the pending state, the resultant "no longer running" job does not appear in builds-4hr (due to buildbot limitations), which causes orphaned jobs in Treeherder. To work around this, Treeherder pre-emptively updates the DB for jobs when a user cancels them using the Treeherder interface. This is not ideal, since the manual DB update can race with the job completing (if the job finished before buildapi had time to cancel it), or worse can show the job as cancelled even if the buildapi request failed to complete, leaving the job as running even though Treeherder's UI will show it as stopped. We have to keep on performing this suboptimal workaround for pending jobs (until we stop using buildbot), but we can at least limit it to just the pending jobs case, since running jobs never needed this workaround in the first place.
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: