Closed
Bug 1163802
Opened 10 years ago
Closed 9 years ago
Running jobs should not have their state manually changed to usercancel when the cancel button pressed
Categories
(Tree Management :: Treeherder: Data Ingestion, defect, P2)
Tree Management
Treeherder: Data Ingestion
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: emorley, Assigned: emorley)
References
Details
Attachments
(1 file)
In bug 1163659 it was discovered that a number of the stuck "loading" jobs in the objectstore had an equivalent completed (with result "usercancel") job in the jobs table.
This is because we pre-emptively mark jobs as cancelled when people press the cancel button.
Now this behaviour is necessary for _pending_ jobs (since when cancelled they don't end up in builds-4hr), but for _running_ jobs it ends up with a dupe.
IMO we should just stop pre-emptively marking running jobs (vs allowing the dupe), since if the buildapi cancel request never makes it through, then we show the job as cancelled even if it's still running in builds-running.js
Assignee | ||
Comment 1•10 years ago
|
||
James, when taskcluster jobs are cancelled using Treeherder, does taskcluster later submit the stopped/cancelled "completed" job to Treeherder? ie: can Treeherder just wait until it sees the job was definitely cancelled, or does Treeherder need to mark the job as cancelled in the jobs table itself?
Flags: needinfo?(jlal)
Comment 3•9 years ago
|
||
Assignee | ||
Updated•9 years ago
|
Attachment #8757959 -
Flags: review?(cdawson)
Comment 4•9 years ago
|
||
Comment on attachment 8757959 [details] [review]
[treeherder] mozilla:only-force-usercancel-pending > mozilla:master
Cool, yeah. Good call.
Attachment #8757959 -
Flags: review?(cdawson) → review+
Comment 5•9 years ago
|
||
Commit pushed to master at https://github.com/mozilla/treeherder
https://github.com/mozilla/treeherder/commit/443c1d56b66f06e837d01b4927c144dc1d2797ad
Bug 1163802 - For job cancellation only manually update pending jobs
If a buildbot job is cancelled whilst in the pending state, the
resultant "no longer running" job does not appear in builds-4hr (due to
buildbot limitations), which causes orphaned jobs in Treeherder.
To work around this, Treeherder pre-emptively updates the DB for jobs
when a user cancels them using the Treeherder interface. This is not
ideal, since the manual DB update can race with the job completing (if
the job finished before buildapi had time to cancel it), or worse can
show the job as cancelled even if the buildapi request failed to
complete, leaving the job as running even though Treeherder's UI will
show it as stopped.
We have to keep on performing this suboptimal workaround for pending
jobs (until we stop using buildbot), but we can at least limit it to
just the pending jobs case, since running jobs never needed this
workaround in the first place.
Assignee | ||
Updated•9 years ago
|
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•