Treeherder should send pulse messages for "cancel all"

RESOLVED FIXED

Status

defect
P3
normal
RESOLVED FIXED
4 years ago
4 years ago

People

(Reporter: emorley, Assigned: vaibhav1994)

Tracking

(Blocks 1 bug)

Details

Attachments

(2 attachments)

46 bytes, text/x-github-pull-request
camd
: review+
Details | Review
46 bytes, text/x-github-pull-request
camd
: review+
Details | Review
Reporter

Description

4 years ago
At the moment, we only send pulse messages for retrigger and cancel, not cancel all.

This causes several issues:
1) Cancel all doesn't cancel taskcluster jobs at the moment
2) Once we switch to using mozci cancellation for buildbot jobs, it won't work for those either

There are two solutions for this:
(a) Make cancelAll() just do a cancel() for every job in that resultset
(b) Have a specific cancelAll() pulse message format, that both taskcluster and mozci know to listen to.
Reporter

Updated

4 years ago
Blocks: 1077053
adusca: for context, self-serve/buildapi allows cancelling all jobs at once (different from cancelling one at a time):
> "Cancel all builds on this revision."
https://secure.pub.build.mozilla.org/buildapi/self-serve/try/rev/98e9b20f9d4d

> DELETE	/self-serve/{branch}/rev/{revision}	Cancels all running or pending builds on this revision
emorley: I was not able to hear clearly on the call so I will rephrase what I understood.

I believe you mentioned that we can have "retrigger/cancel" do not prompt the user while we can let "cancel all" still prompt users with the LDAP/buildapi prompt. Did I understand this well?

If this is the case, can I assume that we could move forward with the single trigger/cancel actions independently from this?

If the "cancel all" sent a pulse action message telling us the _branch_ and the _revision_, that is all that we would need [1].
What code changes would be required for this? I'm happy to give it a try.
Flags: needinfo?(emorley)
Thank you!

For now, we will also have to wait on bug 1032163 to deploy this but I guess the leg work for the UI + pulse messages can be accomplished.
Assignee

Comment 6

4 years ago
(In reply to Ed Morley [:emorley] from comment #0)
> There are two solutions for this:
> (a) Make cancelAll() just do a cancel() for every job in that resultset

Cancel_All already sends a 'cancel' action for every job: https://github.com/mozilla/treeherder/blame/f92ba96cb29230a5cd8638142abb8f7441dddc90/treeherder/model/derived/jobs.py#L303-L304


In fact, cancelAll() function ( https://github.com/mozilla/treeherder/blob/c2103b66ad393f82677439957597adf7754850d0/ui/plugins/controller.js#L310-L313 ) is not called from any part of html code. AFAIK, only cancelAllJobs() function is called from the front-end html code ( https://github.com/mozilla/treeherder/blob/master/ui/partials/main/jobs.html#L32 )
Assignee

Comment 7

4 years ago
Posted file pr #771
Ed, I checked before and after removing cancelAll function from controller.js, and the request is going ok. Can you please review this patch?
Attachment #8634563 - Flags: review?(emorley)
Reporter

Comment 8

4 years ago
Comment on attachment 8634563 [details] [review]
pr #771

Deferring to Cameron - see https://github.com/mozilla/treeherder/pull/771#issuecomment-121926932 :-)
Attachment #8634563 - Flags: review?(emorley) → review?(cdawson)
Comment on attachment 8634563 [details] [review]
pr #771

To be clear: this patch doesn't fix this bug.  But it removes dead code that could be confusing while implementing a fix.  So it's definitely worth while.  :)
Attachment #8634563 - Flags: review?(cdawson) → review+
Commits pushed to master at https://github.com/mozilla/treeherder

https://github.com/mozilla/treeherder/commit/856efe658d93baf15871e8f5526fad686ac5e1b5
Bug 1171575 - Remove unused cancel_all function from controller.js

https://github.com/mozilla/treeherder/commit/972e889374a3651d8fc14c303505b3bd5013d531
Merge pull request #771 from vaibhavmagarwal/cancel-all-remove

Bug 1171575 - Remove unused cancel_all function from controller.js
What is the plan forward after the finding?
Assignee

Comment 12

4 years ago
I think we already cancel all jobs at once via buildapi. See this: https://github.com/mozilla/treeherder/blob/1900b665e4f102490ff2b8cd6d49ef6c3074f6ad/ui/js/services/buildapi.js#L62
It gives a direct call to:
> DELETE	/self-serve/{branch}/rev/{revision}   Cancels all running or pending builds on this revision
So, I think we are good, and do not need to change anything in the case of buildapi jobs.

On publishing to pulse side, we give notification to cancel jobs one by one:
https://github.com/mozilla/treeherder/blob/f92ba96cb29230a5cd8638142abb8f7441dddc90/treeherder/model/derived/jobs.py#L303-L304
With the resultset-action exchange now in place, I can easily change this to give a cumulative "cancel_all" action, but that will change the existing behavior, and I am not sure if there is a pulse_listener for "cancelling" jobs in non-buildbot land.

Ed, Armen, can you let me know if we should change the existing behavior for publishing to pulse for "cancel_all", taking into account the above circumstances?
Flags: needinfo?(emorley)
Flags: needinfo?(armenzg)
Reporter

Comment 13

4 years ago
sgtm :-)
Flags: needinfo?(emorley)
I think I misunderstood your comment on IRC.
If you change the current behaviour we won't be able to cancel jobs for TaskCluster.
If you can do both it would be ideal.

If you keep the current behaviour, you will have to teach mozci/pulse_actions to cancel one job at a time.
Flags: needinfo?(armenzg)
Assignee

Comment 15

4 years ago
Posted file pr #793
:camd, could you review this small patch? Currently, it just sends an extra pulse action for 'cancel_all' and there is no listener for it, so I cannot remove the 'cancel' action for each job since taskcluster listens to it. Once we have a properly functional mozci backend for this, I will remove the 'cancel' action for each job.
Attachment #8636781 - Flags: review?(cdawson)
Comment on attachment 8636781 [details] [review]
pr #793

Chatted on IRC.  This looks right to me.
Attachment #8636781 - Flags: review?(cdawson) → review+
Assignee

Comment 17

4 years ago
camd, could you merge the pr in (don't want it to bitrot)?
Keywords: checkin-needed
Commits pushed to master at https://github.com/mozilla/treeherder

https://github.com/mozilla/treeherder/commit/8ac50505daf020b6185ed9fee80928fdb1bff996
Bug 1171575 - Treeherder should send pulse messages for cancel all

https://github.com/mozilla/treeherder/commit/437236f80b4a868a9c5dc3cce2ed2e9fe9db7910
Merge pull request #793 from vaibhavmagarwal/cancelall

Bug 1171575 - Treeherder should send pulse messages for cancel all
Reporter

Updated

4 years ago
Assignee: nobody → vaibhavmagarwal
Status: NEW → RESOLVED
Last Resolved: 4 years ago
Resolution: --- → FIXED
Assignee

Comment 19

4 years ago
Armen, with the current patch, we send both "cancel_all" message which mozci can work on in the future, and "cancel" action for time being (so that b2g jobs get cancelled). Ideally once we have shifted fully over to mozci, we will be able to remove the "cancel" action for individual jobs.
Coolio! Thanks Vaibhav!
I can't cancel all jobs for my pushes.
Perhaps we should revert this change and put it on hold until bug 1032163 is fixed.
Otherwise, we're going to have a lot of upset developers (knowing that they tend to cancel their try pushes if they mess up).
We also used to show a notification from Treeherder when we would request to cancel an individual job ("We have requested a cancellation").

I canceled individually a bunch of Taskcluster builds but saw no messages. Is this a regression from this bug?
Assignee

Comment 23

4 years ago
Discussed with Armen on IRC, he could not see "cancel_all" button on mozilla-inbound, because he does not have sheriff permissions (behavior is as expected). So, there is no regression as a result of this bug. The cancel message is shown for buildapi jobs, but not for certain b2g jobs (may be a regression there, not sure).
You need to log in before you can comment on or make changes to this bug.