Add the ability to fill in any skipped/coalesced jobs for a revision on demand by a privileged user (Treeherder part)

RESOLVED FIXED

Status

Tree Management
Treeherder
RESOLVED FIXED
3 years ago
3 years ago

People

(Reporter: chmanchester, Assigned: vaibhav1994)

Tracking

(Blocks: 1 bug)

Details

Attachments

(3 attachments, 3 obsolete attachments)

(Reporter)

Description

3 years ago
This will involve multiple pieces -- a simple treeherder ui and a pulse backend, as well as some way to limit treeherder actions to privileged users (sheriffs, for now).

Most generally we want to go from whatever tests we have for a revision to the full set of tests (those that would be run without the influence of SETA or coalesced/periodic builds) in the click of a button.
(Reporter)

Updated

3 years ago
Depends on: 1179016
(Reporter)

Updated

3 years ago
Depends on: 1032163

Comment 1

3 years ago
chmanchester: IIUC we don't need to block on bug 1032163.
We can show certain elements on TH for "is_staff" (aka sheriffs group).

I think we might be able to pull this project off even though I was about to say that we should give up until we move the scheduling to TC.

The way that this could work is:
1 - We hear that a revision should be filled in
 * We request for all *possible* jobs (some build jobs might have not finished)
--------> If we stop here, then we would at least fixed any coalescing that might have happened OR non-scheduled jobs (SETA's doing).

2 - We start listening from treeherder for any *coalesced* jobs
 * I'm just using this event as a way to try to act again + we know that we have to act again
 * Perhaps we were already listening *but* not acting on it
 * We request that job again OR try again to fill in

Some information about coalescing
#################################
11:13 armenzg: catlee: is there an easy way to add pulse publishing for job coalescing?
11:14 catlee: armenzg: at what point?
11:14 armenzg: catlee: we're trying to implement a feature that watches a revision and tries to make sure that *all* possible jobs that could have run is scheduled and run
11:15 armenzg: there are various things that make this complicated with BuildAPI
11:15 armenzg: I was thinking of listening to builds being finished
11:15 catlee: the build finish events should have all the info you need
11:15 armenzg: and then try to make sure that every test job is actually run
11:16 armenzg: I'm worried that some test jobs get coalesced
11:16 armenzg: so listening to those would be great
11:16 armenzg: I guess the buildbot schedulers don't send any pulse messages
11:17 catlee: no, and coalescing only happens when a job starts
11:18 catlee: all the log_uploaded events should include the request_ids of the build
11:18 armenzg: I wonder if "filling in" a revision should be a project to stop and wait until we move to TaskCluster
11:18 catlee: that will tell you if it's been coalesced
11:18 catlee: certainly much easier with TC
11:18 catlee: and we're not going to make many changes in BB to support new workflows
11:19 armenzg: catlee: "log_uploaded" sounds like an event that happens on a job finishing rather than starting?
11:19 armenzg: I hear ya
11:20 catlee: yeah
11:20 catlee: when it's done
11:20 armenzg: that's a bit late for me to know

Comment 2

3 years ago
This depends on a bit of a clean up in here:
https://github.com/armenzg/mozilla_ci_tools/pull/270

Comment 3

3 years ago
Removing the dependency on bug 1032163 since we can deliver this feature to sheriffs to begin with.
No longer depends on: 1032163
(Assignee)

Comment 4

3 years ago
Created attachment 8630584 [details]
fill-in.png

For the front-end on TH, will the button in a dropdown as shown in the image ( http://i.imgur.com/Mi3nYr7.png ) be fine?
(Reporter)

Updated

3 years ago
Depends on: 1180897
No longer depends on: 1179016
(Assignee)

Comment 5

3 years ago
Created attachment 8632311 [details] [review]
github-pr

:mdoglio, :jmaher I would like to have your feedback on this patch. This patch basically adds a button to the TH ui, and when it is pressed, it sends an http post request to treeherder.mozilla.org/api/project/try/resultset/58651/fill_in/

It then subsequently is picked up by the backend, which publishes a message to pulse.
Attachment #8632311 - Flags: feedback?(mdoglio)
Attachment #8632311 - Flags: feedback?(jmaher)
(Assignee)

Comment 6

3 years ago
Created attachment 8632314 [details]
Fill_in UI Button

The Fill-In button is added next to the "cancel_all" button in the TH UI. This button will be visible to is_staff (sheriffs) only. As seen in the picture, there are two button with the same icon, we need to decide an icon for the fill-in one.
Attachment #8630584 - Attachment is obsolete: true
(Assignee)

Comment 7

3 years ago
Created attachment 8632396 [details]
Fill_in UI Button

Updated the UI. Now there is a "Fill Revision" text written in the button on the right hand side beside "cancel_all" option.
Attachment #8632314 - Attachment is obsolete: true
Comment on attachment 8632311 [details] [review]
github-pr

I like the 'fill revision' text, that is nice.

the flow of a message -> pulse seems good as well.  What I don't understand is where do we get the list of jobs from?  there are a lot of hidden jobs that might be run.  Would this run all pgo/opt jobs as well?
Attachment #8632311 - Flags: feedback?(jmaher) → feedback+
(Assignee)

Comment 9

3 years ago
Joel you have asked a good and valid question, I am happy to explain the process here:

1) The above patch publishes to the (revision, branch name) to pulse service.
2) We will then create a pulse listener in pulse_actions[1] to listen to the "fill_in" request
3) Then we will call a function in mozci[2] to trigger the remaining jobs of a revision

[1] - https://github.com/adusca/pulse_actions/blob/master/pulse_actions/handlers/treeherder_buildbot.py#L30
[2] - https://github.com/armenzg/mozilla_ci_tools/blob/master/mozci/mozci.py#L446

Regarding the jobs that will be triggered for filing, all jobs (except 'hg bundle' and 'b2g') will be triggered[3]
[3] - https://github.com/armenzg/mozilla_ci_tools/blob/master/mozci/mozci.py#L451
So this means that pgo jobs will be triggered as well, though we can remove them if we want to. So, should we also trigger pgo jobs while filling in a revision or not?
in the patch there are mentions of getting the job list, I was a bit confused.  Calling mozci for it makes sense.

I don't think we should trigger pgo- that is a periodic builder and we will need another button for periodic builders.
(Assignee)

Comment 11

3 years ago
Ok. I'll add pgo to the list of jobs (['hg bundle', 'b2g']) to be removed.
:vaibhav1994 your patch looks good to me. My only concern is the use of the term 'fill jobs' in the ui. I think something like 'trigger missing jobs' would be more understandable for our users.
Attachment #8632311 - Flags: feedback?(mdoglio) → feedback+
could we fit this as a line item in the 'action items' drop down menu?  or create a secondary drop down menu for strictly sheriff tools.

Either way with 'fill missing jobs', 'bisection failure', 'schedule periodic builders', etc. we will have more specific sheriff tools in the future.
(Assignee)

Comment 14

3 years ago
Created attachment 8632877 [details]
Fill_in UI Type 1
Attachment #8632396 - Attachment is obsolete: true
(Assignee)

Comment 15

3 years ago
Created attachment 8632881 [details]
Fill_in UI Type 2

Hi sheriffs,
I am not sure which UI for fill_in button is suitable, and would like to know whether you like "Fill_in UI Type 1" button or "Fill_in UI Type 2" button. Your feedback is appreciated. Thanks!
Attachment #8632881 - Flags: feedback?(wkocher)
Attachment #8632881 - Flags: feedback?(ryanvm)
Attachment #8632881 - Flags: feedback?(cbook)
I prefer Type 2.
(Assignee)

Updated

3 years ago
Attachment #8632311 - Flags: review?(mdoglio)
(Assignee)

Comment 17

3 years ago
:mdoglio, I have made the changes, and currently have a button "Trigger Missing Jobs" in the Action Panel view ( "Fill_in UI Type 2" pic). Could you please review this patch?
Yeah, 2's nice.

Updated

3 years ago
Attachment #8632881 - Flags: feedback?(wkocher) → feedback+
(Assignee)

Comment 19

3 years ago
Comment on attachment 8632881 [details]
Fill_in UI Type 2

Sticking to "Fill_in UI Type 2" Button.
Attachment #8632881 - Flags: feedback?(ryanvm)
Attachment #8632881 - Flags: feedback?(cbook)
(Assignee)

Updated

3 years ago
Attachment #8632877 - Flags: feedback-

Updated

3 years ago
Assignee: nobody → vaibhavmagarwal
Status: NEW → ASSIGNED
Component: General → Treeherder
Product: Testing → Tree Management
Version: unspecified → ---

Updated

3 years ago
Summary: Add the ability to fill in any skipped/coalesced jobs for a revision on demand by a privileged user → Add the ability to fill in any skipped/coalesced jobs for a revision on demand by a privileged user (Treeherder part)
Comment on attachment 8632311 [details] [review]
github-pr

I added a few comments to the pull request, r? me when it's ready for a second review
Attachment #8632311 - Flags: review?(mdoglio) → review-
(Assignee)

Comment 21

3 years ago
Comment on attachment 8632311 [details] [review]
github-pr

Fixed the comments, and rebased on latest master. :mdoglio, could you please review this?
Attachment #8632311 - Flags: review- → review?(mdoglio)
Attachment #8632311 - Flags: review?(mdoglio) → review+

Comment 22

3 years ago
Commits pushed to master at https://github.com/mozilla/treeherder

https://github.com/mozilla/treeherder/commit/2924564820cc721b98e745df1f35ba072e95e959
Bug 1178524-Publish trigger_missing_jobs action for revision to pulse

https://github.com/mozilla/treeherder/commit/09a5a1dc2da6e59d39e34d24b4db97306a56044d
Merge pull request #747 from vaibhavmagarwal/th-buttons

Bug 1178524 - Publish trigger_missing_jobs action for revision to pulse
(Assignee)

Comment 23

3 years ago
The treeherder part is checked-in and in production. We can close this bug.
Status: ASSIGNED → RESOLVED
Last Resolved: 3 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.