Closed Bug 1141262 Opened 9 years ago Closed 9 years ago

Add support for controlling mozci job backfilling via Treeherder's UI

Categories

(Tree Management :: Treeherder, defect, P4)

defect

Tracking

(Not tracked)

RESOLVED DUPLICATE of bug 1183923

People

(Reporter: armenzg, Unassigned)

References

Details

Recently I've added the ability for mozci to backfill jobs; that is, being able to start from a bad job and trigger everything needed up to the last good job [1].

I would like to know what would be required to make this happen or at least get our feet wet.

Would you suggest we wait for bug 1077053?

[1] http://armenzg.blogspot.ca/2015/03/mozci-030-support-for-backfilling-jobs.html
Adding a bit more of context:

> From armenzg:
> On another note, how does the retrigger job system work? Do we do javascript calls? Could we use python?
> Does treeherder use the credentials of the developer?
> mozci uses credentials provided by the developer or reads them from a file on-disk.

From edmorley:
Bit of a mixture - the buildapi retrigger/cancels are done client side using LDAP auth, but we also hit the API for those calls, since we have to clean up pending jobs from the DB. In addition, non-buildbot jobs are now using the API to retrigger, since Treeherder will publish a pulse message to initiate them. The intention is to move the buildapi stuff server side too, if possible, soon (bug 1077053). For the retriggers/cancels sent via Treeherder's API, we'll be using the Persona auth that Treeherder uses for other things (eg sheriff features).

Matching issue: https://github.com/armenzg/mozilla_ci_tools/issues/111
(In reply to Armen Zambrano - Automation & Tools Engineer (:armenzg) from comment #0)
> Would you suggest we wait for bug 1077053?

Yeah I think that makes sense :-)
Depends on: 1077053
OS: Mac OS X → All
Priority: -- → P4
Hardware: x86 → All
> The intention is to make the trigger button to hit the TH api to trigger (not implemented yet -
> bug 1077053) which would hit the self-serve/buildapi. Is this right?
Yes, that's what bug 1077053 is about.
Depends on: 1120997
No longer depends on: 1120997
This bug is on the "sheriff top issues" list that the sheriffs provide to the Treeherder meetings. However, before this can be tackled, bug 1077053 / bug 1168148 need to be fixed first.

After that, we need to have a chat about workflow and spec of pulse messages, plus how Treeherder will know what jobs are available to be retriggered (eg will we need to add a mozci API somewhere that Treeherder can query for the list of "missing jobs" that are available for retriggering?).

As such, I'm going to remove it from the "top issues" list, since it's not something we can fix soon. If backfilling is a major problem for sheriffs, perhaps we need to dial down the SETA dials for the short term?
Depends on: 1168148
Summary: Allow treeherder to backfill jobs → Add support for controlling mozci job backfilling via Treeherder's UI
we can have a local script sheriffs can run on-demand/as-needed for backfilling until we have a more elegant issue.  If we need to dial back SETA, that is fine as well.
mozci currently can almost give you this information:
> builders = _filter_builders_matching(fetch_allthethings_data()['builders'], " try ")
It might need some touches to make sure we're not missing some builders (I'm thinking of builders that start like "b2g_")

Another way to solve this (without the above) is by clicking on an existing job(s) we want to backfill (it can be a failing job) and we prompt the user to select the starting revision.

We can send a treeherder action to 'backfill' indicating the buildername and starting revision.

We would then determine what needs backfilling and how many revisions back.
We have an inherit limit to not go up the revisions forever.

I believe the Treeherder UI and sending the actions over the pulse stream can be tinkered with in advance.

About the backfill support:
###########################
Set the scope on where to find the last good job:
https://github.com/armenzg/mozilla_ci_tools/blob/master/mozci/scripts/trigger.py#L163

Determine how far the last good job was run:
https://github.com/armenzg/mozilla_ci_tools/blob/master/mozci/mozci.py#L476
(In reply to Armen Zambrano G. (:armenzg - Toronto) from comment #9)
> mozci currently can almost give you this information:

I mean a webapi. Though I guess we need to decide the structure of this all - should we build that piece into Treeherder or separate service or ...?

Perhaps we can all discuss some more at whistler :-)
Armen, backfilling option for sheriffs is already in production with bug 1183923 resolved. Can we close this bug or is there anything left?
Flags: needinfo?(armenzg)
Status: NEW → RESOLVED
Closed: 9 years ago
Flags: needinfo?(armenzg)
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.