Open Bug 1648694 Opened 2 months ago Updated 1 day ago

Backfill 20 jobs by default

Categories

(Tree Management :: Treeherder: Job Triggering & Cancellation, enhancement)

enhancement

Tracking

(Not tracked)

People

(Reporter: igoldan, Unassigned)

Details

Attachments

(1 file)

Attached image backfill-20-jobs.PNG

Perf jobs now require 20 patches to be backfilled. But the Backill button only backfills 10 at a time.

We should update the button so it backfills 20 patches, otherwise Perf sheriffs' resolution time is directly & negatively impacted.

Severity: S1 → S3

I think we need 25. The code is here:
Treeherder view:
https://github.com/mozilla/treeherder/blob/d6ba99312cb954bf07db3c0f6f769f72ca7cdfe0/ui/job-view/details/summary/ActionBar.jsx

calls:
https://searchfox.org/mozilla-central/source/taskcluster/taskgraph/actions/backfill.py#45

I could be wrong and that could be how the custom action works.

Also we should be careful as changing this to just go from 10->25 could affect code sheriffs and their work on other jobs.

As we move to running more jobs on 10 or 25 intervals, backfill should figure out how to backfill until the last job was seen. Possibly this would mean that treeherder itself would find the last time the desired job was seen (in the future the specific manifest) and use that as a parameter for how many revisions need backfill data.

Assignee: aesanu → nobody

Unfortunately, I have other priorities this sprint and won't be able to focus on this task.

:jmaher does the Treeherder team have bandwidth to pick this up? It's a consequence of running the tests less frequently and will impact the perf sheriffing investigation time.

Flags: needinfo?(jmaher)

I am not sure if this is a change to m-c or to treeherder, which specific action do the sheriffs take? In general this should be changing a value, maybe a few extra things as well- I don't see why the perf sheriffs couldn't pick this up but anyone could once we know what to change. I have some hints to the location in comment 1.

Flags: needinfo?(jmaher)

(In reply to Joel Maher ( :jmaher ) (UTC-4) from comment #1)

[...]
Also we should be careful as changing this to just go from 10->25 could affect code sheriffs and their work on other jobs.

As we move to running more jobs on 10 or 25 intervals, backfill should figure out how to backfill until the last job was seen. Possibly this would mean that treeherder itself would find the last time the desired job was seen (in the future the specific manifest) and use that as a parameter for how many revisions need backfill data.

These highlights show this isn't a trivial change. It's something that implies back & forth with the code sheriffs & assessment of the impact it may have (based on their current procedures). Also, it likely implies some more advanced, Treeherder-specific logic for dynamically figuring out the backfill range, which is out of perf team's scope & skillset.

Perf sheriffs agreed to sheriff more sparse data (given the new cost-effective measures). However, we expected that any changes related to reducing jobs per push would be taking care of, not to affect us.
In that it would be accompanied by an appropriate adjustment to the backfill ability.

In the past, perf sheriffs were used with one job every 5 pushes. The backfill ability could backfill 5 jobs.
At some point, we moved to having one job every 10 pushes. The backfill ability was adjusted to backfill 10 jobs.
Now, we have one job every 25 pushes. Yet, the backfill ability wasn't adjusted.

The task of reducing the job frequency should be an atomic one, involving reducing the job frequency and adjusting the backfill ability accordingly (at the same time). Otherwise, this task isn't 100% done & Perf sheriffs' resolution time is directly & negatively impacted.

You need to log in before you can comment on or make changes to this bug.