Open Bug 2007112 Opened 3 months ago Updated 13 hours ago

Perform a binary search backfill with the sheriffing bot with change detection technique integration

Categories

(Tree Management :: Perfherder, task, P1)

Tracking

(Not tracked)

ASSIGNED

People

(Reporter: sparky, Assigned: myeongjun.ko)

References

(Depends on 1 open bug, Blocks 1 open bug)

Details

(Whiteboard: [fxp])

This bug is for performing the binary search backfill with the sherlock bot. We need to discuss exactly how we'll implement the logic for this. One possibility is that we could do a "parallelized" search where instead of triggering it on one push at a time, we could do it on 3 to get a quicker answer. This has the disadvantage of reducing the amount of benefit we see in terms of CI resource usage reductions. The advantage would be quicker answers for the culprit commit. We could also do this with one part as parallelized, and another part through a more sequential approach.

There are other questions/changes that we may need to make in terms of the backfill reporting in sherlock to be sure that we can handle the new checks from integrating the change detection techniques.

Here is the PR: https://github.com/mozilla/treeherder/pull/9237
This PR works as follows:

  • Sherlock triggers backfill on a target push via Taskcluster
  • Sherlock re-runs detection to find the culprit
  • If the detected push moves (left/right), Sherlock shifts the anchor accordingly
  • If the detected push stays the same, it is considered stabilized

One issue is a false-stabilization scenario:
Some pivot pushes never had the task scheduled, so Taskcluster cannot trigger anything. As a result, no new PerformanceDatum is generated, and Sherlock sees no change in data, which may incorrectly look like stabilization.

Since finding the exact culprit without backfilling all pushes is inherently difficult, I had an idea (not a proposal).
Assuming the culprit is within [prev_push, push], we could:

  • First try slicing=3
  • If the detected push doesn't move, retry with slicing=6 at the same anchor to confirm stability
  • If it moves, shift the anchor and continue with slicing=3

The goal is to reduce false stabilization by increasing sampling density when needed, instead of relying on a single limited probe.
We still can't reliably distinguish this from true stabilization. But that may be an acceptable limitation to live with for now.
If the PR looks reasonable, I'm happy to proceed, but I'm open to any feedback or alternative ideas.

Assignee: nobody → myeongjun.ko
Status: NEW → ASSIGNED
You need to log in before you can comment on or make changes to this bug.