Closed Bug 1173822 Opened 9 years ago Closed 7 years ago

Do not allow SETA's adjustments on pushes that are back outs

Categories

(Infrastructure & Operations Graveyard :: CIDuty, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED INVALID

People

(Reporter: kmoir, Assigned: kmoir)

References

Details

top feature requests would be to schedule them on any push that gets periodics scheduled on it and having a way to force them on any other specified push (as I put it in TOR, a "Just run the damn tests" button) from irc RyanVM catlee: how easily could we tweak SETA logic to run the full set of tests on any job that gets periodic builds scheduled on it? catlee RyanVM: hm, not sure. kmoir|buildduty may know more RyanVM kmoir|buildduty: I think doing ^ would cut down on the rate of merged-around bustage we've encountered w/ SETA RyanVM of course, just having an easy way to trigger periodic builds on a push would be nice RyanVM instead of resorting to something like http://people.mozilla.org/~rvandermeulen/trigger_periodic kmoir|buildduty RyanVM: have you encountered a lot of this recently? I'd have to think about how to implement this, don't know right now RyanVM merged-around bustage w/ SETA has been well-documented w/ Joel and crew RyanVM it's probably the #1 problem we've had RyanVM running every 10th pushes just makes it way too easy to miss a new permafail RyanVM at least, if we intend to do more than one merge a day kmoir|buildduty okay I didn't realize this, I hadn't heard that from Joel kmoir|buildduty thanks for letting me know RyanVM np RyanVM so yeah, I think top feature requests would be to schedule them on any push that gets periodics scheduled on it and having a way to force them on any other specified push (as I put it in TOR, a "Just run the damn tests" button) RyanVM|sheriffduty because SETA is definitely brutal coming out of a long tree closure RyanVM|sheriffduty bhearsum: did you make the change today we'd discussed yesterday? catlee RyanVM|sheriffduty: also a way to force all skipped tests to run? RyanVM|sheriffduty catlee: yeah, the "run the tests" button RyanVM|sheriffduty ideally it would just be needing to specify a tree and rev and let the infra handle it from there RyanVM|sheriffduty which we could then hook into the TH UI if all goes well
Assignee: nobody → kmoir
would a simple mozci script work for this? I would be happy to make such a thing.
See the end of comment 0. The goal is specifically to not have to rely on scripts for this.
I understand- for the non periodic case, we would need ascript, either run locally or triggered via a button on treeherder. Am I understanding this correctly?
I was hoping this would be something that buildbot could Just Do.
I think buildbot could do it for periodic- but what about other cases? buildbot doesn't know magically when we want to trigger all coalesced jobs. Another option is to use the work chmanchester has done to auto retrigger failures (currently for autoland user on try and by invite only) to retrigger failures on inbound/fx-team to reduce the unknowns when SETA coalesces and we miss failures.
I think our primary objection is to the phrase "either run locally or". Other than the speed of getting past arguing about taking a dependency, we don't have any reason to care whether treeherder or buildapi/self-serve takes a dependency on mozci, or someone stands up a production-quality webservice in front of mozci, or self-serve itself implements an API that works just like "Create new nightly builds on {tree} revision _____" does by digging through finished jobs to determine what amounts to a full set of tests, or self-serve implements that same API by instead querying either an existing db or a new purpose-built db which can return SELECT * from Jobs WHERE revision="4edd6b30d540" and status="SETASKIPPED". But when a sheriff borrows his mom's computer before dinner and sees the need to fill in, we want his first step not to be to install Mercurial on it so he can clone mozci on it, but instead to have his first and last step be to click a button which posts to buildapi/self-serve/mozilla-inbound/rev/4edd6b30d540/runallthedamntests or to mozci.mozilla.org/mozilla-inbound/rev/4edd6b30d540/runallthedamntests. Of course, since I'm in the middle of my third need of the day to have them filled in already, I would take the interim mozci script for those parts of the day when I'm at a computer that does have mozci on it. (Well, in truth we do have a small reason to care, because anything other than a db which lists jobs that were SETASKIPPED is going to do the wrong thing when we want to fill in a push after a reconfig has removed some job from existence, and mozci or self-serve looking at previous jobs would think that we were missing Android 4.0 debug Cpp because it had been skipped, rather than missing it because it wasn't scheduled because the job no longer exists, but... whatever. We're used to having things which try to schedule jobs which no longer exist long after they no longer exist.)
SETASKIPPED == COALESCED this would solve all cases- so effectively we need to query all coalesced jobs and force them to run. how does this work for future jobs? Lets say a build is building and then new jobs are coalesced- those wouldn't be run until you run this script a second time. Likewise if we do periodic builders and coalesce the test jobs we will have to run this again. Is that fine to run this a couple times? If not, then we need to implement something in a mozci server which waits for pulse messages to kill this.
I believe we should schedule everything on periodic jobs on buildbot. Otherwise, if we can have a way to determine which revisions are running periodic tasks we can also handle it with mozci+pulse_actions [1]. With pulse_actions, we can have a button in TH that will send a pulse action and we can make mozci fulfill your wishes. We can make pulse_actions monitor a revision until all coallesced jobs are indeed running (rather than schedule them and wish that they don't get coalesced). Is there a way to schedule a job on buildbot and set a property to "do not coalesce this one"? [1] bug 1168148
Depends on: 1174722
I have filed bug 1174722 to determine coalesced jobs on a revision so we can trigger them.
adding bug 1121998 so the sheriffs could pin coalesced jobs and then retrigger them all.
Depends on: 1121998
Sheriffs can now use `mozci-trigger --coalesced --repo-name repo -r rev` to trigger all coalesced jobs. http://armenzg.blogspot.ca/2015/06/mozci-080-new-feature-trigger-coalesced.html Should we keep this open to make periodic jobs to have full set of jobs?
We should keep it open to have a non-scripted way to trigger them which can be called by a button on treeherder, see comment 0, comment 2, comment 4, comment 6. I know you love mozci like a child, but that doesn't mean that every sheriff will install it on every computing device they ever touch, or will refuse to own or use any computing device on which they cannot install mozci.
My apologies. I forgot to re-read the bug. I will file the appropiate bugs for that.
Filed bug 1175213. We can also fix bug 1121998 as a general multiple jobs re-trigger system.
Armen's summary of the discussion at the Mozilla all hands with RyanVM and myself We believe that we can solve the main issue by checking the commit message for a backout keyword to ensure that all jobs are run for it (aka skip SETA for this revision). We believe that going from 10 pushes to 5 pushes should reduce some of the pain in here until kmoir fixes this
From what kmoir mentioned on the previous comment, we don't depend on bug 1121998. On a different angle, we will be working on mozci/pulse_actions to add to TH a button to fill in any revision (not only coalesced jobs but not scheduled jobs) - bug 1178524.
Summary: mechanism to force skipped SETA tests to run → Do not allow SETA's adjustments on pushes that are back outs
Armen, should this bug remain open? Not sure of the status of request now that we have migrated trunk builds to buildbot and seta is working in tc
Flags: needinfo?(armenzg)
We can close it now. We have the ability to schedule all missings test jobs on a push (bug 1379163).
Status: NEW → RESOLVED
Closed: 7 years ago
Flags: needinfo?(armenzg)
Resolution: --- → INVALID
Thanks Armen!
Component: Platform Support → Buildduty
Product: Release Engineering → Infrastructure & Operations
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.