unable to backfill or add new jobs for buildbot bridge job (linux64 talos, OSX *)

RESOLVED FIXED

Status

--
blocker
RESOLVED FIXED
a year ago
7 months ago

People

(Reporter: jmaher, Assigned: bstack)

Tracking

Details

Attachments

(5 attachments)

(Reporter)

Description

a year ago
I have been trying for the last hour to 'add new jobs' and 'backfill' for some missing talos data on:
https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&group_state=expanded&filter-searchStr=linux%20talos%20e10s&tochange=0515ebda07af3263ab124ba1a6eabe212e9e1b89&fromchange=a5e5a6e086f8689b1a481af2393a52deeca25e27

this is frustrating as performance is a priority item.  I would like to request that we close the trees until this is fixed.
(Reporter)

Comment 1

a year ago
:garndt, this looks to be a taskcluster issue, can you get someone on the taskcluster team to look into this?
Flags: needinfo?(garndt)
Severity: normal → blocker
(Reporter)

Comment 2

a year ago
as a note, I can 'backfill' and 'add new jobs' for windows talos tests- so this looks to be exclusively related to taskcluster
(Reporter)

Updated

a year ago
Summary: unable to backfill or add new jobs for linux64 talos → unable to backfill or add new jobs for buildbot bridge job (linux64 talos, OSX *)
So far this seems to be limited to buildbot-bridge jobs.

For the OS X and Linux jobs that were requested to be backfilled (both of which are BBB jobs), these errors appear within pulse_actions:
https://tc-gp-public-31d.s3-us-west-2.amazonaws.com/ateam/pulse-action-dev/9ea6fed3-63ab-402e-9e8b-1e9679e7d73d

There is the chance that backfilling is having trouble tracing a job back to the builder.

Investigation is ongoing and involves looking into : https://github.com/mozilla/mozilla_ci_tools/blob/master/mozci/platforms.py#L172
So far the investigation is pointing to the fact that builder schedulers are no longer defined for OS X and Linux because they either run in TC entirely (linux) or scheduled via BBB (OS X).

Linux has been this way for quite some time, and OS X was changed on May 4th.

There are some possibilities found in mozci and pulse_actions of things to change, but not 100% certain.
Flags: needinfo?(garndt)
That would seem to suggest that this is not a new issue, so can we get trees reopened while it's being fixed?
(Reporter)

Comment 6

a year ago
I would suggest turning SETA off and opening the trees until this is fixed.  It will greatly increase our load, but allow us to not depend on backfilling or adding arbitrary jobs.
(Reporter)

Comment 7

a year ago
Created attachment 8867306 [details] [diff] [review]
temporarily disable seta
Attachment #8867306 - Flags: review?(bstack)
(Reporter)

Updated

a year ago
Keywords: leave-open
(Assignee)

Updated

a year ago
Attachment #8867306 - Flags: review?(bstack) → review+
(Assignee)

Updated

a year ago
Assignee: nobody → bstack
Status: NEW → ASSIGNED

Comment 8

a year ago
Pushed by jmaher@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/0f8c033cd3e9
temporarily disable SETA. r=bstack, a=CLOSED TREE

Comment 9

a year ago
Pushed by jmaher@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/fec1331f50b8
temporarily disable SETA for BBB only. r=bstack, a=CLOSED TREE

Comment 14

a year ago
Pushed by archaeopteryx@coole-files.de:
https://hg.mozilla.org/mozilla-central/rev/73b3fc64525b
actually disable SETA, instead of never running talos; r=bstack a=infra-fix
Comment hidden (mozreview-request)
Created attachment 8869565 [details] [review]
[treeherder] imbstack:bug-1364421 > mozilla:master

Comment 18

a year ago
mozreview-review
Comment on attachment 8869544 [details]
Bug 1364421 - Allow BBB jobs to be backfilled

https://reviewboard.mozilla.org/r/141128/#review144710
Attachment #8869544 - Flags: review?(garndt) → review+
(Assignee)

Updated

a year ago
Attachment #8869565 - Flags: review?(cdawson)
Attachment #8869565 - Flags: review?(cdawson) → review+

Comment 20

a year ago
Pushed by kwierso@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/1cd72f93f155
Allow BBB jobs to be backfilled r=garndt
So, once the gecko patch merges to mozilla-central (sometime over the weekend or monday morning, most likely) and the Treeherder patch gets deployed to production (probably sometime on Monday), I think something can land to re-enable SETA for the affected jobs.

Another patch to allow BBB jobs to be triggered via the "Add New Jobs" feature in Treeherder would probably be good at some point, too. Jmaher can probably speak to whether backfilling alone is sufficient for reenabling SETA.
Flags: needinfo?(jmaher)
(Reporter)

Comment 22

a year ago
we can work around add new jobs by using backfilling- so Monday I will get this enabled!
Flags: needinfo?(jmaher)
Can you verify that backfilling works from Treeherder stage? You'll need to be on a branch/push with that gecko commit on it (and probably need to have had it landed a few pushes earlier).
(Assignee)

Comment 25

a year ago
I don't think we ever made backfilling work on stage. There's another patch for it floating around that I can try to land again. I broke everything the last time I tried to land it though. Might be bitrotted so I can try to make it work on Monday.
Created attachment 8870127 [details] [review]
[treeherder] imbstack:bug-1364421-pt-2 > mozilla:master
(Assignee)

Updated

a year ago
Attachment #8870127 - Flags: review?(cdawson)
Attachment #8870127 - Flags: review?(cdawson) → review+
(Reporter)

Comment 28

a year ago
should we go forward and enable SETA again?  I am not sure if all the pieces we know about are landed and fully deployed.
(Assignee)

Comment 29

a year ago
afaict we should be good to re-enable seta. I believe the backfill patch for treeherder is in production and the add-new-jobs one is landed in master. As far as the in-tree patch is concerned, once it has been merged around into all of the branches it is done.

* This all assumes that there are no new bugs introduced by the changes of course.

But I think the next steps would be to have somebody with the permissions to backfill these jobs to try one out in the real world and then turn seta back on if it works!
(In reply to Joel Maher ( :jmaher) from comment #28)
> should we go forward and enable SETA again?  I am not sure if all the pieces
> we know about are landed and fully deployed.

SETA could be enabled again I think.  We've done about as much testing as I think we could at this point.  SETA/backfilling is pretty hard to test in a non-live environment.
Flags: needinfo?(jmaher)
(Reporter)

Comment 31

a year ago
Created attachment 8871792 [details] [diff] [review]
re enable seta now that we have BBB backfill capabilities
Flags: needinfo?(jmaher)
Attachment #8871792 - Flags: review?(dustin)
Attachment #8871792 - Flags: review?(dustin) → review+
(Assignee)

Updated

a year ago
Status: ASSIGNED → RESOLVED
Last Resolved: a year ago
Resolution: --- → FIXED
Removing leave-open keyword from resolved bugs, per :sylvestre.
Keywords: leave-open
You need to log in before you can comment on or make changes to this bug.