1386405 - macosx64-stylo jobs are always running

Assignee

Description

•

8 years ago

A lot of stylo jobs got enabled on more trees than just central. It seems SETA allows these to always run. It could be due that the 2-week grace period is kicking in. We probably should remove that or bring it to a couple of days. Stylo jobs don't show up in here, thus, always being scheduled: https://treeherder.mozilla.org/api/project/mozilla-inbound/seta/job-priorities/?build_system_type=buildbot https://treeherder.mozilla.org/api/project/mozilla-inbound/seta/job-priorities/?build_system_type=taskcluster Now, the jobs do show up as valid job-types: [ "reftest-stylo-e10s-14", "opt", "macosx64-stylo" ], in https://treeherder.mozilla.org/api/project/mozilla-inbound/seta/job-types/ There's an expiration column somewhere that we can change to some value (going off memory here). Now, could someone remind me where do I need to connect? If I'm right about this, we should add this information to the documentation for the next time like this happens or how to remedy it.

Kim Moir [:kmoir] ET

Updated

•

8 years ago

Blocks: 1386264

Kim Moir [:kmoir] ET

Updated

•

8 years ago

Blocks: 1386625

Kim Moir [:kmoir] ET

Updated

•

8 years ago

No longer blocks: 1386264

Armen [:armenzg]

Assignee

Updated

•

8 years ago

Depends on: 1386668

Armen [:armenzg]

Assignee

Comment 1

•

8 years ago

All Mac stylo jobs are marked as "high value" (priority=1) and their expiration date is set of the 13th of August. I've requested that they get updated. You can edit a query of Treeherder's job priority to see for yourself: https://sql.telemetry.mozilla.org/queries/10649/source#table

Assignee: nobody → armenzg

Armen [:armenzg]

Assignee

Comment 2

•

8 years ago

Hi Joel, I would like to remove the 2 weeks grace period from SETA's code. We had to disable Mac stylo jobs in most palces because they tipped over our Mac test capacity. As far as I'm concerned with the 2 weeks grace period is that it will bite us again. Getting into the same situation is more troublesome than having the grace period. OK with removing it?

Joel Maher ( :jmaher ) (UTC -8)

Comment 3

•

8 years ago

I really don't like the idea of removing this- basically a brand new job that we enable will be run periodically, can we reduce it to 1 week? can we special case osx stylo?

Armen [:armenzg]

Assignee

Comment 4

•

8 years ago

What would be the worse it could happen if we did not have such grace period? We find a regression on a change being considered for merge and need to wait for backfill results? Too many hours were wasted this week trying to understand what was going on and on getting us out of the hole. We're still not out of it. Another solution would be if we run new jobs *first* on a repository for few days (reduce grace period to such N days) and use that as reference. We currently use 'mozilla-inbound' as our reference repository and a 2 week grace period. This would be a procedure change and require human enforcement (maybe some code in-tree could be placed to enforce it).

Kim Moir [:kmoir] ET

Comment 5

•

8 years ago

iirc when we switched Android to running on emulators on AWS we pre-seeded the seta data with try runs on specific revisions so when we made the switch our AWS bill didn't spike dramatically.

Joel Maher ( :jmaher ) (UTC -8)

Comment 6

•

8 years ago

I don't want to make a change because of one fire drill. This 2 week period has been in place >1 year, I would like to think carefully before getting rid of it. If we did get rid of it, any new job could be perma fail and we would have little to no data points to determine what is going on. Right now in the self serve model any developer can add new jobs (and they do). Sheriffs don't have a clear picture of all possible jobs to expect and if there is a job that is perma-failing or intermittently passing but run once every 5th push (which in practice is skipped often) it is easy to miss the pattern and assume each failure is unique. It would take a few days to get a signal that things are bad- yes we could back out the patch then, so that would be worse case. Usually what will happen is we turn on a job and there is much confusion and randomization as people don't see the new job running. By default it will be every 5th push. I would prefer to pre-seed the tests in SETA than turn off the 2 weeks period.

Armen [:armenzg]

Assignee

Comment 7

•

8 years ago

I see your point there. I don't have a suggestion on how to make devs preseed SETA since it is so hands-off these days. In any case, this is done.

Status: NEW → RESOLVED

Closed: 8 years ago

Resolution: --- → FIXED

BMO Automation

Updated

•

5 years ago

Product: Tree Management → Tree Management Graveyard

Bugzilla

macosx64-stylo jobs are always running

Categories

(Tree Management Graveyard :: Treeherder: SETA, enhancement)

Tracking

(Not tracked)

People

(Reporter: armenzg, Assigned: armenzg)

References

Details

Crash Data

Security

(public)

User Story

Description

Updated

Updated

Updated

Updated

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Updated