Closed Bug 1128198 Opened 9 years ago Closed 9 years ago

AWS build slaves in jacuzzis not being started

Categories

(Infrastructure & Operations Graveyard :: CIDuty, task)

x86_64
Linux
task
Not set
critical

Tracking

(Not tracked)

RESOLVED INCOMPLETE

People

(Reporter: philor, Unassigned)

Details

Looks like the first jacuzzi to dry out was "Android armv7 API 11+ fx-team build" with four slaves, all stopped, and pending jobs going back 13.5 hours, but we also have several hours of pending and no live slaves for:

Android armv7 API 11+ b2g-inbound build
Android armv7 API 11+ b2g-inbound debug build
Linux fx-team build
b2g_fx-team_linux32_gecko build

and lots of other jacuzzis where there's only one live slave, so we currently have pending and if we have no load for long enough for that one slave to go idle, I expect we'll be hosed.

fx-team is closed, b2g-inbound is approval-only so that gaia commits which don't trigger Android anyway can still land but Gecko pushes can't, mozilla-central is de facto closed since it has jacuzzis that will be dried out by the next time I have something to merge there, and mozilla-inbound is hanging on as long as it neither has a long period without any pushes nor has too many pushes at once.
A curious state of affairs - I thought the jacuzzi allocator "fixed" it by adding one more slave to the affected jacuzzis, making them barely workable by having one slave, but at least in the case of Android armv7 API 11+ fx-team build, https://secure.pub.build.mozilla.org/builddata/reports/slave_health/slavetype.html?include=bld-linux64-spot-096,bld-linux64-spot-420,bld-linux64-spot-490,bld-linux64-spot-491,bld-linux64-spot-498 it added 096, which did not pick up any of the pending Android builds, while the other four who had been sitting idled all suddenly woke up and took jobs at the time of the allocator commit.

Still critical since it could still leave us broken again at any time, but nothing's currently closed over it.
Severity: blocker → critical
Summary: Trees closed, AWS build slaves in jacuzzis not being started → AWS build slaves in jacuzzis not being started
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → INCOMPLETE
Product: Release Engineering → Infrastructure & Operations
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.