Clean up / handle case where spot request is active but instance has gone away

RESOLVED FIXED

Status

Release Engineering
General Automation
--
major
RESOLVED FIXED
3 years ago
3 years ago

People

(Reporter: catlee, Assigned: rail)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

(Reporter)

Description

3 years ago
Bug 1050264 was caused by having active spot requests in AWS without associated instances. We've hit this bug before, and I thought it had been fixed. Perhaps the recent refactoring broke it.

In any case, we should:
a) handle this case in cloudtools.aws.spot.get_available_spot_slave_names by looking at active spot requests and throwing them out if their instances don't exist
b) cancel these requests automatically

Updated

3 years ago
Blocks: 1051166
This is causing us some pain in us-west-2. rail, welcome back! Could you take a look soonish ?
Severity: normal → major
Flags: needinfo?(rail)
I can look sat this this week, probably by adding another sanity check in http://hg.mozilla.org/build/cloud-tools/file/9dcb80cffe6c/scripts/spot_sanity_check.py.
Assignee: nobody → rail
Flags: needinfo?(rail)
https://hg.mozilla.org/build/cloud-tools/rev/3607eb84d963
Status: NEW → RESOLVED
Last Resolved: 3 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.