Closed Bug 1052886 Opened 10 years ago Closed 10 years ago

When requesting spot instances, give up on an az the first time there is no slave name

Categories

(Release Engineering :: General, defect)

x86
Linux
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: nthomas, Unassigned)

Details

Attachments

(1 file)

When we're running close to capacity, aws_watch_pending.log often has a lot of this:

2014-08-12 15:46:07,107 - Need 272 of tst-linux64 in us-east-1d
2014-08-12 15:46:07,107 - Using m1.medium (us-east-1, us-east-1d) 0.0081 (value: 0.0081) < 0.07
2014-08-12 15:46:07,108 - No slave name available for us-east-1, tst-linux64, None
2014-08-12 15:46:07,108 - No slave name available for us-east-1, tst-linux64, None
2014-08-12 15:46:07,108 - No slave name available for us-east-1, tst-linux64, None

and repeat the last line many more times, one per needed instance. And this:

2014-08-12 15:45:59,896 - No free IP available in us-east-1c for subnets ['subnet-ae35ccc4', 'subnet-8f32cbe5', 'subnet-ff3542d7', 'subnet-b8643190', 'subnet-fb97bc8f', 'subnet-844b7ec2', 'subnet-ed35cc87', 'subnet-5cd0d828', 'subnet-7ca5f03a']
2014-08-12 15:45:59,896 - No free IP available for tst-linux64 in us-east-1c

We could short circuit that by checking for getting a value of False for r in do_request_spot_instances().
Summary: When request spot instances, give up on az first time there is no slave name → When requesting spot instances, give up on an az the first time there is no slave name
Might be as simple as this ? If not I'll leave it to you!
Attachment #8471950 - Flags: feedback?(rail)
Comment on attachment 8471950 [details] [diff] [review]
[cloud-tools] Return early

LGTM!
Attachment #8471950 - Flags: review+
Attachment #8471950 - Flags: feedback?(rail)
Attachment #8471950 - Flags: feedback+
Working. Saw an instance of needing 38 bld-linux64, starting 34, then failing to get the rest without lots of log spew. We still try all the other az and instance types when we can't get a name but I'm not going to attempt fixing that here.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Component: General Automation → General
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: