Closed
Bug 970552
Opened 11 years ago
Closed 11 years ago
Do not use spot instances for some builders
Categories
(Release Engineering :: General, defect)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: rail, Assigned: rail)
References
Details
Attachments
(5 files, 1 obsolete file)
11.99 KB,
patch
|
catlee
:
review+
rail
:
checked-in+
|
Details | Diff | Splinter Review |
4.07 KB,
patch
|
catlee
:
review+
rail
:
checked-in+
|
Details | Diff | Splinter Review |
20.74 KB,
patch
|
catlee
:
review+
rail
:
checked-in+
|
Details | Diff | Splinter Review |
1.16 KB,
patch
|
catlee
:
review+
rail
:
checked-in+
|
Details | Diff | Splinter Review |
2.67 KB,
patch
|
catlee
:
review+
rail
:
checked-in+
|
Details | Diff | Splinter Review |
We shouldn't use spot instances for some builders (PGO/release?) or/and branches (beta/release?).
Assignee | ||
Updated•11 years ago
|
Assignee: nobody → rail
Updated•11 years ago
|
Attachment #8374131 -
Flags: review?(catlee) → review+
Assignee | ||
Comment 2•11 years ago
|
||
Comment on attachment 8374131 [details] [diff] [review]
no_pgo_on_spots-buildbotcustom-2.diff
https://hg.mozilla.org/build/buildbotcustom/rev/530a492013c9
Attachment #8374131 -
Flags: checked-in+
Assignee | ||
Comment 3•11 years ago
|
||
in production
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Assignee | ||
Comment 4•11 years ago
|
||
Attachment #8374363 -
Flags: review?(catlee)
Updated•11 years ago
|
Attachment #8374363 -
Flags: review?(catlee) → review+
Assignee | ||
Comment 5•11 years ago
|
||
Comment on attachment 8374363 [details] [diff] [review]
tests.diff
https://hg.mozilla.org/build/buildbotcustom/rev/b7530258fc4d
Attachment #8374363 -
Flags: checked-in+
Assignee | ||
Comment 6•11 years ago
|
||
2014-02-12 16:05:26-0800 [-] Error choosing next slave for builder 'release-mozilla-release-linux_repack_9/10', choosing randomly instead
2014-02-12 16:05:26-0800 [-] Unhandled Error
Traceback (most recent call last):
File "/builds/buildbot/build1/lib/python2.7/site-packages/twisted/python/context.py", line 37, in callWithContext
return func(*args,**kw)
File "/builds/buildbot/build1/lib/python2.7/site-packages/twisted/enterprise/adbapi.py", line 429, in _runInteraction
result = interaction(trans, *args, **kw)
File "/builds/buildbot/build1/lib/python2.7/site-packages/buildbot-0.8.2_hg_f23f5672becd_production_0.8-py2.7.egg/buildbot/process/builder.py", line 517, in _claim_buildreqs
sb = self._choose_slave(available_slaves)
File "/builds/buildbot/build1/lib/python2.7/site-packages/buildbot-0.8.2_hg_f23f5672becd_production_0.8-py2.7.egg/buildbot/process/builder.py", line 548, in _choose_slave
return self.nextSlave(self, available_slaves)
--- <exception caught here> ---
File "/builds/buildbot/build1/lib/python2.7/site-packages/buildbotcustom/misc.py", line 267, in _nextSlave
return func(builder, available_slaves)
File "/builds/buildbot/build1/lib/python2.7/site-packages/buildbotcustom/misc.py", line 463, in _nextSlave_skip_spot
valid.append(s)
exceptions.IndexError: list index out of range
Additionally it would be great to avoid running any of release builds on spot instances because there may be no chance to get to the slave to debug some failure.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Assignee | ||
Comment 7•11 years ago
|
||
Assignee | ||
Comment 8•11 years ago
|
||
I think I found the issue.
sorted(no_spot_slaves, _recentSort(builder))[-1] doesn't work for [], better to return None earlier.
Attachment #8375262 -
Attachment is obsolete: true
Attachment #8375265 -
Flags: review?(catlee)
Updated•11 years ago
|
Attachment #8375265 -
Flags: review?(catlee) → review+
Comment 9•11 years ago
|
||
Live in production.
Assignee | ||
Comment 10•11 years ago
|
||
Comment on attachment 8375265 [details] [diff] [review]
nextSlave.diff
https://hg.mozilla.org/build/buildbotcustom/rev/a03618388e2b
Attachment #8375265 -
Flags: checked-in+
Updated•11 years ago
|
Attachment #8375658 -
Flags: review?(catlee) → review+
Assignee | ||
Comment 12•11 years ago
|
||
Comment on attachment 8375658 [details] [diff] [review]
non-unified.diff
https://hg.mozilla.org/build/buildbotcustom/rev/5d08b50a1531
Attachment #8375658 -
Flags: checked-in+
Assignee | ||
Comment 13•11 years ago
|
||
In production
Status: REOPENED → RESOLVED
Closed: 11 years ago → 11 years ago
Resolution: --- → FIXED
Comment 14•11 years ago
|
||
Had to back this out. See https://bugzilla.mozilla.org/show_bug.cgi?id=980890#c11
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Assignee | ||
Comment 15•11 years ago
|
||
(In reply to Chris AtLee [:catlee] from comment #14)
> Had to back this out. See
> https://bugzilla.mozilla.org/show_bug.cgi?id=980890#c11
http://hg.mozilla.org/build/buildbotcustom/rev/a55559e39a59
Assignee | ||
Comment 16•11 years ago
|
||
Since the kill ratio for spot instances has been almost 0% since we landed the bidding improvements (see below), let's use spot instances everywhere except releases.
^bld-linux64
date, total jobs, jobs on spots, spot retries, o-d retries
2014-03-01, 1725, 1356 (78%), 2 (0%), 2 (0%)
2014-03-02, 1036, 762 (73%), 1 (0%), 0 (0%)
2014-03-03, 2564, 2046 (79%), 68 (3%), 0 (0%)
2014-03-04, 3263, 2636 (80%), 27 (1%), 1 (0%)
2014-03-05, 2987, 2306 (77%), 38 (1%), 2 (0%)
2014-03-06, 3456, 2688 (77%), 29 (1%), 1 (0%)
2014-03-07, 3003, 2425 (80%), 10 (0%), 1 (0%)
2014-03-08, 1303, 951 (72%), 0 (0%), 0 (0%)
2014-03-09, 998, 685 (68%), 0 (0%), 0 (0%)
2014-03-10, 2282, 1966 (86%), 15 (0%), 0 (0%)
2014-03-11, 2730, 2385 (87%), 2 (0%), 0 (0%)
2014-03-12, 2883, 2616 (90%), 9 (0%), 0 (0%)
2014-03-13, 3109, 2728 (87%), 3 (0%), 0 (0%)
It may sound blasphemous, but we can even reconsider our logic to avoid running retried jobs on spot instances! :)
Attachment #8391006 -
Flags: review?(catlee)
Assignee | ||
Comment 17•11 years ago
|
||
Comment 18•11 years ago
|
||
Comment on attachment 8391006 [details] [diff] [review]
kill-skip-spot.diff
Review of attachment 8391006 [details] [diff] [review]:
-----------------------------------------------------------------
Yeah, we could perhaps change it to run on spot if num_retries <= 1 instead of num_retries == 0
Attachment #8391006 -
Flags: review?(catlee) → review+
Assignee | ||
Comment 19•11 years ago
|
||
Comment on attachment 8391006 [details] [diff] [review]
kill-skip-spot.diff
https://hg.mozilla.org/build/buildbotcustom/rev/1f3ed587680c
Attachment #8391006 -
Flags: checked-in+
Assignee | ||
Comment 20•11 years ago
|
||
In production
Assignee | ||
Updated•11 years ago
|
Status: REOPENED → RESOLVED
Closed: 11 years ago → 11 years ago
Resolution: --- → FIXED
Updated•7 years ago
|
Component: General Automation → General
You need to log in
before you can comment on or make changes to this bug.
Description
•