Closed
Bug 1027308
Opened 10 years ago
Closed 10 years ago
Backlog of linux compile jobs (Amazon AWS instances not being launched)
Categories
(Infrastructure & Operations Graveyard :: CIDuty, task)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: KWierso, Unassigned)
References
Details
Attachments
(2 files)
723 bytes,
patch
|
jlund
:
review-
|
Details | Diff | Splinter Review |
886 bytes,
patch
|
nthomas
:
review+
|
Details | Diff | Splinter Review |
AWS seems to be pretty backlogged. https://tbpl.mozilla.org/?tree=Mozilla-B2g30-v1.4&rev=728fb350f32e was pushed more than an hour ago and still has lots of pending builds.
Trunk trees are closed until the situation improves.
Comment 1•10 years ago
|
||
Attachment #8442400 -
Flags: review?(catlee)
Comment 2•10 years ago
|
||
spot instance expense is too high for our bidding on many instance types in us-west and us-east. There are other types we can still avail of so increasing our overall limit of instances allowed to allocate more breathing room for those other types.
Comment 3•10 years ago
|
||
Comment on attachment 8442400 [details] [diff] [review]
increase_west_build_limit.patch
let's increase us-east-1 too while we are at it.
Attachment #8442400 -
Flags: review?(catlee) → review-
Comment 4•10 years ago
|
||
https://hg.mozilla.org/build/cloud-tools/rev/eae3f6598284 is both, r+ from catlee on IRC.
Comment 5•10 years ago
|
||
Sorry, for those of us plying along at home. Please don't use acronyms in bug summaries.
Comment 6•10 years ago
|
||
Current theory - watch_pending really really wants to use us-east-1c for pricing reasons, but subnet-7091d358 has no free IPs so it gives up. Adds in the other subnet for another 123 slots.
Attachment #8442470 -
Flags: review?(catlee)
Comment 7•10 years ago
|
||
Comment on attachment 8442470 [details] [diff] [review]
Add subnet-7091d358 from us-west-1c
https://hg.mozilla.org/build/cloud-tools/rev/ec66195a91fd
Attachment #8442470 -
Flags: review?(catlee) → review+
Comment 8•10 years ago
|
||
> Current theory - watch_pending really really wants to use us-east-1c for
> pricing reasons, but subnet-7091d358 has no free IPs so it gives up. Adds in
> the other subnet for another 123 slots.
Make that '... but subnet-2da98346 has no free IPs ...'
Comment 9•10 years ago
|
||
Tried once I will try again. You have many contributors here who only see that the tree is closed and only explanation is AWS backlog with no explanation of what AWS is. and clicking on the link to the bug gives no more information. Just saying, for an open source open project we should explain these things instead of using acronyms that only a select few know hat they mean. Just My opinion, I could be wrong.
Comment 10•10 years ago
|
||
Oh and to be fair it is not just this bug I have the same issue with many others that is usually ignored.
Comment 11•10 years ago
|
||
Using code words in bugs is anti-open
Comment 12•10 years ago
|
||
To make the more abundantly clear if I am trying to land a patch on inbound the fact that it is closed because of an AWS backlog with no explanation of what AWS is is just like we closed it because we felt like it.
Comment 13•10 years ago
|
||
Bill, we are abundantly busy trying to fix the issue up. Please ask the sheriff on IRC for any clarification you need.
We have setting the ondemand limit back to 100
http://hg.mozilla.org/build/cloud-tools/rev/64155873112f
which reverses part of
http://hg.mozilla.org/build/cloud-tools/rev/9f5ee86055e1
from a week ago.
Comment 14•10 years ago
|
||
and sorry to pick o nthis bug for a generic issue. Bug summaries should be understandable with out jargon or acronyms not generally understood within the entire Mozilla project.
Updated•10 years ago
|
Summary: AWS backlog → Backlog of linux compile jobs (Amazon AWS instances not being launched)
Reporter | ||
Comment 15•10 years ago
|
||
Reopened trunk trees at 2014-06-18T17:03:14 since it seems like things are better.
Comment 16•10 years ago
|
||
Crisis has subsided and free to discuss some of the theories for what may have caused the infra issue.
We have increased our spot instance and on demand instance AWS (amazon web services) limit.
Most likely this is just a band-aid to an underlying problem. In addition to theory described and patched here[1], here is another:
within the the last week we have added more jacuzzis[2] and may have locked too many slave names to them thus starving our non jacuzzi'd builders. The solution here is to add more slave names.
from discussion in #releng:
[16:46:02]] <catlee-away> | so we have 497 bld-linux64 spot slaves in slavealloc, and 416 of those are allocated to jacuzzis
[[16:46:09]] <catlee-away> | that doesn't leave many to handle non-jacuzzi's builders
[17:03:54]] <jlund|build> | catlee-away: we were having large lists of these initially: https://pastebin.mozilla.org/5432246 and this is non-jacuzzi. There didn't seem to be
many many 'no slave names' for the jacuzzis in the log
[[17:04:46]] <catlee-away> | yeah, ok
[[17:04:51]] <catlee-away> | so we need more names
[[17:05:00]] <catlee-away> | we added a bunch more jacuzzis late last week
[1] https://bugzilla.mozilla.org/show_bug.cgi?id=1027308#c6
[2] http://atlee.ca/blog/posts/initial-jacuzzi-results.html
Comment 17•10 years ago
|
||
Thank you for explaining to those who did not know what AWS is. I realize most of us knew that but I hate it when especially a tree closure bug uses non universally known acronyms.
Comment 18•10 years ago
|
||
with trees open and stable again, closing this for now.
for follow up on the fix of this, please see: Bug 1027437 - add more slave names to non jacuzzi builders
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Updated•7 years ago
|
Product: Release Engineering → Infrastructure & Operations
Updated•5 years ago
|
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•