Closed
Bug 985556
Opened 10 years ago
Closed 10 years ago
Bump MAX_BROKER_REFS to 4096
Categories
(Release Engineering :: General, defect)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: armenzg, Assigned: armenzg)
References
Details
Attachments
(4 files)
594 bytes,
patch
|
mozilla
:
review+
armenzg
:
checked-in+
|
Details | Diff | Splinter Review |
660 bytes,
patch
|
mozilla
:
review+
armenzg
:
checked-in+
|
Details | Diff | Splinter Review |
2.84 KB,
patch
|
mozilla
:
review+
armenzg
:
checked-in+
|
Details | Diff | Splinter Review |
632 bytes,
patch
|
mozilla
:
review+
armenzg
:
checked-in+
|
Details | Diff | Splinter Review |
This is going to suck quite a bit. Options: * Bump the limit * Remove one/two project branches + disable b2g18 branches * Split linux64 test masters by product * Run b2g reftests on linux32 VMs Any preference?
Assignee | ||
Comment 1•10 years ago
|
||
Attachment #8393612 -
Flags: review?(aki)
Assignee | ||
Comment 2•10 years ago
|
||
Attachment #8393613 -
Flags: review?(aki)
Assignee | ||
Comment 3•10 years ago
|
||
Attachment #8393614 -
Flags: review?(aki)
Updated•10 years ago
|
Attachment #8393612 -
Flags: review?(aki) → review+
Comment 4•10 years ago
|
||
Comment on attachment 8393613 [details] [diff] [review] raise_limit.buildbot.diff I think you'll need to touch slavealloc's buildbot.tac template as well.
Attachment #8393613 -
Flags: review?(aki) → review+
Updated•10 years ago
|
Attachment #8393614 -
Flags: review?(aki) → review+
Assignee | ||
Comment 5•10 years ago
|
||
I assume that I will have to land first on puppet, then on slavealloc and then on buildbot-configs. At that point I should start rebooting the masters? Or better to use the manhole?
Assignee | ||
Comment 6•10 years ago
|
||
Attachment #8393623 -
Flags: review?(aki)
Updated•10 years ago
|
Attachment #8393623 -
Flags: review?(aki) → review+
Assignee | ||
Updated•10 years ago
|
Summary: Enabling EC2 B2g reftests across the systems causes us to hit the maximum number of builders for tst-linux64 machines → Bump MAX_BROKER_REFS to 4096
Assignee | ||
Comment 7•10 years ago
|
||
https://hg.mozilla.org/build/puppet/rev/de097c0090ea http://hg.mozilla.org/build/buildbot/rev/7ce79514a42d http://hg.mozilla.org/build/buildbot-configs/rev/410007f6c5e6 http://hg.mozilla.org/build/tools/rev/976ac92d2f78 I need to request slavealloc to update. Tomorrow I will go around doing graceful restarts.
Assignee | ||
Comment 8•10 years ago
|
||
Fix typo: https://hg.mozilla.org/build/buildbot-configs/rev/f187ef094c33
Assignee | ||
Comment 9•10 years ago
|
||
Should I back out the buildbot-configs patch before someone increases the builder limit before tomorrow?
Assignee | ||
Comment 10•10 years ago
|
||
FYI, I can see slavealloc giving the right value on buildbot.tac:
> twisted.spread.pb.MAX_BROKER_REFS = 4096
Assignee | ||
Comment 11•10 years ago
|
||
I will be batching the masters like this: https://etherpad.mozilla.org/X20KMNQsXP Steps: 1) disable the masters on slavealloc 2) use manage_masters.py to gracefull_stop update_buildbot start (3 actions) 3) enable the masters on slavealloc
Comment 12•10 years ago
|
||
(In reply to Armen Zambrano [:armenzg] (Release Engineering) (EDT/UTC-4) from comment #11) > I will be batching the masters like this: > https://etherpad.mozilla.org/X20KMNQsXP > > Steps: > 1) disable the masters on slavealloc > 2) use manage_masters.py to gracefull_stop update_buildbot start (3 actions) > 3) enable the masters on slavealloc Two comments: 1) You should leave out bm01-06 -- they're not in production yet. I'll make sure they come up with the right stuff. 2) If you disable bm51 and 52 at the same time, bm67 will end up as the only master for that pool (use1 linux tests). I recommend against doing this - the master will probably grind to a halt or die.
Assignee | ||
Comment 13•10 years ago
|
||
I updated the etherpad taking collocation into consideration. If we had a way to disable/enable a master through slavealloc we could totally script this.
Comment 14•10 years ago
|
||
Do we know if this works via manhole? If so, then we can use fabric to make this change via manhole to all the masters.
Assignee | ||
Updated•10 years ago
|
Attachment #8393612 -
Flags: checked-in+
Assignee | ||
Comment 15•10 years ago
|
||
Comment on attachment 8393613 [details] [diff] [review] raise_limit.buildbot.diff Landed but not deployed.
Attachment #8393613 -
Flags: checked-in+
Assignee | ||
Comment 16•10 years ago
|
||
Comment on attachment 8393614 [details] [diff] [review] raise_limit.bc.diff Backed out until the masters are ready.
Attachment #8393614 -
Flags: checked-in-
Assignee | ||
Updated•10 years ago
|
Attachment #8393623 -
Flags: checked-in+
Assignee | ||
Comment 17•10 years ago
|
||
catlee has updated the test masters without any issues so far. He has used fabric to accomplish it. I don't see anything out of the ordinary: http://builddata.pub.build.mozilla.org/reports/pending/pending.html
Assignee | ||
Comment 18•10 years ago
|
||
catlee deployed the rest.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Assignee | ||
Updated•10 years ago
|
Attachment #8393614 -
Flags: checked-in- → checked-in+
Assignee | ||
Comment 19•10 years ago
|
||
This is the code that catlee used: https://github.com/catlee/tools/commit/0cb4977d7b081c2ef882a6676d947bef489a4b53
Assignee | ||
Comment 20•10 years ago
|
||
Live in production.
Comment 21•10 years ago
|
||
(In reply to Armen Zambrano [:armenzg] (EDT/UTC-4) from comment #7) > http://hg.mozilla.org/build/buildbot/rev/7ce79514a42d The buildbot commit only landed on default & hasn't been merged to the production branch. Does the production branch do anything in this repo? (I'm just trying to figure out what I need to do in bug 961075).
Flags: needinfo?(armenzg)
Assignee | ||
Comment 22•10 years ago
|
||
Interesting situation. It should have landed on production and the masters use production, however, we modified the masters live by using the manhole. That is why everything still worked as we intended to. I will land it on the right place.
Flags: needinfo?(armenzg)
Comment 23•10 years ago
|
||
(In reply to Armen Zambrano [:armenzg] (EDT/UTC-4) from comment #22) > Interesting situation. > It should have landed on production and the masters use production, however, > we modified the masters live by using the manhole. That is why everything > still worked as we intended to. > I will land it on the right place. Please make sure you run "update_buildbot" on the masters if you're landing to the production branch. Otherwise the code on disk won't match what's running in memory.
Comment 24•10 years ago
|
||
(In reply to Armen Zambrano [:armenzg] (EDT/UTC-4) from comment #22) > Interesting situation. > It should have landed on production and the masters use production, however, > we modified the masters live by using the manhole. That is why everything > still worked as we intended to. Ah! :-)
Assignee | ||
Comment 25•10 years ago
|
||
I went into a meeting and I did not had a chance to do it. I've updated all masters with update_buildbot.
Updated•6 years ago
|
Component: General Automation → General
You need to log in
before you can comment on or make changes to this bug.
Description
•