Closed Bug 1383266 Opened 2 years ago Closed 2 years ago

decom a couple scl3 windows tests masters

Categories

(Infrastructure & Operations :: CIDuty, task, P4)

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: kmoir, Unassigned)

References

Details

Attachments

(3 files)

We should evaluate load, but may be able to decomm some scl3 windows test masters after we have migrated some win32/win64 tests to tc as tier1 on trunk next week.

Here is the current list
buildbot-master69.bb.releng.use1.mozilla.com:/builds/buildbot/tests1-windows/master
buildbot-master109.bb.releng.scl3.mozilla.com:/builds/buildbot/tests1-windows/master
buildbot-master110.bb.releng.scl3.mozilla.com:/builds/buildbot/tests1-windows/master
buildbot-master111.bb.releng.scl3.mozilla.com:/builds/buildbot/tests1-windows/master
buildbot-master112.bb.releng.scl3.mozilla.com:/builds/buildbot/tests1-windows/master
buildbot-master119.bb.releng.scl3.mozilla.com:/builds/buildbot/tests1-windows/master
buildbot-master126.bb.releng.scl3.mozilla.com:/builds/buildbot/tests1-windows/master
buildbot-master127.bb.releng.scl3.mozilla.com:/builds/buildbot/tests1-windows/master
Will hold on to grab this bug until the EOW when we can evaluate how the TC migration went. That's due on Wednesdays.
So we have for buildbot windows:
-150 t-w732-ix
-345 t-w864-ix
-40 t-xp32-ix (and only first 10 are enabled)

From the list above mentiond by Kim we use buildbot-master69 for test purpose,anyway this master was not used in production.
We have for the moment 7 bm masters,we can reduce this number to 3 or less,maybe keeping only :

buildbot-master109.bb.releng.scl3.mozilla.com:/builds/buildbot/tests1-windows/master
buildbot-master110.bb.releng.scl3.mozilla.com:/builds/buildbot/tests1-windows/master
buildbot-master111.bb.releng.scl3.mozilla.com:/builds/buildbot/tests1-windows/master
Flags: needinfo?(kmoir)
We also have t-1064-ix in scl3, so we should make sure we have enough capacity for those.
yes sorry also 75  t-1064-ix machines,I forgot to mention them,thank you Amy
That would probably be fine, if you want to be super cautious you could remove two, re-evaluate load and then see if you can remove another.  As a side note, Joel wants to enable some more buildbot tests in bug 1393198
Flags: needinfo?(kmoir)
Will keep for the moment bm109 ,bm110 and bm111.

mysql> select masters.nickname, count(*), masters.enabled from slaves, masters where masters.nickname like '%windows%' and slaves.current_masterid = masters.masterid group by masters.nickname;
+----------------------+----------+---------+
| nickname             | count(*) | enabled |
+----------------------+----------+---------+
| bm109-tests1-windows |       86 |       1 |
| bm110-tests1-windows |       84 |       1 |
| bm111-tests1-windows |       82 |       1 |
| bm112-tests1-windows |       81 |       1 |
| bm119-tests1-windows |       81 |       1 |
| bm126-tests1-windows |       81 |       1 |
| bm127-tests1-windows |       80 |       1 |
Puppet patch to remove bm112,bm119,bm126,bm127
Attachment #8901136 - Flags: review?(spacurar)
Remove from tools repository
Attachment #8901153 - Flags: review?(spacurar)
Attachment #8901136 - Flags: review?(spacurar) → review+
Attachment #8901153 - Flags: review?(spacurar) → review+
Disable nagios alerts for BB masters
Attachment #8901161 - Flags: review?(spacurar)
Attachment #8901161 - Flags: review?(spacurar) → review+
Disabled yesterday these 4 bm masters and the masters left seems to handle the machines left on window:

+----------------------+----------+---------+
| nickname             | count(*) | enabled |
+----------------------+----------+---------+
| bm109-tests1-windows |      191 |       1 |
| bm110-tests1-windows |      189 |       1 |
| bm111-tests1-windows |      187 |       1 |
| bm128-tests1-windows |      134 |       1 |
| bm129-tests1-windows |      134 |       1 |
| bm137-tests1-windows |      133 |       1 |
| bm138-tests1-windows |      133 |       1 |
| bm139-tests1-windows |      133 |       1 |
| bm140-tests1-windows |      133 |       1 |
+----------------------+----------+---------+
 We had a spike of 2k pending jobs on w732 but only for 15 minutes so I will continue with the decom.
Also deleted from slavealloc.
Note explaining the priority level: P4 doesn't mean we've lowered the priority, but the contrary. However, we're aligning these levels to the buildduty quarterly deliverables, where P1-P3 are taken by our daily waterline KTLO operational tasks.
Priority: -- → P4
All done here.
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Product: Release Engineering → Infrastructure & Operations
You need to log in before you can comment on or make changes to this bug.