Closed
Bug 1317723
Opened 8 years ago
Closed 8 years ago
Rebalance the Win8 machine pool
Categories
(Infrastructure & Operations Graveyard :: CIDuty, task)
Infrastructure & Operations Graveyard
CIDuty
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: RyanVM, Assigned: aselagea)
References
Details
Attachments
(4 files)
99.38 KB,
image/png
|
Details | |
27.23 KB,
image/png
|
Details | |
1018 bytes,
patch
|
kmoir
:
review+
aselagea
:
checked-in+
|
Details | Diff | Splinter Review |
12.26 KB,
text/csv
|
kmoir
:
review+
aselagea
:
checked-in+
|
Details |
Over in bug 1317434, I'd like to finally enable Win8 e10s tests in production. Now WinXP tests are no longer running on 53+, we should be able to move a large fraction of those machines over to Win8.
If my reading of Slave Health is correct, we currently have 211 WinXP testers. I'd like to propose moving 151 over to Win8, leaving us with a pool of 60 WinXP test machines to cover Aurora/Beta/Release/ESR45 and ~370 Win8 testers, which is pretty close to what we've got for 10.10 and where backlog seems pretty reasonable these days. 60 machines for WinXP is a bit light, but release branches also aren't as high-volume nor as risk-prone with respect to coalescing, so I don't think a bit of backlog there is a big deal. And buildbot branch prioritization will ensure that mozilla-release still gets first dibs should we find ourselves in a chemspill situation where turnaround time is paramount.
Amy, do those numbers sound reasonable? If so, any reason this can't proceed whenever?
Flags: needinfo?(arich)
Updated•8 years ago
|
Component: General Automation → Buildduty
QA Contact: catlee → bugspam.Callek
Assignee | ||
Updated•8 years ago
|
Assignee: nobody → aselagea
Assignee | ||
Comment 1•8 years ago
|
||
There actually are 222 XP machines in the pool, but noticed that some of them are having issues and will likely need a re-image (see the attachments). Those are:
t-xp32-ix-143
t-xp32-ix-144
t-xp32-ix-145
t-xp32-ix-146
t-xp32-ix-147
t-xp32-ix-148
t-xp32-ix-149
t-xp32-ix-150
t-xp32-ix-151
t-xp32-ix-152
t-xp32-ix-154
Assignee | ||
Comment 2•8 years ago
|
||
I disabled those 11 machines in slavealloc. If we stick to keeping 60 XP machines, that would mean moving 162 to the Windows 8 pool.
Assignee | ||
Comment 3•8 years ago
|
||
Assignee | ||
Comment 4•8 years ago
|
||
Comment 5•8 years ago
|
||
I'm going to defer load calculations to the releng folks since they have a better idea of their application load and wait on them for a request to move specific machines around.
Flags: needinfo?(arich)
Comment 6•8 years ago
|
||
Alin and I talked about that this morning in our standup. He looked at the load and I looked at the load and we think that Ryan's proposal is a sensible way forward.
Assignee | ||
Comment 8•8 years ago
|
||
slavealloc new entries for win8
Attachment #8811339 -
Flags: review?(kmoir)
Assignee | ||
Comment 9•8 years ago
|
||
I disabled the t-xp32-ix machines in range [061-222].
Comment 10•8 years ago
|
||
Comment on attachment 8811338 [details] [diff] [review]
bug_1317723.patch
Alin: Is there a bug for relops to reimage the machines as win8? I didn't see a dependent bug referenced.
Attachment #8811338 -
Flags: review?(kmoir) → review+
Comment 11•8 years ago
|
||
Comment on attachment 8811339 [details]
bug_1317723_slavealloc.csv
There's an extra space in the csv file after the second column that should be removed
substitute "win8," for "win8 ,"
needinfo for my question in the previous review
Flags: needinfo?(aselagea)
Attachment #8811339 -
Flags: review?(kmoir) → review+
Assignee | ||
Comment 12•8 years ago
|
||
Comment on attachment 8811339 [details]
bug_1317723_slavealloc.csv
Fixed and added the entries to slavealloc.
Attachment #8811339 -
Flags: checked-in+
Assignee | ||
Updated•8 years ago
|
Attachment #8811338 -
Flags: checked-in+
Assignee | ||
Comment 13•8 years ago
|
||
(In reply to Kim Moir [:kmoir] from comment #10)
> Alin: Is there a bug for relops to reimage the machines as win8? I didn't
> see a dependent bug referenced.
I filed bug 1318275 for that.
Flags: needinfo?(aselagea)
Comment 14•8 years ago
|
||
64 machines were enabled as a first step:
t-w864-ix-236
t-w864-ix-237
..............
t-w864-ix-299
Query OK, 64 rows affected (0.01 sec)
Rows matched: 64 Changed: 64 Warnings: 0
We will monitor and if everything will be fine and we will have green jobs we will enable the remaining ones.
Reporter | ||
Comment 15•8 years ago
|
||
The remaining ones were enabled by Andrei before leaving for the day with the understanding that we'd be watching the results. Things were quite rocky for awhile - a lot of the machines didn't start taking jobs until being force-rebooted, leading to a significant backlog in the mean time.
Once they started taking jobs, a not-insignificant number (~20%) had resolution issues that were causing widespread test failures. Many screenshots showed Geforce Experience-related notifications on the screen in addition to being at the wrong resolution. Interesting enough, rebooting those misbehaving machines was all it took to get the majority acting nicely.
At this point, ix-308 is disabled for ongoing problems that rebooting wasn't fixing. Additionally, there were a few machines that refused to connect to a master even after multiple reboot attempts. Tracking bugs for those machines have been filed. At this point, we have 383 working machines and I'm calling this fixed.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Comment 16•8 years ago
|
||
Solved ix-308 and the remaining machines which refused to connect to a master after been re-imaged.
Updated•7 years ago
|
Product: Release Engineering → Infrastructure & Operations
Updated•5 years ago
|
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•