Closed Bug 1397879 Opened 4 years ago Closed 4 years ago

reimage 30 win8 machines as windows 10

Categories

(Infrastructure & Operations :: RelOps: General, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: jmaher, Assigned: arich)

References

()

Details

Attachments

(3 files)

+++ This bug was initially created as a clone of Bug #1397225 +++

I would like to move 30 machines from win8->win10 to accommodate load of moving mochitest-chrome/clipboard/gpu jobs to win10.

My thought is that moving 30 jobs is enough to not starve win8  (the backlog there is zero most of the time) and if we find after the move that the backlog remains at or near zero then we could move another 10 machines.
this will be the next round now that some of the load is moved off of win8- we need to prepare for moving more tests soon.

We do not need this on Friday, but happy to see it early next week!

:ahal, when this is resolved, we should have capacity to make the browser-chrome tests run by default on win10
Blocks: 1397229
I've disabled the following machines in preparation for this move:

t-w864-ix-307.wintest.releng.scl3.mozilla.com
t-w864-ix-308.wintest.releng.scl3.mozilla.com
t-w864-ix-309.wintest.releng.scl3.mozilla.com
t-w864-ix-310.wintest.releng.scl3.mozilla.com
t-w864-ix-311.wintest.releng.scl3.mozilla.com
t-w864-ix-312.wintest.releng.scl3.mozilla.com
t-w864-ix-313.wintest.releng.scl3.mozilla.com
t-w864-ix-314.wintest.releng.scl3.mozilla.com
t-w864-ix-315.wintest.releng.scl3.mozilla.com
t-w864-ix-316.wintest.releng.scl3.mozilla.com
t-w864-ix-317.wintest.releng.scl3.mozilla.com
t-w864-ix-318.wintest.releng.scl3.mozilla.com
t-w864-ix-319.wintest.releng.scl3.mozilla.com
t-w864-ix-320.wintest.releng.scl3.mozilla.com
t-w864-ix-321.wintest.releng.scl3.mozilla.com
t-w864-ix-322.wintest.releng.scl3.mozilla.com
t-w864-ix-323.wintest.releng.scl3.mozilla.com
t-w864-ix-324.wintest.releng.scl3.mozilla.com
t-w864-ix-325.wintest.releng.scl3.mozilla.com
t-w864-ix-326.wintest.releng.scl3.mozilla.com
t-w864-ix-327.wintest.releng.scl3.mozilla.com
t-w864-ix-328.wintest.releng.scl3.mozilla.com
t-w864-ix-329.wintest.releng.scl3.mozilla.com
t-w864-ix-330.wintest.releng.scl3.mozilla.com
t-w864-ix-331.wintest.releng.scl3.mozilla.com
t-w864-ix-332.wintest.releng.scl3.mozilla.com
t-w864-ix-333.wintest.releng.scl3.mozilla.com
t-w864-ix-334.wintest.releng.scl3.mozilla.com
t-w864-ix-335.wintest.releng.scl3.mozilla.com
t-w864-ix-336.wintest.releng.scl3.mozilla.com

They will become:

t-w1064-ix-096.wintest.releng.scl3.mozilla.com
t-w1064-ix-097.wintest.releng.scl3.mozilla.com
t-w1064-ix-098.wintest.releng.scl3.mozilla.com
t-w1064-ix-099.wintest.releng.scl3.mozilla.com
t-w1064-ix-100.wintest.releng.scl3.mozilla.com
t-w1064-ix-101.wintest.releng.scl3.mozilla.com
t-w1064-ix-102.wintest.releng.scl3.mozilla.com
t-w1064-ix-103.wintest.releng.scl3.mozilla.com
t-w1064-ix-104.wintest.releng.scl3.mozilla.com
t-w1064-ix-105.wintest.releng.scl3.mozilla.com
t-w1064-ix-106.wintest.releng.scl3.mozilla.com
t-w1064-ix-107.wintest.releng.scl3.mozilla.com
t-w1064-ix-108.wintest.releng.scl3.mozilla.com
t-w1064-ix-109.wintest.releng.scl3.mozilla.com
t-w1064-ix-110.wintest.releng.scl3.mozilla.com
t-w1064-ix-111.wintest.releng.scl3.mozilla.com
t-w1064-ix-112.wintest.releng.scl3.mozilla.com
t-w1064-ix-113.wintest.releng.scl3.mozilla.com
t-w1064-ix-114.wintest.releng.scl3.mozilla.com
t-w1064-ix-115.wintest.releng.scl3.mozilla.com
t-w1064-ix-116.wintest.releng.scl3.mozilla.com
t-w1064-ix-117.wintest.releng.scl3.mozilla.com
t-w1064-ix-118.wintest.releng.scl3.mozilla.com
t-w1064-ix-119.wintest.releng.scl3.mozilla.com
t-w1064-ix-120.wintest.releng.scl3.mozilla.com
t-w1064-ix-121.wintest.releng.scl3.mozilla.com
t-w1064-ix-122.wintest.releng.scl3.mozilla.com
t-w1064-ix-123.wintest.releng.scl3.mozilla.com
t-w1064-ix-124.wintest.releng.scl3.mozilla.com
t-w1064-ix-125.wintest.releng.scl3.mozilla.com
Assignee: relops → arich
aobreja: can you please work on the buildbot patches for this on Friday morning?
Flags: needinfo?(aobreja)
Hosts renamed.
Nagios updated in commit: 9802d34091da644fc25234f6b7db9126532a0612

Once they're added to the buildbot configs, please reimage, verify, enable so that they're picking up jobs, and resolve this bug.
Patch for buildbot-config.
Flags: needinfo?(aobreja)
Attachment #8905813 - Flags: review?(mtabara)
Adding the slaves to slavealloc.
Attachment #8905821 - Flags: review?(mtabara)
Attachment #8905813 - Flags: review?(mtabara) → review+
Attachment #8905821 - Flags: review?(mtabara) → review+
Comment on attachment 8905821 [details]
bug1397879_slavealloc.csv

Added the new slaves to slavealloc and also removed the missing entries for the win8 slaves that were migrated to other win10 pool.

mysql> select slaveid, name from slaves where name like 't-w1064-ix%' order by cast(substring(name, -3) as unsigned) desc limit 30;
+---------+----------------+
| slaveid | name           |
+---------+----------------+
|   34425 | t-w1064-ix-125 |
|   34423 | t-w1064-ix-124 |
|   34421 | t-w1064-ix-123 |
|   34419 | t-w1064-ix-122 |
|   34417 | t-w1064-ix-121 |
|   34415 | t-w1064-ix-120 |
|   34413 | t-w1064-ix-119 |
|   34411 | t-w1064-ix-118 |
|   34409 | t-w1064-ix-117 |
|   34407 | t-w1064-ix-116 |
|   34405 | t-w1064-ix-115 |
|   34403 | t-w1064-ix-114 |
|   34401 | t-w1064-ix-113 |
|   34399 | t-w1064-ix-112 |
|   34397 | t-w1064-ix-111 |
|   34395 | t-w1064-ix-110 |
|   34393 | t-w1064-ix-109 |
|   34391 | t-w1064-ix-108 |
|   34389 | t-w1064-ix-107 |
|   34387 | t-w1064-ix-106 |
|   34385 | t-w1064-ix-105 |
|   34383 | t-w1064-ix-104 |
|   34381 | t-w1064-ix-103 |
|   34379 | t-w1064-ix-102 |
|   34377 | t-w1064-ix-101 |
|   34375 | t-w1064-ix-100 |
|   34373 | T-w1064-ix-099 |
|   34371 | t-w1064-ix-098 |
|   34369 | t-w1064-ix-097 |
|   34367 | t-w1064-ix-096 |
+---------+----------------+
30 rows in set (0.01 sec)

mysql> delete from slaves where name like 't-w864-ix%' order by cast(substring(name, -3) as unsigned) desc limit 30;
Query OK, 30 rows affected (0.01 sec)

mysql> select slaveid, name from slaves where name like 't-w864-ix%' order by cast(substring(name, -3) as unsigned) desc limit 5;
+---------+---------------+
| slaveid | name          |
+---------+---------------+
|   33952 | t-w864-ix-306 |
|   33950 | t-w864-ix-305 |
|   33948 | t-w864-ix-304 |
|   33946 | t-w864-ix-303 |
|   33944 | t-w864-ix-302 |
+---------+---------------+
Attachment #8905821 - Flags: checked-in+
However for the bellow list I was unable to re-image since I got errors like:

<Activate Session error: Command response could not be provided
<Error: Unable to establish LAN session

<t-w1064-ix-121.wintest.releng.scl3.mozilla.com
<t-w1064-ix-122.wintest.releng.scl3.mozilla.com
<t-w1064-ix-123.wintest.releng.scl3.mozilla.com
<t-w1064-ix-124.wintest.releng.scl3.mozilla.com
<t-w1064-ix-125.wintest.releng.scl3.mozilla.com
Status: all machines t-w1064-ix-[096-120] went successfully re-imaged with few exception witch are again now in process of re-image :

t-w1064-ix-106.wintest.releng.scl3.mozilla.com
t-w1064-ix-107.wintest.releng.scl3.mozilla.com
t-w1064-ix-110.wintest.releng.scl3.mozilla.com
>t-w1064-ix-121.wintest.releng.scl3.mozilla.com
>t-w1064-ix-122.wintest.releng.scl3.mozilla.com
>t-w1064-ix-123.wintest.releng.scl3.mozilla.com
>t-w1064-ix-124.wintest.releng.scl3.mozilla.com
>t-w1064-ix-125.wintest.releng.scl3.mozilla.com

re-image in progress
(In reply to Andrei Obreja [:aobreja][:buildduty] from comment #10)

Those had the wrong ipmi password. I've fixed that as well as kicking off the reimage.
There is a problem with the machines from comment 12 -> https://bugzilla.mozilla.org/show_bug.cgi?id=1397879#c12
They seem unreachable. I also tried to reboot them with the ipmitool but no effect.

sebastian.pacurar@P5105:~$ fping t-w1064-ix-121.wintest.releng.scl3.mozilla.com
t-w1064-ix-121.wintest.releng.scl3.mozilla.com is unreachable

sebastian.pacurar@P5105:~$ fping t-w1064-ix-122.wintest.releng.scl3.mozilla.com
t-w1064-ix-122.wintest.releng.scl3.mozilla.com is unreachable

sebastian.pacurar@P5105:~$ fping t-w1064-ix-123.wintest.releng.scl3.mozilla.com
t-w1064-ix-123.wintest.releng.scl3.mozilla.com is unreachable

sebastian.pacurar@P5105:~$ fping t-w1064-ix-124.wintest.releng.scl3.mozilla.com
t-w1064-ix-124.wintest.releng.scl3.mozilla.com is unreachable

sebastian.pacurar@P5105:~$ fping t-w1064-ix-125.wintest.releng.scl3.mozilla.com
t-w1064-ix-125.wintest.releng.scl3.mozilla.com is unreachable

What should I do in this case?
Flags: needinfo?(arich)
I've tried another reimage. If that also fails, you'll need to talk to markco or Q for further debugging.
Flags: needinfo?(arich)
(In reply to Andrei Obreja [:aobreja][:buildduty] from comment #11)

I enabled the ones you indicated finished successfully since jmaher said we were experiencing backlog (and he hadn't even moved all the tests yet).
(In reply to Andrei Obreja [:aobreja][:buildduty] from comment #12)
>t-w1064-ix-121.wintest.releng.scl3.mozilla.com
>t-w1064-ix-122.wintest.releng.scl3.mozilla.com
>t-w1064-ix-123.wintest.releng.scl3.mozilla.com
>t-w1064-ix-124.wintest.releng.scl3.mozilla.com
>t-w1064-ix-125.wintest.releng.scl3.mozilla.com


All these 5 machines are unreachable.Mark can you do some further debugging here?
Flags: needinfo?(mcornmesser)
Depends on: 1398888
Attached image tw1064ix121_bios.jpg
For some reason the network boot option is no longer in the bios.
My bad it is actually there as the second option but when it goes to pxe boot it comes back with "An operating system is not found".
See Also: → 1398954
I've opened 1398954 to track the issues with those specific systems.
Status: NEW → RESOLVED
Closed: 4 years ago
Flags: needinfo?(mcornmesser)
Resolution: --- → FIXED
The machines that were re-imaged did not connect to a master and didn't take jobs yet,for now only 75 machines are enabled.Until this issue is solved please try to keep the backlog lower.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
The problem was fixed and also  SOL was enabled and onboard graphics for w8 was disabled,now these machines are in production with few exception which are under investigation in Bug 1398954:

t-w1064-ix-121.wintest.releng.scl3.mozilla.com
t-w1064-ix-122.wintest.releng.scl3.mozilla.com
t-w1064-ix-123.wintest.releng.scl3.mozilla.com
t-w1064-ix-124.wintest.releng.scl3.mozilla.com
t-w1064-ix-125.wintest.releng.scl3.mozilla.com
Status: REOPENED → RESOLVED
Closed: 4 years ago4 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.