like we did for Bug 1063018, this requires: - a re-image instead of an applied gpo - vs2013 update 3 full release version (18.00.30723) - the post vs2013 install steps of copying cvtres.exe we are ready for roll out based on staging findings in: https://bugzilla.mozilla.org/show_bug.cgi?id=1063018#c17 and the overall health of machines since being put into production: https://secure.pub.build.mozilla.org/builddata/reports/slave_health/slave.html?name=b-2008-ix-0001 https://secure.pub.build.mozilla.org/builddata/reports/slave_health/slave.html?name=b-2008-ix-0002 https://secure.pub.build.mozilla.org/builddata/reports/slave_health/slave.html?name=b-2008-ix-0003 https://secure.pub.build.mozilla.org/builddata/reports/slave_health/slave.html?name=b-2008-ix-0004 https://secure.pub.build.mozilla.org/builddata/reports/slave_health/slave.html?name=b-2008-ix-0001 talking to markco, imaging servers should be able to handle ~10 machines at once and each machine should take ~30m. relops will look into a more automated process of disabling and re-imaging them. we are good to start this as soon as time permits. starting with try pool first again like we did for 1019165 makes sense.
this work is being stalled for today due to higher priority work. will be resumed either tomorrow or thurs
I am going to disable 0006 through 0036, and begin the install once the first 15 are clear.
I am going to re-enable 0009, 0010, and 0017. They are the first ones to complete. In 30 minutes i am taking off for a handful of hours, but I will resume once I am back.
0006 through 0022 are competed and re-enabled. With the exception of 0011 and 0014.0011 seems to be repeatable failing the re-image, and 0014 is unreachable. I will dive deeper into those two as well as rest of the installs on Monday. On Monday things should roll quicker since the second batch is already disabled. Once those installs are going i will disable the next batch and roll through as many as I can Monday. Pausing Monday night for the release builds on Tuesday.
0011 has installed through and is re-enabled. 0023 through 0036 is now being installed with the exception 0033 which is currently unreachable. 0037 through 0051 has been disabled.
0023 through 0036 has completed and been re-enabled except for 0031. 0031 i still waiting to complete. 0052 through 0067 has been disabled. 0054, 0059, 0060, and 0064 was already disabled and commented with https://bugzilla.mozilla.org/show_bug.cgi?id=1060255. Should these machines be re-imaged, or do we want to leave them in their current state?
And currently still waiting on 0037 through 0051 to finish.
37 through 50 is now being installed. Except for 0038 which i am having connection issues with. 0051 is still waiting for the current build to complete. 0014 and 0033 are now back in and are being imaged.
> 0054, 0059, 0060, and 0064 was already > disabled and commented with > https://bugzilla.mozilla.org/show_bug.cgi?id=1060255. Should these machines > be re-imaged, or do we want to leave them in their current state? my understanding was that we fixed the NSIS issue via GPO. I would think these machines just need latest gpo and then they can be re-enabled. Unless I'm shown to be wrong, we can re-image these and they should just work.  https://bugzilla.mozilla.org/show_bug.cgi?id=1060255#c22
0014, 0031, 0033, 0037, 0039, 0041, and 0043 through 0051 has competed and now are re-enabled. 0040 and 0042 need to have their domain accounts deleted and re-imaged. Starting in on 0052+ shortly. 0068 through 0082 has been disabled.
0040, 0042, and 0051 through 0067 are completed and re-enabled. Except 68 which is currently unreachable. Waiting for builds to finish on 0068 through 0082 to complete. 0083 through 0098 has been disabled.
0069 through 0082 are completed and re-enable. Except 0071 which still has a build running on it. 0068 is now reachable and has an install running. 0099 through 0113 have been disabled. Will begin 0083+ shortly. Except for 0089. there is no inventory entry for 0089.
0068 and 0083 through 0098 are complete and re-enabled except for 0086. 0084 and 0086 still currently have builds going. Disabled 0114 through 0128. Will begin on 0099+ shortly.
0071, 0086, 0099 through 0110, and 0112 are completed and re-enabled. 0113 had to have its domain account deleted and now is being re-imaged. 0084 and 0111 still had builds to complete. Disabled 0129 through 0143. Moving on to 0114 +
0111, 0113 through 0117, and 0119 through 0127 has completed and has been re-enabled. 0118 and 0084 are still finishing builds. 0128 did not complete the install and needs to be rekicked. Disabled 0144 through 0158. Moving onto 0128+
Re-enabled 0144 through 0158. I won't get to those tonight. 0128 + still waiting for some builds to finish. After they are finished I will kick of an install. Once those are completed I will hold off until the release builds are finished before continuing on 144 +.
0118, 0136, 0135, and 0139 still had builds going and have been re-enabled. I will catch those on the next go around. 0129,0141, 0138, and 0143 are currently installing. I will check on those tomorrow morning. 0128, 0130 through 0134, 0137, 0140, and 0142 has completed and has been re-enabled.
0084, 0129, 0138, 0141, and 0143 have completed and has been re-enabled.
Do we know when we would be clear to continue re-imaging?
(In reply to Mark Cornmesser [:markco] from comment #19) > Do we know when we would be clear to continue re-imaging? we are all clear to continue today. we have our usual beta but we do not require all slaves on deck.
hmm I had wrongly posted comments for this bug on 1063018. With the exception of 0038 and 0180 all of the machines have been re-imaged and re-enabled.
thanks markco for rolling this out so fast. we can start testing this out on try asap. markco, should we track 0180 and 0038 in their respective bugs and close this bug?
I would prefer not to close this one until those 2 are re-imaged.
0038 has completed and has been re-enabled. 0180 has been previously decommissioned. All machines should now be re-imaged and enabled.
Status: NEW → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED
\o/ Thanks very much Mark!
Component: Platform Support → Buildduty
Product: Release Engineering → Infrastructure & Operations
You need to log in before you can comment on or make changes to this bug.