Closed
Bug 1063018
Opened 10 years ago
Closed 10 years ago
Install vs2013 Update 3, NSIS 3.0a2, and Mercurial 2.9.2 on 5 machines for testing
Categories
(Infrastructure & Operations Graveyard :: CIDuty, task)
Infrastructure & Operations Graveyard
CIDuty
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: jlund, Assigned: markco)
References
Details
No description provided.
Reporter | ||
Updated•10 years ago
|
Blocks: b-2008-ix-0001
Reporter | ||
Updated•10 years ago
|
Blocks: b-2008-ix-0002
Reporter | ||
Updated•10 years ago
|
Blocks: b-2008-ix-0003
Reporter | ||
Updated•10 years ago
|
Blocks: b-2008-ix-0004
Reporter | ||
Updated•10 years ago
|
Blocks: b-2008-ix-0005
Reporter | ||
Comment 1•10 years ago
|
||
I've grabbed b-2008-ix-000[1-5]. b-2008-ix-0001 and b-2008-ix-0004 were not running a job so they are they are fully disabled and ready to be worked with. Once 2, 3, and 5 finish their current job and reboot, they will be ready too.
I also have a dev staging machine: ix-mn-w0864-001 that we can use to create the image with. that is ready to be worked with at any time too.
Blocks: ix-mn-w0864-001
Reporter | ||
Comment 2•10 years ago
|
||
FTR - b-2008-ix-0002, b-2008-ix-0003, and b-2008-ix-0005 are fully disabled and ready to be worked on
Assignee | ||
Updated•10 years ago
|
Assignee: nobody → mcornmesser
Assignee | ||
Comment 3•10 years ago
|
||
The 5 machines are now being installed with VS 2013 update 3.
The git, nsis, and VS 2013 GPOs will not apply to these machines.
I will check back in the morning to make sure the machines have completed the install.
Reporter | ||
Comment 4•10 years ago
|
||
(In reply to Mark Cornmesser [:markco] from comment #3)
> The 5 machines are now being installed with VS 2013 update 3.
\o/
Assignee | ||
Comment 5•10 years ago
|
||
0001 and 0003 did not seem to complete correctly. I am looking into those 2 machines now.
Assignee | ||
Comment 6•10 years ago
|
||
These machines are good to go now.
The got into an odd state where the deployment was not complete the install. After the drives were wiped the installation completed.
Reporter | ||
Comment 7•10 years ago
|
||
sweet. I can't connect to 003, ping 100% packet loss.
I connected the other 4 to my dev master to test before enabling in prod. unfortunately I can not get backup prod keys and copy staging keys to these slaves (sftp and scp does not work[1]) so steps like upload will fail. we can still monitor most the others like hg_update, symbols, compile, etc for now.
[1] https://wiki.mozilla.org/ReleaseEngineering/How_To/Set_Up_a_Freshly_Imaged_Slave
Reporter | ||
Comment 8•10 years ago
|
||
triggered a dozen builds from m-a, m-b, and m-c. will check back in the morning
Assignee | ||
Comment 9•10 years ago
|
||
For 0003 I have just wiped the drive and attempting another install. Though it is behaving kind of odd. I can't tell if it is a drive issue or an over all slowness between it and WDS1. I will check back on it in the morning.
Perhaps we should pick another machine for this since 3 is seeming odd at the moment.
Sftp should work. What is the message when it fails?
Assignee | ||
Comment 10•10 years ago
|
||
0003 i still out. For some reason the installation will not complete on this machine. I am looking into it this morning.
Reporter | ||
Comment 11•10 years ago
|
||
hmm, ok I'll ignore 003 for the moment and unlock it from my master.
here was the cmds for copying keys I tried:
cltbld@B-2008-IX-0001 ~
$ "C:\mozilla-build\msys\bin\scp" -o 'StrictHostKeyChecking no' -o 'BatchMode=no' -r cltbld@b-linux64-hp-0020.build.mozilla.org:~/.ssh .ssh
ssh: connect to host b-linux64-hp-0020.build.mozilla.org port 22: Bad file number
cltbld@B-2008-IX-0001 ~
$ sftp b-2008-ix-0183.build.mozilla.org:.ssh/* .ssh/
Connecting to b-2008-ix-0183.build.mozilla.org...
ssh: connect to host b-2008-ix-0183.build.mozilla.org port 22: Bad file number
Connection closed
Reporter | ||
Comment 12•10 years ago
|
||
looks like all my builds failed over night. I think this was my master's fault. it failed at tooltool step. e.g.:
sh: c:/builds/moz2_slave/m-beta-w32-d-00000000000000000/tools/scripts/tooltool/tooltool_wrapper.sh: No such file or directory
reason it doesn't exist is because the tools repo being cloned is very old:
'hg' 'clone' 'https://hg.mozilla.org/users/stage-ffxbld/tools' 'tools'
will figure out why and point these builds to the correct repo
Assignee | ||
Comment 13•10 years ago
|
||
There seems to be an issue with ssh from within try and build vlans to machines in the try vlan. Which is causing the sftp failures. Do we know if this is a new behavior?
Assignee | ||
Comment 14•10 years ago
|
||
0003 is back in play. Following removal of its account from the domain,a new image deployment completed successfully.
Reporter | ||
Comment 15•10 years ago
|
||
sftp will work if in same vlan however there were no win staging keys available in win staging build pool.
hand made keys and copied them across. connected all 5 to my master. fixed tools error. triggered a dozen builds
Reporter | ||
Comment 16•10 years ago
|
||
I *think* jobs against these slaves using vs2010 are looking good.
I've just triggered a bunch of vs2013 jobs against win32 and win64 to see how update 3 fairs. will post findings later today for both vs2010 and vs2013
Reporter | ||
Comment 17•10 years ago
|
||
some results:
WINNT 5.2 mozilla-beta build:
'hg_update': ['37 mins, 46 secs', '18 mins, 31 secs', '19 mins, 4 secs', '37 mins, 39 secs', '18 mins, 14 secs ']
'compile': ['3 hrs, 52 mins, 36 secs, '3 hrs, 51 mins, 8 secs', '3 hrs, 51 mins, 8 secs', '3 hrs, 52 mins, 6 secs', '3 hrs, 53 mins, 36 secs', '3 hrs, 48 mins, 39 secs']
'buildsymbols': ['6 mins, 54 secs', ' 7 mins, 1 secs', '6 mins, 55 secs', '6 mins, 57 secs', '6 mins, 56 secs']
'package-tests': ['9 mins, 48 secs', '9 mins, 43 secs', '9 mins, 46 secs', '9 mins, 49 secs', '9 mins, 40 secs']
WINNT 5.2 mozilla-beta leak test build:
'hg_update': ['37 mins, 29 secs']
'compile': ['1 hrs, 11 mins, 25 secs']
'buildsymbols': ['9 mins, 13 secs']
'package-tests': ['9 mins, 28 secs']
WINNT 5.2 mozilla-aurora nightly:
'hg_update': ['19 mins, 57 secs', '18 mins, 31 secs']
'compile': ['4 hrs, 45 secs', '7 mins, 25 secs', '4 hrs, 44 secs']
'buildsymbols': ['7 mins, 25 secs', '7 mins, 25 secs']
WINNT 5.2 mozilla-central nightly:
'hg_update': ['18 mins, 46 secs', '20 mins, 44 secs', '33 mins, 7 secs', '18 mins, 48 secs']
'compile': ['3 hrs, 12 secs', ' 3 hrs, 34 secs', '3 hrs, 2 secs', '3 hrs, 22 secs']
'buildsymbols': ['6 mins, 48 secs', ' 7 mins, 26 secs', '6 mins, 55 secs', '6 mins, 55 secs']
I think we are good to put these in production. all the steps bar hg update were very consistant across the branches I tested. I think with hg and the network things may vary so that might explain the two jobs that took ~35 min to clone instead of the usual 18. either way, it doesn't appear like we have bg procs or more than one step taking longer than usual.
markco, when I put staging keys on these machines they seem to have been overwritten with prod keys again. is there pgo logic to do so?
dmajor I also ran the following builds with vs2013 update 3. they all seemed to pass. do u want logs for any of them? If so, I can upload them somewhere public:
WINNT 5.2 mozilla-central leak test build
WINNT 6.1 x86-64 mozilla-central build
WINNT 6.1 x86-64 mozilla-central nightly (pgo)
WINNT 5.2 mozilla-central nightly (pgo)
Flags: needinfo?(mcornmesser)
Flags: needinfo?(dmajor)
Comment 18•10 years ago
|
||
(In reply to Jordan Lund (:jlund) from comment #17)
> dmajor I also ran the following builds with vs2013 update 3. they all seemed
> to pass. do u want logs for any of them? If so, I can upload them somewhere
If you could crack open one of each flavor and make sure it contains "18.00.30723", that's all I need to feel comfortable going ahead.
Flags: needinfo?(dmajor)
Assignee | ||
Comment 19•10 years ago
|
||
I'll look into it. Most likely i will apply an item level targeting based on the machine name to prevent any keys from being copied over.
Flags: needinfo?(mcornmesser)
Assignee | ||
Comment 20•10 years ago
|
||
Item level targeting is in place.
Jlund: Could you copy over the keys to a machine and see if they stick this time?
Reporter | ||
Comment 21•10 years ago
|
||
(In reply to David Major [:dmajor] from comment #18)
> (In reply to Jordan Lund (:jlund) from comment #17)
> > dmajor I also ran the following builds with vs2013 update 3. they all seemed
> > to pass. do u want logs for any of them? If so, I can upload them somewhere
> If you could crack open one of each flavor and make sure it contains
> "18.00.30723", that's all I need to feel comfortable going ahead.
I see the following in each log:
"Microsoft (R) C/C++ Optimizing Compiler Version 18.00.30723 for x86"
Comment 22•10 years ago
|
||
Is x86 in all the logs? The x86-64 ones ought to say "18.00.30723 for x64"
Reporter | ||
Comment 23•10 years ago
|
||
(In reply to David Major [:dmajor] from comment #22)
> Is x86 in all the logs? The x86-64 ones ought to say "18.00.30723 for x64"
my mistake. it says x64 for:
WINNT 6.1 x86-64 mozilla-central build
WINNT 6.1 x86-64 mozilla-central nightly (pgo)
B-2008-IX-000{1-5} machines have been clobbered of vs2013 builds, asserted to have prod keys, and enabled back into production based on positive results from comment 17
left notes on slavealloc and keeping this bug open while they are in the wild
Reporter | ||
Comment 24•10 years ago
|
||
these 5 machines are also going to test new of hg and NSIS as part of the image.
markco will update with times of when these changes were applied.
Assignee | ||
Comment 25•10 years ago
|
||
The Mercurial 2.9.2 GPO is now going out to these machines. Due to failures nsis GPO is no longer being pushed and is being removed.
Assignee | ||
Updated•10 years ago
|
Assignee | ||
Comment 26•10 years ago
|
||
0118, 0135, 0136, and 0139 have been re-imaged and re-enabled.
0144 through 0158 has been disabled.
Note 0159 and 0160 seem to not to exists.
Assignee | ||
Comment 27•10 years ago
|
||
0144 through 158 has re-imaged and been re-enabled. With exception of 0150 due to still a build going.
0160 through 0184 has been disabled.
Assignee | ||
Comment 28•10 years ago
|
||
Correction not 0160 but 0161.
Assignee | ||
Comment 29•10 years ago
|
||
0144 through 0184 has been completed and re-enabled. Except for 180 which currently unreachable.
0180 and 0038 are the only two machines left See bug 1078350.
Depends on: 1078350
Comment 30•10 years ago
|
||
Looks like 0103 and 0002 are either still waiting for a reimage, or still waiting for a reenable.
Assignee | ||
Comment 31•10 years ago
|
||
0103 and 0002 are now re-enabled.
Comment 32•10 years ago
|
||
Are we done here? Can this be closed?
Assignee | ||
Comment 33•10 years ago
|
||
This bug is finished. however, the nsis is still an open issue but is covered in a different bug.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Updated•7 years ago
|
Component: Platform Support → Buildduty
Product: Release Engineering → Infrastructure & Operations
Updated•5 years ago
|
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•