The default bug view has changed. See this FAQ.

Install vs2013 Update 3, NSIS 3.0a2, and Mercurial 2.9.2 on 5 machines for testing

RESOLVED FIXED

Status

Release Engineering
Platform Support
RESOLVED FIXED
3 years ago
2 years ago

People

(Reporter: jlund, Assigned: markco)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

Comment hidden (empty)
(Reporter)

Updated

3 years ago
Blocks: 963197
(Reporter)

Updated

3 years ago
Blocks: 1063022
(Reporter)

Updated

3 years ago
Blocks: 1063024
(Reporter)

Updated

3 years ago
Blocks: 1063025
(Reporter)

Updated

3 years ago
Blocks: 1063027
(Reporter)

Comment 1

3 years ago
I've grabbed b-2008-ix-000[1-5]. b-2008-ix-0001 and b-2008-ix-0004 were not running a job so they are they are fully disabled and ready to be worked with. Once 2, 3, and 5 finish their current job and reboot, they will be ready too.

I also have a dev staging machine: ix-mn-w0864-001 that we can use to create the image with. that is ready to be worked with at any time too.
Blocks: 1014086
(Reporter)

Comment 2

3 years ago
FTR - b-2008-ix-0002, b-2008-ix-0003, and b-2008-ix-0005 are fully disabled and ready to be worked on
(Assignee)

Updated

3 years ago
Assignee: nobody → mcornmesser
(Assignee)

Updated

3 years ago
Depends on: 1063372
(Assignee)

Comment 3

3 years ago
The 5 machines are now being installed with VS 2013 update 3. 

The git, nsis, and VS 2013 GPOs will not apply to these machines. 

I will check back in the morning to make sure the machines have completed the install.
(Reporter)

Comment 4

3 years ago
(In reply to Mark Cornmesser [:markco] from comment #3)
> The 5 machines are now being installed with VS 2013 update 3. 

\o/
(Assignee)

Comment 5

3 years ago
0001 and 0003 did not seem to complete correctly. I am looking into those 2 machines now.
(Assignee)

Comment 6

3 years ago
These machines are good to go now. 

The got into an odd state where the deployment was not complete the install. After the drives were wiped the installation completed.
(Reporter)

Comment 7

3 years ago
sweet. I can't connect to 003, ping 100% packet loss.

I connected the other 4 to my dev master to test before enabling in prod. unfortunately I can not get backup prod keys and copy staging keys to these slaves (sftp and scp does not work[1]) so steps like upload will fail. we can still monitor most the others like hg_update, symbols, compile, etc for now.

[1] https://wiki.mozilla.org/ReleaseEngineering/How_To/Set_Up_a_Freshly_Imaged_Slave
(Reporter)

Comment 8

3 years ago
triggered a dozen builds from m-a, m-b, and m-c. will check back in the morning
(Assignee)

Comment 9

3 years ago
For 0003 I have just wiped the drive and attempting another install. Though it is behaving kind of odd. I can't tell if it is a drive issue or an over all slowness between it and WDS1. I will check back on it in the morning. 

Perhaps we should pick another machine for this since 3 is seeming odd at the moment. 

Sftp should work. What is the message when it fails?
(Assignee)

Comment 10

3 years ago
0003 i still out. For some reason the installation will not complete on this machine. I am looking into it this morning.
(Reporter)

Comment 11

3 years ago
hmm, ok I'll ignore 003 for the moment and unlock it from my master.

here was the cmds for copying keys I tried:

cltbld@B-2008-IX-0001 ~
$ "C:\mozilla-build\msys\bin\scp" -o 'StrictHostKeyChecking no' -o 'BatchMode=no' -r  cltbld@b-linux64-hp-0020.build.mozilla.org:~/.ssh .ssh
ssh: connect to host b-linux64-hp-0020.build.mozilla.org port 22: Bad file number
 
cltbld@B-2008-IX-0001 ~
$ sftp b-2008-ix-0183.build.mozilla.org:.ssh/* .ssh/
Connecting to b-2008-ix-0183.build.mozilla.org...
ssh: connect to host b-2008-ix-0183.build.mozilla.org port 22: Bad file number
Connection closed
(Reporter)

Comment 12

3 years ago
looks like all my builds failed over night. I think this was my master's fault. it failed at tooltool step. e.g.:
sh: c:/builds/moz2_slave/m-beta-w32-d-00000000000000000/tools/scripts/tooltool/tooltool_wrapper.sh: No such file or directory

reason it doesn't exist is because the tools repo being cloned is very old:
'hg' 'clone' 'https://hg.mozilla.org/users/stage-ffxbld/tools' 'tools'

will figure out why and point these builds to the correct repo
(Assignee)

Comment 13

3 years ago
There seems to be an issue with ssh from within try and build vlans to machines in the try vlan. Which is causing the sftp failures. Do we know if this is a new behavior?
(Assignee)

Comment 14

3 years ago
0003 is back in play. Following removal of its account from the domain,a new image deployment completed successfully.
(Reporter)

Comment 15

3 years ago
sftp will work if in same vlan however there were no win staging keys available in win staging build pool.

hand made keys and copied them across. connected all 5 to my master. fixed tools error. triggered a dozen builds
(Reporter)

Comment 16

3 years ago
I *think* jobs against these slaves using vs2010 are looking good.

I've just triggered a bunch of vs2013 jobs against win32 and win64 to see how update 3 fairs. will post findings later today for both vs2010 and vs2013
(Reporter)

Comment 17

3 years ago
some results:

WINNT 5.2 mozilla-beta build:
    'hg_update': ['37 mins, 46 secs', '18 mins, 31 secs', '19 mins, 4 secs', '37 mins, 39 secs', '18 mins, 14 secs ']
    'compile': ['3 hrs, 52 mins, 36 secs, '3 hrs, 51 mins, 8 secs', '3 hrs, 51 mins, 8 secs', '3 hrs, 52 mins, 6 secs', '3 hrs, 53 mins, 36 secs', '3 hrs, 48 mins, 39 secs']
    'buildsymbols': ['6 mins, 54 secs', ' 7 mins, 1 secs', '6 mins, 55 secs', '6 mins, 57 secs', '6 mins, 56 secs']
    'package-tests': ['9 mins, 48 secs', '9 mins, 43 secs', '9 mins, 46 secs', '9 mins, 49 secs', '9 mins, 40 secs']

WINNT 5.2 mozilla-beta leak test build:
    'hg_update': ['37 mins, 29 secs']
    'compile': ['1 hrs, 11 mins, 25 secs']
    'buildsymbols': ['9 mins, 13 secs']
    'package-tests': ['9 mins, 28 secs']

WINNT 5.2 mozilla-aurora nightly:
    'hg_update': ['19 mins, 57 secs', '18 mins, 31 secs']
    'compile': ['4 hrs, 45 secs', '7 mins, 25 secs', '4 hrs, 44 secs']
    'buildsymbols': ['7 mins, 25 secs', '7 mins, 25 secs'] 

WINNT 5.2 mozilla-central nightly:
    'hg_update': ['18 mins, 46 secs', '20 mins, 44 secs', '33 mins, 7 secs', '18 mins, 48 secs']
    'compile': ['3 hrs, 12 secs', ' 3 hrs, 34 secs', '3 hrs, 2 secs', '3 hrs, 22 secs']
    'buildsymbols': ['6 mins, 48 secs', ' 7 mins, 26 secs', '6 mins, 55 secs', '6 mins, 55 secs'] 


I think we are good to put these in production. all the steps bar hg update were very consistant across the branches I tested. I think with hg and the network things may vary so that might explain the two jobs that took ~35 min to clone instead of the usual 18. either way, it doesn't appear like we have bg procs or more than one step taking longer than usual.

markco, when I put staging keys on these machines they seem to have been overwritten with prod keys again. is there pgo logic to do so?

dmajor I also ran the following builds with vs2013 update 3. they all seemed to pass. do u want logs for any of them? If so, I can upload them somewhere public:
WINNT 5.2 mozilla-central leak test build
WINNT 6.1 x86-64 mozilla-central build
WINNT 6.1 x86-64 mozilla-central nightly (pgo)
WINNT 5.2 mozilla-central nightly (pgo)
Flags: needinfo?(mcornmesser)
Flags: needinfo?(dmajor)
(In reply to Jordan Lund (:jlund) from comment #17)
> dmajor I also ran the following builds with vs2013 update 3. they all seemed
> to pass. do u want logs for any of them? If so, I can upload them somewhere
If you could crack open one of each flavor and make sure it contains "18.00.30723", that's all I need to feel comfortable going ahead.
Flags: needinfo?(dmajor)
(Assignee)

Comment 19

3 years ago
I'll look into it. Most likely i will apply an item level targeting based on the machine name to prevent any keys from being copied over.
Flags: needinfo?(mcornmesser)
(Assignee)

Comment 20

3 years ago
Item level targeting is in place. 

Jlund: Could you copy over the keys to a machine and see if they stick this time?
(Reporter)

Comment 21

3 years ago
(In reply to David Major [:dmajor] from comment #18)
> (In reply to Jordan Lund (:jlund) from comment #17)
> > dmajor I also ran the following builds with vs2013 update 3. they all seemed
> > to pass. do u want logs for any of them? If so, I can upload them somewhere
> If you could crack open one of each flavor and make sure it contains
> "18.00.30723", that's all I need to feel comfortable going ahead.

I see the following in each log:
"Microsoft (R) C/C++ Optimizing Compiler Version 18.00.30723 for x86"
Is x86 in all the logs? The x86-64 ones ought to say "18.00.30723 for x64"
(Reporter)

Comment 23

3 years ago
(In reply to David Major [:dmajor] from comment #22)
> Is x86 in all the logs? The x86-64 ones ought to say "18.00.30723 for x64"

my mistake. it says x64 for:
WINNT 6.1 x86-64 mozilla-central build
WINNT 6.1 x86-64 mozilla-central nightly (pgo)

B-2008-IX-000{1-5} machines have been clobbered of vs2013 builds, asserted to have prod keys, and enabled back into production based on positive results from comment 17

left notes on slavealloc and keeping this bug open while they are in the wild
(Reporter)

Updated

3 years ago
Blocks: 1068922
(Reporter)

Comment 24

3 years ago
these 5 machines are also going to test new of hg and NSIS as part of the image.

markco will update with times of when these changes were applied.
Blocks: 989531, 1056981
Summary: Install vs2013 Update 3 on 5 machines for testing → Install vs2013 Update 3, NSIS 3.0a2, and Mercurial 2.9.2 on 5 machines for testing
(Assignee)

Comment 25

3 years ago
The  Mercurial 2.9.2 GPO is now going out to these machines. Due to failures nsis GPO is no longer being pushed and is being removed.
(Assignee)

Updated

3 years ago
No longer blocks: 1056981
Depends on: 1056981
(Assignee)

Comment 26

3 years ago
0118, 0135, 0136, and 0139 have been re-imaged and re-enabled. 

0144 through 0158 has been disabled. 

Note 0159 and 0160 seem to not to exists.
(Assignee)

Comment 27

3 years ago
0144 through 158 has re-imaged and been re-enabled. With exception of 0150 due to still a build going. 

0160 through 0184 has been disabled.
(Assignee)

Comment 28

3 years ago
Correction not 0160 but 0161.
(Assignee)

Comment 29

3 years ago
0144 through 0184 has been completed and re-enabled. Except for 180 which currently unreachable. 

0180 and 0038 are the only two machines left See bug 1078350.
Depends on: 1078350
(Assignee)

Updated

3 years ago
No longer depends on: 1056981
Looks like 0103 and 0002 are either still waiting for a reimage, or still waiting for a reenable.
(Assignee)

Comment 31

3 years ago
0103 and 0002 are now re-enabled.
No longer blocks: 1062877
Are we done here? Can this be closed?
(Assignee)

Comment 33

2 years ago
This bug is finished. however, the nsis is still an open issue but is covered in a different bug.
Status: NEW → RESOLVED
Last Resolved: 2 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.