Closed Bug 1358306 Opened 7 years ago Closed 7 years ago

create w10 wds/gpo/buildbot deployment method

Categories

(Infrastructure & Operations :: RelOps: General, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: arich, Assigned: q)

References

Details

Attachments

(2 files)

Create a deployment method for w10 wds/gpo/buildbot according to the specifications at https://docs.google.com/spreadsheets/d/1BKoXk5K9zySti5CJlQ4blVHDIajfmBiRJQz6HcCS3M4/edit?ts=58f91112#gid=0
Blocks: 1358307
Have a new base cap with updated win 10 patches. Working on a deploy-able image now with a roll in for the replay drivers.
I will update tomorrow with data about web-page-replay, there might not be drivers needed, or we might not be able to get it working.  Either way, I would like to unblock you to finish off this image.
The inf will not block us so it should be in the system in case you need it.

 However, yesterday went down in flames. It appears I need make 100% sure windows 10 updates are turned off before trying a capture as if it is running it will screw the install. I am working on that now as soon as that hurdle is cleared I will try a new deploy today.
In the home edition there's no way to disable updates 100%. Only down to crititical updates

There's also the setting (win10 home edition) of serving downloaded updates to the internet (by default) and to other machines in the local network (only other option). Not sure if pro allows you to remove that as well. I hope so!
I recall setting my network as a not at home and requiring updates only "at home".
looking at web-page-replay and mitmproxy, we do not see a need to install a custom device driver or other software.  The main thing we would like to see installed is python 3.5 (in addition to the python 2.7 that is installed).  I think doing that would meet our core needs and any further changes would be minimal if not none.
So quick summary of the last two days:

Build 1703 has been a pain.

Build 1703 on our original LTSB install worked.

After checking my notes I have found some tests that won't work with LTSB

Moved the base to Enterprise w/ 1703 slip-streamed in and captured.

Deployments were stalled due to IPMI issues.

After getting a work around I started testing deployments. The deployments got bit by the "just a moment" bug after sysprep. Was able to work around that by making reg changes during install. Now hitting OEM "defaultuser0" issue in which all logins are removed but the use user and the password is un set. I am testing fixes for that now. Will check results in the morning.
Looking for an update as to final state of this before Monday morning. If we were unable to finish the automation due to problems with the tools, please do as much of the deployment as you can in an automated fashion then finish up 3 of the machines manually so that we can unblock PI on greening tests.
Flags: needinfo?(q)
Update.

Started trying to implement windows 10 1703.  Found that reboots caused a system hang during deployment.

For control tried the original LTSB build  and did not see the problem on deployment. Went back to the base build process
found that sysprep changed in build 1511. Tried a 1511 build and did not have the issue.


Additional changes in 1607 caused problems even new out of the box systems that had syspreped with oobe ( google "just a moment" hang for complaints of many many users). Additional issues came from build 1703. Both editions show the hang issues during my testing. 

These extra issues forced me update the MDT tool-bench to 2103 U2  I had to update the ADK to newest version ( that officially supports only 1607) this cleared up some minor issues. However, sysprep with oobe causes the "just a moment" hang on deployment . 

I have been attempting to do a reference capture image in audit mode which skips oobe. However this has slowed down the create wim process that does the capture.
Flags: needinfo?(q)
Since sysprep causes the issue. I abandoned the build step. I have an install that uses the unaltered WIM and does a longer series of installs. This is a bit f a hack and require two pieces of manual interaction to roll out machines.
Is your plan to install 3 of the machines with this method to unblock PI and then continue to hack at the automation, or is this the end state and we'll just have an extra couple steps when reimaging the 80 some machines intended for this test?
Flags: needinfo?(q)
I am hoping to get three machines at 100% functionality to unblock tests. I will then go back and reconstruct the share for windows 10. 

As of now I have the systems installing without the hangs. I have two install errors I am working on. If I can;t do them in the steps I am working on now I will write some batch files or gpo patches as a temp work around.
Flags: needinfo?(q)
I found a few more steps that need refactoring ( the apache server install and the pywin install). It looks like path appends are a bit different in this setup. I will update again in the morning with an ETA
Talos doesn't require apache, in the past we had required it, but moved away from it.  Possibly skip that and save one of your problems?  pywin is needed though.
Thanks Joel. I think I have it worked out on this pass but if not i will ditch it. 

Since we have manual steps I currently need ipmi video access to these machines I have filed Bug 1361128 for this
Depends on: 1361128
we have a very havked but owrking install. I need to test the python 3.5 installed on the OS and make sure it sin't causing problems. Is python 3.5  hard need to have in the installer or this something that can be installed with tooltool ?

Q
We have a very hacked but working install. I need to test the python 3.5 installed on the OS and make sure it isn't causing problems. Arr mentioned we had need for python 3.5  is that a hard need to have in the installer or this something that can be installed with tooltool ?

Q
Flags: needinfo?(jmaher)
we need to have python 3.5 installed, I am not sure if it is realistic to install with tooltool, that will need some testing.  Ideally this is baked into the image- that doesn't mean it is a hard requirement.  I don't know if the installation method for python 3.5 is as easy as unpacking a .zip file, there might be a full .msi style installer and maybe reboot required.
Flags: needinfo?(jmaher)
FYI: Armen was installing python 3.6 with tooltool in bug 1361462.
See Also: → 1361462
Q is estimating that the automation will be refactored by EOB 2017-05-15. That will include the updated graphics driver and automation to install VAC (see bug 1363460) and pywin32.
Depends on: 1363460
as a note, we can successfully run our tools on the loaners by installing python 3.6 from tooltool- I don't believe there are firewall exceptions.

I think the only remaining item is verifying the final image with VAC (or lack thereof).  I suspect we might have manually added firewall rules to machines via manually clicking OK- so that is something I would like to check on a fresh final image.
We have VAC working on an auto install. I am getting the automatic gfx driver install ironed out. The new version creates a popup that will need to be killed as it will interrupt tests.

Looks like the new windows build will require extra steps to kill  the dial home telemetry but that should be doable in gpo and not hold up imaging
I think just adding some default firewall rules in would be the icing on the cake (bug 1360338)
added new rules.


New drivers are installing and not having popups.
VAC automation looks good after multiple images. 


Still have a few cortana  and "telemetry"  to rip out.

 With some quick changes to the installs the install time is back to being closer to 15 to 20 minutes.
Re-imaging 003, 005, and 006 now.
jmaher: did you have a chance to test these to verify that we have a finalized image we're ready to deploy to more hosts?
Flags: needinfo?(jmaher)
Attached file win10_aboutsupport.txt
I verified all of the tests yesterday.

Attached about:support on the device.  :milan, can you double check this attachment to see if all looks well from the graphics point of view (remember this is the old machines we have running windows 10, not the new machines).
Flags: needinfo?(jmaher) → needinfo?(milan)
It isn't accelerating graphics, so if that's OK for the "old" Windows 10, then we're fine.  Do we have graphics cards in the old configurations?
Flags: needinfo?(milan)
Sorry, let me answer this better.  I see there is an Nvidia 610 in the system, but it isn't getting picked up as the primary card, so we instead pick up "Microsoft Basic" (haven't seen 0x00ba vendor id before.)  I'd usually switch the preference in Nvidia control panel, for firefox.exe to use the discrete graphics, but I'm not sure how you do that on these systems.
Milan,

 I am taking a look. We are getting the right resolution which is what we test for. This may be a problem with the supplied Microsoft cab version of the nvidia driver instead of using the the Nvidia official.

Q
are you using 005 to test? There seems to be something wrong with that machine can you try 006?
Flags: needinfo?(milan)
let me get about:support for 006
ok, this should give us what we want.
(In reply to Joel Maher ( :jmaher) from comment #33)
> Created attachment 8870492 [details]
> updated about:support from normal graphics
> 
> ok, this should give us what we want.

That looks good.
Flags: needinfo?(milan)
jmaher: are we good to close this bug out?
Flags: needinfo?(jmaher)
Status: NEW → RESOLVED
Closed: 7 years ago
Flags: needinfo?(jmaher)
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: