Closed Bug 1358307 Opened 7 years ago Closed 7 years ago

deploy 4 w10 buildbot workers on ix hardware

Categories

(Infrastructure & Operations :: RelOps: General, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: arich, Assigned: q)

References

Details

Reimage the following machines with the deployment method defined in bug 1358306. 

t-w1064-ix-0003.wintest.releng.scl3.mozilla.com
t-w1064-ix-0004.wintest.releng.scl3.mozilla.com
t-w1064-ix-0005.wintest.releng.scl3.mozilla.com

Please also ensure VPN access for and give (newly generated) cltbld credentials to:
jmaher
rwood
armenzg
Unable to test deploys due to the IPMI console not a ttaching for t-w1064-ix-001-3. trying to sort that out now.
Because of the issues Q was having with ipmi and debugging the deployment, I also appropriated t-w864-ix-033. https://inventory.mozilla.org/en-US/systems/show/8480/
that's now t-w1064-ix-0006.wintest.releng.scl3.mozilla.com.
Summary: deploy 2 w10 buildbot workers on ix hardware → deploy 4 w10 buildbot workers on ix hardware
The following users were already in the vpn_releng_loan group:
jmaher
rwood
armenzg

I added the IPs for the 4 systems so that they'll be accessible as loaners via the VPN when the deployment is done.
Updated puppet for t-w864-ix-033 no longer being in DNS, which was causing a template compilation error.
t-w1064-ix-003 is done  and good to go

t-w1064-ix-004 and t-w1064-ix-005 are almost done imaging and should be up by 5/2/2017 1800
0006 is also available and good to go
005- has a reverse DNS issue blocking deploy

t-w1064-ix-003
t-w1064-ix-004
t-w1064-ix-006

Are usable as of tonight
Disabled t-w1064-ix-003 in slavealloc as it was banging on buildbot-master109's door.
vnc password set to standard loaner creds and cltbld is set to auto login and has admin rights. So folks should be able to login via vnc and do everything needed.
IIUC, we're using nVidia driver 335.23 from March of 2014 for these systems? I'm wondering if we could maybe pick a newer version to use - maybe one that's not older than the OS it's being installed on? I'm impressed that one even installs! We could probably dig through Telemetry data to make an informed guess on a better version to use if need-be.
as a note, all talos jobs run with this current setup.  I did test the mochitest-clipboard job and it failed, although that could be related to vnc into the box- clipboard is out of scope for this immediate project.

On Monday I will test the new mitmproxy/quantum test.
If we're just running talos tests, do we care about the nvidia driver? This is a stopgap measure for talos while taskcluster is being finished. Does mitmproxy require graphics?
this is a stopgap, many of our talos tests are graphics tests- I believe this is a stopgap until September (new hardware)?  If so, we should look into a more recent video driver for the card we have, possibly I am unaware of the timeline to get the new hardware installed and running?
I doubt that production taskcluster jobs will be running for w10 before sept, so that's a good bet. If you'd like an upgrade, please add this to the requirements spreadsheet and specify which driver version you want us to attempt to install. Are there any other requirements that have not yet made it into the spreadsheet?
I can't access that spreadsheet, but according to our own Telemetry data, version 376.53 is the most common driver version installed for that adapter in our Win10 user base. It's actually #1 overall for all nVidia users, too. That's not surprising since it's supplied from Windows Update. Unfortunately, I'm not sure how to find an installer for that to use in our environment...
Blocks: 1362407
These were deployed on 2017-05-02
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.