Closed
Bug 1416867
Opened 8 years ago
Closed 8 years ago
Stand up 30 VMs each of w7 and w10 on moonshots
Categories
(Infrastructure & Operations :: RelOps: General, task)
Infrastructure & Operations
RelOps: General
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: fubar, Assigned: markco)
References
Details
I don't think we filed a bug for w7/w10 VMs; dupe if we did!
We need 30 of each; w10 first, then w7.
Comment 1•8 years ago
|
||
while we are at it, can we also do:
5 machines native install (no xen)
2 machines setup as loaners that rwood and jmaher can access
I want to get ahead of the game!
| Reporter | ||
Comment 2•8 years ago
|
||
Mark, let me know which 2 VMs will be the loaners and I'll update VPN access for rwood. If you need a hand with the loaner setup, dhouse can help (:buildduty will likely be offline by then).
Please prioritize the 2 loaners and getting w7 up over trying any bare metal installs. We *do* want to see if they reduce noise, but they require network driver changes and possible vlan/fw changes, and I don't want that to delay w7. Q will have more details on what changes need to be made.
| Assignee | ||
Comment 3•8 years ago
|
||
t-w1064-xe-226 through t-w1064-xe-255 have been set up.
I will set up t-w1064-xe-254 and 255 as loaners.
What all is needed for the loaner?
VNC password change (No RDP access because it may have adverse effects on the graphic card setup)
Root password change
Disable the tasks involved in starting generic worker and picking up tests
Remove releng jumphost firewall restrictions.
Other items needed?
| Reporter | ||
Comment 4•8 years ago
|
||
(In reply to Mark Cornmesser [:markco] from comment #3)
> t-w1064-xe-226 through t-w1064-xe-255 have been set up.
>
> I will set up t-w1064-xe-254 and 255 as loaners.
Added to vpn_releng_loan for :rwood.
> What all is needed for the loaner?
>
> VNC password change (No RDP access because it may have adverse effects on
> the graphic card setup)
> Root password change
> Disable the tasks involved in starting generic worker and picking up tests
> Remove releng jumphost firewall restrictions.
> Other items needed?
Dave will work with you on these bits.
| Assignee | ||
Comment 5•8 years ago
|
||
254 and 255, 10.49.42.100 and 10.49.42.99, are set up for loaners with the releng loaner passwords.
| Assignee | ||
Comment 6•8 years ago
|
||
(In reply to Mark Cornmesser [:markco] from comment #5)
> 254 and 255, 10.49.42.100 and 10.49.42.99, are set up for loaners with the
> releng loaner passwords.
Just to note instead of cltbld the username is GenericWorker.
Comment 7•8 years ago
|
||
the win10 machines are not accepting jobs when I pushed to try- are these not connected to taskcluster?
https://treeherder.mozilla.org/#/jobs?repo=try&revision=4ba20967f7529d5e08d4869c962a7b8452469721
Flags: needinfo?(mcornmesser)
| Assignee | ||
Comment 8•8 years ago
|
||
Looking at the task there is: "provisionerId: releng-hardware". Which in testing I was not able to get to work. I will jump on a machine and manual change that in the config file and see what happens.
Flags: needinfo?(mcornmesser)
| Assignee | ||
Comment 9•8 years ago
|
||
The manual change the file on the 226 and it did not pick up a test. I am going to reinstall 227 with releng-hardware in the config file before the GenericWorker account is created. If that doesn't work I will NI pmoore.
| Assignee | ||
Comment 10•8 years ago
|
||
We end up with this error from the GenericWorker:
"Client with clientId 'project/releng/worker/releng-hardware/gecko-t-win10-64-hw' not found\n----\nmethod: claimWork\nerrorCode: AuthenticationFailed\nstatusCode: 401\ntime: 2017-11-15T19:01:06.495Z"
I am currently seeking help in #taskcluster.
If it is possible to force the scl3-puppet provisionor for tests the other VMs will pick up and run tests.
| Assignee | ||
Comment 11•8 years ago
|
||
A new workerid and token was generated for gecko-t-win10-64-hw to work with releng-hardware provisionor. I am kicking off another install on 227, and if it works I will go through reinstall the rest and redo the loaners.
| Assignee | ||
Comment 12•8 years ago
|
||
Just an update. With the changes in the generic worker config file, the json became malformed. I think I have sorted out the issues and now attempting another install.
| Assignee | ||
Comment 13•8 years ago
|
||
226 through 255 are now reinstalling with a good GenericWorker config file pointed to the releng-hardware provisionor. Thye should be good to go with in an hour or two.
I will set up 254 and 255 as loaners tomorrow.
Comment 14•8 years ago
|
||
one test is running but fails 2/2 times:
https://public-artifacts.taskcluster.net/JqLfWWSMSuyiO5h6kFDDhw/0/public/logs/live_backing.log
the failure is a 403 and I suspect we don't have firewall rules setup properly on these machines.
| Assignee | ||
Comment 15•8 years ago
|
||
I am able to jump on a machine and pull from the url that is returning the 403 during the test. I will looks through the logs and see if anything is in the test environment that may cause an issue.
| Assignee | ||
Comment 16•8 years ago
|
||
Also reaching in #releng for suggestions. The more I think about it the less I think it is a firewall configuration since it is an HTTP response, so the machine is able to reach out to the site and the site is saying no.
| Assignee | ||
Comment 17•8 years ago
|
||
It is abit odd. The error is coming up here:
HTTP error 403 while getting http://pypi.pvt.build.mozilla.org/pub/mozsystemmonitor-0.3.tar.gz (from http://pypi.pvt.build.mozilla.org/pub/)
But other package were downloaded successfully:
14:56:46 INFO - Downloading http://pypi.pvt.build.mozilla.org/pub/psutil-3.1.1-cp27-none-win32.whl (87kB)
14:56:47 INFO - Installing collected packages: psutil
14:56:47 INFO - Successfully installed psutil-3.1.1
| Assignee | ||
Comment 18•8 years ago
|
||
Outside of the test virtual environment I get the same response using wget:
C:\Users\GenericWorker>wget http://pypi.pvt.build.mozilla.org/pub/mozsystemmonitor-0.3.tar.gz
--2017-11-16 17:27:51-- http://pypi.pvt.build.mozilla.org/pub/mozsystemmonitor-0.3.tar.gz
Resolving pypi.pvt.build.mozilla.org (pypi.pvt.build.mozilla.org)... 10.22.74.160
Connecting to pypi.pvt.build.mozilla.org (pypi.pvt.build.mozilla.org)|10.22.74.160|:80... connected.
HTTP request sent, awaiting response... 403 Forbidden
2017-11-16 17:27:51 ERROR 403: Forbidden.
C:\Users\GenericWorker>wget http://pypi.pvt.build.mozilla.org/pub/psutil-3.1.1-cp27-none-win32.whl
--2017-11-16 17:30:34-- http://pypi.pvt.build.mozilla.org/pub/psutil-3.1.1-cp27-none-win32.whl
Resolving pypi.pvt.build.mozilla.org (pypi.pvt.build.mozilla.org)... 10.22.74.160
Connecting to pypi.pvt.build.mozilla.org (pypi.pvt.build.mozilla.org)|10.22.74.160|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 87554 (86K) [application/x-troff-man]
Saving to: 'psutil-3.1.1-cp27-none-win32.whl'
Comment 19•8 years ago
|
||
:catlee, can you help us figure out why we get 403 for certain packages in pypi.pvt.build.mozilla.org ?
Flags: needinfo?(catlee)
| Reporter | ||
Comment 20•8 years ago
|
||
(In reply to Joel Maher ( :jmaher) (UTC-5) from comment #19)
> :catlee, can you help us figure out why we get 403 for certain packages in
> pypi.pvt.build.mozilla.org ?
bug 1415703
Comment 21•8 years ago
|
||
that is a secure bug, I don't have access
| Reporter | ||
Comment 22•8 years ago
|
||
(In reply to Joel Maher ( :jmaher) (UTC-5) from comment #21)
> that is a secure bug, I don't have access
sorry, fixed! the fix to it is on its way out now, fwiw.
| Reporter | ||
Updated•8 years ago
|
Flags: needinfo?(catlee)
| Assignee | ||
Comment 23•8 years ago
|
||
254 is set up as loaner. I will set 255 when I can catch it not running a test.
| Assignee | ||
Comment 24•8 years ago
|
||
255 is now set up a Win 10 loaner.
t-w732-xe-256 through t-w732-xe-262 are now set up in WIn 7 test pool. I will add to this pool once blades become available.
| Assignee | ||
Comment 25•8 years ago
|
||
JUst to note we are tacking which VMs on on which blade here: https://docs.google.com/spreadsheets/d/1ewFVpaFw60ljxCmPaW4Gu8rrmnuJ2nDqZejmer1dw38/edit#gid=0
I am snagging 226 WIndows VM to do some testing starting with IO.
| Assignee | ||
Comment 26•8 years ago
|
||
Installing t-w732-xe-195 through 119 now. After the installations complete I will set up 118 and 119 loaners.
I am also returning 226 to the win 10 test pool.
| Assignee | ||
Comment 27•8 years ago
|
||
I am working on getting the loaners set up this morning. It will end up being 217 and 218.
I am hitting an issue with VNC authentication not accepting the loaner password or the old password after a reboot.
| Assignee | ||
Comment 28•8 years ago
|
||
I had to go in and directly edit to the ini file to update the password and to allow user input.
The loaners are now set up.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•