please loan jmaher a osx machine and a new linux XE machine

RESOLVED FIXED

Status

Release Engineering
Loan Requests
P1
normal
RESOLVED FIXED
a month ago
15 days ago

People

(Reporter: jmaher, Assigned: jmaher)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

(Assignee)

Description

a month ago
I need to run some tests locally to get settings for a performance test.  This test is specific to the timing of the hardware, so getting one of each hardware type is important (I am forward looking, not considering the IX hardware)

Updated

a month ago
Flags: needinfo?(jlund)
Priority: -- → P1

Updated

a month ago
Flags: needinfo?(jlund)
Summary: please load jmaher a osx machine and a new linux XE machine → please loan jmaher a osx machine and a new linux XE machine

Comment 1

a month ago
Hey Joel,

Sorry for the delay here. As we await for LDAP access for the new builduty team I am, unfortunately, the bottleneck for buildduty bugs. It's on me, not them.

I'll have a look on this on Mon. Reach me offline if you need it prior to that.

Comment 2

a month ago
Due to release week, I missed this today. I'll have a look tomorrow.

Comment 3

a month ago
So bear with me here, we don't have documentation for the new moonshot hardware. I loosely followed what we would do for other talos linux hardware.

I grabbed: t-linux64-xe-300.test.releng.mdc1.mozilla.com

It didn't appear to be running any taskcluster workers:
$ ps aux|grep worker

I purged it via a combination of:
* https://wiki.mozilla.org/ReleaseEngineering/How_To/Loan_a_Slave#t-yosemite-r7_taskcluster
* https://wiki.mozilla.org/ReleaseEngineering/How_To/Loan_a_Slave#talos-linux64-ix.2C_tst-linux32-ec2.2C_tst-linux64-ec2

Joel already has vpn access so I just added the host. I'll send an email with creds now.


dhouse ^ we will need to do full reimage when we want to put this host back in. Also, it wasn't clear to me if it was okay to grab this machine. Do we have a slavealloc equivalent or to know which hw worker machines we can take? Apologies if I shouldn't have taken it.

I have some questions about how we want to do this in a post buildbot world and before we have the slaveapi controller replacement but that's beyond the scope of this bug

Comment 4

a month ago
I'm assuming you want a t-yosemite-r7 machine here when you say osx. We can change that up if I have that wrong but I wanted to get something for you since you will be up before me tomorrow.

I grabbed: t-yosemite-r7-472.test.releng.mdc1.mozilla.com

access to this machine will also be bundled up in email.



Other out of scope open questions, should we have host tracking bugs for these so we know to reimage and throw them back in the queue?

Comment 5

a month ago
needinfo: dhouse. sanity checking you are okay with me taking t-linux64-xe-300 and other questions from comment 3

Updated

a month ago
Flags: needinfo?(dhouse)

Comment 6

a month ago
(In reply to Jordan Lund (:jlund) from comment #5)
> needinfo: dhouse. sanity checking you are okay with me taking
> t-linux64-xe-300 and other questions from comment 3

added to this, we now cannot connect to t-linux64-xe-300 using the temp pw given to cltbld + root. Perhaps it is still getting re-imaged or used in prod. Help loaning this out would be appreciated. We can follow up with docs for Buildduty once there is a process.

Comment 7

a month ago
one more issue, we can't connect to the macos box t-yosemite-r7-472 via vnc. Tried: https://wiki.mozilla.org/ReleaseEngineering/How_To/Access_Machines_via_VNC#Connecting_from_a_Mac_client

cc relops

Comment 8

a month ago
(In reply to Jordan Lund (:jlund) from comment #7)
> one more issue, we can't connect to the macos box t-yosemite-r7-472 via vnc.
> Tried:
> https://wiki.mozilla.org/ReleaseEngineering/How_To/
> Access_Machines_via_VNC#Connecting_from_a_Mac_client
> 
> cc relops

enabled legacy vnc and changed screen sharing prefs: https://wiki.mozilla.org/ReleaseEngineering/How_To/Access_Machines_via_VNC#Enable_legacy_VNC_and_set_the_password

jlund> jmaher: can you try again with pw given in email?
13:32:50 (yosemite vnc)
13:34:31 
<jmaher> yeah, doing it now
13:35:56 jlund: it connected, then said connection has been gracefully closed
13:35:58 so that is progress

Comment 9

a month ago
jmaher> jlund: can you ensure that rwood has vpn access to the machine?  I should be able to do the linux machine when dhouse helps us out
13:45:50 
<jlund> jmaher: done. I readded rwood to vpn loaner list. He should have the same access abilities as you to these machines
13:46:08 I also just kicked the yosemite box in case that helps
(Assignee)

Comment 10

a month ago
:rwood, can you help out with the osx machine?
Flags: needinfo?(rwood)
We can use t-linux-64-ms-003.test.releng.mdc1.mozilla.com as a new bare-metal loaner. This was one of the first 30 for QA from earlier this month and is set up the same as the others (and how the remaining 70 will be set up).

#300 is still kind of stuck (we cannot kickstart it as bare-metal until the network/pxe setup is changed).
Flags: needinfo?(dhouse)
(In reply to Dave House [:dhouse] from comment #11)
> We can use t-linux-64-ms-003.test.releng.mdc1.mozilla.com as a new
> bare-metal loaner. This was one of the first 30 for QA from earlier this
> month and is set up the same as the others (and how the remaining 70 will be
> set up).
> 
> #300 is still kind of stuck (we cannot kickstart it as bare-metal until the
> network/pxe setup is changed).

hostname correction: t-linux64-ms-003.test.releng.mdc1.mozilla.com (no dash between "linux" and "64")
jlund: t-linux64-ms-003.test.releng.mdc1.mozilla.com is rebuilt now (applied the new hostname and cleaned/cleared).
(In reply to Dave House [:dhouse] from comment #13)
> jlund: t-linux64-ms-003.test.releng.mdc1.mozilla.com is rebuilt now (applied
> the new hostname and cleaned/cleared).

Thanks Dave for your help on getting this set up. Neat that we have new machines to play with and loan out now.

I emailed Joel with more host and credential details.
(In reply to Jordan Lund (:jlund) from comment #7)
> one more issue, we can't connect to the macos box t-yosemite-r7-472 via vnc.
> Tried:
> https://wiki.mozilla.org/ReleaseEngineering/How_To/
> Access_Machines_via_VNC#Connecting_from_a_Mac_client
> 
> cc relops

With OSX's screen sharing utility, I was able to vnc to #472 directly on the vpn and through the jumphost to a user I created with the same group membership as cltbld, and using the user's password and not the vncpass. So the connection *should work, but that is only testing from my vpn account (and maybe try the user's password and not the general vnc password?).
(In reply to Dave House [:dhouse] from comment #15)
..
> 
> With OSX's screen sharing utility, I was able to vnc to #472 directly on the
> vpn and through the jumphost to a user I created with the same group
> membership as cltbld, and using the user's password and not the vncpass. So
> the connection *should work, but that is only testing from my vpn account
> (and maybe try the user's password and not the general vnc password?).

I was able to vnc to OSX 472, got to the login screen, for user I chose 'other' and used cltbld and the credentials, then a popup appeared saying something about it could not update the login keychain, I tried two of the options (continue or create new) and neither work and it just disconnects at that point.
Flags: needinfo?(rwood) → needinfo?(dhouse)
Update: Dave worked some magic and I'm now able to vnc in successfully on OSX 472 using mac screenshare and the cltbld credentials, thanks!
Flags: needinfo?(dhouse)
(In reply to Robert Wood [:rwood] from comment #17)
> Update: Dave worked some magic and I'm now able to vnc in successfully on
> OSX 472 using mac screenshare and the cltbld credentials, thanks!

\o/ that's great news. @Dave, could you help get this documented with me so Buildduty knows what to do going forward? Motive being Buildduty self sufficiency and less interrupts for buildduty.

current vnc docs where we could add a section: https://wiki.mozilla.org/ReleaseEngineering/How_To/Access_Machines_via_VNC#Enable_legacy_VNC_and_set_the_password
Flags: needinfo?(dhouse)
I think the critical part was that the generic-worker kept trying to run each time we logged in as cltbld, and it would fail to find the generic worker and reboot the machine. There may have been a change to the run-generic-worker.sh, but I added a removal of the plist into the loaner doc: https://wiki.mozilla.org/ReleaseEngineering/How_To/Loan_a_Slave#bld-lion-r5.2C_t-yosemite-r7

The keychain warning is expected because we reset the user password from the commandline. Apple has a doc about fixing the keychain: https://support.apple.com/en-us/HT202860 I removed the keychain files instead, but then I had to add cltbld as an Administrator for Firefox to install. So I don't think we want to remove the keychain files normally. We may want to do something like here, https://technology.siprep.org/terminal-command-to-change-a-user-password-on-a-mac/, to set the keychain password at the same time as resetting the user password.
Flags: needinfo?(dhouse)
I think the docs on VNC are correct. I re-ran the commands there a couple of times but the problem was fixed by the generic-worker thing.
(In reply to Dave House [:dhouse] from comment #13)
> jlund: t-linux64-ms-003.test.releng.mdc1.mozilla.com is rebuilt now (applied
> the new hostname and cleaned/cleared).

Just want to make sure this host is working too. Can anyone confirm?
(In reply to Dave House [:dhouse] from comment #20)
> I think the docs on VNC are correct. I re-ran the commands there a couple of
> times but the problem was fixed by the generic-worker thing.

Thanks for confirming this and for touching up the docs
(Assignee)

Comment 23

28 days ago
yes, this works and I was able to get my info.  I have a lot of issues where tightVNC gets into a locked up scenario and I need to reset the VPN- I suspect this is something with available ports to communicate on and sockets left in a hung or old/used state- disconnect/reconnect the VPN it resets all those connects forcefully.
Great (mostly). We can leave this bug open to track clean up when you are done.
Hey Joel,
I'm gonna assign this bug to you for easy tracking, once you are done with the servers please mark this bug as RESOLVED, so we know when to reclaim the servers.
Assignee: nobody → jmaher
(Assignee)

Comment 26

15 days ago
we are done with these machines, thanks for helping get them setup.
Status: NEW → RESOLVED
Last Resolved: 15 days ago
Resolution: --- → FIXED
Relops will need to be involved to re-image t-linux64-ms-003.test.releng.mdc1.mozilla.com
We don't yet have a way for buildduty to do the reclaiming on the moonshots.
I kicked off the re-imaging of t-linux64-ms-003. I'll check later if it completes.
You need to log in before you can comment on or make changes to this bug.