Closed Bug 1489268 Opened 6 years ago Closed 6 years ago

Loan kmoir a a aws-provisioner-v1/gecko-3-b-win2012 machine

Categories

(Infrastructure & Operations Graveyard :: CIDuty, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: kmoir, Unassigned)

References

Details

I'm having trouble getting a machine in bug 1486016 and it would be probably be good to have a reserved instance instead of one that will disappear as a taskcluster loaner as a spot instance.
Blocks: 1485228
Hi, At the moment I looked into the available documentation and unfortunately I could not find any way of loaning a gecko-3-b-win2012 machine without the taskcluster "Self Provision". We usually loan what can be found in https://wiki.mozilla.org/ReleaseEngineering/How_To/Loan_a_Slave, I also looked into https://wiki.mozilla.org/ReleaseEngineering/How_To/Self_Provision_a_TaskCluster_Windows_Instance and OCC (https://github.com/mozilla-releng/OpenCloudConfig/) but I don't see any way of providing one without self provision. Perhaps resolving the scopes in bug 1486016 would give a better chance of working on one.. Rob, is it possible to self provision/loan a gecko-3-b-win2012 machine without taskcluster's help? Something like creating an AWS instance with gecko-3-b-win2012 configuration??
Flags: needinfo?(rgarbas)
I think you meant to ni Rob instead of Rok
Flags: needinfo?(rgarbas) → needinfo?(rthijssen)
yes, it can be done with a little manual effort. there might be some problems with using gecko-3-b-win2012 rather than gecko-1-b-win2012. the only differences between these worker types is - the presence of the cot-key on the l3 builder - the absence of an ssh server on the l3 builder there are some security checks that would need to be overridden or worked around if we really want to use an l3 builder in a loaner scenario. i think it would require some code changes to occ to allow for it. if we can get away with using gecko-1-b-win2012, it's simpler. the process would be: - start a new on-demand instance in the tc account of the aws console using the latest gecko-1-b-win2012 ami (just use the ami id from a recent build log for that worker type) - ssh to the instance and change the Administrator password. - disable the generic-worker service (nssm) to prevent the instance from trying to claim tasks - disable the occ and haltOnIdle scheduled tasks to prevent the instance from shutting down or updating itself - give the new admin password to Kim
Flags: needinfo?(rthijssen)
Thanks I would like a gecko-1-b-win2012 machine if possible, I'm not having success loaning myself one. ciduty folks could you try Rob's instructions?
I managed to loan myself a machine. That being said, it would be useful to update the documentation on this page https://wiki.mozilla.org/ReleaseEngineering/How_To/Self_Provision_a_TaskCluster_Windows_Instance to reflect the instructions in the comments above.
Actually the machine I loaned myself is not persistent and was killed as it was a spot instance so my afternoon's work went away. Is there a way to setup an instance type that is not killed after several hours?
Hello Kim, According with the below step from Rob's instructions, sounds like it is possible: > - disable the occ and haltOnIdle scheduled tasks to prevent the instance from shutting down or updating itself Am I right? I can try to loan an instance for you right now.
Flags: needinfo?(kmoir)
- the instance needs to be an on-demand instance rather than spot to keep aws from terminating it - disable generic-worker, occ and the haltOnIdle scheduled tasks to keep our software from terminating it
That would be great. Thank you.
Flags: needinfo?(kmoir)
I wanted to give it a shot using gecko-1-b-win2012 version 562946ec8c6d - ami-06143479 Gecko builder for Windows; TaskCluster worker type: gecko-1-b-win2012, OCC version 562946ec8c6d, https://github.com/mozilla-releng/OpenCloudConfig/tree/562946ec8c6d0c99b42a8669c3b413dd53f75957} At the end when it asks for "key pair" what should we choose here? Is there a default key set on the ami that we should all have access to or create a new key pair?? The 2nd option would be a bit problematic because me and I believe the rest of the CIDuty team are not authorized to create/download the key pair. Also on what hardware do gecko-1-b-win2012 use? I mean what instance type is used in production/recommended for gecko-1-b-win2012?
Flags: needinfo?(rthijssen)
For the last question I suggest checking https://tools.taskcluster.net/aws-provisioner/gecko-1-b-win2012/view (I need to be logged in), which says c4.4xlarge or c5.4xlarge.
- yes, either c4.4xlarge or c5.4xlarge will work. - try key pair: mozilla-taskcluster-worker-gecko-1-b-win2012. you can ignore the warnings about not having the private key to decrypt the password since you'll use ssh to connect anyway. - make sure you add security groups: rdp-only & ssh only so that rdp and ssh will work on the instance.
Flags: needinfo?(rthijssen)
:grenade Tried to create the instance but we're hitting the following: "You are not authorized to perform this operation: Creating security groups".
Flags: needinfo?(rthijssen)
there are existing security groups called ssh-only and rdp-only which should be used. no need to create new groups. eg: https://us-west-2.console.aws.amazon.com/ec2/v2/home?region=us-west-2#SecurityGroups:groupName=ssh-only,rdp-only
Flags: needinfo?(rthijssen)
Assignee: nobody → ciduty
I realize the ciduty folks rotate shifts so the original person who worked on this may not be on duty for a while. Is there a way for someone to look at this so I can have a permanent loaner that doesn't get rebooted at random times. I've been using a temporary instance to test my patches but I can't rely on them because they are spot instances. thanks!
Yet again I'm getting another blocker while trying to launch the instance.. > Launch Failed > You are not authorized to perform this operation. At the moment I believe we don't have access to create instances using our taskcluster IAM accounts. I understand all the information provided here and I was able to reproduce 100% of the steps. I also want to start documenting all of these, starting with some basic information for the CIDuty team. Until we get access to create instances, can someone create the instance for Kim as this is a blocker for her for more then 8 days? Rob would you mind give it a shot?
Flags: needinfo?(rthijssen)
kmoir: instance created here: https://us-west-2.console.aws.amazon.com/ec2/v2/home?region=us-west-2#Instances:search=i-01f55f10dfea62327 if you can confirm that this is you on keybase: https://keybase.io/kmoir, i can use keybase secure messaging to send you the administrator password for rdp. alternatively, send me a pgp public key to encrypt an email to you with. if you also provide me with an ssh public key, i can give you ssh access to the instance. ciduty: these are the commands i ran on the instance over ssh to disable occ & generic-worker (ssh currently connects to a powershell console): # stop & disable occ & dsc Stop-Process -Confirm:$false -Id (Get-CimInstance Win32_Process -Filter "name = 'powershell.exe'" | ? { $_.CommandLine -eq 'powershell.exe -File C:\dsc\rundsc.ps1' }).ProcessId Unregister-ScheduledTask -TaskName @('PrepLoaner', 'RunDesiredStateConfigurationAtStartup', 'HaltOnIdle') -Confirm:$false Remove-Item -Path ('{0}\System32\Configuration\*.mof' -f $env:SystemRoot) -confirm:$false -force # stop & disable generic-worker Set-Service -Name 'Generic Worker' -StartupType 'Disabled' Remove-Item -Path 'C:\generic-worker\generic-worker.exe' -confirm:$false -force Stop-Service -Name 'Generic Worker' # set administrator password and grant ssh access ([ADSI]'WinNT://./Administrator').SetPassword('plain text password goes here') Add-Content -Path 'C:\Users\Administrator\.ssh\authorized_keys' -Value 'users ssh public key goes here'
Flags: needinfo?(rthijssen)
for the record, the event logs for the loaner instance are here: https://papertrailapp.com/systems/2302237742/events
Thanks so much Rob. Here is my gpg public key. https://gpg.mozilla.org/pks/lookup?search=kmoir&op=vindex pub 4096R/E8E9E1A0 2016-07-14 I think the one is keybase is old
Flags: needinfo?(rthijssen)
credentials emailed
Flags: needinfo?(rthijssen)
Thanks Rob!
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.