Closed Bug 1485672 Opened 7 years ago Closed 6 years ago

ramp-up dedicated mobile beetmoverworker to handle the android-components

Categories

(Release Engineering :: Release Automation, enhancement)

enhancement
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: mtabara, Assigned: mtabara)

References

Details

Attachments

(1 file, 1 obsolete file)

Geckoview library is being pushed to the Maven S3 via the normal beetmoverworkers. However, the AC (android-components[1]) will be triggering a Github-based beetmover task, hence we need to separate the logic here from the rest of the pool (whatever comes from tree goes to normal beetmoverworkers pool), whatever comes from github, we'll have a dedicated workerGroup / workerId. We currently have the following workerGroup / workerId pairs for existing dedicated mobile instances: workerGroup: mobile-signing-v1 workerId: mobil-signing-linux-1 workerGroup: mobile-pushapk-v1 workerId: mobil-pushapkworker-1 Hence, for beetmover I propose: workerGroup: mobile-beetmover-v1 workerId: mobile-beetmover-1 I personally preferred something more verbose "mobile-beetmoverworker-1" for the workerId but the 22 limitation on that prevents us. We can always amend things later if we really have to. In order to bring this machine up, we need: * DNS entries for its IP from IT * ramp-up an instance in EC2 * puppetize it Initially I'll be using this for staging environment and once we're 100% things are working smoothly, I'll nuke contents and switch to production mode for android-components. [1]: https://github.com/mozilla-mobile/android-components/
Depends on: 1485690
See Also: → 1343575
Depends on: 1486059
So far: a) I used part I from [1] to get an available IP in the same beetmoverworker pool. b) Then I requested the DNS entries in bug 1485690 c) Then I used part II from [1] to create an instance in AWS d) puppetization failed as the newly created instance had no matching node in the puppet moco-nodes catalog. If it can't find the catalog, it won't do anything at all, not even the basic auth stuff (keys, etc) e) I could've used the private repo key to login there via root but I bounced off aws-manager2 since that's still around for a little longer (e.g. ssh -i ~/.ssh/aws-ssh-key root@HOST ...) f) pushed my puppet changes (including the moco-nodes match for new HOSTNAME in my puppet environment) f) manually run the pinned puppet agent from there to bring in all the stuff TIL: there's a log in the folder we run per each create_instance.py call we do TIL: you can pass env var `PUPPET_EXTRA_OPTIONS='--environment=<puppet env username> --server releng-puppet2.srv.releng.mdc1.mozilla.com'` to the create-instance cli to make it run that directly. that's neat! [1]: https://wiki.mozilla.org/ReleaseEngineering/How_To/Loan_a_Slave#Environment_Setup
Attachment #9003862 - Attachment is patch: true
Attachment #9003862 - Attachment is patch: false
We now have mobile-beetmover-1.srv.releng.use1.mozilla.com up and runnning, currently pinned to my environment. Leftovers: * once bug 1486059 is resolved, I still need to a) add the TC client details in hiera (mine for now) b) add them https://github.com/mozilla-releng/build-puppet/pull/183/files#diff-d5d8409645ace812a7a5c4e4668268ceR174 c) make sure that the worker is successfully querying for tasks in TC Queue - it should show up here[1] Making CoT happy is a different story. I'll file a separate bug for that as I need to touch multiple pieces. [1]: https://tools.taskcluster.net/provisioners/scriptworker-prov-v1/worker-types
Blocks: 1486089
There are a bunch of alerts that keep coming about mobile-beetmover-1.srv.releng.use1.mozilla.com : Sun Aug 26 08:28:05 -0700 2018 Puppet (err): Could not retrieve catalog from remote server: Error 400 on SERVER: Could not find default node or by name with 'mobile-beetmover-1.srv.releng.use1.mozilla.com' on node mobile-beetmover-1.srv.releng.use1.mozilla.com Sun Aug 26 08:28:05 -0700 2018 Puppet (err): Could not retrieve catalog; skipping run Can you please take a look ?
(In reply to Adrian Pop from comment #4) > There are a bunch of alerts that keep coming about > mobile-beetmover-1.srv.releng.use1.mozilla.com : > > Sun Aug 26 08:28:05 -0700 2018 Puppet (err): Could not retrieve catalog from > remote server: Error 400 on SERVER: Could not find default node or by name > with 'mobile-beetmover-1.srv.releng.use1.mozilla.com' on node > mobile-beetmover-1.srv.releng.use1.mozilla.com > Sun Aug 26 08:28:05 -0700 2018 Puppet (err): Could not retrieve catalog; > skipping run > > Can you please take a look ? Sure, will take a look, thanks for the ping.
* Fixed the errors, instance is still pinned to my environment but I think we can safely land that to production. * I'll give it a test for rest of beetmoverworkers on Monday and trigger a staging release, just to make sure I'm not messing the configs for all the beetmover workers. We now have a mobile-beetmover successfully querying for work in TC[1]. Next step is to make CoT happy in scriptworker so that we can actually schedule the tasks in the android-components graph. [1]: https://tools.taskcluster.net/provisioners/scriptworker-prov-v1/worker-types/mobile-beetmover-v1.
* add configs for mobile-beetmover * add mobile-beetmover in the nodes and configure it with the mobile-staging settings * refactor and get rid of the three templates, rely on a sole one Will pin dep beetmoverworkers and chain them to my environment and run a staging release before deploying this to production, just to make sure the script_config.json is not messed up by this refactoring.
Attachment #9007296 - Flags: review?(mozilla)
Attachment #9003862 - Attachment is obsolete: true
Comment on attachment 9007296 [details] [review] [puppet] add configs for android-components beetmover; refactoring 302-ing this to Johan since I'll talk to him in a few minutes about this anyway.
Attachment #9007296 - Flags: review?(mozilla) → review?(jlorenzo)
Comment on attachment 9007296 [details] [review] [puppet] add configs for android-components beetmover; refactoring See https://github.com/mozilla-releng/build-puppet/pull/183#pullrequestreview-154137085
Attachment #9007296 - Flags: review?(jlorenzo)
Comment on attachment 9007296 [details] [review] [puppet] add configs for android-components beetmover; refactoring Addressde the comments, thanks again for the review. Throwing this again in the reviewing pool!
Attachment #9007296 - Flags: review?(jlorenzo)
Comment on attachment 9007296 [details] [review] [puppet] add configs for android-components beetmover; refactoring Johan approved in PR. I landed this in https://github.com/mozilla-releng/build-puppet/commit/e1edd5791365542821e2bc4e83d6a60d7ec7e894
Attachment #9007296 - Flags: review?(jlorenzo)
Attachment #9007296 - Flags: review+
Attachment #9007296 - Flags: checked-in+
Changes landed in production affecting dev/prod beetmoverworkers. I think we can close this, will reopen if things go south.
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Blocks: 1503548
Component: Release Automation: Uploading → Release Automation
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: