Closed Bug 1478215 Opened 6 years ago Closed 6 years ago

[tracking] complete migration of releng services from SCL3

Categories

(Release Engineering :: General, enhancement)

enhancement
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: jlund, Unassigned)

References

Details

(Whiteboard: [releng:q32018])

Attachments

(1 file)

this tracker will be used to identify remaining work and clean up bugs
Depends on: 1475517
Whiteboard: [releng:q32018]
Depends on: 1479619
Depends on: 1479620
Depends on: 1479753
Morphing this to handle all SCL3 services we want to continue with, now that we've hung several other dependencies off it.
Summary: [tracking] complete migration of relengapi/buildapi services from SCL3 → [tracking] complete migration of releng services from SCL3
bug 1479620, 1479619, and 1484874 are all bugs found by Nick going through our puppet configuration (releng/relops puppetagain) and determining which services live on scl3 hosts. Since scl3 is dying, any service that runs in scl3 and that Taskcluster still rely on, we will need to move it to either aws or mdc1.

@ciduty, along with the nagios request in bug 1484880, could you please have a look through all our hosts defined here: https://hg.mozilla.org/build/puppet/file/9cb089f5a074/manifests/moco-nodes.pp

for each host with 'scl3' in its hostname, create a list of services that it currently runs. Example: https://hg.mozilla.org/build/puppet/file/9cb089f5a074/manifests/moco-nodes.pp#l631

Many of these hosts are buildbot masters and while we don't need buildbot to run on them, the extra services like releaserunner are still needed by Taskcluster.

Don't worry if you know or not whether something is still used in Taskcluster, an exhaustive list that releng can go through after with you is good enough.
Flags: needinfo?(ciduty)
Attached file SCL3 services
I've checked and created a list for SCL3 hosts and added it to the bug.
Please check and review it
Thanks! I trimmed this further, removing common base puppet includes: e.g. 'mozilla', 'releng', etc. I've left the hosts that are actually running things and do not have "buildbot" in their name :)
@nick - could you help go through what's left in spreadsheet that ciduty created in comment 5 and then act as consulter/reviewer for moving the services over to hosts that are not in scl3?

@ciduty - we will be wanting to do something like what we did for releaserunner[1] but for the other releng services that we've flagged in comment 5


[1] https://github.com/mozilla-releng/build-puppet/pull/171/files
Flags: needinfo?(nthomas)
I missed that comment #2 referred to the old hg repo, instead of github at https://github.com/mozilla-releng/build-puppet/blob/master/manifests/moco-nodes.pp. So the releaserunner3 info about bm83 and bm85 out of date, but diffing doesn't find any other (scl3) differences. Waiting on edit access to annotate the sheet.

The only thing that jumps out from the list is aws-manager\d+. There will be lots we can turn off with ESR52 (eg all the cron tasks for watch_pending, golden instance generation etc), but we'll still need somewhere to create scriptworkers, and possibly modify routing tables etc etc. catlee, you were asking about this on IRC last week, do you have a plan ? Spin replacements up in AWS perhaps ?
Flags: needinfo?(catlee)
I've added action and bug columns to the sheet with my recommendations.
Flags: needinfo?(nthomas)
Status
* host list reviewed on sheet (see comment #5)
* still to do:
 * l10n bumper - bug 1479620 - already in AWS, need nagios monitoring to move
 * bouncer check - bug 1479619 - build puppet patch up for adding it to bm01, need nagios monitoring to move after that
 * aws-manager2 - pending catlee's decision. In discussions we've noted it's not critical that we maintain access to this machine or equivalent, there are workarounds if we need to run the various scripts
No longer depends on: 1484880
Our[ciduty] part it is pretty much done here so I will remove the NI for ciduty.
Flags: needinfo?(ciduty)
(In reply to Nick Thomas [:nthomas] (UTC+13) from comment #10)
>  * aws-manager2 - pending catlee's decision. In discussions we've noted it's
> not critical that we maintain access to this machine or equivalent, there
> are workarounds if we need to run the various scripts

we've decommissioned this, and so far are managing without a replacement.
Flags: needinfo?(catlee)
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: