Closed Bug 1309395 Opened 8 years ago Closed 8 years ago

Treeherder Terraform cleanup post Heroku migration

Categories

(Tree Management :: Treeherder: Infrastructure, defect, P3)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: emorley, Assigned: fubar)

References

Details

Attachments

(1 file)

No rush, but at some point we can:
* Adjust the treeherder-{stage,prod} RDS instances to use the default security group/subnets
* Remove the unused treeherder admin EC2 instance
* Remove the now unused parameter groups {th-import, th-replication}
* Remove the treeherder-dbgrp subnet group
* Remove the treeherder-vpc VPC
* Fix the various boolean values that were listed as strings (eg `multi_az = "True"` 
* Make the tags consistent across prototype/stage/prod
I'm actually inclined to leave the security groups, subnet group and VPC as they are. Despite the DBs being public (a pox on heroku), it keeps things that don't need to talk segregated. It doesn't hurt anything, and creates work for, imo, no gain. Also pretty sure it'd cause re-IP'ing of the instances.

But the VPN can come down, and we should definitely clean things up. Even more so, in light of bug 1309874.
Assignee: nobody → klibby
(In reply to Kendall Libby [:fubar] from comment #1)
> Despite the DBs being public (a pox on heroku)

Or more a pox on Mozilla not paying for Heroku enterprise/private spaces :-)
But yeah happy to do whatever wrt VPC/subnet/SGs etc.
removed unneeded resources, cleaned up tags. will remove VPN bits when :dcurado has brought the SCL3 side down.

To github.com:mozilla-platform-ops/devservices-aws.git
   abd9ffe..27f1ffd  master -> master
Attachment #8805641 - Flags: review?(jwatkins)
Ed, the treeherder-heroku RDS instance is off in the default VPC instead of the treeherder VPC. Any objections to moving it? I don't think terraform can without nuking it, so it will probably be manual.
Flags: needinfo?(emorley)
Comment on attachment 8805641 [details]
devservices-aws PR #8: remove VPN to SCL3

lgtm
Attachment #8805641 - Flags: review?(jwatkins) → review+
(In reply to Kendall Libby [:fubar] from comment #6)
> Ed, the treeherder-heroku RDS instance is off in the default VPC instead of
> the treeherder VPC. Any objections to moving it? I don't think terraform can
> without nuking it, so it will probably be manual.

Go for it :-)
Flags: needinfo?(emorley)
Looks like it can't be moved manually either; the only option is to change security groups. Tried making a read-replica that we could promote and migrate to, but that also fails as you can't specify the db subnet group when creating the replica. Argh.
I'm happy for you to just nuke the treeherder-heroku RDS instance and re-create from the latest snapshot of prod, if it's easier :-)

(Given we probably want to change the name, I guess we could just spin up a new instance, I switch the environment variable for the prototype instance, then we kill the old RDS instance?)
I've sorted out the issue I was having yesterday and have spun up a new dev instance from the latest prod snapshot (rds:treeherder-prod-2016-11-10-07-05). Should be able to just replace treeherder-heroku with treeherder-dev everywhere. 

Let me know when prototype's changed over and it's safe to remove the treeherder-heroku instance.
Is the new treeherder-dev instance using the credentials of prod, or have they been updated after the re-snapshot? (https://github.com/hashicorp/terraform/issues/8604 seems like it would have to be done manually for now)
Still prod credentials; I'm not sure that I have the dev creds.
I have:
* paused celerybeat on treeherder-prototype
* generated new dev credentials
* logged into the treeherder-dev as th_admin using the prod password
* changed the th_admin password using `SET PASSWORD = PASSWORD('...');`
* updated DATABASE_URL on the treeherder-prototype Heroku app to use the new instance's domain name, and new password

This triggered a deploy, incl migration run, and a recent master branch migration (that isn't yet present on prod) is now running.

Once that is complete I'll unpause celerybeat and hand back over to James for him to deploy his feature branch (it will need to be rebased on master before then too).
The migrations failed, have left a comment on bug 1311185 comment 24.
Given bug 1311185 comment 25, we'll need to recreate it once the prod DB has had the last few manual changes made (next few days).

prototype is semi-usable at the moment, since I set IGNORE_PREDEPLOY_ERRORS=1 so it at least deployed and pointed at the new DB, and then re-enabled celerybeat.

In the meantime the old treeherder-heroku RDS instance is now unused and can be destroyed (ingestion was failing on it for the last few days due to it being even further behind with schema changes, so we're no worse off).
(In reply to Ed Morley [:emorley] from comment #16)
>
> In the meantime the old treeherder-heroku RDS instance is now unused and can
> be destroyed 

Thanks, nuking!  https://github.com/mozilla-platform-ops/devservices-aws/pull/28
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Blocks: 1321294
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: