Closed Bug 865398 Opened 11 years ago Closed 11 years ago

deploy new snippets prod

Categories

(Infrastructure & Operations Graveyard :: WebOps: Other, task, P3)

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: hoosteeno, Assigned: dmaher)

References

Details

(Whiteboard: [triaged 20130424])

Currently snippets staging and dev are much more modern than snippets prod. This creates some deploy risks -- testing on staging may or may not catch issues that will appear in prod. We experienced this during a recent deploy.

There are more deploys coming this quarter (:mkelly is working on a major refactor in bug 744613). We could make them less risky by making prod and stage look more similar.
Assignee: server-ops-webops → dmaher
Priority: -- → P4
Whiteboard: [triaged 20130424]
For clarification, we're referring to snippets-dev.allizom.org and snippets.allizom.org as dev and stage, not the old-but-still-kicking-for-no-good-reason snippets.stage.mozilla.com.
We are lucky in that the extant hardware is beefy enough, so we don't need to spin up any new machines.  Furthermore, the cluster itself is mildly over-provisioned, so in light of these facts, here is what I propose :
* Set up the production deployment bits and pieces on snippetsadm1 (the "new" snippetsadm).
* Remove one production server from the load balanced pool and scrub the "old" snippets from it.
* Deploy the "new" snippets prod to the newly-cleaned server.
* Perform tests as necessary to confirm satisfaction.
* Re-insert said server into the load balancer pool.
* Rinse and repeat for the other two.

Once completed, we can (finally) deal with bug 808010 and get rid of snippetsadm (sans "1").
Flags: needinfo?(hoosteeno)
It sounds like a plan. Questions:

* Is there any reason to think that 2 servers can't handle normal load?
* About how long should this entire operation take under the most ideal conditions?
* Does mkelly need to be present for some/all of the time? Testing, maybe?
* Is there any engineering prerequisite for doing this work?
* When?
Flags: needinfo?(hoosteeno) → needinfo?(dmaher)
Answers:

1. Honestly, one server could probably handle the load (but that's not very safe). :)
2. Setting up the deployment environment (snippetsadm) might take an hour or so, but that has no effect on service delivery, so it can take however long it takes.  The first webhead we touch might take an hour (just to make sure that it's been cleanly deployed) - the remaining two nodes would be faster.
3. Somebody with intimate knowledge of snippets should be on hand for the actual deployment.
4. I'm not sure what you mean by "engineering prerequisite".
5. Next week ?  Depends on the availability of the person noted in answer #3. :)  Note that I'm in UTC+2, so we'd need to do this during the morning in the Americas.
Flags: needinfo?(dmaher) → needinfo?(hoosteeno)
Oh! Oh! Pick me! Pick me!

I'm available any day before 12 Noon Eastern, and the only day I'm not available past that is Tuesday due to meetings.
06:43:14 <mkelly> Proceed at your own convenience. :D

WIP.
Status: NEW → ASSIGNED
Flags: needinfo?(hoosteeno)
07:04:42 <mkelly> Use master
07:05:01 <phrawzty> so "master" for dev, stage, and prod ?
07:05:14 <mkelly> Yeah

Updating stage and implementing for prod.
And we've hit a firewall problem : snippetsadm1 can't communicate with the database VIP. I'm going to back out of the upgrade (only snippets1 was touched, so it's not a big deal).

This is on hold until we get the flow opened (bug 866735).
New plan:

* Get two VMs set up to serve as the new Prod webheads.
* Configure the back-end and deploy.
* Point a (temporary) LB VIP at the new webheads.
* Run load simulations and other assorted tests.

On deployment day, we'll can actually just insert the new webheads into the existing VIP pool and let them take take some load organically.  We'll then drain and disable the old webheads from the pool and that'll be that.
Priority: P4 → P3
Summary: Snippets service: Let's make prod environments better → deploy new snippets prod
The VMs have been set up and configured, and the new prod app is available at snippets-prodtest.webapp.phx1.mozilla.com (VPN required for access).  We're having some problems with Chief that need to be resolved, however, from a functional standpoint everything is ready to go for the load simulations and other tests.
Depends on: 891793
Depends on: 892292
No longer depends on: 891793
The new snippets service has been live in prod for some time now.  Closing this bug.
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Component: Server Operations: Web Operations → WebOps: Other
Product: mozilla.org → Infrastructure & Operations
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.