Closed Bug 1201659 Opened 9 years ago Closed 9 years ago

[mig] 2015-09-10 backward incompatible release

Categories

(Enterprise Information Security Graveyard :: MIG, task)

x86_64
Linux
task
Not set
normal

Tracking

(Not tracked)

RESOLVED INCOMPLETE

People

(Reporter: jvehent, Assigned: jvehent)

References

Details

Recent changes in the MIG backend infrastructure required us to break backward compatibility. When we release the latest version of MIG scheduler, old agents will not be reachable any longer. For this reason, we want to synchronize a release of the backend with a deployment of new agents across IT, releng and services and opsec.

An important note: while agents running the old version will not be reachable any more, the agents themselves will keep functioning correctly and not notice they are talking to the void. The impact to their host will be null and won't break anything. It won't even show in the logs until we delete the old rabbitmq exchanges, days after the release. By that time, hopefully, all agents will have been upgraded.

Below is a list of tasks that need to be performed before, during and after the release. Most actions will be taken by Aaron and myself. There is one action for Dustin to merge the patch in build-puppet on release day. 

Preparation
-----------
1. Perform sanity check on scribe, pkg and memory modules {ulfr+alm}
2. Update GPG keys {alm}
    - ops -> access to file, pkg and netstat
    - sec -> access to *
3. Create an automated investigator key, give acl to scribe, pkg, file and netstat {alm}
4. Update build script to copy available module to new location {alm}
5. Enable scribe, pkg, memory modules everywhere {alm}
6. Build mig-agent packages for all organizations {alm}
7. Prepare patch for releng deployment {ulfr}
    - publish new mig-agent packages to releng repositories
    - create deploy bug for releng, wait for r+
8. Prepare patch for opsec puppet {ulfr}
    - create new exchanges with new ACLs in rabbitmq
    - deploy new scheduler/api/workers {ulfr}

Release Day (starts 1100EDT, 0800PDT)
-------------------------------------
9. Deploy new servers in moz-opsec {ulfr}
10. Merge patch into releng build-puppet {dustin}
11. Publish new packages in mrepo {ulfr}
12. Publish new packages in SVC mrepo {ulfr}
13. Publish new packages in opsec puppet & ansible {alm}

Cleanup
-------
14. Write script that pull heartbeats from old exchange to detect non updated agents {ulfr}
15. Delete old exchange and queues in rabbitmq admin UI {ulfr}
Sounds good.  Keep in mind that because of the way we bake AWS instance nightly, there's a 24-hour lag on changes to all of our AWS buildslaves.  It doesn't sound like that would be a problem.  We can accelerate that to "a few hours" by manually kicking off a baking process, but it's still not instantaneous.
That should be alright, we can live without MIG for half a day.
QA took longer than expected, so we're going to release tomorrow instead.
Summary: [mig] 2015-09-09 backward incompatible release → [mig] 2015-09-10 backward incompatible release
Depends on: 1203559
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → INCOMPLETE
Product: Enterprise Information Security → Enterprise Information Security Graveyard
You need to log in before you can comment on or make changes to this bug.