Closed
Bug 1205867
Opened 9 years ago
Closed 9 years ago
Migrate Pulse/PulseGuardian from phx1 to CloudAMQP/Heroku
Categories
(Infrastructure & Operations Graveyard :: WebOps: Other, task)
Infrastructure & Operations Graveyard
WebOps: Other
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: mcote, Unassigned)
References
Details
(Whiteboard: [kanban:https://webops.kanbanize.com/ctrl_board/2/1793] )
The time has come to move Pulse (the RabbitMQ cluster) and PulseGuardian (the Pulse management web app) out of phx1 and into CloudAMQP and Heroku, respectively. The migration plan is at https://docs.google.com/document/d/1F207nMJUXXxyDNuJuoPDfFzK39RSy0gOqrYMR-21AcQ/edit# On the day of, I will need someone on hand who can close ports on the Pulse zlb, can dump the PulseGuardian database and deliver it to me, can dump the RabbitMQ definitions and deliver them to me, and can update the pulse.mozilla.org DNS entry. I've been testing everything I can ahead of time to make sure it goes smoothly. Jgriffin has agreed to be around to help with testing and such.
Comment 1•9 years ago
|
||
:mcote - this will also require new network flows to be added for the new IP addresses to be accessible by buildbot machines. That work doesn't need a TCW afaik, but will block this work. Can you open a bug against releng to add the flows, and include the IP addresses there? (And block this bug, of course) h/t :arr for thinking of this!
Flags: needinfo?(mcote)
Comment 2•9 years ago
|
||
:mcote also, could you add the information requested in https://wiki.mozilla.org/IT/ChangeControl#Submitting_a_Change_Request then set the "cab-review" flag please? Holler if you need a hand.
Assignee | ||
Comment 3•9 years ago
|
||
Thanks! Filed request for netflows and CAB, blocking this bug.
Flags: needinfo?(mcote)
Assignee | ||
Comment 4•9 years ago
|
||
As discussed we'll be doing this outside of a TCW. Current plan is to migrate at least Pulse itself next Wednesday, October 7th, around 5 pm PDT. PulseGuardian may be moved earlier.
Summary: Migration Pulse/PulseGuardian from phx1 to CloudAMQP/Heroku during next TCW → Migration Pulse/PulseGuardian from phx1 to CloudAMQP/Heroku
Assignee | ||
Updated•9 years ago
|
Summary: Migration Pulse/PulseGuardian from phx1 to CloudAMQP/Heroku → Migrate Pulse/PulseGuardian from phx1 to CloudAMQP/Heroku
Assignee | ||
Comment 5•9 years ago
|
||
PulseGuardian was successfully migrated to Heroku on 2015/10/06. We attempted to migrate Pulse to CloudAMQP today, but we neglected to get a proper SSL certificate for pulse.mozilla.org onto the CloudAMQP cluster, so clients with strict hostname checking were failing to connect. We've contacted CloudAMQP support to figure out the best way to handle this, whether that be providing them with our cert or some other mechanism. We should be good to retry next week.
Assignee | ||
Updated•9 years ago
|
Status: NEW → ASSIGNED
Assignee | ||
Updated•9 years ago
|
Assignee | ||
Comment 6•9 years ago
|
||
The migration today succeeded. The VIPs for the old cluster have been turned off and all traffic is flowing through the CloudAMQP instance. The old cluster is still running but will be decommissioned in a day or two; see bug 1214636.
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Comment 7•9 years ago
|
||
The move of the RabbitMQ server totally broke our Mozmill CI because of IP (and maybe other) change. As result our machines can no longer connect to th broker. We should really announce those changes to all the customers and not break them unexpectedly. I will handle our regression via bug 1215464.
Comment 8•9 years ago
|
||
(In reply to Hal Wine [:hwine] (use NI) from comment #1) > :mcote - this will also require new network flows to be added for the new IP > addresses to be accessible by buildbot machines. That work doesn't need a > TCW afaik, but will block this work. Exactly this was not done for our machines so that we are stranded now. Sadly I cannot open bug 1205889 at all due to permission issues, so I will most likely reference this bug in the one I will file now. I assume we need an identical setup.
Comment hidden (obsolete) |
Comment hidden (obsolete) |
Comment 11•9 years ago
|
||
Both last two comments are not related to this topic. I would suggest you to file a new on that.
Updated•5 years ago
|
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•