Open Bug 1603633 Opened 1 year ago Updated 8 months ago

Upgrade RabbitMQ

Categories

(Webtools :: Pulse, enhancement, P5)

enhancement

Tracking

(Not tracked)

People

(Reporter: dustin, Assigned: bpitts)

References

(Blocks 1 open bug)

Details

Attachments

(2 files)

We are running a 3-year-old version of RabbitMQ. Let's upgrade.

My concern with doing this is less that Taskcluster will malfunction in some way (although that's possible) but that other uses of pulse will malfunction.

Kim, I suspect you at least considered an upgrade at some point when your team managed this service. If you could add any context you can remember or find, that'd be great!

Flags: needinfo?(kmoir)

I have not personally considered an upgrade while on the team but I recall discussing this with the support folks from cloudampq . I can't find any notes from the conversation which is unusual for me. I did find this
https://www.cloudamqp.com/blog/2016-05-03-upgrade-clusters.html

I would suggest reaching out to them for a meeting to discuss especially since our version is so far behind and the upgrade process may be a bit different. Glob may have more information, but he is on pto right now.

Flags: needinfo?(kmoir)

Thanks!

One helpful point here: we've been running staging and our dev environments off RabbitMQ 3.7.5 / Erlang 20.1 for some time with no issues.

Based on our meeting today, edunham will be taking the lead on this one..

Assignee: dustin → edunham

A few thoughts for when edunham is ready to pick this up:

Based on https://www.cloudamqp.com/blog/2020-01-30-rabbitmq-erlang-upgrades.html, it appears that the upgrade process is to first upgrade to the latest version of erlang compatible with the current rabbitmq version, then upgrade to the latest version of rabbitmq compatible with that new erlang versions, then repeat the previous two steps until there are no more upgrades available. Each of those steps requires downtime.

It seems worth talking to CloudAMQP support to determine

  1. exactly what sequence of erlang and rabbitmq updates we would need to take in order to get pulse to the latest version of rabbitmq 3.7.x and an estimate of how long that would take
  2. What client changes may be required to go from 3.5 to 3.7

Depending on how complex the upgrade is and how risk-averse we are, it may be better to devise a plan to spin up and switch to a new 3.7 cluster rather than plan to upgrade the current 3.5 cluster.

Attached image pulse_versions.png
Attached image stage_versions.png
Duplicate of this bug: 1545506

Emily, can you share notes from any conversations you've had with taskclsuter team and with cloudampq support on this topic with me?

Brian, I forwarded you the email thread "Re: Some questions about custom SSL certificate on orange-antelope/pulse.mozilla.org" where we discussed upgrading with support.

Notes from discussions with Dustin and Coop are in https://docs.google.com/document/d/18w8GdimPBnuPb-qzr0MYP3X_HpVbdQzoKJYCuXS106g/edit#heading=h.iueazmxrxbri

Assignee: edunham → bpitts

Based on that correspondence Emily forwarded me, Pulse is too far behing ot upgrade in place. CloudAMQP recommends we create a new cluster and follow https://www.cloudamqp.com/docs/cluster_migration.html#seamless-migration-with-rabbitmq-queue-federation to switch to it.

Priority: -- → P5
You need to log in before you can comment on or make changes to this bug.