Closed Bug 1498700 Opened 7 years ago Closed 7 years ago

Support a RabbitMQ deployment per TC deployment

Categories

(Taskcluster :: Services, enhancement)

enhancement
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: dustin, Assigned: dustin)

References

Details

I think we want to run RabbitMQ internally to the cluster. For production, obviously we'll want to talk to the "real" pulse server, so we'll need to keep that option in place in the configs. Option: Make taskcluster-mozilla-terraform support either rabbitmq admin credentials from secrets, or build its own rabbitmq service and use its credentials; and taskcluster-terraform expects admin credentials. Or: Buy a "staging" CloudAMQP broker and use that; the docs https://www.cloudamqp.com/docs/faq.html#separate_applications suggest that it's possible to use vhosts on such a system. But, pulse users are global across vhosts, so we would have to modify tc-pulse to add some unique prefix to usernames. Brian, I know you had some thoughts about this and we even talked about it for a bit..
Flags: needinfo?(bstack)
Probably the easiest is to get a big cloudampq instance that allows management at the vhost level like you say. It would be nice to be completely "airgapped" from pulse. We could also attempt to run a rabbitmq host directly in either kubernetes or gcp?
Flags: needinfo?(bstack)
See Also: → 1467880
I think we need to solve this soon. Right now all of the non-production instances are sharing the same rabbitmq instance, and all installations after the first are not setting the user passwords correctly. A big mess! I suspect that just namespacing the usernames with ${var.dpl} will help, and I don't see anything in the plans documentation that suggests we can't use vhosts. But using tc-pulse is the better long-term solution for this.
Ah, we are indeed on a shared plan, as the admin user only has access to the / vhost and the vhost named after itself. Mark would the price difference be a showstopper to upgrade to a multi-vhost plan? Given multiple vhosts, we can share a single instance across all staging and dev deployments of Taskcluster.
Flags: needinfo?(mcote)
Discussed on IRC; it is indeed a dedicated instance to which we can add new vhosts.
Flags: needinfo?(mcote)
I had misunderstood how vhosts were handled. And forgotten that the "administrator" tag means that all of the other permissions rabbitmq maintains are irrelevant :) https://github.com/taskcluster/taskcluster-mozilla-terraform/pull/23
I'm working on a for tc-pulse to support this, too.
That still doesn't quite do it, because we don't pass a namespace to pulse-publisher in most services. I'll fix up auth, since it's handy for testing dev deployments, but we'll leave the others to use tc-pulse.
Nah, we're going to need to make everything use tc-pulse to get this landed.
Depends on: 1488789, 1436456
> Nah, we're going to need to make everything use tc-pulse to get this landed. This is because tc-client doesn't support a `namespace` argument, so it's trying to create a queue named `queue/dustindev-taskcluster-auth/exclusive/SWSR0engTV-hQAgyNe16yQ`. The fix is, I think, to use tc-lib-pulse everywhere, but to turn off and never use tc-pulse. I'll do that in bug 1499856.
Depends on: 1499856
Oops, I need to update that to pass a pulse username/password to pulse (at least, in my dev env it's not connecting)
Looks like the issue was an incorrect password -- deleting the user and re-running terraform apply worked. I wish I knew why that was happening, but I'm hoping it's just due to all the pulse work lately.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Component: Redeployability → Services
You need to log in before you can comment on or make changes to this bug.