pulseguardian cannot delete exclusive queues, doesn't log about it
Categories
(Webtools :: Pulse, defect)
Tracking
(Not tracked)
People
(Reporter: dustin, Assigned: asmaknikar, Mentored)
Details
(Keywords: good-first-bug, Whiteboard: [lang=python])
Jan 17 12:59:53 pulseguardian app/worker.1: [Other] New queue. {"queuename": "queue/mrgigglesdev/tc-builds-dev", "queuesize": 113774, "queuedurable": false, "valid": true, "ownername": "mrgigglesdev", "newowner": false} ["queue"]
Jan 17 12:59:53 pulseguardian app/worker.1: [Other] Deleting queue. {"queuename": "queue/mrgigglesdev/tc-builds-dev", "queuesize": 113774, "warningthreshold": 4000, "deletionthreshold": 20000} ["queue"]
This was an excl
queue, and pulseguardian kept trying to delete it but failed, and didn't log anything. Trying to delete in the management console gave
405 RESOURCE_LOCKED - cannot obtain exclusive access to locked queue 'queue/mrgigglesdev/tc-builds-dev' in vhost '/'
I'd like to have at least seen that in the logs.
CloudAMQP's alerting did let us know about this queue, which is how we knew to look.
The fix was to manually terminate the connection, which automatically deleted the queue. if this happens again we should probably try to automate that.
Comment 1•5 years ago
|
||
Yeah, I saw a huge pile of email from this when I started working today, and immediately went in to shut the client down. (It's a development instance of the mrgiggles IRC bot. The main bot's pulse connection still seems happy.) I haven't investigated to see why it was behind. I assumed that the queue was being deleted, but the bot retried and re-created it, but it sounds like that's wrong?
I did get multiple emails per minute for a few hours out of it!
Reporter | ||
Comment 2•5 years ago
|
||
It looks like the connection was stuck, not consuming any messages but holding the queue open. It's somthing pulseguardian should be resilient to.
Reporter | ||
Updated•5 years ago
|
Reporter | ||
Updated•5 years ago
|
Assignee | ||
Comment 3•5 years ago
|
||
Is someone assigned or can I attempt to work on it?
Assignee | ||
Comment 4•5 years ago
|
||
(In reply to Ashish Maknikar from comment #3)
Is someone assigned or can I attempt to work on it?
i.e can I be assigned. It is my first bug.
Reporter | ||
Comment 5•5 years ago
|
||
Sure! Have you gotten pulseguardian development set up?
Assignee | ||
Comment 6•5 years ago
|
||
Can you guide me. Is there a git repo or is it included somewhere in the mozilla source repo.
Reporter | ||
Comment 7•5 years ago
|
||
The repository is here and has getting-started directions in the README. I'd recommend getting that cloned and getting the existing tests running, then looking into how you might reproduce the situation described above, then thinking about how to fix it.
Assignee | ||
Comment 8•5 years ago
|
||
Did you have any issues with python absolute path while running tests. The test/runtests file seems not to be able to refer to the pulseguardian module(saying it does not exist).I have temporatily added it to the environment PYTHONPATH variable.
Reporter | ||
Comment 9•5 years ago
|
||
I think you need to do these steps as well, before running the tests (this isn't clear from the README -- feel free to make a PR to clarify!)
Within the chosen environment, install and configure PulseGuardian:
Install the requirements:
pip install -r requirements.txt Install the package. This will ensure you have access to the pulseguardian package from anywhere in your virtualenv. python setup.py develop
Description
•