When a commit goes to github, github sends a POST to a few services at mozilla, one being gitmirror.mozilla.org (which is working fine) and one being to updater.allizom.org (which is broken). The one to updater.allizom.org is the one that tells the project to update its code. A few months ago we had a problem where the updater.allizom.org script would freeze up on fridays. I don't know that we ever found a solution, but it seemed to stop happening until last friday. As of now, flightdeck -dev is broken since last friday (https://builder-addons.allizom.org/media/updater.output.txt). Rumor has it restarting the celery that updater.allizom.org is talking to will fix this. I don't know what box that is (it'll be in the config). Filing as critical because this is blocking testing and we have a push today (and it's the end of the quarter and this stuff needs to go out). We're on IRC if we can help.
Component: Server Operations → Server Operations: Web Operations
This is fixed. There was a lot of troubleshooting here due to severely lacking documentation on how this whole process works, but I've updated the mana page for builder, so it's better now. The TL;DR is that on Friday, mradm02 was rebooted. Supervisord was not set to start on boot, and so the celeryd-updater job did not start. I started it up, and set supervisord to start on boot, so we shouldn't have this problem again. Here's a link to the updater output. Note this is Zeus-cached, so there's sometimes a delay between when a job gets run and when it's visible here: https://builder-addons.allizom.org/media/updater.output.txt Here's a link to the Mana documentation for how this works, and how to trigger a manual update: https://mana.mozilla.org/wiki/display/websites/builder.addons.mozilla.org#builderaddonsmozillaorg-UpdatePushprocedure
Status: NEW → RESOLVED
Last Resolved: 7 years ago
Resolution: --- → FIXED
Thanks for the help. If mradm02 reboots on fridays that solves the mysterious failures in the past also.
Component: Server Operations: Web Operations → WebOps: Other
Product: mozilla.org → Infrastructure & Operations
You need to log in before you can comment on or make changes to this bug.