Closed Bug 644030 Opened 13 years ago Closed 13 years ago

Spark stage: restart Celery on each cron update

Categories

(mozilla.org Graveyard :: Server Operations, task)

task
Not set
blocker

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: franck.bugzilla, Assigned: fox2mike)

References

()

Details

Following bug 641004 which enabled Celery on stage. I figured I'd file a new bug instead of re-opening since it's a different issue.

Currently I'm getting very different background tasks behavior on stage vs dev, I suspect this is due to Celery not being restarted.

As a result, we need celeryd to be restarted each time tasks are modified in the source code.

Unless I'm mistaken, the only way to do this is to restart it every time update_site.py is run. This means on every update cron job.

---

I'm setting this bug as blocker because it's affecting our ability to properly test challenge functionality.
No offence, but a stage site after business hours isn't a blocker. Especially on release day.
Assignee: server-ops → shyam
(In reply to comment #1)
> No offence, but a stage site after business hours isn't a blocker. Especially
> on release day.

I'm aware that today is a special day. Perhaps I'm confused as to what "blocker" means in a Server Operations context. It was not my intention to imply that people should get this resolved after business hours.

I simply wanted to emphasize the fact that QA won't be able to test the site properly and since we have a silent launch scheduled in two days, it's blocking us from moving forward.
I've updated the cron to restart celery everytime there's an update to the site.
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Thanks Shyam for the quick fix.
(In reply to comment #2)
> (In reply to comment #1)
> > No offence, but a stage site after business hours isn't a blocker. Especially
> > on release day.
> 
> I'm aware that today is a special day. Perhaps I'm confused as to what
> "blocker" means in a Server Operations context. It was not my intention to
> imply that people should get this resolved after business hours.

So here's what happens :

Blocker - pages the oncall sysadmin immediately, 24/7/365.
Critcal - pages the oncall sysadmin after 8 hours (of being in the queue)
Major   - same as above, after 24 hours (instead of 8).

Blockers are for production websites being down (usually). Stage sites have IT support, 0900 to 1700 PDT, 5 days a week.
(In reply to comment #5)
 
> Blockers are for production websites being down (usually). Stage sites have IT
> support, 0900 to 1700 PDT, 5 days a week.

That's not really true, or at least, shouldn't be; we have at least two WebQA engineers in the UK who depend on staging sites being up, and who should be treated like 9-5PM PDT, in the US (spoke to Corey about this, too).
(In reply to comment #5)

Sorry, I was not aware that setting as blocker would page a sysadmin automatically. Besides, I'm located in Paris, France so I don't always have a sense of when/where everyone is working.

---

About the Celery restart,

I'm getting emails from the Cron daemon every 5 minutes with the following message:
Stopping celery-spark-stage: celery-spark-stage: stopped
[  OK  ]

Typically I receive messages from Cron when there is an error. I figured I would ask you if there's any way not to receive those Celery-specific messages or if this is normal and I should just ignore them.
(In reply to comment #4)
> Thanks Shyam for the quick fix.

You're welcome!
(In reply to comment #7)
> (In reply to comment #5)
> 
> Sorry, I was not aware that setting as blocker would page a sysadmin
> automatically. Besides, I'm located in Paris, France so I don't always have a
> sense of when/where everyone is working.

Sure, hence the information.

> Typically I receive messages from Cron when there is an error. I figured I
> would ask you if there's any way not to receive those Celery-specific messages
> or if this is normal and I should just ignore them.

Fixed, put it into a separate cron, so it won't email you for the restart. It's a minute after the update script now.
Product: mozilla.org → mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.