Closed
Bug 690360
Opened 14 years ago
Closed 14 years ago
Stuck cron jobs on AMO?
Categories
(mozilla.org Graveyard :: Server Operations, task)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: clouserw, Assigned: nmaul)
Details
We've been getting mail every few minutes for the past few days about some failing cron jobs. I was hoping it would recover, but it appears to not be the case. It's for production AMO and these are the crons:
/usr/bin/python2.6 manage.py cron update_collections_subscribers
/usr/bin/python2.6 manage.py cron update_addons_current_version
/usr/bin/python2.6 manage.py cron update_collections_votes
Before the cron jobs run, they check to see if a previous copy is still running (in which case they won't run). In the case of these three, they are claiming previous copies are running all the time. The lock files are these, respectively:
`/tmp/django_cron.lock.update_collections_subscribers`
`/tmp/django_cron.lock.update_addons_current_version`
`/tmp/django_cron.lock.update_collections_votes`
I don't have clear steps here so I'm filing this bug to feel out the situation.
Can you give us the timestamps on those files?
Can you confirm that there aren't more copies of AMO cron jobs running on addonsadm.private.phx1? In the past we've had stage and prod on the same box and the cron jobs collided.
Anything else fishy you can think of?
| Assignee | ||
Comment 1•14 years ago
|
||
apache apache 0 Sep 23 12:00 django_cron.lock.update_addons_current_version
apache apache 0 Sep 23 12:05 django_cron.lock.update_collections_subscribers
apache apache 0 Sep 23 12:25 django_cron.lock.update_collections_votes
And it looks like they really are still running:
[root@addonsadm.private.phx1 tmp]# ps axo pid,etime,command
<snip>
27195 6-00:26:47 /usr/bin/python2.6 manage.py cron update_collections_votes
25549 6-00:47:29 /usr/bin/python2.6 manage.py cron update_collections_subscribers
21470 6-00:52:44 /usr/bin/python2.6 manage.py cron update_addons_current_version
They're all for prod, by the way. None of them currently running for stage or dev.
Want me to kill them and rm those 3 lock files?
Assignee: server-ops → nmaul
| Reporter | ||
Comment 2•14 years ago
|
||
Wow, yes please.
We can keep an eye on them when they rerun to make sure it's not something that is going to keep happening.
Thanks.
| Assignee | ||
Comment 3•14 years ago
|
||
All 3 killed and their lockfiles removed. Also killed the job and lockfile for 'manage.py cron deliver_hotness', per IRC conversation.
So far, Things look to be back to normal. I'm not seeing any long-running "manage.py cron" processes hanging around, and at least update_collections_votes just ran. I think this is fixed.
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Updated•11 years ago
|
Product: mozilla.org → mozilla.org Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•