Closed Bug 730388 Opened 12 years ago Closed 11 years ago

[basket] Switch basket to use Celery

Categories

(Infrastructure & Operations Graveyard :: WebOps: Other, task)

All
Other
task
Not set
critical

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: jlong, Assigned: cturra)

References

Details

+++ This bug was initially created as a clone of Bug #725781 +++

It looks like the Celery servers have been setup, so now we need to test them and get basket switched over to them.

I need to commit some code changes (mainly error handling) before we do this too.
Assignee: server-ops → mburns
Assignee: mburns → cturra
:jlongster - looks like this slipped through the cracks a little bit. i wanted to touch base with you to see if you have done the code changed mentioned in your bug description? if so, i can work with you to get the celery servers configured for you.

sorry for the delay with this!
Status: NEW → ASSIGNED
:jlongster - i am going to resolve this as "won't fix" since so much time has passed and i suspect it has already been resolved in either another bug and possibly another way. please don't hesitate to re-open if i am incorrect. again, i apologize for the deploy on this!
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → WONTFIX
Hey Chris, the celery servers were never configured. I didn't see comment 2 because I don't work on mozilla.org anymore, but we need to set them up. I've been told recently that the ExactTarget 3rd party requests are timing out sometimes.

I'm not sure who you will be working with, maybe me, but I'll point the engagement team to this bug.
Status: RESOLVED → REOPENED
Resolution: WONTFIX → ---
:jlongster - thnx for the update. i just had a look at the celery configuration for basket-{dev|stage} and they appear to be configured correctly. can you please provide a little more information about how i can see this not working and the criteria to in fact test that things are working as expected?
:jlongster i am going to r/won'tfix this bug until we can get further details around my questions in comment 4. please feel to have the engagement team re-open if they can provide further details.
Status: REOPENED → RESOLVED
Closed: 12 years ago12 years ago
Resolution: --- → WONTFIX
(In reply to Chris Turra [:cturra] from comment #4)
> :jlongster - thnx for the update. i just had a look at the celery
> configuration for basket-{dev|stage} and they appear to be configured
> correctly. can you please provide a little more information about how i can
> see this not working and the criteria to in fact test that things are
> working as expected?

I'll be working on getting this working. The basket code isn't quite ready yet, but if dev is ready with rabbit, I'll do the work and try this out in the next day or two. Sorry this has languished for so long.
Status: RESOLVED → REOPENED
Resolution: WONTFIX → ---
Blocks: 806327
WOO HOO! The code is finally ready. I've got my finger on the merge button. If you can add the proper celery broker configs to dev, I'll then push the changes and we can test.
:pmac - i have pushed the following config changes to basket. can you confirm it's what you're expecting?

  CELERY_ALWAYS_EAGER = False
  BROKER_HOST = "generic-celery-dev"
  BROKER_PORT = 5672
  BROKER_USER = "basket_dev"
  BROKER_PASSWORD = "basket_dev"
  BROKER_VHOST = "basket_dev"
That does look good. I'll be pushing to master soon and test from there. Thanks :cturra!
Follow up questions:

1. I don't see a worker running for this on the dev machine. Is this handled elsewhere, or do we still need to figure this part out?
2. Deployment scripts will need updating to restart the worker processes (celeryd) when new code is pushed. Is this done, or easy? Hopefully you have a standard way of doing this for other apps?
3. What command do you (web ops) normally use to start the workers? Do you start it with -B (to start the celery-beat scheduler) and at what log level?

Thanks for any further info you can give. Perhaps my searching of Mana is weak, but I haven't found much there.
Depends on: 852696
Blocks: 841894
I wanted to note here that the queue of bugs waiting for this to be resolved is getting bigger.

Please advise on when you think there would be time to work on this.

As always your help is much appreciated.
Flags: needinfo?(cturra)
1) The service runs on a dedicated host (not the dev web server)

2) You will want to add the section to your deploy script like in:
https://github.com/mozilla/firefox-flicks/blob/master/bin/update/deploy.py#L106
and then the setting like in:
https://github.com/mozilla/firefox-flicks/blob/master/bin/update/commander_settings.py-dist#L10
which of course we will fill out as necessary.
And FYI this is already set up in the non-Chief update push script we have for you.

3) We are currently using supervisord to run these as services. The config line says:
command=/usr/bin/python2.6 /data/www/basket-dev.allizom.org/basket/manage.py celeryd --loglevel=INFO  -f /var/log/celeryd-basket-dev.log -c 4 

Let me know if you need more info.
Flags: needinfo?(cturra)
(In reply to Jason Crowe [:jd] from comment #12)

Great info! Thanks so much. I'd love to have the deploy script in the basket repo. The project is kinda old however, so it's not there. 

Supervisiord is great, and thanks for the command. That's a lot of what I wanted to know. Is there a way for me to access that log? How is the service monitored? Are there alerts on queue size? Can there be? There was a time on stage when the queue wasn't doing anything, but I couldn't see or do anything about it. I'm just nervous because previously if basket went down we'd know because of errors on bedrock and other sites, but if celeryd stops working we won't know for a while, and if rabbit goes down during that we could lose thousands of subscriptions.
closing this bug off now that we have completed the celery deployment for basket through dev, stage and prod.
Status: REOPENED → RESOLVED
Closed: 12 years ago11 years ago
Resolution: --- → FIXED
Component: Server Operations: Web Operations → WebOps: Other
Product: mozilla.org → Infrastructure & Operations
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.