The current implementation only works on a single node deployment. The application installs a job in the crontab that simply checks every hour if there is something that needs to run. In a scenario with multiple web nodes multiple jobs will run, one per web node.
A possible solution is to have a node in the cluster dedicated to the scheduler.
On a second thought, it may be sufficient to have a single python command running every minute in a cron job. :robotblake is that doable in the dockerflow infrastructure?
That's do-able, but I feel like the right way to do it would be to use celery / huey / rq and run workers either as separate containers on each node or using supervisor (or similar) inside the containers.
:mdoglio and me talked about this a bit and found that based on our experience Celery would be way too complex for our use case. Instead I pitched RQ as a replacement since it's a nice compromise between the quality of the developer API and operational complexity (e.g. using Redis ElastiCache). We'd be using RQ-Scheduler (https://github.com/ui/rq-scheduler) for the periodic scheduling on top of RQ.
:jezdez can you please split the work required to configure rq into a set of bugs that block this one?
Created attachment 8791539 [details] [review] [telemetry-analysis-service] mozilla:bug-1302777-add-redis-instance > mozilla:master
This is on master now
Comment on attachment 8791539 [details] [review] [telemetry-analysis-service] mozilla:bug-1302777-add-redis-instance > mozilla:master This was reviewed on GitHub