Closed Bug 1406019 Opened 7 years ago Closed 6 years ago

[ops infra socorro] working crontabber in ops stage

Categories

(Socorro :: Infra, task, P2)

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: miles, Assigned: miles)

References

Details

The crontabber will likely be deployed as a long-running node that is replaced via its own pipeline. However, if it makes sense to deploy the crontabber alongside another socorro app, that is also an option.

A working crontabber in ops stage blocks the workingness of the webapp and processor.
Blocks: 1391034
Blocks: 1406020
stolen text from :willkg in bug 1407671

Crontabber runs in a docker container in a script that's an infinite loop with 5 minute waits.

When we do a deploy, we want old-crontabber to finish up what it's doing (which could take minutes) and then stop doing things. However, during a deploy, we might have run a migration that affected the database that's being used by the job the old-crontabber is using. In that case, old-crontabber is hosed, things happen that are probably fine, but old-crontabber doesn't update the bookkeeping table regarding running the job.

Do we change crontabber bookkeeping? Do we change the timeout values for "stale jobs"? Do we add poisoning to old-crontabber so that it stops?

This bug covers figuring out all that stuff and making it work.
regarding comment 2:

Right now we are handling the crontabber like we handle "most apps", which is to say we deploy the new stack, then scaledown / cleanup the old stacks.

We've asserted that this might be a bad policy for crontabber, because multiple simultaneous crontabbers being alive at the same time is bad. That's just what we have right now. Open question is still open.
Making this a P2. This needs to work in the new infra.
Priority: -- → P2
Miles and I fixed database data and some configuration and I think all the jobs we need running correctly are running now. The only one that is busted is the MissingSymbolsCronApp which we don't want to run anymore.

Given that, I'm going to mark this FIXED.
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.