Closed Bug 781010 Opened 13 years ago Closed 13 years ago

[crontabber] last_success is preventing backfilling on daily jobs with a time

Categories

(Socorro :: General, task)

x86
macOS
task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: peterbe, Assigned: peterbe)

References

Details

Suppose you have a job like `socorro.cron.jobs.foo.FooApp|1d|18:00`. Also, suppose the job takes 5-10 minute to run. Now, the first time it wakes up (once #780290 lands) is, say, Monday 18:02:00. What you then get is something like this:: "slow-three": { "next_run": "Tuesday 18:00:00.000000", "first_run": "Monday 18:02:00.376068", "last_error": {}, "last_run": "Monday 18:02:00.376068", "last_success": "Monday 18:07:00.573181", "error_count": 0 }, Note how last_success reflects that it took 5 minutes to complete the job. Also, not that it took 2 minute after 18:00:00 for it to actually run. (since crontab starts crontabber every 5 min plus it's synchronous so other jobs delay) The next day, it starts at Tuesday 18:01:00 (1 minute earlier). Let's also look at the current code for how it does the backfilling which is this:: when = last_success seconds = convert_frequency(self.config.frequency) interval = datetime.timedelta(seconds=seconds) while (when + interval) < now: when += interval function(when) yield when Then: when = Monday 18:07:00 seconds = 1 day interval = 1 day now = Monday 18:01:00 when + interval = Tuesday 18:07:00 (when + interval) < now Thus: the while loop never loops because the condition is immediately False. When you get in the logs is therefore something like this:: Tuesday 18:01:00,184 DEBUG - MainThread - about to run <class 'socorro.cron.jobs.foo.FooApp'> because it's not time to run And you DON'T get:: Tuesday 18:01:00,184 DEBUG - MainThread - sucessfully ran <class 'socorro.cron.jobs.foo.FooApp'> And this is WRONG!
The proposed solution is to do this right before going into the while loop:: when = last_success if self.config.time: h, m = [int(x) for x in self.config.time.split(':')] when = when,.replace(hour=h).replace(minute=m) seconds = convert_frequency(self.config.frequency) interval = datetime.timedelta(seconds=seconds) while (when + interval) < now: when += interval function(when) yield when Then, it will be True on the first loop of the while loop. Q.E.D.
Assignee: nobody → peterbe
https://github.com/mozilla/socorro/pull/778 correctly simulates this and solves it. In short, the explanation lies in appreciating that even if crontabber fires at the exact same time every day, there might be multiple jobs in the queue and one of the earlier ones some times take LESS time. Preventing any drift on the hour/minute part of last_success solves this problem.
Depends on: 781989
Blocks: 778242
Commits pushed to master at https://github.com/mozilla/socorro https://github.com/mozilla/socorro/commit/bdcdead9f5c07cc26d46a21baaa9092c9e170ce5 bug781010 - backfill jobs with time shouldn't break if previous jobs are faster than normal https://github.com/mozilla/socorro/commit/91bf382042cf214fa476b49b927ee518ddfe576f Merge pull request #778 from peterbe/bug781010-crontabber-drifting-last_success bug781010 - backfill jobs with time shouldn't break if previous jobs are...
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.