Move crons that depend on ADU an hour later

RESOLVED WONTFIX

Status

Socorro
Backend
--
major
RESOLVED WONTFIX
5 years ago
5 years ago

People

(Reporter: laura, Assigned: lonnen)

Tracking

unspecified

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [qa-])

(Reporter)

Description

5 years ago
I talked with deinspanjer.  EOD processing has been delayed a lot of late because they just moved to a new Vertica cluster, and there are some teething troubles like bug 853585.

Until they get those ironed out, let's move stuff an hour later and hopefully that will circumvent most of the issues.  

(I know 40 is frozen but this is only a crontab change, so I think we can do it.)
As of landing this: https://github.com/mozilla/socorro/commit/57a1149c5e5c085d78b30f8aeab7552ed3ca90c3
we will become much better at handling errors. What that patch was all about was basically that crontabber will re-attempt failing jobs in 5 minutes. And it will do so again and again till it works. Before, it wouldn't re-attempt until the next cycle (ie, a day) if it failed. 

All the dependent jobs will just patiently wait. Almost as if the parent job was just super slow to execute. 

The patch is really trivial but the concept/business logic is not. So, we'll keep an eye on crontabber and wait for corner cases we haven't thought about.
Status: NEW → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → WONTFIX

Updated

5 years ago
Whiteboard: [qa-]
Hmm... As of today, 29th March 2013, the new crontabber will impatiently re-attempt every 5 minutes. Basically giving the ADU data a second chance every 5 minutes until it works. 

However, at the time of writing it has re-attempted 40 times. I.e. 3 hours and 20 minutes (right?) meaning its delay is much more severe than just 1 additional hour. 

Thoughts from metrics? Who should we aim this bug at?
Status: RESOLVED → REOPENED
Resolution: WONTFIX → ---
After some discussion, re-opening it was maybe a bad idea. It would basically prepare us for failure. The ADU count needs to be fixed and work at 10:00AM. Hopefully the current 3.5h delay is exceptional as metrics is ironing out some bugs and it should return to normal soon.
Status: REOPENED → RESOLVED
Last Resolved: 5 years ago5 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.