Some ETL jobs take too long (over 12 hours). This means another machine may work on it in the meantime; each overwriting the results of the other, and both complaining about the resulting inconsistency. These big results eventually error out, and left on the queue for another to work on. Overtime, the queue is saturated with these long-running jobs consuming the resources of all machines, and preventing further ETL. Find one of these jobs (they are still on the queue) and fix the problem.
If this is still a problem, I have not noticed for a while.
Status: NEW → RESOLVED
Last Resolved: 2 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.