Closed Bug 1426144 Opened 8 years ago Closed 8 years ago

[ops infra socorro] adu jobs failing

Categories

(Socorro :: Infra, task, P1)

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: willkg, Assigned: willkg)

References

Details

Attachments

(1 file)

The ADUCronApp and BuildADUCronApp jobs are failing: """ 2017-12-19 15:45:27,086 ERROR - crontabber - - MainThread - Exception raised during socorro.external.postgresql.connection_context transaction Traceback (most recent call last): File "/usr/local/lib/python2.7/site-packages/crontabber/transaction_executor.py", line 46, in __call__ result = function(connection, *args, **kwargs) File "/app/socorro/cron/jobs/matviews.py", line 52, in run self.run_proc(connection, [target_date]) File "/app/socorro/cron/jobs/matviews.py", line 28, in run_proc cursor.callproc(self.get_proc_name(), signature) InternalError: raw_adi not updated for 2017-12-18 2017-12-19 15:45:27,118 INFO - crontabber - - MainThread - Error captured in Sentry. Reference: 35d50b6765a14e31823572a34e212a27 2017-12-19 15:45:27,118 DEBUG - crontabber - - MainThread - error when running <class 'socorro.cron.jobs.matviews.ADUCronApp'> on None Traceback (most recent call last): File "/usr/local/lib/python2.7/site-packages/crontabber/app.py", line 1053, in _run_one for last_success in self._run_job(job_class, config, info): File "/usr/local/lib/python2.7/site-packages/crontabber/base.py", line 250, in main function(when) File "/usr/local/lib/python2.7/site-packages/crontabber/mixins.py", line 165, in _run_proxy **kwargs File "/usr/local/lib/python2.7/site-packages/crontabber/transaction_executor.py", line 46, in __call__ result = function(connection, *args, **kwargs) File "/app/socorro/cron/jobs/matviews.py", line 52, in run self.run_proc(connection, [target_date]) File "/app/socorro/cron/jobs/matviews.py", line 28, in run_proc cursor.callproc(self.get_proc_name(), signature) InternalError: raw_adi not updated for 2017-12-18 2017-12-19 15:45:27,218 DEBUG - crontabber - - MainThread - about to run <class 'socorro.cron.jobs.matviews.BuildADUCronApp'> 2017-12-19 15:45:27,262 ERROR - crontabber - - MainThread - Exception raised during socorro.external.postgresql.connection_context transaction Traceback (most recent call last): File "/usr/local/lib/python2.7/site-packages/crontabber/transaction_executor.py", line 46, in __call__ result = function(connection, *args, **kwargs) File "/app/socorro/cron/jobs/matviews.py", line 52, in run self.run_proc(connection, [target_date]) File "/app/socorro/cron/jobs/matviews.py", line 28, in run_proc cursor.callproc(self.get_proc_name(), signature) InternalError: raw_adi has not been updated for 2017-12-18 2017-12-19 15:45:27,267 INFO - crontabber - - MainThread - Error captured in Sentry. Reference: d234d36dbac74674988f6df8b30fb2b0 2017-12-19 15:45:27,267 DEBUG - crontabber - - MainThread - error when running <class 'socorro.cron.jobs.matviews.BuildADUCronApp'> on None Traceback (most recent call last): File "/usr/local/lib/python2.7/site-packages/crontabber/app.py", line 1053, in _run_one for last_success in self._run_job(job_class, config, info): File "/usr/local/lib/python2.7/site-packages/crontabber/base.py", line 250, in main function(when) File "/usr/local/lib/python2.7/site-packages/crontabber/mixins.py", line 165, in _run_proxy **kwargs File "/usr/local/lib/python2.7/site-packages/crontabber/transaction_executor.py", line 46, in __call__ result = function(connection, *args, **kwargs) File "/app/socorro/cron/jobs/matviews.py", line 52, in run self.run_proc(connection, [target_date]) File "/app/socorro/cron/jobs/matviews.py", line 28, in run_proc cursor.callproc(self.get_proc_name(), signature) InternalError: raw_adi has not been updated for 2017-12-18 """ I glean from the errors in the logs that there's no data in raw_adi. The RawADIMover should be adding data, so I'm not sure what the issue is. This bug covers figuring out what's going on and fixing it.
Making this a P1 and grabbing it. It's one of the last data problems with -stage-new.
Assignee: nobody → willkg
Status: NEW → ASSIGNED
Priority: -- → P1
-stage-new has data in the raw_adi table for days up to and including 2017-12-17. The date that the crontabber jobs is complaining about is 2017-12-18. I looked at -stage and it has data for days up to and including 2017-12-18. I looked at when the fetch-adi-from-hive job is set to run on -stage, -prod, and -stage-new: * -prod: 8:00 * -stage: 1:55 * -stage-new: 22:28 For -prod, that's the *real* job that runs on that server in SCL3. That job runs *before* the two ADU jobs that are failing out right now. For -stage, that's a *fake* job that just updates the crontabber bookkeeping so the two ADU jobs which are dependent on a successful run run correctly. For -stage-new, that's the RawADIMoverCronApp job which simulates the real job. The fact it's running at 22:28 way after the ADU jobs is wrong. It needs to run at 8:00. I'm going to fix the crontabber job specs. Then once that gets to -stage-new, I'll stomp on the crontabber state for those three jobs and see if that fixes things. If that doesn't fix things, then the next thing to look at is the target_date figuring in RawADIMoverApp.
Commits pushed to master at https://github.com/mozilla-services/socorro https://github.com/mozilla-services/socorro/commit/4edabdbff25ecd987a25495a2c9c0d7e2a01ce2c fixes bug 1426144 - fix job spec for fetch-adi-from-hive jobs The fake one and the rawadimover one were set to run at the wrong time in relation to the adu jobs that need that data. This fixes that. https://github.com/mozilla-services/socorro/commit/150b9d3aae6f809c0064a1e234c2d57f8d9b7ead Merge pull request #4266 from willkg/1426144-rawadimover fixes bug 1426144 - fix job spec for fetch-adi-from-hive jobs
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Oops... I'm going to reopen this until this is actually fixed.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
We landed those changes and they got deployed to -stage-new. I then did the following SQL to change the crontabber state for the RawADIMoverCronApp: => update crontabber set next_run='2017-12-19 08:20' where app_name='fetch-adi-from-hive'; UPDATE 1 => update crontabber set last_run='2017-12-18 08:20', last_success='2017-12-18 08:20' where app_name='fetch-adi-from-hive'; UPDATE 1 That sets it up so that it should run at 8:20 every day going forward plus it adjusts the bookkeeping so that it looks like it hasn't run since yesterday. I watched the logs and RawADIMoverCronApp ran and pulled data for 2017-12-18--that's perfect. Then I ran this SQL to adjust the bookkeeping so that the two adu jobs would run for every day since 12/14/2017: => update crontabber set next_run='2017-12-18 08:30' where app_name='adu-matview' or app_name='build-adu-matview'; UPDATE 2 => update crontabber set last_success='2017-12-14 08:30' where app_name='adu-matview' or app_name='build-adu-matview'; UPDATE 2 After that, everything updated and is in a "proper state". I'll check it tomorrow to make sure it stays in a proper state, but I think we're done here. Marking as FIXED.
Status: REOPENED → RESOLVED
Closed: 8 years ago8 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: