Closed Bug 1103140 Opened 11 years ago Closed 11 years ago

LagLog crontabber app fails to get the xlog_transform

Categories

(Socorro :: Database, task)

x86
macOS
task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: peterbe, Assigned: selenamarie)

Details

See https://errormill.mozilla.org/webtools/socorro-prod/group/174397/ Stacktrace (most recent call last): File "crontabber/app.py", line 974, in _run_one for last_success in self._run_job(job_class, config, info): File "crontabber/base.py", line 138, in main function() File "crontabber/base.py", line 208, in _run_proxy return self.run(*args, **kwargs) File "socorro/cron/jobs/laglog.py", line 52, in run replay_location = self.xlog_transform(replay_location) File "socorro/cron/jobs/laglog.py", line 36, in xlog_transform logid, offset = xlog.split('/')
If fails here: https://github.com/mozilla/socorro/blob/master/socorro/cron/jobs/laglog.py#L52 because the value for the `select replay_location from pg_stat_replication` is NULL for some reason. Any idea Selena?
Flags: needinfo?(sdeckelmann)
Selena, this is nagios alerting. Would you mind looking into why the select is coming back with a null?
Assignee: nobody → sdeckelmann
Hey Matt -- Have you been running any replica builds in the past week? We're seeing some weird behavior from a monitor and I'm wondering if maybe it's just a replica creation related issue. It's not a big deal, just trying to rule it out. Thanks!
Flags: needinfo?(sdeckelmann) → needinfo?(mpressman)
I ran it a couple times on Monday and Tuesday 11/24-11/25 afternoon. I've been trying to keep an eye on the load to make sure there aren't any issues. It's also running right now and this will be the last one needed. Let me know if this is causing/contributing to the issue and I'll run it at another time
Flags: needinfo?(mpressman)
I downtimed the alert and now the backup is completed. There shouldn't be any more backups being run against the current primary master as the rest of the new hosts that still need to be setup will be using the host that just completed
(In reply to Matt Pressman [:mpressman] from comment #5) > I downtimed the alert and now the backup is completed. There shouldn't be > any more backups being run against the current primary master as the rest of > the new hosts that still need to be setup will be using the host that just > completed Thanks so much! So the cause I believe of this monitor blip is that we are running backups. So the monitor is correctly detecting lag - it's just lag that is acceptable for the duration of the manual backup. I asked Matt to ack the monitor and indicate that a backup is running if it goes off in the future.
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.