Closed
Bug 1103140
Opened 11 years ago
Closed 11 years ago
LagLog crontabber app fails to get the xlog_transform
Categories
(Socorro :: Database, task)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: peterbe, Assigned: selenamarie)
Details
See https://errormill.mozilla.org/webtools/socorro-prod/group/174397/
Stacktrace (most recent call last):
File "crontabber/app.py", line 974, in _run_one
for last_success in self._run_job(job_class, config, info):
File "crontabber/base.py", line 138, in main
function()
File "crontabber/base.py", line 208, in _run_proxy
return self.run(*args, **kwargs)
File "socorro/cron/jobs/laglog.py", line 52, in run
replay_location = self.xlog_transform(replay_location)
File "socorro/cron/jobs/laglog.py", line 36, in xlog_transform
logid, offset = xlog.split('/')
| Reporter | ||
Comment 1•11 years ago
|
||
If fails here: https://github.com/mozilla/socorro/blob/master/socorro/cron/jobs/laglog.py#L52 because the value for the `select replay_location from pg_stat_replication` is NULL for some reason.
Any idea Selena?
Flags: needinfo?(sdeckelmann)
| Reporter | ||
Comment 2•11 years ago
|
||
Selena, this is nagios alerting. Would you mind looking into why the select is coming back with a null?
Assignee: nobody → sdeckelmann
| Assignee | ||
Comment 3•11 years ago
|
||
Hey Matt -- Have you been running any replica builds in the past week? We're seeing some weird behavior from a monitor and I'm wondering if maybe it's just a replica creation related issue. It's not a big deal, just trying to rule it out.
Thanks!
Flags: needinfo?(sdeckelmann) → needinfo?(mpressman)
Comment 4•11 years ago
|
||
I ran it a couple times on Monday and Tuesday 11/24-11/25 afternoon. I've been trying to keep an eye on the load to make sure there aren't any issues. It's also running right now and this will be the last one needed. Let me know if this is causing/contributing to the issue and I'll run it at another time
Flags: needinfo?(mpressman)
Comment 5•11 years ago
|
||
I downtimed the alert and now the backup is completed. There shouldn't be any more backups being run against the current primary master as the rest of the new hosts that still need to be setup will be using the host that just completed
| Assignee | ||
Comment 6•11 years ago
|
||
(In reply to Matt Pressman [:mpressman] from comment #5)
> I downtimed the alert and now the backup is completed. There shouldn't be
> any more backups being run against the current primary master as the rest of
> the new hosts that still need to be setup will be using the host that just
> completed
Thanks so much!
So the cause I believe of this monitor blip is that we are running backups. So the monitor is correctly detecting lag - it's just lag that is acceptable for the duration of the manual backup. I asked Matt to ack the monitor and indicate that a backup is running if it goes off in the future.
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•