Closed
Bug 818989
Opened 12 years ago
Closed 11 years ago
Backups failed due to bug in 9.2.1 through 9.2.4 - coordinate debugging with a debugging patch
Categories
(Data & BI Services Team :: DB: MySQL, task)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: selenamarie, Assigned: selenamarie)
References
Details
We had a security and bugfix release for PostgreSQL today. These include fixing issues with routine maintenance commands like CREATE INDEX CONCURRENTLY and some replication bugs.
At the earliest convenient time, we should upgrade to 9.2.2.
This process is just:
* Install new packaged version of 9.2
* Stop socorro
* Stop/start databases
* Restart socorro
Updated•12 years ago
|
Assignee: server-ops-database → mpressman
Assignee | ||
Comment 1•12 years ago
|
||
9.2.3 will be out by the time it's possible to upgrade.
Summary: Schedule 9.2.2 upgrade for Socorro databases → Schedule 9.2.3 upgrade for Socorro databases
Assignee | ||
Comment 2•12 years ago
|
||
We hit a bug in 9.2.1 today:
# Make a backup!
Tue Feb 19 00:02:01 UTC 2013
## Destructively clean up local /data/pgbasebackups/9.2/2013021900/data
## Making base backup
transaction log start point: 1097/BC7A0460
pg_basebackup: starting background WAL receiver
pg_basebackup: streaming header too small: 25
transaction log end point: 1098/14940E60
pg_basebackup: waiting for background process to finish streaming...
pg_basebackup: child process exited with error 1
This bug was fixed here:
http://www.postgresql.org/message-id/CABUevEyv+YCQ26eXFKgBetDekeSp_YCJs80m1N=bFy3a47-GOg@mail.gmail.com
And the commit is present in 9.2.3
We have 9.2.3 available on backup4, but the server must be upgraded to fix this problem.
Summary: Schedule 9.2.3 upgrade for Socorro databases → Backups failed due to bug in 9.2.1 - Schedule 9.2.3 upgrade for Socorro databases
Assignee | ||
Comment 3•12 years ago
|
||
Tue Feb 19 18:04:10 UTC 2013
## Destructively clean up local /data/pgbasebackups/9.2/2013021918/data
## Making base backup
transaction log start point: 109D/3F26C6D0
pg_basebackup: starting background WAL receiver
pg_basebackup: streaming header too small: 25
Ugh. Still happening.
Assignee | ||
Comment 4•12 years ago
|
||
Matt --
Could you confirm that the pg_basebackup is now working since the upgrade.
Thanks!
-selena
Flags: needinfo?(mpressman)
Comment 5•12 years ago
|
||
Since the update to 9.2.4 on master01 we have not hit this issue, here is the output from /var/log/base_backup.log
Fri Apr 5 17:58:18 UTC 2013
# Cleaning up old backup directories
Nothing to remove! Only 5 backups present.
# Done!
Fri Apr 5 17:58:18 UTC 2013
# Make a backup!
Sat Apr 6 16:02:01 UTC 2013
## Destructively clean up local /data/pgbasebackups/9.2/2013040616/data
## Making base backup
transaction log start point: 1211/AF8076C0
pg_basebackup: starting background WAL receiver
transaction log end point: 1212/25DB3D20
pg_basebackup: waiting for background process to finish streaming...
pg_basebackup: base backup completed
# Done with backup!
Sat Apr 6 17:58:13 UTC 2013
# Cleaning up old backup directories
2013032800
# Done!
Sat Apr 6 17:58:14 UTC 2013
# Make a backup!
Sun Apr 7 16:02:01 UTC 2013
## Destructively clean up local /data/pgbasebackups/9.2/2013040716/data
## Making base backup
transaction log start point: 1218/B33B490
pg_basebackup: starting background WAL receiver
transaction log end point: 1218/768B8788
pg_basebackup: waiting for background process to finish streaming...
pg_basebackup: base backup completed
# Done with backup!
Sun Apr 7 17:58:44 UTC 2013
# Cleaning up old backup directories
2013032900
# Done!
Sun Apr 7 17:58:45 UTC 2013
# Make a backup!
Mon Apr 8 16:02:01 UTC 2013
## Destructively clean up local /data/pgbasebackups/9.2/2013040816/data
## Making base backup
transaction log start point: 121D/3022C9C0
pg_basebackup: starting background WAL receiver
transaction log end point: 121D/9ABB6538
pg_basebackup: waiting for background process to finish streaming...
pg_basebackup: base backup completed
# Done with backup!
Mon Apr 8 17:58:52 UTC 2013
# Cleaning up old backup directories
2013040100
# Done!
Mon Apr 8 17:58:52 UTC 2013
# Make a backup!
Tue Apr 9 16:02:01 UTC 2013
## Destructively clean up local /data/pgbasebackups/9.2/2013040916/data
## Making base backup
transaction log start point: 1222/58023B00
pg_basebackup: starting background WAL receiver
transaction log end point: 1222/D65F8F58
pg_basebackup: waiting for background process to finish streaming...
pg_basebackup: base backup completed
# Done with backup!
Tue Apr 9 17:59:01 UTC 2013
# Cleaning up old backup directories
2013040300
# Done!
Flags: needinfo?(mpressman)
Assignee | ||
Comment 6•12 years ago
|
||
Great! Thank you
Comment 7•12 years ago
|
||
apparently this is still a problem, since the upgrade we have not had an issue until today's backup:
# Make a backup!
Wed Apr 10 16:02:01 UTC 2013
## Destructively clean up local /data/pgbasebackups/9.2/2013041016/data
## Making base backup
transaction log start point: 1226/CC2A4360
pg_basebackup: starting background WAL receiver
pg_basebackup: streaming header too small: 25
transaction log end point: 1228/48FFE918
pg_basebackup: waiting for background process to finish streaming...
pg_basebackup: child process exited with error 1
Assignee | ||
Comment 8•12 years ago
|
||
(In reply to Matt Pressman [:mpressman] from comment #7)
> apparently this is still a problem, since the upgrade we have not had an
> issue until today's backup:
Ugh. Alright. I'm going to have a look at compiling a debugging version of PostgreSQL to see if we can catch this in the act.
I have to address a couple other tickets before that unfortunately.
Updated•11 years ago
|
Flags: needinfo?(sdeckelmann)
Comment 9•11 years ago
|
||
Matt, Selena - is this still a problem? backups failing seems like a non-trivial thing, is there an update on this bug?
Assignee | ||
Comment 10•11 years ago
|
||
(In reply to Sheeri Cabral [:sheeri] from comment #9)
> Matt, Selena - is this still a problem? backups failing seems like a
> non-trivial thing, is there an update on this bug?
I'm not sure -- maybe matt could check on the backups? Also, monitoring for the backups through nagios would be great.
Flags: needinfo?(sdeckelmann)
Assignee | ||
Comment 11•11 years ago
|
||
Still happening:
# Make a backup!
Wed Sep 4 16:02:01 UTC 2013
## Making base backup
transaction log start point: 14D5/E1440C28
pg_basebackup: starting background WAL receiver
pg_basebackup: streaming header too small: 25
transaction log end point: 14D6/53FBDBF8
pg_basebackup: waiting for background process to finish streaming...
pg_basebackup: child process exited with error 1
# Done with backup!
Wed Sep 4 18:37:03 UTC 2013
# Cleaning up old backup directories
2013090116
# Done!
Looks like we get a few backups succeeding, sometimes.
Assignee | ||
Updated•11 years ago
|
Summary: Backups failed due to bug in 9.2.1 - Schedule 9.2.3 upgrade for Socorro databases → Backups failed due to bug in 9.2.1 through 9.2.4 - coordinate debugging with a debugging patch
Comment 12•11 years ago
|
||
I modified the base_backup.sh script so that if it catches an error, it will re-run until it completes successfully. There is a nagios check for the postgres backups, however it doesn't look like it is applied. Additionally, that check it could/should probably be more robust. I'll work on that as well.
Assignee | ||
Updated•11 years ago
|
Assignee: mpressman → sdeckelmann
Assignee | ||
Updated•11 years ago
|
Assignee | ||
Comment 13•11 years ago
|
||
Progress today:
* installed the following on backup4 to support a build:
> sudo yum install -y bison-devel readline-devel zlib-devel openssl-devel wget
> sudo yum groupinstall -y 'Development Tools'
* Built a copy of Postgres from commit 061b88c732952c59741374806e1e41c1ec845d50, which includes a fix for pg_basebackup.
:mpressman is kicking off a backup now to test it.
Assignee | ||
Comment 14•11 years ago
|
||
Had to recompile the 9.2 version. 9.3 is not backward compatible for streaming replication.
Assignee | ||
Comment 15•11 years ago
|
||
Confirmed that this fix is in.
Version 9.2.6 of Postgres will include the fix.
Assignee | ||
Updated•11 years ago
|
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Updated•10 years ago
|
Product: mozilla.org → Data & BI Services Team
You need to log in
before you can comment on or make changes to this bug.
Description
•