replayDB needs a separate partition/share for archive logs

RESOLVED FIXED

Status

RESOLVED FIXED
7 years ago
4 years ago

People

(Reporter: jberkus, Assigned: jabba)

Tracking

Details

(Whiteboard: pending stage site in PHX)

(Reporter)

Description

7 years ago
socorro1.db.sjc1.mozilla.com has run out of disk space a few times due to falling behind the database master for continuous backup.

since both the archive logs and the database are on the same iscsi share, this can lead to an unrecoverable database situation and needing to do a full resync from master.  This would be less likely with the archive logs on a separate partitition/share/quota'd directory.

After resolving the iscsi reliability issues per other bug, please do the following:

1) add a separate 50GB partition, LUN, or other size-limited volume called "wal_archive".

2) add a nagios/ganglia monitor on wal_archive to make an alert whenever that volume is more than 30GB.
(Reporter)

Comment 1

7 years ago
note: the new wal_archive volume can be on local disk, if space is available.

Updated

7 years ago
Assignee: server-ops → jdow
Whiteboard: pending stage site in PHX
(Assignee)

Comment 2

7 years ago
replaydb has been permanently shut down. In phx we'll have a different architecture, so we'll implement this as needed from the beginning.
Status: NEW → RESOLVED
Last Resolved: 7 years ago
Resolution: --- → FIXED
Product: mozilla.org → mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.