Closed Bug 677103 Opened 13 years ago Closed 13 years ago

replayDB needs a separate partition/share for archive logs

Categories

(mozilla.org Graveyard :: Server Operations, task)

x86
Linux
task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: jberkus, Assigned: jabba)

Details

(Whiteboard: pending stage site in PHX)

socorro1.db.sjc1.mozilla.com has run out of disk space a few times due to falling behind the database master for continuous backup.

since both the archive logs and the database are on the same iscsi share, this can lead to an unrecoverable database situation and needing to do a full resync from master.  This would be less likely with the archive logs on a separate partitition/share/quota'd directory.

After resolving the iscsi reliability issues per other bug, please do the following:

1) add a separate 50GB partition, LUN, or other size-limited volume called "wal_archive".

2) add a nagios/ganglia monitor on wal_archive to make an alert whenever that volume is more than 30GB.
note: the new wal_archive volume can be on local disk, if space is available.
Assignee: server-ops → jdow
Whiteboard: pending stage site in PHX
replaydb has been permanently shut down. In phx we'll have a different architecture, so we'll implement this as needed from the beginning.
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Product: mozilla.org → mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.