Closed Bug 965907 Opened 10 years ago Closed 10 years ago

Undo bind mounts in product delivery

Categories

(Infrastructure & Operations :: Change Requests, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: gcox, Unassigned)

References

Details

There are bind mounts in the FTP servers, where certain directories on NFS mounts are cross-linked to other mounts.  Besides being ugly, we believe they may be causing us to be unable to move volumes around the filer.

We are syncing that cross-linked data over to new volumes and, when ready, we want to replace those mounts with new filer-side mounts.  That will reduce complexity on the ftp servers as well as let us fragment a large volume, and may remove locks and let us move another volume around the filer.

* affects: product delivery
* impact: brief (3-5 mins) unavailability of older releases and candidates from the ftp servers
* gcox on point, nthomas to assist/advise/coordinate crons
* Weds 5 Feb, 1900-2000 PST
** After PST hours, before Romanian hours
** Timed for after release ships, before beta ships
** Can abort if we're facing a chemspill
Flags: cab-review?
(In reply to Greg Cox [:gcox] (plz don't needinfo me) from comment #0)
> * affects: product delivery
> * impact: brief (3-5 mins) unavailability of older releases and candidates
> from the ftp servers

tinderbox-builds/old contains builds created when developers push to hg (rather than 'older releases'), and only the builds which are 3 days or more days old. Disruption there is not likely to cause any issues.
I've disabled the cron jobs in r81830. We may get one or more runs before puppet syncs that to upload-cron.private.scl3.
Flags: cab-review? → cab-review+
rsync has the breakout vols in sync.  A catchup/verification (since nothing is changing) is taking about 5-7 minutes.
Unfortunately, we're going to need to spin a second build for 28.0b1, which means we'll want to be putting new data in candidates/ at the scheduled time for this work. So we'll have to postpone I think.
Could still do the three tinderbox-builds ones though.
Changed the tinderbox-builds mounts from client-side mounts to filer junction mounts, puppet change 81923.
candidates is on hold due to release pressure.  We'll circle back.
Nice. Re-enabled the crons in 81925.
On the upload boxes, there was a crusty old chroot bind mount from a partner upload (infra bug 791745) that has since been turned off.  We can look into whether it's needed or not later, but, for now, I've turned it from a bind mount to a direct mount in r81966.
Worked with :nthomas and found a gap, got candidates moved over, puppet change r82014.

That leaves the old cm-ixstore01 mount unused.  On Monday I'll work on taking that out.
Depends on: 969423
I've double checked the old mounts at upload1:/mnt/cm-ixstore01/ for any recently modified files, and didn't find any. We also haven't had any cron error mail for the new junction mounts, and serving files is also working fine. 

--> All clear from RelEng to delete 10.22.75.117:/tinderbox_builds
10.22.75.117:/tinderbox_builds or /mnt/cm-ixstore01 unmounted/offlined and removed from puppet and nagios in r82418.
Updated https://mana.mozilla.org/wiki/display/websites/Product+delivery

Volume is offline on the filer and I'll destroy it soon if no problem comes up.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Depends on: 981879
No longer depends on: 981879
Product: mozilla.org → Infrastructure & Operations
Change Request: --- → approved
Flags: cab-review+
You need to log in before you can comment on or make changes to this bug.