Closed Bug 618088 Opened 14 years ago Closed 14 years ago

Test and implement AUS syncing between SJC and PHX

Categories

(mozilla.org Graveyard :: Server Operations, task)

x86
macOS
task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: cshields, Assigned: aravind)

References

Details

re: bug 596391.. Currently Nick's scripts sync the data between dm-ausstage01 and dp-ausstage01. We need to decide on a method of getting this data on to NFS rather than local disk. ETA Mon/Tue (12/13 or 12/14) during all-hands, will have Aravind around. Nick is out on paternity leave 12/15.
Assignee: cshields → aravind
Went through 596391 and talked to bhearsum, here is what I plan on doing. This is mostly for nthomas to comment on. 1) comment out cron job under the ffxbld account on dm-ausstage01 rsync -a --delete /opt/aus2/incoming/2/ dp-ausstage01:/opt/aus2/incoming/2/ 2) sync /vol/aus2/ from netapp-b (in mpt) the same volume on the phx netapp. 3) wipe out /opt/aus2 on dp-ausstage01 and mount the synced phx netapp volume there. 4) setup ssh keys on dp-ausstage01 (for cltbld and ffxbld). Is there anything else you can think of that needs to be done? Once all that's done, I'd uncomment the cron job in step 1.
I had written this before comment #1 was made, but lets post it anyway for background. There are three ways data will be pushed onto that location: 1) cron job every 10 minutes updating /opt/aus2/incoming/2/Firefox/ (nightly snippets) 2) cron job every day checking content of /opt/aus2/incoming/3/ (release snippets); starts at midnight but can take up to 100 mins [it makes a tarball in MPT first and that's gated on NFS speed there. And that's with a with local disk, impact of netapp move TBD] 3) pushing new snippets from releases - on demand from release drivers Until we're serving out of PHX we can ignore 1) and 2) to make NFS changes. Just let RelEng know that we will get cron mail about failures, or we can disable them until you're done. To get status on 3) you'd have to ask RelEng. 4.0b8 is due to start any time now. You should be able to dive into the 'Build' links at https://wiki.mozilla.org/Releases to figure out who's on the hook for any releases in progress.
I'm a little short of time before heading out on paternity leave so verification our side might be fun. (In reply to comment #1) > 1) comment out cron job under the ffxbld account on dm-ausstage01 > rsync -a --delete /opt/aus2/incoming/2/ dp-ausstage01:/opt/aus2/incoming/2/ I was going to deploy the other changes very shortly but could do that later in the day. > 2) sync /vol/aus2/ from netapp-b (in mpt) the same volume on the phx netapp. > > 3) wipe out /opt/aus2 on dp-ausstage01 and mount the synced phx netapp volume > there. TBH, I'd just mount the PHX netapp on dp-aussstage01 and rsync from there, then swap them over. It'll be a bunch faster, and if I deploy my changes after you're done I can take care of making sure they're in sync. > 4) setup ssh keys on dp-ausstage01 (for cltbld and ffxbld). Don't need to do this, ffxbld is already set up and cltbld needs a stake through the heart.
(In reply to comment #3) > TBH, I'd just mount the PHX netapp on dp-aussstage01 and rsync from there, Sorry, this is unclear. I mean populate the PHX netapp from the snippets already in dp-ausstage01:/opt/aus2/incoming
As I am digging into this more, I noticed that there might be two cron jobs updating this nfs share in phx. We already have a cron job syncing all of aus2/incoming from mpt to the nfs volume in phx. I commented out this sync job and mounted this volume on dp-ausstage01 for you guys to verify (/opt/aus2_nfs). I plan on leaving this sync commented out. Since this is already being synced up (from our cron job).. I don't think we need to sync anything else here. Once you guys verify this, I will move that mount point from /opt/aus2_nfs to /opt/aus2. We might need to adjust uid numbers for the ffxbld user, but other than that it should be good.
I'm doing a dummy rsync between /opt/aus2 and /opt/aus2_nfs, and it's taking significantly longer than the equivalent operations on the local disk. The scripts in bug 596391 are based on the assertion 'nfs is fast in PHX', which doesn't seem to be the case.
(In reply to comment #6) > I'm doing a dummy rsync between /opt/aus2 and /opt/aus2_nfs, and it's taking > significantly longer than the equivalent operations on the local disk. The > scripts in bug 596391 are based on the assertion 'nfs is fast in PHX', which > doesn't seem to be the case. It's *faster* than nfs in mpt. It will still be slower than local disk.
OK, that might have been mostly my bad. rsync can't find any differences with rsync -ni -av0 --no-o --no-g --delete /opt/{aus2/,aus2_nfs/aus2/}incoming/ (ie itemized dry run which doesn't touch owner or group) So go ahead and swap them over.
Only cron job 1) from comment #2 is currently in place.
Okay, /opt/aus2 is now an nfs mount.
Could you organise it so /opt/aus2 contains the current contents of /opt/aus2/aus2 ?
I am copying the original directory into /opt/aus2/aus2_orig. I also fixed ffxbld uid/gid to match those on dm-ausstage01. Please let me know if there is anything else for me to do here.
Hmm, I probably wasn't very clear. Here is what I would like to see: 1) The NFS mount at dp-ausstage01:/opt/ instead of /opt/aus2, so we don't have dp-ausstage1:/opt/aus2/aus2/ and dm-ausstage01:/opt/aus2/ 2) The uid and guid for ffxbld working properly, right now on dp-ausstage01 I see $ whoami; pwd; ls -la ffxbld /opt/aus2/aus2 total 32 drwxr-xr-x 8 root root 4096 Jul 3 2009 . drwxr-xr-x 7 root root 4096 Dec 14 18:00 .. drwxr-xr-x 4 root root 4096 Jan 19 2010 app drwxr-xr-x 3 593 2004 4096 Aug 27 2005 build drwxr-xr-x 3 593 2004 4096 Aug 27 2005 build.old drwxr-xr-x 3 593 2004 4096 Aug 27 2005 build-with-nightly-l10n lrwxrwxrwx 1 root root 23 Mar 23 2010 config.php -> app/inc/config-dist.php drwxr-xr-x 6 593 2004 4096 Sep 15 14:56 incoming drwxr-xr-x 4 593 2004 4096 Apr 4 2007 snippets
Please also install cvs on dp-ausstage01.
cvs is installed. The volume is now mounted on /opt. 593 uid is the cltbld user, which I didn't recreate because you said you want it dead. I added it to the system and the owners now look similar to what they do on dm-ausstage01. I added the keys under cltbld on dm-ausstage01 to cltbld/dp-ausstage01
Thanks for adding cvs and fixing the mount, but we're working against each other here. I had dp-ausstage01 and /opt/aus2/ set up just as wanted it before we swapped to NFS. In particular *) only a ffxbld account on dp-ausstage01, using a new key that hasn't leaked all over *) unneeded crap from MPT deleted from /opt/aus2 *) remaining files in /opt/aus2 owned by ffxbld:ffxbld At this point I could you please *) grant ffxbld sudo on dp-ausstage01 OR *) remove cltbld user *) move /opt/aus2 to /opt/aus2_mpt *) move /opt/aus2_orig to /opt/aus2 *) 'chown -R ffxbld:ffxbld /opt/aus2' I need to get this finished up ASAP.
ffxbld now has full sudo, let me know once you are done with it.
nthomas: you have sudo now, but /opt/aus2 in now the nfs mount, so before you clear stuff from it.. please think if we need to.. If its just extra data, clean-up can wait for you (for when you are not in a rush to get done).. But, I will leave that to you.
Too late! I removed aus2-dev/ & aus-prep/ from /opt, and belatedly realized aus-prep/ wasn't something synced from MPT. Sorry about that. Your sync from MPT is still there at /opt/aus2_mpt. I'll do some testing before we yank the sudo.
You can remove the sudo access again, I'm all done with tweaks.
sudo access is gone. Please re-open the bug if there is anything else remaining here.
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Product: mozilla.org → mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.