Closed
Bug 596391
Opened 14 years ago
Closed 14 years ago
Changes to sync AUS data to PHX datacenter
Categories
(Release Engineering :: General, defect, P2)
Release Engineering
General
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: morgamic, Assigned: nthomas)
References
Details
(Whiteboard: [release][automation][q3goal])
Attachments
(2 files, 3 obsolete files)
1.90 KB,
patch
|
catlee
:
review+
nthomas
:
checked-in+
|
Details | Diff | Splinter Review |
4.44 KB,
patch
|
nthomas
:
review+
nthomas
:
checked-in+
|
Details | Diff | Splinter Review |
Based on our discussion, a stop-gap for getting sync done between our datacenters is to push snippets during the actual build process so we instantly have new changes in both places.
Options:
- pushsnip rsync
- upload in both places, do everything in parallel
Rob or Chris can articulate better. We'll track things here.
Comment 1•14 years ago
|
||
Do we have an AMO computer to receive the data generated by this change in the PHX colo?
Whiteboard: [release][automation]
Updated•14 years ago
|
Priority: -- → P3
Reporter | ||
Comment 2•14 years ago
|
||
Think you mean AUS. We need to get that data from IT. Derek - where would we write to?
Comment 3•14 years ago
|
||
Bringing in oremj to comment on our existing AMO infrastructure.
Assignee | ||
Updated•14 years ago
|
Assignee: nobody → nrthomas
Priority: P3 → P2
Whiteboard: [release][automation] → [release][automation][q3goal]
Reporter | ||
Comment 4•14 years ago
|
||
Derek this is for AUS not AMO.
Assignee | ||
Comment 5•14 years ago
|
||
I tested pushing the 3.5.13 release snippets from MPT -> PHX, which is a time-sensitive task we do with QA waiting on us. Did that with rsync as per usual, and it took 174 seconds for the first dry run, 11s for second, 6s for third and fourth. Most of the first run is spent at MPT doing 'Building file list' on the NFS share, the later ones are faster because the files are in the NFS cache. From a cold cache the time to build the file list varies a bit, depending on how busy the filer behind NFS is.
If I actually push those snippets to PHX the file transfer takes about a minute.
So if the NFS cache is hot at MPT from pushing the snippets live there, then the time to push to PHX should 60-90 seconds. That should be fine from a QA point of view so I'll work up a patch to pushsnip.
Assignee | ||
Comment 6•14 years ago
|
||
The MPT steps are unmodified except for echo'ing more information along the way. Wasn't sure if we need a commandline arg to not push to Phoenix, that's easily added if needed. I'll file against IT to get the production account and key setup at the PHX end.
Attachment #479967 -
Flags: review?(catlee)
Assignee | ||
Comment 7•14 years ago
|
||
Bah, we haven't even considered how nightly updates are going to get synced across.
Assignee | ||
Comment 8•14 years ago
|
||
Everything is going to be owned by the ffxbld account on dp-ausstage01, both nightly and release snippets, and we'll use one account to transfer.
Attachment #479967 -
Attachment is obsolete: true
Attachment #480007 -
Flags: review?(catlee)
Attachment #479967 -
Flags: review?(catlee)
Assignee | ||
Comment 9•14 years ago
|
||
(In reply to comment #7)
> Bah, we haven't even considered how nightly updates are going to get synced
> across.
First cut is going to be rsyncing /opt/aus2/incoming/2 across every 5 minutes. To accomodate that I've got a backup job going for all the existing content to /opt/aus2/snippets/backup/20100930-backup-of-nightly-updates.tar.bz. There is a lot of old crap in there so it's taking a long time. When it is done we can cull incoming/2 pretty hard and make the rsync complete much more quickly. We remove nightly mar files older than 30 days on ftp.m.o, so can remove snippets older than (say) 28 days. There should be a nightly cron to maintain this, and remove empty directories of a minimum depth (at least Firefox/<branch>).
We also talked about adding a nightly cron job to make sure PHX is in sync with MPT.
Assignee | ||
Comment 10•14 years ago
|
||
Actually, with a smart script you could do even better, since you only need the latest two or three buildid's for any
app/branch/platform/buildID/locale/
path. Would be pretty trivial if locale was ahead of buildID, but of course its isn't.
Assignee | ||
Comment 11•14 years ago
|
||
/opt/aus2/snippets/backup/20100930-backup-of-nightly-updates.tar.bz is now done, if anyone wants to take up the baton and cull incoming/2.
Updated•14 years ago
|
Attachment #480007 -
Flags: review?(catlee) → review+
Assignee | ||
Comment 12•14 years ago
|
||
I've cleaned out the old nightly snippets from /opt/aus2/incoming/2. The following have been removed
* Thunderbird/, Sunbird/: now on their own AUS server
* In Firefox/
* 2.0/, trunk/: these branches are now end of life
* firefox-lorentz/: development branch no longer used, AUS patch to repatriate
to mozilla-1.9.2 to follow
These two Firefox directories have been emptied out
* electrolysis: this project is idle and will be reset at some point
* mozilla-2.0: branch isn't live yet
The remaining Firefox directories (mozilla-1.9.1, mozilla-1.9.2, mozilla-central, tracemonkey) are still active and were trimmed to the last 6 days of snippets. That leaves about 7500 files and 4000 directories, and it will be interesting to see how long a sync takes. I'll try that next week.
Assignee | ||
Comment 13•14 years ago
|
||
Cron job to clean out nightly snippet store (ffxbld@dm-ausstage01)
0 23 * * * find /opt/aus2/incoming/2 -mindepth 4 -maxdepth 4 -mtime +6 -exec rm -rf {} \;
On stage we delete l10n mar files from dirs with -mtime +5. The snippets pointing to those files are placed in directories which were created two days before (by the time the snippets go into the dir for the previous day, which was created the day before that).
Assignee | ||
Comment 14•14 years ago
|
||
Syncing the nightly snippets took 39, 23, 15, and 11 seconds when I tried it four times, with a 120 second sleep to let NFS caching expire.
Set up a cron job to sync nightly snippets over every ten minutes
*/10 * * * * rsync -a /opt/aus2/incoming/2/ dp-ausstage01:/opt/aus2/incoming/2/
That and comment #13 going to me verbosely for starters.
Assignee | ||
Comment 15•14 years ago
|
||
Status:
* nightly updates snippets are being synced from dm-ausstage01 to dp-ausstage01
* release snippets will be pushed to both locations once attachment 480007 [details] [diff] [review] lands (ready to go)
* we just need some sort of daily job which checks release snippets are in sync, I think we can script something using the nightly backups we already make
IT can go ahead with prepping AUS webheads in PHX IMO.
Summary: Modify pushsnip or backupsnip to copy changes to PHX data center → Changes to sync AUS data to PHX datacenter
Assignee | ||
Comment 17•14 years ago
|
||
To run this we'd move cltbld's cron job to ffxbld and amend it to:
@daily /home/cltbld/bin/backupsnip-nightly && ssh dp-ausstage01 ./check-sync
The mail would to go to our group mailing list. Normally there will be no output and no mail.
If there are differences it produces output like this,
# eg for a new snippet
cd+++++++ Firefox/3.5.15/Darwin_Universal-gcc3/20101026200251/af/beta/
>f+++++++ Firefox/3.5.15/Darwin_Universal-gcc3/20101026200251/af/beta/complete.txt
# eg updated snippet (file size isn't changing, just content)
>f..t.... Firefox/3.5.7/Linux_x86-gcc3/20091221152502/es-AR/beta/partial.txt
The man page for rsync describes that syntax under '--itemize-changes', but that's directory and file creation for the first two. The last one is updated because newly generated snippets have different timestamps (not sure it ever looks at anything else once it finds one reason to update).
If we find differences then the unpacked snippets from the MPT backup are pushed into production, and we move the PHX snippets aside for later investigation. Run time is a few mins on the current PHX VM.
Attachment #496782 -
Flags: review?(catlee)
Assignee | ||
Comment 18•14 years ago
|
||
Attachment 480007 [details] [diff] will break staging, this one won't.
Attachment #480007 -
Attachment is obsolete: true
Attachment #496784 -
Flags: review?(catlee)
Updated•14 years ago
|
Attachment #496782 -
Flags: review?(catlee) → review+
Comment 19•14 years ago
|
||
Comment on attachment 496784 [details] [diff] [review]
Teach pushsnip a new trick, v3 (doesn't break staging)
Is the ';' after PHX_KEY necessary?
Attachment #496784 -
Flags: review?(catlee) → review+
Assignee | ||
Comment 20•14 years ago
|
||
(In reply to comment #19)
> Is the ';' after PHX_KEY necessary?
No, I don't think so.
Assignee | ||
Comment 21•14 years ago
|
||
Makes backupsnip-nightly not mail normally, otherwise unchanged. Carrying review over.
Attachment #496782 -
Attachment is obsolete: true
Attachment #497752 -
Flags: review+
Assignee | ||
Comment 22•14 years ago
|
||
The changes IT has made to set up NFS (in bug 618088) have blocked me deploying this stuff today, so here's a list of steps if I don't get to it early Wednesday.
*) land attachment 497792 [details] to modify the nightly backup script to push new backups to PHX, and run check_sync. 'cvs up' in dm-ausstage01:~cltbld/bin
*) modify crontab for cltbld@dm-ausstage01 (aus2-staging.m.o) to read
MAILTO=release@mozilla.com
@daily /home/cltbld/bin/backupsnip-nightly && ssh dp-ausstage01 ./check-sync
*) as ffxbld@dp-ausstage01 (ssh from ffxbld@dm-ausstage01) do
cd ~
cvs -d :pserver:anonymous@cvs-mirror.mozilla.org:/cvsroot co -d bin mozilla/tools/release/bin
which check-sync
Should return ~/bin/check-sync. Run check-sync once to make sure PHX is up to date.
*) the pushsnip change (attachment 496784 [details] [diff] [review]) should land with the stray ';' removed.
*) Email release group with heads up:
"pushsnip will now push to MPT (like usual) and also to PHX. It shouldn't take too much longer but could be double of old push time. Until IT sets up AUS in PHX QA can continue doing release update testing as before - 2 mins after the MPT push finishes. Once IT has AUS in PHX live we should wait until PHX push is finished plus 2 mins to unlease the QA hounds.
Once PHX is live any manual interventions in the snippet store on dm-ausstage01 (aus2-staging.m.o) will have to be made in PHX too. An rsync from MPT to PHX takes a prohibitive amount of time unless you can stop it down to a small subset of the store. To reach dp-ausstage01, first ssh to aus2-staging as ffxbld, then 'ssh dp-ausstage01'."
*) Update https://intranet.mozilla.org/Build:Updates for 2nd para above. Add info about PHX, how to reach dp-ausstage01 etc.
Assignee | ||
Comment 23•14 years ago
|
||
Comment on attachment 497752 [details] [diff] [review]
Script to compare and replace Phoenix data (silencio!)
Checked in with a followup to not to the SCP on non-prod systems.
http://bonsai.mozilla.org/cvsquery.cgi?treeid=default&module=all&branch=HEAD&branchtype=match&dir=mozilla%2Ftools%2Frelease%2Fbin&file=&filetype=match&who=&whotype=match&sortby=Date&hours=2&date=explicit&mindate=2010-12-15+15%3A28&maxdate=2010-12-15+15%3A32&cvsroot=%2Fcvsroot
Attachment #497752 -
Flags: checked-in+
Assignee | ||
Comment 24•14 years ago
|
||
Actually ended up moving the backupsnip-nightly cron on dm-ausstage from cltbld to ffxbld, since I revoked the right for cltbld@dm-ausstage01 to log into dp-ausstage01. This necessitated changing the ownership and permissions for dm-ausstage01:/opt/aus2/snippets/backup from cltbld:cltbld:775 to ffxbld:cltbld:775, and adding cltbld to the cltbld group (!!). Verified backupsnip still works as cltbld.
Testing backupsnip-nightly & check-sync is working OK ...
Assignee | ||
Comment 25•14 years ago
|
||
And they do. About two hours to create the backup in MPT and another hour in PHX to unpack it, compare to the existing files and swap out if necessary. crontab ended up slightly different (ffxbld@dm-aussstage01):
# backup prod snippets and keep PHX in sync
@daily /home/cltbld/bin/backupsnip-nightly && ssh dp-ausstage01 /home/ffxbld/bin/check-sync
Assignee | ||
Comment 26•14 years ago
|
||
Comment on attachment 496784 [details] [diff] [review]
Teach pushsnip a new trick, v3 (doesn't break staging)
Checked in, the trailing ; fix as a followup:
http://bonsai.mozilla.org/cvsquery.cgi?treeid=default&module=all&branch=HEAD&branchtype=match&dir=mozilla%2Ftools%2Frelease%2Fbin&file=&filetype=match&who=&whotype=match&sortby=Date&hours=2&date=explicit&mindate=2010-12-16+00%3A12&maxdate=2010-12-16+00%3A12&cvsroot=%2Fcvsroot
Deployed to aus2-staging (dm-ausstage01).
Attachment #496784 -
Flags: checked-in+
Assignee | ||
Comment 27•14 years ago
|
||
And on staging-stage too; preproduction-stage took care of itself via puppet. Some possibility of bustage on both of those but hopefully avoided.
Assignee | ||
Comment 28•14 years ago
|
||
All done. Over to IT.
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Updated•12 years ago
|
Product: mozilla.org → Release Engineering
You need to log in
before you can comment on or make changes to this bug.
Description
•