Closed Bug 1165405 Opened 9 years ago Closed 8 years ago

[tracker] Adjust release automation for ftp.m.o --> S3 migration

Categories

(Release Engineering :: Release Automation: Other, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: nthomas, Unassigned)

References

Details

Attachments

(1 file)

The plan is to have a new upload host that is a frontend to S3, which will have a modified copy of post-upload.py that handles PUTs to S3. This will enable existing buildbot jobs to continue uploading as they are now. Possibly just a domain change. This bug is for figuring out what to do with all the other ssh/rsync access the release automation uses:
Changes needed:
release-mozilla-beta-linux_partner_repack
release-mozilla-beta-linux64_partner_repack
release-mozilla-beta-win32_partner_repack
release-mozilla-beta-macosx64_partner_repack
release-mozilla-beta-win64_partner_repack
  Using rsync to upload
    convert to post-upload.py

release-mozilla-beta-linux_standalone_repack
release-mozilla-beta-linux_repack_1/10 & co
release-mozilla-beta-linux64_standalone_repack
release-mozilla-beta-linux64_repack_1/10 & co
release-mozilla-beta-win32_standalone_repack
release-mozilla-beta-win32_repack_1/10 & co
release-mozilla-beta-macosx64_standalone_repack
release-mozilla-beta-macosx64_repack_1/10 & co
release-mozilla-beta-win64_standalone_repack
release-mozilla-beta-win64_repack_1/10
  Disable nightly style partials in release builds, see
    http://hg.mozilla.org/mozilla-central/file/default/tools/update-packaging/Makefile.in#l81

release-mozilla-beta-firefox_antivirus
  Will need a new system to run av on, d/l from S3
    scan on new frontend before onward push ?

release-mozilla-beta-firefox_checksums
release-mozilla-beta-xulrunner_checksums
  Uses rsync to pull down files, combines, signs, rsync back up
  Creates contrib dirs - new scheme for that ?
    convert to walking S3 LIST, post-upload.py to finish

release-mozilla-beta-check_permissions
  ssh used to verify file and directory persmissions
    convert to S3 API or deprecate ?

release-mozilla-beta-firefox_push_to_mirrors
release-mozilla-beta-xulrunner_push_to_mirrors
  ssh in to rsync to create hardlinks, uses exclude list
    convert to S3 API to do copies, want to preserve perf if we can (multithread)
  for Firefox also creates index.html to hide release until ready
    deprecate ?? Depends on how Cloud Tools are going to do index.html

release-mozilla-beta-firefox_postrelease
release-mozilla-beta-xr_postrelease
  Remove index.html for Firefox, update symlinks
  convert symlink to S3 copy/redirect ?
------------------------------------------------------------------------

Update domain if it changes:
release-mozilla-beta-linux_build
release-mozilla-beta-linux64_build
release-mozilla-beta-win32_build
release-mozilla-beta-macosx64_build
release-mozilla-beta-win64_build
release-mozilla-beta-linux_standalone_repack
release-mozilla-beta-linux_repack_1/10 & co
release-mozilla-beta-linux64_standalone_repack
release-mozilla-beta-linux64_repack_1/10 & co
release-mozilla-beta-win32_standalone_repack
release-mozilla-beta-win32_repack_1/10 & co
release-mozilla-beta-macosx64_standalone_repack
release-mozilla-beta-macosx64_repack_1/10 & co
release-mozilla-beta-win64_standalone_repack
release-mozilla-beta-win64_repack_1/10
  D/l of previous complete mars, otherwise post-upload.py

release-mozilla-beta-linux_standalone_repack
release-mozilla-beta-linux_repack_1/10 & co
release-mozilla-beta-linux64_standalone_repack
release-mozilla-beta-linux64_repack_1/10 & co
release-mozilla-beta-win32_standalone_repack
release-mozilla-beta-win32_repack_1/10 & co
release-mozilla-beta-macosx64_standalone_repack
release-mozilla-beta-macosx64_repack_1/10 & co
release-mozilla-beta-win64_standalone_repack
release-mozilla-beta-win64_repack_1/10
  D/l of previous complete mars, otherwise post-upload.py

release-mozilla-beta-firefox_beta_updates ReleaseUpdatesFactory

release-mozilla-beta-linux_update_verify_beta_1/6
release-mozilla-beta-linux64_update_verify_beta_1/6
release-mozilla-beta-macosx64_update_verify_beta_1/6
release-mozilla-beta-win32_update_verify_beta_1/6
release-mozilla-beta-win64_update_verify_beta_1/6

release-mozilla-beta-beta_final_verification
------------------------------------------------------------------------

Uploading with post_upload.py, no change other than possibly domain:
release-mozilla-beta-firefox_source
release-mozilla-beta-xulrunner_source
release-mozilla-beta-xulrunner_linux_build
release-mozilla-beta-xulrunner_linux64_build
release-mozilla-beta-xulrunner_win32_build
release-mozilla-beta-xulrunner_macosx64_build
release-mozilla-beta-xulrunner_win64_build
------------------------------------------------------------------------

Doesn't interact with stage:
release-mozilla-beta-firefox_tag_source
release-mozilla-beta-firefox_tag_l10n
release-mozilla-beta-linux_repack_complete
release-mozilla-beta-linux64_repack_complete
release-mozilla-beta-win32_repack_complete
release-mozilla-beta-macosx64_repack_complete
release-mozilla-beta-win64_repack_complete
release-mozilla-beta-update_shipping_beta
release-mozilla-beta-firefox_beta_ready_for_beta-cdntest_testing
release-mozilla-beta-firefox_beta_ready_for_release
release-mozilla-beta-firefox_bouncer_submitter
release-mozilla-beta-firefox_reset_schedulers
release-mozilla-beta-firefox_beta_start_uptake_monitoring
------------------------------------------------------------------------
All your notes look accurate to me. Just one small additional comment:

(In reply to Nick Thomas [:nthomas] from comment #1)
> Changes needed:
> release-mozilla-beta-linux_partner_repack
> release-mozilla-beta-linux64_partner_repack
> release-mozilla-beta-win32_partner_repack
> release-mozilla-beta-macosx64_partner_repack
> release-mozilla-beta-win64_partner_repack
>   Using rsync to upload
>     convert to post-upload.py

As we discussed in person, this is a bit more work than it may seem, because everything else that uses post-upload.py does it through upload.py, which is in the Gecko tree.
Summary: Adjust release automation for new upload host → Adjust release automation for ftp.m.o --> S3 migration
I'd forgotten to assess Fennec builds earlier. 

* tag and source are same as desktop, no change needed except the upload host
* the builds and repacks are using post_upload.py, no change needed except the upload host
* the mirror push and postrelease need the same fixes as desktop

So there's not any additional work there.
It's SWAG time.

Desktop
-------
*_partner_repack:
Issue: Using rsync to upload, convert to post-upload.py
Code: http://hg.mozilla.org/build/buildbotcustom/file/fd6476cc8f48/process/factory.py#l4751
Fix: Borrow code from the en-US builds, http://hg.mozilla.org/build/buildbotcustom/file/fd6476cc8f48/process/factory.py#l2625. Download upload.py & deps from gecko repo so we can use it. Possibly upload each partner repack separately, to avoid several GB in one go
SWAG: 2 days

*_repack: 
Issue: Disable nightly style partials in release builds
Code: http://hg.mozilla.org/mozilla-central/file/default/tools/update-packaging/Makefile.in#l81
Fix: Set MOZ_AUTOMATION_UPDATE_PACKAGING=0 in beta & release mozconfigs, or backport work on date which adds MOZ_AUTOMATION_UPDATE_PACKAGING_PARTIAL and set it to 0
Test: mozconfig dumper for nightly/dep builds, http://hg.mozilla.org/build/braindump/file/default/mozconfig-related/dump-mozconfig.bash
SWAG: 2 days

*_antivirus  - still needs scoping
Issue: a/v is currently run on stage.mozilla.org, local script called after ssh connection
Code: http://hg.mozilla.org/build/tools/file/default/scripts/release/stage-tasks.py#l93
Option1: Cloud Services - Scan on new upload host before submission to S3 (provision extract-and-scan script, integrate with post_upload.py). Important to make sure virus definitions are being updated at least daily
SWAG: 1 week ?

Option2: Setup new RelEng host in S3, d/l files from S3 (excluding blacklist, maybe listen to pulse), a/v scan.
SWAG:

Option3: Build AV service/bitmover we'll have need for relpromo
SWAG: > 2 weeks

*_checksums (including xulrunner):
Issue: Uses rsync to pull down files, combines, signs, rsync back up. Creates contrib dirs
Code: http://hg.mozilla.org/build/tools/file/default/scripts/release/generate-sums.py
Fix: convert rsync d/l to walking S3 LIST, existing combine & sign, upload.py & post-upload.py to push back up
SWAG: 3 days
Unknowns: what we are doing with contrib/ directories; maybe create in post_upload instead when candidates dir first created

check_permissions
Issue: ssh to stage to verify file and directory permissions
Code: http://hg.mozilla.org/build/tools/file/default/scripts/release/stage-tasks.py#l68
Option 1: work with Cloud Services to ensure appropriate ACL on files
SWAG: 1 week
Option 2: Get LIST of files, and use http://docs.aws.amazon.com/AmazonS3/latest/API/RESTObjectGETacl.html. Requires parsing policy xml, too fragile ?
SWAG: 1 week

------------------------------------------------------------------------

All (desktop, mobile, xulrunner)
--------------------------------
*_push_to_mirrors
Issue: we run rsync after ssh into stage.m.o. Firefox has index.html added in subdirs of releases/<version>/ to hide files until we actually release
Code: http://hg.mozilla.org/build/tools/file/default/scripts/release/stage-tasks.py#l101
Fix: use http://docs.aws.amazon.com/AmazonS3/latest/API/RESTObjectCOPY.html to push, parallel worker model for perf. Deprecate index.html, move them aside and put back later ?
SWAG: 2 days

*_postrelease
Issue: Removes index.html for Firefox, updates symlinks, updates bouncer aliases
Code: http://hg.mozilla.org/build/tools/file/default/scripts/release/stage-tasks.py#l378
Fix: convert symlink a redirect (x-amz-website​-redirect-location) for top level dir ?
SWAG: Unknown until more technical investigation done
> *_repack: 

Clarification - this is l10n repacks.

> Issue: Disable nightly style partials in release builds
> Code:
> http://hg.mozilla.org/mozilla-central/file/default/tools/update-packaging/
> Makefile.in#l81
> Fix: Set MOZ_AUTOMATION_UPDATE_PACKAGING=0 in beta & release mozconfigs, or
> backport work on date which adds MOZ_AUTOMATION_UPDATE_PACKAGING_PARTIAL and
> set it to 0

Slight miscue here. The default for MOZ_AUTOMATION_UPDATE_PACKAGING is 0, but this bit of the Makefile isn't correct:
  full-update:: complete-patch $(if $(MOZ_AUTOMATION_UPDATE_PACKAGING),automation-partial-patch)

I have a fix on the date branch to port over, see bug 1170913.
Depends on: 1170913
*_postrelease also is responsible for copying certain partner builds into the 'bundles' directory.
> *_postrelease
> Issue: Removes index.html for Firefox, updates symlinks, updates bouncer
> aliases

I missed some partner builds getting copied too. See http://hg.mozilla.org/build/tools/file/default/scripts/release/stage-tasks.py#l206 for the code for that. We may need to re-host those somewhere.
Depends on: 1148592
Depends on: 1175085
Depends on: 1175086
Summary of dependent bugs:

*_partner_repack --> bug 1173343

*_repack (l10n) --> bug 1170913

*_antivirus  --> bug 1145774, but we might chose to build a another monolithic scanner in the short-term. using the new upload host is currently off the table.

*_checksums (including xulrunner) --> bug 1174145
> Unknowns: what we are doing with contrib/ directories; maybe create in
> post_upload instead when candidates dir first created  --> bug 1148592

check_permissions --> bug 1175085

*_push_to_mirrors  bug 1160380 probably, but don't forget index.html issue

*_postrelease --> bug 1175086
Assignee: nthomas → nobody
Status: ASSIGNED → NEW
Summary: Adjust release automation for ftp.m.o --> S3 migration → [tracker] Adjust release automation for ftp.m.o --> S3 migration
Depends on: 1181542
> *_push_to_mirrors  bug 1160380 probably, but don't forget index.html issue

See Bug 1181542 instead.
I'm slightly confused as to the latest plan here.

We've switched from ftp.m.o to archive.m.o (or similar), but yet ftp.m.o is still accessible, and still being given as the log URL for jobs in builds-4hr (bug 1192019). I also thought we were switching to S3? Or is archive.m.o on S3? Is this just for buildbot?

I'm just trying to figure out how Treeherder can support the "go to build directory" feature (or something similar that let's users discover/share links to binaries and/or build artifacts), in bug 1160410 and friends.

Thanks :-)
Flags: needinfo?(nthomas)
Depends on: 1192019
It's still early days in the transition. You're correct that archive.m.o will be the new domain, backed by S3, but to date we've only disabled ftp:// on ftp. No data has moved yet, and ftp & archive remain identical in DNS. IT will communicate the timetable when it's finalized.

I'm hopeful that there will be nothing for treeherder to do. New builds should have archive.m.o urls at the point of ingestion into TH, and older ftp.m.o urls should continue to work until those builds are expired off.
Flags: needinfo?(nthomas)
Depends on: 1196101
Depends on: 1198667
Depends on: 1198668
Depends on: 1199066
(In reply to Nick Thomas [:nthomas] from comment #12)
> I'm hopeful that there will be nothing for treeherder to do. New builds
> should have archive.m.o urls at the point of ingestion into TH, and older
> ftp.m.o urls should continue to work until those builds are expired off.

Ah great - thank you :-)
Depends on: 1211297
Depends on: 1203907
Blocks: 1213721
Depends on: 1213771
Depends on: 1213790
Depends on: 1214971
This wraps all the changes from the dependent bugs. The important parts are in the production configs:
* ftpServer changes to the new canonical domain archive.mozilla.org
* stagingServer changes to upload.{role}.productdelivery.{env}.mozaws.net for role ffxbld (includes mobile) or tbirdbld, and env stage or prod
* add S3Credentials for 'push to mirrors' (linux slaves only)
* add S3Bucket, net-mozaws-{env}-delivery-{suffix} for env stage or prod, and suffix is firefox for firefox, otherwise archive
* remove defunct ausHost, ausUser, ausSshKey, xulrunnerPlatforms, xulrunner_mozconfigs (will land before bug 1196278)
* remove disablePermissionCheck, as requested on bug 1175085
* for fennec, remove autoGenerateChecksums=False, as the new checksums code can handle it too

Also syncs a lot of stuff from production --> staging to minimize the diffs between them.
Attachment #8676003 - Flags: review?(rail)
Comment on attachment 8676003 [details] [diff] [review]
[buildbot-configs] Combined config changes for S3

Review of attachment 8676003 [details] [diff] [review]:
-----------------------------------------------------------------

::: mozilla/release-fennec-mozilla-beta.py
@@ -87,4 @@
>  releaseConfig['ausServerUrl']        = 'https://aus4.mozilla.org'
> -releaseConfig['ausHost']             = 'aus3-staging.mozilla.org'
> -releaseConfig['ausUser']             = 'ffxbld'
> -releaseConfig['ausSshKey']           = 'ffxbld_rsa'

DIAF!!!
Attachment #8676003 - Flags: review?(rail) → review+
Depends on: 1223019
Depends on: 1226105
No longer depends on: 1181710
I think we are done here
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: