Closed
Bug 1186297
Opened 9 years ago
Closed 8 years ago
Switch ash branch to upload via new S3 frontends
Categories
(Release Engineering :: General, defect)
Release Engineering
General
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: nthomas, Unassigned)
References
Details
Attachments
(2 files)
30.64 KB,
patch
|
Details | Diff | Splinter Review | |
2.43 KB,
patch
|
nthomas
:
review+
rail
:
checked-in+
|
Details | Diff | Splinter Review |
... as an initial loadtest. Ash was suggested.
Reporter | ||
Comment 1•9 years ago
|
||
I've merged m-c to ash again, and this patch should take care of all the desktop, android, and the public parts of device builds. The exception is b2g manifests, which will be broken by a mismatched key and host. I am unsure if we still need these, but if we do can ask for a b2gbld upload endpoint, or swap to using ffxbld. There appear to be no remaining buildbot factory builds on ash, and in fact on m-c it's only hg-bundle, xulrunner, and l10n dep jobs which still reference 'stage.mozilla.org' explicitly. None of those are enabled on ash.
This is waiting on updated known_hosts being deployed on enough slaves to avoid false upload failures.
Reporter | ||
Comment 2•9 years ago
|
||
Pushed that to ash - https://hg.mozilla.org/projects/ash/rev/917678add467
Comment 3•9 years ago
|
||
Uploads look OK, but the new post_upload still isn't printing out the URLs for the uploaded resources.
Flags: needinfo?(oremj)
Comment 4•9 years ago
|
||
Can you give me the command you ran and the output that it generated?
Flags: needinfo?(oremj) → needinfo?(catlee)
Comment 5•9 years ago
|
||
The log is here:
http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/ash-linux-debug/1438229081/ash-linux-debug-bm71-build1-build0.txt.gz
The command we ran was:
post_upload.py --tinderbox-builds-dir ash-linux-debug -p firefox -i 20150729210441 --revision 917678add467 --release-to-tinderbox-dated-builds --release-to-latest-tinderbox-builds
Flags: needinfo?(catlee)
Comment 6•9 years ago
|
||
The --release-to-tinderbox-dated-builds command was broken. That should be working now. The --release-to-latest-tinderbox-builds, in the previous post_upload script, would symlink the dated builds directory to latest. Can we stop doing that? If not, how do we want to handle that operations since we can't atomically flip a directory.
Comment 7•9 years ago
|
||
We agreed earlier that instead of symlinking that you could copy the files into the latest directory.
Comment 8•9 years ago
|
||
I can copy them over, but what about files that are already there? Should I leave them or delete everything in the directory?
Flags: needinfo?(catlee)
Comment 9•9 years ago
|
||
Now getting these errors:
13:53:00 INFO - /builds/slave/ash-lx-00000000000000000000000/build/src/obj-firefox/_virtualenv/bin/python /builds/slave/ash-lx-00000000000000000000000/build/src/build/upload.py --base-path ../../dist \
13:53:00 INFO - '../../dist/firefox-42.0a1.en-US.linux-i686.tar.bz2' '../../dist/linux-i686/xpi/firefox-42.0a1.en-US.langpack.xpi' '../../dist/firefox-42.0a1.en-US.linux-i686.common.tests.zip' '../../dist/firefox-42.0a1.en-US.linux-i686.cppunittest.tests.zip' '../../dist/firefox-42.0a1.en-US.linux-i686.xpcshell.tests.zip' '../../dist/firefox-42.0a1.en-US.linux-i686.mochitest.tests.zip' '../../dist/firefox-42.0a1.en-US.linux-i686.reftest.tests.zip' '../../dist/firefox-42.0a1.en-US.linux-i686.web-platform.tests.zip' '../../dist/firefox-42.0a1.en-US.linux-i686.crashreporter-symbols.zip' '../../dist//firefox-42.0a1.en-US.linux-i686.txt' '../../dist//firefox-42.0a1.en-US.linux-i686.json' '../../dist//firefox-42.0a1.en-US.linux-i686.mozinfo.json' '../../dist//test_packages.json' '../../dist/jsshell-linux-i686.zip' ../../dist/host/bin/mar ../../dist/host/bin/mbsdiff \
13:53:00 INFO - '../../dist//firefox-42.0a1.en-US.linux-i686.checksums' '../../dist//firefox-42.0a1.en-US.linux-i686.checksums'.asc
13:56:58 INFO - 2015/07/30 20:56:58 putting /tmp/tmp.a43SLswCoz//firefox-42.0a1.en-US.linux-i686.tar.bz2 to net-mozaws-prod-delivery-firefox/pub/firefox/tinderbox-builds/ash-linux/1438229081/firefox-42.0a1.en-US.linux-i686.tar.bz2 err: NoSuchBucket: The specified bucket does not exist
13:56:58 INFO - status code: 404, request id: []
13:56:58 INFO - 2015/07/30 20:56:58 putting /tmp/tmp.a43SLswCoz/linux-i686/xpi/firefox-42.0a1.en-US.langpack.xpi to net-mozaws-prod-delivery-firefox/pub/firefox/tinderbox-builds/ash-linux/1438229081/firefox-42.0a1.en-US.langpack.xpi err: NoSuchBucket: The specified bucket does not exist
13:56:58 INFO - status code: 404, request id: []
13:56:59 INFO - 2015/07/30 20:56:59 putting /tmp/tmp.a43SLswCoz//firefox-42.0a1.en-US.linux-i686.common.tests.zip to net-mozaws-prod-delivery-firefox/pub/firefox/tinderbox-builds/ash-linux/1438229081/firefox-42.0a1.en-US.linux-i686.common.tests.zip err: NoSuchBucket: The specified bucket does not exist
13:56:59 INFO - status code: 404, request id: []
Flags: needinfo?(catlee)
Comment 10•9 years ago
|
||
You'll need the bucket-prefix flag we talked about earlier. --bucket-prefix "net-mozaws-stage-delivery"
Reporter | ||
Comment 11•9 years ago
|
||
Merged m-c to ash, and added the bucket prefix, at https://hg.mozilla.org/projects/ash/rev/b3432877c7fe.
Reporter | ||
Comment 12•9 years ago
|
||
We should turn on other jobs to make ash like m-c, eg nightlies, l10n, spidermonkey and other projects, in order to flush out ssh/scp/rsync usage. Transition to the new host is slated for mid-October.
Comment 13•9 years ago
|
||
config diff can be viewed at https://gist.github.com/rail/2fc97bc7add5573af0ff
There are some unused variables, but cleaning them up would be another bug. If you prefer we can tweak some variables, like enable_nightly_everytime, enable_weekly_bundle (shouldn't we use bundleclone?).
Attachment #8659885 -
Flags: review?(nthomas)
Reporter | ||
Comment 14•9 years ago
|
||
Comment on attachment 8659885 [details] [diff] [review]
ash-buildbot-configs.diff
Lets either back off on periodic_start_hours now or soon, unless those builds are checking for pushes as well as watching the clock. Strange that dep_signing_servers is being unset for debug builds.
Attachment #8659885 -
Flags: review?(nthomas) → review+
Comment 15•9 years ago
|
||
Comment on attachment 8659885 [details] [diff] [review]
ash-buildbot-configs.diff
https://hg.mozilla.org/build/buildbot-configs/rev/eb9fa2c84aee
pgo strategy set to per-checkin
Attachment #8659885 -
Flags: checked-in+
Comment 16•9 years ago
|
||
pushed to production
https://hg.mozilla.org/build/buildbot-configs/rev/ed59fecd49f1
Reporter | ||
Comment 17•9 years ago
|
||
I merged inbound to ash to see how we go - https://treeherder.mozilla.org/#/jobs?repo=ash&revision=4089002b3fae
Summary: Switch a branch to upload via new S3 frontends → Switch ash branch to upload via new S3 frontends
Comment 18•9 years ago
|
||
I investigated multiple options to figure out what would be the optimal instance type for the upload host.
One of ideas was trying to simulate load comparable to what we have now.
First step was to figure out the current upload rates. I tried to use graphite, but the data is to coarse - tx/rx rates are not what we need.
I took this approach to figure out the upload rates we experience:
* find all files modified in particular period of time. I applied this to Firefox, Fennec and B2G files living on stage.m.o, modified within 24h for a busy day with multiple releases in fly (around Sep 15)
* Generate time series and analyze the rates. In our case the max is most important because we have to plan for peaks.
The results are the below:
30s max: 874 Mbps
1m max: 658 Mbps
3m max: 477 Mbps
5m max: 439 Mbps
10m max: 378 Mbps
1h max: 320 Mbps
Load simulation is a bit tricky task which may take a lot of resources. We thought that we could use taskcluster to spin up a lot of clients and upload some files. This will require some extra work to prep proper images with all needed secrets baked in and write custom scripts to generate traffic.
From our past experience with proxxy, we will need quite a beefy instance (assuming we can't use multiple instances in parallel) to meet the needed network performance. Per http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-ec2-config.html m3.2xlarge might be what we need.
Reporter | ||
Comment 19•9 years ago
|
||
rail, could you do what's necessary to turn on funsize for ash ?
Flags: needinfo?(rail)
Reporter | ||
Comment 20•9 years ago
|
||
I've pushed to ash with fixes for
* desktop balrog submission (replace archive.m.o with CDN instead of ftp.m.o), would have probably hit this on android too
* android multi-locale's missing config
See https://treeherder.mozilla.org/#/jobs?repo=ash&revision=9595d85e38b8.
B2G builds will still be busted uploading to the new system, probably because upload.ffxbld.stage needs to be allowed to write to /pub/b2g/.
Comment 21•9 years ago
|
||
(In reply to Nick Thomas [:nthomas] from comment #19)
> rail, could you do what's necessary to turn on funsize for ash ?
You can add ash to https://github.com/mozilla/funsize/blob/master/funsize/worker.py#L22. Deployment is a bit tricky (need to document it..), I can help with that.
Flags: needinfo?(rail)
Reporter | ||
Comment 22•9 years ago
|
||
Comment 23•9 years ago
|
||
Deployed new funsize:
remote: https://hg.mozilla.org/build/puppet/rev/28303fd6d4b7
remote: https://hg.mozilla.org/build/puppet/rev/61152257c0f7
Reporter | ||
Comment 24•9 years ago
|
||
Current status: everything is green on ash except
* public b2g uploads, the upload host isn't allowed access to b2g/, bug 1211374
* b2g manifests will fail to upload after the switch, bug 1211371
* hazard builds do their own upload, bug 1211402
* the blocklist updates will break on esr38 when we lose tinderbox-builds/foo/latest, and have no nightlies, bug 1211770
(In reply to Rail Aliiev [:rail] from comment #23)
> Deployed new funsize:
Not sure if not working, or needs more nightlies to be able to make progress. In retrospect we only need to generate partials one build back.
Reporter | ||
Comment 25•9 years ago
|
||
Still using ash to finish up the source manifests and socorro json files, and then possibly for redoing the pvtbuilds uploads.
Reporter | ||
Comment 26•8 years ago
|
||
Finished using ash a long time ago. https://github.com/mozilla-releng/funsize/pull/41/files to celan up the funsize config.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Assignee | ||
Updated•7 years ago
|
Component: General Automation → General
You need to log in
before you can comment on or make changes to this bug.
Description
•