Closed
Bug 1025842
Opened 10 years ago
Closed 10 years ago
mock 'archives' are fragile on spot instances
Categories
(Release Engineering :: General, defect)
Tracking
(Not tracked)
RESOLVED
WORKSFORME
People
(Reporter: nthomas, Unassigned)
References
Details
Bug 1024962 was try spot in us-west-2. Today we have bld spot in us-west-2 showing similar symptoms - eg bld-linux64-spot-443, 459, 477, 320, 473, 466 all failing to unpack the root archive mock caches in what look newly launched spot instances. The failure is quick, and the next build is successful. I had assumed it was bug 1023477 but it seems we have backed out the fs changes there.
Comment 2•10 years ago
|
||
and closed inbound now due to a higher failure rate
Comment 3•10 years ago
|
||
(In reply to Carsten Book [:Tomcat] from comment #2) > and closed inbound now due to a higher failure rate update 6:20 - all integration trees are closed since this problem is spreading affected builders as example slave: bld-linux64-spot-1002 - https://tbpl.mozilla.org/php/getParsedLog.php?id=41788883&tree=Mozilla-Inbound&full=1 slave: bld-linux64-spot-431 - https://tbpl.mozilla.org/php/getParsedLog.php?id=41788917&tree=Mozilla-Inbound&full=1 slave: bld-linux64-spot-1001 - https://tbpl.mozilla.org/php/getParsedLog.php?id=41789434&tree=Fx-Team&full=1 slave: bld-linux64-spot-075 - https://tbpl.mozilla.org/php/getParsedLog.php?id=41789426&tree=Fx-Team slave: bld-linux64-spot-175 - https://tbpl.mozilla.org/php/getParsedLog.php?id=41789338&tree=Fx-Team
Updated•10 years ago
|
Severity: critical → blocker
Comment 4•10 years ago
|
||
Checking...
Comment 5•10 years ago
|
||
I again suspect changes in bug 1023477...
Comment 6•10 years ago
|
||
I'm still seeing failures: https://tbpl.mozilla.org/php/getParsedLog.php?id=41792604&tree=Mozilla-Inbound Are these just from old instances?
Comment 7•10 years ago
|
||
yes, just checked bld-linux64-spot-092, it was based on one of the "bad" AMIs.
Reporter | ||
Comment 8•10 years ago
|
||
Have we purged all the instances bad on the bad AMI(s) ?
Comment 9•10 years ago
|
||
(In reply to Nick Thomas [:nthomas] from comment #8) > Have we purged all the instances bad on the bad AMI(s) ? http://hg.mozilla.org/build/cloud-tools/file/default/scripts/aws_terminate_by_ami_id.py
Comment 10•10 years ago
|
||
bld-linux64-spot-494 looks to have failed 4 times in a similar manner as ones from https://bugzilla.mozilla.org/show_bug.cgi?id=1025842#c3 also, might be related, bld-linux64-spot-136 had issues with zip/objcopy step in 'make buildsymbols'. How can I get the list of 'bad' AMI's so I can verify either of these? rail - I'm guessing you have ran that script against all known bad ami's so these spot hostnames might be good now?
Comment 11•10 years ago
|
||
I think this may be affecting more than just spot instances. Single locale android repacks - at least the ones that run on ix machines - are failing because they never run "mock install". Eg: http://ftp.mozilla.org/pub/mozilla.org/mobile/tinderbox-builds/mozilla-aurora-l10n/mozilla-aurora-android-l10n_3-unknown-bm71-build1-build1.txt.gz mock install was never run (presumably because this condition failed: http://mxr.mozilla.org/build-central/source/mozharness/mozharness/mozilla/mock.py#200), and autoconf213 was never installed.
Comment 12•10 years ago
|
||
Seems to be working better now.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → WORKSFORME
Assignee | ||
Updated•6 years ago
|
Component: General Automation → General
You need to log in
before you can comment on or make changes to this bug.
Description
•