Closed Bug 1304673 Opened 9 years ago Closed 9 years ago

All Trees closed for new windows build instances failing to clone

Categories

(Infrastructure & Operations Graveyard :: CIDuty, task)

task
Not set
major

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: cbook, Assigned: markco)

References

Details

Attachments

(1 file)

10:34 < nthomas|away> aobreja|lunch: aselagea|lunch - pretty big deal - new windows instances failing to clone, eg http://buildbot-master70.bb.releng.use1.mozilla.com:8001/builders/fuzzer-win64-rev2/builds/8451/steps/clone_scripts/logs/stdio 10:34 < nthomas|away> Abort: could not find web.cacerts: /etc/mercurial/cacert/cacert.pem 10:34 < nthomas|away> builds retry and fail, queue fills up 10:35 < nthomas|away> Tomcat|lunch: ^^ 10:38 < nthomas|away> fallout from https://bugzilla.mozilla.org/show_bug.cgi?id=1303766 I think all trees closed now 10:39 < nthomas|away> https://hg.mozilla.org/build/puppet/file/production/modules/mercurial/templates/hgrc.erb#l45 not so good on windows 10:40 < nthomas|away> given the OS limits in https://bug1303766.bmoattachments.org/attachment.cgi?id=8793369
affects running build on integration, central etc and also windows nightlys on trunk are busted so far :(
<nthomas> ok, so spot-b-2008-2016-09-22-09-44 is going to get whacked, that’s ami-0918c469 in usw2, and ami-f6e991e1 in use1 <nthomas> done. new instances should be OK. going to terminate existing bad instances now <nthomas> 29 in use1 only
We haven't seen any failures on try yet, but it seems likely that the y-2008 AMIs would be affected too.
markco, aselagea: the path to the cert is going to need to depend on the platform. What works for OS X and Linux won't work for windows. The cert bundle location is currently specified in the ini file for windows. We need to make sure we address this before any new windows AMIs are generated (which will happen automatically tonight).
Flags: needinfo?(mcornmesser)
Flags: needinfo?(aselagea)
Because the axe is already sharpened, I also removed * spot-y-2008-2016-09-22-09-45, usw2: ami-de1fc3be, use1: ami-2aea923d; and terminated 12 instances running in use1 * spot-t-w732-2016-09-22-08-57, usw2: ami-851dc1e5, use1: ami-91f48c86 Was going to terminate w732 instances but couldn't find any test failures, so may have jumped the gun there.
Summary: All Trees closed for new windows instances failing to clone → All Trees closed for new windows build instances failing to clone
Attached patch hgrc.diffSplinter Review
This should set the cert path correct for all platforms. :aselagea: feel free to land this since once you review I'm on PTO. Mark: we should regenerate the AMIs to make sure they're functional.
Flags: needinfo?(aselagea)
Attachment #8793717 - Flags: review?(aselagea)
Attachment #8793717 - Flags: review?(aselagea) → review+
trees reopen since nick cleaned up everything we think
Reduced the importance of this bug to "major" to stop the alerts.
Severity: blocker → major
Assignee: nobody → mcornmesser
The CA bundle that we added in bug 1303766 (the most recent one, 2016) is different from the one currently stored on the Windows machines (which is 2013). I suppose the latter is integrated in the AMI, so I wonder if we need to update it.
Status: NEW → RESOLVED
Closed: 9 years ago
Flags: needinfo?(mcornmesser)
Resolution: --- → FIXED
Product: Release Engineering → Infrastructure & Operations
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: