Closed Bug 1159637 Opened 9 years ago Closed 7 years ago

[provisioner] Ensure that SecurityGroups exist in the region which each WorkerType is configured

Categories

(Taskcluster :: Services, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: jhford, Assigned: jhford)

References

Details

Sometime over the night, we switched from everything working well to all of a sudden every provisioning iteration failing with an error like this: [alert-operator] InvalidGroup.NotFound: The security group 'docker-worker' does not exist in default VPC 'vpc-<snip>' InvalidGroup.NotFound: The security group 'docker-worker' does not exist in default VPC 'vpc-<snip>' These were happening with calls in the us-east-1 region, which is a new region for the provisioner. Someone must have added this region to some workerTypes: b2g-desktop-debug.json: "region": "us-east-1", b2g-desktop-opt.json: "region": "us-east-1", b2gbuild-desktop.json: "region": "us-east-1", b2gbuild-emulator-kk.json: "region": "us-east-1", b2gbuild.json: "region": "us-east-1", b2gtest-emulator.json: "region": "us-east-1", b2gtest.json: "region": "us-east-1", emulator-ics-debug.json: "region": "us-east-1", emulator-ics.json: "region": "us-east-1", emulator-jb-debug.json: "region": "us-east-1", emulator-jb.json: "region": "us-east-1", overnight. The problem is that these regions do not have the docker-worker security group in them... The result is the InvalidGroup.NotFound issue noted above. There are a couple options I can see to avoid this in future: 1. Try to automagically copy over security groups... this feels like an awful idea 2. Verify that the required security groups and AMIs exist in the region when updating/creating a workerType I think that the right option is 2.
Sorry about that, I was the one that started adding those in preparation of moving away from us-west because of a lot of issues we were seeing yesterday. Didn't realize that the security group wasn't setup.
Summary: Ensure that AMIs and SecurityGroups exist in the region which each WorkerType is configured → [provisioner] Ensure that AMIs and SecurityGroups exist in the region which each WorkerType is configured
I agree option (2) seems like the way to go.
Component: TaskCluster → AWS-Provisioner
Product: Testing → Taskcluster
Looks like work for this might be done under: https://github.com/taskcluster/aws-provisioner/pull/34
that's only for AMIs, but I could also do security groups I bet. That patch is a little old, so I should update it.
AMI checks were already done.
Summary: [provisioner] Ensure that AMIs and SecurityGroups exist in the region which each WorkerType is configured → [provisioner] Ensure that SecurityGroups exist in the region which each WorkerType is configured
Now that we're collecting information on failures in provisioning and working on bubbling up error messages, we don't really need to do this. When an invalid entry is used, we'll see failures when requesting resources. Those errors will be shortly bubbled up into the provisioner UI, so debugging for anyone will be easy. The error messages aren't visible yet, but are in progress.
Assignee: nobody → jhford
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Component: AWS-Provisioner → Services
You need to log in before you can comment on or make changes to this bug.