Require security group IDs instead of names in worker type definitions



a year ago
a year ago


(Reporter: dustin, Assigned: dustin)


When requesting instances in a VPC, you must specify the IDs of the security groups. When requesting instances in EC2-Classic, you can specify the names or the IDs of the security groups. 

We are not using "EC2-Classic" anymore, with the advent of bug 1387540.
John, thoughts on this?  We could try to translate names to IDs, but groups are scoped to VPCs, so that would require determining the VPC for the given subnet, then determining the security group of the given name in that VPC.
Flags: needinfo?(jhford)
Greg did a little digging on this, and it seems that this documentation is wrong -- it doesn't say anything about "default" or "non-default" VPCs, and we are not using EC2-Classic at all.  So apparently SG names are allowed.  Hopefully they're also allowed for nondefault VPCs, and this will be INVALID.  We'll find out when we test provisioning in staging.
I think the documentation is partially right -- if you specify a SubnetId + SecurityGroups, you get a 400 error.  Removing SecurityGroups lets it succeed.
Flags: needinfo?(jhford)
John, just re-setting the needinfo to get a bump on this -- what are your thoughts on the best way to solve this issue?  See also comment 1.
Flags: needinfo?(jhford)
indeed, from Sentry:

MissingParameter: When specifying a security group you must specify a group id for each item
Assignee: nobody → dustin
That's unfortunate that we can't use the name when we have a subnet specified, since aiui, security groups are shared between availability zones.  One thing which we were thinking of doing, which is outside the scope of this project for sure, is managing security groups with ec2-manager.  If we did that, we'd be able to store the security group id easily in the database to do mapping.  In the interim, we basically have to choose between just using Ids in the worker types (yuck!) or resolving them at launch specification generation time.  Let's do the latter.

In order to not totally break the API, we should cache the values.  Caching in memory is probably all we want.  I suspect that we can safely cache these values for ~1h and make it very clear to everyone that there could be up to that amount of lag for deleting a security group or if a new security group is created with an old name, that the old ID could be used for up to 1h.  I don't think we're changing security groups so often.

If we do this, I'd like to make sure that the only time we do this resolution is in the case where we have a subnet id specified.  In that case, we should throw an error if the workertype has a security group id already specified.  In this specific case, we should delete the name property and only specify the id one.
Flags: needinfo?(jhford)
SG's are shared between AZ's, but not between VPCs.  So the search will be subnet -> vpc -> sg.
falling back to another solution in bug 1392307.
Last Resolved: a year ago
Resolution: --- → WONTFIX
sorry, in bug 1387540.
That didn't work, so we're back to this work.
Resolution: WONTFIX → --- mostly wrapped this up, but needs some more work.  I might combine this with bug 1388481 anyway.
Last Resolved: a year agoa year ago
Resolution: --- → DUPLICATE
Duplicate of bug: 1388436
You need to log in before you can comment on or make changes to this bug.