Closed
Bug 1152624
Opened 10 years ago
Closed 8 years ago
ami generation in use1 sometimes doesn't copy over to usw2
Categories
(Release Engineering :: General, defect)
Tracking
(Not tracked)
RESOLVED
WONTFIX
People
(Reporter: jlund, Unassigned)
References
Details
in "Bug 1149580 - disable AMI generation" I had to manually create and publish new AMIs.
There was some problems doing so: emulator64 and linux64 usw2 newly created AMIs had an old AMI name (they were based off use1 old AMI):
2015-04-08 14:11:11,325 - INFO - AMI spot-tst-linux64-2015-04-08-21-00 (ami-be3e02d6) is ready
2015-04-08 14:11:11,326 - WARNING - Terminating Instance:i-03f8b8d4
2015-04-08 14:11:12,112 - INFO - Copying ami-6ebd9506 (spot-tst-linux64-2015-03-25-08-59) to us-west-2
2015-04-08 14:11:12,113 - INFO - Copying Image:ami-6ebd9506 to us-west-2
2015-04-08 14:11:13,979 - INFO - AMI created
2015-04-08 14:11:13,979 - INFO - ID: ami-f56c47c5, name: None
2015-04-08 14:11:13,979 - INFO - New AMI created. AMI ID: ami-f56c47c5
apparently this is also happening outside of manual runs (via cron):
17:05:44 <•nthomas> 2015-03-20 02:01:47,732 - INFO - AMI spot-tst-linux64-2015-03-20-08-52 (ami-887c53e0) is ready
17:05:46 <•nthomas> 2015-03-20 02:01:48,517 - INFO - Copying ami-021f336a (spot-tst-linux64-2015-03-19-16-11) to us-west-2
to fix I manually copied the AMIs and then re-published just usw2:
>>> linux64_ami = get_ami('us-east-1', 'tst-linux64')
>>> linux64_ami.id
u'ami-be3e02d6'
>>> new_ami_linux64 = copy_ami(linux64_ami, 'us-west-2')
>>> # repeated the same for emulator64
$ aws_publish_amis -r us-west-2
I'm not sure exactly what is happening here. I did notice that AMIs took over an hour to go from pending to finished.
I think get_ami() is not seeing the newly created use1 AMIs sometimes (instead taking the previous one to copy from) and then passing that one to copy_ami(). I guess, at the very least, we should verify that get_ami() is the same ami that was just generated.
Reporter | ||
Comment 1•10 years ago
|
||
also, turns out you can't hack things and run `aws_publish_amis -r us-west-2` alone as that upsets use1. got 1000s of errors after I did that:
Apr 08 18:48:34 tst-emulator64-spot-195.test.releng.use1.mozilla.com running: post-task hook: /opt/runner/task_hook.py {"try_num": 5, "max_retries": 5, "task": "0-check_ami", "result": "RETRY"}
so I re ran both regions (no -r means use both):
aws_publish_amis
Updated•10 years ago
|
Summary: ami generation in use1 sometimes doesn't copy over to usw-2 → ami generation in use1 sometimes doesn't copy over to usw2
Comment 2•10 years ago
|
||
So I think the problem is that the ami is created here:
https://github.com/mozilla/build-cloud-tools/blob/master/cloudtools/scripts/aws_create_instance.py#L187
but isn't returned from that function, so the code further down here:
https://github.com/mozilla/build-cloud-tools/blob/master/cloudtools/scripts/aws_create_instance.py#L294
has to query the list of AMIs to find the one that was just created. I wonder if most of this time this works fine, but sometimes the new AMI isn't ready yet and you end up finding an older one instead.
Comment 3•8 years ago
|
||
Hit this again today.
Comment 5•8 years ago
|
||
At the post-mortem for the hg.m.o ssl cert change today, I agreed to find an owner for this.
Assignee: nobody → coop
Updated•8 years ago
|
Assignee: coop → nobody
Updated•8 years ago
|
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → WONTFIX
Assignee | ||
Updated•8 years ago
|
Component: Tools → General
You need to log in
before you can comment on or make changes to this bug.
Description
•