Closed Bug 1313105 Opened 8 years ago Closed 8 years ago

Golden AMIs that use puppetagain.pub.build.mozilla.org fail to generate

Categories

(Infrastructure & Operations Graveyard :: CIDuty, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: aselagea, Unassigned)

References

Details

Attachments

(3 files)

From the logs: Traceback (most recent call last): File "aws_create_instance.py", line 171, in create_instance deploypass=deploypass, reboot=reboot) File "/builds/aws_manager/cloud-tools/cloudtools/aws/instance.py", line 151, in assimilate_instance run_chroot('yum install -q -y puppet cloud-init wget') File "/builds/aws_manager/cloud-tools/cloudtools/aws/instance.py", line 103, in run_chroot run(cmd, *args, **kwargs) File "/builds/aws_manager/lib/python2.7/site-packages/fabric/network.py", line 578, in host_prompting_wrapper return func(*args, **kwargs) File "/builds/aws_manager/lib/python2.7/site-packages/fabric/operations.py", line 1042, in run shell_escape=shell_escape) File "/builds/aws_manager/lib/python2.7/site-packages/fabric/operations.py", line 932, in _run_command error(message=msg, stdout=out, stderr=err) File "/builds/aws_manager/lib/python2.7/site-packages/fabric/utils.py", line 321, in error return func(message) File "/builds/aws_manager/lib/python2.7/site-packages/fabric/utils.py", line 34, in abort sys.exit(1) SystemExit: 1
AMIs affected atm: - try-linux64-ec2-golden - tst-linux32-ec2-golden - bld-linux64-ec2-golden - av-linux64-ec2-golden - tst-emulator64-ec2-golden From IRC: " aselagea|buildduty> Alin Selagea arr: I see the same error for the 5 golden AMIs after running the scripts again 18:25:22 I think we should file a bug for that 18:28:32 <arr> ah, they're using http://puppetagain.pub.build.mozilla.org 18:28:44 that's the problem 18:28:51 that's the public vip we shut down "
Ideally we'd be using the bare hostname and letting DNS point things at the local mirror (this is going to wind up being use1 for everything since that's where we generate AMIs). Do we actually use yum beyond the golden AMI creation? If we can use an unqualified name, then I propose we fix both linux and windows AMI generation. :aselagea, do you have time to test?
Flags: needinfo?(aselagea)
Putting some notes here as I'll need to head out soon. Steps to test this: 1. terminate the 5 Linux instances mentioned in #c1 (from the AWS console, use1) 2. land Amy's patch attached to this bug (make sure cloud-tools has been updated) 3. restart the scripts for the 5 AMIs Note: I've already terminated the processes for the old instances, so no need to worry about that. Thanks!
Flags: needinfo?(aselagea)
I had a couple minutes between meetings, so I terminated the instances in AWS.
Need to take a look at the incoming and outgoing security groups. It looks like a security group is blocking connections from the golden images to the puppet masters.
Attachment #8804785 - Flags: checked-in+
I'm getting the same error as in the description when trying to summon an instance for my beetmoverworker: 2016-10-26 12:49:00,904 - WARNING - problem assimilating beetmoverworker-1.srv.releng.use1.mozilla.com (10.134.48.36), retrying in 1200 sec ... Traceback (most recent call last): File "cloud-tools/scripts/aws_create_instance.py", line 171, in create_instance deploypass=deploypass, reboot=reboot) File "/builds/aws_manager/cloud-tools/cloudtools/aws/instance.py", line 151, in assimilate_instance run_chroot('yum install -q -y puppet cloud-init wget') File "/builds/aws_manager/cloud-tools/cloudtools/aws/instance.py", line 103, in run_chroot run(cmd, *args, **kwargs) File "/builds/aws_manager/lib/python2.7/site-packages/fabric/network.py", line 578, in host_prompting_wrapper return func(*args, **kwargs) File "/builds/aws_manager/lib/python2.7/site-packages/fabric/operations.py", line 1042, in run shell_escape=shell_escape) File "/builds/aws_manager/lib/python2.7/site-packages/fabric/operations.py", line 932, in _run_command error(message=msg, stdout=out, stderr=err) File "/builds/aws_manager/lib/python2.7/site-packages/fabric/utils.py", line 321, in error return func(message) File "/builds/aws_manager/lib/python2.7/site-packages/fabric/utils.py", line 34, in abort sys.exit(1) SystemExit: 1 Nore sure how to move forward so I'll just add my bug as a blcker here as well to keep track of the things. Feel free to wipe it off if I'm wrong. Thanks!
Blocks: 1308042
Attached file no-data.patch
The puppetagain VIP had a /data in the path which the puppetmasters do not. That was intended to make people stop and thing, oh, hm, maybe I shouldn't use this as a repo. I guess it didn't work. Anyway, this patch removes those /data where necessary. Already landed in cloud-tools.
It turns out we do it yet a different way for ubuntu. see bug 906785
All the linux AMIs have now been regenerated.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Product: Release Engineering → Infrastructure & Operations
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: