Closed
Bug 1130548
Opened 10 years ago
Closed 10 years ago
Update base AMIs so builders and masters can use the same
Categories
(Infrastructure & Operations :: RelOps: General, task)
Infrastructure & Operations
RelOps: General
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: dustin, Assigned: dustin)
Details
Attachments
(4 files, 2 obsolete files)
6.93 KB,
text/plain
|
Details | |
1.02 KB,
patch
|
dividehex
:
review+
dustin
:
checked-in+
|
Details | Diff | Splinter Review |
2.17 KB,
patch
|
dividehex
:
review+
dustin
:
checked-in+
|
Details | Diff | Splinter Review |
52 bytes,
text/x-github-pull-request
|
rail
:
review+
|
Details | Review |
Begun in https://github.com/mozilla/build-cloud-tools/pull/28
but that caused
10:53:44 INFO - + tar -Jxf gcc.tar.xz
10:53:44 INFO - tar (child): xz: Cannot exec: No such file or directory
10:53:44 INFO - tar (child): Error is not recoverable: exiting now
10:53:44 INFO - tar: Child returned status 2
10:53:44 INFO - tar: Error is not recoverable: exiting now
when run outside of a mock environment.
I need to figure out what's causing xz to be installed on a regular (non-AWS) builder, and why that's not happening here.
Assignee | ||
Comment 1•10 years ago
|
||
On a throwaway Amazon Linux host, I ran `yum -d 1 -c /chroot/etc/yum-local.cfg -y --installroot=/chroot groupinstall Core` against the a yum-local.cfg pointing to the puppetagain repos.
I also created a VM with VMWare and puppetized it with the wrong puppet password.
The diffs are in this attachment.
Assignee | ||
Comment 2•10 years ago
|
||
This strips the leading /\d+:/ -- I don't know what that means, but presumably it's a difference in how Anaconda (in install.log) and Yum (in yum.log) record their activities.
Attachment #8562854 -
Attachment is obsolete: true
Assignee | ||
Comment 3•10 years ago
|
||
That's pretty huge. Time to dive into anaconda and see what it's up to.
For what it's worth, this is the KS file:
http://hg.mozilla.org/build/puppet/file/292f582b1657/setup/centos6-kickstart.cfg.erb#l33
and it only specifies @core -rhgb
Assignee | ||
Comment 4•10 years ago
|
||
Oh, interesting: (h/t tmary)
https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Installation_Guide/s1-kickstart2-packageselection.html
---
Note that the Core and Base groups are always selected by default, so it is not necessary to specify them in the %packages section.
---
So KS *is* installing Base. I'll try comparing KS against a groupinstall of Core and Base, and see what the differences are.
Assignee | ||
Comment 5•10 years ago
|
||
And indeed, yum groupinstall Core Base installs EXACTLY the 397 packages that the kickstart script does. I'll make an update to the KS script so I don't forget what I've learned today, and revert to using Core and Base in the base AMIs too.
Assignee | ||
Comment 6•10 years ago
|
||
Attachment #8562899 -
Flags: review?(jwatkins)
Assignee | ||
Comment 7•10 years ago
|
||
Comment on attachment 8562899 [details] [diff] [review]
bug1130548-puppetagain.patch
er, hang on, it looks a little different in the live PXE configs
Attachment #8562899 -
Attachment is obsolete: true
Attachment #8562899 -
Flags: review?(jwatkins)
Assignee | ||
Comment 8•10 years ago
|
||
(add -subscription-manager, since that's what we've been using to date)
Attachment #8562901 -
Flags: review?(jwatkins)
Assignee | ||
Comment 9•10 years ago
|
||
The thing that triggered all of this was finding ssmtp installed on the base AMI, while buildmasters wanted postfix. Switching from one to the other must be done in a transaction, and puppet doesn't support transactions.
I suspect the difference is that the Core and Base groups are installed with all yum repos activated (base, updates, epel, and releng), while Anaconda only installs against base. I'm verifying now.
Assignee | ||
Comment 10•10 years ago
|
||
That didn't really make a difference:
--- just-core-base 2015-02-11 12:41:47.863150600 -0500
+++ just-core-base-all-repos 2015-02-11 13:09:01.594539097 -0500
@@ -1,3 +1,4 @@
+
abrt-2.0.8-21.el6.centos.x86_64
abrt-addon-ccpp-2.0.8-21.el6.centos.x86_64
abrt-addon-kerneloops-2.0.8-21.el6.centos.x86_64
@@ -101,7 +102,7 @@
fontconfig-2.8.0-3.el6.x86_64
fprintd-0.1-21.git04fd09cfa.el6.x86_64
fprintd-pam-0.1-21.git04fd09cfa.el6.x86_64
-freetype-2.3.11-14.el6_3.1.x86_64
+freetype-2.4.12-6.el6.1.x86_64
gamin-0.1.10-9.el6.x86_64
gawk-3.1.7-10.el6.x86_64
gdbm-1.8.0-36.el6.x86_64
@@ -234,7 +235,7 @@
lzo-2.03-3.1.el6.x86_64
m4-1.4.13-5.el6.x86_64
mailx-12.4-7.el6.x86_64
-make-3.81-20.el6.x86_64
+make-3.82-19.el6.x86_64
MAKEDEV-3.24-6.el6.x86_64
man-1.6f-32.el6.x86_64
man-pages-3.22-20.el6.noarch
@@ -376,7 +377,7 @@
vim-enhanced-7.2.411-1.8.el6.x86_64
vim-minimal-7.2.411-1.8.el6.x86_64
virt-what-1.11-1.2.el6.x86_64
-wget-1.12-1.8.el6.x86_64
+wget-1.15-2.el6.x86_64
which-2.19-6.el6.x86_64
wireless-tools-29-5.1.1.el6.x86_64
words-3.0-17.el6.noarch
And none of these lists include ssmtp. So I think this is a false lead.
Something else must be installing ssmtp as a dependency (via the virtual), and only when postfix isn't isntalled first. So, a puppet ordering problem. Yuck.
Assignee | ||
Comment 11•10 years ago
|
||
I have also manually confirmed that all of the additional_packages are included in Core/Base:
dhclient
openssh-server
kernel
grub
lvm2
yum
Assignee | ||
Comment 12•10 years ago
|
||
The error for masters is
Wed Feb 11 10:55:29 -0800 2015 Puppet (err): Execution of '/bin/rpm -e ssmtp-2.61-15.el6.x86_64' returned 1: error: Failed dependencies:
/usr/bin/mailq is needed by (installed) nagios-plugins-mailq-1.4.15-2.el6.x86_64
/usr/sbin/sendmail is needed by (installed) cronie-1.4.4-12.el6.x86_64
Wed Feb 11 10:55:29 -0800 2015 /Stage[main]/Packages::Postfix/Package[ssmtp]/ensure (err): change from 2.61-15.el6 to absent failed: Execution of '/bin/rpm -e ssmtp-2.61-15.el6.x86_64' returned 1: error: Failed dependencies:
/usr/bin/mailq is needed by (installed) nagios-plugins-mailq-1.4.15-2.el6.x86_64
/usr/sbin/sendmail is needed by (installed) cronie-1.4.4-12.el6.x86_64
So either nagios-plugins or cronie is pulling in ssmtp as the default, preventing postfix from being installed. I'll need to do some hacking in puppet to manage to use a transaction.
Assignee | ||
Comment 13•10 years ago
|
||
The packages don't actually conflict, and `alternates` prefers postfix in auto mode.
So I think we could just remove the reference to ssmtp, and make the `alternates` run depend on postfix being installed. That's certainly a lot simpler than setting up a transaction. However, it will leave us in a mix of states: ssmtp installed on some hosts and not on others. Depending on the order puppet runs in.
So I'm going to use the transaction anyway.
Assignee | ||
Updated•10 years ago
|
Summary: Switch builders to use the new Core-only base AMI → Update base AMIs so builders and masters can use the same
Assignee | ||
Comment 14•10 years ago
|
||
Attachment #8562956 -
Flags: review?(jwatkins)
Assignee | ||
Comment 15•10 years ago
|
||
Testing:
* in progress for masters (non-production)
* I'll build a golden AMI tomorrow and see how it flies
Attachment #8562999 -
Flags: review?(rail)
Assignee | ||
Comment 16•10 years ago
|
||
Master worked fine.
Updated•10 years ago
|
Attachment #8562999 -
Flags: review?(rail) → review+
Assignee | ||
Comment 17•10 years ago
|
||
I'm testing the golden AMI now. The patch is merged, but can be backed out if the test fails.
Assignee | ||
Comment 18•10 years ago
|
||
2015-02-12 08:46:49,534 - INFO - AMI created
2015-02-12 08:46:49,534 - INFO - ID: ami-e0e8a388, name: spot-bld-linux64-2015-02-12-16-21
2015-02-12 08:46:49,662 - INFO - AMI spot-bld-linux64-2015-02-12-16-21 (ami-e0e8a388) is ready
Assignee | ||
Comment 19•10 years ago
|
||
I spot-checked one, and I see runner running. And xz is installed. So I'm cautiously optimistic. I'm also going to kill the AMI, just in case.
Assignee | ||
Comment 20•10 years ago
|
||
There are currently 23 instances in use1 running this image (all build, not try).
Updated•10 years ago
|
Attachment #8562901 -
Flags: review?(jwatkins) → review+
Updated•10 years ago
|
Attachment #8562956 -
Flags: review?(jwatkins) → review+
Assignee | ||
Comment 21•10 years ago
|
||
I see at least one green build (hard to tell if there are more from the times) in looking at one of the remaining hosts, so I'm calling this good.
Assignee | ||
Comment 22•10 years ago
|
||
Comment on attachment 8562901 [details] [diff] [review]
bug1130548-puppetagain-r2.patch
remote: https://hg.mozilla.org/build/puppet/rev/8efe3047c9c9
remote: https://hg.mozilla.org/build/puppet/rev/3604f9ebfe55
Attachment #8562901 -
Flags: checked-in+
Assignee | ||
Comment 23•10 years ago
|
||
Comment on attachment 8562956 [details] [diff] [review]
bug1130548-postfix.patch
remote: https://hg.mozilla.org/build/puppet/rev/0562ae4687a7
remote: https://hg.mozilla.org/build/puppet/rev/3604f9ebfe55
Attachment #8562956 -
Flags: checked-in+
Assignee | ||
Updated•10 years ago
|
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•