Closed Bug 588957 Opened 14 years ago Closed 13 years ago

create 64-bit linux ref image on ix hardware

Categories

(Release Engineering :: General, defect)

All
macOS
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: bhearsum, Assigned: jhford)

References

Details

(Whiteboard: [q1 goal][carry over])

Attachments

(5 files, 1 obsolete file)

Puppet can do a lot of it. Assigning to me for now.
Won't be working on this until much closer to the rest of the ix machines coming in or afterwards.
Priority: -- → P3
Priority: P3 → P5
The ix machines are being delivered this week. 

After offline discussion with both bhearsum and armen, switching this bug to Armen as he's blocked on win64 work right now anyway, and bhearsum is busy with other Q3 stuff right now.
Assignee: bhearsum → armenzg
ACK

bhearsum how much of this have you got done?
https://wiki.mozilla.org/ReferencePlatforms/Linux-CentOS-5.0_64-bit
Could you please email me the credentials?

I will resume from wherever you left it.
Priority: P5 → P2
This bug hasn't been started on. We don't even have an ix machine with 64-bit CentOS 5.0 on it yet.
Depends on: 597057
Priority: P2 → P3
No longer blocks: 588950
Priority: P3 → P2
As mentioned in:
https://wiki.mozilla.org/ReferencePlatforms/Linux-CentOS-5.0_64-bit#Puppet_Installation

I installed puppet on this machine with this:
> scp root@moz2-linux64-slave03:/var/cache/yum/epel/packages/*.rpm /tools/dist
> scp root@moz2-linux64-slave03:/var/cache/yum/updates/packages/*.rpm /tools/dist
> rpm -i ruby-libs-1.8.5-5.el5_4.8.x86_64.rpm
> rpm -i ruby-1.8.5-5.el5_4.8.x86_64.rpm 
> rpm -i augeas-libs-0.7.0-1.el5.x86_64.rpm
> rpm -i facter-1.5.7-1.el5.noarch.rpm
> rpm -i ruby-augeas-0.3.0-1.el5.x86_64.rpm 
> rpm -i ruby-shadow-1.4.1-7.el5.x86_64.rpm
> rpm -i puppet-0.24.8-4.el5.noarch.rpm

I am synching only one time with staging-puppet with:
+ node "linux64-ix-ref.build.scl1.mozilla.com" inherits "centos5-x86_64-build" {
+     include buildslave, ix  
+ }

I got these errors at first which I will fix tomorrow:
[root@linux64-ix-ref ~]# puppetd --test --server staging-puppet.build.mozilla.org
> info: Caching catalog at /var/lib/puppet/localconfig.yaml
> notice: Starting catalog run
> notice: //Node[build]/base/centos5/Service[atd]/ensure: ensure changed 'running' to 'stopped'
> info: Filebucket[/var/lib/puppet/clientbucket]: Adding /etc/fstab(a9f3b1fe823df2b7d9b9cc8ae8c7e3d3)
> err: //Node[linux64-ix-ref.build.scl1.mozilla.com]/ix/Mount[builds]/ensure: change from present to mounted failed: Execution of '/bin/mount -o noatime /builds' returned 32: mount: mount point /builds does not exist
> 
> notice: //Node[linux64-ix-ref.build.scl1.mozilla.com]/ix/Mount[builds]: Refreshing self
> notice: //Node[build]/base/centos5/Service[acpid]/ensure: ensure changed 'running' to 'stopped'
> err: //Node[build]/base/centos5/Package[libnotify-devel]/ensure: change from absent to present failed: Execution of '/bin/rpm -i --oldpackage http://staging-puppet.build.mozilla.org/staging/centos5-x86_64/build/RPMs/libnotify-devel-0.4.2-6.el5.x86_64.rpm' returned 1: error: Failed dependencies:
>         dbus-devel >= 0.90 is needed by libnotify-devel-0.4.2-6.el5.x86_64
>        dbus-glib-devel >= 0.70 is needed by libnotify-devel-0.4.2-6.el5.x86_64
>         glib2-devel >= 2.2.2 is needed by libnotify-devel-0.4.2-6.el5.x86_64
> 
> notice: //Node[centos]/cltbld/Exec[/usr/bin/crontab -u cltbld /home/cltbld/crontab]/returns: executed successfully
> notice: //Node[linux64-ix-ref.build.scl1.mozilla.com]/buildslave/moz-rpms/File[/builds/ccache]: Dependency mount[/builds] has 1 failures
> warning: //Node[linux64-ix-ref.build.scl1.mozilla.com]/buildslave/moz-rpms/File[/builds/ccache]: Skipping because of failed dependencies
> notice: //Node[linux64-ix-ref.build.scl1.mozilla.com]/buildslave/moz-rpms/Exec[/usr/bin/ccache -M 2G]: Dependency mount[/builds] has 1 failures
> warning: //Node[linux64-ix-ref.build.scl1.mozilla.com]/buildslave/moz-rpms/Exec[/usr/bin/ccache -M 2G]: Skipping because of failed dependencies
> err: //Node[linux64-ix-ref.build.scl1.mozilla.com]/buildslave/buildbot/Service[buildbot-tac]/ensure: change from stopped to running failed: Could not start Service[buildbot-tac]: Execution of '/sbin/service buildbot-tac start' returned 1:  at /etc/puppet/manifests/packages/buildbot.pp:29
> notice: //Node[build]/base/centos5/Service[auditd]/ensure: ensure changed 'running' to 'stopped'
> notice: Finished catalog run in 77.07 seconds


According to the wiki the following will get deployed:
* Yum packages:
** libnotify-devel-0.4.2-6.el5.x86_64.rpm
** wireless-tools-devel-28-2.el5.x86_64.rpm
** lcov-1.7-1.noarch.rpm
** debuginfo packages for: atk, cairo, dbus, dbus-glib, expat, fontconfig, freetype, gcc, GConf2, glib2, glibc, gnome-vfs2, gtk2, gtk2-engines, hal-cups-utils, hal, libbonobo, libgnome, libselinux, libX11, libXcursor, libXext, libXfixes, libXft, libXi, libXinerama, libXrender, ORBit2, pango (specific versions listed here: http://hg.mozilla.org/build/puppet-manifests/file/4f3b55768ade/packages/debuginfopackages.pp)
** nagios-plugins-1.4.9-1.el5.rf.x86_64.rpm
** nagios-nrpe-2.5.2-1.el5.rf.x86_64.rpm
** nagios-plugins-nrpe-2.5.2-1.el5.rf.x86_64.rpm 
* Other things:
** GCC 4.3.3
** python 2.5.1
** twisted 2.4.0
** twisted-core 2.4.0
** zope-interface 3.3.0
** jdk 1.5.0_15
** Mercurial 1.1.2
** autoconf 2.13
Depends on: 606214
Priority: P2 → P3
Priority: P3 → P2
I will get to this on Q1.
Priority: P2 → P3
Whiteboard: [q1 goal][carry over]
jhford will be giving a hand with this.
Assignee: armenzg → jhford
Priority: P3 → --
Blocks: 633275
If you could, it'd be great if you could do a once over on https://wiki.mozilla.org/ReferencePlatforms/Linux-CentOS-5.0#Install_Puppet to make sure that it is up to date.
Attachment #514926 - Flags: review?(bhearsum)
Attachment #514926 - Flags: review?(bhearsum) → review+
Because this image is using LVM, the device that /builds resides on is different.  All other options are the same.

Without connecting the soon-to-be refimage to the staging puppet master, is there a good way to test this?
Attachment #516634 - Flags: review?(bhearsum)
Attachment #516634 - Flags: review?(bhearsum) → review+
It looks like this machine is syncing to puppet properly.

[root@linux64-ix-ref ~]# puppetd --test --server scl-production-puppet.build.scl1.mozilla.com
info: Caching catalog at /var/lib/puppet/localconfig.yaml
notice: Starting catalog run
notice: //Node[centos]/cltbld/Exec[/usr/bin/crontab -u cltbld /home/cltbld/crontab]/returns: executed successfully
info: //Node[build]/base/centos5/buildslave::cleanup/Tidy[/home/cltbld/.mozilla/firefox/console.log]/ensure: Tidy target does not exist; ignoring
notice: //Node[build]/base/centos5/Service[atd]/ensure: ensure changed 'running' to 'stopped'
notice: //Node[build]/base/centos5/Service[acpid]/ensure: ensure changed 'running' to 'stopped'
notice: //Node[build]/base/centos5/buildslave::cleanup/Exec[find /tmp/* -mmin +15 -print | xargs -n1 rm -rf]/returns: executed successfully
notice: //Node[build]/base/centos5/Service[auditd]/ensure: ensure changed 'running' to 'stopped'
notice: Finished catalog run in 14.19 seconds
Depends on: 640298
Turns out that LVM doesn't work with our imaging tool (boo deploystudio).  Thanks bkero for doing the LVM->MBR tranfer!

This patch makes sure that our manifests deal with the change in device node that resulted.
Attachment #516634 - Attachment is obsolete: true
Attachment #518777 - Flags: review?(coop)
Attachment #518777 - Flags: review?(coop) → review+
add a slave to test image with
Attachment #519269 - Flags: review?(aki)
Comment on attachment 519269 [details] [diff] [review]
make a slave01 entry that is just like the ref

STAMP
Attachment #519269 - Flags: review?(aki) → review+
Depends on: 641962
add all linux64-ix-slave machines to puppet
Attachment #520771 - Flags: review?(coop)
(In reply to comment #15)
> Created attachment 520771 [details] [diff] [review]
> add all linux64-ix-slave machines to puppet
> 
> add all linux64-ix-slave machines to puppet

Also, it looks like in puppet land, try machines are treated exactly as production ones are (for now).  Is this correct?
Comment on attachment 520771 [details] [diff] [review]
add all linux64-ix-slave machines to puppet

Are any staying in staging?
Attachment #520771 - Flags: review?(coop) → review+
(In reply to comment #17)
> Comment on attachment 520771 [details] [diff] [review]
> add all linux64-ix-slave machines to puppet
> 
> Are any staying in staging?

yes, 01 and 02, but for now I'd like to keep them on production puppet to keep things simple.
(In reply to comment #16)
> Also, it looks like in puppet land, try machines are treated exactly as
> production ones are (for now).  Is this correct?

Yes - the difference is the SSH keys, which puppet does not manage.
Depends on: 643614
Depends on: 643601
(In reply to comment #20)
> Yes - the difference is the SSH keys, which puppet does not manage.

thanks!
Depends on: 640990
Depends on: 643903
Depends on: 644316
Depends on: 644318
Depends on: 644319
Depends on: 645145
some of these slaves are out for repair
Attachment #522767 - Flags: review?(coop)
Attachment #522767 - Flags: review?(coop) → review+
This ref image has been created and is being deployed.  Further work being tracked in bug 577154 for the rollout.  Any issues found with the reference image should be filed as new bugs at this point as the image has begun to be rolled out.
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: