Closed Bug 1162670 Opened 10 years ago Closed 10 years ago

lion build machines not running puppet after reimage

Categories

(Infrastructure & Operations :: RelOps: Puppet, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: selenamarie, Assigned: arich)

Details

The following systems appear to not have run puppet after a reimage, not sure why: bld-lion-r5-016 bld-lion-r5-031 bld-lion-r5-057 bld-lion-r5-078 bld-lion-r5-079 bld-lion-r5-092 Amy is working through to fix these, as a couple had old installer passwords on them. She is investigating further. Thanks!
It looks like the DS workflow is still copying over the files, but the launchdaemon isn't being started. I'm not sure why this is happening yet (or when it started). I see that we're now pushing the plist file out with puppet and I'm wondering if that overwrote one that was working before (maybe 10.7 needed it enabled and 10.6 and 10.10 didn't?). Will dig deeper.
Summary: Some lion build machines not running puppet after reimage → lion build machines not running puppet after reimage
Yeah, it looks like we started deploying puppet/files/org.mozilla.puppetize.plist. On 10.7, at least, needs the launchdaemon to be enabled, not disabled. I'm not sure if this is true for 10.6 and 10.10. For the time being, I've altered this file in my own environment and pinned install.build.releng.scl3.mozilla.com so that reimaging bld-lion-r5 machines will function again. We need a better long term fix.
It looks like back in 2012, Jake checked in modules/puppet/files/org.mozilla.puppetize.plist but didn't reference it from anywhere -- probably just for safekeeping. In bug 1119421 (March 2015), the deploystudio module started installing that file under /Deploy/Files/ on the DS servers, probably without noticing that it's disabled. I'd say the correct fix here is to move the file into the deploystudio module, and edit it to be enabled.
I don't know if it needs to be disabled on 10.6 and 10.10 (I believe that's where the file was taken form, install.test, and it was and is still functional there). I need to test that out and verify if we can change it to Enabled across the board. Regardless, yes, I think it should be moved into the deploystudio module, possibly as a template.
It needs to be disabled on all hosts after they've run puppet. It needs to be enabled on all hosts when it's installed by deploystudio.
I'm going to take talos-mtnlion-r5-ref and turn it into t-yosemite-r5-0095.test.releng.scl3.mozilla.com to test 10.10 I'm going to reimage t-snow-r4-0002.test.releng.scl3.mozilla.com and reimage it (it's already disabled) to test 10.6.
Verified on 10.6 and 10.10. install servers unpinned.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.