Closed
Bug 891881
Opened 12 years ago
Closed 12 years ago
Support 10.6 Talos with PuppetAgain
Categories
(Infrastructure & Operations :: RelOps: Puppet, task, P2)
Infrastructure & Operations
RelOps: Puppet
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: dustin, Assigned: coop)
References
Details
Attachments
(4 files)
12.25 KB,
patch
|
coop
:
review+
dustin
:
checked-in+
|
Details | Diff | Splinter Review |
1.53 KB,
patch
|
coop
:
review+
dustin
:
checked-in+
|
Details | Diff | Splinter Review |
4.94 KB,
patch
|
coop
:
review+
coop
:
checked-in+
|
Details | Diff | Splinter Review |
12.56 KB,
patch
|
coop
:
review+
dustin
:
checked-in+
|
Details | Diff | Splinter Review |
No description provided.
Updated•12 years ago
|
Component: Server Operations: RelEng → RelOps: Puppet
Product: mozilla.org → Infrastructure & Operations
QA Contact: arich → dustin
Reporter | ||
Comment 1•12 years ago
|
||
This is currently blocked on getting a clean 10.6 install, which is blocked on some particularly typical Apple insanity -- in particular, there are no retail versions of OS X that will install on this hardware, so we need to find restore DVDs.
Reporter | ||
Comment 3•12 years ago
|
||
OK, I think this is about ready, except that we don't have a means in place to image 10.6 systems with DeployStudio - bug 894988.
The system log is full of
Jul 29 08:48:17 r4-mini-001 edu.mit.Kerberos.krb5kdc[30221]: Can't get profile to fetch realms
Jul 29 08:48:17 r4-mini-001 com.apple.launchd[1] (edu.mit.Kerberos.krb5kdc[30221]): Exited with exit code: 1
Jul 29 08:48:17 r4-mini-001 com.apple.launchd[1] (edu.mit.Kerberos.krb5kdc): Throttling respawn: Will start in 10 seconds
but I'm hoping, based on some Googling, that it's an artifact of the imaging process (Casper, in this case) and not a problem with Puppet.
Reporter | ||
Comment 4•12 years ago
|
||
Attachment #782608 -
Flags: review?(coop)
Assignee | ||
Comment 5•12 years ago
|
||
Comment on attachment 782608 [details] [diff] [review]
bug891881.patch
Review of attachment 782608 [details] [diff] [review]:
-----------------------------------------------------------------
::: modules/users/manifests/signer/account.pp
@@ +35,5 @@
> # relevant fixes.
> # NOTE: this user is *not* an Administrator. All admin-level access is granted via sudoers.
> case $::macosx_productversion_major {
> + '10.6': {
> + if (secret("signer_pw_paddedsha1") == '') {
Are we signing on 10.6 now?
Attachment #782608 -
Flags: review?(coop) → review+
Comment 6•12 years ago
|
||
Yeah, we've been signing on 10.6 since we moved onto the r4 minis about a year and a half ago (bug 729077).
Reporter | ||
Comment 7•12 years ago
|
||
Even if we weren't, it's the right thing to do to update that manifest to support 10.6 -- puppetagain should not artificially limit itself to only the (OS, purpose) tuples that we use.
Reporter | ||
Comment 8•12 years ago
|
||
Comment on attachment 782608 [details] [diff] [review]
bug891881.patch
https://hg.mozilla.org/build/puppet/rev/c93c03a7c848
(minus the manifests change, and the unused $service in vnc::init
Attachment #782608 -
Flags: checked-in+
Reporter | ||
Comment 9•12 years ago
|
||
OK, just waiting on the capacity to image puppetagain hosts with 10.6.
Reporter | ||
Comment 10•12 years ago
|
||
Minor patch to avoid running screenresolution on every puppet run
Attachment #785968 -
Flags: review?(coop)
Assignee | ||
Updated•12 years ago
|
Attachment #785968 -
Flags: review?(coop) → review+
Reporter | ||
Comment 11•12 years ago
|
||
I think we're ready to deploy this - 10.6 support in DeployStudio is set up IIRC.
Coop, do you want to do the deployment, or is it Armen's turn to have all the fun?
Updated•12 years ago
|
Flags: needinfo?(coop)
Assignee | ||
Comment 12•12 years ago
|
||
I'm happy to continue working through this.
Dustin: just to confirm the ask here, you're ready for me to try netboot-ing a 10.6 slave and then run it through staging?
Flags: needinfo?(coop) → needinfo?(dustin)
Reporter | ||
Comment 13•12 years ago
|
||
I'll do the netbooting initially, but yes. Which should I reimage?
Flags: needinfo?(dustin)
Assignee | ||
Comment 14•12 years ago
|
||
(In reply to Dustin J. Mitchell [:dustin] from comment #13)
> I'll do the netbooting initially, but yes. Which should I reimage?
I've set aside talos-r4-snow-001.
It's enabled in slavealloc and pointed at my test master, so it *should* end up in the correct state at the end of a netboot.
Reporter | ||
Comment 15•12 years ago
|
||
This is reimaged, but I forgot to add the node definitions until just now, so it's looping in puppetize.sh waiting. It should go forward shortly. I'll keep an eye on it.
Reporter | ||
Comment 16•12 years ago
|
||
OK, it's up, but it looks like its basedir is wrong:
cltbld 748 0.0 0.1 2460596 6652 ?? S 7:33AM 0:00.75 /tools/buildbot-0.8.4-pre-moz2/bin/python2.7 /tools/buildbot/bin/twistd --no_save --logfile /Users/cltbld/talos-slave/twistd.log --python /Users/cltbld/talos-slave/buildbot.tac
Can you fix that and reboot and see how it goes?
I'm leaving this afternoon for the rest of the week, but Amy or other relops folks can certainly reimage more hosts for you. I still have my name on talos-r4-snow-079, so we can reimage that one too if you'd like more parallelism.
Assignee | ||
Comment 17•12 years ago
|
||
There seems to be some problem with sudo as well:
[cltbld@talos-r4-snow-001.build.scl1.mozilla.com talos-slave]$ sudo reboot
sudo: unknown defaults entry `umask_override'
...and then it prompts me for a password.
Assignee | ||
Comment 18•12 years ago
|
||
(In reply to Chris Cooper [:coop] from comment #17)
> ...and then it prompts me for a password.
...and then tells me:
cltbld is not in the sudoers file. This incident will be reported.
Reporter | ||
Comment 19•12 years ago
|
||
Looks like snow leopard's sudoers doesn't support the #include syntax :(
Reporter | ||
Comment 20•12 years ago
|
||
[cltbld@talos-r4-snow-001.build.scl1.mozilla.com ~]$ sudo reboot
sudo: can't stat /etc/sudoers.d/*: No such file or directory
Segmentation fault
sudo is not an app I like to see segfaulting!
Reporter | ||
Comment 21•12 years ago
|
||
Attachment #793600 -
Flags: review?(coop)
Assignee | ||
Updated•12 years ago
|
Attachment #793600 -
Flags: review?(coop) → review+
Reporter | ||
Comment 22•12 years ago
|
||
tested OK on:
lion
mtnlion
ubuntu
centos
So I'll land this on Monday unless someone's willing to land it earlier.
Assignee | ||
Comment 23•12 years ago
|
||
Comment on attachment 793600 [details] [diff] [review]
bug891881.patch
https://hg.mozilla.org/build/puppet/rev/27a3c1e06875
Attachment #793600 -
Flags: checked-in+
Assignee | ||
Comment 24•12 years ago
|
||
We still get the "sudo: unknown defaults entry `umask_override'" message, but I can 'sudo reboot' now as cltbld. Tested on talos-r4-snow-001.
Reporter | ||
Comment 25•12 years ago
|
||
Oh, I completely forgot about that once I saw things working. I'll get a patch together.
Assignee | ||
Comment 26•12 years ago
|
||
talos-r4-snow-001 is connected to my dev-master and is running some tests now.
Reporter | ||
Comment 27•12 years ago
|
||
This converts the sudoers base files to a template, stripping the comments, and omits the umask_override on OS X 10.6.
tested OK on
mtnlion
centos
ubuntu
snow
Attachment #795539 -
Flags: review?(coop)
Assignee | ||
Comment 28•12 years ago
|
||
Comment on attachment 795539 [details] [diff] [review]
bug891881-sudoers-2.patch
Review of attachment 795539 [details] [diff] [review]:
-----------------------------------------------------------------
Yay for removing code.
Attachment #795539 -
Flags: review?(coop) → review+
Assignee | ||
Comment 29•12 years ago
|
||
(In reply to Chris Cooper [:coop] from comment #26)
> talos-r4-snow-001 is connected to my dev-master and is running some tests
> now.
All tests passed in staging.
Dustin: should I start setting up batches of 10.6 machines for netbooting?
Reporter | ||
Comment 30•12 years ago
|
||
Sure, that'd be great. I'll get them in the right deploystudio category, if they're not already.
Reporter | ||
Updated•12 years ago
|
Attachment #795539 -
Flags: checked-in+
Reporter | ||
Updated•12 years ago
|
Attachment #785968 -
Flags: checked-in+
Reporter | ||
Comment 31•12 years ago
|
||
All talos-r4-snow-* are now in the deploystudio group that will install them with puppetagain. They only need to be blessed and reboot.
Assignee | ||
Comment 32•12 years ago
|
||
OK, I will get the first batch setup for netbooting shortly.
Assignee | ||
Comment 33•12 years ago
|
||
Batches:
1. 002-027
2. 028-056 (41 doesn't exist)
3. 057-083 (81 doesn't exist)
Setting up batch 1 to netboot now.
Assignee | ||
Comment 34•12 years ago
|
||
Batch 1 has had their basedirs updated in slavealloc and have been marked for netboot. The only idle slave was 026, so I rebooted that one manually.
Assignee | ||
Comment 35•12 years ago
|
||
Batch 2 has had basedirs updated in slavealloc and has been marked for netboot.
Reporter | ||
Updated•12 years ago
|
Assignee: dustin → coop
Assignee | ||
Comment 36•12 years ago
|
||
Batch 3 has had basedirs updated in slavealloc and has been marked for netboot.
Status: NEW → ASSIGNED
Priority: -- → P2
Assignee | ||
Updated•12 years ago
|
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•