Closed Bug 1339315 Opened 7 years ago Closed 5 years ago

[Tracker] Setup new deploystudio server

Categories

(Infrastructure & Operations :: RelOps: General, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: dividehex, Assigned: dividehex)

References

Details

Attachments

(3 files, 1 obsolete file)

In moving out of scl3 we will need to build a new deploystudio server which will be will end up being staged in the "Virtual" datacenter within scl3 both being moved to the final destination.

We will start by putting a mac mini in a sonnet chassis and placing it within the releng scl3 test vlan.  This will make is easy to provision with the current install.test deploystudio server.
Depends on: 1341485
Depends on: 1339491
Since Van has racked a mac mini (bug 1341485) and disabled the SIP that prevent netbooting (1339491), I've been successfully captured a Sierra image (just in case things don't go as planned and it needs to be restored).  After that, I re-blessed, rebooted it and ran the Restore yosemite-r7-ref workflow to install the yosemite base image.

Also,
Added dsadmin account
Installed latest version of Deploystudio (1.7.6)
Ran security updates
I attempted to install OSX Server from the apple store but it looks like the current version only supports running on OSX 10.11.6.  I see two ways to approach this;  1) I try and copy the Server app from the current install and 2) I restore 10.12 Sierra and continue installing the OSX Server app from the Apple store.
It seems that copying over the Server.app from the current install.test hosts worked.  Opening the app prompted it to configure itself.  It did throw an error about setting up the wiki but other than that it seemed to work fine.  I haven't copied over any netboot images yet or tested any other functionality passed installing, opening it and clicking around.
Depends on: 1349737
Did the following on install2.test:

* Added proper sreg record so dns lookups would function properly.
* Deleted NTFS partition on ext ssd drive and added a single 1TB hfs+ partition under the name BackupExtSSD.
* Turned on Timemachine and set the BackupExtSSD as the target destination.  First backup completed successfully.
* Added /Deploy directory and enabled filesharing on it.  User dsadmin is given rw perms.
* Walked through the initial setup with the DS Assistant app just to make sure there were no error during the DS configuration.  This will need to be rerun again after the host as moved to it's final destination along with resetting the hostname/computername
All the pre-flight puppet bits are setup; see everything below for details.


Various hostnames have been set:

Mac-mini:~ dsadmin$ sudo scutil --set HostName install.test.releng.scl3.mozilla.com
Mac-mini:~ dsadmin$ sudo scutil --set ComputerName install
Mac-mini:~ dsadmin$ sudo scutil --set LocalHostName install


Grab puppetize.sh, puppet and facter:

install:~ root# curl -O http://puppet/repos/DMGs/facter-2.2.0.dmg
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  108k  100  108k    0     0   185k      0 --:--:-- --:--:-- --:--:--  186k
install:~ root# curl -O http://puppet/repos/DMGs/puppet-3.7.0.dmg
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 1525k  100 1525k    0     0  15.2M      0 --:--:-- --:--:-- --:--:-- 15.3M
install:~ root# curl -O https://hg.mozilla.org/build/puppet/raw-file/tip/modules/puppet/files/puppetize.sh
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  7319  100  7319    0     0  11415      0 --:--:-- --:--:-- --:--:-- 11418


Mount puppet and facter dmgs:

install:~ root# hdiutil attach ./facter-2.2.0.dmg
Checksumming Driver Descriptor Map (DDM : 0)…
     Driver Descriptor Map (DDM : 0): verified   CRC32 $47F651AD
Checksumming Apple (Apple_partition_map : 1)…
     Apple (Apple_partition_map : 1): verified   CRC32 $08D25B30
Checksumming disk image (Apple_HFS : 2)…
........................................................................................................................................
          disk image (Apple_HFS : 2): verified   CRC32 $1F8118CB
Checksumming  (Apple_Free : 3)…
                    (Apple_Free : 3): verified   CRC32 $00000000
verified   CRC32 $7F619E5C
/dev/disk3          	Apple_partition_scheme
/dev/disk3s1        	Apple_partition_map
/dev/disk3s2        	Apple_HFS                      	/Volumes/facter-2.2.0
install:~ root# hdiutil attach ./puppet
puppet-3.7.0.dmg  puppetize.sh
install:~ root# hdiutil attach ./puppet-3.7.0.dmg
Checksumming Protective Master Boot Record (MBR : 0)…
Protective Master Boot Record (MBR :: verified   CRC32 $A3950610
Checksumming GPT Header (Primary GPT Header : 1)…
 GPT Header (Primary GPT Header : 1): verified   CRC32 $4C3F035C
Checksumming GPT Partition Data (Primary GPT Table : 2)…
GPT Partition Data (Primary GPT Tabl: verified   CRC32 $1D8ED866
Checksumming  (Apple_Free : 3)…
                    (Apple_Free : 3): verified   CRC32 $00000000
Checksumming disk image (Apple_HFS : 4)…
........................................................................................................................................
          disk image (Apple_HFS : 4): verified   CRC32 $D17EAD30
Checksumming  (Apple_Free : 5)…
                    (Apple_Free : 5): verified   CRC32 $00000000
Checksumming GPT Partition Data (Backup GPT Table : 6)…
GPT Partition Data (Backup GPT Table: verified   CRC32 $1D8ED866
Checksumming GPT Header (Backup GPT Header : 7)…
  GPT Header (Backup GPT Header : 7): verified   CRC32 $EC757827
verified   CRC32 $4FE72613
/dev/disk4          	GUID_partition_scheme
/dev/disk4s1        	Apple_HFS                      	/Volumes/puppet-3.7.0


Install puppet and facter:

install:~ root# installer -package /Volumes/facter-2.2.0/facter-2.2.0.pkg -target "/Volumes/Macintosh HD"
installer: Package name is facter-2.2.0
installer: Installing at base path /
installer: The install was successful.
install:~ root# installer -package /Volumes/puppet-3.7.0/puppet-3.7.0.pkg -target "/Volumes/Macintosh HD"
installer: Package name is puppet-3.7.0
installer: Installing at base path /
installer: The install was successful.

Unmount dmgs:

install:~ root# hdiutil detach /Volumes/facter-2.2.0/
"disk3" unmounted.
"disk3" ejected.
install:~ root# hdiutil detach /Volumes/puppet-3.7.0/
"disk4" unmounted.
"disk4" ejected.
Depends on: 1361117
Attachment #8863464 - Attachment is patch: true
Install.test.releng.mdc1 has been puppetized although, I had to fix the /Users/dsadmin file ownership since puppet kindly change the uid of the dsadmin user :-/
Since changing the UID, I also had to run keychain repair. This was needed before the SSL cert creation.

* Because the server fqdn changed, a new ssl needed to be generated. Using the Server App, I created a ssl cert with the fqdn and a 10y expiration
* Ran through the Deploystudio setup with the new fqdn and ssl cert
* Generated a new netboot image (DSR-10105) and placed it on /Library/NetBoot/NetBootSP0
* Started the NetInstall service and configured "/Volumes/Macintosh HD" to serve Images Only under Storage Settings
* Set DSR-10105 as default image

Last, I started an rsync session to transfer /Deploy from install.test.releng.scl3 to mdc1.  I've opted to copy it it /Deploy.old on the mdc1 install so it can be cherry picked for things instead of just carrying over all the cruft from deploystudio installs from days of yore.
Rsync of /Deploy has completed
Deleting the dsadmin keychain is not the proper way of handling the issue of a password mismatch between the account and the keychain.  Just short of automating the keychain password change with the 'security' cli tool, let's just remove the function for now.
Attachment #8863792 - Flags: review?(dhouse)
Fixed missing semicolon
Attachment #8863792 - Attachment is obsolete: true
Attachment #8863792 - Flags: review?(dhouse)
Attachment #8863795 - Flags: review?(dhouse)
Depends on: 1361453
I've run into a few issues so far.

1) Timemachine seems broken.  This might be and issue with yosemite itself.  From what I can tell, the backups are working just fine but default permission issues on the backups folder prevent the dsadmin user from "entering" timemachine.  I set it perms on /Volumes/BackupExtSSD/Backups.backupdb to 777 and was able to "enter" but after turning on the timemachine again, it politely corrected the perms back to 770.  I'm not going to go down the rabbit hole on this one since nothing obvious popped up on a google search and I've tried all the recommendation from what was found (eg. diskutil verify/repair, enter are roo/administrator, etc).  Google search also reveals this to be a common issue on yosemite.  I do think if we needed to do a full restore from the recovery partition, this might not be an issue and might just work.  We might want to test doing a full recovery test with another mac mini just to see.  In addition to timemachine, it will also be running bacula on /Deploy as last resort backup.

2) There is a lot of strangeness with the logins.keychain and the setting of the dsadmin user password.  We really shouldn't be deleting the keychain as a bandaid to puppet managing the dsadmin password. So for the time being, the keychain will need to be updated manually after a password reset.  I've also noticed puppet has reset the dsadmin password several times even after puppet seems to have achieved state and subsequent runs come back clean.  I don't know if there is some kind of house keeping that changes this or my continues fiddling with the logins.keychain.  This might deserve a deeper dig at some later date.

As for now, neither of these issues are block deploystudio.

So with that said, I've finished cherry picking the t-yosemite-r7 workflow (and all it's dependencies).  I've set the workflow to default and filed https://bugzilla.mozilla.org/show_bug.cgi?id=1361453 to test the netboot/deploystudio setup as it is.  Hopefully, it 'just works'TM
Attachment #8863795 - Flags: review?(dhouse) → review+
Blocks: 1366828

This work is complete. IIRC, timemachine started working again. In addition to that, these are backed up with bacula.

Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: