Closed Bug 769780 Opened 8 years ago Closed 7 years ago

Setup puppet masters in AWS

Categories

(Infrastructure & Operations :: CIDuty, task, P2)

x86_64
Linux

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: rail, Assigned: rail)

References

Details

Attachments

(3 files, 4 obsolete files)

Attached patch AWS puppet (obsolete) — Splinter Review
The idea is having 2 types of masters in AWS:

puppetca: autosigns slaves (ca = true)

puppetmaster: puppet master with ca = false, using puppetca's cert and CRL to verify clients.

ca = true/false comes from extlookup/puppet{ca,master}-config.csv

The following script sets up puppetca/masters:
https://github.com/rail/briar-patch/blob/6ca737c98f10ecad75051812cc23e56829690fce/releng/aws_create_puppetmaster.py

It uses "puppet apply" to "masterize" the instances (line 99). puppetmasters sync CRL from puppetca and reload apache if needed (toplevel::server::puppetmaster).

I added mod_passenger repo rsynced from rsync://passenger.stealthymonkeys.com/rpms/

[master] of puppet.conf shouldn't affect it's current usage, but we can use a different template if you want.

I touched logic in modules/puppet/templates/puppetmasters.txt.erb to make the list random.

TODO: add a cronjob for modules/toplevel/files/server/puppet/update.sh
TODO: sync /data

So far it's been working fine for 2-3 days in EC2! :)
Attachment #637979 - Flags: feedback?(dustin)
Attachment #637979 - Flags: feedback?(bugspam.Callek)
Comment on attachment 637979 [details] [diff] [review]
AWS puppet

Review of attachment 637979 [details] [diff] [review]:
-----------------------------------------------------------------

The CSV contents are intended to allow different puppetagain installations to distinguish themselves, rather than distinguishing different hosts in the same installation.  I think it would make more sense to determine whether a host is a puppetmaster or puppetca in the node definitions, and include e.g., toplevel::server::puppetmaster or toplevel::server::puppetca depending.  Then those toplevel classes can call the puppetmaster module:

  class {
    puppetmaster:
      is_puppetca => true; # or false for toplevel::server::puppetmaster
  }

I'm not sure how node naming works in the AWS setup - is that not possible?

Overall, I'm not sure what the intent of the two toplevel classes is, nor do I see a lot of detail on how the puppetmaster/puppetca stuff is handled -- are CA requests forwarded from puppetmasters to the puppetca?  How is the CRL synchronized back?  We should probably have a talk with opsec when this is ready to roll, just to make sure the cert handling is OK.  I suspect we can do so quickly, as an extension of the sign-off on the infra configuration.

::: manifests/nodes.pp
@@ +20,5 @@
>  
> +node /.*\.build\.aws-us-west-1\.mozilla\.com/ {
> +    # Make sure we get our /etc/hosts set up
> +    class {
> +        "network::aws": stage => packagesetup,

Let's make a new stage for this, say "network", and add the contents of network::aws to the network class, conditioned on the host being in AWS.

@@ +26,5 @@
> +    include toplevel::slave::build::mock
> +}
> +
> +node /^puppetmaster-\d+/ {
> +    include toplevel::server::puppetmaster

This includes toplevel::server::puppetmaster, but it looks like most of the code is in toplevel::server::puppet.. ::confused::

::: modules/packages/manifests/mozilla/py27_mercurial.pp
@@ +10,5 @@
> +            file {
> +                "/usr/local/bin/hg":
> +                    ensure => "/tools/python27-mercurial/bin/hg",
> +                    require => Package["mozilla-python27-virtualenv"];
> +            }

This should really be in the package, and should be consistent across all of the custom packages.  Currently none use symlinks like this, IIRC.

::: modules/packages/manifests/setup.pp
@@ +28,5 @@
>              url_path => "repos/yum/mirrors/puppetlabs/el/6/products/$hardwaremodel";
>          "releng-public-${operatingsystem}${majorver}-${hardwaremodel}":
>              url_path => "repos/yum/releng/public/$operatingsystem/$majorver/$hardwaremodel";
> +        "passenger-${operatingsystem}${majorver}-${hardwaremodel}":
> +            url_path => "repos/yum/mirrors/passenger.stealthymonkeys.com/rhel/6.2/$hardwaremodel";

This should probably just be named 'passenger'.  We can document the links on the wiki as has been done for puppetlabs.  Also, the repo title doesn't necessarily need all of those variables in it..

::: modules/puppet/templates/puppet.conf.erb
@@ +3,5 @@
>  [main]
>      logdir = /var/log/puppet
>      rundir = /var/run/puppet
>      ssldir = /var/lib/puppet/ssl
> +    confgdir = /etc/puppet/production

missing an 'i'?

::: modules/puppet/templates/puppetmasters.txt.erb
@@ +4,5 @@
>  srand(ipaddress.sub('.', '').to_i)
>  
> +all_puppet_servers = [puppet_server] + puppet_servers
> +all_puppet_servers = all_puppet_servers.uniq
> +all_puppet_servers.shuffle.each do |mirror_server|

Nicely done

::: modules/toplevel/files/server/puppet/update.sh
@@ +1,4 @@
> +#! /bin/bash
> +
> +errormailto="release@mozilla.com"
> +changemailto="release@mozilla.com"

These should go to releng-shared@mozilla.com.

::: modules/toplevel/files/server/puppet/yum_mirrors.conf
@@ +1,2 @@
> +Alias /repos /data/repos
> +Alias /python /data/python

You'll need to add a new one of these for every tree.  It's easier to just use DocumentRoot /data.  See releng_puppetmaster_web.conf on a releng puppetmaster for how we do it there.

::: modules/toplevel/manifests/server/master.pp
@@ +1,1 @@
> +class toplevel::server::master inherits toplevel::base {

We're both wrong :)  This should inherit toplevel::server -- toplevel classes' naming hierarchy and class hierarchy should be in lock-step.

::: modules/toplevel/manifests/server/puppet.pp
@@ +1,2 @@
> +# toplevel class for running a puppet master
> +class toplevel::server::puppet {

Just about everything here, and the files/templates associated with it, should be in modules/puppetmaster.

@@ +2,5 @@
> +class toplevel::server::puppet {
> +    include toplevel::server
> +    include nrpe::check::ntp_time
> +    include nrpe::check::ganglia
> +    include puppet

This will probably need to be 'include ::puppet' to avoid relative includes..

@@ +4,5 @@
> +    include nrpe::check::ntp_time
> +    include nrpe::check::ganglia
> +    include puppet
> +
> +    package {

These should all be done via the package module  You can bundle them up (e.g., httpd + mod_ssl)

@@ +15,5 @@
> +            ensure => installed;
> +        "puppet-server":
> +            require => Package["puppet"],
> +            ensure => $puppet::puppet_version;
> +        "createrepo":

Does this end up being used during the setup?  I don't see it used elsewhere in the patch.

@@ +17,5 @@
> +            require => Package["puppet"],
> +            ensure => $puppet::puppet_version;
> +        "createrepo":
> +            ensure => installed;
> +        "mercurial":

Shouldn't this by py27_mercurial?  Do we want multiple mercurials installed?

@@ +30,5 @@
> +            owner => root,
> +            group => root,
> +            require => Package["puppet-server"],
> +            source => "puppet:///modules/toplevel/server/puppet/fileserver.conf";
> +        "/etc/puppet/autosign.conf":

Since this is empty, it may be easier and more obvious to just write content => "".

@@ +36,5 @@
> +            owner => root,
> +            group => root,
> +            require => Package["puppet-server"],
> +            source => "puppet:///modules/toplevel/server/puppet/autosign.conf";
> +        # TODO: add a crontab entry for the following file

And, as a note, use a file in /etc/cron.d/ - the cron {} resource will leave you unhappy.

@@ +47,5 @@
> +        "/etc/httpd/conf.d/yum_mirrors.conf":
> +            require => Package["httpd"],
> +            source => "puppet:///modules/toplevel/server/puppet/yum_mirrors.conf";
> +        "/etc/httpd/conf.d/puppetmaster_passenger.conf":
> +            require => [Package["httpd"], Package["mod_ssl"], Package["mod_passenger"]],

Most of the requires here and above aren't necessary just to install the file.  For example, here you only need httpd (well, should be Class['packages::httpd']), as that will create the conf.d directory.  The rest can get installed after the file is created with no harm.  Similarly, if puppet is running then /etc/puppet exists, so there's no need to require puppet-server for update.sh.

@@ +73,5 @@
> +        "iptables":
> +            require => [
> +                File["/etc/sysconfig/iptables"],
> +            ],
> +            subscribe => File["/etc/sysconfig/iptables"],

subscribe implies require, so only the second is necessary.

You may need ensure => running, and possibly hasrestart => true, to get this to refresh.

I'm on the fence about suggesting that this and the iptables file above be moved to network::firewall.  Given infinite time for writing code, I'd love to see

  network::firewall {
    [ '22', '80', '443', '8140' ]:
      proto => tcp;
  }

but that's asking a lot for something not related to the purpose of this patch.

@@ +81,5 @@
> +                Package["puppet-server"],
> +                File["/etc/puppet/puppet.conf"],
> +                File["/etc/puppet/fileserver.conf"],
> +                File["/var/lib/puppet/ssl/ca"],
> +            ],

This will only need to require Class['package::puppet-server']

@@ +96,5 @@
> +                File['/etc/httpd/conf.d/puppetmaster_passenger.conf']
> +            ],
> +            ensure => running,
> +            enable => true;
> +    }

We'll end up using httpd again, so this should be moved out to an httpd module.  You can have the file resources above notify => Class['httpd::service'] to get the effect of this subscription.

::: setup/hosts-bootstrap
@@ +1,3 @@
> +127.0.0.1               localhost.localdomain localhost
> +::1             localhost6.localdomain6 localhost6
> +127.0.0.1 puppet mrepo repos mrepo.mozilla.org

Where are we talking to mrepo?

::: setup/masterize.sh
@@ +1,5 @@
> +#!/bin/bash
> +set -e
> +set -x
> +
> +echo "Setting up this host to be a puppet master"

It'd be nice to have this script bootstrap itself entirely, doing the initial hg clone.  Then we could write up a one-liner to masterize:
 wget http://hg.m.o/puppet/file/tip/setup/masterize.sh && sh masterize.sh
Currently there are a number of things unnecessarily done by aws_create_puppetmaster.py.

@@ +28,5 @@
> +
> +/bin/echo "=== Installing apache, setting up mirrors ==="
> +(/bin/rpm -q httpd  > /dev/null 2>&1 || /usr/bin/yum -y -q install httpd)
> +/bin/cp /etc/puppet/production/modules/toplevel/files/server/puppet/yum_mirrors.conf /etc/httpd/conf.d/yum_mirrors.conf
> +/etc/init.d/httpd restart

This serves up /data, but where does that come from?  Once the public mirror is up (bug 757285), you should be able to initialize with an rsync from there.

@@ +35,5 @@
> +#if ( `ps ax | grep -v grep | grep -q ntpd` ); then
> +    ## Stop ntpd running. Puppet will start it back up
> +    #/sbin/service ntpd stop
> +#fi
> +#/usr/sbin/ntpdate ntp.build.mozilla.org

This should probably just use pool.ntp.org?

@@ +48,5 @@
> +/sbin/service puppetmaster stop
> +# Make sure puppet-server isn't installed already.
> +# We can end up with file permission issues if we upgrade puppet-server between
> +# different rpm repos (e.g. from epel to puppetlabs)
> +(/bin/rpm -e puppet-server > /dev/null 2>&1 || exit 0)

Shouldn't this uninstall happen *before* the install?

@@ +54,5 @@
> +/bin/echo "=== Applying toplevel::server::puppet..."
> +/bin/cp /etc/puppet/production/setup/puppet.conf /etc/puppet/puppet.conf
> +
> +# TODO: This actually can return 0 if it fails to satisfy some dependencies
> +# /usr/bin/puppet apply --modulepath /etc/puppet/production/modules --manifestdir /etc/puppet/production/manifests /etc/puppet/production/manifests/puppetmaster.pp || exit 1

You may want

* --detailed-exitcodes:
  Provide transaction information via exit codes. If this is enabled, an exit
  code of '2' means there were changes, an exit code of '4' means there were
  failures during the transaction, and an exit code of '6' means there were both
  changes and failures.
Attachment #637979 - Flags: feedback?(dustin) → feedback-
Depends on: 770848
Depends on: 769341
Depends on: 769329
Comment on attachment 637979 [details] [diff] [review]
AWS puppet

Updated one will be here soon.
Attachment #637979 - Flags: feedback?(bugspam.Callek)
Attached patch AWS puppet v2 (obsolete) — Splinter Review
Comments incoming
Attachment #639139 - Flags: review?(dustin)
Attachment #637979 - Attachment is obsolete: true
(In reply to Dustin J. Mitchell [:dustin] from comment #1)
> 
> The CSV contents are intended to allow different puppetagain installations
> to distinguish themselves, rather than distinguishing different hosts in the
> same installation.  I think it would make more sense to determine whether a
> host is a puppetmaster or puppetca in the node definitions, and include
> e.g., toplevel::server::puppetmaster or toplevel::server::puppetca
> depending.  Then those toplevel classes can call the puppetmaster module:
> 
>   class {
>     puppetmaster:
>       is_puppetca => true; # or false for toplevel::server::puppetmaster
>   }
> 
> I'm not sure how node naming works in the AWS setup - is that not possible?

puppet.conf.erb is in the puppet module, so in this case to access that variable you'll need to initialize toplevel::server::puppetmaster and get it via $toplevel::server::puppetmaster::is_puppetca what sounds very problematic to me.

In this case we use 2 sets of CSV files: one (puppetca-config.csv) for running "puppet apply" on puppetca master and one (puppetmaster-config.csv) for "puppet apply" on puppetmasters. Both "puppet apply" is going to be run by update.sh cronjob when something changes in the hg repo.

At the same time puppetmasters will be running puppet agent to sync CRL and CSV data from puppetca (puppetmaster::sync_crl and puppetmaster::sync_csv) very often (every 5 mins?) (see runinterval).


> Overall, I'm not sure what the intent of the two toplevel classes is, nor do
> I see a lot of detail on how the puppetmaster/puppetca stuff is handled --
> are CA requests forwarded from puppetmasters to the puppetca?

We create instances and force "puppet agent ... --server puppetmaster-xx --ca_server=puppetca-xx" command on slaves. puppetca autosigns all requests atm. After running this slaves don't need to talk to puppetca anymore, they will use /etc/puppet/puppetmasters.txt.

>  How is the CRL synchronized back?

It's in puppetmaster::sync_crl (puppetmasters-xx only).

> We should probably have a talk with opsec when this
> is ready to roll, just to make sure the cert handling is OK.  I suspect we
> can do so quickly, as an extension of the sign-off on the infra
> configuration.

What's the procedure here?
 
> ::: manifests/nodes.pp
> @@ +20,5 @@
> >  
> > +node /.*\.build\.aws-us-west-1\.mozilla\.com/ {
> > +    # Make sure we get our /etc/hosts set up
> > +    class {
> > +        "network::aws": stage => packagesetup,
> 
> Let's make a new stage for this, say "network", and add the contents of
> network::aws to the network class, conditioned on the host being in AWS.

Done.

> @@ +26,5 @@
> > +    include toplevel::slave::build::mock
> > +}
> > +
> > +node /^puppetmaster-\d+/ {
> > +    include toplevel::server::puppetmaster
> 
> This includes toplevel::server::puppetmaster, but it looks like most of the
> code is in toplevel::server::puppet.. ::confused::

Yeah, see my explanation above.

> ::: modules/packages/manifests/mozilla/py27_mercurial.pp
> @@ +10,5 @@
> > +            file {
> > +                "/usr/local/bin/hg":
> > +                    ensure => "/tools/python27-mercurial/bin/hg",
> > +                    require => Package["mozilla-python27-virtualenv"];
> > +            }
> 
> This should really be in the package, and should be consistent across all of
> the custom packages.  Currently none use symlinks like this, IIRC.

I'm not sure what to do here. catlee added it to make slaves work properly, I believe.

> ::: modules/packages/manifests/setup.pp
> @@ +28,5 @@
> >              url_path => "repos/yum/mirrors/puppetlabs/el/6/products/$hardwaremodel";
> >          "releng-public-${operatingsystem}${majorver}-${hardwaremodel}":
> >              url_path => "repos/yum/releng/public/$operatingsystem/$majorver/$hardwaremodel";
> > +        "passenger-${operatingsystem}${majorver}-${hardwaremodel}":
> > +            url_path => "repos/yum/mirrors/passenger.stealthymonkeys.com/rhel/6.2/$hardwaremodel";
> 
> This should probably just be named 'passenger'.  We can document the links
> on the wiki as has been done for puppetlabs.  Also, the repo title doesn't
> necessarily need all of those variables in it..

Filed bug 770848 to mirror the packages. Renamed to "passenger".
 
> ::: modules/puppet/templates/puppet.conf.erb
> @@ +3,5 @@
> >  [main]
> >      logdir = /var/log/puppet
> >      rundir = /var/run/puppet
> >      ssldir = /var/lib/puppet/ssl
> > +    confgdir = /etc/puppet/production
> 
> missing an 'i'?

I just removed it.


> ::: modules/toplevel/files/server/puppet/update.sh
> @@ +1,4 @@
> > +#! /bin/bash
> > +
> > +errormailto="release@mozilla.com"
> > +changemailto="release@mozilla.com"
> 
> These should go to releng-shared@mozilla.com.

Done.
 
> ::: modules/toplevel/files/server/puppet/yum_mirrors.conf
> @@ +1,2 @@
> > +Alias /repos /data/repos
> > +Alias /python /data/python
> 
> You'll need to add a new one of these for every tree.  It's easier to just
> use DocumentRoot /data.  See releng_puppetmaster_web.conf on a releng
> puppetmaster for how we do it there.

Done.
 
> ::: modules/toplevel/manifests/server/master.pp
> @@ +1,1 @@
> > +class toplevel::server::master inherits toplevel::base {
> 
> We're both wrong :)  This should inherit toplevel::server -- toplevel
> classes' naming hierarchy and class hierarchy should be in lock-step.


Heh, fixed.
 
> ::: modules/toplevel/manifests/server/puppet.pp
> @@ +1,2 @@
> > +# toplevel class for running a puppet master
> > +class toplevel::server::puppet {
> 
> Just about everything here, and the files/templates associated with it,
> should be in modules/puppetmaster.

Done
 
> @@ +2,5 @@
> > +class toplevel::server::puppet {
> > +    include toplevel::server
> > +    include nrpe::check::ntp_time
> > +    include nrpe::check::ganglia
> > +    include puppet
> 
> This will probably need to be 'include ::puppet' to avoid relative includes..

Moved to the puppetmaster package.

> @@ +4,5 @@
> > +    include nrpe::check::ntp_time
> > +    include nrpe::check::ganglia
> > +    include puppet
> > +
> > +    package {
> 
> These should all be done via the package module  You can bundle them up
> (e.g., httpd + mod_ssl)

Done.
 
> @@ +15,5 @@
> > +            ensure => installed;
> > +        "puppet-server":
> > +            require => Package["puppet"],
> > +            ensure => $puppet::puppet_version;
> > +        "createrepo":
> 
> Does this end up being used during the setup?  I don't see it used elsewhere
> in the patch.

We used to use it, but we shouldn't. Removed.
 
> @@ +17,5 @@
> > +            require => Package["puppet"],
> > +            ensure => $puppet::puppet_version;
> > +        "createrepo":
> > +            ensure => installed;
> > +        "mercurial":
> 
> Shouldn't this by py27_mercurial?  Do we want multiple mercurials installed?

This is moved to packages::mercurial and needed on puppetca/puppetmasters only, no need for the custom, not system-wide one. Don't want to fight with paths. :)


> @@ +30,5 @@
> > +            owner => root,
> > +            group => root,
> > +            require => Package["puppet-server"],
> > +            source => "puppet:///modules/toplevel/server/puppet/fileserver.conf";
> > +        "/etc/puppet/autosign.conf":
> 
> Since this is empty, it may be easier and more obvious to just write content
> => "".

Actually it is "*\". Done.
 
> @@ +36,5 @@
> > +            owner => root,
> > +            group => root,
> > +            require => Package["puppet-server"],
> > +            source => "puppet:///modules/toplevel/server/puppet/autosign.conf";
> > +        # TODO: add a crontab entry for the following file
> 
> And, as a note, use a file in /etc/cron.d/ - the cron {} resource will leave
> you unhappy.

Yeah, I hit that before.

> @@ +47,5 @@
> > +        "/etc/httpd/conf.d/yum_mirrors.conf":
> > +            require => Package["httpd"],
> > +            source => "puppet:///modules/toplevel/server/puppet/yum_mirrors.conf";
> > +        "/etc/httpd/conf.d/puppetmaster_passenger.conf":
> > +            require => [Package["httpd"], Package["mod_ssl"], Package["mod_passenger"]],
> 
> Most of the requires here and above aren't necessary just to install the
> file.  For example, here you only need httpd (well, should be
> Class['packages::httpd']), as that will create the conf.d directory.  The
> rest can get installed after the file is created with no harm.  Similarly,
> if puppet is running then /etc/puppet exists, so there's no need to require
> puppet-server for update.sh.

Cut and stitched everything here. :)
 
> @@ +73,5 @@
> > +        "iptables":
> > +            require => [
> > +                File["/etc/sysconfig/iptables"],
> > +            ],
> > +            subscribe => File["/etc/sysconfig/iptables"],
> 
> subscribe implies require, so only the second is necessary.

I removed this for now. Adding firewall is in my TODO list.

> You may need ensure => running, and possibly hasrestart => true, to get this
> to refresh.

Done. Also added hasstatus => true.
 
> I'm on the fence about suggesting that this and the iptables file above be
> moved to network::firewall.  Given infinite time for writing code, I'd love
> to see
> 
>   network::firewall {
>     [ '22', '80', '443', '8140' ]:
>       proto => tcp;
>   }

I think we can use http://forge.puppetlabs.com/puppetlabs/firewall. TODO.
> but that's asking a lot for something not related to the purpose of this
> patch.
> 
> @@ +81,5 @@
> > +                Package["puppet-server"],
> > +                File["/etc/puppet/puppet.conf"],
> > +                File["/etc/puppet/fileserver.conf"],
> > +                File["/var/lib/puppet/ssl/ca"],
> > +            ],
> 
> This will only need to require Class['package::puppet-server']

Fixed.

> @@ +96,5 @@
> > +                File['/etc/httpd/conf.d/puppetmaster_passenger.conf']
> > +            ],
> > +            ensure => running,
> > +            enable => true;
> > +    }
> 
> We'll end up using httpd again, so this should be moved out to an httpd
> module.  You can have the file resources above notify =>
> Class['httpd::service'] to get the effect of this subscription.

Done.
 
> ::: setup/hosts-bootstrap
> @@ +1,3 @@
> > +127.0.0.1               localhost.localdomain localhost
> > +::1             localhost6.localdomain6 localhost6
> > +127.0.0.1 puppet mrepo repos mrepo.mozilla.org
> 
> Where are we talking to mrepo?

Removed.
 
> ::: setup/masterize.sh
> @@ +1,5 @@
> > +#!/bin/bash
> > +set -e
> > +set -x
> > +
> > +echo "Setting up this host to be a puppet master"
> 
> It'd be nice to have this script bootstrap itself entirely, doing the
> initial hg clone.  Then we could write up a one-liner to masterize:
>  wget http://hg.m.o/puppet/file/tip/setup/masterize.sh && sh masterize.sh
> Currently there are a number of things unnecessarily done by
> aws_create_puppetmaster.py.


 
> @@ +28,5 @@
> > +
> > +/bin/echo "=== Installing apache, setting up mirrors ==="
> > +(/bin/rpm -q httpd  > /dev/null 2>&1 || /usr/bin/yum -y -q install httpd)
> > +/bin/cp /etc/puppet/production/modules/toplevel/files/server/puppet/yum_mirrors.conf /etc/httpd/conf.d/yum_mirrors.conf
> > +/etc/init.d/httpd restart
> 
> This serves up /data, but where does that come from?  Once the public mirror
> is up (bug 757285), you should be able to initialize with an rsync from
> there.

Initially we create instances with prepopulated /data. I added a cronjob to sync files. We'll also need to create snapshots of /data before we create another puppetmaster instance.

> @@ +35,5 @@
> > +#if ( `ps ax | grep -v grep | grep -q ntpd` ); then
> > +    ## Stop ntpd running. Puppet will start it back up
> > +    #/sbin/service ntpd stop
> > +#fi
> > +#/usr/sbin/ntpdate ntp.build.mozilla.org
> 
> This should probably just use pool.ntp.org?

The machines are inside of build-vpn.

> @@ +48,5 @@
> > +/sbin/service puppetmaster stop
> > +# Make sure puppet-server isn't installed already.
> > +# We can end up with file permission issues if we upgrade puppet-server between
> > +# different rpm repos (e.g. from epel to puppetlabs)
> > +(/bin/rpm -e puppet-server > /dev/null 2>&1 || exit 0)
> 
> Shouldn't this uninstall happen *before* the install?

I removed rpme -e step by fixing the root problem: there was no puppetlabs repo in the bootstrap repo file.

> @@ +54,5 @@
> > +/bin/echo "=== Applying toplevel::server::puppet..."
> > +/bin/cp /etc/puppet/production/setup/puppet.conf /etc/puppet/puppet.conf
> > +
> > +# TODO: This actually can return 0 if it fails to satisfy some dependencies
> > +# /usr/bin/puppet apply --modulepath /etc/puppet/production/modules --manifestdir /etc/puppet/production/manifests /etc/puppet/production/manifests/puppetmaster.pp || exit 1
> 
> You may want
> 
> * --detailed-exitcodes:
>   Provide transaction information via exit codes. If this is enabled, an exit
>   code of '2' means there were changes, an exit code of '4' means there were
>   failures during the transaction, and an exit code of '6' means there were
> both
>   changes and failures.

Unfortunately that doesn't work with puppet apply afaik. :(

Patch comments incoming,
Comment on attachment 639139 [details] [diff] [review]
AWS puppet v2

Review of attachment 639139 [details] [diff] [review]:
-----------------------------------------------------------------

::: modules/config/manifests/init.pp
@@ +9,3 @@
>      $puppet_notif_email = extlookup("puppet_notif_email")
>      $puppet_server = extlookup("puppet_server")
> +    $puppet_server_ca_enabled = extlookup("puppet_server_ca_enabled", "false")

Explicitly used "false" here, to prevent conversion from bool->string which didn't work.

@@ +9,4 @@
>      $puppet_notif_email = extlookup("puppet_notif_email")
>      $puppet_server = extlookup("puppet_server")
> +    $puppet_server_ca_enabled = extlookup("puppet_server_ca_enabled", "false")
> +    $puppet_agent_runinterval = extlookup("puppet_agent_runinterval", undef)

To make puppetmasters sync more frequent.
We talked by vidyo, and identified a few issues here:
 - puppetmasters are running both 'puppet agent' (talking to puppetca, with toplevel::server::puppetmaster, so only sync'ing) and 'puppet apply' (in update.sh, with toplevel::server::puppet, doing all of the other server-like stuff)
 - puppetca is autosigning, which precludes putting secrets into puppet.
 - puppetca is only used on first connect, to get certificates.  We could get more security and remove an SPOF by having whatever instantiates new slaves (kitten reaper) generate and supply the certs instead.

From what I understand, none of this is blocking B2G-in-the-clouds tomorrow, as there are already puppetmasters there running just fine.

In the interests of getting some of this committed, without committing to designs we don't like, the plan is:
 - remove the bootstrapping support from the patch for now
 - add a $config::puppetca_host to moco-settings.csv or local-settings.csv so that puppetmasters know who their puppetca is
 - use rsync, rather than puppet, to synchronize CSV's and CRLs from $config::puppetca_host
 - use only toplevel::server::puppetmaster for puppetmasters
 - puppetmasters should run puppet agent *only*, with themselves as the server

Once that's committed, we can more easily attack the bootstrapping problem with another patch on this bug.
Comment on attachment 639139 [details] [diff] [review]
AWS puppet v2

I'll ask for r? once the new patch is ready.
Attachment #639139 - Flags: review?(dustin)
Depends on: 771598
Depends on: 771600
Attached patch AWS puppet v3 (obsolete) — Splinter Review
comments incoming
Attachment #639139 - Attachment is obsolete: true
Attachment #641887 - Flags: review?(dustin)
Comment on attachment 641887 [details] [diff] [review]
AWS puppet v3

Review of attachment 641887 [details] [diff] [review]:
-----------------------------------------------------------------

* You can ignore everything under setup/ if you prefer to have manifests only.
* CRL sync isn't covered here
* There are 2 ways to install puppet masters: standalone and puppet master talking to another (super!) puppet master

::: manifests/nodes.pp
@@ +26,5 @@
> +    include toplevel::slave::build::mock
> +}
> +
> +node /puppetmaster-\d+\..*\.aws-.*\.mozilla\.com/ {
> +    include toplevel::server::puppetmaster_standalone

We are going to use a standalone one in AWS for now.

::: manifests/site.pp
@@ +1,3 @@
>  # Setup extlookup which we only use for config magic
>  $extlookup_datadir = "$settings::manifestdir/extlookup"
> +$extlookup_precedence = ["$domain", "local-config", "default-config", "secrets"]

to simplify puppetmaster settings for slaves in different regions.

::: modules/puppet/manifests/atboot.pp
@@ -32,5 @@
>                      # installed first
>                      require => Class['puppet::install'],
>                      source => "puppet:///modules/puppet/puppet-centos-initd";
> -                "/etc/sysconfig/puppet":
> -                    content => template("puppet/sysconfig-puppet.erb");

I moved this to periodic, atboot doesn't use it.

::: modules/puppetmaster/manifests/config.pp
@@ +38,5 @@
> +            group  => puppet,
> +            source => "puppet:///modules/puppetmaster/config.ru";
> +        "/etc/incron.d/crl.incron":
> +            require => [Class['packages::httpd'], Class['packages::incron']],
> +            content => "/var/lib/puppet/ssl IN_MODIFY /sbin/service httpd graceful\n";

Magic! :)

::: modules/puppetmaster/manifests/sync_data.pp
@@ +1,3 @@
> +class puppetmaster::sync_data {
> +    file {
> +        # TODO: add --delete once passenger repo is copied over

Will drop it.
Comment on attachment 641887 [details] [diff] [review]
AWS puppet v3

Review of attachment 641887 [details] [diff] [review]:
-----------------------------------------------------------------

OK, well, if you can make the puppet_server thing not break production, and turn off autosigning, this is good to go.  I ignored the setup/ stuff as directed.

You'll need docs on the wiki for the new module, and a writeup of how the install process works, too, when it lands.

::: manifests/extlookup/default-config.csv
@@ -1,5 @@
>  puppet_notif_email,nobody@mozilla.com
>  data_server,repos
>  data_servers,%{data_server}
> -puppet_server,puppet
> -puppet_servers,%{puppet_server}

This will break the moco install, since the puppet servers are the failover pool (same for the config.pp change)

::: modules/packages/manifests/mozilla/py27_mercurial.pp
@@ +10,5 @@
> +            file {
> +                "/usr/local/bin/hg":
> +                    ensure => "/tools/python27-mercurial/bin/hg",
> +                    require => Package["mozilla-python27-virtualenv"];
> +            }

The package now creates this link itself (bug 738040).

::: modules/puppet/templates/puppet.conf.erb
@@ +7,5 @@
> +    srand(old_seed)
> +else
> +    server = puppet_servers
> +end
> +-%>

Ah, I see why the parameter was removed from default-config.csv, then: if not puppet server is set, then pick one "randomly" from the set of available servers.  Can you switch this around to use a sentinel value in puppet_server, instead?  Something unabiguous like "<<shuffle>>"?  That way the default ('puppet') can stay in place.

Also, since this snippet occurs twice (and should be used in puppetmasters.txt, too), you can DRY things up by putting it in its own template, say calculate-puppet-server.erb, that calculates server and then outputs it (with no whitespace), and then setting
  $puppet_server = template("puppet/calculate-puppet-server.erb");

::: modules/puppetmaster/manifests/config.pp
@@ +13,5 @@
> +            mode => 0644,
> +            owner => root,
> +            group => root,
> +            require => Class["puppet"],
> +            content => "*";

I thought you were disabling autosigning?

@@ +38,5 @@
> +            group  => puppet,
> +            source => "puppet:///modules/puppetmaster/config.ru";
> +        "/etc/incron.d/crl.incron":
> +            require => [Class['packages::httpd'], Class['packages::incron']],
> +            content => "/var/lib/puppet/ssl IN_MODIFY /sbin/service httpd graceful\n";

Nice!  But probably deserves a comment :)

::: modules/toplevel/manifests/server/puppetmaster_standalone.pp
@@ +2,5 @@
> +class toplevel::server::puppetmaster_standalone {
> +    include toplevel::base
> +    include toplevel::server::puppetmaster_base
> +    include puppetmaster::update_manifests
> +}

So these three toplevel classes don't follow the name/class hierarchy coordination.  I think you could have the same effect as follows (with classes at the corresponding filenames):

class toplevel::server::puppetmaster inherits toplevel::server {
    include nrpe::check::ntp_time
    include nrpe::check::ganglia
    include ::puppetmaster
}

# toplevel class for running a puppet master
class toplevel::server::puppetmaster::standalone inherits toplevel::server::puppetmaster {
    include toplevel::server::puppetmaster
    # update manifests regularly from hg
    include puppetmaster::update_manifests
}

note that most of the things listed in puppetmaster_base.pp are included by toplevel::server, so don't need to be repeated.
Attachment #641887 - Flags: review?(dustin) → review-
Joes, this is work to build puppet masters that will run (are running?) in AWS.  The plan, as I understand it, is to disable autosigning, and to sign new hosts out-of-band, with a CA in the moco network, when the hosts are created, rather than using the normal puppet CA mechanism to do so.  Rail, anything to add?
Everything sounds correct.
So, we'd like to review the current AWS/VPC deployment before we move ahead any further. Dustin or Rail, can you set aside some time this week to meet with us?
I'd like to sit in if the timing works out, but catlee and/or rail are the folks to talk to.
Depends on: 774638
Attached patch AWS puppet v4 (obsolete) — Splinter Review
This is the updated and tested version of puppet manifests.

I would really like to go with "puppet apply" way of setting up multiple puppet master due to 2 important reasons:

1) You don't need a puppet master-master (sic!) to setup a puppet master. Even if you use localhost there is a chance of breaking apache (see 2.). 

I believe it'll be important for SeaMonkey puppet master - no need to deal with netflows, and they can even their own fork of puppet repo if they need.

2) In case if you use localhost as a puppet master-master there is a chance to break apache with broken configs, invalid certificates, etc. It's easier to recover multiple masters in this case. You don't need to ssh to every single master and kick them. Just push your change to hg and relax. :)

I've been testing this configuration and it behaves quite stable.

Changes:
* Refactored and simplified puppetmaster and puppetmaster::standalone manifests. puppet::periodic runs puppet agent / puppet apply (update.sh) depending on existence of /etc/puppet/standalone file.

* Dropped incrond based apache reloads in favour of a script which pulls CRLs from a central location and reloads apache when detects a CRL update. Originally I thought about pushing the CRL from a central location, but that would require us to keep the destination list up to date. Publishing the CRL is much easier and error prone.

* update.sh converted to a template so you don't need to hardcode emails and hg repo url

* Removed /etc/sysconfig/puppet which isn't used by any script. Originally it was used by the original /etc/init.t/puppet, but we replace it with our own script with multiple puppet masters.

* Added /root/.hgrc with dotencode=False
Attachment #641887 - Attachment is obsolete: true
Attachment #650584 - Flags: review?(dustin)
Comment on attachment 650584 [details] [diff] [review]
AWS puppet v4

Review of attachment 650584 [details] [diff] [review]:
-----------------------------------------------------------------

This looks great!  I'm confused about how the SSL part works, though: if each master generates its own certs (in setup/masterize.sh), but clients are signed out-of-band, how do the masters authenticate those clients?  Is there a manual step to install $ssldir/ca/ca_pub.pem?

So, r+ with a few minor tweaks in the comments here, and we should probably talk about the SSL stuff together with someone from opsec.  We can get this committed before the opsec review (since it sounds like it's already in use anyway).  Do you want to schedule a meeting for that?

It'll be good to coordinate with buildduty, callek, and kim for the landing - there's a nontrivial risk of breaking existing production machines with this patch.

::: manifests/extlookup/default-config.csv
@@ -1,5 @@
>  puppet_notif_email,nobody@mozilla.com
>  data_server,repos
>  data_servers,%{data_server}
> -puppet_server,puppet
> -puppet_servers,%{puppet_server}

I think that the default case should be a single puppet server; the two other cases are
 * having more with each client assigned to a "local" master (mozilla)
 * having more with clients randomly assigned (AWS)

Could you keep the puppet_server arg in this file, but change it around so that if it is some obvious sentinel value like "<<RANDOM>>", it's selected randomly from puppet_servers?  I think that will be more obvious than the *lack* of that parameter causing the randomization.

::: modules/puppet/manifests/atboot.pp
@@ -40,5 @@
>                      # installed first
>                      require => Class['packages::puppet'],
>                      content => template("puppet/puppet-centos-initrd.erb");
> -                "/etc/sysconfig/puppet":
> -                    content => template("puppet/sysconfig-puppet.erb");

+1 - this has been on my "minor fix" list for a while..

::: modules/puppetmaster/templates/update.sh.erb
@@ +64,5 @@
> +            echo "Puppet changes applied at $hostname:"
> +            echo ''
> +            hg diff -r $rev_before -r $rev_after
> +            echo ''
> +            echo "puppet apply output:"

This should probably be conditional on /etc/puppet/standalone existing.  I *think* that with that change, both toplevel::server:puppetmaster and toplevel::server::puppetmaster::standalone will work (with, admittedly, no bootstrapping available for the former).

::: modules/puppetmaster/templates/update_crl.sh.erb
@@ +8,5 @@
> +
> +tmp=$(mktemp)
> +trap "rm -f $tmp" EXIT
> +
> +if ! wget -q -O "$tmp" "$crl_sync_url"; then

We'll need to get opsec to look at this.  I assume crl_sync_url is https://?
Attachment #650584 - Flags: review?(dustin) → review+
Attached patch AWS puppet v5Splinter Review
* addressed comments
* carrying over r=dustin

(In reply to Dustin J. Mitchell [:dustin] from comment #16)
> Comment on attachment 650584 [details] [diff] [review]
> AWS puppet v4
> 
> Review of attachment 650584 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> This looks great! 

Thanks!

> I'm confused about how the SSL part works, though: if
> each master generates its own certs (in setup/masterize.sh), but clients are
> signed out-of-band, how do the masters authenticate those clients?  Is there
> a manual step to install $ssldir/ca/ca_pub.pem?

There is a bunch of CA scripts to generate and revoke the certs:
https://github.com/rail/briar-patch/tree/aws/releng/ca-scripts (slightly edited your scripts)

and 

some additional steps to put the certs
for puppet masters: https://github.com/rail/briar-patch/blob/aws/releng/aws_create_puppetmaster.py#L96
slaves: https://github.com/rail/briar-patch/blob/aws/releng/aws_create_instance.py#L58

The *.py part isn't polished yet.
 
> So, r+ with a few minor tweaks in the comments here, and we should probably
> talk about the SSL stuff together with someone from opsec.  We can get this
> committed before the opsec review (since it sounds like it's already in use
> anyway).  Do you want to schedule a meeting for that?

It's in use, but not for production slaves yet. In any case it's better than autosigning puppetca.
 
Actually, we had a meeting with opsecs a couple of weeks ago, assuming that we'll go this way. There were no show stoppers, iirc. the only concern was about putting ssh keys on aws slaves, but nothing puppet related.

> It'll be good to coordinate with buildduty, callek, and kim for the landing
> - there's a nontrivial risk of breaking existing production machines with
> this patch.

Sounds good. I'll be on on buildduty next week. One less person to coordinate with. :)
 
> Could you keep the puppet_server arg in this file, but change it around so
> that if it is some obvious sentinel value like "<<RANDOM>>", it's selected
> randomly from puppet_servers?  I think that will be more obvious than the
> *lack* of that parameter causing the randomization.

Done and tested.
 
> ::: modules/puppetmaster/templates/update.sh.erb
> @@ +64,5 @@
> > +            echo "Puppet changes applied at $hostname:"
> > +            echo ''
> > +            hg diff -r $rev_before -r $rev_after
> > +            echo ''
> > +            echo "puppet apply output:"
> 
> This should probably be conditional on /etc/puppet/standalone existing.  I
> *think* that with that change, both toplevel::server:puppetmaster and
> toplevel::server::puppetmaster::standalone will work (with, admittedly, no
> bootstrapping available for the former).

Good point, will be easier to run masterize.sh.
 
> ::: modules/puppetmaster/templates/update_crl.sh.erb
> @@ +8,5 @@
> > +
> > +tmp=$(mktemp)
> > +trap "rm -f $tmp" EXIT
> > +
> > +if ! wget -q -O "$tmp" "$crl_sync_url"; then
> 
> We'll need to get opsec to look at this.  I assume crl_sync_url is https://?

Yes, I'm going to run CA on cruncher and publish the files to https://build.m.o:
https://github.com/rail/briar-patch/blob/aws/releng/ca-scripts/ca_config.sh#L8
Attachment #650584 - Attachment is obsolete: true
Attachment #650627 - Flags: review+
Better to put them on https://secure.pub.build.mozilla.org, since we're retiring https://build.mozilla.org (bug 604688).  We can limit access to that to the AWS network.  File a bug for me to get that set up?  I can set up an rsync from cruncher.
Depends on: 781642
Filed bug 781642
Attached patch ganglia setupSplinter Review
I'm not sure what ganglia cluster should be pointed these servers at. Assuming RelEngSCL3Srv.
Attachment #650924 - Flags: review?(dustin)
Comment on attachment 650924 [details] [diff] [review]
ganglia setup

Review of attachment 650924 [details] [diff] [review]:
-----------------------------------------------------------------

Ganglia needs to be in the same subnet as the server, so I'm almost certain this won't work.  Amy can correct me if I'm wrong, but for the moment I'd advise going without ganglia, or setting it up in EC2.
Attachment #650924 - Flags: review?(dustin) → review-
Comment on attachment 650627 [details] [diff] [review]
AWS puppet v5

Refreshed and unbitrotten:
http://hg.mozilla.org/build/puppet/rev/48ffe79f824f
Attachment #650627 - Flags: checked-in+
Attached patch fix randomnessSplinter Review
It turned out that there is an issue with order of puppet servers in /etc/puppet/puppetmasters.txt.

We need to put $puppet_master to the top of the file if puppet_master in CSV files isn't set to <<RANDOM>>.

The patch needs testing.
Depends on: 783346
Comment on attachment 652195 [details] [diff] [review]
fix randomness

I tested the patch on mtnlion and hp slaves, looks good:
http://pastebin.mozilla.org/1761523
http://pastebin.mozilla.org/1761524
Attachment #652195 - Flags: review?(dustin)
Depends on: 783541
Comment on attachment 652195 [details] [diff] [review]
fix randomness

Looks good.  Can you write up some wiki docs on this, too?
Attachment #652195 - Flags: review?(dustin) → review+
(In reply to Dustin J. Mitchell [:dustin] from comment #25)
> Can you write up some wiki docs on this, too?

Sure. I going to do this this week, probably tomorrow.
I updated the documentation https://wiki.mozilla.org/ReleaseEngineering/PuppetAgain/Modules/puppet and added https://wiki.mozilla.org/ReleaseEngineering/PuppetAgain/Modules/puppetmaster.

Still need to document CA and certificate management, that's bug 784716
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Product: mozilla.org → Release Engineering
Product: Release Engineering → Infrastructure & Operations
You need to log in before you can comment on or make changes to this bug.