Closed Bug 1350506 Opened 5 years ago Closed 5 years ago

sea-puppet's puppet setup is horked.

Categories

(SeaMonkey :: Release Engineering, defect)

defect
Not set
blocker

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: ewong, Assigned: ewong)

References

Details

Attachments

(1 file, 3 obsolete files)

This looks like bug 1311822. 

(No, I don't recall running puppet cert on sea-puppet)

puppet agent --test on -3 returns:

Error: Could not request certificate: Error 400 on SERVER: this master is not a CA
Exiting; failed to retrieve certificate and waitforcert is disabled
Following the instructions to that bug, I did:

1) backedup all the stuff in var/lib/puppetmaster/ssl/ca to ../backup.zip
2) removed the last entry in inventory.txt
3) removed the *.pems

rebooted puppetmaster.

rebooted -3.

waiting for it to return
did |puppet agent --trace --test|  on -3 and got:


Error: Could not request certificate: Error 400 on SERVER: this master is not a CA
/usr/lib/ruby/site_ruby/1.8/puppet/indirector/rest.rb:207:in `is_http_200?'
/usr/lib/ruby/site_ruby/1.8/puppet/indirector/rest.rb:100:in `find'
/usr/lib/ruby/site_ruby/1.8/puppet/indirector/certificate/rest.rb:12:in `find'
/usr/lib/ruby/site_ruby/1.8/puppet/indirector/indirection.rb:201:in `find'
/usr/lib/ruby/site_ruby/1.8/puppet/ssl/host.rb:207:in `certificate'
/usr/lib/ruby/site_ruby/1.8/puppet/ssl/host.rb:36:in `localhost'
/usr/lib/ruby/site_ruby/1.8/puppet/ssl/validator/default_validator.rb:27:in `initialize'
/usr/lib/ruby/site_ruby/1.8/puppet/ssl/validator.rb:27:in `new'
/usr/lib/ruby/site_ruby/1.8/puppet/ssl/validator.rb:27:in `default_validator'
/usr/lib/ruby/site_ruby/1.8/puppet/network/http_pool.rb:35:in `http_instance'
/usr/lib/ruby/site_ruby/1.8/puppet/indirector/rest.rb:57:in `network'
/usr/lib/ruby/site_ruby/1.8/puppet/indirector/rest.rb:82:in `http_request'
/usr/lib/ruby/site_ruby/1.8/puppet/indirector/rest.rb:62:in `http_get'
/usr/lib/ruby/site_ruby/1.8/puppet/indirector/rest.rb:96:in `find'
/usr/lib/ruby/site_ruby/1.8/puppet/indirector/rest.rb:190:in `do_request'
/usr/lib/ruby/site_ruby/1.8/puppet/indirector/request.rb:264:in `do_request'
/usr/lib/ruby/site_ruby/1.8/puppet/indirector/rest.rb:190:in `do_request'
/usr/lib/ruby/site_ruby/1.8/puppet/indirector/rest.rb:90:in `find'
/usr/lib/ruby/site_ruby/1.8/puppet/indirector/certificate/rest.rb:12:in `find'
/usr/lib/ruby/site_ruby/1.8/puppet/indirector/indirection.rb:201:in `find'
/usr/lib/ruby/site_ruby/1.8/puppet/ssl/host.rb:207:in `certificate'
/usr/lib/ruby/site_ruby/1.8/puppet/ssl/host.rb:326:in `wait_for_cert'
/usr/lib/ruby/site_ruby/1.8/puppet/application/agent.rb:478:in `wait_for_certificates'
/usr/lib/ruby/site_ruby/1.8/puppet/application/agent.rb:319:in `run_command'
/usr/lib/ruby/site_ruby/1.8/puppet/application.rb:384:in `run'
/usr/lib/ruby/site_ruby/1.8/puppet/application.rb:510:in `plugin_hook'
/usr/lib/ruby/site_ruby/1.8/puppet/application.rb:384:in `run'
/usr/lib/ruby/site_ruby/1.8/puppet/util.rb:488:in `exit_on_fail'
/usr/lib/ruby/site_ruby/1.8/puppet/application.rb:384:in `run'
/usr/lib/ruby/site_ruby/1.8/puppet/util/command_line.rb:146:in `run'
/usr/lib/ruby/site_ruby/1.8/puppet/util/command_line.rb:92:in `execute'
/usr/bin/puppet:8
Exiting; failed to retrieve certificate and waitforcert is disabled
Summary: sea-puppet/ sea-hp-linux64-3 are not talking to each other. → sea-puppet's puppet setup is horked.
Blocks: SM2.48b1
Severity: normal → blocker
Blocks: SM2.48
After wrestling with puppet and certificate chaining, I think I've managed to
unhork the puppet infrastructure.

The basic gist of fixing it I got from [1]


1) generate a Rootca self-signed cert
   i) generate rootCA CRL
2) generate a puppetmaster ca csr
   - sign the puppetmaster ca csr with the RootCA cert
   - Generate the puppetmaster CA CRL
     i) copy the ca-cert to /var/lib/puppetmaster/ssl/git/ca-certs
         (and rename to sea-puppet.community.scl3.mozilla.crt)
    ii) copy the puppetmaster's ca and rootca cert and crls to
         /var/lib/puppetmaster/ssl/git/certdir
         - then run the following script in that dir:

         for i in *.crl; do
            h=`openssl crl -hash -noout -in $i`
            fn=$h.r0
            echo "    Linking ${fn} to $i..."
            [ ! -f $fn ] && ln -s $i $fn
         done
         for i in *.pem; do
             h=`openssl x509 -hash -noout -in $i`
             fn=$h.0
             echo "    Linking ${fn} to $i..."
             [ ! -f $fn ] && ln -s $i $fn
         done

3) generate a puppetmaster (leaf) csr
   - sign puppetmaster (leaf) csr with the puppetmaster ca cert
      i) copy this puppetmaster leaf cert to
          /var/lib/puppetmaster/ssl/git/master-certs

4) generate the hosts (2->13 + sea-puppet + sea-master1) csrs
   - sign with the puppetmaster ca's cert
   - for each host:
      i) copy the crt to /var/lib/puppet/ssl/public_keys (rename to pem)
     ii) copy the crt also to /var/lib/puppet/ssl/certs (and rename to pem)
    iii) copy the key to /var/lib/puppet/ssl/private_keys (and rename to pem)
     iv) copy the rootca's crt to /var/lib/puppet/ssl/certs and rename to ca.pem)

5) then run |puppet agent -t| on every host.

[1] - https://wiki.mozilla.org/ReleaseEngineering/PuppetAgain/Certificate_Chaining

So I'm going to ask Callek for confirmation on whether I did it right.
Flags: needinfo?(bugspam.Callek)
I think it's unhorked namely because puppet had overwritten my new mercurial
installation with the old one.

So I need to update the puppet mercurial module.
Attached patch [puppet] proposed patch (obsolete) — Splinter Review
Attached patch [puppet] proposed patch (obsolete) — Splinter Review
Attachment #8860731 - Attachment is obsolete: true
Attachment #8860734 - Flags: review?(bugspam.Callek)
Attached patch [puppet] proposed patch (v3) (obsolete) — Splinter Review
realized I hadn't flushed the repo version.
Attachment #8860734 - Attachment is obsolete: true
Attachment #8860734 - Flags: review?(bugspam.Callek)
Attachment #8860735 - Flags: review?(bugspam.Callek)
Attachment #8860735 - Attachment is obsolete: true
Attachment #8860735 - Flags: review?(bugspam.Callek)
then on the puppetmaster:

1) copied the mozilla-python27-mercurial-3.9.1-1.el6.x86_64.rpm to 
   /data/repos/yum/releng/public/CentOS/6/x86_64

2) cd /data/repos/yum/releng/public/CentOS/6/x86_64

3) createrepo --update ./

then on each puppet slave (as root)

1) yum clean all
2) puppet agent -t
FWIW, there has been (as of this writing) one successful L64 trunk hourly build since the outage which had started immediately after buildID=20170420023553.
This new build is at http://ftp.mozilla.org/pub/seamonkey/tinderbox-builds/comm-central-trunk-linux64/1492959533/ ; it has buildID=20170423075853 which is after comment #9 (2 hours after if the build ID is in Mozilla time zone or 9 hours after if in UTC).
oops, got my arithmetic wrong: it is 5 hours earlier if in UTC.
fixed.
Assignee: nobody → ewong
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Flags: needinfo?(bugspam.Callek)
You need to log in before you can comment on or make changes to this bug.