Closed Bug 1242146 Opened 8 years ago Closed 8 years ago

getcert.cgi not returning certs for y-2008-ec2-golden

Categories

(Infrastructure & Operations :: RelOps: Puppet, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: grenade, Unassigned)

Details

the puppettize call from y-2008-ec2-golden to getcerts.cgi is producing a certs.sh file that includes only a single line which reads:

echo 'Certificate request for y-2008-ec2-golden.try.releng.use1.mozilla.com'

The file contains no certificates.

This does not appear to be the case for other windows hosts as a recent run of the same userdata script from a host named b-2008-ec2-1983 resulted in valid certificates being downloaded.
The echo line should be going to stdout if run interactively or the log file if not (at least on linux, I don't know if we're using a different puppetize method on windows). It sounds like it's writing data to the wrong file and perhaps ditching the data from the pipe that the getcgi call from the puppetize script is supposed to make.
The echo line is definitely coming from the puppet server. wget returns the same one line file:

C:\Users\Administrator>wget --no-check-certificate https://deploy:xxxxxx@releng-puppet2.srv.releng.scl3.mozilla.com/deploy/getcert.cgi
--2016-01-23 07:30:19--  https://deploy:*password*@releng-puppet2.srv.releng.scl3.mozilla.com/deploy/getcert.cgi
Resolving releng-puppet2.srv.releng.scl3.mozilla.com (releng-puppet2.srv.releng.scl3.mozilla.com)... 10.26.48.50
Connecting to releng-puppet2.srv.releng.scl3.mozilla.com (releng-puppet2.srv.releng.scl3.mozilla.com)|10.26.48.50|:443... connected.
WARNING: cannot verify releng-puppet2.srv.releng.scl3.mozilla.com's certificate, issued by '/CN=CA on releng-puppet2.srv.releng.scl3.mozilla.com':
  Unable to locally verify the issuer's authority.
HTTP request sent, awaiting response... 401 Authorization Required
Connecting to releng-puppet2.srv.releng.scl3.mozilla.com (releng-puppet2.srv.releng.scl3.mozilla.com)|10.26.48.50|:443... connected.
WARNING: cannot verify releng-puppet2.srv.releng.scl3.mozilla.com's certificate,
 issued by '/CN=CA on releng-puppet2.srv.releng.scl3.mozilla.com':
  Unable to locally verify the issuer's authority.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/plain]
Saving to: 'getcert.cgi'

C:\Users\Administrator>cat getcert.cgi
echo 'Certificate request for y-2008-ec2-golden.try.releng.use1.mozilla.com'
Remember that the manifests are not consulted here, so any issues there are irrelevant.  The key is often DNS.

dmitchell@admin1a ~ $ host y-2008-ec2-golden.try.releng.use1.mozilla.com
y-2008-ec2-golden.try.releng.use1.mozilla.com has address 10.134.64.5
dmitchell@admin1a ~ $ host 10.134.64.5
5.64.134.10.in-addr.arpa domain name pointer y-2008-ec2-golden.try.releng.use1.mozilla.com.

So, forward and reverse align, and match to the hosts's record in EC2.  Here's the access_log for the request in comment 2

10.134.64.5 - - [23/Jan/2016:07:30:22 -0800] "GET /deploy/getcert.cgi HTTP/1.1" 401 510 "-" "Wget/1.15 (mingw32)"
10.134.64.5 - deploy [23/Jan/2016:07:30:22 -0800] "GET /deploy/getcert.cgi HTTP/1.1" 200 77 "-" "Wget/1.15 (mingw32)"

So it's getting a 200 OK.  There's nothing matching that time in the error log.

Looking at getcert.cgi, it does a bunch of access checking before sending the echo line:

https://github.com/mozilla/build-puppet/blob/master/modules/puppetmaster/templates/getcert.cgi.erb#L71.  Fortunately, that's the point where it starts logging to email.  The corresponding email is:

Creating certificate for y-2008-ec2-golden.try.releng.use1.mozilla.com (10.134.64.5) from 10.134.64.5.
+ set -e
+ source /var/lib/puppetmaster/ssl/scripts/ssl_common.sh
++ scripts_dir=/var/lib/puppetmaster/ssl/scripts
++ source /var/lib/puppetmaster/ssl/scripts/vars.sh
+++ scripts_dir=/var/lib/puppetmaster/ssl/scripts
+++ git_dir=/var/lib/puppetmaster/ssl/git
+++ git_common_dir=/var/lib/puppetmaster/ssl/git-common
+++ distinguished_master=releng-puppet2.srv.releng.scl3.mozilla.com
+++ distinguished_common=/var/lib/puppetmaster/ssl/git-common
+++ fqdn=releng-puppet2.srv.releng.scl3.mozilla.com
+++ root_ca_cert=/var/lib/puppetmaster/ssl/git/ca-certs/root.crt
+++ root_ca_crl=/var/lib/puppetmaster/ssl/git/ca-certs/root.crl
+++ ca_dir=/var/lib/puppetmaster/ssl/ca
+++ ca_certs_dir=/var/lib/puppetmaster/ssl/git/ca-certs
+++ agent_certs_dir=/var/lib/puppetmaster/ssl/git/agent-certs
+++ revocation_requests_dir=/var/lib/puppetmaster/ssl/git/revocation-requests
+++ my_revocation_requests_dir=/var/lib/puppetmaster/ssl/git/revocation-requests/releng-puppet2.srv.releng.scl3.mozilla.com
+++ pvt_dir=/var/lib/puppetmaster/ssl/pvt
+++ tmp_dir=/var/lib/puppetmaster/ssl/tmp
+++ certdir=/var/lib/puppetmaster/ssl/git/certdir
+++ master_ca_key=/var/lib/puppetmaster/ssl/pvt/master-ca.key
+++ master_ca_cert=/var/lib/puppetmaster/ssl/git/ca-certs/releng-puppet2.srv.releng.scl3.mozilla.com.crt
+++ master_ca_crl=/var/lib/puppetmaster/ssl/git/ca-certs/releng-puppet2.srv.releng.scl3.mozilla.com.crl
+++ master_key=/var/lib/puppetmaster/ssl/pvt/master.key
+++ master_cert=/var/lib/puppetmaster/ssl/git/master-certs/releng-puppet2.srv.releng.scl3.mozilla.com.crt
+++ id -un
++ '[' root = root ']'
+ host=y-2008-ec2-golden.try.releng.use1.mozilla.com
+ ip=10.134.64.5
++ echo y-2008-ec2-golden.try.releng.use1.mozilla.com
++ tr ' /' --
+ host=y-2008-ec2-golden.try.releng.use1.mozilla.com
+ lock_ca_dir
+ lockfile -10 -r 3 -l 120 /var/lib/puppetmaster/ssl/ca/lock
+ trap 'rm -f /var/lib/puppetmaster/ssl/ca/lock; exit' SIGHUP SIGINT SIGTERM EXIT
+ revoke_leaf_cert y-2008-ec2-golden.try.releng.use1.mozilla.com
+ local hostname=y-2008-ec2-golden.try.releng.use1.mozilla.com
+ local i dest master
+ read master
+ cd /var/lib/puppetmaster/ssl/git/agent-certs
+ ls -1
+ '[' -f /var/lib/puppetmaster/ssl/git/agent-certs/releng-puppet1.srv.releng.scl3.mozilla.com/y-2008-ec2-golden.try.releng.use1.mozilla.com.crt ']'
+ read master
+ '[' -f /var/lib/puppetmaster/ssl/git/agent-certs/releng-puppet1.srv.releng.use1.mozilla.com/y-2008-ec2-golden.try.releng.use1.mozilla.com.crt ']'
+ read master
+ '[' -f /var/lib/puppetmaster/ssl/git/agent-certs/releng-puppet1.srv.releng.usw2.mozilla.com/y-2008-ec2-golden.try.releng.use1.mozilla.com.crt ']'
+ read master
+ '[' -f /var/lib/puppetmaster/ssl/git/agent-certs/releng-puppet2.srv.releng.scl3.mozilla.com/y-2008-ec2-golden.try.releng.use1.mozilla.com.crt ']'
+ read master
+ run_revocations
+ local need_crl=false
+ local to_revoke
++ mktemp
+ local tempfile=/tmp/tmp.0srHAXjQN2
+ read to_revoke
+ cd /var/lib/puppetmaster/ssl/git/revocation-requests/releng-puppet2.srv.releng.scl3.mozilla.com
+ ls -1
+ rm -f /tmp/tmp.0srHAXjQN2
+ days=6
+ '[' -f /var/lib/puppetmaster/ssl/git/ca-certs/releng-puppet2.srv.releng.scl3.mozilla.com.crl ']'
++ openssl crl -nextupdate -noout -in /var/lib/puppetmaster/ssl/git/ca-certs/releng-puppet2.srv.releng.scl3.mozilla.com.crl
++ sed -e 's/.*=//'
+ exp='Jan 29 17:10:01 2016 GMT'
++ date '--date=Jan 29 17:10:01 2016 GMT' +%s
+ exp=1454087401
++ date +%s
+ now=1453563022
+ [[ 1454087401 < 1454081422 ]]
+ false
++ mktemp /var/lib/puppetmaster/ssl/tmp/y-2008-ec2-golden.try.releng.use1.mozilla.com-XXXXXX.key
+ keyfile=/var/lib/puppetmaster/ssl/tmp/y-2008-ec2-golden.try.releng.use1.mozilla.com-TX8Cy3.key
+ certfile=/var/lib/puppetmaster/ssl/tmp/y-2008-ec2-golden.try.releng.use1.mozilla.com.crt
+ make_leaf_cert y-2008-ec2-golden.try.releng.use1.mozilla.com agent /var/lib/puppetmaster/ssl/tmp/y-2008-ec2-golden.try.releng.use1.mozilla.com-TX8Cy3.key /var/lib/puppetmaster/ssl/tmp/y-2008-ec2-golden.try.releng.use1.mozilla.com.crt
+ local leaf_fqdn=y-2008-ec2-golden.try.releng.use1.mozilla.com
+ local keyfile=/var/lib/puppetmaster/ssl/tmp/y-2008-ec2-golden.try.releng.use1.mozilla.com-TX8Cy3.key
+ local certfile=/var/lib/puppetmaster/ssl/tmp/y-2008-ec2-golden.try.releng.use1.mozilla.com.crt
+ '[' agent = master ']'
+ extns=master_ca_exts
+ subj=/CN=y-2008-ec2-golden.try.releng.use1.mozilla.com
+ openssl genrsa -out /var/lib/puppetmaster/ssl/tmp/y-2008-ec2-golden.try.releng.use1.mozilla.com-TX8Cy3.key 2048
+ openssl req -subj /CN=y-2008-ec2-golden.try.releng.use1.mozilla.com -new -key /var/lib/puppetmaster/ssl/tmp/y-2008-ec2-golden.try.releng.use1.mozilla.com-TX8Cy3.key
+ openssl ca -batch -config /var/lib/puppetmaster/ssl/ca/openssl.conf -extensions master_ca_exts -in /dev/stdin -notext -out /var/lib/puppetmaster/ssl/tmp/y-2008-ec2-golden.try.releng.use1.mozilla.com.crt
+ rm -f /var/lib/puppetmaster/ssl/ca/lock
+ exit

We're seeing

  https://github.com/mozilla/build-puppet/blob/master/modules/puppetmaster/templates/deployment_getcert.sh.erb

invoke make_leaf_cert,

  https://github.com/mozilla/build-puppet/blob/master/modules/puppetmaster/templates/ssl_common.sh.erb#L57

and failing in openssl req | openssl ca.  Reproducing the latter:

[root@releng-puppet2.srv.releng.scl3.mozilla.com git]# openssl ca -batch -config /var/lib/puppetmaster/ssl/ca/openssl.conf -in /dev/stdin -notext -out /tmp/tmpcert -extensions master_ca_exts <<EOF
Using configuration from /var/lib/puppetmaster/ssl/ca/openssl.conf
-----BEGIN CERTIFICATE REQUEST-----
MIICfTCCAWUCAQAwODE2MDQGA1UEAwwteS0yMDA4LWVjMi1nb2xkZW4udHJ5LnJl
bGVuZy51c2UxLm1vemlsbGEuY29tMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIB
CgKCAQEA7E898JUs/FULerXL53rSwfYnfvl5C7qBMfzo5K23nYiR6v0qqJ8CzL4+
LZZtnQMsyznEh/RLWCFa5LIik14J44DADKpUrzuCTRAFXuRU+iru83CuF50Q7lsA
Uh7EfM4IhmX/sbeIWMAIvRYNWOZNbCtNjmCIFsDw4xPcFQDZ+lhPPUvucbXihEDz
ZWfDEvCzcxrACrAkDqAzT4fCF4zYKBtfaejpSC9JqjLIY+QxtNAyOPLKNkPMrr7n
KU1NdVnLPBN0wfI6yKtDBxnCvVuxjCLJKhmIFiGMOD/j5AuWKqnKPaH4udpjXyRt
9XS7y74YjMJz1N+3lzv2kHVVGUUE5QIDAQABoAAwDQYJKoZIhvcNAQEFBQADggEB
ADlnQhX9V94GaeM6u0+Ijo8YVRzT833OD8i1VHxxda2+Uuzeto5kJEdq88PsqmNp
x6UzGOk87lDU1YW5Udd2HTR4VrkyK5p0vxYwTWJ1L/nrVYNJVvECUl+iKxYxSeC/
QJ0sGjq/8zOIfWYNGh019gmtKyekb6J1Z7ZPJjeRftazqvXkaD4XVwqm4pUtn/ig
ZEAFQ/1NNSAl5n5bHW7ZU2jjDWj3yfYVCJmX8tAzRYRHIxrbx4e+c3OUK91OcRiQ
ECAj3A2ODQ3CWgoGpll+5KIY9osbiJUrrtw/uNT8AMQrRTW+3FCzNPYpl+e5ePNM
mrkvDsFtHyYcd+4xBJgGY0s=
-----END CERTIFICATE REQUEST-----
EOF
Check that the request matches the signature
Signature ok
The Subject's Distinguished Name is as follows
commonName            :ASN.1 12:'y-2008-ec2-golden.try.releng.use1.mozilla.com'
Certificate is to be certified until Jan 22 14:52:09 2021 GMT (1825 days)
failed to update database
TXT_DB error number 2

that error sometimes indicates a serial number clash, so checking..

[root@releng-puppet2.srv.releng.scl3.mozilla.com ca]# tail inventory.txt
R       210120094751Z   160123094724Z   2E18    unknown /CN=tst-linux32-ec2-golden.test.releng.use1.mozilla.com
R       210120094828Z   160122095057Z   2E19    unknown /CN=t-w732-ec2-golden.test.releng.use1.mozilla.com
R       210120095058Z   160123094820Z   2E1A    unknown /CN=t-w732-ec2-golden.test.releng.use1.mozilla.com
V       210120113004Z           2E1B    unknown /CN=b-2008-ec2-1983.build.releng.use1.mozilla.com
R       210121094724Z   160124094923Z   2E1C    unknown /CN=tst-linux32-ec2-golden.test.releng.use1.mozilla.com
R       210121094820Z   160123095042Z   2E1D    unknown /CN=t-w732-ec2-golden.test.releng.use1.mozilla.com
R       210121095042Z   160124094700Z   2E1E    unknown /CN=t-w732-ec2-golden.test.releng.use1.mozilla.com
R       210121145344Z   160123151001Z   2E1F    unknown /CN=releng-puppet2.srv.releng.scl3.mozilla.com
R       210122094700Z   160124094923Z   2E20    unknown /CN=t-w732-ec2-golden.test.releng.use1.mozilla.com
V       210122094923Z           2E21    unknown /CN=t-w732-ec2-golden.test.releng.use1.mozilla.com
[root@releng-puppet2.srv.releng.scl3.mozilla.com ca]# cat serial
2E22

#21EF concerns me a little -- why is there a recently-issued client cert for the puppetmaster itself?  Checking the cert in use by the puppetmaster, it's a much older one, but with the same CN:

[root@releng-puppet2.srv.releng.scl3.mozilla.com ca]# openssl x509 -text -in /var/lib/puppet/ssl/certs/releng-puppet2.srv.releng.scl3.mozilla.com.pem  | grep -E 'CN=releng-puppet2.srv.releng.scl3.mozilla.com|Serial'
        Serial Number: 319 (0x13f)
        Subject: CN=releng-puppet2.srv.releng.scl3.mozilla.com

So I have no idea what that is, but probably unrelated.  However,

R       201208160651Z   151210170649Z   2D9D    unknown /CN=y-2008-ec2-golden.try.releng.use1.mozilla.com
R       201208170649Z   151214181928Z   2D9E    unknown /CN=y-2008-ec2-golden.try.releng.use1.mozilla.com
R       201212181928Z   151214182258Z   2DA8    unknown /CN=y-2008-ec2-golden.try.releng.use1.mozilla.com
R       201212182258Z   151214182558Z   2DA9    unknown /CN=y-2008-ec2-golden.try.releng.use1.mozilla.com
R       201212182558Z   151214182721Z   2DAA    unknown /CN=y-2008-ec2-golden.try.releng.use1.mozilla.com
R       201212182722Z   151214182853Z   2DAB    unknown /CN=y-2008-ec2-golden.try.releng.use1.mozilla.com
R       201212182853Z   160107151001Z   2DAC    unknown /CN=y-2008-ec2-golden.try.releng.use1.mozilla.com
R       210105171053Z   160107171121Z   2DEF    unknown /CN=y-2008-ec2-golden.try.releng.use1.mozilla.com
R       210105171121Z   160107180304Z   2DF0    unknown /CN=y-2008-ec2-golden.try.releng.use1.mozilla.com
V       210105203033Z           2DF4    unknown /CN=y-2008-ec2-golden.try.releng.use1.mozilla.com

shows a *lot* of certs generated for that CN, and apparently #2DF4 (issued, it looks like, on 16/01/07) hasn't been revoked.  However, it's not in the git repository, either:

[root@releng-puppet2.srv.releng.scl3.mozilla.com ca]# find /var/lib/puppetmaster/ssl/git | grep /y-2008-ec2-golden.try.releng.use1.mozilla.com

Looking in the error log around that time:

[Thu Jan 07 03:32:49 2016] [error] [client 10.134.64.5] Certificate Verification: Error (23): certificate revoked
[Thu Jan 07 03:32:52 2016] [error] [client 10.134.64.5] Certificate Verification: Error (23): certificate revoked
[Thu Jan 07 03:32:52 2016] [error] [client 10.134.64.5] Certificate Verification: Error (23): certificate revoked
[Thu Jan 07 03:32:59 2016] [error] [client 10.134.64.5] Certificate Verification: Error (23): certificate revoked
[Thu Jan 07 03:32:59 2016] [error] [client 10.134.64.5] Certificate Verification: Error (23): certificate revoked

which, now that I look back in the mail history, was around the time we had some git errors from re-generating certificates too quickly.  If I recall, Amy deleted the certificate in question, which was probably the certificate we needed.  Indeed:

commit 03b65525677a14deef39caf68d5a5f1b7626ca89
Author: Automation <puppetsync@releng-puppet1.srv.releng.use1.mozilla.com>
Date:   Thu Jan 7 23:38:40 2016 -0800

    removed  y-2008-ec2-golden.try.releng.use1.mozilla.com

 ...2008-ec2-golden.try.releng.use1.mozilla.com.crt | 22 ----------------------
 1 file changed, 22 deletions(-)

so I've reverted that:

commit 8b390088ed4cd07eecdc633335d74bda3a8e4d8e
Author: Automation <puppetsync@releng-puppet2.srv.releng.scl3.mozilla.com>
Date:   Sun Jan 24 07:17:35 2016 -0800

    Bug 1242146: Revert "removed  y-2008-ec2-golden.try.releng.use1.mozilla.com"
    
    This reverts commit 03b65525677a14deef39caf68d5a5f1b7626ca89.

diff --git a/agent-certs/releng-puppet2.srv.releng.scl3.mozilla.com/y-2008-ec2-golden.try.releng.use1.mozilla.com.crt b/agent-certs/releng-puppet2.srv.releng.scl3.mozilla.com/y-2008-ec2-golden.try.releng.use1.mozilla.com.crt
new file mode 100644
index 0000000..2d6a2d0
--- /dev/null
+++ b/agent-certs/releng-puppet2.srv.releng.scl3.mozilla.com/y-2008-ec2-golden.try.releng.use1.mozilla.com.crt
@@ -0,0 +1,22 @@
+-----BEGIN CERTIFICATE-----
+MIIDsjCCApqgAwIBAgICLfQwDQYJKoZIhvcNAQEFBQAwOzE5MDcGA1UEAwwwQ0Eg
+b24gcmVsZW5nLXB1cHBldDIuc3J2LnJlbGVuZy5zY2wzLm1vemlsbGEuY29tMB4X
+DTE2MDEwNzIwMzAzM1oXDTIxMDEwNTIwMzAzM1owODE2MDQGA1UEAwwteS0yMDA4
+LWVjMi1nb2xkZW4udHJ5LnJlbGVuZy51c2UxLm1vemlsbGEuY29tMIIBIjANBgkq
+hkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAyqVCyvUkJRBoG5VabCH9n5iPej+kKUH1
+M4R2vcbsknzMNssi73x2bW/fYL/8WoeQS9glqIyswVZe5bkadu28aYqgZfvRmu3/
+Drkx841GOJyCM8CY0g90VjQlt2yCh6CklAWkgCVb7C89gpOx2s3yW91fFKx0Ui8U
+b3RvI7FMHYTtMjCUqzB35rLTi4p5M6lC0dZNQR7I/FyY3yfAVM9EbEUW/Qjv9Izv
+LyH1DGswuLgX5I99PnY3Cf8aw1e8ZjncDALU+ilIMyl+4A9Fwb7G1b0T9a1wSrrB
+bGbhXRSLRsylg6Gg6uJMYEmqzkHurVzkAQiPauEuQ0DDnWGnDF+mtQIDAQABo4HC
+MIG/MIGMBgNVHSMEgYQwgYGhfKR6MHgxHDAaBgNVBAMTE1B1cHBldEFnYWluIEJh
+c2UgQ0ExIjAgBgkqhkiG9w0BCQEWE3JlbGVhc2VAbW96aWxsYS5jb20xHDAaBgNV
+BAsTE1JlbGVhc2UgRW5naW5lZXJpbmcxFjAUBgNVBAoTDU1vemlsbGEsIEluYy6C
+AQswDAYDVR0TAQH/BAIwADALBgNVHQ8EBAMCBaAwEwYDVR0lBAwwCgYIKwYBBQUH
+AwIwDQYJKoZIhvcNAQEFBQADggEBAEhYAuRUWlCQdHsLIBapGFI4GUypgi9lPJpU
+BsNKQkLTJut9PvMsg1VfVD/oyB9XObKwJdgTXm0Nmk5vo/d6BxqNqKdVtkOacP0Q
+J6RdS/OyGJov8IKxPUMgYs8HvcxwSSSZtfKDuCEVOCOdvat0HJD+DK4vgPurf8ht
+yxmzl82xzuRe1I+7xF0IpCGll9xZ2wSGZ71mNmuXYsbFr/NEzYC8RI+rHYmOIuR9
+aQT/UTCcCrTnMBUksAAHIirJuYW4XP8ZprQhhIHZHl7sDrHJuTD/t+5htETr04lB
+oWV2ieVoFyk4FCjetwiKD2/C/4dnly4qUKRmC4/UwBmFZnAngZw=
+-----END CERTIFICATE-----

and I expect things should work fine now.
Dustin's fixes have sorted this!
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.