Closed Bug 792009 Opened 12 years ago Closed 12 years ago

balrog client getting ssl errors

Categories

(Release Engineering :: General, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: bhearsum, Assigned: bhearsum)

References

Details

Attachments

(3 files)

I'm wondering if this is related to the new IT root certificate. I see this in balrog client output on Linux, Win32, and Win64: retry: Calling <function run_with_timeout at 0xb7d082cc> with args: (['/tools/python-2.6.5/bin/python', '/builds/slave/m-cen-lnx-ntly/tools/scripts/updates/balrog-client.py', '--build-properties', 'buildprops_balrog.json', '--api-root', 'https://aus4-admin-dev.allizom.org', '--verbose', '--credentials-file', 'BuildSlaves.py'], 1260, None, None, False, True), kwargs: {}, attempt #1 Executing: ['/tools/python-2.6.5/bin/python', '/builds/slave/m-cen-lnx-ntly/tools/scripts/updates/balrog-client.py', '--build-properties', 'buildprops_balrog.json', '--api-root', 'https://aus4-admin-dev.allizom.org', '--verbose', '--credentials-file', 'BuildSlaves.py'] Balrog request to https://aus4-admin-dev.allizom.org/releases/Firefox-mozilla-central-nightly-20120918030553 Data sent: None Starting new HTTPS connection (1): aus4-admin-dev.allizom.org Traceback (most recent call last): File "/builds/slave/m-cen-lnx-ntly/tools/scripts/updates/balrog-client.py", line 36, in <module> runner.run() File "/builds/slave/m-cen-lnx-ntly/tools/lib/python/balrog/client/cli.py", line 80, in run buildData=data, copyTo=copyTo) File "/builds/slave/m-cen-lnx-ntly/tools/lib/python/balrog/client/api.py", line 121, in update_build url_template_vars=url_template_vars) File "/builds/slave/m-cen-lnx-ntly/tools/lib/python/balrog/client/api.py", line 76, in request res = self.do_request(prerequest_url, None, 'HEAD', {}) File "/builds/slave/m-cen-lnx-ntly/tools/lib/python/balrog/client/api.py", line 101, in do_request verify=self.verify, auth=self.auth) File "/builds/slave/m-cen-lnx-ntly/tools/lib/python/vendor/requests-0.10.8/requests/sessions.py", line 203, in request r.send(prefetch=prefetch) File "/builds/slave/m-cen-lnx-ntly/tools/lib/python/vendor/requests-0.10.8/requests/models.py", line 557, in send raise SSLError(e) requests.exceptions.SSLError: [Errno 1] _ssl.c:480: error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed
Blocks: balrog
Assignee: nobody → bhearsum
From curl: * Server certificate: * subject: serialNumber=QFblspylXort2BviK0LdJuDx7haU0SBy; C=US; ST=California; L=Mountain View; O=Mozilla Corporation; CN=*.allizom.org * start date: 2011-10-10 22:11:59 GMT * expire date: 2013-12-11 16:30:26 GMT * subjectAltName: aus4-admin-dev.allizom.org matched * issuer: C=US; O=GeoTrust, Inc.; CN=GeoTrust SSL CA * SSL certificate verify ok. The cert for aus4-admin-dev.allizom.org is a real/public cert, and not signed by the old (or new) Mozilla root CA cert. *.allizom.org is signed by GeoTrust SSL CA (which definitely needs the following intermediate, which is included in the config for this service) GeoTrust SSL CA is signed by GeoTrust Global CA (this is in most modern ca-bundle packages) GeoTrust Global CA is signed by Equifax... this is also included in the config, and is only rarely necessary (most clients have a ca-bundle containing GeoTrust Global CA directly). In particular certain Android phones sometimes need this.
Thanks Jake. I'm pretty certain this is a client-side problem. Rail showed me how to use the openssl command line client, and it's able to authenticate the server just fine: openssl s_client -connect aus4-admin-dev.allizom.org:443 -CAfile geo.crt CONNECTED(00000003) depth=3 C = US, O = Equifax, OU = Equifax Secure Certificate Authority verify return:1 depth=2 C = US, O = GeoTrust Inc., CN = GeoTrust Global CA verify return:1 depth=1 C = US, O = "GeoTrust, Inc.", CN = GeoTrust SSL CA verify return:1 depth=0 serialNumber = QFblspylXort2BviK0LdJuDx7haU0SBy, C = US, ST = California, L = Mountain View, O = Mozilla Corporation, CN = *.allizom.org verify return:1 --- Certificate chain 0 s:/serialNumber=QFblspylXort2BviK0LdJuDx7haU0SBy/C=US/ST=California/L=Mountain View/O=Mozilla Corporation/CN=*.allizom.org i:/C=US/O=GeoTrust, Inc./CN=GeoTrust SSL CA 1 s:/C=US/O=GeoTrust, Inc./CN=GeoTrust SSL CA i:/C=US/O=GeoTrust Inc./CN=GeoTrust Global CA 2 s:/C=US/O=GeoTrust Inc./CN=GeoTrust Global CA i:/C=US/O=Equifax/OU=Equifax Secure Certificate Authority <snip> SSL-Session: Protocol : TLSv1.1 Cipher : RC4-SHA Session-ID: 82EE76692B0692F15ECEF58178507A1BC100D53E7387A9931EB4DC004DB88893 Session-ID-ctx: Master-Key: E27CA84DE01F128DE188B9BAABFA2A2517A10F822524FBC0CB893DA920B583DEA5C5764C78197A9867A8636571B6A38A Key-Arg : None PSK identity: None PSK identity hint: None SRP username: None Start Time: 1348065167 Timeout : 300 (sec) Verify return code: 0 (ok) --- But when I give requests.get() the same cert it spits out the traceback from comment #0. I did wireshark captures of both and they're the same up to and including the "Server Hello Done" message. After that, the python client sends "Alert (Level: Fatal, Description: Unknown CA)", whereas s_client starts a Client Key Exchange. This makes me think that there's a bug in the requests library causing this. Out of curiosity, did anything about the SSL config of aus4-admin-dev change in the past couple of days? Even while we were hitting bug 754067 our Python client was able to perform the SSL handshake without issue.
Just in case anyone has any ideas, I'm attaching wireshark captures from both the s_client coversation and the requests one.
I think I found the problem. After getting some help from #python-requests, I straced the openssl s_client and found that it was falling back on system certs. Specifically, it was opening Equifax_Secure_CA.pem. When I pointed requests at anything that includes that cert, or even just at that cert itself, it was able to verify the certificate. I think this means that the server isn't sending the cert anymore, and the wireshark captures seem to confirm that. I'm going to fix up our client to look at a full cert bundle rather than just a specific certificate/chain -- I don't want us to break if the next certificate we get has a different chain.
This patch removes the single GeoTrust cert and replaces it with a full cert bundle, freshly converted from mozilla-release. I used curl's mk-ca-bundle.pl tool as follows in my local mozilla-release clone to create it: cd security/nss/lib/ckfw/builtins perl ~/tmp/mk-ca-bundle.pl -n It spat out ca-bundle.crt. It all works now: (blah)➜ tools PYTHONPATH=lib/python/vendor/requests-0.10.8 python Python 2.7.3 (default, Aug 1 2012, 05:14:39) [GCC 4.6.3] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import requests >>> requests.get('https://aus4-admin-dev.allizom.org:443/rules.html', verify='misc/certs/ca-bundle.crt') <Response [401]>
Attachment #662595 - Flags: review?(rail)
Comment on attachment 662595 [details] [diff] [review] use a full cert bundle ALL the certs!
Attachment #662595 - Flags: review?(rail) → review+
Comment on attachment 662595 [details] [diff] [review] use a full cert bundle With this landed, Windows and Linux should be fixed I _think_.
Attachment #662595 - Flags: checked-in+
I updated the tools checkout on the slave that did the most recent 64-bit Linux nightly and it worked: [cltbld@linux64-ix-slave03 m-cen-lnx64-ntly]$ python /builds/slave/m-cen-lnx64-ntly/tools/buildfarm/utils/retry.py -s 1 -r 5 -t 1260 /tools/python-2.6.5/bin/python /builds/slave/m-cen-lnx64-ntly/tools/scripts/updates/balrog-client.py --build-properties buildprops_balrog.json --api-root https://aus4-admin-dev.allizom.org --verbose --credentials-file BuildSlaves.py retry: Calling <function run_with_timeout at 0x2aaaaf7412a8> with args: (['/tools/python-2.6.5/bin/python', '/builds/slave/m-cen-lnx64-ntly/tools/scripts/updates/balrog-client.py', '--build-properties', 'buildprops_balrog.json', '--api-root', 'https://aus4-admin-dev.allizom.org', '--verbose', '--credentials-file', 'BuildSlaves.py'], 1260, None, None, False, True), kwargs: {}, attempt #1 Executing: ['/tools/python-2.6.5/bin/python', '/builds/slave/m-cen-lnx64-ntly/tools/scripts/updates/balrog-client.py', '--build-properties', 'buildprops_balrog.json', '--api-root', 'https://aus4-admin-dev.allizom.org', '--verbose', '--credentials-file', 'BuildSlaves.py'] Balrog request to https://aus4-admin-dev.allizom.org/releases/Firefox-mozilla-central-nightly-20120919030602 Data sent: None Starting new HTTPS connection (1): aus4-admin-dev.allizom.org "HEAD /releases/Firefox-mozilla-central-nightly-20120919030602 HTTP/1.1" 200 0 Got CSRF Token: 20120919104515##f9445353add75cc5ec2dcabe90b65657ca03b690 Balrog request to https://aus4-admin-dev.allizom.org/releases/Firefox-mozilla-central-nightly-20120919030602/builds/Linux_x86_64-gcc3/en-US Data sent: {'product': u'Firefox', 'csrf_token': '20120919104515##f9445353add75cc5ec2dcabe90b65657ca03b690', 'data_version': '433', 'copyTo': '["Firefox-mozilla-central-nightly-latest"]', 'version': u'18.0a1', 'data': '{"buildID": "20120919030602", "appv": "18.0a1", "partial": {"fileUrl": "http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/2012/09/2012-09-19-03-06-02-mozilla-central/firefox-18.0a1.en-US.linux-x86_64.partial.20120918030553-20120919030602.mar", "hashValue": "29655922e1b06d6bb193a289dabcd7ab88ec0466b31081a35164b1f80c019a39aad78a71d0a991b3b1eaf939e9351ced2648765d4d68c6496959b4e5a0dca8b5", "from": "Firefox-mozilla-central-nightly-20120918030553", "filesize": "5440506"}, "complete": {"fileUrl": "http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/2012/09/2012-09-19-03-06-02-mozilla-central/firefox-18.0a1.en-US.linux-x86_64.complete.mar", "hashValue": "e4c9508aacc8b738b1dbdb441aa43e62edc62dac9125d50217cb934ac92533c3814656a29de5943b87a16180b0880b56f74231c467e7dd25acd5fa48aaa38a49", "from": "*", "filesize": "29762257"}, "extv": "18.0a1"}'} Starting new HTTPS connection (2): aus4-admin-dev.allizom.org "PUT /releases/Firefox-mozilla-central-nightly-20120919030602/builds/Linux_x86_64-gcc3/en-US HTTP/1.1" 201 25 Jake, I'm not sure if you want to fix the server to include the Equifax cert or not, but the RelEng problem is fixed now.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Product: mozilla.org → Release Engineering
Component: General Automation → General
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: