Closed Bug 967452 Opened 8 years ago Closed 8 years ago

nightly builds failing because of switch to ftp-ssl

Categories

(Infrastructure & Operations Graveyard :: CIDuty, task)

x86
All
task
Not set
blocker

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: cbook, Unassigned)

References

Details

Attachments

(1 file)

seems to affect nightly builds like

command: START
command: wget -O previous.apk https://ftp-ssl.mozilla.org/pub/mozilla.org/mobile/nightly/latest-mozilla-central-android-armv6/gecko-unsigned-unaligned.apk
command: cwd: /builds/slave/m-cen-and-a6-ntly-000000000000/build
command: output:
--2014-02-04 04:11:14--  https://ftp-ssl.mozilla.org/pub/mozilla.org/mobile/nightly/latest-mozilla-central-android-armv6/gecko-unsigned-unaligned.apk
Resolving ftp-ssl.mozilla.org... 63.245.215.129
Connecting to ftp-ssl.mozilla.org|63.245.215.129|:443... connected.
ERROR: certificate common name “ftp.mozilla.org” doesn’t match requested host name “ftp-ssl.mozilla.org”.
To connect to ftp-ssl.mozilla.org insecurely, use ‘--no-check-certificate’.
command: ERROR
Traceback (most recent call last):
  File "/builds/slave/m-cen-and-a6-ntly-000000000000/tools/scripts/android/../../lib/python/util/commands.py", line 47, in run_cmd
    return subprocess.check_call(cmd, **kwargs)
  File "/usr/lib64/python2.6/subprocess.py", line 502, in check_call
    raise CalledProcessError(retcode, cmd)
CalledProcessError: Command '['wget', '-O', 'previous.apk', 'https://ftp-ssl.mozilla.org/pub/mozilla.org/mobile/nightly/latest-mozilla-central-android-armv6/gecko-unsigned-unaligned.apk']' returned non-zero exit status 5
command: END (0.51s elapsed)

https://tbpl.mozilla.org/php/getParsedLog.php?id=34054624&tree=Mozilla-Central
https://tbpl.mozilla.org/php/getParsedLog.php?id=34054194&tree=Mozilla-Central
https://tbpl.mozilla.org/php/getParsedLog.php?id=34051087&tree=Mozilla-Aurora
dustin@cerf ~ $ openssl s_client -connect ftp-ssl.mozilla.org:443 | openssl x509 -text -noout
depth=1 C = US, O = DigiCert Inc, CN = DigiCert SHA2 Secure Server CA
verify error:num=20:unable to get local issuer certificate
verify return:0
Certificate:
    Data:
        Version: 3 (0x2)
        Serial Number:
            0c:89:b8:4c:a0:61:12:a0:b4:eb:ec:3b:54:32:b4:4b
    Signature Algorithm: sha256WithRSAEncryption
        Issuer: C=US, O=DigiCert Inc, CN=DigiCert SHA2 Secure Server CA
        Validity
            Not Before: Jan 28 00:00:00 2014 GMT
            Not After : Sep 28 12:00:00 2016 GMT
        Subject: C=US, ST=CA, L=Mountain View, O=Mozilla Foundation, CN=ftp.mozilla.org
        Subject Public Key Info:
            Public Key Algorithm: rsaEncryption
                Public-Key: (2048 bit)
                Modulus:
                    00:cf:53:5d:ae:c9:9e:05:b5:cb:c7:70:f8:28:36:
                    ed:e5:59:7f:49:e1:6a:26:ef:f6:2e:18:c2:db:71:
                    4f:06:be:1e:9d:1e:5c:99:69:e9:cf:a2:7d:8d:c2:
                    cd:5d:32:c8:70:34:76:fe:e9:86:e6:8b:22:8c:17:
                    fd:5e:86:36:63:b7:49:8f:9a:d9:1f:e0:d4:ae:6b:
                    4e:53:bf:76:0b:da:22:60:5a:52:fe:86:76:8f:8d:
                    80:c6:1f:61:49:d3:1e:3e:9e:3d:3b:92:96:f0:70:
                    ea:2c:f4:4d:23:35:d2:2f:d1:cd:91:d0:e4:7d:ed:
                    65:19:59:9f:62:eb:25:df:38:5e:98:0b:90:c3:b8:
                    1f:b3:5b:4a:90:60:32:bd:22:c6:e9:6f:8a:6d:b9:
                    3c:18:cd:04:3a:bf:dd:d5:6d:62:cb:d4:f6:a3:6b:
                    0a:73:02:a5:eb:e1:46:6d:e3:d8:c9:f3:64:cd:86:
                    8a:53:5d:ee:63:bd:e2:52:6b:ae:eb:13:70:c0:96:
                    88:07:c4:0e:e3:da:38:05:fb:43:be:46:e4:82:7f:
                    d3:00:72:66:70:e8:dc:1d:cc:92:50:8f:ce:1b:84:
                    55:a7:1d:76:62:7b:0d:3c:bc:c8:49:f3:d0:97:21:
                    26:d8:2d:22:93:96:16:e7:73:34:26:1c:95:e1:f6:
                    a7:7b
                Exponent: 65537 (0x10001)
        X509v3 extensions:
            X509v3 Authority Key Identifier:
                keyid:0F:80:61:1C:82:31:61:D5:2F:28:E7:8D:46:38:B4:2C:E1:C6:D9:E2

            X509v3 Subject Key Identifier:
                CC:E7:93:A5:48:A7:42:33:01:43:F7:41:FC:70:6C:E4:DE:65:B9:EE
            X509v3 Subject Alternative Name:
                DNS:ftp.mozilla.org, DNS:ftp-ssl.mozilla.org
            X509v3 Key Usage: critical
                Digital Signature, Key Encipherment
            X509v3 Extended Key Usage:
                TLS Web Server Authentication, TLS Web Client Authentication
            X509v3 CRL Distribution Points:

                Full Name:
                  URI:http://crl3.digicert.com/ssca-sha2-g1.crl

                Full Name:
                  URI:http://crl4.digicert.com/ssca-sha2-g1.crl

            X509v3 Certificate Policies:
                Policy: 2.16.840.1.114412.1.1
                  CPS: http://www.digicert.com/ssl-cps-repository.htm
                  User Notice:
                    Explicit Text:

            Authority Information Access:
                OCSP - URI:http://ocsp.digicert.com
                CA Issuers - URI:http://cacerts.digicert.com/DigiCertSHA2SecureServerCA.crt

            X509v3 Basic Constraints: critical
                CA:FALSE
    Signature Algorithm: sha256WithRSAEncryption
         d1:d0:a5:77:0a:73:9f:62:2f:01:83:4f:ee:6d:b9:11:fd:dc:
         3c:dd:79:94:d4:b2:e6:85:a8:8b:e3:aa:10:5c:ed:ae:df:25:
         17:81:7d:df:bb:86:0f:a4:c0:d9:13:25:ee:0e:88:6e:fe:3c:
         1c:11:e9:15:f9:08:1e:2f:d8:4d:29:3a:7b:f6:f2:d2:3a:6a:
         8c:b1:6b:44:6e:8a:2e:79:59:63:f8:ed:4e:55:09:22:79:3c:
         e8:34:ed:62:16:17:ae:b8:6f:7a:26:cc:ed:4e:bf:cf:81:e2:
         16:5d:b0:72:2f:9e:03:e0:0e:be:b1:45:47:95:61:fe:07:0f:
         14:1f:ce:f8:e7:d0:97:ab:ac:3e:9e:9c:e5:5e:0d:f2:30:5a:
         09:e1:f3:74:09:41:ff:2a:c0:8e:6f:68:31:17:b0:ac:02:32:
         e7:df:db:85:c0:c2:14:cc:66:8d:81:0b:f4:32:e4:92:b9:53:
         aa:9b:9c:13:c3:70:2c:6e:62:40:82:85:d7:52:4c:8a:47:cb:
         01:ce:97:22:d2:31:2c:08:0f:74:b6:b9:06:14:94:d4:43:e7:
         34:83:08:0f:56:0a:41:94:75:e1:a7:4c:9d:93:95:a0:7f:47:
         a9:65:c8:4b:18:4b:bb:27:3d:a7:7c:a3:12:7e:27:01:62:01:
         0e:83:c6:7c
The alternative name is there in the "X509v3 Subject Alternative Name" extension.  I wonder if that wget is too old.

[root@bld-linux64-ec2-090.build.releng.use1.mozilla.com ~]# wget --version
GNU Wget 1.12 built on linux-gnu.
...
[root@bld-linux64-ec2-090.build.releng.use1.mozilla.com ~]# wget -O /dev/null https://ftp-ssl.mozilla.org
--2014-02-04 05:10:46--  https://ftp-ssl.mozilla.org/
Resolving ftp-ssl.mozilla.org... 63.245.215.129
Connecting to ftp-ssl.mozilla.org|63.245.215.129|:443... connected.
ERROR: certificate common name “ftp.mozilla.org” doesn’t match requested host name “ftp-ssl.mozilla.org”.
To connect to ftp-ssl.mozilla.org insecurely, use ‘--no-check-certificate’.

Yet

dustin@euclid ~ $ wget --version
GNU Wget 1.14 built on linux-gnu.
...
dustin@euclid ~ $ wget -O /dev/null https://ftp-ssl.mozilla.org
--2014-02-04 08:11:15--  https://ftp-ssl.mozilla.org/
Resolving ftp-ssl.mozilla.org... 63.245.215.129
Connecting to ftp-ssl.mozilla.org|63.245.215.129|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 561 [text/html]
Saving to: ‘/dev/null’

100%[====================================================================================================================================================================================================>] 561         --.-K/s   in 0s

2014-02-04 08:11:16 (37.9 MB/s) - ‘/dev/null’ saved [561/561]
So, I think on these systems at least, we'll need to upgrade wget.  Unfortunately, even CentOS-6.5 doesn't have a newer version.
I built 1.15.  It's based on GNUTLS, which I'm sure brings its own set of fun bugs.

This is filtering out to the other puppetmasters now, and the current puppet manifests specify ensure => latest, so as RPM repos update, this will deploy.  So we'll see this on some systems soon.

---

[root@rpmpackager1 /]# wget --version
GNU Wget 1.15 built on linux-gnu.

+digest +https +ipv6 -iri +large-file +nls -ntlm +opie +ssl/gnutls

Wgetrc:
    /etc/wgetrc (system)
Locale:
    /usr/share/locale
Compile:
    gcc -DHAVE_CONFIG_H -DSYSTEM_WGETRC="/etc/wgetrc"
    -DLOCALEDIR="/usr/share/locale" -I. -I../lib -I../lib -O2 -g -pipe
    -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector
    --param=ssp-buffer-size=4 -m64 -mtune=generic
Link:
    gcc -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions
    -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic
    /usr/lib64/libgnutls.so -lz -lrt ftp-opie.o gnutls.o
    ../lib/libgnu.a

Copyright (C) 2011 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
<http://www.gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Originally written by Hrvoje Niksic <hniksic@xemacs.org>.
Please send bug reports and questions to <bug-wget@gnu.org>.
<mock-chroot>[root@rpmpackager1 /]# wget https://ftp-ssl.mozilla.org
--2014-02-04 08:32:36--  https://ftp-ssl.mozilla.org/
ERROR: Failed to open cert /etc/ssl/certs/make-dummy-cert: (-34).
ERROR: Failed to open cert /etc/ssl/certs/Makefile: (-34).
ERROR: Failed to open cert /etc/ssl/certs/ca-bundle.trust.crt: (-34).
Resolving ftp-ssl.mozilla.org... 63.245.215.129
Connecting to ftp-ssl.mozilla.org|63.245.215.129|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 561 [text/html]
Saving to: ‘index.html.3’

100%[====================================================================================================================================================================================================>] 561         --.-K/s   in 0s

2014-02-04 08:32:37 (10.7 MB/s) - ‘index.html.3’ saved [561/561]

<mock-chroot>[root@rpmpackager1 /]# wget https://ftp-ssl-zlb.vips.scl3.mozilla.com
--2014-02-04 08:32:39--  https://ftp-ssl-zlb.vips.scl3.mozilla.com/
ERROR: Failed to open cert /etc/ssl/certs/make-dummy-cert: (-34).
ERROR: Failed to open cert /etc/ssl/certs/Makefile: (-34).
ERROR: Failed to open cert /etc/ssl/certs/ca-bundle.trust.crt: (-34).
Resolving ftp-ssl-zlb.vips.scl3.mozilla.com... 63.245.215.129
Connecting to ftp-ssl-zlb.vips.scl3.mozilla.com|63.245.215.129|:443... connected.
The certificate's owner does not match hostname ‘ftp-ssl-zlb.vips.scl3.mozilla.com’
<mock-chroot>[root@rpmpackager1 /]#
Assignee: nobody → dustin
Attached patch bug967452.patchSplinter Review
Attachment #8370126 - Flags: review?(bhearsum)
I suspect this and bug 967472 are the same thing?
(In reply to Axel Hecht [:Pike] from comment #6)
> I suspect this and bug 967472 are the same thing?

Yes.
Attachment #8370126 - Flags: review?(bhearsum) → review+
Duplicate of this bug: 967472
Foopies need this fix:

slave: tegra-136
--2014-02-04 13:10:37--  https://ftp-ssl.mozilla.org/pub/mozilla.org/mobile/tinderbox-builds/mozilla-aurora-android-armv6/1391542405/fennec-29.0a2.en-US.android-arm-armv6.apk
Resolving ftp-ssl.mozilla.org... 63.245.215.129
Connecting to ftp-ssl.mozilla.org|63.245.215.129|:443... connected.
ERROR: certificate common name `ftp.mozilla.org' doesn't match requested host name `ftp-ssl.mozilla.org'.
To connect to ftp-ssl.mozilla.org insecurely, use `--no-check-certificate'.
program finished with exit code 5

Not sure if I should back out post_upload or if we can push this to foopies.
Also happening on OSX jetpack tests.  I have already backed out post_upload.py but it's taking its sweet time updating.
seeing similar situation for windows:
https://tbpl.mozilla.org/php/getParsedLog.php?id=34081821&tree=Mozilla-Central

as aki said, waiting for staging.m.o to update.
Chris, it looks like we don't have a lot of HTTP clients that can handle the DNS Alternative Name extensions.  Windows will be particularly nasty to upgrade.

Could we just put a different cert on ftp-ssl.mozilla.org, rather than piggybacking it with ftp.mozilla.org's cert?
Assignee: dustin → nobody
i'd prefer not to do that. it would require us to manage two virtual servers and pools for the ftp cluster. currently the vip for both ftp and ftp-ssl are bound the the same virtual server (where ssl is terminated).

keep in mind, the way this is configured, using the x509 subject alternative name extension is the standard way we server groups of services out of single vip's in other places, like the generic cluster. it's a standard and pretty common practice.
The two virtual servers could share the same pool, right?  So it's really only adding another virtual server with a different cert configured?

The alternative involves building and deploying wget for several platforms, and wget may not be the only weak tool in use here.  At any rate, it's going to take quite a lot of time to get all that up and working, during which the ftp-ssl.mozilla.org vhost can't be used for most purposes.
When we hit this first, I opened ftp-ssl in firefox, and looked what the security dialog said.

It's not obvious from looking at it in Firefox that this is an OK cert, or why it would be. Once you click through technical details, you end up with some text saying

Not Critical
DNS Name: ftp.mozilla.org
DNS Name: ftp-ssl.mozilla.org


In particular for the place where people download our software from, I wish the certificate was more obvious, and perhaps even had bells and whistles like EV.
That's really a separate concern - ftp-ssl.mozilla.org isn't for use by end-users and really *is* an alternative name for ftp.mozilla.org.  If there are weaknesses in or problems with the certificates for ftp.mozilla.org, please file another bug.

The only reason I brought up the cert is that some of the software used in automation (wget, at least) isn't compatible with alternative names.
Seems that linux nightlies are up today, but both win and mac failed on failed certs again.
I got poked about this bug in the Firefox Coordination Meeting, however, I don't see this happening anymore:
https://tbpl.mozilla.org/?jobname=nightly
https://tbpl.mozilla.org/?tree=Mozilla-Aurora&jobname=nightly

Where are people seeing these issues?
Is this bug closed?
I'm fixing Android nightlies in bug 968232. I'm not sure if that fully encompasses this bug or not.
We haven't had Android nightly updates since Monday. I have been told it is because of this. Is there an ETA for a fix?
(In reply to Brad Lassey [:blassey] (use needinfo?) from comment #22)
> We haven't had Android nightly updates since Monday. I have been told it is
> because of this. Is there an ETA for a fix?

I just triggered a new set of Android nightlies, tracked in bug 968232. ETA: 90min.
(In reply to Francesco Lodolo [:flod] from comment #20)
> I'm still not seeing nightly builds for l10n and certificate errors
> http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/2014-02-05-03-02-03-
> mozilla-central-l10n/mozilla-central-win32-l10n-nightly-it-bm84-build1-
> build993.txt.gz

This still needs addressing - I haven't seen anyone doing anything about it yet...
Updating summary to reflect that other platforms are affected too.

(In reply to Ben Hearsum [:bhearsum] from comment #24)
> (In reply to Francesco Lodolo [:flod] from comment #20)
> > I'm still not seeing nightly builds for l10n and certificate errors
> > http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/2014-02-05-03-02-03-
> > mozilla-central-l10n/mozilla-central-win32-l10n-nightly-it-bm84-build1-
> > build993.txt.gz
> 
> This still needs addressing - I haven't seen anyone doing anything about it
> yet...

Aki is working up a patch for this.
Summary: Android Nightly Fail: ERROR: certificate common name “ftp.mozilla.org” doesn’t match requested host name “ftp-ssl.mozilla.org”. → nightly builds failing because of switch to ftp-ssl
(In reply to Ben Hearsum [:bhearsum] from comment #25)
> Updating summary to reflect that other platforms are affected too.
> 
> (In reply to Ben Hearsum [:bhearsum] from comment #24)
> > (In reply to Francesco Lodolo [:flod] from comment #20)
> > > I'm still not seeing nightly builds for l10n and certificate errors
> > > http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/2014-02-05-03-02-03-
> > > mozilla-central-l10n/mozilla-central-win32-l10n-nightly-it-bm84-build1-
> > > build993.txt.gz
> > 
> > This still needs addressing - I haven't seen anyone doing anything about it
> > yet...
> 
> Aki is working up a patch for this.

Which ended up at https://bugzilla.mozilla.org/attachment.cgi?id=8370971&action=edit.

We're already halfway to the next set of scheduled nightlies, so we're just going to wait for those to pick this up. Triggering an extra set now would have minimal benefit, and hurt our capacity during peak time.
(In reply to Ben Hearsum [:bhearsum] from comment #23)
> (In reply to Brad Lassey [:blassey] (use needinfo?) from comment #22)
> > We haven't had Android nightly updates since Monday. I have been told it is
> > because of this. Is there an ETA for a fix?
> 
> I just triggered a new set of Android nightlies, tracked in bug 968232. ETA:
> 90min.

They're done, and working. I'm on 2014-02-05 now.
Windows and Mac nightlies should be fixed by the patch in bug 960571. Mac is already confirmed to be working (https://bugzilla.mozilla.org/show_bug.cgi?id=960571#c77) and we're just waiting for the Windows nightly to finish, and then the subsequent repacks will run (and should succeed).
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Do we know why the new wget gives three errors on every https download ? eg

$ wget https://www.google.com
--2014-03-11 01:21:44--  https://www.google.com/
ERROR: Failed to open cert /etc/ssl/certs/Makefile: (-34).
ERROR: Failed to open cert /etc/ssl/certs/ca-bundle.trust.crt: (-34).
ERROR: Failed to open cert /etc/ssl/certs/make-dummy-cert: (-34).
Resolving www.google.com... 74.125.239.51, 74.125.239.48, 74.125.239.49, ...
Connecting to www.google.com|74.125.239.51|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: ‘index.html’

Some issue from gnutls ?
That seems innocuous enough.  It's trying to open all of the files in that dir as certificates, and judging by the filenames those aren't certificates.  So yes, I think it's a difference in how GnuTLS and OpenSSL handle that directory.  I assume the wget succeeds?
It does, just pollutes the logs with spurious error messages as it goes.
Product: Release Engineering → Infrastructure & Operations
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.