Closed Bug 914065 Opened 11 years ago Closed 9 years ago

Ciphersuites on https://*.mozilla.org differ from best practice being advanced in Firefox

Categories

(Infrastructure & Operations Graveyard :: WebOps: Other, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: hsivonen, Assigned: nmaul)

References

Details

(Whiteboard: [kanban:https://webops.kanbanize.com/ctrl_board/2/261] [change - configuration])

Steps to reproduce:
 1) Compare https://www.ssllabs.com/ssltest/analyze.html?d=www.mozilla.org&s=63.245.215.20 (similar with hacks.mozilla.org, etc.) with https://briansmith.org/browser-ciphersuites-01.html

Actual results:
Three ways the server config differs from what's being advanced as best practice in Firefox:
 1) The server prefers TLS_RSA_* suites over TLS_DHE_RSA_* suites (and overrides the client's order of preference) thereby negotiating non-forward-secret ciphersuites with browsers, including Firefox. Expected Mozilla's https  services to be forward secrecy-enabled.
 2) The server prefers RC4 over AES thereby ending up negotiationg RC4 with browsers despite BEAST having been mitigated on the client side and RC4 having known problems.
 3) The server does not have TLS 1.2 enabled,  so there is no chance of negotiating a GCM ciphersuite even once browsers start supporting them.

Expected results:
Expected Mozilla to practice the best practice as proposed in https://briansmith.org/browser-ciphersuites-01.html also on its servers.

Additional info:
It looks like www.mozilla.org has been configured with 1024-bit DHE despite the RSA key size being 2048 bits. Probably worth fixing before moving TLS_DHE_RSA_* up in the order of preference.
Assignee: nobody → server-ops-webops
Component: General → Server Operations: Web Operations
Product: www.mozilla.org → mozilla.org
QA Contact: nmaul
Version: unspecified → other
mail.mozilla.com, too.
This has been previously reported in https://bugzilla.mozilla.org/show_bug.cgi?id=896760 . The issue here is that most mozilla sites sit behind a Riverbed Zeus Load Balancer that does not support TLS1.2, ECDH, OCSP stapling and so on...
We are discussing with them on a timeline to get that rolled out, but until then our options are very limited: it's either wait, or replace it with something else. The later involves a huge amount of re-engineering of our web frontend, and hasn't been started yet.

FYI, OpSec has a recommendation on the SSL configuration of various equipments. This is what all Infra teams should follow: https://mana.mozilla.org/wiki/pages/viewpage.action?pageId=35069456
As far as prioritizing AES above RC4, the problem is that BEAST is mitigated in TLS1.1 and above, but Zeus doesn't support TLS1.1. And there is no way to make the ciphersuite version specific, so that if we put DHE-RSA-AES256-SHA above RC4-SHA, it will be true regardless of the SSL/TLS version negociated.
For this reason, RC4-SHA is always preferred before AES.

If you think there's a better way of doing this, please do comment on this bug.
(In reply to Julien Vehent [:ulfr] from comment #2)
> This has been previously reported in
> https://bugzilla.mozilla.org/show_bug.cgi?id=896760 . The issue here is that
> most mozilla sites sit behind a Riverbed Zeus Load Balancer that does not
> support TLS1.2, ECDH, OCSP stapling and so on...

I hope that the next time when Mozilla buys SSL-sensitive equipment, the money goes to an entity whose products are advancing the state of crypto on the Web (though I haven't surveyed the load balancer market and don't know if something better actually exists).

(In reply to Julien Vehent [:ulfr] from comment #3)
> As far as prioritizing AES above RC4, the problem is that BEAST is mitigated
> in TLS1.1 and above, but Zeus doesn't support TLS1.1. 

Point #2 in comment 0 was based on the assumption that BEAST has been mitigated on the client side. Has it not?
(In reply to Henri Sivonen (:hsivonen) from comment #4)
> Point #2 in comment 0 was based on the assumption that BEAST has been
> mitigated on the client side. Has it not?

Looks like Apple hasn't mitigated it yet in Safari:
http://blog.ivanristic.com/2013/09/is-beast-still-a-threat.html
(In reply to Henri Sivonen (:hsivonen) from comment #4)
> I hope that the next time when Mozilla buys SSL-sensitive equipment, the
> money goes to an entity whose products are advancing the state of crypto on
> the Web (though I haven't surveyed the load balancer market and don't know
> if something better actually exists).

Load balancing is a lot more than just SSL... that certainly factors in, but we can't sacrifice everything else to have the optimal SSL stack. We have to choose something that gives is a good mix of security, reliability, operational maintainability, feature set, and performance.

That said, there are other concerns with Stingray. I don't claim that it's perfect except for SSL. :)

The question is simply one of priority. How do we justify the time and financial investment to switch to another platform? For instance, should this be more important than auditing and securing our many Wordpress implementations? OpSec and AppSec indicated to me that no, it is not... WP is a bigger concern to them.

(In reply to Julien Vehent [:ulfr] from comment #3)
> As far as prioritizing AES above RC4, the problem is that BEAST is mitigated
> in TLS1.1 and above, but Zeus doesn't support TLS1.1.

This is incorrect. Zeus/Stingray *does* support TLSv1.1. It does not support TLSv1.2 yet, or any ECDHE / GCM ciphers.

https://www.ssllabs.com/ssltest/analyze.html?d=www.mozilla.org

It doesn't support many ciphers, unfortunately, and Firefox only supports up to TLSv1.0 out of the box right now. But even so, Chrome, IE, Opera, and Safari (in certain versions/platforms) all will negotiate a 1.1 session.

> And there is no way to
> make the ciphersuite version specific, so that if we put DHE-RSA-AES256-SHA
> above RC4-SHA, it will be true regardless of the SSL/TLS version negociated.
> For this reason, RC4-SHA is always preferred before AES.

Just for the record, this seems to be the case for most platforms, not just Zeus... you generally have to craft a single ciphersuite string that works for all SSLv3/TLSv1/TLSv1.1/TLSv1.2. Apache, nginx, and haproxy's new SSL support are all like this.

(In reply to Henri Sivonen (:hsivonen) from comment #5)
> (In reply to Henri Sivonen (:hsivonen) from comment #4)
> > Point #2 in comment 0 was based on the assumption that BEAST has been
> > mitigated on the client side. Has it not?
> 
> Looks like Apple hasn't mitigated it yet in Safari:
> http://blog.ivanristic.com/2013/09/is-beast-still-a-threat.html

That article (and ssllabs.com) indicate that they think site operators should consider BEAST mitigated well enough client-side that we can stop preferring RC4 now, even though Apple/Safari is still a small risk ("small" because it's believed to be not exploitable).

By contrast, RC4 continues to be attacked, and it's never going to get stronger. BEAST will continue to get weaker. RC4 affects everyone, while BEAST currently only affects a portion of the Internet. So at some point, it starts making a lot of sense to get rid of RC4 and forget about server-side BEAST mitigation.

In light of that, would we like to rethink our cipher ordering? It's trivial to change... there's just only 8 ciphers to choose from. :)

The default (and currently-used) ordering is:

    1 SSL_RSA_WITH_RC4_128_SHA
    2 SSL_RSA_WITH_RC4_128_MD5
    3 SSL_RSA_WITH_AES_256_CBC_SHA
    4 SSL_DHE_RSA_WITH_AES_256_CBC_SHA
    5 SSL_RSA_WITH_3DES_EDE_CBC_SHA
    6 SSL_DHE_RSA_WITH_3DES_EDE_CBC_SHA
    7 SSL_RSA_WITH_AES_128_CBC_SHA
    8 SSL_DHE_RSA_WITH_AES_128_CBC_SHA

Certainly RC4-MD5 should go. We could also move #4 up to the top, which gets us forward secrecy on several of the browsers tested by ssllabs.com (including Chrome, Firefox, Opera, and Safari, but not IE or Java). I don't know if moving any of these would net us increased TLSv1.1 coverage.. I suspect not.

Given that list, I think my ideal ordering would be something like this:

    4 SSL_DHE_RSA_WITH_AES_256_CBC_SHA
    8 SSL_DHE_RSA_WITH_AES_128_CBC_SHA
    6 SSL_DHE_RSA_WITH_3DES_EDE_CBC_SHA
    3 SSL_RSA_WITH_AES_256_CBC_SHA
    7 SSL_RSA_WITH_AES_128_CBC_SHA
    5 SSL_RSA_WITH_3DES_EDE_CBC_SHA
    1 SSL_RSA_WITH_RC4_128_SHA

I don't know where to put the 3DES_EDE ciphers, but this seems at least reasonable. Note the total absence of RC4-MD5.

We can also adjust the DH key size. The default is 1024 bits, which is kinda small these days... we can easily bump it up to 2048.
After some more reading (https://briansmith.org/browser-ciphersuites-01.html), I think I'll revise my ideal ordering from above. AES-128's preferred over AES-256's, and 3DES reduced in priority but still preferred over RC4. RC4-MD5 still gone.

    8 SSL_DHE_RSA_WITH_AES_128_CBC_SHA
    4 SSL_DHE_RSA_WITH_AES_256_CBC_SHA
    7 SSL_RSA_WITH_AES_128_CBC_SHA
    3 SSL_RSA_WITH_AES_256_CBC_SHA
    6 SSL_DHE_RSA_WITH_3DES_EDE_CBC_SHA
    5 SSL_RSA_WITH_3DES_EDE_CBC_SHA
    1 SSL_RSA_WITH_RC4_128_SHA

Any thoughts on this?
(In reply to Jake Maul [:jakem] from comment #6)
> Load balancing is a lot more than just SSL... that certainly factors in, but
> we can't sacrifice everything else to have the optimal SSL stack.

Sure.

> We have to
> choose something that gives is a good mix of security, reliability,
> operational maintainability, feature set, and performance.

If the case indeed is that when purchasing a load balancer  in the future it was necessary to trade off some security  in order to get better other characteristics, that would be a pretty bad sign, since it would  suggest that practicing what we preach is only available to huge sites who can roll their own load-balancing or to small sites that don't need load balancers, but what we preach wouldn't actually be feasible for big but not huge sites.

Thankfully, since it appears that the load balancer in question here can actually be configured with DHE ciphersuites,  it appears that the situation isn't actually that bad.

> We can also adjust the DH key size. The default is 1024 bits, which is kinda
> small these days... we can easily bump it up to 2048.

Makes sense.

Note that if Mozilla has Java-based things connecting to Mozilla's own sites over https, those Java things may need to get Bouncy Castle -enabled while waiting for JDK8: http://stackoverflow.com/questions/6851461/java-why-does-ssl-handshake-give-could-not-generate-dh-keypair-exception . But since Python seems to be preferred over Java here, hopefully this is a non-issue. In general, giving worse crypto to browsers in order to accommodate broken Java libs seems like a bad tradeoff.

(In reply to Jake Maul [:jakem] from comment #7)
>     8 SSL_DHE_RSA_WITH_AES_128_CBC_SHA
>     4 SSL_DHE_RSA_WITH_AES_256_CBC_SHA
>     7 SSL_RSA_WITH_AES_128_CBC_SHA
>     3 SSL_RSA_WITH_AES_256_CBC_SHA
>     6 SSL_DHE_RSA_WITH_3DES_EDE_CBC_SHA
>     5 SSL_RSA_WITH_3DES_EDE_CBC_SHA
>     1 SSL_RSA_WITH_RC4_128_SHA
> 
> Any thoughts on this?

Makes sense to me.
I believe most browsers support TLS1.1. And with more CPUs having support for AES-NI, I do see an incentive to put AES ciphersuites at the top. However, prioritizing DHE ciphersuites will have an impact on CPU usage. Do we have headroom? Can we activate it on a couple of virtual hosts to measure the impact first?

I'm crunching AES speed numbers to decide whether AES-128 should be set before AES-256. I posted the results on dev-tech-crypto and would like to wait for feedback before we make any change to the ZLB. But the gist is: speed isn't the issue, AES-256 is apparently not more, and maybe less, secure than AES-128 due to timing attacks.
https://groups.google.com/d/msg/mozilla.dev.tech.crypto/36na1B2brGU/xUMMPMgkmEMJ
Jake: can we try the following ciphersuite on a smaller set of sites?

SSL_DHE_RSA_WITH_AES_128_CBC_SHA
SSL_DHE_RSA_WITH_AES_256_CBC_SHA
SSL_RSA_WITH_AES_128_CBC_SHA
SSL_RSA_WITH_AES_256_CBC_SHA
SSL_RSA_WITH_RC4_128_SHA
Flags: needinfo?(nmaul)
I've put this into place on the primary PHX1 external Zeus cluster. This affects 1/2 of www.mozilla.org traffic, aus3.mozilla.org, support.mozilla.org, and many others.

The CPU difference for just the cipher suite changes was very minor... about 5% on the most-heavily loaded in the cluster. However, once that was done the next required change was to increase the DH key size from 1024 bits to 2048... I consider this to be a very important change if we are to start using DH ciphers on a regular basis, as we do with this cipher ordering change.

This change (1024->2048 DH key size) caused a *very* sharp increase in CPU usage on the load balancer hosting aus3.mozilla.org. It was not maxxed out, but enough that I felt it was necessary to restructure things a bit. I moved some other VIPs to other load balancers (which needed done anyway- it was already higher than most other LBs). I also added 2 extra VIPs for aus3, to spread the load out across 3 total LBs. This was also something I was already intending to do, so I'm not too concerned about doing it. 2 total LBs would be enough under normal circumstances, but an extra one gives us some more failover cushion.


Let's sit on this for a few days, and if we're still happy with the performance we can work on altering the other frontend LB clusters (SCL3 and HCI).
Flags: needinfo?(nmaul)
Exciting! Thanks for taking care of it Jake!

$ ./CiphersScan.sh www.mozilla.org:443
prio  ciphersuite         protocol  pfs_keysize
1     DHE-RSA-AES128-SHA  TLSv1.1   DH,2048bits
2     DHE-RSA-AES256-SHA  TLSv1.1   DH,2048bits
3     AES128-SHA          TLSv1.1
4     AES256-SHA          TLSv1.1
5     RC4-SHA             TLSv1.1
6     (NONE)
dhe - nice. 

and a compact 3 packet server hello (with cert chain) - also nice! Now firefox has to catch up with you and do TLS1.1 itself. thanks!

would you consider doing OCSP stapling? (the OCSP call can really slow down the effective handshake time).
TLS1.2 is in Nightly. OCSP stapling is an ongoing discussion, but won't happen right away for technical reasons.
(In reply to Julien Vehent [:ulfr] from comment #14)
> TLS1.2 is in Nightly.

preffed off

> OCSP stapling is an ongoing discussion, but won't
> happen right away for technical reasons.

can you elaborate?
> 
> > OCSP stapling is an ongoing discussion, but won't
> > happen right away for technical reasons.
> 
> can you elaborate?

just for a little context - the firefox median time for a successful OCSP verification is just shy of 400ms - so its a very important bottleneck.
(In reply to Henri Sivonen (:hsivonen) from comment #0)
> Expected results:
> Expected Mozilla to practice the best practice as proposed in
> https://briansmith.org/browser-ciphersuites-01.html also on its servers.

A quick clarification: The *set* of cipher suites I recommend above, not marked "deprecated", is something that probably should be shared between servers and clients. However, the best ordering of the cipher suites may be different between clients and servers. For example, Firefox mitigates BEAST and Lucky 13 even when SSL 3.0 and TLS 1.0 are used so there is no reason for us to put RC4 above any CBC-based cipher suites. However, because servers have to deal with clients that are vulnerable to such attacks, a server might make different choices.

(In reply to Julien Vehent [:ulfr] from comment #10)
> Jake: can we try the following ciphersuite on a smaller set of sites?
> 
> SSL_DHE_RSA_WITH_AES_128_CBC_SHA
> SSL_DHE_RSA_WITH_AES_256_CBC_SHA
> SSL_RSA_WITH_AES_128_CBC_SHA
> SSL_RSA_WITH_AES_256_CBC_SHA
> SSL_RSA_WITH_RC4_128_SHA

Given that Safari implements TLS 1.1 and the others either implement TLS 1.0 or have BEAST mitigations, I agree with Julien's suggestion here, until we can get ECDHE support.

(In reply to Jake Maul [:jakem] from comment #11)
> This change (1024->2048 DH key size) caused a *very* sharp increase in CPU
> usage on the load balancer hosting aus3.mozilla.org. It was not maxxed out,
> but enough that I felt it was necessary to restructure things a bit. I moved
> some other VIPs to other load balancers (which needed done anyway- it was
> already higher than most other LBs). I also added 2 extra VIPs for aus3, to
> spread the load out across 3 total LBs. This was also something I was
> already intending to do, so I'm not too concerned about doing it. 2 total
> LBs would be enough under normal circumstances, but an extra one gives us
> some more failover cushion.

I don't know if there are any artificial limitations in the product we are using, but you can also choose a key size between 1024 bits and 2048 bits if the performance hit becomes unbearable. Anyway, if you lose control of your RSA private key then it's game over for all previously-encrypted content when you use RSA key exchange. I think loss of RSA private key is more of a risk than the current risk of people breaking even 1024-bit DHE keys, so I would prefer any DHE key 1024 bits or larger over 2048 RSA key exchange.

If we were worried about NSA-level threats regarding the key size, we'd have to worry about NSA-level threats regarding stealing the key. And, honestly, the level of capability to steal the RSA private key almost definitely doesn't require an NSA-level threat.

Also, keep in mind that some of our servers, like aus*.mozilla.org, are used only by Mozilla products and are used in specific ways and have different threats. I would say that (EC)DHE cipher suites should be mandatory for https://mail.mozilla.org and https://bugzilla.mozilla.org because good encryption is so important for them. For aus*, encryption is more of a "nice-to-have" so if using smaller keys for AUS gives you headroom for the most critical services (things that security-group people log in to), I recommend you go that route.
Bit of bad news... due to compatibility issues with New Relic and Service Now, we have had to degrade the DH key size from 2048 back down to 1024.

Not yet sure what the limitation is with NR, but on SN's side, they're using Java 6, which only supports 1024-bit DH key sizes. Larger ones are supposedly going to be supported in Java 8, which is not yet in general release.
Considering the change in comment 18, do we still want to use the cipher ordering in comment 10?
New Relic has informed me that they too are using Java 6 for that particular service, so it is indeed precisely the same issue as ServiceNow had.
Component: Server Operations: Web Operations → WebOps: Other
Product: mozilla.org → Infrastructure & Operations
The java 6 limitation sucks, and there might be other libraries out there with similar issues.
I still think we should standardized on DHE, even with 1024 bits primes. Because if one DH handshake is broken, which would still take significant resources, it wouldn't impact any other session.

Some notes I added today on DHE: https://mana.mozilla.org/wiki/pages/viewpage.action?pageId=35069456#SSL%26TLSCiphersuites-ForwardSecrecy

Can we deploy the new ciphersuite on the rest of the infrastructure?
(In reply to Julien Vehent [:ulfr] from comment #21)
> The java 6 limitation sucks, and there might be other libraries out there
> with similar issues.
> I still think we should standardized on DHE, even with 1024 bits primes.
> Because if one DH handshake is broken, which would still take significant
> resources, it wouldn't impact any other session.
> 
> Some notes I added today on DHE:
> https://mana.mozilla.org/wiki/pages/viewpage.
> action?pageId=35069456#SSL%26TLSCiphersuites-ForwardSecrecy
> 
> Can we deploy the new ciphersuite on the rest of the infrastructure?

I don't think you should limit the key size on mail.mozilla.com, intranet.mozilla.org, bugzilla.mozilla.org, or other places because of the Java limitation. We should limit the small key sizes to the servers that Service Now and New Relic need to access. Right now all the servers I mentioned are using SSL_RSA_WITH_RC4_128_SHA, which is one of the worst choices.
(In reply to Julien Vehent [:ulfr] from comment #21)
> The java 6 limitation sucks, and there might be other libraries out there
> with similar issues.
> I still think we should standardized on DHE, even with 1024 bits primes.
> Because if one DH handshake is broken, which would still take significant
> resources, it wouldn't impact any other session.
> 
> Some notes I added today on DHE:
> https://mana.mozilla.org/wiki/pages/viewpage.
> action?pageId=35069456#SSL%26TLSCiphersuites-ForwardSecrecy
> 
> Can we deploy the new ciphersuite on the rest of the infrastructure?

This is completed. All 3 of our public-facing Zeus clusters (PHX1, SCL3, HCI) are using the new settings. With 1024-bit DH keys, the CPU difference between RC4-SHA and the new ordering is negligible.


(In reply to Brian Smith (:briansmith, was :bsmith@mozilla.com) from comment #22)
> I don't think you should limit the key size on mail.mozilla.com,
> intranet.mozilla.org, bugzilla.mozilla.org, or other places because of the
> Java limitation. We should limit the small key sizes to the servers that
> Service Now and New Relic need to access. Right now all the servers I
> mentioned are using SSL_RSA_WITH_RC4_128_SHA, which is one of the worst
> choices.

Sadly this is a per-environment setting, not a per-domain one.

We could potentially do higher key sizes in one of those 3 environments... mail.mozilla.com, for instance, is in HCI. Other things are too though, and I have no good way of knowing in advance whether or not anything in a given environment is being accessed by a problematic client. Some things hosted there are not even HTTPS... there's things like LDAP+SSL too, which seems likely to be problematic in whole new ways.
What about others like login.persona.com? This is a mozilla server, too.
Careful with domain names... login.persona.com is *not* a Mozilla property. :)

login.persona.org, however, is handled by Mark Mayo's Cloud Services team. It's hosted in AWS, and appears to be fronted by Amazon's ELB service. I don't know if they have ELB doing SSL offloading or not... but if so, I do know that it's relatively weak in terms of what it can be configured to do- you can select what ciphers to offer, but can't set an ordering. I don't know what plans his team might have to improve this, or what plans Amazon might have to improve the service itself. If anyone has info on the latter, I'd love to know about it!
login.persona.org is discussed here: https://bugzilla.mozilla.org/show_bug.cgi?id=904077
Whiteboard: [change - configuration]
Jake, can we enable ciphers logging on the VIP that hosts www.mozilla.org ? (and possibly all of them?)

https://intranet.mozilla.org/Services/Ops/ZeusSetup#Enable_request_logging
Flags: needinfo?(nmaul)
We can, but it takes coordination with Annie's BI/DW team to make sure we don't break their log processing.

One problem we have is that our log formats are not well standardized. They're "basically" all the same, but there's quite a few small differences. There's no way to save a template in Zeus, or to make a template the default.



Anurag, can you possibly provide a list of the domain names you care about (w.r.t. load balancer logs)? Unfortunately the way our stuff is currently constructed, we just send you everything and let you sort out what's needed, and we haven't kept a list.

Then for everything you don't care about, we can try to standardize the log format on our side... then coordinate with you to see if we can standardize the things you do care about.

In fact, perhaps it would be possible for us to come up with a single format that works for everything you need and everything we need, and just use that everywhere. Does that seem doable?
Flags: needinfo?(nmaul) → needinfo?(aphadke)
hey jake,
Here's the list of domains that we collect/care about:
"releases.mozilla.com","pfs.mozilla.org", "data.mozilla.com", "marketplace.mozilla.org","addons.mozilla.org","services.addons.mozilla.org","static.addons.mozilla.org","www.mozilla.com","support.mozilla.com", "versioncheck.addons.mozilla.org","download.mozilla.org", "snippets-stats.mozilla.org", "www.mozilla.org", "input.mozilla.org","videos-origin.mozilla.org","videos-cdn.mozilla.net", "ftp.mozilla.org","download-stats.mozilla.org","bugzilla.mozilla.org", "aus2.mozilla.org", "aus3.mozilla.org", "aus4.mozilla.org", "marketplace.firefox.com", "snippets.mozilla.com"

I actually prefer having a standardized log format.

-anurag
Flags: needinfo?(aphadke)
What about metrics.mozilla.com, telemetry-dash.mozilla.org and telemetry.mozilla.org?
Jake & Anurag: what is the next step for this? In order to replace RC4 with 3DES in the ZLB ciphersuite, we need logs. And these logs need to be standardized for metrics.
Should a new bug be created to work on that standardization step?
Flags: needinfo?(nmaul)
:ulfr - Can you provide the location of log files on metrics-logger1 so we can start collecting them in the DWH?

-anurag
Flags: needinfo?(nmaul)
/data/stats/logs says the script run on ZLB, same thoughts after logging in to metrics logger and looking around
(In reply to Anurag Phadke[:aphadke@mozilla.com] from comment #33)
> :ulfr - Can you provide the location of log files on metrics-logger1 so we
> can start collecting them in the DWH?

I'm confused ... the logs go where they've always gone. This hasn't changed in over 9000 years. :)

LOGSERVER='metrics-logger1.private.scl3.mozilla.com'
sudo -u logpull rsync -av --include='*.gz' --exclude='*' --remove-source-files -e 'ssh -o StrictHostKeyChecking=no' $DIR/ logpull@$LOGSERVER:/data/stats/logs/$HOSTNAME/ > "$RESULTLOG" 2>&1

Looks like /data/stats/logs.


(In reply to Julien Vehent [:ulfr] from comment #32)
> Jake & Anurag: what is the next step for this? In order to replace RC4 with
> 3DES in the ZLB ciphersuite, we need logs. And these logs need to be
> standardized for metrics.
> Should a new bug be created to work on that standardization step?

Yes, definitely, please do open a new bug for that... that will be a major effort. I would also recommend trying to find some way to tie it into the 2014 IT goals for best prioritization. If it's not, it's likely to be overridden by those projects.

In fact this whole project is not on the top level IT goals list. I recommend :joes should contact Sylvie and see where it is on her radar. That will make it worlds easier for us to justify spending time on this (versus other IT projects that currently have higher visibility).

But... why do we need log data to change the prioritization? Why don't we just add 3DES into the order chain where we want it, and then decide (based on log data at that time) if we still need to keep RC4 at all? For reference, the current order is:
SSL_DHE_RSA_WITH_AES_128_CBC_SHA
SSL_DHE_RSA_WITH_AES_256_CBC_SHA
SSL_RSA_WITH_AES_128_CBC_SHA
SSL_RSA_WITH_AES_256_CBC_SHA
SSL_RSA_WITH_RC4_128_SHA

My point here is simply that standardizing the log data seems likely to be a long process. For faster results, parallelize what you can before then, and only block on the bare minimum that must wait.
(In reply to Jake Maul [:jakem] from comment #35)
> My point here is simply that standardizing the log data seems likely to be a
> long process. For faster results, parallelize what you can before then, and
> only block on the bare minimum that must wait.

The question addressed in this bug - which ciphersuite should be used on *.mozilla.org ? - has mostly been answered. The remaining bit is "can we replace RC4 with 3DES". And to answer that, we need an idea of the volume of connections still using RC4, before replacing it with 3DES which is 30 times slower.

I have no particular interest in standardizing the ZLB log format. I just want to know how many connections still use RC4. If modifying the current log format is too cumbersome, then I will try to obtain this information from our NSM instead.

The NSM approach would also allow us to count the clients that only support RC4 ciphers in their CLIENT HELLO. Something we cannot do with ZLB logs anyway.

Please keep this bug open until either resolutions are decided:
* keep RC4 in the ciphersuite
* replace RC4 with 3DES

I'll open another bug to block this one.
Ah, I had forgotten the performance impact. That does make it worthwhile to gather data first.

It's worth noting that the vast majority of the total traffic encountered on our load balancers is from Firefox. This is not just because we're Mozilla, but because most of the high-traffic properties are Firefox services... crash-stats, FHR, snippets, etc. Something like www.mozilla.org is actually comparatively lightly trafficked. :)

Still, www.mozilla.org is probably one of the best places to gather data. MDN might also be good if you want a second opinion... that's probably one of the biggest non-Mozilla / non-Firefox apps hosted on ZLB's.
I have added the DHE and RSA 3DES ciphers to the list, in both PHX1 and SCL3, globally.

After watching our CPU load and how much traffic (bytes/sec) were being encrypted via RC4 as compared to AES, I judged the increased CPU load from 3DES to be negligible. After enabling, load did increase by a very small amount.

Sadly, this new version of Stingray (from 8.1r1 up to 9.6r1) does not bring with it TLSv1.2 or any new ciphers. Also sadly, adding these 3DES ciphers results in no additional "Forward Secrecy" browsers being reported by ssllabs.com... notably still not supported is virtually every version of IE, on any version of Windows.

For reference, here's the current cipher ordering after adding these in:

SSL_DHE_RSA_WITH_AES_128_CBC_SHA
SSL_DHE_RSA_WITH_AES_256_CBC_SHA
SSL_DHE_RSA_WITH_3DES_EDE_CBC_SHA
SSL_RSA_WITH_AES_128_CBC_SHA
SSL_RSA_WITH_AES_256_CBC_SHA
SSL_RSA_WITH_3DES_EDE_CBC_SHA
SSL_RSA_WITH_RC4_128_SHA

A cursory examination indicates that AES is used about 10x as often as 3DES (in our PHX1 datacenter), and that RC4 is now barely used at all... very possibly low enough we could remove it altogether, though we should look more closely before doing that.


I do note that our Server-Side TLS guidelines (https://wiki.mozilla.org/Security/Server_Side_TLS#Recommended_Ciphersuite) recommend not using 3DES. My understanding is that this is purely because its performance is not particularly good, and when you have a very wide range of other ciphers there's often no need for including it. In this case we have a very limited cipher selection, so I suspect it's more acceptable than it might otherwise be. And if it does allow us to eliminate RC4, so much the better.
Thanks Jake!

The concern with 3DES was indeed performance. We didn't know initially how much traffic would be impacted.

The Server Side TLS recommendation is due for an update. I will replace RC4 with 3DES in the backward compatible ciphersuite, and create a new ciphersuite that has neither.

Next I would like to identify sites that do not need backward compatibility, and remove SSL3 and 3DES/RC4 altogether from them.
Assignee: server-ops-webops → nmaul
Whiteboard: [change - configuration] → [kanban:https://kanbanize.com/ctrl_board/4/124] [change - configuration]
Guidelines updated. I added a non-backward compatible ciphersuite. 
https://wiki.mozilla.org/Security/Server_Side_TLS#Non-Backward_Compatible_Ciphersuite

Jake: does the ZLB allow us to define two parameters templates that could just be used in vhosts directly, instead of having each vhost redefine them?
Flags: needinfo?(nmaul)
Nope, no templating capabilities. Wish it had some, we could use templates for lots more than just this. :(
Flags: needinfo?(nmaul)
(In reply to Jake Maul [:jakem] from comment #20)
> New Relic has informed me that they too are using Java 6 for that particular
> service, so it is indeed precisely the same issue as ServiceNow had.

I checked with New Relic, and this seems fixed now.
Depends on: 1085135
(In reply to Reed Loden [:reed] from comment #43)
> (In reply to Jake Maul [:jakem] from comment #20)
> > New Relic has informed me that they too are using Java 6 for that particular
> > service, so it is indeed precisely the same issue as ServiceNow had.
> 
> I checked with New Relic, and this seems fixed now.

Thanks for the update. Looking through various logs, I still see a negligible number of connections announce the user-agent 'Java/1.6'.

Regardless, that's only good for bumping the DH parameter to 2048, because we still need SSLv3 for Win XP pre-sp3.
Whiteboard: [kanban:https://kanbanize.com/ctrl_board/4/124] [change - configuration] → [kanban:https://webops.kanbanize.com/ctrl_board/2/261] [change - configuration]
Julien, I'm having trouble discerning if there's further work here for us to do - the Zeus seem to have been tuned as far as they can go, and we've left certain sites in a 'low' security mode to work with XP SP2 and so forth - so, my opinion is that we've done as much as we can here *today*, and once the vendor provides us TLS 1.2/ECDH/etc, we'll enable those through software upgrades tracked as part of those releases.
Flags: needinfo?(jvehent)
That's correct, we've done as much as possible given our current capabilities, and all new deployments follow guidelines from https://wiki.mozilla.org/Security/Server_Side_TLS

I'm closing this bug as resolved, we'll continue to monitor and harden ciphersuites as continual improvements.
Status: NEW → RESOLVED
Closed: 9 years ago
Flags: needinfo?(jvehent)
Resolution: --- → FIXED
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.