Closed Bug 1485864 (CVE-2018-12404) Opened 6 years ago Closed 5 years ago

Vulnerability disclosure Bleichenbacher attack

Categories

(NSS :: Libraries, defect)

3.37
defect
Not set
normal

Tracking

(firefox-esr6064+ fixed, firefox63 wontfix, firefox64+ fixed, firefox65+ fixed)

RESOLVED FIXED
Tracking Status
firefox-esr60 64+ fixed
firefox63 --- wontfix
firefox64 + fixed
firefox65 + fixed

People

(Reporter: eyal.ronen, Assigned: franziskus)

References

Details

(Keywords: csectype-other, sec-moderate, Whiteboard: [adv-main64-][adv-esr60.4-]Embargoed until 30 November 2018 15:00 GMT)

Attachments

(4 files)

***** IMPORTANT: This information is under 120 days embargo (exact time will be coordinated). This MUST NOT be talked about before that time, except on a need-to-know basis.

Dear NSS team,

Thanks for your quick response.

The vulnerability that we found is to a cache side channel based variant of the Bleichenbacher attack.
We are currently in the process of writing the paper and implementing PoCs for some of the vulnerable TLS implementations.

We decided to rush the disclosure process because this type of attack is now getting a lot of attention due to the recent “Pseudo Constant Time Implementations of TLS Are Only Pseudo Secure” paper by Ronen et al. that was just  published (https://eprint.iacr.org/2018/747). We though it will be prudent to give you a heads up as soon as possible.

We have looked at your rsa decryption code (confirmed in the latest commit 0c3fab7e731af1ced04c63c194cea0660d4a1ae2). Although you have implemented countermeasures against the Bleichenbacher attack in your code, we think it is still vulnerable to cache attacks.

We have looked at the function ssl3_HandleRSAClientKeyExchange at ssl3con.c.
This function have several non constant time operations. For example the random number buffer is only accessed after the RSA decryption if the padding fails. Another example is the conditional call to the function SSL3_SendAlert. We can detect those conditional operations using a cache attack to create a Bleichenbacher padding oracle.

We have implemented similar PoC on other TLS implementations.
We strongly recommend to switch to a full constant time constant memory access solution such as the one in the latest version of OpenSSL.

We will be happy to answer any questions, and to provide you with more details.
Best regards,
Yuval, Eyal, Adi, David and Daniel
OK, this seems straightforward enough.  We don't read both values and we might leak which of the two values we read.

Both boringssl and openssl have the same fix here, but that doesn't work very well for us.  The padding and version checks are in the PK11 module.  I need to check that those are constant time still, but assuming that they are (or are made to be), then the fix is simple enough.  Rather than reading from an array of pre-master secrets as we do now, we copy a new value that is constructed by reading both:

useFauxPms is 1 or 0 right now, we propagate that bit to all indices:

uint8_t mask = useFauxPms<<1|useFauxPms;
mask = mask<<2|mask;
mask = mask<<4|mask;

for (i = 0; i < SSL3_RANDOM_BYTES; ++i) {
  pms[i] = (fauxPms[i] & mask) | (realPms[i] & ~mask);
}
Hi,

It is indeed required, but I think that there might be some more side channels.
For example, the usage of the free function. It frees different arrays at different times (e.g., if I understand correctly the random buffer is only freed if the padding check succeeds).
Again this can be solved by always allocating and freeing the arrays at the same place and time regardless of the padding. 
It is important to make sure all operations and memory access are constant in regards to the padding of the RSA.
As we disclosed in a very early stage, I didn't have time to go over the entire call tree and find all non constant time / memory code.
There's a limit to how much we're willing to do here.  We don't see RSA decryption used very much now, so at the point that it is easier for us to simply disable it, we will do that instead.
That sounds great. Deprecating  RSA decryption is by far the best possible option in our opinion. If this is your decision, we just ask that you wait with this till we lift the embargo. We will coordinate the exact date when we hear from all of the other vendors.
Hmm, trends are in the right direction, but static RSA rates are still at ~6% (https://mzl.la/2BQjcMO).  That indicates that we can't disable it.  Once we have more concrete information about the conditions that a practical attack might require, we can look at what closing the gap will take beyond what I sketched out in comment 1.
OK, that is a shame.
What makes the attack practical is mostly the conditional memory access. This is handled by your sketched solution. We didn't check all the full code path yet, but any conditional branches along the way will be relatively easy to exploit.
Hi all,

I wanted to check what are your plans regarding the bug and fixes.
We have received some requests to cut down on the embargo time, so I wanted to know when do you think you will be ready?

We are currently working on the write up of the paper.
We are not sure we are going to implement a PoC on the NSS code as the problems are similar to out we found in other vendors, but I think I can map most of the problems we have seen.

A relatively simple attack will be on the high-level function that was mentioned before.

However, as I mentioned before all of the code should be checked. This is what we have:
RSA_DecryptBlock has a conditional memcpy. It is relatively easy to exploit as we have seen in other implementations.
Moreover, the different checks for the padding are not done in constant time but with if branching. We believe this can be exploited to get a much stronger Bleichenbacher requiring much less TLS handshakes. 

The lower level code has non-constant checks that can result in a Mangar type Oracle that is very efficient and requires a small number of handshakes.
The for loop in mp_unsigned_octet_size and the for loop in mp_to_fixlen_octets. This is due to the different behaviors if the MSByte is 0 or not.

Notice that the low-level problems also affect the OAEP RSA implementation.

We believe that if some domain uses multiple servers with the same RSA key, a MiTM downgrade attack (both on TLS 1.2 and TLS 1.3) is possible, and that puts all of the users at risk. 

This leads me to another issue. The logjam paper mentions a method to keep the TLS handshake alive in firefox indefinitely using TLS warning alerts. In our tests, this still seems to be the issue. If this is correct, this makes the downgrade attack much easier for us.

Best regards,
Eyal
Flags: needinfo?(martin.thomson)
Hi,

As we mentioned before, we would like to cut down on the embargo period.

As most vendors are in the final stages of creating the fix, we would like to lift the embargo one month earlier.

The new date for lifting the embargo is on the 22nd of November 2018 15:00 GMT.

We will be happy to help with any remaining issues concerning the patches.

Please confirm the new date.
We would also like to get your permission to add you to a general mail of all the vendors, so it will be easier to do more coordination if necessary.

Best regards,
Eyal
I am sorry,

The date is the 23nd of November 2018 15:00 GMT
So because of PK11 we can't really do this the way we would want to. This patch does a little nasty pointer arithmetic to do a constant time swap of the fauxPms and the realPms.
I think this is better than what there was before but I didn't check yet if it's enough.
Eyal, I'd appreciate if you could have a look at the attached patch (note that you might have to link phabricator to you bugzilla account if you didn't do so yet). It should fix the most serious issues here.
Flags: needinfo?(eyal.ronen)
QA Contact: franziskuskiefer
Hi,

I think that the patch does improve things, but I have two concerns:

1. Currently two different memory locations will be accessed depending on the PKCS validation. I think it is safer to do a constant time copy/ choose from either  the real or fake pms into single target buffer, and then continue the run using this taret buffer. This will also create an overall constant memory access pattern unlike the pointer swap.
2. Even if you think the code in this level, the conditional memcpy in RSA_DecryptBlock is trivial to exploit. I strongly recommend to also address this issue. You can look at what was done in BearSSL, as I think they have a very simple constant time approach. 

Best regards,
Eyal
Flags: needinfo?(eyal.ronen)
Thanks for the feedback.

> I think it is safer to do a constant time copy/ choose from either  the real or fake pms into single target buffer, and then continue the run using this taret buffer.

Unfortunately we're limited by the PK11 API here. If we were to copy the key, we still wouldn't copy the actual key bytes and thus still access a different memory location.

> the conditional memcpy in RSA_DecryptBlock is trivial to exploit

I agree. But that should be addressed in a separate patch.
This should improve protection against timing attacks in RSA_DecryptBlock.
Note that test coverage of this function isn't great and this should probably get a couple tests to make sure that it's correct.
Attached file bleichenbacher.pdf
Hi,

This is the current draft of the paper we are going to publish.
Hi,

I attached a draft of the paper and will appreciate any feedback. 

I would try to review the patches again in the next couple of days.

Do you have any comment about the issue with the TLS handshakes timeouts? I think that really puts Firefox users at risk.
I suggest that you also have a look at the mp_to_fixlen_octets function. It should be a much simpler fix than the ones that you already implemented.
Whiteboard: Embargoed until 23 November 2018 15:00 GMT
Per team, embargo extended a week.
Whiteboard: Embargoed until 23 November 2018 15:00 GMT → Embargoed until 30 November 2018 15:00 GMT
Alias: CVE-2018-12404
This assumes that the mp_int isn't truncated, though that seems to be a common
practice throughout mpi.  If we can tweak those functions that do the trimming,
we might end up in good shape out of this.
I spent a little time on seeing if I could retain leading zeros for mp_int so that other functions could be closer to constant time.

Fixing mp_read_unsigned_octets() was fairly simple; I also changed mp_cmp() to be constant time.

However, I haven't tracked down the resulting issues with mp_div and friends.  Those are spectacularly non-constant-time, so it's possible that s_mp_clamp() inserted at the right points would be a reasonable way to proceed, but a real fix likely hits all the code that uses MP_USED()/USED(), which is a lot.  My biggest concern is the other code, like that in mpmontg.c and assembly, that might rely on the MSD being non-zero.  Our test coverage here is pretty dire, so we would need to invest significant effort to fix anything here.

That's a sad indictment on the state of the code, but this is clearly not code that was built to modern standards from the ground up.  We would do well to invest more in verified implementations.
Flags: needinfo?(martin.thomson)
Just a ping that the embargo on this ends in 2 days.
Minusing for Firefox advisories in discussion with Dan Veditz. This is something for the NSS project to assign a CVE to and put into their release notes.
Whiteboard: Embargoed until 30 November 2018 15:00 GMT → [adv-main64-][adv-esr60.4-]Embargoed until 30 November 2018 15:00 GMT
OK, I'll need some direction on how to do a CVE for this, then. I had previously noted the aliased CVE (CVE-2018-12404) to the researchers and the NSS release notes.
Group: crypto-core-security → core-security-release
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.