Closed Bug 1190248 (CVE-2016-1938) Opened 9 years ago Closed 9 years ago

mp_div and mp_exptmod sometimes produce wrong calculation results

Categories

(NSS :: Libraries, defect, P1)

3.19.2
defect

Tracking

(firefox40 wontfix, firefox41 wontfix, firefox42 wontfix, firefox43 wontfix, firefox44+ fixed, firefox45+ fixed, firefox46+ fixed, firefox-esr3846+ fixed, b2g-v2.0 wontfix, b2g-v2.0M wontfix, b2g-v2.1 wontfix, b2g-v2.1S wontfix, b2g-v2.2 affected, b2g-v2.2r affected, b2g-master affected)

RESOLVED FIXED
Tracking Status
firefox40 --- wontfix
firefox41 --- wontfix
firefox42 --- wontfix
firefox43 --- wontfix
firefox44 + fixed
firefox45 + fixed
firefox46 + fixed
firefox-esr38 46+ fixed
b2g-v2.0 --- wontfix
b2g-v2.0M --- wontfix
b2g-v2.1 --- wontfix
b2g-v2.1S --- wontfix
b2g-v2.2 --- affected
b2g-v2.2r --- affected
b2g-master --- affected

People

(Reporter: hanno, Assigned: wtc)

References

(Blocks 1 open bug)

Details

(Keywords: sec-high, Whiteboard: [post-critsmash-triage][b2g-adv-main2.5?][adv-main44+][adv-esr38.8+] see comment 27 for severity)

Attachments

(5 files, 2 obsolete files)

Sometimes calculations with mp_div produce wrong results. An example is this:
0x10000000000000001000000000000000000000 / 0x10000000000000001001
(both numbers are hex)

This should produce
0xFFFFFFFFFFFFFFFFFF

However mp_div will produce
0xFFFFFFFFFFFFFFFFFE

I have attached a sample c code that will show this (compile libmpi.a and then compile with "gcc -I[path_to_freebl] [input] libmpi.a"). It will do a div calculation and then multiply the result with the divisor, add the modulus and then substract the original dividend. If the calculation would be correct this would always result in zero, no matter what the input values were.

While mp_div is not exposed via the nss api it is used within several crypto functions (rsa, dsa, diffie hellman), so this may have severe security impacts.

Here are some other values that will also result in wrong results:
801C4D3A9DE4691ADDBFC7D2D2DE2DA7114125A77FECE17FDA204B73FA3CDA4FF783AEE4453952BEF81DF7A9114125A700
801C4D3A9DE4691ADDBFC7D2D2DE2DA7114125A7CEF66E9B76BAC1

B7B7B7B7B7B7B7B7B7B7B7939393939393939393939393939393B7B7B7B7B7B70080FFFFC1B7BAB70001B7
B7B7B7B7B7B7B7B7B7B7C1B7BAB7DE00648B

BFC7D2E66986FFFCE1B4D2DEF82CBEF80000FADACE4B73F710AED25E000004001DE9BEF81DE91DE9A9556301F7A9551C
BFC7D2E66986FFFCE1B4D2DEF82CBEF81DE9A9556301F7A9551CBFC7D2E46986FFFCE1B4D2BAC1

I found these issues by comparing the results of openssl's bignum functions with the ones from nss. I then ran afl and fuzzed the inputs.
Blocks: nss-fuzz
> I found these issues by comparing the results of openssl's bignum functions with the ones from nss. I then ran afl and fuzzed the inputs.

Differential testing FTW!
Flags: needinfo?(rlb)
I just check our usage of mp_div. They are:

1) pqg generation. From a Firefox perspective, this is not an issue, but it is an NSS issue we should probably fix.
2) full rsa construction from partial keys. This is also not a firefox issue. NSS can reconstruct a full RSA key from partial components. So if you have a short private key (private exponent, public exponent, and modulus) you can construct the full rsa key with CRT parameters. Also if you have stored a short private key (public exponent, modulus, Encrypted(prime1) - like the in a TPM, you can reconstruct the full rsa key with CRT parameters. Firefox does not use either of these features.

Note; mp_div_2, mp_div_d  and mp_div_2d are completely different functions and work in different ways from mp_div. dh.c uses mp_div_2 not mp_div. Ecc used mp_div_2 and mp_div_d.

We should fix this, but there isn't an issue for firefox.
I had a look at potential exploit vectors, too, and came to the same conclusion as Robert: Looks like there is nothing that will have serious impact for Firefox. (Also found another miscalculation bug, but I think it's independent from this one, will open another bug.)
Mass cc to get some NSS eyes on these bugs.
Trimming non-NSS people from the CC list.
Flags: needinfo?(rlb)
Keywords: sec-low
What architectures does this appear on? Which architectures can we test on to isolate the bug? Does this bug appear if we force the use of pure C with no assembly optimizations? I'm a little unfamiliar with the build system, and I don't know what systems we have access to to hunt for bugs.
I looked at the mpi code for the div and didn't see any assembler optimizations for the div code. Not surprising since non of the core crypto needed mp_div. PQG generation is an offline part of generating DSA keys (Usually PQG is given rather than generated). In the RSA case, only corner cases where only part of the RSA key is stored (outside of NSS) and loaded, so it's a pretty rare case.

The error, in any case, must be in the C code. It is possible there may be a platform dependency based on internal word size. It would be interesting to compare 32 bit results versus 64 bit results.

bob
The bug lies in an incorrect assumption made in the div code about the algorithm. When dividing the examples given, the dividend is 3 words, being divided by a 2 word divisor, and the first word of the divisor equals that of the dividend,and the dividend's first 2 words are greater then the divisor. The code thus guesses a 1 as the initial digit of the quotient because it assumes when the dividend's first word is greater then or equal to the first word of the divisor the initial guess for the quotient digit can be taken to be 1. This is then clobbered on the next iteration. The relevant line is 4239 in mpi.c.
wladd: Unfortunately, Your analysis right isn't quite right, but close.

The first example does not have matching first words:
64 bit case (given in variable names):
 
rem (in) 0x0000000000000000
         0x0000000000100000
	 0x0000000000100000

div (in) 0x0000000000001001
         0x0000000000001000

Normalize will shift the result left by 51 bits, but it will shift both rem and div.

rem (after norm) 0x0000000000000000
                 0x0000000000000000
                 0x0000000000000080
                 0x0000000000000080

div (after norm) 0x0000000000000000
                 0x8008000000000000
                 0x8000000000000000

Thie issue happens in the second round, where most significant digit of part will equal most significant digit of div, but part has been extended because the initial part (both words together) was smaller than div. In that case q_msd guess of 1 is wrong, it should be more like MP_MAX_DIGITS.
Attached patch mpi.c.patch (obsolete) — Splinter Review
Patch that fixes the problem
Attachment #8649037 - Flags: review?(dkeeler)
Thanks wladd for you analysis. While is wasn't exactly right, it did point to where the issue was.

bob
Hi Robert, I believe your patch in attachment #8649037 [details] [diff] [review] is not correct. If I run the fix with the test values I get this for the first example:
div +00000000000000FF0000000000000000
mod +00000000000010000000000000000F010000000000000000
Now the input was
10000000000000001000000000000000000000 div 10000000000000001001

As you can see the modulus is larger than the divisor (I get similar outputs for the other examples). This can't be correct.
Comment on attachment 8649037 [details] [diff] [review]
mpi.c.patch

Review of attachment 8649037 [details] [diff] [review]:
-----------------------------------------------------------------

I guess if this isn't quite fixed it doesn't make sense to do a full review right now. (Also, I'm not very familiar with this code at all - perhaps there's someone else who would be better suited?)

::: lib/freebl/mpi/mpi.c
@@ +4215,5 @@
>      MP_DIGITS(&part) = MP_DIGITS(rem) + unusedRem;
>      MP_ALLOC(&part)  = MP_ALLOC(rem)  - unusedRem;
>      MP_USED(&part)   = MP_USED(div);
> +
> +    /* we have now truncated the part of the remainder to the same length as 

nit: trailing whitespace
Attachment #8649037 - Flags: review?(dkeeler)
Attached patch A version of Bob's mpi.c patch (obsolete) — Splinter Review
I studied the code that Bob's patch changed. Here is a variant
of Bob's patch. (I omitted the comments, which should be added
back.)

Hanno, could you please test this patch? Thanks.
Attachment #8651461 - Flags: review?(rrelyea)
Attachment #8651461 - Flags: feedback?(hanno)
Hi Wan-Teh, patch looks good. On my samples it calculates correct results on all test cases. Also ran a fuzzer again for some time, no errors so far (which of course is no proof that it is correct in all cases).
Comment on attachment 8651461 [details] [diff] [review]
A version of Bob's mpi.c patch

Review of attachment 8651461 [details] [diff] [review]:
-----------------------------------------------------------------

r+ rrelyea.

If you want to take my suggested optimization, that's fine as well. r+ because wtc's code is correct (and in performance critical parts of the crypto library.

bob

::: lib/freebl/mpi/mpi.c
@@ +4246,1 @@
>        q_msd /= div_msd;

I think this case is always 0 or 1. div_msd has the high bit set, so div_msd*2 is larger than a word (which is larger than q_msd). 

This code should work with:

   q_msd = q_msd >= div_msd ? 1 : 0;

If assert below is correct, then it can only be 1.
Attachment #8651461 - Flags: review?(rrelyea) → review+
OK I think it's clear that in this case, q_msd /= div_msd is always 1. First, since q_msd <= MP_DIGIT_MAX, and div_msd is normalized to have the high bit set, 2*div_msd > MP_DIGIT_MAX. Line 4126 makes sure that q_msd >= div_msd if we didn't extend. Therefore we know that div_msd <= q_msd < 2*div_msd. Dividing out you get:
1 <= q_msd/div_msd < 2. Since q_msd then q_msd /= div_msd must be 1.

so your optimization is simply q_msd = 1.

bob
Bob: thanks a lot for the review and analysis of the !partExtended case.
I came to the same conclusion independently (but didn't have time to post
it), so I am happy to see your confirmation.

I incorporated your comments and checked in the patch:
https://hg.mozilla.org/projects/nss/rev/a555bf0fc23a
Attachment #8649037 - Attachment is obsolete: true
Attachment #8651461 - Attachment is obsolete: true
Attachment #8651461 - Flags: feedback?(hanno)
Attachment #8652429 - Flags: review?(rrelyea)
Attachment #8652429 - Flags: checked-in?
Attachment #8652429 - Flags: checked-in? → checked-in+
Comment on attachment 8652429 [details] [diff] [review]
A version of Bob's mpi.c patch, v2

Review of attachment 8652429 [details] [diff] [review]:
-----------------------------------------------------------------

::: lib/freebl/mpi/mpi.c
@@ +4213,5 @@
>      MP_DIGITS(&part) = MP_DIGITS(rem) + unusedRem;
>      MP_ALLOC(&part)  = MP_ALLOC(rem)  - unusedRem;
>      MP_USED(&part)   = MP_USED(div);
> +
> +    /* We have now truncated the part of the remainder to the same length as 

I deleted the space at the end of this line.

https://hg.mozilla.org/projects/nss/rev/608645309ab9

@@ +4244,5 @@
>        if (q_msd == RADIX)
>          --q_msd;
>  #else
> +      if (q_msd == div_msd) {
> +        q_msd = MP_DIGIT_MAX;

I should add that the checks on line 4244 and line 4247 handle
the same condition.

When we use s_mpv_div_2dx1d, we must do this check first because
the Windows implementation of s_mpv_div_2dx1d will crash
otherwise.

We can use the check on line 4247 for both cases, as I did in
the previous version of this patch. But I kept the check at line
4244 because it matches the location of the check in Knuth's
algorithm.
Attachment #8652429 - Flags: review?(rrelyea) → review+
Hi Hanno,

Thanks a lot for testing my patch. I have a question for you.

(In reply to Hanno Boeck from comment #13)
> Hi Robert, I believe your patch in attachment #8649037 [details] [diff] [review] [diff]
> [review] is not correct. If I run the fix with the test values I get this
> for the first example:
> div +00000000000000FF0000000000000000
> mod +00000000000010000000000000000F010000000000000000
> Now the input was
> 10000000000000001000000000000000000000 div 10000000000000001001
> 
> As you can see the modulus is larger than the divisor (I get similar outputs
> for the other examples). This can't be correct.

On what platform did you test Bob's patch?

I believe my patch is essentially the same as Bob's patch on
platforms that don't use the s_mpv_div_2dx1d function, so I'd
like to get to the bottom of this.

Here is the diff between the two patches:

https://bugzilla.mozilla.org/attachment.cgi?oldid=8649037&action=interdiff&newid=8652429&headers=1
I did my test on 64 bit Linux and got the same result as Hanno. Your patch fixed the problem on that platform. 

I think most platforms use s_mpv_div_2dx1d(). There's a C implementation in the file that came from nspr.

bob
Bob, thanks. I reproduced Hanno's result after I switched to an optimized build.
I was doing a debug build and got an assertion failure instead.

So the crucial change on top of your patch is indeed the check I added to the
s_mpv_div_2dx1d() code path:

-      mp_digit r;
-      MP_CHECKOK( s_mpv_div_2dx1d(q_msd, MP_DIGIT(&part, MP_USED(&part) - 2), 
-				  div_msd, &q_msd, &r) );
+      if (q_msd == div_msd) {
+        q_msd = MP_DIGIT_MAX;
+      } else {
+        mp_digit r;
+        MP_CHECKOK( s_mpv_div_2dx1d(q_msd, MP_DIGIT(&part, MP_USED(&part) - 2),
+				    div_msd, &q_msd, &r) );
+      }
The |alloc|, |used|, and |dp| fields of |part| are immediately overwritten
inside the while loop, so we just need to initialize the |sign| field before
the while loop.

Note: it is possible to merge the two if statements inside the while loop
and eliminate the |partExtended| variable. I tried it, and found the resulting
code less readable. So I didn't submit a patch for that.
Attachment #8652612 - Flags: review?(rrelyea)
Comment on attachment 8652612 [details] [diff] [review]
Just initialize the |sign| field of |part|

Review of attachment 8652612 [details] [diff] [review]:
-----------------------------------------------------------------

r+ rrelyea
Attachment #8652612 - Flags: review?(rrelyea) → review+
Comment on attachment 8652612 [details] [diff] [review]
Just initialize the |sign| field of |part|

Review of attachment 8652612 [details] [diff] [review]:
-----------------------------------------------------------------

Patch checked in: https://hg.mozilla.org/projects/nss/rev/cfd0ad4726cb
Attachment #8652612 - Flags: checked-in+
Assignee: nobody → wtc
Status: NEW → RESOLVED
Closed: 9 years ago
OS: Unspecified → All
Priority: -- → P1
Hardware: Unspecified → All
Resolution: --- → FIXED
Target Milestone: --- → 3.21
Group: core-security → core-security-release
Hi, I just figured out that this issue is the same as #1194947 - a miscalculation in mp_exptmod(). The fix from Wan-Teh fixes both issues.

However - the assumption in comments #2 and #3 that this is only a minor issue might be wrong, so the severity should be reconsidered. In fact mp_div is called from s_mp_exptmod() (which is the "rsa-function" and therefore should be considered security sensitive). Not sure why I didn't see that in the first place.

I think #1194947 can be marked as a duplicate. Judging the real severity and whether this has any attack scenarios probably requires some skilled cryptographer. But given that this is the core function of a major crypto algorithm I'd suggest that the fix should be backported to all supported firefox versions.
Whiteboard: [b2g-adv-main2.5?]
Yikes, I did my grep again and still didn't see it, then looked exactly in mpmontg.c and there is is where we divide as part of the montgomery code. 

I'm not sure how this can be exploited, but prudence dictates that we treat it as exploitable. (there are certainly cases where the attacker can control the inputs to the modexp code), at least for signature verification.

bob
In a few days a very simliar bug to this one will be made public in openssl.

I would like to write a public advisory covering both issues, though I'm not sure what the state here is. Shall I re-open, considering that with exptmod is affected I think the severity is higher.
Please email us at mozilla@security.org if you have a time sensitive question. Since you left a comment with no needinfo? on anyone, there is no guarantee folks will see this quickly. 

I'm ni? Dan since this is probably his call. I'll ni? Wan-teh and Richard as well.
Flags: needinfo?(wtc)
Flags: needinfo?(rlb)
Flags: needinfo?(dveditz)
Al: I only know this bug is fixed in NSS 3.21 (there is no NSS 3.20
release) and NSS 3.21 was released on Friday, Nov 13, 2015. I don't
know which Firefox releases use NSS 3.21.

Hanno: don't re-open this bug.

Dan: please reassess the severity of this bug.
Flags: needinfo?(wtc)
(In reply to Wan-Teh Chang from comment #32)
> 
> I don't know which Firefox releases use NSS 3.21.

This is an automatically updated page that can you always refer to for such questions:
http://kuix.de/mozilla/versions/
And the answer is "Firefox 44" has this fix so the world is still vulnerable to it.
(In reply to Wan-Teh Chang from comment #32)
> Al: I only know this bug is fixed in NSS 3.21 (there is no NSS 3.20
> release) and ...

Correction: there are NSS 3.20 and 3.20.1 releases.

My main statement is still correct: this bug is only fixed in NSS 3.21.
(In reply to Al Billings [:abillings] from comment #31)
> Please email us at mozilla@security.org if you have a time sensitive
> question.

security@mozilla.org, just to be clear. :)
sec-high might be overstating the present danger--it would take a lot of work and number crunching to come up with a possibly exploitable condition--but if someone does figure it out it's a pretty crack. I'll go with it rather than sec-moderate as a signal that we should land this on the ESR branch, too.
Flags: needinfo?(dveditz)
Keywords: sec-lowsec-high
Summary: mp_div produces sometimes wrong calculation results → mp_div and mp_exptmod sometimes produce wrong calculation results
Whiteboard: [b2g-adv-main2.5?] → [b2g-adv-main2.5?] see comment 27 for severity
Flags: needinfo?(rlb)
Hanno, can you verify that this has been properly fixed? Thank you.
Flags: needinfo?(hanno)
Yes, just re-checked 3.21, there everything is fixed.
Flags: needinfo?(hanno)
Another potential candidate for a fix in an NSS uplift to ESR38.
Flags: needinfo?(lhenry)
Whiteboard: [b2g-adv-main2.5?] see comment 27 for severity → [b2g-adv-main2.5?][adv-main44+] see comment 27 for severity
Tim, Steve, here is another NSS issue we would want for ESR38.  I am planning to go to build for 38.6.0 on Thursday. I'm sorry that's not a lot of warning; I hope we can coordinate NSS updates and our Firefox release cycle more closely for 45 and onwards.
Flags: needinfo?(ttaubert)
Flags: needinfo?(sworkman)
Flags: needinfo?(lhenry)
Deferring this till 38.7.0esr.
Clearing my needinfo - Tim can handle this one too :)
Flags: needinfo?(sworkman)
Alias: CVE-2016-1938
Flags: needinfo?(ttaubert)
Whiteboard: [b2g-adv-main2.5?][adv-main44+] see comment 27 for severity → [b2g-adv-main2.5?][adv-main44-] see comment 27 for severity
Kai, will this fix be included in the NSS 3.21 update as well? If yes, one more reason to uplift to esr38.7 branch? Thanks!
Flags: needinfo?(kaie)
I don't know if NSS 3.21 will be uplifted to the ESR 38.x branch.

I'll start a discussion with the NSS developers about the recommended way to fix this bug in ESR 38.x
Flags: needinfo?(kaie)
There have been three commits in this bug, and they apply cleanly on the NSS_3_19_2_X_BRANCH. This is a patch that combines the three commits.
Attachment #8725227 - Attachment filename: mpi-combined.patch → combined patch that applies on NSS_3_19_2_X_BRANCH
Whiteboard: [b2g-adv-main2.5?][adv-main44-] see comment 27 for severity → [b2g-adv-main2.5?][adv-main44+] see comment 27 for severity
Hi Kai, I noticed that you have a patch ready. Do you mean to land them in time for esr38.7? Or is this planned for after March 8 and therefore esr38.8? I am just trying to decide whether I can go to build esr38.7 build1 or wait for this to be uplifted? Thanks!
Flags: needinfo?(kaie)
Canceling the NI as I just noticed this https://bugzilla.mozilla.org/show_bug.cgi?id=1211568#c81. No more NSS updates for esr38.7, the next update will be for esr 38.8.
Flags: needinfo?(kaie)
Blocks: 1254986
Flags: sec-bounty?
Flags: qe-verify-
Whiteboard: [b2g-adv-main2.5?][adv-main44+] see comment 27 for severity → [post-critsmash-triage][b2g-adv-main2.5?][adv-main44+] see comment 27 for severity
Flags: sec-bounty? → sec-bounty+
According to https://kuix.de/mozilla/versions/ ESR38.7 has NSS 3.19.2.4

Mozilla software version on branch mozilla-esr38: 38.7.1esrpre
NSPR (portable runtime) version: NSPR_4_10_10_RTM
NSS (security library) version: NSS_3_19_2_4_RTM
CKBI (builtin trust) version: 2.4

Did ESR38.8 take an NSS 3.19.2.5 or similar update?

This is marked tracking for ESR38.8.
Flags: needinfo?(sledru)
Flags: needinfo?(rkothari)
Flags: needinfo?(kaie)
This is the last NSS update that I see, based on Bug 1254986 - Upgrade Firefox 38.8 ESR to NSS 3.19.2.4
Flags: needinfo?(rkothari)
Looks like it is going to be a wontfix for 38.
Given that it was fixed first in 44, I don't see that as a big issue.
Flags: needinfo?(sledru)
Flags: needinfo?(sledru)
(In reply to Ritu Kothari (:ritu) from comment #52)
> This is the last NSS update that I see, based on Bug 1254986 - Upgrade
> Firefox 38.8 ESR to NSS 3.19.2.4

And does 3.19.2.4 contain this fix or not?

You can't "won't fix" this if you don't know that as it might *already* be fixed.
Flags: needinfo?(rkothari)
https://hg.mozilla.org/releases/mozilla-esr38/rev/c64ca84020a1 is the update we took.

Comment 50 shows NSS took changes 12 days before this ESR38 checkin.
NSS 3.19.2.4 landed on the ESR-38 branch in bug 1254986
Whiteboard: [post-critsmash-triage][b2g-adv-main2.5?][adv-main44+] see comment 27 for severity → [post-critsmash-triage][b2g-adv-main2.5?][adv-main44+][adv-esr38.8+] see comment 27 for severity
oups, thanks!
Flags: needinfo?(sledru)
Group: core-security-release
Flags: needinfo?(kaie)
You need to log in before you can comment on or make changes to this bug.