Closed Bug 1334108 Opened 8 years ago Closed 8 years ago

ECParams shouldn't have been modified, revert the change

Categories

(NSS :: Libraries, defect)

3.28
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: KaiE, Assigned: rrelyea)

References

Details

Attachments

(3 files, 1 obsolete file)

In bug 957105, the struct ECParams was extended with another member: int pointSize That caused issues with java-1.8.0-openjdk, which is manually creating and filling that structure, and therefore the member has random data, but the NSS implementation depends on that new attribute. Bob said it was probably a mistake to extend that structure, as it's apparently part of a stable interface. How can we avoid more problems when stable environments pick a newer NSS? Can we revert that change?
Bob, could you please attach a suggested patch, how you think we should revert the API change and remedy the situation ?
Flags: needinfo?(rrelyea)
See Also: → 1338500
This bug is really a big problem for the Linux distribution world, and it's also urgent for us to get a fix. I've looked into this a bit, and I'm trying to make some suggestions. Maybe the suggestions aren't good ideas. But at least we'll have a starting point for discussion... The new ECParams.pointSize member is used to transport pubKeyLen, which is used to allocate a buffer of the correct size. In the past, prior to adding ECCurve25519, the code used ecParams->fieldID.size to calculate pubKeyLen, without requiring additional information. Could we do the following: - we introduce new enum values for ECParamsType typedef enum { ec_params_explicit, ec_params_named, ec_params_explicit_v2, ec_params_named_v2, } ECParamsType; - we declare, in order to use ECCurve25519, or any additional future curve, the application MUST use the new *_v2 enum values - if code attempts to use any new curve (including ECCurve25519), but the code still uses an old type enum, the library rejects that attempt with a graceful runtime failure - if code attempts to use an old curve, and the code still uses the old type enum, then the library will use the classic calculation that works with the old curves, without needing pointSize - we remove pointSize from ECParams - we declare that, if a _v2 type is used, then one of the SECItems is interpreted as a NESTED item, that actually contains two attributes, e.g. curveOID. We could define that with enum _v2, curveOID is an opaque type. /* private header */ struct ECParamsExtraPrivate { SECItem *dummy; size_t self_size; SECItem curveOID; int pointSize; } /* public header */ struct ECParamsStr { PLArenaPool *arena; ECParamsType type; ECFieldID fieldID; ECCurve curve; SECItem base; SECItem order; int cofactor; SECItem DEREncoding; ECCurveName name; union { SECItem curveOID_v1; <- change void *extraParams_v2; <- change } /* int pointSize REMOVED */ /* add some kind of static assert, that sizeof(void*) <= sizeof(SECItem) */ }; typedef struct ECParamsStr ECParams; We define new accessor functions, that are required to get/set any of the attributes curveOID, pointSize, or future members. We'll publicly declare/announce that the NSS 3.28, 3.28.1, 3.28.2 and 3.29 releases contained an API incompatible change, and that any code that started to use the pointSize attribute or the ECCurve25519 must be changed to continue to function. I believe that any application, that was compiled against these versions, and that attempted to create an ECParams structure by itself, will have discovered that it was necessary to assign to the pointSize argument, to make it work. For such applications the change will be noticed, if the newer NSS version removes the pointSize attribute, when they are recompiled. But they will malfunction without recompilation. What about application code that was built against a version having pointSize, but which doesn't access the pointSize attribute, but which handles data structures involving ECParams? When such an application is used with remove-pointSize NSS at runtime, but wasn't rebuilt against removed-pointSize, that application will malfunction, because of the smaller structure size. Can we identify/invent a runtime check, which the library could use, to find out what which version of NSS the application was built against? I guess we cannot. Can we declare that anyone who built against NSS 3.28, 3.28.1, 3.28.2, 3.29, REALLY REALLY really must rebuild their application, at the time of upgrading to NSS 3.28.3 or NSS 3.29.1 ?
So it's worse than I thought. I thought we could just add new entry points and the ones compiled with 3.28 will just work because the value was added at the end of the struct, but the struct is imbeded in other structs (namely the keys).. At this point the only way forward is to abandoned the 3.28 headers and release a new version with the exact fields as the other. We can then add an accessor function to get the point size (since it doesn't change for a given curve). So revert the change to ECParams but not to ECCurveParams. The the function takes the ECParams and returns a point size. The two places in softoken and two places in Freebl will use that new function. That function needs to be exported through the loader table. I'll supply a patch. bob
Flags: needinfo?(rrelyea)
Attached patch Compiled but untested patch. (obsolete) — Splinter Review
Thanks Bob for working on it. I've exected experimental builds, on both NSS trunk, and NSS 3.28 branch, and tests are green. (I couldn't yet test Windows, because of an infrastructure issue.) https://treeherder.mozilla.org/#/jobs?repo=nss-try&revision=86b3767a57c5e4741325387187c172e729809d7b https://treeherder.mozilla.org/#/jobs?repo=nss-try&revision=6e13af442d1240ae1fdeece243bad545d25a8fee
(the merged patch that I used for testing the 3.28 branch)
Franziskus, Bob hasn't yet asked for review, but I might make sense to start the discussione early. Because you had working on the original change, ( https://hg.mozilla.org/projects/nss/rev/047ab976840a ) do you have any concerns regarding the approach that Bob suggests?
My all.sh tests work. The only thing to test is if this fixes the actual Java issue on Fedora. I believe that we did tell Java that they could use the API, which is why we created the ECDecode so that Java could create an ECParam from our internal ECCurveParams. It would be good to release NSS since 3.28 breaks compatibility with 3.27 and this change will only be compatible with 3.27. bob
(In reply to Robert Relyea from comment #9) > My all.sh tests work. The only thing to test is if this fixes the actual > Java issue on Fedora. I believe that we did tell Java that they could use > the API, which is why we created the ECDecode so that Java could create an > ECParam from our internal ECCurveParams. This request for testing needs to be made in Fedora land. I've added a comment here: https://bugzilla.redhat.com/show_bug.cgi?id=1415137#c65 > It would be good to release NSS > since 3.28 breaks compatibility with 3.27 and this change will only be > compatible with 3.27. I assume Bob requests a NSS 3.28.x release. It will also require a NSS 3.29.x release.
Looks good to me in general. I put some comments here [1]. [1] https://nss-review.dev.mozaws.net/D210
The indentation issue was tabs instead of spaces. The DISABLE_ECC defines. I think they are already broken and should now be removed (I don't think we have anyone that builds without ECC). In any case the ones in blapit.h are wrong since these functions are part of the fixed interface table in freebl, so they should always be there (just return errors). Note that the #defines aren't around the usages in the freebl vector table. If we want DISABLE_ECC to work, then we'd need to adjust the #defines in ecdecode.c
Attachment #8837341 - Flags: review?(franziskuskiefer)
See Also: → 1339720
Comment on attachment 8837341 [details] [diff] [review] Fix binary compatibility for ECParams. - updated to fix white space issues. Review of attachment 8837341 [details] [diff] [review]: ----------------------------------------------------------------- > The indentation issue was tabs instead of spaces. It still looks a bit odd to me. I'd suggest running clang-format before landing. > The DISABLE_ECC defines. I think they are already broken and should now be removed (I don't think we have anyone that builds without ECC). I wouldn't mind removing DISABLE_ECC. I filed bug 1339720 to track the issue. It shouldn't block us here.
Attachment #8837341 - Flags: review?(franziskuskiefer) → review+
Checked in with adjusted formatting: https://hg.mozilla.org/projects/nss/rev/917d4935a790
Target Milestone: --- → 3.30
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
backport to 3.29 branch for 3.29.1: https://hg.mozilla.org/projects/nss/rev/a521665e45b9 backport to 3.28 branch for 3.28.3: https://hg.mozilla.org/projects/nss/rev/0e25df041c8f
See Also: → curve25519
Blocks: 1339790
Blocks: 1339789
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: