Closed
Bug 321350
Opened 19 years ago
Closed 19 years ago
Implement optimized code for NIST Suite B elliptic curves
Categories
(NSS :: Libraries, enhancement, P2)
Tracking
(Not tracked)
RESOLVED
FIXED
3.11.1
People
(Reporter: douglas, Assigned: douglas)
Details
(Whiteboard: ECC FIPS)
Attachments
(7 files, 2 obsolete files)
673 bytes,
text/plain
|
Details | |
7.81 KB,
text/plain
|
vipul.gupta
:
review+
wtc
:
superreview+
|
Details |
7.91 KB,
text/plain
|
vipul.gupta
:
review+
wtc
:
superreview+
|
Details |
4.82 KB,
text/plain
|
vipul.gupta
:
review+
wtc
:
superreview+
|
Details |
6.13 KB,
patch
|
vipul.gupta
:
review+
wtc
:
superreview+
|
Details | Diff | Splinter Review |
30.22 KB,
patch
|
Details | Diff | Splinter Review | |
82.01 KB,
patch
|
douglas
:
review+
vipul.gupta
:
superreview+
|
Details | Diff | Splinter Review |
The elliptic curve library (ECL) in NSS currently has optimized code only for curves secp160r1, secp192r1, and secp224r1, in the files lib/freebl/ecl/ecp_192.c, ecp_224.c, ecp_fp160.c, ecp_fp192.c, and ecp_fp224.c. Optimized code should be developed for the curves in the NIST Suite B cryptography recommendations (http://www.nsa.gov/ia/industry/crypto_suite_b.cfm?MenuID=10.2.7), namely for the nistp224 (==secp256r1), nistp384 (==secp384r1), and nistp521 (==secp521r1) curves. This and bug #320583 would allow NSS to support all recommendations (except ECMQV) of NIST Suite B.
Comment 1•19 years ago
|
||
Hmm, our test are showing p192 to be slower then p224 and p256. p224 and p256 are about on par, p384 is much slower.
bob
Comment 2•19 years ago
|
||
Above, this bug says "nistp224 (==secp256r1)". That's a typo, right?
If this work is going to be done, it needs to be done in time for FIPS
evaluation, right?
Assigning this to Vipul, so it doesn't appear to be Wan-Teh's responsibility.
Assignee: wtchang → vipul.gupta
Priority: -- → P2
Whiteboard: ECC FIPS
Target Milestone: --- → 3.11.2
Version: unspecified → 3.11
Updated•19 years ago
|
Assignee: vipul.gupta → douglas
Priority: P2 → P1
Assignee | ||
Comment 3•19 years ago
|
||
Implementation of fast reduction for the remaining NIST prime curves: P-256, P-384, and P-521.
On my test systems (PowerPC G4 and PowerPC G5), the fast reduction code is actually slower than the Montgomery modular reduction code for all curves except P-521, so the algorithm wiring in ecl.c only uses the fast reduction for P-521. I've attached timings from my system showing this behaviour. It's probably worth checking in all the code, even though only the P-521 code will actually be wired in.
I wasn't able to force cvs diff to create a patch that also contained new files, so also part of the CVS submission are the attachments ecp_256.c, ecp_384.c, and ecp_521.c, all of which should be placed in the lib/freebl/ecl directory.
Attachment #213840 -
Flags: review?(vipul.gupta)
Assignee | ||
Comment 4•19 years ago
|
||
Assignee | ||
Comment 5•19 years ago
|
||
Attachment #213843 -
Flags: review?(vipul.gupta)
Assignee | ||
Comment 6•19 years ago
|
||
Attachment #213844 -
Flags: review?(vipul.gupta)
Assignee | ||
Comment 7•19 years ago
|
||
Attachment #213845 -
Flags: review?(vipul.gupta)
Comment 8•19 years ago
|
||
(In reply to comment #3)
> On my test systems (PowerPC G4 and PowerPC G5), the fast reduction code is
> actually slower than the Montgomery modular reduction code for all curves
> except P-521, so the algorithm wiring in ecl.c only uses the fast reduction for
> P-521. I've attached timings from my system showing this behaviour. It's
> probably worth checking in all the code, even though only the P-521 code will
> actually be wired in.
According to comment 1, shouldn't the patch also rewire the code for
nistp192 to use montgomery reduction? and possibly even nistp224?
nit: could you please post a unified diff (use diff -u)?
vipul
Assignee | ||
Comment 9•19 years ago
|
||
Attachment #213840 -
Attachment is obsolete: true
Attachment #213889 -
Flags: review?(vipul.gupta)
Attachment #213840 -
Flags: review?(vipul.gupta)
Assignee | ||
Comment 10•19 years ago
|
||
(In reply to comment #8)
> According to comment 1, shouldn't the patch also rewire the code for
> nistp192 to use montgomery reduction? and possibly even nistp224?
It does that.
Douglas
Comment 11•19 years ago
|
||
(In reply to comment #10)
> (In reply to comment #8)
> > According to comment 1, shouldn't the patch also rewire the code for
> > nistp192 to use montgomery reduction? and possibly even nistp224?
>
> It does that.
Ok. I'll take a closer look later today. Since I'm not familiar
with the algorithmic details of the, and I don't see any stylistic issues,
I'll simply be treating this code as a blackbox for tests.
I'm curious to know if you've run all.sh using the merged
ECC and non-ECC test scripts (you'll need to apply the
latest patches for the cert, smime, ssl and tools directory
available as part of bug 324887). These tests have been
modified to exercise the three NIST B curves at the highest
level (i.e. at the certutil/tstclnt/selfserv/cmsutil level).
[Ideally, we should do these tests in both 32 and 64 bit mode
with the new code wired in for *all* the curves so we know
the code behaves correctly should someone choose to change
the wiring later.]
Also, does the optimized reduction implemented here
implement what I remember as the shift and subtract
technique seen in Dan Bernstein's P224 code or does that
only make sense for floating point code?
Thanks for putting in all this effort and doing it so
promptly. BTW, any insight into why the public-key
verification tests are failing for the NIST Koblitz curves
as per Wan-Teh's email?
vipul
Assignee | ||
Comment 12•19 years ago
|
||
(In reply to comment #11)
> I'm curious to know if you've run all.sh using the merged
> ECC and non-ECC test scripts (you'll need to apply the
> latest patches for the cert, smime, ssl and tools directory
> available as part of bug 324887). These tests have been
> modified to exercise the three NIST B curves at the highest
> level (i.e. at the certutil/tstclnt/selfserv/cmsutil level).
> [Ideally, we should do these tests in both 32 and 64 bit mode
> with the new code wired in for *all* the curves so we know
> the code behaves correctly should someone choose to change
> the wiring later.]
No, I haven't tried the merged all.sh scripts. Could you give it a try?
> Also, does the optimized reduction implemented here
> implement what I remember as the shift and subtract
> technique seen in Dan Bernstein's P224 code or does that
> only make sense for floating point code?
I haven't read Dan Bernstein's P224 code, but the P224 algorithm as implemented is basically some shifts and some additions and subtractions.
> Thanks for putting in all this effort and doing it so
> promptly. BTW, any insight into why the public-key
> verification tests are failing for the NIST Koblitz curves
> as per Wan-Teh's email?
No, I don't know why the KIST K Koblitz curves are failing. My initial guess was that maybe they use different irreducibles than the NIST B and so they were failing for the same reason that the c2pnb163v? curves failed, but that's not the case as far as I can tell. I've asked Wan-Teh to send me more info about the tests so I can look into it.
Comment 13•19 years ago
|
||
Comment on attachment 213889 [details] [diff] [review]
Optimized reduction for NIST P-256, P-384, and P-521 (unified)
Looks ok. Will run merged tests after wiring in the new code (hopefully, later today). Didn't want to hold up other reviews in the meantime.
Attachment #213889 -
Flags: superreview?(wtchang)
Attachment #213889 -
Flags: review?(vipul.gupta)
Attachment #213889 -
Flags: review+
Comment 14•19 years ago
|
||
Comment on attachment 213843 [details]
lib/freebl/ecl/ecp_256.c
Looks ok. Will run merged tests after wiring in the new code (hopefully, later today). Didn't want to hold up other reviews in the meantime.
Attachment #213843 -
Flags: superreview?(wtchang)
Attachment #213843 -
Flags: review?(vipul.gupta)
Attachment #213843 -
Flags: review+
Comment 15•19 years ago
|
||
Comment on attachment 213844 [details]
lib/freebl/ecl/ecp_384.c
Looks ok. Will run merged tests after wiring in the new code (hopefully, later today). Didn't want to hold up other reviews in the meantime.
Attachment #213844 -
Flags: superreview?(wtchang)
Attachment #213844 -
Flags: review?(vipul.gupta)
Attachment #213844 -
Flags: review+
Comment 16•19 years ago
|
||
Comment on attachment 213845 [details]
lib/freebl/ecl/ecp_521.c
Looks ok. Will run merged tests after wiring in the new code (hopefully, later today). Didn't want to hold up other reviews in the meantime.
Attachment #213845 -
Flags: superreview?(wtchang)
Attachment #213845 -
Flags: review?(vipul.gupta)
Attachment #213845 -
Flags: review+
Comment 17•19 years ago
|
||
Comment on attachment 213889 [details] [diff] [review]
Optimized reduction for NIST P-256, P-384, and P-521 (unified)
r=wtc.
In ecl/ecl.c, we have
> } else if ((name == ECCurve_SECG_PRIME_192K1)) {
> group =
> ECGroup_consGFp_mont(&irr, &curvea, &curveb, &genx, &geny,
> &order, params->cofactor);
> if (group == NULL) { res = MP_UNDEF; goto CLEANUP; }
>- MP_CHECKOK(ec_group_set_gfp192(group, name));
Is this rewiring or bug fixing? I don't understand when ec_group_set_gfp<ddd> should be used.
>- /* XXX secp521r1 fails ecp_test with &ec_GFp_pts_mul_jac */
>- if (name == ECCurve_SECG_PRIME_521R1) {
>- group->points_mul = &ec_pts_mul_simul_w2;
>- }
Why is this removed?
Attachment #213889 -
Flags: superreview?(wtchang) → superreview+
Updated•19 years ago
|
Attachment #213843 -
Flags: superreview?(wtchang) → superreview+
Updated•19 years ago
|
Attachment #213844 -
Flags: superreview?(wtchang) → superreview+
Updated•19 years ago
|
Attachment #213845 -
Flags: superreview?(wtchang) → superreview+
Assignee | ||
Comment 18•19 years ago
|
||
(In reply to comment #17)
> In ecl/ecl.c, we have
>
> > } else if ((name == ECCurve_SECG_PRIME_192K1)) {
> > group =
> > ECGroup_consGFp_mont(&irr, &curvea, &curveb, &genx, &geny,
> > &order, params->cofactor);
> > if (group == NULL) { res = MP_UNDEF; goto CLEANUP; }
> >- MP_CHECKOK(ec_group_set_gfp192(group, name));
>
> Is this rewiring or bug fixing? I don't understand when ec_group_set_gfp<ddd>
> should be used.
This is the removal of a line that was unnecessary. ec_group_set_grp<ddd> functions have their own switch statement in them to determine what to do based on the curve. The secp192k1 curve doesn't have any optimized code in the ecp_192.c file, so the ec_group_set_grp192 call on this curve doesn't do anything and hence can be removed. Not a bug, but an unnecessary line of code.
> >- /* XXX secp521r1 fails ecp_test with &ec_GFp_pts_mul_jac */
> >- if (name == ECCurve_SECG_PRIME_521R1) {
> >- group->points_mul = &ec_pts_mul_simul_w2;
> >- }
>
> Why is this removed?
It no longer fails. Three years ago I never figured out why it failed, and now I don't know why it's not failing. It might be related to one of the array size bugs that was fixed a few months ago, but I'm not certain.
Comment 19•19 years ago
|
||
(In reply to comment #12)
> No, I haven't tried the merged all.sh scripts. Could you give it a try?
I've tried this (but with only the optimized reduction of secp521r1 wired in)
and all tests pass. This bug looks ready to be marked fixed/resolved.
vipul
Comment 20•19 years ago
|
||
Douglas, how do I build and run the performance test
you used to get the numbers in comment 4? I'd like
to get the numbers for Linux and Windows.
Assignee | ||
Comment 21•19 years ago
|
||
(In reply to comment #20)
> Douglas, how do I build and run the performance test
> you used to get the numbers in comment 4? I'd like
> to get the numbers for Linux and Windows.
You need to create standalone builds of the mpi and ecl libraries and ecl test programs:
> cd lib/freebl/mpi
> make libs
> cd../ecl
> make tests
Then run the ecp_test program with the timing option:
> ./ecp_test --time
This will run with the default algorithm wirings according to the patch. To change the wirings, you need to edit lib/freebl/ecl/ecl.c and change the wirings in ecgroup_fromNameAndHex. For example, if you want to use the Montgomery arithmetic, each curve just needs to have the following lines:
> group = ECGroup_consGFp_mont(&irr, &curvea, &curveb, &genx, &geny, &order, params->cofactor);
> if (group == NULL) { res = MP_UNDEF; goto CLEANUP; }
Whereas if you want fast reduction, each curve needs to have the following lines, with the appropriate function call on the third line:
> group = ECGroup_consGFp(&irr, &curvea, &curveb, &genx, &geny, &order, params->cofactor);
> if (group == NULL) { res = MP_UNDEF; goto CLEANUP; }
> MP_CHECKOK(ec_group_set_gfp224(group, name));
Assignee | ||
Comment 22•19 years ago
|
||
I plan to do optimized base point multiplication as well for the NIST suite B curves, but as a separate patch. Wan-Teh, should that be filed under a new bug or as future attachments to this bug?
Comment 23•19 years ago
|
||
Douglas, please open a new bug for optimized base point multiplication.
Bob, could you or Rob Crittenden do the performance measurement described
in comment 20 and comment 21?
Comment 24•19 years ago
|
||
ecperf is a NSS test tool I created from douglas's raw performance tool. I created it for a couple of reasons: 1) I was most interested in the raw performance of the tool algorithms at both the freebl and the pkcs #11 levels (so I could compare the relative performance, and 2) I wanted to build and link with mpi as configured in our running NSS (the stand alone mpi does not necessarily have all of our platform performance enhancements built in). 3) I needed a threaded version of the tool to me able to measure the true performance on hyperthreaded/dual core and smp systems (non-threaded versions produced numbers just more than half the true throughput).
Below is the result of 3 different configurations:
1) All the optimized curve code turned off.
2) The old 'optimized' curve code.
3) The new optimized curve code.
These were run on a 2 way Intel EMT 64 running RHEL 4 (Linux).
Summary.
1. All the old 'optimized' curves are very slow (much slower than unoptimized) - we already knew this.
2. The new optimized code is percentage slower than optimized on Intel 64 bit, except for 521 (I think douglas was predicting that).
I suspect this would not be the case on 32bit platforms, where the relative difference between bignum multiplies and other operations is greater.
-------------No optimizations----------------------------------------
Testing SECP-160K1 using freebl implementation...
ECDH_Derive count: 600 sec: 0.66 op/sec: 909.20
ECDSA_Sign count: 600 sec: 0.67 op/sec: 892.46
ECDSA_Verify count: 600 sec: 1.42 op/sec: 422.96
... okay.
Testing NIST-P192 using freebl implementation...
ECDH_Derive count: 600 sec: 0.79 op/sec: 759.06
ECDSA_Sign count: 600 sec: 0.83 op/sec: 724.61
ECDSA_Verify count: 600 sec: 1.78 op/sec: 337.66
... okay.
Testing NIST-P224 using freebl implementation...
ECDH_Derive count: 600 sec: 1.02 op/sec: 586.95
ECDSA_Sign count: 600 sec: 1.05 op/sec: 573.38
ECDSA_Verify count: 600 sec: 2.15 op/sec: 279.68
... okay.
Testing NIST-P256 using freebl implementation...
ECDH_Derive count: 600 sec: 1.20 op/sec: 498.32
ECDSA_Sign count: 600 sec: 1.24 op/sec: 484.67
ECDSA_Verify count: 600 sec: 2.54 op/sec: 235.80
... okay.
Testing NIST-P384 using freebl implementation...
ECDH_Derive count: 600 sec: 2.34 op/sec: 256.44
ECDSA_Sign count: 600 sec: 2.39 op/sec: 250.84
ECDSA_Verify count: 600 sec: 4.59 op/sec: 130.85
... okay.
Testing NIST-P521 using freebl implementation...
ECDH_Derive count: 600 sec: 4.21 op/sec: 142.50
ECDSA_Sign count: 600 sec: 4.28 op/sec: 140.12
ECDSA_Verify count: 600 sec: 7.48 op/sec: 80.19
... okay.
----------------Old optimized code--------------------------
Testing SECP-160K1 using freebl implementation...
ECDH_Derive count: 600 sec: 0.67 op/sec: 900.41
ECDSA_Sign count: 600 sec: 0.68 op/sec: 885.87
ECDSA_Verify count: 600 sec: 1.47 op/sec: 407.85
... okay.
Testing NIST-P192 using freebl implementation...
ECDH_Derive count: 600 sec: 2.51 op/sec: 239.18
ECDSA_Sign count: 600 sec: 2.59 op/sec: 231.55
ECDSA_Verify count: 600 sec: 4.43 op/sec: 135.38
... okay.
Testing NIST-P224 using freebl implementation...
ECDH_Derive count: 600 sec: 2.12 op/sec: 282.45
ECDSA_Sign count: 600 sec: 2.15 op/sec: 279.37
ECDSA_Verify count: 600 sec: 3.53 op/sec: 170.05
... okay.
Testing NIST-P256 using freebl implementation...
ECDH_Derive count: 600 sec: 1.22 op/sec: 490.35
ECDSA_Sign count: 600 sec: 1.24 op/sec: 483.94
ECDSA_Verify count: 600 sec: 2.55 op/sec: 234.85
... okay.
Testing NIST-P384 using freebl implementation...
ECDH_Derive count: 600 sec: 2.38 op/sec: 251.72
ECDSA_Sign count: 600 sec: 2.40 op/sec: 249.65
ECDSA_Verify count: 600 sec: 4.61 op/sec: 130.24
... okay.
Testing NIST-P521 using freebl implementation...
ECDH_Derive count: 600 sec: 4.16 op/sec: 144.08
ECDSA_Sign count: 600 sec: 4.27 op/sec: 140.48
ECDSA_Verify count: 600 sec: 57.27 op/sec: 10.48
... okay.
----------------New optimized code--------------------------
ecperf -t 2 -i 300
Testing SECP-160K1 using freebl implementation...
ECDH_Derive count: 600 sec: 0.66 op/sec: 907.78
ECDSA_Sign count: 600 sec: 0.69 op/sec: 868.81
ECDSA_Verify count: 600 sec: 1.49 op/sec: 402.64
... okay.
Testing NIST-P192 using freebl implementation...
ECDH_Derive count: 600 sec: 0.83 op/sec: 723.07
ECDSA_Sign count: 600 sec: 0.86 op/sec: 699.67
ECDSA_Verify count: 600 sec: 1.84 op/sec: 326.88
... okay.
Testing NIST-P224 using freebl implementation...
ECDH_Derive count: 600 sec: 1.04 op/sec: 575.97
ECDSA_Sign count: 600 sec: 1.08 op/sec: 558.00
ECDSA_Verify count: 600 sec: 2.19 op/sec: 273.99
... okay.
Testing NIST-P256 using freebl implementation...
ECDH_Derive count: 600 sec: 1.24 op/sec: 484.37
ECDSA_Sign count: 600 sec: 1.27 op/sec: 471.96
ECDSA_Verify count: 600 sec: 2.60 op/sec: 230.80
... okay.
Testing NIST-P384 using freebl implementation...
ECDH_Derive count: 600 sec: 2.41 op/sec: 248.64
ECDSA_Sign count: 600 sec: 2.47 op/sec: 242.82
ECDSA_Verify count: 600 sec: 4.72 op/sec: 127.22
... okay.
Testing NIST-P521 using freebl implementation...
ECDH_Derive count: 600 sec: 3.76 op/sec: 159.37
ECDSA_Sign count: 600 sec: 3.89 op/sec: 154.18
ECDSA_Verify count: 600 sec: 6.59 op/sec: 91.10
... okay.
Comment 25•19 years ago
|
||
Just to be clear for those that haven't been following:
The old curve optimizations actually slowed down performance (this lead to bizzarre situations where 256 bit EC was faster than 192 bit on all platforms because 256 bit wasn't optimized). Both douglas and I noticed this last week.
Douglas found a major source of the problems (appearantly not calling the montgomery multipy routine) and fixed it for this patch. Since we found the differential between the optimized curves and the unoptimized curves to be greater on 64 bit, we wanted to measure the actual gain (or loss) on that platform. The numbers I posted in the bug are all running on exactly the same platform (a 64 bit Intel EMT).
There are no 32-bit numbers posted. My guess is the new optimizations will be a win on 32-bit.
bob
Comment 26•19 years ago
|
||
Bob, in comment 24, you wrote:
> The new optimized code is percentage slower than optimized on Intel 64 bit,
That says "X is slower than X".
Now, perhaps you meant "new optimized code is slower than old optimized code",
or perhaps you meant "new optimized code is slower than unoptimized code",
or perhaps you meant " unoptimized code is slower than optimized code",
but I seriously doubt you meant that X is slower than X. Please clarify.
Assignee | ||
Comment 27•19 years ago
|
||
(In reply to comment #24)
> ecperf is a NSS test tool I created from douglas's raw performance tool.
Where can ecperf be found?
Comment 28•19 years ago
|
||
Here's ECperf that I used.
Comment 29•19 years ago
|
||
this patch takes douglas's patch and adds the following:
1) inline adds/subracts in the optimized curve mod functions for 192, 224, and 256 curves.
2) replaces the mp_mod() function call in 192, 224, and 256 with optimized versions.
2) makes a single 512 bit curve independent of the MP_WORD_SIZE.
3) Creates in-line versions for add and subtract for general gf curves of 3=6 words in length.
This code increases 192 and 224 performance by 70% over the montgomery code, and 256 bit performance by about 10%.
Obvious future todos: More tweaking with the 256 curves, doing the same optimizations for 384.
Attachment #215161 -
Flags: review?
Updated•19 years ago
|
Attachment #215161 -
Flags: review? → review?(douglas)
Assignee | ||
Comment 30•19 years ago
|
||
Comment on attachment 215161 [details] [diff] [review]
Optimized curves, faster adds.
The lib/freebl/ecl/tests/ecp_test.c programs fails with this patch on my PowerPC G5 system. The error is as follows:
Testing NIST-P192 using specific implementation...
base point P:
188DA80EB03090F67CBF20EB43A18800F4FF0AFD82FF1012
7192B95FFC8DA78631011ED6B24CDD573F977A11E794811
(order-1)*P (ECPoint_mul):
188DA80EB03090F67CBF20EB43A18800F4FF0AFD82FF1012
F8E6D46A003725879CEFEE1294DB32298C06885EE186B7EE
(order-1)*P (ECPoint_mul):
188DA80EB03090F67CBF20EB43A18800F4FF0AFD82FF1012
F8E6D46A003725879CEFEE1294DB32298C06885EE186B7EE
(order)*P (ECPoint_mul):
0
0
(order)*P (ECPoint_mul):
0
0
(order-1)*P + (order-1)*P + P == (order-1)*P (ECPoints_mul):
188DA80EB03090F67CBF20EB43A18800F4FF0AFD82FF1012
F8E6D46A003725879CEFEE1294DB32298C06885EE186B7EE
Error: validate point on base point failed.
Error: exiting with error value -1
Error: exiting with error value -1
I have not had a chance to determine the cause of this failure as the patch is quite a large patch. Have you run it on 64-bit platforms other than AMD? (The PowerPC G5 is a 64-bit platform.)
Attachment #215161 -
Flags: review?(douglas) → review-
Comment 31•19 years ago
|
||
Actually I haven't been able to run it on AMD 64. The only 64 bit platform I have access to is Linux 64 bit.
I've run both 32bit and 64 bit. I've also run the code without turning on the assembler.
If you are failing on the 192 bit curve, the problem should be isolated to to ecp_192.c and ecl-curve.h (well you also have to turn on the 192 optimized curve in ecl.c).
Do you have test vectors I can run on my machine?
If it's failing on a single test (are a set of tests), it's probably something wrong with the final reduction test (though 192's test is pretty simple). If it is failing generally, it could be a problem with the MP_ADD_CARRY or MP_SUB_BORROW macros. They were copied from mpi add and subtract routines, but I may have missed something.
bob
Assignee | ||
Comment 32•19 years ago
|
||
(In reply to comment #31)
> Do you have test vectors I can run on my machine?
You should be able to run the same tests I did by building the standline MPI and ECL libraries and then running ecp_test --print in the ecl directory.
Comment 33•19 years ago
|
||
(In reply to comment #31)
> Actually I haven't been able to run it on AMD 64. The only 64 bit platform I
> have access to is Linux 64 bit.
Did you mean Intel 64 bit?
Comment 34•19 years ago
|
||
yes, I mean Linux 64 bit running on an Intel EMT 64 processor.
Comment 35•19 years ago
|
||
Found 192 problem (missing clamp).
Fixed 256 final reduce check.
Fixed MP_NO_MP_WORD ADD and Subract macros
Attachment #215161 -
Attachment is obsolete: true
Attachment #215707 -
Flags: review?(douglas)
Comment 36•19 years ago
|
||
I was able to reproduce the ecp_test failure on my 64 bit platform (missing clamp). updated patch attached.
Assignee | ||
Comment 37•19 years ago
|
||
Comment on attachment 215707 [details] [diff] [review]
Optimized curves, faster adds. Version 2
Patch applies cleanly and all tests (both freebl/ecl/ec?_test and all.sh) pass.
Attachment #215707 -
Flags: review?(douglas) → review+
Comment 38•19 years ago
|
||
Comment on attachment 215707 [details] [diff] [review]
Optimized curves, faster adds. Version 2
need 2nd review for 3.11
Attachment #215707 -
Flags: superreview?(vipul.gupta)
Comment 39•19 years ago
|
||
Comment on attachment 215707 [details] [diff] [review]
Optimized curves, faster adds. Version 2
I didn't have time to actually apply the patch and run tests so I'm relying on Douglas's assertion that everything checks out. I've read through the patch and it seems fine.
Attachment #215707 -
Flags: superreview?(vipul.gupta) → superreview+
Comment 40•19 years ago
|
||
This patch broke the Sparc 32-bit solaris build on the tip. You can see that in the tinderbox.
gmake[1]: Entering directory `/h/monstre/export/home/nss/tip/mozilla/security/nss/lib/freebl'
cc -o SunOS5.9_DBG.OBJ/SunOS_ABI32_INT32/ecl_gf.o -c -g -KPIC -DSVR4 -DSYSV -D__svr4 -D__svr4__ -DSOLARIS -D_REENTRANT -DSOLARIS2_9 -D_SVID_GETTOD -xarch=v8 -xs -DXP_UNIX -DSHLIB_SUFFIX=\"so\" -DSHLIB_PREFIX=\"lib\" -DSHLIB_VERSION=\"_3\" -DSOFTOKEN_SHLIB_VERSION=\"3\" -DNSS_ENABLE_ECC -DRIJNDAEL_INCLUDE_TABLES -DDEBUG -UNDEBUG -DDEBUG_jp96085 -DMP_USE_UINT_DIGIT -DMP_ASSEMBLY_MULTIPLY -DSHA_NO_LONG_LONG -DMP_API_COMPATIBLE -I/usr/dt/include -I/usr/openwin/include -I../../../../dist/SunOS5.9_DBG.OBJ/include -I../../../../dist/public/nss -I../../../../dist/private/nss -Impi -Iecl ecl/ecl_gf.c
"ecl/ecl_gf.c", line 278: syntax error before or at: r0
"ecl/ecl_gf.c", line 279: syntax error before or at: r1
"ecl/ecl_gf.c", line 280: syntax error before or at: r2
"ecl/ecl_gf.c", line 365: syntax error before or at: r0
"ecl/ecl_gf.c", line 366: syntax error before or at: r1
"ecl/ecl_gf.c", line 367: syntax error before or at: r2
"ecl/ecl_gf.c", line 368: syntax error before or at: r3
"ecl/ecl_gf.c", line 425: cannot recover from previous errors
cc: acomp failed for ecl/ecl_gf.c
gmake[1]: *** [SunOS5.9_DBG.OBJ/SunOS_ABI32_INT32/ecl_gf.o] Error 2
gmake[1]: Leaving directory `/h/monstre/export/home/nss/tip/mozilla/security/nss/lib/freebl'
gmake: *** [libs] Error 2
Comment 41•19 years ago
|
||
Sheriff Nelson backed out a certain set of changes that broke the build.
Please remember to watch Tinderbox after doing a checkin.
Checking in Makefile; new revision: 1.77; previous revision: 1.76
Checking in manifest.mn; new revision: 1.46; previous revision: 1.45
Checking in ecl/Makefile; new revision: 1.4; previous revision: 1.3
Checking in ecl/ecl-priv.h; new revision: 1.6; previous revision: 1.5
Checking in ecl/ecl.c; new revision: 1.8; previous revision: 1.7
Checking in ecl/ecl_gf.c; new revision: 1.4; previous revision: 1.3
Checking in ecl/ecp_192.c; new revision: 1.4; previous revision: 1.3
Checking in ecl/ecp_224.c; new revision: 1.4; previous revision: 1.3
Comment 42•19 years ago
|
||
Checked in on tip :
Checking in manifest.mn;
/cvsroot/mozilla/security/nss/lib/freebl/manifest.mn,v <-- manifest.mn
new revision: 1.47; previous revision: 1.46
done
Checking in Makefile;
/cvsroot/mozilla/security/nss/lib/freebl/Makefile,v <-- Makefile
new revision: 1.78; previous revision: 1.77
done
Checking in ecl/Makefile;
/cvsroot/mozilla/security/nss/lib/freebl/ecl/Makefile,v <-- Makefile
new revision: 1.5; previous revision: 1.4
done
Checking in ecl/ecl-priv.h;
/cvsroot/mozilla/security/nss/lib/freebl/ecl/ecl-priv.h,v <-- ecl-priv.h
new revision: 1.7; previous revision: 1.6
done
Checking in ecl/ecl.c;
/cvsroot/mozilla/security/nss/lib/freebl/ecl/ecl.c,v <-- ecl.c
new revision: 1.9; previous revision: 1.8
done
Checking in ecl/ecl_gf.c;
/cvsroot/mozilla/security/nss/lib/freebl/ecl/ecl_gf.c,v <-- ecl_gf.c
new revision: 1.5; previous revision: 1.4
done
Checking in ecl/ecp_192.c;
/cvsroot/mozilla/security/nss/lib/freebl/ecl/ecp_192.c,v <-- ecp_192.c
new revision: 1.5; previous revision: 1.4
done
Checking in ecl/ecp_224.c;
/cvsroot/mozilla/security/nss/lib/freebl/ecl/ecp_224.c,v <-- ecp_224.c
new revision: 1.5; previous revision: 1.4
done
This has now been tested good on solaris 32bit Sparc.
bob
Updated•19 years ago
|
QA Contact: jason.m.reid → libraries
Comment 43•19 years ago
|
||
Checked into NSS 3.11 branch:
Checking in Makefile;
/cvsroot/mozilla/security/nss/lib/freebl/Makefile,v <-- Makefile
new revision: 1.70.2.4; previous revision: 1.70.2.3
done
Checking in manifest.mn;
/cvsroot/mozilla/security/nss/lib/freebl/manifest.mn,v <-- manifest.mn
new revision: 1.44.2.1; previous revision: 1.44
done
Checking in ecl/Makefile;
/cvsroot/mozilla/security/nss/lib/freebl/ecl/Makefile,v <-- Makefile
new revision: 1.2.28.1; previous revision: 1.2
done
Checking in ecl/ecl-priv.h;
/cvsroot/mozilla/security/nss/lib/freebl/ecl/ecl-priv.h,v <-- ecl-priv.h
new revision: 1.4.2.1; previous revision: 1.4
done
Checking in ecl/ecl.c;
/cvsroot/mozilla/security/nss/lib/freebl/ecl/ecl.c,v <-- ecl.c
new revision: 1.5.2.2; previous revision: 1.5.2.1
done
Checking in ecl/ecl_gf.c;
/cvsroot/mozilla/security/nss/lib/freebl/ecl/ecl_gf.c,v <-- ecl_gf.c
new revision: 1.2.28.1; previous revision: 1.2
done
Checking in ecl/ecp_192.c;
/cvsroot/mozilla/security/nss/lib/freebl/ecl/ecp_192.c,v <-- ecp_192.c
new revision: 1.2.28.1; previous revision: 1.2
done
Checking in ecl/ecp_224.c;
/cvsroot/mozilla/security/nss/lib/freebl/ecl/ecp_224.c,v <-- ecp_224.c
new revision: 1.2.28.1; previous revision: 1.2
done
Checking in ecl/ecp_256.c;
/cvsroot/mozilla/security/nss/lib/freebl/ecl/ecp_256.c,v <-- ecp_256.c
new revision: 1.1.2.2; previous revision: 1.1.2.1
done
Checking in ecl/ecp_384.c;
/cvsroot/mozilla/security/nss/lib/freebl/ecl/ecp_384.c,v <-- ecp_384.c
new revision: 1.1.2.2; previous revision: 1.1.2.1
done
Checking in ecl/ecp_521.c;
/cvsroot/mozilla/security/nss/lib/freebl/ecl/ecp_521.c,v <-- ecp_521.c
new revision: 1.1.2.2; previous revision: 1.1.2.1
done
Status: NEW → RESOLVED
Closed: 19 years ago
Resolution: --- → FIXED
Updated•19 years ago
|
Priority: P1 → P2
Target Milestone: 3.11.2 → 3.11.1
Comment 44•19 years ago
|
||
Comment on attachment 215707 [details] [diff] [review]
Optimized curves, faster adds. Version 2
>+#ifdef MPI_AMD64_ADD
>+ /* compiler fakeout? */
>+ if ((r3 == b0) && (r1 == b0) && (r0 == b0)) {
>+ MP_CHECKOK(s_mp_pad(r, 4));
>+ }
>+#endif
>+ MP_CHECKOK(s_mp_pad(r, 4));
>+ MP_DIGIT(r, 3) = r3;
>+ MP_DIGIT(r, 2) = r2;
>+ MP_DIGIT(r, 1) = r1;
>+ MP_DIGIT(r, 0) = r0;
>+ MP_SIGN(r) = MP_ZPOS;
>+ MP_USED(r) = 4;
>+ s_mp_clamp(r);
Bob, what's the purpose of the "compiler fakeout"?
Why doesn't it also test r2 == b0? Why does it call
s_mp_pad(r, 4), which we call again right below?
You need to log in
before you can comment on or make changes to this bug.
Description
•