Closed Bug 311432 Opened 20 years ago Closed 17 years ago

ECC's ECL_USE_FP code (for Linux x86) fails pairwise consistency test

Categories

(NSS :: Libraries, defect, P4)

3.11
x86
Linux
defect

Tracking

(Not tracked)

RESOLVED FIXED
3.12.1

People

(Reporter: wtc, Assigned: rrelyea)

Details

(Whiteboard: ECC)

Attachments

(1 file)

I'm building the NSS tip (NSS 3.11 pre-release) on Red Hat Enterprise Linux 4 (very close to Fedora Core 3) on x86 with NSS_ENABLE_ECC=1. When I run all.sh, certutil fails because the new EC key pair fails the pairwise consistency test; we use the new key pair to sign and then verify the signature; the signature verification fails. all.sh passes on Windows. I looked at nss/lib/freebl/Makefile for anything that's specific to Linux, and found this block of code: ifeq ($(OS_TARGET),Linux) ifeq ($(CPU_ARCH),x86_64) ASFILES = arcfour-amd64-gas.s mpi_amd64_gas.s ASFLAGS += -march=opteron -m64 -fPIC DEFINES += -DNSS_BEVAND_ARCFOUR -DMPI_AMD64 -DMP_ASSEMBLY_MULTIPLY DEFINES += -DNSS_USE_COMBA MPI_SRCS += mpi_amd64.c mp_comba.c endif ifeq ($(CPU_ARCH),x86) ASFILES = mpi_x86.s DEFINES += -DMP_ASSEMBLY_MULTIPLY -DMP_ASSEMBLY_SQUARE DEFINES += -DMP_ASSEMBLY_DIV_2DX1D USE_FP_CODE = 1 endif ifdef NSS_ENABLE_ECC ifdef USE_FP_CODE #enable floating point ECC code DEFINES += -DECL_USE_FP ECL_SRCS += ecp_fp160.c ecp_fp192.c ecp_fp224.c ecp_fp.c ECL_HDRS += ecp_fp.h endif endif # NSS_ENABLE_ECC endif # Linux If I comment out the "USE_FP_CODE = 1" statement, all.sh passes. Vipul, Douglas, what is this floating point ECC code, and why is it only used on Linux x86? Do you remember when you last tested it? Any idea why it doesn't work now?
(In reply to comment #0) > Vipul, Douglas, what is this floating point ECC code, and why is > it only used on Linux x86? Do you remember when you last tested > it? Any idea why it doesn't work now? The floating point ECC code is described in the file freebl/ecl/README.FP. It represents field elements using arrays of double instead of int because the floating point unit on some platforms (notably, UltraSPARC) is much faster than the integer unit. I know we tested it and it worked on Solaris/UltraSPARC, but I don't remember how much testing was done on Linux/x86. Could you try running the self-contained floating point code tests and let me know the result? That could help diagnose the problem. To do so, do the following: 1. export TARGET=x86LINUX 2. cd .../freebl/mpi 3. make libmpi.a 4. cd .../freebl/ecl 5. make tests 6. ./ecp_fpt This should give some indication as to where the problem lies.
Douglas: the ECL_USE_FP code is used not only on Linux x86 but also on Solaris SPARC, as you said. I just followed your instructions on Red Hat Enterprise Linux (RHEL) 4 x96 to build the ecp_fpt test. The test has unresolved references to ec_GFp_pt_mul_jac: ecp_fpt.o(.text+0x2b59): In function `testPointMulRandom': : undefined reference to `ec_GFp_pt_mul_jac' ecp_fpt.o(.text+0x30b0): In function `testPointMulTime': : undefined reference to `ec_GFp_pt_mul_jac' collect2: ld returned 1 exit status This is because that function is inside an ifdef in ecp_jac.c: /* by default, this routine is unused and thus doesn't need to be compiled */ #ifdef ECL_ENABLE_GFP_PT_MUL_JAC /* Computes R = nP where R is (rx, ry) and P is (px, py). The parameters * a, b and p are the elliptic curve coefficients and the prime that * determines the field GFp. Elliptic curve points P and R can be * identical. Uses mixed Jacobian-affine coordinates. Assumes input is * already field-encoded using field_enc, and returns output that is still * field-encoded. Uses 4-bit window method. */ mp_err ec_GFp_pt_mul_jac(const mp_int *n, const mp_int *px, const mp_int *py, mp_int *rx, mp_int *ry, const ECGroup *group)' { ... } #endif If I remove the ifdef, the ecp_fpt can be built successfully. When I run it, I get this failure: Testing SECG-160R1 using specific floating point implementation... Error: Jacobian Floating Point Incorrect. floating point result rx 8CEF852E1242A7A780B439BB1D1FA3AD846B5C2F ry 70D5F73A40657E3276EBA46CF886932BB15740E6 integer result x DF389CD53604A97D308C056A38FA8ACB222BE7C5 y 7CCB6AC9F683B820FA8BD44631ABADF3DC05D213 TEST FAILED - Point Addition - Jacobian & Affine Error: exiting with error value -1
Summary: ECC's USE_FP_CODE=1 code (for Linux x86) fails pairwise consistency test → ECC's ECL_USE_FP code (for Linux x86) fails pairwise consistency test
On Solaris 9 UltraSPARC (I had to remove the same ifdef to get the ecp_fpt test to link): TARGET=v8plusSOLARIS Test passed TARGET=v9SOLARIS Test passed TARGET=v8SOLARIS Test passed
This patch works around the floating point ECC code bug on Linux x86 by disabling it. It also contains some code cleanup changes. Here is a summary of the changes. 1. Rename the USE_FP_CODE variable as ECL_USE_FP, which is the variable used in the standalone makefiles in lib/freebl/ecl. 2. Comment out ECL_USE_FP = 1 for Linux x86 to work around this bug. 3. Share the common makefile code for ECL_USE_FP between Linux x86 and Solaris SPARC. 4. Move the dependency rule for $(OBJDIR)/sysrand$(OBJ_SUFFIX) to the appropriate section of the makefile.
Attachment #202178 - Flags: review?(nelson)
Comment on attachment 202178 [details] [diff] [review] Work around the bug, plus cleanup r=nelson
Attachment #202178 - Flags: review?(nelson) → review+
Comment on attachment 202178 [details] [diff] [review] Work around the bug, plus cleanup I checked in this patch on the NSS trunk for NSS 3.11. Checking in Makefile; /cvsroot/mozilla/security/nss/lib/freebl/Makefile,v <-- Makefile new revision: 1.68; previous revision: 1.67 done
Douglas, Do you know this code? Are either of the other contributors, Stephen Fung <fungstep@hotmail.com> and Nils Gura <nils.gura@sun.com>, available to look at it? There is an interesting comment in README.FP, which says: Some platforms, notably x86 Linux, may use an extended-precision floating point representation that has a 64-bit mantissa. [6] Although this implementation is optimized for the IEEE standard 53-bit mantissa, it should work with the 64-bit mantissa. A check is done at run-time in the function ec_set_fp_precision that detects if the precision is greater than 53 bits, and runs code for the 64-bit mantissa accordingly. Seems to me that the ability to represent an "extended" 64-bit mantissa would be a property of the hardware more than of the OS (x86 Linux). Anyway, I wonder if the code to detect the extended precision ability is not working properly.
Priority: -- → P2
Whiteboard: ECC
Target Milestone: --- → 3.11.2
Per our meeting, reassigning to Bob. Questions: 1) is FP really a win over integer on x86 ? If not, we should not enable it on any x86 OS . 2) why is this only set for Linux x86 and not other x86 OSes (eg. Solaris x86, Windows). We should do all or none.
Assignee: wtchang → rrelyea
QA Contact: jason.m.reid → libraries
Retargetting all P2s to 3.11.3 .
Target Milestone: 3.11.2 → 3.11.3
We need to freeze the lib/freebl code on the NSS_3_11_BRANCH because we have finished the FIPS algorithm testing. Moved the target milestone to NSS 3.12.
Target Milestone: 3.11.3 → 3.12
patch in hand, going ahead and bumping for the bug council. bob
Priority: P2 → P1
Target Milestone: 3.12 → 3.12.1
Blocks: FIPS2008
Priority: P1 → P4
Status: NEW → RESOLVED
Closed: 17 years ago
Resolution: --- → FIXED
No longer blocks: FIPS2008
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: