Closed Bug 311432 Opened 19 years ago Closed 16 years ago

ECC's ECL_USE_FP code (for Linux x86) fails pairwise consistency test

Categories

(NSS :: Libraries, defect, P4)

3.11
x86
Linux
defect

Tracking

(Not tracked)

RESOLVED FIXED
3.12.1

People

(Reporter: wtc, Assigned: rrelyea)

Details

(Whiteboard: ECC)

Attachments

(1 file)

I'm building the NSS tip (NSS 3.11 pre-release) on Red Hat
Enterprise Linux 4 (very close to Fedora Core 3) on x86 with
NSS_ENABLE_ECC=1.

When I run all.sh, certutil fails because the new EC key pair
fails the pairwise consistency test; we use the new key pair
to sign and then verify the signature; the signature verification
fails.  all.sh passes on Windows.

I looked at nss/lib/freebl/Makefile for anything that's specific
to Linux, and found this block of code:

ifeq ($(OS_TARGET),Linux)
ifeq ($(CPU_ARCH),x86_64)
    ASFILES  = arcfour-amd64-gas.s mpi_amd64_gas.s
    ASFLAGS += -march=opteron -m64 -fPIC
    DEFINES += -DNSS_BEVAND_ARCFOUR -DMPI_AMD64 -DMP_ASSEMBLY_MULTIPLY
    DEFINES += -DNSS_USE_COMBA
    MPI_SRCS += mpi_amd64.c mp_comba.c
endif
ifeq ($(CPU_ARCH),x86)
    ASFILES  = mpi_x86.s
    DEFINES += -DMP_ASSEMBLY_MULTIPLY -DMP_ASSEMBLY_SQUARE
    DEFINES += -DMP_ASSEMBLY_DIV_2DX1D
    USE_FP_CODE = 1
endif
ifdef NSS_ENABLE_ECC
    ifdef USE_FP_CODE
        #enable floating point ECC code
        DEFINES  += -DECL_USE_FP
        ECL_SRCS += ecp_fp160.c ecp_fp192.c ecp_fp224.c ecp_fp.c
        ECL_HDRS += ecp_fp.h
    endif
endif # NSS_ENABLE_ECC
endif # Linux

If I comment out the "USE_FP_CODE = 1" statement, all.sh passes.

Vipul, Douglas, what is this floating point ECC code, and why is
it only used on Linux x86?  Do you remember when you last tested
it?  Any idea why it doesn't work now?
(In reply to comment #0)
> Vipul, Douglas, what is this floating point ECC code, and why is
> it only used on Linux x86?  Do you remember when you last tested
> it?  Any idea why it doesn't work now?

The floating point ECC code is described in the file freebl/ecl/README.FP.  It
represents field elements using arrays of double instead of int because the
floating point unit on some platforms (notably, UltraSPARC) is much faster than
the integer unit.  

I know we tested it and it worked on Solaris/UltraSPARC, but I don't remember
how much testing was done on Linux/x86.  

Could you try running the self-contained floating point code tests and let me
know the result?  That could help diagnose the problem.  To do so, do the following:
1. export TARGET=x86LINUX
2. cd .../freebl/mpi
3. make libmpi.a
4. cd .../freebl/ecl
5. make tests
6. ./ecp_fpt
This should give some indication as to where the problem lies.
Douglas: the ECL_USE_FP code is used not only on Linux x86
but also on Solaris SPARC, as you said.

I just followed your instructions on Red Hat Enterprise Linux
(RHEL) 4 x96 to build the ecp_fpt test.  The test has unresolved
references to ec_GFp_pt_mul_jac:

ecp_fpt.o(.text+0x2b59): In function `testPointMulRandom':
: undefined reference to `ec_GFp_pt_mul_jac'
ecp_fpt.o(.text+0x30b0): In function `testPointMulTime':
: undefined reference to `ec_GFp_pt_mul_jac'
collect2: ld returned 1 exit status

This is because that function is inside an ifdef in ecp_jac.c:

/* by default, this routine is unused and thus doesn't need to be compiled */
#ifdef ECL_ENABLE_GFP_PT_MUL_JAC
/* Computes R = nP where R is (rx, ry) and P is (px, py). The parameters
 * a, b and p are the elliptic curve coefficients and the prime that
 * determines the field GFp.  Elliptic curve points P and R can be
 * identical.  Uses mixed Jacobian-affine coordinates. Assumes input is
 * already field-encoded using field_enc, and returns output that is still
 * field-encoded. Uses 4-bit window method. */
mp_err
ec_GFp_pt_mul_jac(const mp_int *n, const mp_int *px, const mp_int *py,
                                  mp_int *rx, mp_int *ry, const ECGroup *group)'
{
...
}
#endif

If I remove the ifdef, the ecp_fpt can be built successfully.
When I run it, I get this failure:

Testing SECG-160R1 using specific floating point implementation...
  Error: Jacobian Floating Point Incorrect.
floating point result
rx    8CEF852E1242A7A780B439BB1D1FA3AD846B5C2F
ry    70D5F73A40657E3276EBA46CF886932BB15740E6
integer result
x   DF389CD53604A97D308C056A38FA8ACB222BE7C5
y   7CCB6AC9F683B820FA8BD44631ABADF3DC05D213
TEST FAILED - Point Addition - Jacobian & Affine
Error: exiting with error value -1

Summary: ECC's USE_FP_CODE=1 code (for Linux x86) fails pairwise consistency test → ECC's ECL_USE_FP code (for Linux x86) fails pairwise consistency test
On Solaris 9 UltraSPARC (I had to remove the same ifdef to get
the ecp_fpt test to link):

TARGET=v8plusSOLARIS
Test passed

TARGET=v9SOLARIS
Test passed

TARGET=v8SOLARIS
Test passed



This patch works around the floating point ECC code bug
on Linux x86 by disabling it.  It also contains some code
cleanup changes.  Here is a summary of the changes.

1. Rename the USE_FP_CODE variable as ECL_USE_FP, which is
the variable used in the standalone makefiles in lib/freebl/ecl.

2. Comment out ECL_USE_FP = 1 for Linux x86 to work around this
bug.

3. Share the common makefile code for ECL_USE_FP between Linux x86
and Solaris SPARC.

4. Move the dependency rule for $(OBJDIR)/sysrand$(OBJ_SUFFIX) to
the appropriate section of the makefile.
Attachment #202178 - Flags: review?(nelson)
Comment on attachment 202178 [details] [diff] [review]
Work around the bug, plus cleanup

r=nelson
Attachment #202178 - Flags: review?(nelson) → review+
Comment on attachment 202178 [details] [diff] [review]
Work around the bug, plus cleanup

I checked in this patch on the NSS trunk for NSS 3.11.

Checking in Makefile;
/cvsroot/mozilla/security/nss/lib/freebl/Makefile,v  <--  Makefile
new revision: 1.68; previous revision: 1.67
done
Douglas, Do you know this code?

Are either of the other contributors, Stephen Fung <fungstep@hotmail.com> and
Nils Gura <nils.gura@sun.com>, available to look at it?  

There is an interesting comment in README.FP, which says:

   Some platforms, notably x86 Linux, may use an extended-precision floating
   point representation that has a 64-bit mantissa.  [6]  Although this
   implementation is optimized for the IEEE standard 53-bit mantissa,
   it should work with the 64-bit mantissa.  A check is done at run-time
   in the function ec_set_fp_precision that detects if the precision is
   greater than 53 bits, and runs code for the 64-bit mantissa accordingly.

Seems to me that the ability to represent an "extended" 64-bit mantissa 
would be a property of the hardware more than of the OS (x86 Linux).
Anyway, I wonder if the code to detect the extended precision ability is 
not working properly.
Priority: -- → P2
Whiteboard: ECC
Target Milestone: --- → 3.11.2
Per our meeting, reassigning to Bob.
Questions:
1) is FP really a win over integer on x86 ? If not, we should not enable it on any x86 OS .
2) why is this only set for Linux x86 and not other x86 OSes (eg. Solaris x86, Windows). We should do all or none.
Assignee: wtchang → rrelyea
QA Contact: jason.m.reid → libraries
Retargetting all P2s to 3.11.3 .
Target Milestone: 3.11.2 → 3.11.3
We need to freeze the lib/freebl code on the NSS_3_11_BRANCH
because we have finished the FIPS algorithm testing.  Moved
the target milestone to NSS 3.12.
Target Milestone: 3.11.3 → 3.12
patch in hand, going ahead and bumping for the bug council.

bob
Priority: P2 → P1
Target Milestone: 3.12 → 3.12.1
Blocks: FIPS2008
Priority: P1 → P4
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
No longer blocks: FIPS2008
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: