Closed Bug 416508 Opened 16 years ago Closed 16 years ago

Fix a _MSC_VER typo in sha512.c, and use SEC_BEGIN_PROTOS/SEC_END_PROTOS in secport.h

Categories

(NSS :: Libraries, defect)

defect
Not set
trivial

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: wtc, Assigned: wtc)

Details

Attachments

(3 files)

Attached patch Proposed patchSplinter Review
The attached patch fixes two minor problems in sha512.c and secport.h.

I verified that MSC_VER is not defined.  The ifdefs in sha512.c
seem to have been copied from sha_fast.h, which uses the correct
_MSC_VER macro.
Attachment #302260 - Flags: review?(nelson)
Comment on attachment 302260 [details] [diff] [review]
Proposed patch

Wan-Teh, please tell me (us), does this change to sha512.c actually make any difference in the generated code on Windows with VC7 or VC8?
Attachment #302260 - Flags: review?(nelson) → review+
I don't know how to look at the code generated with Visual C++.
Does it have an option equivalent to GCC's -S option to output
an assembly code file?

I checked in the patch on the NSS trunk for NSS 3.12.

Checking in freebl/sha512.c;
/cvsroot/mozilla/security/nss/lib/freebl/sha512.c,v  <--  sha512.c
new revision: 1.10; previous revision: 1.9
done
Checking in util/secport.h;
/cvsroot/mozilla/security/nss/lib/util/secport.h,v  <--  secport.h
new revision: 1.15; previous revision: 1.14
done
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
The patch made sha512.obj smaller.  The following is the size of sha512.obj,
compiled by Visual C++ 2005 in optimized builds.

Before: 29036 Feb 14 13:27 sha512.obj
After:  28714 Feb 14 13:28 sha512.obj
To look at the generated code, I used:
dumpbin /disasm:NOBYTES WIN954.0_OPT.OBJ/WIN95_SINGLE_SHLIB/sha512.obj

I looked at the code before and after this change.  The differences were
bigger than I expected.  I did find that the bswap instructions were not
being inlined before, but were being inlined after.  But I also found a
lot of other differences in the code, including a lot of different 
register assignments and instruction reordering.  

This change made the size of the disassembled code MUCH larger.  
We definitely should compare the performance to see if this has had the 
desired and intended effect.  Otherwise, it's bloat. :/
Nelson, do you want me to back out the MSC_VER => _MSC_VER change?  Note that
the .obj file is smaller, even though the disassembly is larger.

I found that recent versions of Visual C++ has an intrinsic function _byteswap_ulong
for all architectures.  This patch uses that.
Attachment #303424 - Flags: review?(nelson)
Comment on attachment 303424 [details] [diff] [review]
Use Visual C++ intrinsic function _byteswap_ulong

This looks OK.
Wait for the tree to go green before committing it, please.
Let's not rush to undo the _MSC_VER change.  It'll probably be good.
Attachment #303424 - Flags: review?(nelson) → review+
(In reply to comment #4)
One explanation is that inline assembly prevents MSVC from performing
some optimizations inside the function:
http://msdn2.microsoft.com/en-us/library/5hd5ywk0.aspx

The _byteswap_ulong intrinsic function should give us the best of both
worlds.
Comment on attachment 303424 [details] [diff] [review]
Use Visual C++ intrinsic function _byteswap_ulong

I checked in this patch on the NSS trunk for NSS 3.12.

Checking in sha512.c;
/cvsroot/mozilla/security/nss/lib/freebl/sha512.c,v  <--  sha512.c
new revision: 1.11; previous revision: 1.10
done
Checking in sha_fast.h;
/cvsroot/mozilla/security/nss/lib/freebl/sha_fast.h,v  <--  sha_fast.h
new revision: 1.6; previous revision: 1.5
done
Looking at the code more closely, I now understand what
the #if defined(IS_LITTLE_ENDIAN) I removed was for --
fast implementation of SHA_HTONL is only necessary for
little endian machines.  For big endian machines, we
define SHA_NTONL(x) as (x) later in the file.

So I restored the ifdef.

Checking in sha_fast.h;
/cvsroot/mozilla/security/nss/lib/freebl/sha_fast.h,v  <--  sha_fast.h
new revision: 1.7; previous revision: 1.6
done
wtc: Doesn't _X86_ imply IS_LITTLE_ENDIAN? And you define SHA_HTONL as _byteswap for all _MSC_VER >= 1300, not just _X86_ (or IS_LITTLE_ENDIAN), so you might want to add that check there.
jag: SHA_HTONL is defined in both sha_fast.h and sha512.c.  Which
file were you referring to?

In sha512.c, SHA_HTONL is defined simply as a byteswap
(which means the definition would be wrong for big endian), but is
only used for little endian.  So the code is correct.
wtc: I was talking about sha_fast.h. So yeah, while the compiled code works, I wouldn't call SHA_HTONL correct.

And I just noticed that the IS_LITTLE_ENDIAN doesn't just guard the _X86_ section but also the _byteswap intrinsic.
jag: in sha_fast.h, the definition of SHA_HTONL is correct.  sha_fast.h
doesn't have the problem of sha512.c that I described in comment 11.
You must have missed something.

To avoid misunderstanding, please use a patch to identify the problem.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: