Last Comment Bug 416508 - Fix a _MSC_VER typo in sha512.c, and use SEC_BEGIN_PROTOS/SEC_END_PROTOS in secport.h
: Fix a _MSC_VER typo in sha512.c, and use SEC_BEGIN_PROTOS/SEC_END_PROTOS in s...
Status: RESOLVED FIXED
:
Product: NSS
Classification: Components
Component: Libraries (show other bugs)
: unspecified
: All All
: -- trivial (vote)
: 3.12
Assigned To: Wan-Teh Chang
:
Mentors:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2008-02-08 23:12 PST by Wan-Teh Chang
Modified: 2008-02-19 17:28 PST (History)
3 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---


Attachments
Proposed patch (2.93 KB, patch)
2008-02-08 23:12 PST, Wan-Teh Chang
nelson: review+
Details | Diff | Review
Use Visual C++ intrinsic function _byteswap_ulong (2.62 KB, patch)
2008-02-14 17:14 PST, Wan-Teh Chang
nelson: review+
Details | Diff | Review
Undo the removal of #if defined(IS_LITTLE_ENDIAN) (1.25 KB, patch)
2008-02-15 21:12 PST, Wan-Teh Chang
no flags Details | Diff | Review

Description Wan-Teh Chang 2008-02-08 23:12:53 PST
Created attachment 302260 [details] [diff] [review]
Proposed patch

The attached patch fixes two minor problems in sha512.c and secport.h.

I verified that MSC_VER is not defined.  The ifdefs in sha512.c
seem to have been copied from sha_fast.h, which uses the correct
_MSC_VER macro.
Comment 1 Nelson Bolyard (seldom reads bugmail) 2008-02-12 16:26:18 PST
Comment on attachment 302260 [details] [diff] [review]
Proposed patch

Wan-Teh, please tell me (us), does this change to sha512.c actually make any difference in the generated code on Windows with VC7 or VC8?
Comment 2 Wan-Teh Chang 2008-02-14 10:45:03 PST
I don't know how to look at the code generated with Visual C++.
Does it have an option equivalent to GCC's -S option to output
an assembly code file?

I checked in the patch on the NSS trunk for NSS 3.12.

Checking in freebl/sha512.c;
/cvsroot/mozilla/security/nss/lib/freebl/sha512.c,v  <--  sha512.c
new revision: 1.10; previous revision: 1.9
done
Checking in util/secport.h;
/cvsroot/mozilla/security/nss/lib/util/secport.h,v  <--  secport.h
new revision: 1.15; previous revision: 1.14
done
Comment 3 Wan-Teh Chang 2008-02-14 13:32:34 PST
The patch made sha512.obj smaller.  The following is the size of sha512.obj,
compiled by Visual C++ 2005 in optimized builds.

Before: 29036 Feb 14 13:27 sha512.obj
After:  28714 Feb 14 13:28 sha512.obj
Comment 4 Nelson Bolyard (seldom reads bugmail) 2008-02-14 13:37:42 PST
To look at the generated code, I used:
dumpbin /disasm:NOBYTES WIN954.0_OPT.OBJ/WIN95_SINGLE_SHLIB/sha512.obj

I looked at the code before and after this change.  The differences were
bigger than I expected.  I did find that the bswap instructions were not
being inlined before, but were being inlined after.  But I also found a
lot of other differences in the code, including a lot of different 
register assignments and instruction reordering.  

This change made the size of the disassembled code MUCH larger.  
We definitely should compare the performance to see if this has had the 
desired and intended effect.  Otherwise, it's bloat. :/
Comment 5 Wan-Teh Chang 2008-02-14 17:14:01 PST
Created attachment 303424 [details] [diff] [review]
Use Visual C++ intrinsic function _byteswap_ulong

Nelson, do you want me to back out the MSC_VER => _MSC_VER change?  Note that
the .obj file is smaller, even though the disassembly is larger.

I found that recent versions of Visual C++ has an intrinsic function _byteswap_ulong
for all architectures.  This patch uses that.
Comment 6 Nelson Bolyard (seldom reads bugmail) 2008-02-14 17:31:27 PST
Comment on attachment 303424 [details] [diff] [review]
Use Visual C++ intrinsic function _byteswap_ulong

This looks OK.
Wait for the tree to go green before committing it, please.
Let's not rush to undo the _MSC_VER change.  It'll probably be good.
Comment 7 Wan-Teh Chang 2008-02-14 21:49:18 PST
(In reply to comment #4)
One explanation is that inline assembly prevents MSVC from performing
some optimizations inside the function:
http://msdn2.microsoft.com/en-us/library/5hd5ywk0.aspx

The _byteswap_ulong intrinsic function should give us the best of both
worlds.
Comment 8 Wan-Teh Chang 2008-02-15 18:25:29 PST
Comment on attachment 303424 [details] [diff] [review]
Use Visual C++ intrinsic function _byteswap_ulong

I checked in this patch on the NSS trunk for NSS 3.12.

Checking in sha512.c;
/cvsroot/mozilla/security/nss/lib/freebl/sha512.c,v  <--  sha512.c
new revision: 1.11; previous revision: 1.10
done
Checking in sha_fast.h;
/cvsroot/mozilla/security/nss/lib/freebl/sha_fast.h,v  <--  sha_fast.h
new revision: 1.6; previous revision: 1.5
done
Comment 9 Wan-Teh Chang 2008-02-15 21:12:50 PST
Created attachment 303662 [details] [diff] [review]
Undo the removal of #if defined(IS_LITTLE_ENDIAN)

Looking at the code more closely, I now understand what
the #if defined(IS_LITTLE_ENDIAN) I removed was for --
fast implementation of SHA_HTONL is only necessary for
little endian machines.  For big endian machines, we
define SHA_NTONL(x) as (x) later in the file.

So I restored the ifdef.

Checking in sha_fast.h;
/cvsroot/mozilla/security/nss/lib/freebl/sha_fast.h,v  <--  sha_fast.h
new revision: 1.7; previous revision: 1.6
done
Comment 10 jag (Peter Annema) 2008-02-17 00:48:10 PST
wtc: Doesn't _X86_ imply IS_LITTLE_ENDIAN? And you define SHA_HTONL as _byteswap for all _MSC_VER >= 1300, not just _X86_ (or IS_LITTLE_ENDIAN), so you might want to add that check there.
Comment 11 Wan-Teh Chang 2008-02-19 11:42:05 PST
jag: SHA_HTONL is defined in both sha_fast.h and sha512.c.  Which
file were you referring to?

In sha512.c, SHA_HTONL is defined simply as a byteswap
(which means the definition would be wrong for big endian), but is
only used for little endian.  So the code is correct.
Comment 12 jag (Peter Annema) 2008-02-19 15:25:37 PST
wtc: I was talking about sha_fast.h. So yeah, while the compiled code works, I wouldn't call SHA_HTONL correct.

And I just noticed that the IS_LITTLE_ENDIAN doesn't just guard the _X86_ section but also the _byteswap intrinsic.
Comment 13 Wan-Teh Chang 2008-02-19 17:28:35 PST
jag: in sha_fast.h, the definition of SHA_HTONL is correct.  sha_fast.h
doesn't have the problem of sha512.c that I described in comment 11.
You must have missed something.

To avoid misunderstanding, please use a patch to identify the problem.

Note You need to log in before you can comment on or make changes to this bug.