Closed Bug 350948 Opened 18 years ago Closed 18 years ago

freebl macro change can give 1% improvement in RSA performance on amd64

Tracking

(Not tracked)

Status:

RESOLVED FIXED

Milestone:

3.12

People

(Reporter: julien.pierre, Assigned: julien.pierre)

Details

Attachments

(1 file)

simple Makefile change. Use a different word-at-a-time implementation of the weave code 18 years ago Julien Pierre 1.26 KB, patch	nelson : review+	Details \| Diff \| Splinter Review

Julien Pierre

Assignee

Description

•

18 years ago

The change triggers a different implementation of the weave code. I measured a 1% improvement on Solaris amd64 (64-bit) in rsaperf 1024-bit private keys ops by making this change.

Julien Pierre

Assignee

Comment 1

•

18 years ago

Attached patch simple Makefile change. Use a different word-at-a-time implementation of the weave code — Details — Splinter Review

Julien Pierre

Assignee

Updated

•

18 years ago

Priority: -- → P2

Target Milestone: --- → 3.12

Julien Pierre

Assignee

Updated

•

18 years ago

Attachment #236345 - Flags: review?(nelson)

Wan-Teh Chang

Comment 2

•

18 years ago

Comment on attachment 236345 [details] [diff] [review]
simple Makefile change. Use a different word-at-a-time implementation of the weave code

You can use this on Solaris x86, too.

Nelson Bolyard (seldom reads bugmail)

Comment 3

•

18 years ago

Comment on attachment 236345 [details] [diff] [review]
simple Makefile change. Use a different word-at-a-time implementation of the weave code

r=nelson for trunk

Attachment #236345 - Flags: review?(nelson) → review+

Julien Pierre

Assignee

Comment 4

•

18 years ago

Wan-Teh,

I tried it on Solaris x86 and I saw a decrease in performance of 9.3%.

Julien Pierre

Assignee

Comment 5

•

18 years ago

Checked in on the trunk :

Checking in Makefile;
/cvsroot/mozilla/security/nss/lib/freebl/Makefile,v  <--  Makefile
new revision: 1.87; previous revision: 1.86
done

Status: NEW → RESOLVED

Closed: 18 years ago

Resolution: --- → FIXED

Wan-Teh Chang

Comment 6

•

18 years ago

Julien, thanks for the Solaris x86 info.  I wonder if we
should turn it off for Linux x86, too.

Julien Pierre

Assignee

Comment 7

•

18 years ago

I would expect the results to be similar between Solaris and Linux, given that it's the same code.

Note that I benchmarked on a dual Opteron machine with
rsaperf -d . -n none -p 30 -t 2

On the Opteron machine, I got the 1% increase in 64-bit mode, and a 9.3% decrease in 32-bit mode, when adding the defines added by attachment 236345 [details] [diff] [review] in the Solaris x86 block .

I think it is a property of AMD CPUs that 8-bit accesses are slow in the 64-bit instruction set, but not in the 32-bit instruction set. The same may not be necessarily be true on Intel chips. You would have to experiment. I don't have access to any 64-bit Intel CPUs to do this comparison.

I tested the 32-bit code on a 32-bit dual Xeon CPU with hyperthreading. I tested with 4 threads in rsaperf. On that chip, there was a 1% increase by using MP_CHAR_STORE_SLOW - with the same bits that gave a 9.3% decrease on the AMD. That just goes to show that the chips are really very different. You have to balance which chip you want to optimize for. If you want to optimize for both, you should create multiple freebl libraries. But at that point, you would probably want to look into using SSE2 for that additional freebl lib, which will perform even better on the Intel - and not as much of an improvement on the AMD, though still better than the standard multiply instruction.

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Quick Search

freebl macro change can give 1% improvement in RSA performance on amd64

Categories

(NSS :: Libraries, defect, P2)

Tracking

(Not tracked)

People

(Reporter: julien.pierre, Assigned: julien.pierre)

References

Details

Crash Data

Security

(public)

User Story

Attachments

(1 file)

Description

Comment 1

Updated

Updated

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Attachment

General

Description

File Name

Content Type