432516 - SM: consider doubleToInt32 optimization

Reporter

Description

•

17 years ago

+++ This bug was initially created as a clone of Bug #412978 +++ Intel posted a straightforward patch there that improves tamarin-tracing performance dramatically for some benchmarks. We should try it out in SpiderMonkey.

Robert Sayre

Comment 1

•

17 years ago

I'll check this out.

Assignee: general → sayrer

Igor Bukanov

Comment 2

•

17 years ago

We should also look in using that idea to optimize JSDOUBLE_IS_INT(). In particular, the sequence JSDOUBLE_IS_INT(d, i_) && INT_FITS_IN_JSVAL(i_) should benefit from this.

Moh Haghighat

Comment 3

•

17 years ago

I'll also look at what Igor mentions in Comment #2 above.

Igor Bukanov

Comment 4

•

17 years ago

(In reply to comment #2) > We should also look in using that idea to optimize JSDOUBLE_IS_INT(). In > particular, the sequence JSDOUBLE_IS_INT(d, i_) && INT_FITS_IN_JSVAL(i_) should > benefit from this. > To clarify: it would be nice to have a fast macro or function like: jsval v = JSDOUBLE_FITS_IN_JSVAL(d); if (v != JSVAL_NULL) { // v here is JSVAL_INT }

Moh Haghighat

Comment 5

•

17 years ago

Attached file A test driver to evaluate the performance gain of the optimized version of js_DoubleToECMAInt32() (obsolete) — Details

This tests measures the performance gain of the proposed optimized version of js_DoubleToECMAInt32(). The speedup depends on the value of the argument of the function. The test generates a large number of random variables in various ranges and measures the speedup. To run the test, do the following: cl /O2 d2i.c d2i.exe Here's an example output on a Core2 Duo laptop: c:\code\doubletoint32>cl /O2 d2i.c Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 15.00.21022.08 for 80x86 Copyright (C) Microsoft Corporation. All rights reserved. d2i.c Microsoft (R) Incremental Linker Version 9.00.21022.08 Copyright (C) Microsoft Corporation. All rights reserved. /out:d2i.exe d2i.obj c:\code\doubletoint32>d2i.exe max = 1 orig time = 5313, res = 0 new time = 468, res = 0 speedup = 11.35x max = 2147483648 orig time = 5875, res = 31759025306893781 new time = 688, res = 31759025306893781 speedup = 8.54x max = 4294967296 orig time = 6546, res = -151450805860420 new time = 1766, res = -151450805860420 speedup = 3.71x max = 1782633656570947600000000000000 orig time = 9047, res = 0 new time = 469, res = 0 speedup = 19.29x

Moh Haghighat

Comment 6

•

17 years ago

Attached file A test driver to evaluate the performance gain of the optimized version of js_DoubleToECMAInt32(). — Details

This tests measures the performance gain of the proposed optimized version of js_DoubleToECMAInt32(). The speedup depends on the value of the argument of the function. The test generates a large number of random variables in various ranges and measures the speedup. To run the test, do the following: cl /O2 d2i.c d2i.exe Here's an example output on a Core2 Duo laptop: c:\code\doubletoint32>cl /O2 d2i.c Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 15.00.21022.08 for 80x86 Copyright (C) Microsoft Corporation. All rights reserved. d2i.c Microsoft (R) Incremental Linker Version 9.00.21022.08 Copyright (C) Microsoft Corporation. All rights reserved. /out:d2i.exe d2i.obj c:\code\doubletoint32>d2i.exe max = 1 orig time = 5313, res = 0 new time = 468, res = 0 speedup = 11.35x max = 2147483648 orig time = 5875, res = 31759025306893781 new time = 688, res = 31759025306893781 speedup = 8.54x max = 4294967296 orig time = 6546, res = -151450805860420 new time = 1766, res = -151450805860420 speedup = 3.71x max = 1782633656570947600000000000000 orig time = 9047, res = 0 new time = 469, res = 0 speedup = 19.29x

Attachment #319700 - Attachment is obsolete: true

Igor Bukanov

Comment 7

•

17 years ago

Attached patch patch against SM v1 — Details — Splinter Review

This is what I am going to performance test.

Robert Sayre

Updated

•

17 years ago

Assignee: sayrer → igor

Moh Haghighat

Comment 8

•

17 years ago

In SpiderMonkey on Windows, the total number of calls to js_DoubleToECMAInt32() during a complete run of Sunspider (5 iterations) is currently ~7.07M, while in TT, the corresponding function is currently called ~33.36M times (almost a factor of 5x). So, the performance impact of this optimization on SpiderMonkey is going to be accordingly much smaller.

Brendan Eich [:brendan]

Updated

•

17 years ago

Blocks: js1.8.5

Mike Schroepfer

Updated

•

17 years ago

Flags: wanted-next+

David Mandelin [:dmandelin]

Comment 9

•

17 years ago

Attached file Sunspider analysis — Details

I just did some Sunspider perf testing with this modification. A. My test procedure was to build a JS shell with "make -f Makefile.ref BUILD_OPT=1" and run sunspider with 50 trials. This is on my MacBook Pro with 2.2 GHz Intel Core 2 Duo and 2 GB 667 MHz DDR2 SDRAM. I left my other processes (emacs, etc) open, but I didn't touch the machine while the tests were running. B. The 3 configurations I tested were: - base (baseline trunk SM) - ints (replace toInt32 operation with Moh's version) - uint (replace toInt32 and toUint32 operation with Moh's version) I don't have code for the toUint32 operation specifically. I just used Moh's toInt32 code with the return type changed to uint32. I think that might actually even give correct answers. C. I wrote my own script to analyze the results, as the Sunspider comparison script doesn't give really detailed statistics. Also, they add in an extra degree of freedom for the t test that I think shouldn't be there, but maybe I'm missing something. My results, which are attached, have 3 columns: - %diff: the percentage relative difference in average time taken. For example, the very first result is a %diff of -0.4 for base vs. ints. This means the total time was 0.4% less with the ints patch applied, i.e., a 1.004x speedup. - t: the t statistic for the difference in means (i.e., the ratio (estimated difference in means) / (standard error)). Bigger absolute value means more significant. For reference, t >= about 2 means significance at the 95% level (note: I don't recommend using 95% confidence intervals for analyzing this). - p-value: A p-value of 'z' means this: "If the means are in fact equal, the probability of observing this big of a difference (i.e., this big a t statistic) is 'z'." (This is true only if the assumptions of the model hold, in this case that test runs are independent and normally distributed with equal variance.) Thus, if you decide to call anything with p-value <= 0.05 significant (equivalent to using the 95% confidence interval), then 1/20 of what you call significant is actually bogus. Because there are 26 tests, at this level one would be pretty likely to be making some false inferences of significance. I also print *s for significance level on each line: * means better than 0.05, ** for 0.01, *** for 0.001, and **** for 0.0001. In this context, * doesn't mean much, but I would say *** is safe to consider a real difference. D. With all that explained, in my tests we're looking at a highly significant but small 0.3-0.4% speedup from using the new toInt code. Some tests see a greater speedup. For example, access-nbody is 5% faster. Some tests slow down using the new code, which is surprising. For example, crypto-sha1 is about 2% slower in both experimental runs. I have no idea what's going on there, but i-cache effects would be my dumb guess. Perhaps VTune can shed more light on this.

David Mandelin [:dmandelin]

Comment 10

•

17 years ago

Attached patch Patch used for Sunspider analysis — Details — Splinter Review

I switch the #if 0/1s to enable each item. The Uint one can be enabled only if the other one is.

Igor Bukanov

Updated

•

16 years ago

Assignee: igor → general

Brendan Eich [:brendan]

Updated

•

16 years ago

Whiteboard: DUPEME

Tom S. (please needinfo tschuster)

Comment 11

•

15 years ago

Commited here: http://hg.mozilla.org/tracemonkey/rev/319361b18289 Inlined here: http://hg.mozilla.org/tracemonkey/rev/fcd321cd60c1

Status: NEW → RESOLVED

Closed: 15 years ago

Resolution: --- → FIXED

sjw

Updated

•

12 years ago

Whiteboard: DUPEME

A test driver to evaluate the performance gain of the optimized version of js_DoubleToECMAInt32() 17 years ago Moh Haghighat 5.61 KB, text/plain		Details
A test driver to evaluate the performance gain of the optimized version of js_DoubleToECMAInt32(). 17 years ago Moh Haghighat 5.61 KB, text/plain		Details
patch against SM v1 17 years ago Igor Bukanov 4.84 KB, patch		Details \| Diff \| Splinter Review
Sunspider analysis 17 years ago David Mandelin [:dmandelin] 6.89 KB, text/plain		Details
Patch used for Sunspider analysis 17 years ago David Mandelin [:dmandelin] 6.72 KB, patch		Details \| Diff \| Splinter Review

Bugzilla

SM: consider doubleToInt32 optimization

Categories

(Core :: JavaScript Engine, defect)

Tracking

()

People

(Reporter: jorendorff, Unassigned)

References

Details

Crash Data

Security

(public)

User Story

Attachments

(4 files, 1 obsolete file)

Description

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Updated

Comment 8

Updated

Updated

Comment 9

Comment 10

Updated

Updated

Comment 11

Updated

Attachment

General

Description

File Name

Content Type