Closed
Bug 1416289
Opened 2 years ago
Closed Last year
Don't use the MathCache for Math.sign and Math.trunc
Categories
(Core :: JavaScript Engine, enhancement, P3)
Core
JavaScript Engine
P3
Tracking
()
RESOLVED
FIXED
mozilla62
Tracking  Status  

firefox62    fixed 
People
(Reporter: anba, Assigned: anba)
References
Details
Attachments
(3 files, 3 obsolete files)
39.74 KB,
patch

jandem
:
review+

Details  Diff  Splinter Review 
15.46 KB,
patch

jandem
:
review+

Details  Diff  Splinter Review 
27.66 KB,
patch

anba
:
review+

Details  Diff  Splinter Review 
Math.sign() [1] and Math.trunc() [2] seem hardly expensive enough to justify using the MathCache. [1] https://searchfox.org/mozillacentral/rev/30ead7d1ae5bf95b8bc0fd67b950cd46ca05e32c/js/src/jsmath.cpp#13941400 [2] https://searchfox.org/mozillacentral/rev/30ead7d1ae5bf95b8bc0fd67b950cd46ca05e32c/modules/fdlibm/src/s_trunc.cpp#3162
Assignee  
Comment 1•2 years ago


No longer use the Math cache for Math.sign() and Math.trunc(), instead directly inline the math_function<...> code in js::math_{trunc, sign}(...). Math functions which don't use the cache mustn't set MMathFunction::cache_ [1], assert this in the MMathFunction constructor and update IonBuilder::inlineMathFunction() accordingly. And I didn't rename math_{trunc,sign}_uncached to math_{trunc,sign}_impl (normally functions which don't use the cache have the _impl suffix) to avoid complicating this macro in WasmBuiltins.cpp [2]. [1] https://searchfox.org/mozillacentral/rev/30ead7d1ae5bf95b8bc0fd67b950cd46ca05e32c/js/src/jit/CodeGenerator.cpp#70047008 [2] https://searchfox.org/mozillacentral/rev/30ead7d1ae5bf95b8bc0fd67b950cd46ca05e32c/js/src/wasm/WasmBuiltins.cpp#665
Attachment #8927431 
Flags: review?(jdemooij)
Assignee  
Comment 2•2 years ago


Always return double values from MMathFunction specialized functions to allow inlining them in MCallOptimize [1]. Also applies to Math.random [2] and Math.hypot [3]. [1] https://searchfox.org/mozillacentral/rev/30ead7d1ae5bf95b8bc0fd67b950cd46ca05e32c/js/src/jit/MCallOptimize.cpp#483484 [2] https://searchfox.org/mozillacentral/rev/30ead7d1ae5bf95b8bc0fd67b950cd46ca05e32c/js/src/jit/MCallOptimize.cpp#13721373 [3] https://searchfox.org/mozillacentral/rev/30ead7d1ae5bf95b8bc0fd67b950cd46ca05e32c/js/src/jit/MCallOptimize.cpp#13201321
Attachment #8927433 
Flags: review?(jdemooij)
Assignee  
Comment 3•2 years ago


Part 1 gives the following improvements to current tip. // 290ms > 220ms function trunc1() { var q = 0; var t = dateNow(); for (var i = 0; i < 10000000; ++i) { // Constant value to always hit the math cache. q += Math.trunc(1.23456); } return [dateNow()  t, q]; } // 245ms > 140ms function sign1() { var q = 0; var t = dateNow(); for (var i = 0; i < 10000000; ++i) { // Constant value to always hit the math cache. q += Math.sign(1.23456); } return [dateNow()  t, q]; } Part 2 gives the following improvements compared to current tip. // 310ms > 100ms function trunc2() { // Constant value to always hit the math cache. // Not passed as a constant to avoid optimizing away // the call in the patched version. var xs = Array(8).fill(1.23456); var q = 0; var t = dateNow(); for (var i = 0; i < 10000000; ++i) { q += Math.trunc(xs[i & 7]); } return [dateNow()  t, q]; } // 300ms > 55ms function sign2() { // Constant value to always hit the math cache. // Not passed as a constant to avoid optimizing away // the call in the patched version. var xs = Array(8).fill(1.23456); var q = 0; var t = dateNow(); for (var i = 0; i < 10000000; ++i) { q += Math.sign(xs[i & 7]); } return [dateNow()  t, q]; }
Assignee  
Comment 4•2 years ago


Selfhosting Math.sign (bug 1021888) and Math.trunc could lead to additional speedups. For example with these implementations  function MathSign(x) { x = +x; // Inlined: ToNumber if (x > 0) return 1; if (x < 0) return 1; return x; } function MathTrunc(x) { x = +x; // Inlined: ToNumber if (x < 0) { // Equivalent to Math.ceil, but avoids repeated bailouts for certain // inputs like 2147483648.1. return std_Math_floor(x); } return std_Math_floor(x); }  The Math.trunc() µbenchmark from comment 3 improves to 25ms and the Math.sign() µbenchmark improves to 20ms.
Updated•2 years ago

Priority:  → P3
Comment 5•2 years ago


Comment on attachment 8927431 [details] [diff] [review] bug1416289part1dontcachesignandtrunc.patch Review of attachment 8927431 [details] [diff] [review]:  Makes sense. ::: js/src/jit/MIR.cpp @@ +1912,5 @@ > + case Floor: return false; > + case Ceil: return false; > + case Round: return false; > + default: > + MOZ_CRASH("Unknown math function"); Nit: if you remove the 'default:' and add the MOZ_CRASH after this switch statement, the compiler will warn/error when we add a new enum value without updating this code.
Attachment #8927431 
Flags: review?(jdemooij) → review+
Comment 6•2 years ago


Comment on attachment 8927433 [details] [diff] [review] bug1416289part2doublereturnvalues.patch Review of attachment 8927433 [details] [diff] [review]:  (In reply to André Bargull [:anba] from comment #2) > Always return double values from MMathFunction specialized functions to > allow inlining them in MCallOptimize [1]. Also applies to Math.random [2] > and Math.hypot [3]. This makes sense for Math.random because it ~always returns a double, but I think for sign() and trunc() it would be nice to specialize for int32 return types in Ion because these cases are so common. This patch is a performance improvement so maybe it's fine but the problem with returning doubles is that it can affect a lot of other expressions. I'll think about it more.
Assignee  
Comment 7•Last year


(Note: The three new patches completely replace the original two patches.) js/src/jsmath.{h,cpp}  math_trunc_uncached wasn't renamed to math_trunc_impl to keep supporting this code in WasmBuiltins.cpp: https://searchfox.org/mozillacentral/rev/b28b94dc81d60c6d9164315adbd4a5073526d372/js/src/wasm/WasmBuiltins.cpp#764795 js/src/jit/{Lowering,MCallOptimize,MIR,Recover,LIRshared}.{h,cpp}:  I've basically only copied the existing approach for the inlining code of Math.{floor,ceil,round}().  This includes the code for recovering and float32 specializations. js/src/jit/{arm,arm64,mipsshared,x86shared} files:  The assembler code in CodeGenerator::visitTrunc(F) for x86 resp. in MacroAssemblerARMCompat::trunc(f) for ARM was copied from the existing code for ceil/floor (including comments and formatting).  While I could have done the same for MIPS, I considered it too risky to check in MIPS code which cannot be tested by me, therefore the MIPS code currently simply crashes for the two new methods. js/src/jittest/tests/ion/dcewithrinstructions.js  Fixed what looks like a copypaste error in rceil_double().  Added rtrunc_number and rtrunc_double based on the existing tests for Math.round in that file. js/src/jittest/tests/ion/math{Floor,Round}.js  Replaced trycatchstatement with a withstatement to actually make sure the function is not run in Ion. js/src/jittest/tests/ion/mathTrunc.js  Copied and adapted from the existing math{Floor,Round}.js files in the same directory. js/src/jittest/tests/ion/testFloat32.js  Added new test for Math.trunc().  Also added missing test Math.round().  And added negative tests for Math.log() and Math.exp().
Attachment #8927431 
Attachment is obsolete: true
Attachment #8927433 
Attachment is obsolete: true
Attachment #8927433 
Flags: review?(jdemooij)
Attachment #8973679 
Flags: review?(jdemooij)
Assignee  
Comment 8•Last year


Open question: I've first tried to have only two lowering classes (LSignI and LSignD) and then use MToNumberInt32 for the case when the input is MIRType::Double and the expected output is MIRType::Int32 (in MCallOptimize). Unfortunately this breaks the recover tests for Math.sign(), because MToNumberInt32 itself is not recoverable. As a workaround I've simply added another lowering class (LSignDI) to handle the Double>Int32 case. Does that make sense or is there another, more preferable approach I should have taken? CodeGenerator::visitSignI(): The assembler code was adapted from the Clang assembly for this C function (per godbolt.org for Clang6 with O3). int sign(int num) { if (num > 0) return 1; if (num < 0) return 1; return 0; } sign(int): # @sign(int) mov ecx, edi sar ecx, 31 test edi, edi mov eax, 1 cmovle eax, ecx ret Except that I didn't know how to encode the conditional move with the existing MacroAssembler methods, so instead I've simply used a normal branch. CodeGenerator::visitSignD():  I don't know if there's much of a difference which comparisons are executed in which exact order. CodeGenerator::visitSignDI():  Copied (again) the code to detect negative zero. Maybe one day branchNegativeZero will be available for all archs. :)  The generated assembler code is probably still not really efficient, but at least it's better than what we had before...
Attachment #8973680 
Flags: review?(jdemooij)
Assignee  
Comment 9•Last year


js/src/jsmath.h:  Removed duplicate declaration of math_imul().  Moved math_sqrt_handle() next to math_sqrt() to match how other nearby code is structured. js/src/jsmath.cpp:  Removed unnecessary includes.  Moved the math_function template function to the top, so it can be used by more functions.  Also changed math_function to use MutableHandleValue::setDouble() instead of MutableHandleValue::setNumber() to ensure the functions using math_function can be inlined by MMathFunction.  That means the following functions will now always return a DoubleValue:  Math.cbrt  Math.exp, Math.expm1  Math.log, Math.log10, Math.log2, Math.log1p  Math.cosh, Math.sinh, Math.tahn  Math.acosh, Math.asinh, Math.atanh  Also changed Math.random() and Math.hypot() to always return DoubleValue for inlining purposes.  Do you think it's okay if those remaining functions always return DoubleValues?
Attachment #8973682 
Flags: review?(jdemooij)
Comment 10•Last year


Ah I wanted to get back to this bug this week :) I'll review today or tomorrow.
Comment 11•Last year


Comment on attachment 8973679 [details] [diff] [review] bug1416289part1inlinetrunc.patch Review of attachment 8973679 [details] [diff] [review]:  Nice!
Attachment #8973679 
Flags: review?(jdemooij) → review+
Comment 12•Last year


Comment on attachment 8973680 [details] [diff] [review] bug1416289part2inlinesign.patch Review of attachment 8973680 [details] [diff] [review]:  We could make MToNumberInt32(double) recoverable maybe, but I actually like having the SignD and SignDI versions :) ::: js/src/jit/CodeGenerator.cpp @@ +7444,5 @@ > + > + Label done; > + masm.move32(input, output); > + masm.rshift32Arithmetic(Imm32(31), output); > + masm.branch32(Assembler::LessThanOrEqual, input, Imm32(0), &done); The cmov version requires an extra temp register, but then you could use cmp32move32 like this: masm.rshift32Arithmetic(Imm32(31), output); masm.move32(Imm32(1), temp); masm.cmp32move32(Assembler::GreaterThan, output, Imm32(0), temp, output); What you have is also fine though. ::: js/src/jit/MCallOptimize.cpp @@ +1528,5 @@ > + > + if (returnType != MIRType::Int32 && returnType != MIRType::Double) > + return InliningStatus_NotInlined; > + > + if (!IsFloatingPointType(argType) && !(argType == MIRType::Int32 && returnType == argType)) Maybe returnType == MIRType::Int32 instead of returnType == argType here? ::: js/src/jit/MIR.cpp @@ +1898,5 @@ > + if (!input>isConstant()  !input>toConstant()>isTypeRepresentableAsDouble()) > + return this; > + > + double in = input>toConstant()>numberToDouble(); > + double out = js::math_sign_uncached(in); This will use AutoUnsafeCallWithABI offthread. Probably not the end of the world, but it would be nice to avoid it. Can we add a sign function we can call directly here? Maybe in jslibmath.h?
Attachment #8973680 
Flags: review?(jdemooij) → review+
Comment 13•Last year


Comment on attachment 8973682 [details] [diff] [review] bug1416289part3doublereturn.patch Review of attachment 8973682 [details] [diff] [review]:  Thanks for these patches! Please double check at least Kraken doesn't regress a lot (or check AWFY/Talos after landing), it has some very hot Math calls (but these return doubles IIRC).
Attachment #8973682 
Flags: review?(jdemooij) → review+
Assignee  
Comment 14•Last year


Thanks for the reviews! (In reply to Jan de Mooij [:jandem] from comment #12) > ::: js/src/jit/CodeGenerator.cpp > What you have is also fine though. Okay, then I'll stick with the current version for now. :) > ::: js/src/jit/MIR.cpp > > + double out = js::math_sign_uncached(in); > > This will use AutoUnsafeCallWithABI offthread. Probably not the end of the > world, but it would be nice to avoid it. Can we add a sign function we can > call directly here? Maybe in jslibmath.h? Hmm, is it okay to file a followup bug for this one, because it actually affects all Math functions: https://searchfox.org/mozillacentral/rev/f30847c12e5fb13791401ed4330ced3e1b9c8d81/js/src/jit/MIR.cpp#19452020 (In reply to Jan de Mooij [:jandem] from comment #13) > Thanks for these patches! Please double check at least Kraken doesn't > regress a lot (or check AWFY/Talos after landing), it has some very hot Math > calls (but these return doubles IIRC). I didn't spot any major performance differences when running Kraken locally (the first two entries for each subtest are without part 3, the last two entries with part 3). But I guess we'll see if Talos decides differently here. ;) [#0] aiastar Cycles:10 Average:152.10 Median:153.00 stddev:2.81 (1.8%) stddevsansfirst:2.60 [#0] aiastar Cycles:10 Average:149.00 Median:148.00 stddev:3.06 (2.1%) stddevsansfirst:1.92 [#0] aiastar Cycles:10 Average:152.00 Median:152.00 stddev:4.67 (3.1%) stddevsansfirst:4.94 [#0] aiastar Cycles:10 Average:150.90 Median:151.00 stddev:3.38 (2.2%) stddevsansfirst:3.50 [#1] audiobeatdetection Cycles:10 Average:260.70 Median:256.00 stddev:40.95 (16.0%) stddevsansfirst:42.19 [#1] audiobeatdetection Cycles:10 Average:274.30 Median:278.00 stddev:27.94 (10.1%) stddevsansfirst:25.98 [#1] audiobeatdetection Cycles:10 Average:255.10 Median:244.50 stddev:33.87 (13.9%) stddevsansfirst:35.14 [#1] audiobeatdetection Cycles:10 Average:266.10 Median:279.00 stddev:28.90 (10.4%) stddevsansfirst:28.38 [#2] audiodft Cycles:10 Average:265.50 Median:271.00 stddev:8.77 (3.2%) stddevsansfirst:9.15 [#2] audiodft Cycles:10 Average:272.30 Median:270.50 stddev:10.46 (3.9%) stddevsansfirst:11.08 [#2] audiodft Cycles:10 Average:274.70 Median:275.00 stddev:15.92 (5.8%) stddevsansfirst:15.84 [#2] audiodft Cycles:10 Average:268.30 Median:265.50 stddev:12.96 (4.9%) stddevsansfirst:13.61 [#3] audiofft Cycles:10 Average:187.20 Median:177.50 stddev:32.60 (18.4%) stddevsansfirst:34.33 [#3] audiofft Cycles:10 Average:189.40 Median:182.00 stddev:30.51 (16.8%) stddevsansfirst:31.71 [#3] audiofft Cycles:10 Average:178.60 Median:176.50 stddev:16.81 (9.5%) stddevsansfirst:17.75 [#3] audiofft Cycles:10 Average:180.60 Median:172.00 stddev:21.73 (12.6%) stddevsansfirst:22.64 [#4] audiooscillator Cycles:10 Average:135.10 Median:130.50 stddev:20.04 (15.4%) stddevsansfirst:21.14 [#4] audiooscillator Cycles:10 Average:133.20 Median:129.00 stddev:15.41 (11.9%) stddevsansfirst:16.09 [#4] audiooscillator Cycles:10 Average:132.80 Median:126.00 stddev:18.87 (15.0%) stddevsansfirst:17.87 [#4] audiooscillator Cycles:10 Average:129.50 Median:125.50 stddev:13.95 (11.1%) stddevsansfirst:14.66 [#5] imaginggaussianblur Cycles:10 Average:164.50 Median:166.50 stddev:11.48 (6.9%) stddevsansfirst:11.53 [#5] imaginggaussianblur Cycles:10 Average:164.80 Median:165.00 stddev:9.05 (5.5%) stddevsansfirst:9.40 [#5] imaginggaussianblur Cycles:10 Average:174.20 Median:169.50 stddev:22.99 (13.6%) stddevsansfirst:8.00 [#5] imaginggaussianblur Cycles:10 Average:165.70 Median:164.50 stddev:9.23 (5.6%) stddevsansfirst:9.69 [#6] imagingdarkroom Cycles:10 Average:190.00 Median:199.00 stddev:15.84 (8.0%) stddevsansfirst:16.70 [#6] imagingdarkroom Cycles:10 Average:199.70 Median:200.50 stddev:15.74 (7.9%) stddevsansfirst:15.55 [#6] imagingdarkroom Cycles:10 Average:206.40 Median:207.00 stddev:16.40 (7.9%) stddevsansfirst:16.96 [#6] imagingdarkroom Cycles:10 Average:199.20 Median:202.00 stddev:6.86 (3.4%) stddevsansfirst:7.26 [#7] imagingdesaturate Cycles:10 Average:154.00 Median:154.50 stddev:13.04 (8.4%) stddevsansfirst:12.95 [#7] imagingdesaturate Cycles:10 Average:146.70 Median:150.00 stddev:7.24 (4.8%) stddevsansfirst:7.68 [#7] imagingdesaturate Cycles:10 Average:158.80 Median:164.00 stddev:20.62 (12.6%) stddevsansfirst:21.63 [#7] imagingdesaturate Cycles:10 Average:151.80 Median:154.00 stddev:17.54 (11.4%) stddevsansfirst:17.98 [#8] jsonparsefinancial Cycles:10 Average:64.70 Median:65.00 stddev:1.77 (2.7%) stddevsansfirst:1.87 [#8] jsonparsefinancial Cycles:10 Average:64.90 Median:66.00 stddev:2.64 (4.0%) stddevsansfirst:2.77 [#8] jsonparsefinancial Cycles:10 Average:65.20 Median:65.00 stddev:4.18 (6.4%) stddevsansfirst:4.39 [#8] jsonparsefinancial Cycles:10 Average:65.10 Median:66.50 stddev:3.87 (5.8%) stddevsansfirst:3.84 [#9] jsonstringifytinderbox Cycles:10 Average:47.90 Median:46.00 stddev:6.24 (13.6%) stddevsansfirst:6.53 [#9] jsonstringifytinderbox Cycles:10 Average:51.60 Median:45.50 stddev:18.45 (40.6%) stddevsansfirst:19.42 [#9] jsonstringifytinderbox Cycles:10 Average:45.90 Median:47.00 stddev:1.73 (3.7%) stddevsansfirst:1.66 [#9] jsonstringifytinderbox Cycles:10 Average:45.90 Median:46.00 stddev:1.97 (4.3%) stddevsansfirst:1.96 [#10] stanfordcryptoaes Cycles:10 Average:118.10 Median:116.50 stddev:7.52 (6.5%) stddevsansfirst:7.94 [#10] stanfordcryptoaes Cycles:10 Average:120.10 Median:122.00 stddev:10.54 (8.6%) stddevsansfirst:10.22 [#10] stanfordcryptoaes Cycles:10 Average:120.50 Median:119.00 stddev:12.34 (10.4%) stddevsansfirst:13.08 [#10] stanfordcryptoaes Cycles:10 Average:118.80 Median:118.50 stddev:5.83 (4.9%) stddevsansfirst:5.92 [#11] stanfordcryptoccm Cycles:10 Average:164.50 Median:166.00 stddev:5.89 (3.5%) stddevsansfirst:6.22 [#11] stanfordcryptoccm Cycles:10 Average:176.20 Median:171.50 stddev:23.59 (13.8%) stddevsansfirst:24.99 [#11] stanfordcryptoccm Cycles:10 Average:167.50 Median:169.00 stddev:5.15 (3.0%) stddevsansfirst:5.43 [#11] stanfordcryptoccm Cycles:10 Average:176.60 Median:172.00 stddev:24.25 (14.1%) stddevsansfirst:25.52 [#12] stanfordcryptopbkdf2 Cycles:10 Average:233.00 Median:234.00 stddev:10.41 (4.5%) stddevsansfirst:9.93 [#12] stanfordcryptopbkdf2 Cycles:10 Average:235.30 Median:234.50 stddev:13.47 (5.7%) stddevsansfirst:13.20 [#12] stanfordcryptopbkdf2 Cycles:10 Average:230.00 Median:230.50 stddev:7.92 (3.4%) stddevsansfirst:8.26 [#12] stanfordcryptopbkdf2 Cycles:10 Average:239.30 Median:239.00 stddev:10.52 (4.4%) stddevsansfirst:10.42 [#13] stanfordcryptosha256iterative Cycles:10 Average:109.10 Median:92.00 stddev:32.15 (34.9%) stddevsansfirst:33.50 [#13] stanfordcryptosha256iterative Cycles:10 Average:95.90 Median:96.00 stddev:5.53 (5.8%) stddevsansfirst:5.86 [#13] stanfordcryptosha256iterative Cycles:10 Average:104.00 Median:96.50 stddev:12.23 (12.7%) stddevsansfirst:12.97 [#13] stanfordcryptosha256iterative Cycles:10 Average:107.50 Median:90.50 stddev:44.70 (49.4%) stddevsansfirst:47.25
Comment 15•Last year


(In reply to André Bargull [:anba] from comment #14) > Hmm, is it okay to file a followup bug for this one, because it actually > affects all Math functions: Ah in that case it's probably fine asis.
Assignee  
Comment 16•Last year


Updated part 2 per review comments, carrying r+.
Attachment #8973680 
Attachment is obsolete: true
Attachment #8974091 
Flags: review+
Assignee  
Comment 17•Last year


Try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=2860bbcf47df304252da336a384ce9a4bbc3db02
Keywords: checkinneeded
Comment 18•Last year


Pushed by ryanvm@gmail.com: https://hg.mozilla.org/integration/mozillainbound/rev/06bee42b4d49 Part 1: Add Ioninline support for Math.trunc. r=jandem https://hg.mozilla.org/integration/mozillainbound/rev/d37d926c33fa Part 2: Add Ioninline support for Math.sign. r=jandem https://hg.mozilla.org/integration/mozillainbound/rev/93f3dc3cecc3 Part 3: Always use double values for Math functions which are inlined through MMathFunction. r=jandem
Keywords: checkinneeded
Comment 19•Last year


bugherder 
https://hg.mozilla.org/mozillacentral/rev/06bee42b4d49 https://hg.mozilla.org/mozillacentral/rev/d37d926c33fa https://hg.mozilla.org/mozillacentral/rev/93f3dc3cecc3
Status: ASSIGNED → RESOLVED
Closed: Last year
statusfirefox62:
 → fixed
Resolution:  → FIXED
Target Milestone:  → mozilla62
Updated•Last year

statusfirefox58:
affected → 
You need to log in
before you can comment on or make changes to this bug.
Description
•