Closed Bug 815737 Opened 11 years ago Closed 11 years ago

IonMonkey: Inline Math.exp(), Math.atan(), Math.acos(), and Math.asin()


(Core :: JavaScript Engine, defect)

Not set





(Reporter: zbraniecki, Assigned: fiveop)


(Blocks 1 open bug)


(Whiteboard: [good first bug][mentor=nbp][lang=c++])


(2 files, 1 obsolete file)

Attached file demo
We seem to miss fast path for Math.exp.

From the demo:

home:~/ $ ~/projects/v8/out/native/d8 ./demo.js
mathExp: 72
mathExp: 52
mathExp: 51
mathExp: 51
home:~/ $ ~/projects/mozilla/builds/obj-opt-x86_64-apple-darwin12.2.0/dist/ ~/demo.js 
mathExp: 1079
mathExp: 1081
mathExp: 1054
mathExp: 1067
We do not currently have a fast path for Math.exp(). We clearly should. Potentially a good first bug.
Summary: Math.exp is 20x faster in V8 vs. IonMonkey → IonMonkey: Inline Math.exp()
Blocks: IonSpeed
I am actually not sure if v8 inlines anything unless you compile with "fast-math".
Sean, do you think you could put some steps here to help a new contributor fix this bug? I don't expect this is as simple as using the inline keyword on math_exp_body.

I looked through the source code and saw that and math_exp_body are likely to be the places where the work would be done.
Flags: needinfo?(sstangl)
This isn't about inlining the C function; it's about not making a C function call at all and instead doing Math.exp in JIT-generated assembly...
Sure. The goal is to generate inline assembly in IonMonkey that makes a very quick call to math_exp_body(). Most of the work has already been done for sin()/cos()/tan(); this bug will just require making math_exp_body() non-static (probably renamed as math_exp_impl()) then plugging the function into IonMonkey's inlining system, reusing the same functions that sin()/cos()/tan() use.

The main inlining code is in js/src/ion/MCallOptimize.cpp -- search for inlineMathFunction(). This requires defining a new type in the MMathFunction::Function enum, then using it in inlineNativeCall() as with js_math_sin.

The MMathFunction is then lowered to an LMathFunctionD in js/src/ion/Lowering.h's visitMathFunction(). This part should work already; no modifications are required.

Finally, the LMathFunctionD must be translated to assembly, done in js/src/ion/CodeGenerator.cpp's visitMathFunctionD(). This requires adding a new "MMathFunction::Exp" case to the switch.
Flags: needinfo?(sstangl)
(In reply to Boris Zbarsky (:bz) from comment #4)
> This isn't about inlining the C function; it's about not making a C function
> call at all and instead doing Math.exp in JIT-generated assembly...

In the future, maybe. For now, exp() is part of the family of cache-reliant math functions, and it's much easier to just reuse the existing machinery. If further oomph is necessary, we can look into actually inlining it later, but the rest of the functions seem to be quick enough provided that we just avoid a full callVM().
(In reply to Sean Stangl from comment #6)
> provided that we just avoid a full callVM().

Ahem, that should read "a full visitCallNative()". The call is slowed down by generation of an exit frame, because we don't know that the target native function cannot GC.
This bug implies some modifications in js/src/ion/MCallOptimize.cpp and likely js/src/ion/CodeGenerator.cpp or any of the arch-specific variant if the choice of the implementation goes for a native function call (callWithABI) to some architecture specific instructions for double exponentiation.

Once done the attached benchmark should be a magnitude faster and performance improvement should be visible on
Whiteboard: [good first bug][mentor=nbp][lang=c++]
(In reply to Nicolas B. Pierron [:pierron] [:nbp] from comment #8)
> This bug implies some modifications ...

Comment 5 states exactly what to do to reuse the MMathFunction infrastructure. No need for arch-specific paths.
Attached patch patch as described in comment 5 (obsolete) — Splinter Review
improves performance of 'demo' attachment by a factor of 120 on my machine
Attachment #720295 - Flags: review?(jAwS)
Nice work!!

I haven't followed the conversation on #jsapi, but I assume somebody from the IM team should to do the review. Sean Stangl in this case.

FYI 1:
To know who should review the code you can always run "hg annotate" on the sourcefile or just go to the site.

In this case:

You will see sstangl on all "return inlineMathFunction(callInfo, MMathFunction::Cos);" lines. He created that code and has the best idea if it is correct. So I think you should flip the review to Sean.

FYI 2:
To flip review, you need to click on "Details" on the patch. Under flags, you will see the input box to change the reviewer. I'm mentioning this, because it wasn't immediately obvious for me the first time I did that ;).
Attachment #720295 - Flags: review?(jAwS) → review?(sstangl)
Assignee: general → philipp.schaefer
OS: Mac OS X → All
Hardware: x86 → All
Comment on attachment 720295 [details] [diff] [review]
patch as described in comment 5

Review of attachment 720295 [details] [diff] [review]:

I'm not Sean, and I don't claim to know much about how to write this patch, but from glancing at the code it looks like not much has changed. The call to mathCache->lookup is now deferred until the previous infinity checks have passed, but besides that I don't see where the large perf increase can come from. Is the uploaded patch missing some files? (I also could just be reading it wrong, please correct me if so.)

::: js/src/jsmath.cpp
@@ +284,3 @@
>              return 0.0;
>      }
> +    endif

This is missing the hash character, #endif, and both the #ifdef and #endif should be at the beginning of the line (no preceding whitespace).
Apart from the #endif issue and the new whitespace at the top of two files the patch is perfect. We actually have a very good abstraction with MMathFunction, which makes this change so small. What happens is that in the CodeGenerator we emit a call to math_exp_impl, with the MathCache as argument. So we skip all of the VM call overhead and can probably even load from the cache directly in the called function.
Comment on attachment 720295 [details] [diff] [review]
patch as described in comment 5

Review of attachment 720295 [details] [diff] [review]:

Good patch! Just some minor nits to fix. When they're fixed, upload a new version and flag me for review, and I'll r+ and push it.

Note that the crazy 120x speedup is because the "demo" micro-benchmark always calls Math.exp(100), and 100 is cached, so exp() is rarely executed. It will still be a major speedup for code that hits uncached entries, though.

::: js/src/ion/MCallOptimize.cpp
@@ +1,1 @@
> +

nit: whitespace added to top of file.

::: js/src/ion/MIR.h
@@ +1,1 @@
> +

nit: whitespace added to top of file.

::: js/src/jsmath.cpp
@@ +278,2 @@
>  {
> +    #ifdef _WIN32

nit: preprocessor directives should not be prefixed with whitespace.

@@ +284,3 @@
>              return 0.0;
>      }
> +    endif

This should be "#endif", without whitespace.
Attachment #720295 - Flags: review?(sstangl)
Philipp -- looking through jsmath.cpp, it looks like we're also missing MMathFunction support for acos(), asin(), and atan() (the last missing one is sqrt(), but IonMonkey handles that using inline assembly that bypasses the cache).

Rather than filing separate bugs, since you already know how to connect those functions to IonMonkey, would you mind updating the patch to also include those? It will require separating the functions into the main body and an _impl() version that takes a double.
Attachment #720295 - Attachment is obsolete: true
Attachment #720399 - Flags: review?(sstangl)
Comment on attachment 720399 [details] [diff] [review]
fixed review nits and added inlining for atan/acos/asin

Review of attachment 720399 [details] [diff] [review]:

Perfect. Thanks for adding support for the trig functions!

If you're interested, we're still missing inlining support for many native functions. Inlining is suitable for natives that cannot trigger garbage collection (by creating a new object), which rules out most of the String and Array methods -- but there are, from memory, a number of Date functions and global functions like parseInt() that should have fast-paths.

Math.random() should also have a fast-path that just calls out to C++ to avoid the exit frame/callVM overhead in visitCallNative().

Plenty of work left :)
Attachment #720399 - Flags: review?(sstangl) → review+
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla22
Summary: IonMonkey: Inline Math.exp() → IonMonkey: Inline Math.exp(), Math.atan(), Math.acos(), and Math.asin()
You need to log in before you can comment on or make changes to this bug.