Last Comment Bug 705302 - IonMonkey: Add an OOL path for truncating doubles outside int32 range
: IonMonkey: Add an OOL path for truncating doubles outside int32 range
Status: RESOLVED FIXED
:
Product: Core
Classification: Components
Component: JavaScript Engine (show other bugs)
: unspecified
: All All
: -- normal with 1 vote (vote)
: ---
Assigned To: Tom Schuster [:evilpie]
:
Mentors:
: 721245 734387 (view as bug list)
Depends on: 709423
Blocks: IonSpeed
  Show dependency treegraph
 
Reported: 2011-11-25 08:39 PST by Jan de Mooij [:jandem]
Modified: 2012-03-15 03:25 PDT (History)
12 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
WIP (4.20 KB, patch)
2012-01-18 11:02 PST, Tom Schuster [:evilpie]
no flags Details | Diff | Splinter Review
current state (7.91 KB, patch)
2012-02-08 13:37 PST, Tom Schuster [:evilpie]
no flags Details | Diff | Splinter Review
WIP v2 (10.30 KB, patch)
2012-03-02 08:31 PST, Tom Schuster [:evilpie]
no flags Details | Diff | Splinter Review
v1 (11.45 KB, patch)
2012-03-13 10:36 PDT, Tom Schuster [:evilpie]
jdemooij: review+
Details | Diff | Splinter Review

Description Jan de Mooij [:jandem] 2011-11-25 08:39:56 PST
We need this for at least bitwise-and, nsieve-bits and IIRC some Kraken crypto tests.

JM has a stub path for doubles within 2^32 of a signed integer. It works by adding either -4294967296.0 or 4294967296.0 before truncating. This covers the most common case, 0xffffffff.
Comment 1 Jan de Mooij [:jandem] 2011-11-25 09:03:09 PST
(In reply to Jan de Mooij (:jandem) from comment #0)
> We need this for at least bitwise-and, nsieve-bits and IIRC some Kraken
> crypto tests.

To be fair, with OSR we probably don't need this for bitwise-and and (maybe) nsieve-bits so this is not high priority for now.
Comment 2 Tom Schuster [:evilpie] 2012-01-17 07:48:18 PST
Doesn't sound too hard, going to try doing that.
Comment 3 Tom Schuster [:evilpie] 2012-01-18 11:02:56 PST
Created attachment 589576 [details] [diff] [review]
WIP

After a few hours of trying to implement this in some reasonable way cross platform, I came up with this. emitTruncateDouble was platform specific anyways. I think this should work on x86, but haven't tested that yet.
Comment 4 Tom Schuster [:evilpie] 2012-01-26 11:56:11 PST
Fwiw, v8 has this impressive functions doing the conversation for a lot more case, https://code.google.com/codesearch#W9JxUuHYyMg/trunk/src/ia32/lithium-codegen-ia32.cc&q=LDoubleToI%20package:http://v8%5C.googlecode%5C.com&l=3846. I still don't really understand what they are doing, but I am going to investigate it a bit.
Comment 5 Sean Stangl [:sstangl] 2012-01-26 11:58:40 PST
*** Bug 721245 has been marked as a duplicate of this bug. ***
Comment 6 Jan de Mooij [:jandem] 2012-02-06 00:17:16 PST
evilpie, this will be a very nice ss-nsieve-bits win, so let me know if there's anything I can do to help you get this landed.

If your inline path fails it can just call ValueToECMAInt32 instead of bailing out. This matches JM and we can always optimize later (I don't think we have to, though, since most benchmarks/scripts use doubles <= 0xffffffff).
Comment 7 Tom Schuster [:evilpie] 2012-02-08 09:59:28 PST
So well, because doing OOL calls isn't that cool, I looked at the code generated by clang for js_DoubleToECMAInt32 and it actually looks like something that could be simplified a bit and just be used in-line.
Comment 8 Tom Schuster [:evilpie] 2012-02-08 13:37:17 PST
Created attachment 595514 [details] [diff] [review]
current state

They way we use this functions, makes most of the easy stuff not so easy, and ugly (We don't use this like a normal visit functions only responsible for lir instruction). This is at least a working version. (No optimization for arm)
Comment 9 Tom Schuster [:evilpie] 2012-02-10 13:47:17 PST
So I refactored this patch to actually use OOL CodeGenerators yaay. But now I am blocked by Bug 709423 to actually call the function.
Comment 10 Tom Schuster [:evilpie] 2012-03-02 08:31:49 PST
Created attachment 602374 [details] [diff] [review]
WIP v2

So this seems to mostly work, but isn't very beautiful. I decided to use a OOL path for truncate on x86, instead of inline calling js_DoubleToECMAInt32 like arm, but I haven't yet tested if this is actually worthwhile. (It should be, because I think we never actually call the function on our benchmarks).

It also has some bugs around saving and recovering registers, because I don't really understand what is needed there.
Comment 11 Kannan Vijayan [:djvj] 2012-03-09 08:35:14 PST
*** Bug 734387 has been marked as a duplicate of this bug. ***
Comment 12 Tom Schuster [:evilpie] 2012-03-13 10:36:27 PDT
Created attachment 605448 [details] [diff] [review]
v1

\o/ Thanks to jandem etc. for you support! I currently have so much other stuff to do so, so I hope this is just going to work.

I introduced saveVolatile and restoreVolatile so we can easily call such non VM functions.
Comment 13 Jan de Mooij [:jandem] 2012-03-14 05:54:25 PDT
Comment on attachment 605448 [details] [diff] [review]
v1

Review of attachment 605448 [details] [diff] [review]:
-----------------------------------------------------------------

Nice work! r=me with these comments addressed.

I think it's okay to handle LTruncateDToInt32 in x86-shared, since ARM may want to implement this differently. We should be able to share emitTruncateDouble though.

::: js/src/ion/IonRegisters.h
@@ +489,5 @@
> +    { }
> +    RegisterSet(const FloatRegisterSet &fpu)
> +      : gpr_(),
> +        fpu_(fpu)        
> +    { }

These should be "explicit" to avoid unexpected casts.

::: js/src/ion/shared/CodeGenerator-shared.h
@@ +204,4 @@
>      }
>  
>    public:
> +    void saveVolatile(Register reg) {

Nit: s/reg/returnReg

Something like this seems a bit simpler btw:

RegisterSet regs = RegisterSet::Volatile();
regs.maybeTake(returnReg);

masm.PushRegsInMask(regs);

::: js/src/ion/shared/CodeGenerator-x86-shared.cpp
@@ +957,4 @@
>  JS_STATIC_ASSERT(INT_MIN == int(0x80000000));
>  
>  void
> +CodeGeneratorX86Shared::emitTruncateDouble(const FloatRegister &src, const Register &dest)

If we add truncateDouble to the macro assembler, we can move this function to CodeGenerator.

@@ +1047,5 @@
> +    masm.cvtsi2sd(output, ScratchFloatReg);
> +
> +    masm.ucomisd(temp, ScratchFloatReg);
> +    masm.j(Assembler::Parity, &fail);
> +    masm.j(Assembler::Equal, &done);

You can jump to ool->rejoin() here (and remove the |done| label).
Comment 14 Jan de Mooij [:jandem] 2012-03-15 03:25:14 PDT
Pushed with nits fixed (per our IRC conversation yesterday):

https://hg.mozilla.org/projects/ionmonkey/rev/49a7d5a3b400

Note You need to log in before you can comment on or make changes to this bug.