IonMonkey: Add an OOL path for truncating doubles outside int32 range




JavaScript Engine
6 years ago
6 years ago


(Reporter: jandem, Assigned: evilpie)


(Blocks: 1 bug)

Firefox Tracking Flags

(Not tracked)



(2 attachments, 2 obsolete attachments)



6 years ago
We need this for at least bitwise-and, nsieve-bits and IIRC some Kraken crypto tests.

JM has a stub path for doubles within 2^32 of a signed integer. It works by adding either -4294967296.0 or 4294967296.0 before truncating. This covers the most common case, 0xffffffff.

Comment 1

6 years ago
(In reply to Jan de Mooij (:jandem) from comment #0)
> We need this for at least bitwise-and, nsieve-bits and IIRC some Kraken
> crypto tests.

To be fair, with OSR we probably don't need this for bitwise-and and (maybe) nsieve-bits so this is not high priority for now.
Doesn't sound too hard, going to try doing that.
Assignee: general → evilpies
Created attachment 589576 [details] [diff] [review]

After a few hours of trying to implement this in some reasonable way cross platform, I came up with this. emitTruncateDouble was platform specific anyways. I think this should work on x86, but haven't tested that yet.
Fwiw, v8 has this impressive functions doing the conversation for a lot more case, I still don't really understand what they are doing, but I am going to investigate it a bit.


6 years ago
Duplicate of this bug: 721245

Comment 6

6 years ago
evilpie, this will be a very nice ss-nsieve-bits win, so let me know if there's anything I can do to help you get this landed.

If your inline path fails it can just call ValueToECMAInt32 instead of bailing out. This matches JM and we can always optimize later (I don't think we have to, though, since most benchmarks/scripts use doubles <= 0xffffffff).
So well, because doing OOL calls isn't that cool, I looked at the code generated by clang for js_DoubleToECMAInt32 and it actually looks like something that could be simplified a bit and just be used in-line.
Created attachment 595514 [details] [diff] [review]
current state

They way we use this functions, makes most of the easy stuff not so easy, and ugly (We don't use this like a normal visit functions only responsible for lir instruction). This is at least a working version. (No optimization for arm)
So I refactored this patch to actually use OOL CodeGenerators yaay. But now I am blocked by Bug 709423 to actually call the function.
Depends on: 709423
Created attachment 602374 [details] [diff] [review]
WIP v2

So this seems to mostly work, but isn't very beautiful. I decided to use a OOL path for truncate on x86, instead of inline calling js_DoubleToECMAInt32 like arm, but I haven't yet tested if this is actually worthwhile. (It should be, because I think we never actually call the function on our benchmarks).

It also has some bugs around saving and recovering registers, because I don't really understand what is needed there.
Attachment #589576 - Attachment is obsolete: true
Attachment #595514 - Attachment is obsolete: true


6 years ago
Duplicate of this bug: 734387
Created attachment 605448 [details] [diff] [review]

\o/ Thanks to jandem etc. for you support! I currently have so much other stuff to do so, so I hope this is just going to work.

I introduced saveVolatile and restoreVolatile so we can easily call such non VM functions.
Attachment #605448 - Flags: review?(jdemooij)

Comment 13

6 years ago
Comment on attachment 605448 [details] [diff] [review]

Review of attachment 605448 [details] [diff] [review]:

Nice work! r=me with these comments addressed.

I think it's okay to handle LTruncateDToInt32 in x86-shared, since ARM may want to implement this differently. We should be able to share emitTruncateDouble though.

::: js/src/ion/IonRegisters.h
@@ +489,5 @@
> +    { }
> +    RegisterSet(const FloatRegisterSet &fpu)
> +      : gpr_(),
> +        fpu_(fpu)        
> +    { }

These should be "explicit" to avoid unexpected casts.

::: js/src/ion/shared/CodeGenerator-shared.h
@@ +204,4 @@
>      }
>    public:
> +    void saveVolatile(Register reg) {

Nit: s/reg/returnReg

Something like this seems a bit simpler btw:

RegisterSet regs = RegisterSet::Volatile();


::: js/src/ion/shared/CodeGenerator-x86-shared.cpp
@@ +957,4 @@
>  JS_STATIC_ASSERT(INT_MIN == int(0x80000000));
>  void
> +CodeGeneratorX86Shared::emitTruncateDouble(const FloatRegister &src, const Register &dest)

If we add truncateDouble to the macro assembler, we can move this function to CodeGenerator.

@@ +1047,5 @@
> +    masm.cvtsi2sd(output, ScratchFloatReg);
> +
> +    masm.ucomisd(temp, ScratchFloatReg);
> +    masm.j(Assembler::Parity, &fail);
> +    masm.j(Assembler::Equal, &done);

You can jump to ool->rejoin() here (and remove the |done| label).
Attachment #605448 - Flags: review?(jdemooij) → review+

Comment 14

6 years ago
Pushed with nits fixed (per our IRC conversation yesterday):
Last Resolved: 6 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.