Closed Bug 705302 Opened 13 years ago Closed 12 years ago

IonMonkey: Add an OOL path for truncating doubles outside int32 range

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: jandem, Assigned: evilpie)

References

(Blocks 1 open bug)

Details

Attachments

(2 files, 2 obsolete files)

We need this for at least bitwise-and, nsieve-bits and IIRC some Kraken crypto tests.

JM has a stub path for doubles within 2^32 of a signed integer. It works by adding either -4294967296.0 or 4294967296.0 before truncating. This covers the most common case, 0xffffffff.
(In reply to Jan de Mooij (:jandem) from comment #0)
> We need this for at least bitwise-and, nsieve-bits and IIRC some Kraken
> crypto tests.

To be fair, with OSR we probably don't need this for bitwise-and and (maybe) nsieve-bits so this is not high priority for now.
Doesn't sound too hard, going to try doing that.
Assignee: general → evilpies
Attached patch WIP (obsolete) — Splinter Review
After a few hours of trying to implement this in some reasonable way cross platform, I came up with this. emitTruncateDouble was platform specific anyways. I think this should work on x86, but haven't tested that yet.
Fwiw, v8 has this impressive functions doing the conversation for a lot more case, https://code.google.com/codesearch#W9JxUuHYyMg/trunk/src/ia32/lithium-codegen-ia32.cc&q=LDoubleToI%20package:http://v8%5C.googlecode%5C.com&l=3846. I still don't really understand what they are doing, but I am going to investigate it a bit.
evilpie, this will be a very nice ss-nsieve-bits win, so let me know if there's anything I can do to help you get this landed.

If your inline path fails it can just call ValueToECMAInt32 instead of bailing out. This matches JM and we can always optimize later (I don't think we have to, though, since most benchmarks/scripts use doubles <= 0xffffffff).
Status: NEW → ASSIGNED
So well, because doing OOL calls isn't that cool, I looked at the code generated by clang for js_DoubleToECMAInt32 and it actually looks like something that could be simplified a bit and just be used in-line.
Attached patch current state (obsolete) — Splinter Review
They way we use this functions, makes most of the easy stuff not so easy, and ugly (We don't use this like a normal visit functions only responsible for lir instruction). This is at least a working version. (No optimization for arm)
So I refactored this patch to actually use OOL CodeGenerators yaay. But now I am blocked by Bug 709423 to actually call the function.
Depends on: 709423
Attached patch WIP v2Splinter Review
So this seems to mostly work, but isn't very beautiful. I decided to use a OOL path for truncate on x86, instead of inline calling js_DoubleToECMAInt32 like arm, but I haven't yet tested if this is actually worthwhile. (It should be, because I think we never actually call the function on our benchmarks).

It also has some bugs around saving and recovering registers, because I don't really understand what is needed there.
Attachment #589576 - Attachment is obsolete: true
Attachment #595514 - Attachment is obsolete: true
Attached patch v1Splinter Review
\o/ Thanks to jandem etc. for you support! I currently have so much other stuff to do so, so I hope this is just going to work.

I introduced saveVolatile and restoreVolatile so we can easily call such non VM functions.
Attachment #605448 - Flags: review?(jdemooij)
Comment on attachment 605448 [details] [diff] [review]
v1

Review of attachment 605448 [details] [diff] [review]:
-----------------------------------------------------------------

Nice work! r=me with these comments addressed.

I think it's okay to handle LTruncateDToInt32 in x86-shared, since ARM may want to implement this differently. We should be able to share emitTruncateDouble though.

::: js/src/ion/IonRegisters.h
@@ +489,5 @@
> +    { }
> +    RegisterSet(const FloatRegisterSet &fpu)
> +      : gpr_(),
> +        fpu_(fpu)        
> +    { }

These should be "explicit" to avoid unexpected casts.

::: js/src/ion/shared/CodeGenerator-shared.h
@@ +204,4 @@
>      }
>  
>    public:
> +    void saveVolatile(Register reg) {

Nit: s/reg/returnReg

Something like this seems a bit simpler btw:

RegisterSet regs = RegisterSet::Volatile();
regs.maybeTake(returnReg);

masm.PushRegsInMask(regs);

::: js/src/ion/shared/CodeGenerator-x86-shared.cpp
@@ +957,4 @@
>  JS_STATIC_ASSERT(INT_MIN == int(0x80000000));
>  
>  void
> +CodeGeneratorX86Shared::emitTruncateDouble(const FloatRegister &src, const Register &dest)

If we add truncateDouble to the macro assembler, we can move this function to CodeGenerator.

@@ +1047,5 @@
> +    masm.cvtsi2sd(output, ScratchFloatReg);
> +
> +    masm.ucomisd(temp, ScratchFloatReg);
> +    masm.j(Assembler::Parity, &fail);
> +    masm.j(Assembler::Equal, &done);

You can jump to ool->rejoin() here (and remove the |done| label).
Attachment #605448 - Flags: review?(jdemooij) → review+
Pushed with nits fixed (per our IRC conversation yesterday):

https://hg.mozilla.org/projects/ionmonkey/rev/49a7d5a3b400
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.