Closed Bug 521353 Opened 15 years ago Closed 15 years ago

optimized fast path for OP_add in Interpreter preserves too much precision on 64bit cpu's

Categories

(Tamarin Graveyard :: Virtual Machine, defect, P1)

x86_64
All

Tracking

(Not tracked)

VERIFIED FIXED
flash10.1

People

(Reporter: edwsmith, Assigned: stejohns)

References

Details

Attachments

(2 files, 2 obsolete files)

On 64bit cpu's, kIntegerType atoms have a 61 bit value payload.  The Interpreter's OP_add implementation detects when both atoms are kIntegerType and performs a fast integer add with an overflow check.

this add allows integers to grow up to 61 bits which is more precision than allowed by the ECMAScript standard (numbers should conform to IEEE double which is limited to 53 bits of integer precision).

other opcodes and other functions in the VM may have similar bugs.

Tamarin-redux's current JIT will invoke Toplevel::add2(), which is missing any optimizations for integers; since add2() adds all numeric values as doubles, it doesn't show this problem.  However, a pending optimization (bug 517127) exposes the bug by adding the integer case to add2()'s implementation.  And anyway, the interpreter & jit should agree.

Here's a test case:

function f() {
    // d starts life as a kIntegerType atom
    var d = 0xffff|0
    var i = 16

    // double d over and over to exceed the range limits of atom and 32 bit,
    // to try to demonstrate extra precision that shouldn't be there.  To test
    // for precision, we explicitly convert d to double by multiplication, then
    // mask of the low bits and compare with an unconverted (masked) d.
    //
    // if d has more precision than double supports, the masked result should
    // be different.

    while ((d&15) == ((d*1.0)&15) && i < 64) {
        d = d + d + 1
        i++
        print(i + " " + d + " " + d*1.0)
    }
    if (i == 64) {
        // d's precision stayed the same
        print("pass")
    } else {
        // d's kIntegerType precision exceed that of kDoubleType
        print("fail")
    }
}
f()

Correct behavior prints:

...
50 1125899906842623 1125899906842623
51 2251799813685247 2251799813685247
52 4503599627370495 4503599627370495
53 9007199254740991 9007199254740991
54 18014398509481984 18014398509481984
55 36028797018963970 36028797018963970
56 72057594037927940 72057594037927940
57 144115188075855870 144115188075855870
58 288230376151711740 288230376151711740
59 576460752303423500 576460752303423500
60 1152921504606847000 1152921504606847000
61 2305843009213694000 2305843009213694000
62 4611686018427388000 4611686018427388000
63 9223372036854776000 9223372036854776000
64 18446744073709552000 18446744073709552000
pass

buggy behavior (use a 64bit build and -Dinterp)

...
50 4294967295 1125899906842623
51 4294967295 2251799813685247
52 4294967295 4503599627370495
53 4294967295 9007199254740991
54 4294967295 18014398509481984
fail

Note that the printed value 4294967295 seems to be due to some code down in string conversion assuming that kIntegerType implies int32_t, which is a bug all on its own.  (shows up starting at 33 bits).  At least below 54 bits, and at least with the existing Atom encoding, a kIntegerType atom should be formatted and printed correctly with its full precision.

some possible fixes, none seem ideal:

* remove integer fast paths from 64bit builds

* keep fast paths, but limit kIntegerAtom to 29 bits on all cpu's

* modify kIntegerType atom representation to shift the value part farther left, limiting its payload to (say) 53 bits or less, e.g. (value<<16|kIntegerType) would limit to 48 bits before overflowing to double.  probably best on 64bit cpus but riskiest -- any naked code (*) assuming the value is <<3 or >>3 will break

(*) this is code we'd have to identify and fix for any atom encoding change anyway
Flags: in-testsuite?
The string formatting bug is in AvmCore::string(), which downcasts the Atom integer value to int32_t or uint32_t before calling intToString() or uintToString().  

a fix is to use a string conversion that takes intptr_t, and not do anything special for unsigned values (kIntegerType atoms are always signed values), all the way down the call stack into MathUtils::convertIntegerToStringBase10().
This doesn't fix the real bug but at least formats these large atoms correctly.
Assignee: nobody → edwsmith
Attachment #405456 - Flags: review?(lhansen)
buggy behavior with patch applied:

50 1125899906842623 1125899906842623
51 2251799813685247 2251799813685247
52 4503599627370495 4503599627370495
53 9007199254740991 9007199254740991
54 18014398509481983 18014398509481984  <-- notice: off by one digit
fail
Attachment #405456 - Flags: review?(lhansen) → review+
on inspection, integer fast paths for add, subtract, increment, decrement, are affected.  the _i variants truncate results to 32bits so seem fine as-is.
Comment on attachment 405456 [details] [diff] [review]
Formats large kIntegerType atoms (with >32 bits of precision) correctly as strings

string formatting patch pushed http://hg.mozilla.org/tamarin-redux/rev/92e9d5a62ff5
Attachment #405456 - Attachment is obsolete: true
(In reply to comment #0)
> some possible fixes, none seem ideal:
>
> * keep fast paths, but limit kIntegerAtom to 29 bits on all cpu's

Actually, can't we have the best of both worlds? have integer atoms be

   (intptr_t(int32val)<<32)|kIntegerType
That would somewhat arbitrarily exclude 1/2 of the uint32_t values.  33 bits of precision (<<31) would be the next possible choice that would include all of int32_t and uint32_t.  

If we agree to go past 32, I can't think of any reason to stop short of 53 bits of precision (<<11), except maybe on FPU's that are sloppy in the final couple bits when working on integral doubles.  are any that sloppy?  Remember this is 64-bit cpu's too, so arm vfp fast fp mode doesnt apply.
Blocks: 517127
Fair enough, but any change of this sort is (semi-)dependent on https://bugzilla.mozilla.org/show_bug.cgi?id=523192 as there is oodles of code (inside the VM and embedders) that cheat and peek directly at Atom contents. Finding all of these will be almost impossible to do by inspection.
(In reply to comment #7)
> If we agree to go past 32, I can't think of any reason to stop short of 53 bits
> of precision (<<11), except maybe on FPU's that are sloppy in the final couple
> bits when working on integral doubles.  are any that sloppy?  Remember this is
> 64-bit cpu's too, so arm vfp fast fp mode doesnt apply.

Seems like that would be a massive failure; 53 bits of integral precision is pretty much required by IEEE754 iirc.
arm's vfp "fast" mode is a mode that lets it violate IEEE754 in ways to speed up code.  other chips might have similar corners you can optionally cut.  i'm just hedging here, we need to investigate.  33 or 48 might be safe compromises.
iirc vfp fast mode only affects denormalized numbers, so shouldn't affect this.

that said, I could live with 48.
Status: NEW → ASSIGNED
Priority: -- → P1
Target Milestone: --- → flash10.1
IMHO, the simplest and safest short-term fix is to constrain 64-bit int atoms to the union of int/uint range in the interpreter (and in OP_add etc). Longer term, using 53-bit ints would be worth doing but bug 523192 has to land first, which will take a while. 

Opinions?
cool.  adding this:

   result = (result<<N)>>N  // N in the range 32-53)

before the overflow test should do the trick
I still don't get why 53-bit would be worth it.  Most numbers used as numbers are small.  When they are large it's because they are bit vectors represented as int; when they are bit vectors, they are 32 bits in AS3 or JS.  So IMO a 32-bit payload seems the most reasonable for 64-bit systems.
I buy that argument for 34-53 bits.  I contend that 32 bits isn't enough because bitmasks and pixel values tend to be either int or uint, and restricting kIntegerType (which is signed) to 32 bits will exclude large uint32_t values.
(In reply to comment #15)
> I buy that argument for 34-53 bits.  I contend that 32 bits isn't enough
> because bitmasks and pixel values tend to be either int or uint, and
> restricting kIntegerType (which is signed) to 32 bits will exclude large
> uint32_t values.

Ah, but isn't the bug that the VM conflates int and uint?  :-)
(In reply to comment #14)
> I still don't get why 53-bit would be worth it.

They'd be worth it because they'd be basically cost-free once an opaque atom api is in place. (In reply to comment #16)

> Ah, but isn't the bug that the VM conflates int and uint?  :-)

Arguably, yep, but that one isn't solvable in the short-term.
poaching from Edwin
Assignee: edwsmith → stejohns
FWIW, this fails for me in NON-interp debug-debugger 64-bit builds, with 

54 18014398509481983 18014398509481984

investigating...
(In reply to comment #13)
> cool.  adding this:
> 
>    result = (result<<N)>>N  // N in the range 32-53)
> 
> before the overflow test should do the trick

We use that approach in a few places already (see SIGN_EXTEND in Interpreter.cpp) but it won't work for all cases of addition and subtraction, since the integer range we're talking about is not symmetrical (specifically, -2147483648 to 4294967295).
After fiddling with this a bit, trying to have an asymmetrical range for ints on 64-bit is making me nervous... I think we can get it right but I'm nervous there are going to be corner cases we miss, or optimizations we lose. I also suspect (without proof) that there is code that expects that numbers-in-the-int-range won't overflow with numbers-outside-the-int-range when it comes to atom (e.g., you can find 0x7fffffff in kIntegerType or kDoubleType but not both) -- this is an identity enforced by our current Atom-creation code.

Long-term, I think moving to 53-bit ints will be the best option, but now I'm leaning towards having 64-bit-build int atoms be 32 signed bits (+ 3 tag bits)... meaning that signed ints will always fit into an atom, but uints > 0x7fffffff will overflow to double. Granted, this is suboptimal, but IMHO might be safer for the short term.

Opinions?
I dont think we need an asymetrical limit.  It is fine (imo) for kIntegerType to have 33 bits of precision.  This will get the int and uint range, and won't perserve too much precision compared to double.

Any safety issues about kIntegerType and kDoubleType overlapping already exist for sub-29 bit atoms, right?

How about a compromise.  If we're adding code to limit, lets limit to 33 bits.  That will fix this bug which is the only known problem, without regressing 64-bit cpu performance.  If as the release progresses we still aren't comfortable with the risk, or if we find new problems, then we can drop to 32 bits and give up the performance for uint values, or drop to 29 bits for parity with 32-bit CPU's.

The eventual future optimization of shifting int's farther left lets us optimize out these range-limiting shifts on fast path.
(In reply to comment #22)
> Any safety issues about kIntegerType and kDoubleType overlapping already exist
> for sub-29 bit atoms, right?

Possibly but unlikely. This is admittedly a paranoid concern.
 
> How about a compromise.  If we're adding code to limit, lets limit to 33 bits. 
> That will fix this bug which is the only known problem, without regressing
> 64-bit cpu performance. 

Yep, but it introduces a new potential latent bug: int atoms can now have values that won't fit into 32 bits (either signed or unsigned). (Granted, we already have such a latent bug for code that assumes "int" means "int32").

> If as the release progresses we still aren't
> comfortable with the risk, or if we find new problems, then we can drop to 32
> bits and give up the performance for uint values, or drop to 29 bits for parity
> with 32-bit CPU's.

fair enough.
perhaps one thing we could do to reduce paranoi (and find bugs) is set aside some time for a thorough review of all the atom handling code.  It wont catch problems in code we dont know about, but it ought to put to rest worry about the avm code that handles atoms.
(In reply to comment #23)
> Yep, but it introduces a new potential latent bug: int atoms can now have
> values that won't fit into 32 bits (either signed or unsigned). (Granted, we
> already have such a latent bug for code that assumes "int" means "int32").

So if we can decide that it's ok to have larger-than-32-bit int atoms, no point in stopping at 33 bits.(In reply to comment #24)

> perhaps one thing we could do to reduce paranoi (and find bugs) is set aside
> some time for a thorough review of all the atom handling code. 

It's not code in the VM I'm worried about, it's the code in flash/air glue. I don't think it's practical to review it for errors.
spotted in AvmCore::istype:

        if (bt == BUILTIN_uint)
        {   
            return atomInt(atom) >= 0;
        }

not 64bit safe.  on 64bit, should be atomInt(atom) >= 0 && < (1LL<<32)
Attached patch Patch (obsolete) — Splinter Review
This implements a subset of the atom changes from bug 523192 to fix this issue. This normalizes 64-bit integer atoms to be signed 53-bit ints (to allow for precise conversion to double).

Note that this change renames kIntegerAtom to kIntptrAtom to reinforce the fact that an int atom can be larger than int (which has always been true). It also forces embedders to examine (and correct) all instances of direct int-atom manipulation.

Ideally, all details of int-atom manipulation will someday be encapsulated in atom.h and friends; for now, some bits are still spread in a few places (eg AvmCore, Interpreter), but that is expected to improve incrementally with the atom api in the future; in the meantime, this accomplishes the short-term goal of removing int-atom manipulation from embedder code. (The changes necessary in Flash + AIR are fairly modest.)

We now pass all sanities (including the new testcase) on mac in 32+64, jit+interp.
Attachment #408681 - Flags: superreview?(lhansen)
Attachment #408681 - Flags: review?(edwsmith)
whitespace:

atom-inlines.h: no tabs please (eventually all source files will be like this, with matching modelines)

atom.h: kIntptrType  = 6 should line up columns with other defs.  and no tabs, please.

(more soon)
Comment on attachment 408681 [details] [diff] [review]
Patch

Interpreter.cpp:1461: in INSTR(add_i), you could just CLAMP32 the result, which is equivalent to clamping both operands before adding, but with less ALU ops.

Interpreter.cpp (everywhere): once we know something is kIntptrType, removing the mask with "- kIntptrType" instead of "^ kIntptrType" allows better code since, for example, (a - 6) + (b - 6) = (a + b - 12); whereas with ^ we don't exect the same constant folding with (a ^ 6) + (b ^ 6)

INSTR(urshift): maybe the guard around MAKE_INTEGER can be simpler once we know IS_BOTH_INTEGER(a1,a2) :  if a2 != 0 and a1 was already a legal signed intptr_t (29 bits or 53 bits) then we'll definitely insert at least one zero in the high bit, and the result has to be a valid unsigned kIntptrType atom (28 or 52 bit uint).  if a2 == 0, then INSTR(urshift) is equivalent to ToUint32(a1); on 32-bit cpus we still must check if it fits in kIntptrType, but on 64-bit cpus it always will.  ergo, we just need to check for a2 == 0 on 32 bit only.

in BITOP_TWO_VALUES_AND_NEXT, SIGN_EXTEND() was removed. presumably all callers have been checked for validity?

INSTR(lookupswitch): integer_u became integer_i - why?  should index be int32_t?
Attachment #408681 - Flags: review?(edwsmith) → review+
(In reply to comment #28)
> atom-inlines.h: no tabs please (eventually all source files will be like this,
> with matching modelines)

I finally bit the bullet and switch my editor over to spaces instead of tabs, so dunno how that crept in.

> (From update of attachment 408681 [details] [diff] [review])
> Interpreter.cpp:1461: in INSTR(add_i), you could just CLAMP32 the result, 
> which is equivalent to clamping both operands before adding, but with 
> less ALU ops.

nice, will do (ditto for subtract_i and BITOP_TWO_VALUES_AND_NEXT)
 
> Interpreter.cpp (everywhere): once we know something is kIntptrType, removing
> the mask with "- kIntptrType" instead of "^ kIntptrType" allows better code
> since, for example, (a - 6) + (b - 6) = (a + b - 12); whereas with ^ we don't
> exect the same constant folding with (a ^ 6) + (b ^ 6)

agreed, but didn't want to increase the churn for this patch. let's do that in a subsequent patch.

> INSTR(urshift): maybe the guard around MAKE_INTEGER can be simpler...

nice, I'll try it out

> in BITOP_TWO_VALUES_AND_NEXT, SIGN_EXTEND() was removed. presumably 
> all callers have been checked for validity?

Yep, only 3 callers

> INSTR(lookupswitch): integer_u became integer_i - why?  should index be
> int32_t?

Verifier actually verifies that this value is int, not uint, so integer_u was never correct. And in fact, at least one acceptance test calls it with a negative value (I can dig it up if you're curious), so integer_u asserted (since it now checks for valid range). So the code was always wrong, it just didn't complain.
Attached patch Revised Patch Splinter Review
Same as previous patch, but with the minor changes suggested by Edwin
Attachment #408681 - Attachment is obsolete: true
Attachment #408877 - Flags: superreview?(lhansen)
Attachment #408877 - Flags: review?(edwsmith)
Attachment #408681 - Flags: superreview?(lhansen)
Attachment #408877 - Flags: review?(edwsmith) → review+
Should there be a method to insert a tag in an intptr_t value, so that the shift-by-3 is not exposed the way it is, or are we completely wedded to that?  I'm looking at the changes to WordcodeEmitter.cpp.  I see we have AvmThunkSmallIntAtom in the API.

Indeed there are two FIXMEs in WordcodeEmitter.cpp that say "wrong for 64-bit", should they be fixed?

The change to ArrayClass.cpp needs a version check; see bug #524122.

The definition of atomMaxIntValue is wrong for 64-bit, it must use 1LL, like the corresponding negative value.

In AvmCore::compare you can use atomIsBothIntptr.

Interpreter.cpp: The comment in INSTR(bitnot) ("// implicitly do AvmCore::integer prior to op, since 64-bit ints may be too large") scarcely makes sense in the context, as AvmCore::integer is nowhere in sight.

INSTR(urshift): I'm pretty sure that 

  #ifdef AVMPLUS_64BIT
  #else
  ...
  #endif

is not really conducive to good reading; I'm certain that indented #ifdefs are not.  Neither fits the style of the rest of the file (I have tried to outdent #ifdefs generally, except in file headers).
Comment on attachment 408877 [details] [diff] [review]
Revised Patch 

Approved with the provision that the problem with 1LL and 'sort' must be fixed, and #ifdeffery in the interpreter ought to be fixed.
Attachment #408877 - Flags: superreview?(lhansen) → superreview+
(In reply to comment #32)
> Should there be a method to insert a tag in an intptr_t value, so that the
> shift-by-3 is not exposed the way it is, or are we completely wedded to that? 
> I'm looking at the changes to WordcodeEmitter.cpp.  I see we have
> AvmThunkSmallIntAtom in the API.

there very much should be, but I decided to hold off on that and deal with it as part of bug 523192 instead. (this bug is shaping up to be a subset of the likely fix for that.) 

> Indeed there are two FIXMEs in WordcodeEmitter.cpp that say "wrong for 64-bit",
> should they be fixed?

oops. will investigate.

> The change to ArrayClass.cpp needs a version check; see bug #524122.

ah. you are correct. will back out that change.
 
> The definition of atomMaxIntValue is wrong for 64-bit, it must use 1LL, like
> the corresponding negative value.

oops, good catch, typo.
 
> In AvmCore::compare you can use atomIsBothIntptr.

yep.
 
> Interpreter.cpp: The comment in INSTR(bitnot) ("// implicitly do
> AvmCore::integer prior to op, since 64-bit ints may be too large") scarcely
> makes sense in the context, as AvmCore::integer is nowhere in sight.

um. yeah. brain fart.

> INSTR(urshift): I'm pretty sure that 
> 
>   #ifdef AVMPLUS_64BIT
>   #else
>   ...
>   #endif
> 
> is not really conducive to good reading; I'm certain that indented #ifdefs are
> not.  Neither fits the style of the rest of the file (I have tried to outdent
> #ifdefs generally, except in file headers).

will change to conform to the file. FWIW, though, I find this style easier to read, as I find the difference between "ifdef" and "ifndef" easy to miss.
pushed as changeset:   2919:85d77d66abae
Status: ASSIGNED → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Depends on: 525479
Attachment #424907 - Flags: review?(brbaker)
Attachment #424907 - Flags: review?(brbaker) → review+
Testcase pushed to redux changeset 3727	3bedd1d5a4db, verified fixed r3871
Status: RESOLVED → VERIFIED
Flags: in-testsuite? → in-testsuite+
Blocks: 559321
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: