Closed Bug 1401624 Opened 7 years ago Closed 5 years ago

Use object-biased NaN boxing instead of double-biased on x86-64

Categories

(Core :: JavaScript Engine, enhancement, P3)

56 Branch
enhancement

Tracking

()

RESOLVED FIXED
mozilla69
Performance Impact low
Tracking Status
firefox56 --- wontfix
firefox57 --- wontfix
firefox69 --- fixed

People

(Reporter: djvj, Assigned: iain)

References

(Depends on 1 open bug)

Details

(Keywords: perf, Whiteboard: [js:perf])

Attachments

(16 files, 41 obsolete files)

2.77 KB, text/x-csrc
Details
695 bytes, application/x-javascript
Details
748 bytes, application/x-javascript
Details
12.99 KB, text/plain
Details
42.32 KB, patch
Details | Diff | Splinter Review
21.53 KB, patch
Details | Diff | Splinter Review
2.32 KB, patch
Details | Diff | Splinter Review
10.96 KB, patch
Details | Diff | Splinter Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
We have discussed shifting over to object-biased boxing instead of double-biased NaN-boxing.

The main reasoning is that objects are presumably boxed and unboxed far more often than doubles in web code.

I had a proposal called Ex-Boxing that used the rotated high exponent bits in a double as low-bit tags, forcing heap-boxing of some doubles, but allowing low-bit tagging of boxed objects.  The theory was that high-bit tagging and the need to load high-bit constants into registers before using them in operations was a significant hit.

I tested my theory on a microbench and it turned out that low-bit tagging and high-bit tagging are basically equivalent from a performance perspective.  So I'm shelving my ex-boxing idea.

On the upside, the difference between a nop and high-bit tagging is significant.

To measure the potential impact of switching to object-biased boxing, I instrumented Value.h to print out diagnostics whenever a value was boxed or unboxed, disabled the jits, and ran the engine.  The results on shell benchmarks showed 76% of all operations on Values being |toObject|.  The remaining operations are rounding errors.

Turning this operation into a no-op should be a significant benefit.
Attached patch object-biased-boxing.patch (obsolete) — Splinter Review
This is an initial patch that doesn't work and fails on basically every test.  But the core code is there to work off of.

After some investigation, I am going with the following approach:

The original number space we want to use for boxing is the value:

0xFFF0_0000_0000_0000

This is -Inf, but any non-zero value in the low 52 bits will make the value into a NaN.

We rotate the number space by adding 0x0010_0000_0000_0000.  This shifts -Inf to 0, and all signalling NaNs (high-bit set) to non-zero values with the high 16 bits being zero.

Our canonical NaN and +Inf, which all have high bits 0x7FF, will shift to have high bits 0xFFF.
All other doubles will shift to have non-zero in the 12 high bits.

To handle -Inf being 0, we test for 0 after the transform (should be a simple CPU flag check after the adjustment), and encode -Inf specially as 0xFFF0_0000_0000_0001 (i.e. the equivalent of a transformed non-canonical NaN with the low-bit set).

This handles doubles.

We use the tag 0 for objects, leading to objects having the high 17 bits be 0, and needing no operations for boxing and unboxing.

Remaining values are boxed as before.
Assignee: nobody → kvijayan
Exciting. How does this scheme compare to JSC (32-bit and 64-bit)? JSC tends to get these things right so it can't hurt to compare to them.
So JSC's system uses the following tag patterns for the high bits:

0x0000... for pointers
0x0001... to 0xFFFE for doubles
0xFFFF... for int32s

They use low-value constants to encode singleton constants (null, undef, false, true).  So basically they use a restricted subset of the NaN space (taking 15 high bits), and using 1 bit to distinguish between pointers and int32s.

I think this is a great approach, but it doesn't fit well with our current codebase.  There are heavy assumptions in our codebase around the high-bits in a Value always storing the type.  We also store more type information directly in the boxedval.  We could change this, but given the evidence indicating that object unboxing is the high-value target, it doesn't seem worthwhile to go to the effort of completely revamping the boxing format and breaking all assumptions.. when we can go with a more incremental approach and get most of the gains.

I'm also somewhat concerned about regressions due to the change, and want to minimize the changes in assumptions to reduce that risk.

Our current approach is heavily focused on the high bits explicitly representing type tags, and we have lots of types (Null, Undef, Bool, Int32, Double, String, Symbol, Magic, Object).  So I'm mostly trying to keep as much the same as possible.

Adding 0x0010... shifts the 0xFFF... numberspace (containing -Inf and SigNaNs) to 0x000...
We manually shift out -Inf, and we have our 5 bits for type tags again.

Thoughts?
Priority: -- → P3
Whiteboard: [js:perf][qf:p3]
(In reply to Kannan Vijayan [:djvj] from comment #1)
> Created attachment 8910363 [details] [diff] [review]
> object-biased-boxing.patch
> 
> We use the tag 0 for objects, leading to objects having the high 17 bits be
> 0, and needing no operations for boxing and unboxing.

At the risk of having missed something, we are now seeing 49-bit virtual addresses in the wild on 64-bit systems (aarch64, significantly, and less importantly on SPARC).  I know you wrote x86_64 in the title of this bug, but at this point it seems a shame not to consider arm64 or the real risk that x64 is going to move beyond 48-bit virtual addresses at some point.

Also, this scheme only seems to work if kernel memory is in the high half of the virtual address space (so that bit 47 is 0 in user mode addresses); this is common, but is it universal?
A few thoughts:

* One thing I like about JSC's system is that the upper 16 bits store the tag, so if you have a Value in memory you can always use load16/branch16 to load/check the tag without any bitops. (For some of the "32-bit payload types" we already do something similar by comparing the upper 32 bits but we can't do that for objects/strings/symbols.)

* On 32-bit, our current boxing scheme is pretty ideal (at least assuming 8-byte Values). One issue is that we probably have shared code using masm.storeDouble et al to "box" doubles. That's fixable though.

* Overall the idea makes sense - the main thing I don't like is the extra branch required when boxing doubles. I haven't really thought about it but is there really no way to avoid this?
(In reply to Lars T Hansen [:lth] from comment #4)
> I know you wrote x86_64 in the title of this bug,
> but at this point it seems a shame not to consider arm64 or the real risk
> that x64 is going to move beyond 48-bit virtual addresses at some point.

IIUC, we've eliminated all the cases of fake GC things allocated in static memory and thus every pointer that gets nan-boxed is allocated by our own GC (right?).  Given that, if OSes start allowing more than 47 significant pointer bits, we should be able to use mmap()/VirtualAlloc() address hints to place our GC allocations in the desired range.  That doesn't cover the hypothetical case where kernel takes low memory, forcing all user space into high memory, but given all the 64-bit precedent keeping user space in low memory, this seems like a big risk for an architecture to go this route and thus it seems less likely.
(In reply to Luke Wagner [:luke] from comment #6)
> Given that, if OSes start allowing more than 47 significant
> pointer bits, we should be able to use mmap()/VirtualAlloc() address hints
> to place our GC allocations in the desired range.

We have some (pretty rough) code for this already in the GC allocator. Unfortunately from tests on arm64 distros with 48-bit pointers it seems that mmap doesn't treat the passed address as a hint, instead just failing if the address is already allocated, so we do this awful loop: http://searchfox.org/mozilla-central/source/js/src/gc/Memory.cpp#580

A better way would be to map large regions at a time and just commit/decommit subsections (like we do for executable memory used by the JITs, but more dynamic), so we'd only incur the overhead of getting an appropriate region very rarely.
(In reply to Emanuel Hoogeveen [:ehoogeveen] from comment #7)
> A better way would be to map large regions at a time and just
> commit/decommit subsections (like we do for executable memory used by the
> JITs, but more dynamic), so we'd only incur the overhead of getting an
> appropriate region very rarely.

That sounds like a good idea.
(In reply to Lars T Hansen [:lth] from comment #4)
> At the risk of having missed something, we are now seeing 49-bit virtual
> addresses in the wild on 64-bit systems (aarch64, significantly, and less
> importantly on SPARC).  I know you wrote x86_64 in the title of this bug,
> but at this point it seems a shame not to consider arm64 or the real risk
> that x64 is going to move beyond 48-bit virtual addresses at some point.

This was a concern of mine as well.  I actually took a while to convince myself that yes, we were _actually_ assuming that ptrs only have 47 significant bits.

Our current approach of always storing the type-tag in the high bits requires at least 4 bits.  I used 5 bits because I'm keeping this as incremental as possible to reduce risks and get it in fast.

We can reduce that burden in several ways.  JSC (and V8 for that matter) doesn't distinguish between Object/String/Symbol ptrs in the value, and instead stores the type in the object itself.  JSC additionally moves the singleton constants (undef, null, true, false) under the object type tag, but with small-int-invalid-pointer values for the payload.

There's room for us to maneuver here.

> Also, this scheme only seems to work if kernel memory is in the high half of
> the virtual address space (so that bit 47 is 0 in user mode addresses); this
> is common, but is it universal?

I'll go with Luke's judgement on this.  I was super sketched out about the 47 bit + userspace-in-lowmem assumption too.  Considering the comments, I think it's reasonable to go ahead allow 48 bits of ptr payload, since we're not using 5 type bits in any case (we have 8 types, so technically we can get away with 3 type bits), and also considering jandem's comments about the nice properties of a 16-bit tag.


(In reply to Jan de Mooij [:jandem] (PTO Sep 21-22) from comment #5)
> A few thoughts:
> 
> * One thing I like about JSC's system is that the upper 16 bits store the
> tag, so if you have a Value in memory you can always use load16/branch16 to
> load/check the tag without any bitops. (For some of the "32-bit payload
> types" we already do something similar by comparing the upper 32 bits but we
> can't do that for objects/strings/symbols.)

Oh, that's a very interesting point.  I noticed the tricks we do with the 32-bit payload cases.  This is a good argument for moving to a JSVAL_TAG_SHIFT of 48, and there's really nothing stopping us from doing that.

> 
> * On 32-bit, our current boxing scheme is pretty ideal (at least assuming
> 8-byte Values). One issue is that we probably have shared code using
> masm.storeDouble et al to "box" doubles. That's fixable though.
> 
> * Overall the idea makes sense - the main thing I don't like is the extra
> branch required when boxing doubles. I haven't really thought about it but
> is there really no way to avoid this?

Yeah, that branch is a wart.  Thinking about it..

Instead of incrementing by (2^52), we can increment by (2^51), which would keep -Inf out of the picture (0xfff0... => 0xfff8...).  Canonical SigNaN (0xfff8_0000_0000_0000) would get sent to 0, and we'd still have 4 free type bits in the tag.  This would also get us the 16-bit tag as well, allowing the easy type checks you mention earlier.
So I think I've run across the first place in the code where we assume our current boxing format.  When we finish emitting bytecode and are filling our JSScript info in, we copy constant values into the script's constant table.  The table is initialized to 0s before it's written to.  The write into the table triggers a post-barrier which tests if the existing value is an object and if so, does some stuff.

Currently the 0 value presents as a double and this code is skipped.  With my patch it presents as an object and we crash when the barrier code looks into it.  Sigh.  Hope there are not too many places like this.
Attached patch object-biased-boxing.patch (obsolete) — Splinter Review
Updated patch.  Caught a few more places where we're using implicit zeroes, but still haven't gotten everything yet.

This version passes most jsreftests (didn't run to completion, but >99% success rate), and most jit-tests pass (around 3.5% fail rate).  Haven't tested performance yet.  Need to be relatively sure it won't crash a browser run before testing that.
Attachment #8910363 - Attachment is obsolete: true
After following up on more test errors, I realized that I was being too aggressive with the amount of space in the 64-bit numberspace I was allocating to tagged values.  Namely, negative NaNs (which I had previously erroneously thought were signalling NaNs) can be generated by code.  This uses the top 13 bits of a double-word, and leaves only 3 type bits in the tag, which is not quite sufficient for our needs.

There are a couple of viable solutions here.  One is to free up some type-bits by unifying certain types.  JSC unifies the representation of undef, null, true, false, string, and symbol all under the object type.  This frees them up to only require 3 different types to be represented in the boxed value: Int, Double, and Object.

For now, that seems like a much more involved approach, so instead I'm going back to using 17-bit tags instead of 16-bit tags.  This will give us the 4 type bits we need and makes for a smaller change to the existing assumptions.

The core set of changes seem to be OK at this point.  The main source of errors ATM seems to be the jit.  I'm mostly running into correctness errors now, and few crashes.
Keywords: perf
:djvj, I didn't get a chance to talk much at the All-Hands, but I think I'm going to be helping out with this bug as part of the performance team. Is there anything in particular I should take a look at first here?
Flags: needinfo?(kvijayan)
Attached patch fix-bitrot.patch (obsolete) — Splinter Review
Ok, I this fixes the bitrot and should hopefully do everything right to avoid regressing the Spectre patch. Of course, there are still lots of test failures that I'm investigating.
Assignee: kvijayan → jporter+bmo
Attachment #8913337 - Attachment is obsolete: true
Status: NEW → ASSIGNED
Attachment #8991405 - Flags: feedback?(kvijayan)
Hi Jim,

Sorry the long wait.  I think I spoke with you after this feedback request and we went over the patch 1:1, right?  How are you faring with the tests?  If you want we can meet again and discuss.
Flags: needinfo?(kvijayan)
Attached patch nanbox-series.patch (obsolete) — Splinter Review
Attachment #8991405 - Attachment is obsolete: true
Attachment #8991405 - Flags: feedback?(kvijayan)
Attached patch nanbox-series.patch (obsolete) — Splinter Review
Attachment #9009214 - Attachment is obsolete: true
Attached patch object-biased-boxing.patch (obsolete) — Splinter Review
Just posting this so I don't lose track.  Found a bug in boxing when pair-debugging with Jim.  After that, the following jit-test run produces the following failures:

./jit-test/jit_test.py --args='--no-ion --baseline-eager --no-wasm-baseline --no-wasm-ion' ./BUILD-DEBUG64/dist/bin/js

FAILURES:
    asm.js/testFFI.js
    asm.js/testFloatingPoint.js
    asm.js/testKVKV.js
    asm.js/testBullet.js
    ion/bug1165905.js
    ion/bug860838.js
    sunspider/check-string-tagcloud.js
TIMEOUTS:
    debug/bug1370905.js
    gc/bug-1292564.js
Benjamin, perhaps you already know the answer here and can save them some time? It looks like the FFI of a double between wasm and js is missing a boxing or unboxing call and assuming float conversion is a no-op. Do you have ideas where in the code we copy in/out of wasm with a double, or maybe even can spot the copy that is missing a box?
Flags: needinfo?(bbouvier)
Minimal test case (extracted from asm.js/testFFI):


function ffiDoubleFew(a,b,c,d) { return d+1 }
var f = asmLink(asmCompile('glob', 'imp', USE_ASM + 'var ffi=imp.ffi; function f(i) { i=+i; return +ffi(i,i+1.0,i+2.0,i+3.0) } return f'), null, {ffi:ffiDoubleFew});
for (var i = 0; i < 15; i++)
    assertEq(f(i), i+4);


It appears the code that fills up the arguments when doing a fast call from wasm to the JITs is using plain storeDouble() to put operands of type Double onto the stack, when it should not assume anything about the Value encoding and probably use boxDouble() instead.

You need to update the lines in the `toValue` branches of FillArgumentArray function:

https://searchfox.org/mozilla-central/source/js/src/wasm/WasmStubs.cpp#1098-1124
https://searchfox.org/mozilla-central/source/js/src/wasm/WasmStubs.cpp#1136-1146

If following this, there are issues for the return values, it could be that the return value isn't correctly converted, deep into one of the callees of these functions: https://searchfox.org/mozilla-central/source/js/src/wasm/WasmStubs.cpp#1513-1518

Happy to help more if more issues like this pop up; I've checked the JIT entries and they seem to properly use boxDouble when filling up the arguments.
Flags: needinfo?(bbouvier)
Depends on: 1485347
Gary, before this actually gets landed we'll want to do some fuzzing. Is it straightforward to do differential testing of the existing vs proposed boxing when this is ready? Is a build flag to select boxing style sufficient?

(I figure we should start this discussion now..)
Flags: needinfo?(nth10sd)
Depends on: 1493739
> Is a build flag to select boxing style sufficient?

A runtime flag is needed.
Flags: needinfo?(nth10sd)
I'm not sure it will be possible to have a runtime flag for this patch. We'll need to think more on this.
It's definitely not possible to have a runtime flag to make this switch.  It's a deep change of the value-representation.

We'll need to fuzz the patch directly.  Gary, what are our options here?
Flags: needinfo?(nth10sd)
This patch fixes up existing places in the VM (non-jitcode) where we assume that the value format naturally boxes doubles.
Attachment #9012664 - Flags: review?(jdemooij)
Comment on attachment 9012664 [details] [diff] [review]
fix-implicit-value-format-usage.patch

Review of attachment 9012664 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks for breaking up. Can you leave BytecodeEmitter and JSScript alone here?
Bug 1485347 (which I need to land as soon as I'm back) has fixed them.
Attachment #9012664 - Flags: review?(jdemooij)
Comment on attachment 9012664 [details] [diff] [review]
fix-implicit-value-format-usage.patch

Oops.. didn't mean to set flags.
Attachment #9012664 - Flags: review?(jdemooij)
(In reply to Benjamin Bouvier [:bbouvier] from comment #20)
> You need to update the lines in the `toValue` branches of FillArgumentArray
> function: [snip]

Thanks for these code references. It looks like changing these helps a few things, but not as much as I would have expected. The minimal test case you posted fails in the same way with or without patching those lines. Does it work for you? Maybe I'm just doing something wrong...
Ah yes.  I actually looked at that and went "oh right, it's masm.whatever.. that's jitcode".  Forgot that wasm always at least baseline-jits and will interact badly with even non-jitted js with the value-boxing changes.
(In reply to Ted Campbell [:tcampbell] (PTO until Oct 9) from comment #26)
> Thanks for breaking up. Can you leave BytecodeEmitter and JSScript alone
> here?
> Bug 1485347 (which I need to land as soon as I'm back) has fixed them.

I've landed some of those patches. Don't worry about any remaining conflicts. I'll sort them out later.
This moves the type tag over by one bit, leaving us with 47 bits of payload
for objects. The end result is that this patch changes less than it used to.
(Note: this should get rolled into the previous patch later!)

Depends on D7138
Attachment #9009262 - Attachment is obsolete: true
What's the compile-time flag needed to trigger this mode?

(I'm still not promising anything yet but I have ideas...)
Flags: needinfo?(nth10sd)
(In reply to Gary Kwong [:gkw] [:nth10sd] from comment #35)
> What's the compile-time flag needed to trigger this mode?
> 
> (I'm still not promising anything yet but I have ideas...)

Currently, there is none. It's always on for x86-64 builds.
And to be clear: it'll crash on any non-x86-64 builds if you enable the jits (and probably even if you don't when running wasm code).
Fixed more cases in the current code where the current value-boxing format (with noop-boxed doubles) are implicitly assumed.

This is not ready for review yet as wasm tests are still failing jit-tests running '--no-ion --baseline-eager':

FAILURES:
    ion/bug1165905.js
    wasm/basic.js
    --no-wasm-baseline wasm/basic.js
    --no-wasm-ion wasm/basic.js
    --test-wasm-await-tier2 wasm/basic.js
    wasm/import-export.js
    --no-wasm-baseline wasm/import-export.js
    --no-wasm-ion wasm/import-export.js
    --test-wasm-await-tier2 wasm/import-export.js
    wasm/js-reexport.js
    --no-wasm-baseline wasm/js-reexport.js
    --no-wasm-ion wasm/js-reexport.js
    --test-wasm-await-tier2 wasm/js-reexport.js
    wasm/nan-semantics.js
    --no-wasm-baseline wasm/nan-semantics.js
    --no-wasm-ion wasm/nan-semantics.js
    --test-wasm-await-tier2 wasm/nan-semantics.js
Attachment #9012664 - Attachment is obsolete: true
Attachment #9012664 - Flags: review?(jdemooij)
Updated patch.  Just testing with interpreter-only (no baseline and no ion) to isolate the wasm failures that don't relate to jit for now:

FAILURES:
    asm.js/testBullet.js
    wasm/nan-semantics.js
    --no-wasm-baseline wasm/nan-semantics.js
    --test-wasm-await-tier2 wasm/nan-semantics.js
    --no-wasm-ion wasm/nan-semantics.js
Attachment #9012767 - Attachment is obsolete: true
Rebased to tip.. some of the deltas go away since :tcampbell landed them already.
Attachment #9012778 - Attachment is obsolete: true
Minor cleanup on earlier patch.
Attachment #9012779 - Attachment is obsolete: true
Benjamin, question for you.  This is a test failure where it should be obvious to the right person (e.g. you) where the right place to look is.

In wasm/nan-semantics.js, the following test fails:

    var checkBitPatterns = {
        "": {
            float32(x) {
                f32[1] = x;
                assertSameBitPattern(0, 4, 4);
            },
            float64(x) {
                f64[1] = x;
                assertSameBitPattern(0, 8, 8);
            }
        }
    }

    f64[0] = NaN;
    f64[0] = f64[0]; // Force canonicalization.
    f64[1] = wasmEvalText('(module (func (result f64) (f64.const nan:0x123456)) (export "" 0))').exports[""]();
assertSameBitPattern(0, 8, 8);

    wasmEvalText('(module (import "" "float64" (param f64)) (func (call 0 (f64.const nan:0x123456))) (export "" 0))', checkBitPatterns).exports[""]();



The issue seems to be the call to "float64(x)" - I don't think I'm catching the location where we're calling "float64()" with the nan-constant and boxing it.  As far as I can tell, I _did_ address the locations where that happens in the fix-implicit-value-format-usage patch (the FillArgumentsArray changes).

Tried searching around for other places where the call might be getting invoked and I'm coming up with nothing.  Do you know where I could look.. or am I doing something wrong when boxing the double for a call to an JS FFI?
Flags: needinfo?(bbouvier)
Exciting to see we're getting close. I'm happy to review once ready; I might be on PTO (ish) for a few days but I should have time next week I think.
(In reply to Kannan Vijayan [:djvj] from comment #42)
> Benjamin, question for you.  This is a test failure where it should be
> obvious to the right person (e.g. you) where the right place to look is.
> 
> In wasm/nan-semantics.js, the following test fails:
> 
>     var checkBitPatterns = {
>         "": {
>             float32(x) {
>                 f32[1] = x;
>                 assertSameBitPattern(0, 4, 4);
>             },
>             float64(x) {
>                 f64[1] = x;
>                 assertSameBitPattern(0, 8, 8);
>             }
>         }
>     }
> 
>     f64[0] = NaN;
>     f64[0] = f64[0]; // Force canonicalization.
>     f64[1] = wasmEvalText('(module (func (result f64) (f64.const
> nan:0x123456)) (export "" 0))').exports[""]();
> assertSameBitPattern(0, 8, 8);
> 
>     wasmEvalText('(module (import "" "float64" (param f64)) (func (call 0
> (f64.const nan:0x123456))) (export "" 0))', checkBitPatterns).exports[""]();
> 
> 
> 
> The issue seems to be the call to "float64(x)" - I don't think I'm catching
> the location where we're calling "float64()" with the nan-constant and
> boxing it.  As far as I can tell, I _did_ address the locations where that
> happens in the fix-implicit-value-format-usage patch (the FillArgumentsArray
> changes).
> 
> Tried searching around for other places where the call might be getting
> invoked and I'm coming up with nothing.  Do you know where I could look.. or
> am I doing something wrong when boxing the double for a call to an JS FFI?

The line which is calling it is the following:

    wasmEvalText('(module (import "" "float64" (param f64)) (func (call 0 (f64.const nan:0x123456))) (export "" 0))', checkBitPatterns).exports[""]();

This creates a wasm module that imports a JS function called "float64" from the "" namespace of the imports object, and then `call 0` calls the function that has index 0 (and imported functions are the first one in the function index range, since the import section is placed before the function declaration section).

Looking at which failures are triggered, my best guess is that the issue isn't in the stub itself. The entry stub passes the right information to the Instance::callImport function, which does this for f32/f64 arguments:

          case ValType::F64:
            args[i].set(JS::CanonicalizedDoubleValue(*(double*)&argv[i]));
            break;

The CanonicalizedDoubleValue function sets a Value from the raw bits, which needs fixing for the new boxing format:

    return MOZ_UNLIKELY(mozilla::IsNaN(d))
           ? Value::fromRawBits(detail::CanonicalizedNaNBits)
           : Value::fromDouble(d);

https://searchfox.org/mozilla-central/source/js/public/Value.h#994

Maybe it's already fixed in another patch than the fix-implicit-value-format-usage one? If it's not this, I'm happy to investigate more if you have a rolled up patch containing all the fixes.
Flags: needinfo?(bbouvier)
I don't think this is an update, but just in case - re-uploading from patch queue.
Attachment #9012781 - Attachment is obsolete: true
Attached patch object-biased-nanboxing.patch (obsolete) — Splinter Review
Update to object-biased-nanboxing with fix to CanonicalizedNaN value constructor.

Now with interpreter-only, exactly one test fails: testBullet.js

That's a huge test case, so I'll address the baseline and ion failures first and see if the fixes there catches the issue.
Attachment #9011287 - Attachment is obsolete: true
Documenting remaining test failures so I don't have to keep running the tests again:

FAILURES:
    asm.js/testFFI.js
    asm.js/testFloatingPoint.js
    asm.js/testKVKV.js
    asm.js/testBullet.js
    ion/bug1165905.js
    ion/bug860838.js
    sunspider/check-string-tagcloud.js
    --no-wasm-baseline wasm/builtin.js
    wasm/builtin.js
    --no-wasm-ion wasm/builtin.js
    --test-wasm-await-tier2 wasm/builtin.js
    wasm/conversion.js
    --no-wasm-baseline wasm/conversion.js
    --no-wasm-ion wasm/conversion.js
    --test-wasm-await-tier2 wasm/conversion.js
    wasm/import-export.js
    --no-wasm-baseline wasm/import-export.js
    --no-wasm-ion wasm/import-export.js
    --test-wasm-await-tier2 wasm/import-export.js
    wasm/ion-args.js
    --no-wasm-baseline wasm/ion-args.js
    --no-wasm-ion wasm/ion-args.js
    --test-wasm-await-tier2 wasm/ion-args.js
    --no-wasm-baseline wasm/js-reexport.js
    wasm/js-reexport.js
    --no-wasm-ion wasm/js-reexport.js
    --test-wasm-await-tier2 wasm/js-reexport.js
The "testKVKV.js" is a reduced testcase I obtained from testFFI.js.

Benjamin, can you point me in the right direction with this?


The test case is:

load(libdir + "asm.js");
load(libdir + "asserts.js");

var imp = { };

var recurse = function(i,j) { if (i == 0) return j; return f(i-1,j+1)+j }
imp.recurse = recurse;
var f = asmLink(asmCompile('glob', 'imp', USE_ASM + 'var r=imp.recurse; function f(i,j) { i=i|0;j=+j; return +r(i|0,j) } return f'), null, imp);
assertEq(f(0,3.3), 3.3);
print("HERE4.1.1");
assertEq(f(1,3.3), 3.3+4.3);
print("HERE4.1.2");

The test case fails only when it is run with '--baseline-eager --no-asmjs' flags passed:


The output from jit-tests is:

HERE4.1.1
/home/kvijayan/checkouts/mozilla-inbound/js/src/jit-test/tests/asm.js/testKVKV.js:11:1 Error: Assertion failed: got 4.600000000000001, expected 7.6
Stack:
  @/home/kvijayan/checkouts/mozilla-inbound/js/src/jit-test/tests/asm.js/testKVKV.js:11:1
Flags: needinfo?(bbouvier)
Apologies, I meant this only fails when run with '--baseline-eager' and *with asmjs enabled*.
Cancelling needinfo, sorry for the bother.  After some deep-diving with rr, this seems like it's NOT coming from asmjs code at all - but rather (probably) the bytecode compiler (haven't finished it yet).

Tracing back I can see a JSOP_DOUBLE pushing a script constant with the "4.6-ish" raw value I'm seeing showing up.  I'm suspecting some constant-folding code in the bytecode-compiler at this point.
Flags: needinfo?(bbouvier)
Looking at this again.. the issue seems to actually be in the baseline code:

Traced with rr and I'm seeing the following lhs and rhs at the _second_ run of the `recurse` function, at the '+' operator which adds the result of the recursive call to argument (i.e. the  '+' in `f(...)+j` towards the end of the `recurse` function):

(rr) p lhs.ptr->asDouble_
$116 = 3.3000000000000003
(rr) p rhs.ptr->asDouble_
$117 = 3.2999999999999998

This doesn't seem normal: the second invocation of that JSOP_ADD should be `4.3 + 3.3` (or something close to it).  Now, looking at how our new boxing layout works.. we see this:

1: Double 3.3000000000000003
  Raw U64: 400a666666666667
  Box U64: 4012666666666666

2: Double 3.2999999999999998
  Raw U64: 400a666666666666
  Box U64: 4012666666666665

3: Double 3.3
  Raw U64: 400a666666666666
  Box U64: 4012666666666665

Note that the representation of 3.3 and 3.2999...98 are identical.  These are the same values.

Also note that the representation of 3.3 and 3.30000...3 are the result of incrementing the least significant bit of the mantissa.

I suspect a 'uint64_t(1) + uint64_t(X)' is happening somewhere, where X is the float64, or boxed-float64 representation of a double value.
An updated testcase paints a better picture of the problem:

load(libdir + "asm.js");
load(libdir + "asserts.js");

var imp = { };

var recurse = function(i,j) { print("I=" + i + ", " + " j=" + j); if (i == 0) return j; return f(i-1,j+1)+j }
imp.recurse = recurse;
var f = asmLink(asmCompile('glob', 'imp', USE_ASM + 'var r=imp.recurse; function f(i,j) { i=i|0;j=+j; return +r(i|0,j) } return f'), null, imp);
assertEq(f(0,3.3), 3.3);
print("HERE4.1.1");
assertEq(f(1,3.3), 3.3+4.3);
print("HERE4.1.2");



The Output:

I=0,  j=3.3
HERE4.1.1
I=1,  j=2.3000000000000003
I=0,  j=2.3000000000000007
/home/kvijayan/checkouts/mozilla-inbound/js/src/jit-test/tests/asm.js/testKVKV-2.js:11:1 Error: Assertion failed: got 4.600000000000001, expected 7.6
Stack:
  @/home/kvijayan/checkouts/mozilla-inbound/js/src/jit-test/tests/asm.js/testKVKV-2.js:11:1


The first line after the "HERE4.1.1" is interesting.  It should be the same values being passed in via `f(1, 3.3)`, but it's not.

If I remove the first call to the asm.js `f` (i.e the call to `f` in `assertEq(f(0,3.3), 3.3)`), the test passes fine.

This suggests that baseline calling is the issue here: the first time we call, we use the fallback stub for the call and the arguments go through fine.  The second time we call, we should be using the optimized call stub.  I think this is failing to handle doubles properly.
Attachment #9012998 - Attachment is obsolete: true
Attached patch object-biased-nanboxing.patch (obsolete) — Splinter Review
Found the issue causing most of the remaining test failures on '--no-ion --baseline-eager'.

Remaining jit-test failures with that mode are:

    asm.js/testBullet.js
    sunspider/check-string-tagcloud.js


Still getting a lot of test failures with '--ion-eager --baseline-eager'.
Attachment #9012999 - Attachment is obsolete: true
Attached patch ion-fixes.patch (obsolete) — Splinter Review
Started fixing bugs showing up in ion.  Patch attached.  Testing with '--no-ion --baseline-eager' produces too few failures.. so now I've moved onto '--ion-eager --baseline-eager'.

With this patch, the test failures are:

    TypedObject/jit-read-float64.js
    --no-asmjs asm.js/testFFI.js
    asm.js/testFFI.js
    asm.js/testFloatingPoint.js
    --no-asmjs asm.js/testFloatingPoint.js
    --no-asmjs asm.js/testMathLib.js
    asm.js/testMathLib.js
    asm.js/testZOOB.js
    asm.js/testBullet.js
    basic/bug652054.js
    basic/doMath.js
    basic/homogenous-literals.js
    basic/math-random.js
    basic/math-jit-tests.js
    basic/parseIntTests.js
    basic/testBug504520Harder.js
    basic/testBug554043.js
    basic/testBug720695.js
    basic/testDoubleToStr.js
    basic/testNativeLog.js
    basic/testParseInt.js
    basic/testTypedArrayClamping.js
    basic/testTypedArrayInit.js
    closures/set-outer-trace-3.js
    closures/t004.js
    debug/wasm-10.js
    debug/wasm-13.js
    --no-wasm-baseline debug/wasm-13.js
    for-of/array-iterator-generic.js
    ion/array-push-length-overflow.js
    ion/bug1000960.js
    ion/bug1107011-2.js
    ion/bug1189137.js
    ion/bug705351.js
    ion/bug764432.js
    ion/bug860838-3.js
    ion/bug915301.js
    ion/ceil.js
    ion/dce-with-rinstructions.js
    ion/lsra-bug1112164.js
    ion/setelem-float32-typedarray-ic.js
    ion/test-scalar-replacement-float32.js
    ion/testFloat32.js
    ion/inlining/inline-getelem-args.js
    jaeger/bug618863.js
    jaeger/bug639792.js
    jaeger/bug658240.js
    jaeger/bug771871.js
    jaeger/recompile/bug659639.js
    jaeger/recompile/incdec.js
    jaeger/recompile/inlinestubs.js
    jaeger/recompile/memory-02.js
    self-hosting/tolength.js
    sunspider/check-3d-cube.js
    sunspider/check-3d-raytrace.js
    sunspider/check-3d-morph.js
    sunspider/check-access-nbody.js
    sunspider/check-bitops-nsieve-bits.js
    sunspider/check-math-cordic.js
    sunspider/check-math-partial-sums.js
    sunspider/check-math-spectral-norm.js
    sunspider/check-date-format-tofte.js
    sunspider/check-date-format-xparb.js
    sunspider/check-string-fasta.js
    sunspider/check-string-tagcloud.js
    v8-v5/check-raytrace.js
    typedarray/sort.js
    sunspider/check-string-unpack-code.js
    v8-v5/check-splay.js
    wasm/bce.js
    --no-wasm-baseline wasm/bce.js
    wasm/builtin.js
    --no-wasm-ion wasm/bce.js
    --test-wasm-await-tier2 wasm/bce.js
    --no-wasm-baseline wasm/builtin.js
    --test-wasm-await-tier2 wasm/builtin.js
    --no-wasm-ion wasm/builtin.js
    wasm/const.js
    --no-wasm-baseline wasm/const.js
    --test-wasm-await-tier2 wasm/const.js
    --no-wasm-ion wasm/const.js
    wasm/conversion.js
    --no-wasm-baseline wasm/conversion.js
    --no-wasm-ion wasm/conversion.js
    --test-wasm-await-tier2 wasm/conversion.js
    --no-wasm-baseline wasm/errors.js
    --no-wasm-ion wasm/errors.js
    wasm/errors.js
    --test-wasm-await-tier2 wasm/errors.js
    --no-wasm-ion wasm/ion-args.js
    --test-wasm-await-tier2 wasm/ion-args.js
    wasm/memory.js
    --test-wasm-await-tier2 wasm/memory.js
    --no-wasm-baseline wasm/memory.js
    --no-wasm-ion wasm/memory.js
    wasm/spec/harness/wast.js

Seems like a lot, but probably tracks back to a handful of places once again.
Attached patch object-biased-nanboxing.patch (obsolete) — Splinter Review
Rolled the ion fixes into the main object-biased-nanboxing patch to avoid messiness between patch changes.
Attachment #9013915 - Attachment is obsolete: true
Attachment #9014143 - Attachment is obsolete: true
Attached patch object-biased-nanboxing.patch (obsolete) — Splinter Review
Updated rebased patch.
Assignee: jporter+bmo → kvijayan
Attachment #9012739 - Attachment is obsolete: true
Attachment #9012740 - Attachment is obsolete: true
Attachment #9012741 - Attachment is obsolete: true
Attachment #9013914 - Attachment is obsolete: true
Attachment #9014145 - Attachment is obsolete: true
Attached patch object-biased-nanboxing.patch (obsolete) — Splinter Review
Whoops, uploaded stale unified patch before.  This is the right one.
Attachment #9019573 - Attachment is obsolete: true
With this version of the patch, there are no test failures with '--no-ion --no-baseline'.

Test failures with '--no-ion --baseline-eager':
    asm.js/testBullet.js
    ion/bug1165905.js
    sunspider/check-string-tagcloud.js


Test failures with '--ion-eager':
    TypedObject/inlineopaque.js
    TypedObject/jit-read-float64.js
    asm.js/testFFI.js
    --no-asmjs asm.js/testFFI.js
    asm.js/testFloatingPoint.js
    --no-asmjs asm.js/testFloatingPoint.js
    asm.js/testMathLib.js
    --no-asmjs asm.js/testMathLib.js
    asm.js/testZOOB.js
    basic/array-copyWithin.js
    basic/doMath.js
    basic/homogenous-literals.js
    basic/math-random.js
    basic/math-jit-tests.js
    basic/parseIntTests.js
    basic/testBug504520.js
    basic/testBug504520Harder.js
    basic/testBug720695.js
    basic/testTypedArrayClamping.js
    basic/testTypedArrayInit.js
    closures/set-outer-trace-3.js
    closures/t004.js
    debug/wasm-10.js
    debug/wasm-13.js
    --no-wasm-baseline debug/wasm-13.js
    for-of/array-iterator-generic.js
    gc/gcparam.js
    ion/array-push-length-overflow.js
    ion/bug1000960.js
    ion/bug1107011-2.js
    ion/bug764432.js
    ion/bug860838-3.js
    ion/bug878510.js
    ion/bug925305.js
    ion/ceil.js
    ion/doubleArrays.js
    ion/dce-with-rinstructions.js
    ion/lsra-bug1112164.js
    ion/setelem-float32-typedarray-ic.js
    ion/test-scalar-replacement-float32.js
    ion/testFloat32.js
    ion/inlining/array-push.js
    ion/inlining/inline-getelem-args.js
    jaeger/bug639792.js
    jaeger/bug652305.js
    jaeger/bug658240.js
    jaeger/recompile/bug659639.js
    jaeger/recompile/incdec.js
    sunspider/check-3d-morph.js
    sunspider/check-3d-raytrace.js
    sunspider/check-3d-cube.js
    sunspider/check-access-nbody.js
    sunspider/check-bitops-nsieve-bits.js
    sunspider/check-math-cordic.js
    sunspider/check-math-partial-sums.js
    sunspider/check-math-spectral-norm.js
    sunspider/check-date-format-tofte.js
    sunspider/check-date-format-xparb.js
    sunspider/check-string-tagcloud.js
    v8-v5/check-raytrace.js
    typedarray/sort.js
    sunspider/check-string-unpack-code.js
    v8-v5/check-deltablue.js
    v8-v5/check-splay.js
    --no-wasm-baseline wasm/bce.js
    wasm/bce.js
    --no-wasm-ion wasm/bce.js
    --no-wasm-baseline wasm/builtin.js
    wasm/builtin.js
    --test-wasm-await-tier2 wasm/bce.js
    --no-wasm-ion wasm/builtin.js
    --test-wasm-await-tier2 wasm/builtin.js
    wasm/const.js
    --no-wasm-baseline wasm/const.js
    --no-wasm-ion wasm/const.js
    --test-wasm-await-tier2 wasm/const.js
    wasm/conversion.js
    --no-wasm-ion wasm/conversion.js
    --no-wasm-baseline wasm/conversion.js
    --test-wasm-await-tier2 wasm/conversion.js
    wasm/errors.js
    --no-wasm-baseline wasm/errors.js
    --no-wasm-ion wasm/errors.js
    --test-wasm-await-tier2 wasm/errors.js
    --test-wasm-await-tier2 wasm/float.js
    wasm/ion-args.js
    --no-wasm-baseline wasm/ion-args.js
    wasm/memory.js
    --no-wasm-ion wasm/memory.js
    --no-wasm-baseline wasm/memory.js
    --test-wasm-await-tier2 wasm/memory.js
    wasm/spec/harness/wast.js
Attached file double-helper.c
Small C program to help convert double values to hex representations (boxed and unboxed), and the reverse.
A few notes:

1. The failure of ion/bug1165905.js with --no-ion --baseline-eager has nothing to do with this bug. The testcase also fails on central with those options: "Error: Can't turn off JITs with JIT code on the stack."

2. The failure in sunspider/check-string-tagcloud.js is caused by unboxed objects. In MacroAssembler.cpp, storeUnboxedProperty and loadUnboxedProperty both assume that converting an unboxed double to a Value is a no-op, and use store/loadValue. If those functions are fixed to box on load and unbox on store, check-string-tagcloud.js passes. 

3. Our ion code to push doubles into an array is broken. Here's a minimal testcase (run with --ion-eager --ion-offthread-compile=off:

```
function f(n) {
    let x = [];
    x.push(n + 0.5);
    return x;
}
print(f(1)[0]);
print(f(2)[0]);
print(f(3)[0]); // should be 3.5; actually prints 2.5000000000000004.
```

The problem is that we generate a ToDouble instruction, but we don't ever box the resulting double. This was fine when boxing doubles was a no-op, but it breaks with object-biased nan-boxing.

Boxing at this point fixes the testcase: https://searchfox.org/mozilla-central/source/js/src/jit/MCallOptimize.cpp#913
Attached patch nan-boxing-fixes.patch (obsolete) — Splinter Review
Here's a big one: https://searchfox.org/mozilla-central/source/js/src/jit/CodeGenerator.cpp#3475

Passing a double argument in Ion stores it to the stack unboxed. Loading it from the stack will try to unbox it again, giving the wrong result.

Fixing this fixes quite a few of the failing testcases. (A number of the remaining patches all fail with the same GC-related error, which I'm going to look at today.)

Uploading a patch with my fixes. The fix I am least certain about is array.push.
Attached patch changes.patch (obsolete) — Splinter Review
Updated set of '--ion-eager --baseline-eager' failures after fixing some of the stuff identified by Iain:

FAILURES:
    asm.js/testMathLib.js
    --no-asmjs asm.js/testMathLib.js
    asm.js/testZOOB.js
    basic/bug652054.js
    basic/homogenous-literals.js
    basic/math-jit-tests.js
    basic/testBug504520.js
    basic/testBug504520Harder.js
    basic/testBug720695.js
    closures/set-outer-trace-3.js
    debug/wasm-10.js
    debug/wasm-13.js
    --no-wasm-baseline debug/wasm-13.js
    ion/bug1107011-2.js
    ion/bug705351.js
    ion/bug736135.js
    ion/bug816786.js
    ion/bug860838-3.js
    ion/bug878510.js
    ion/bug915301.js
    ion/bug925305.js
    ion/getprop-cache.js
    ion/dce-with-rinstructions.js
    ion/lsra-bug1112164.js
    ion/setelem-float32-typedarray-ic.js
    ion/test-scalar-replacement-float32.js
    ion/testFloat32.js
    jaeger/recompile/memory-02.js
    sunspider/check-3d-raytrace.js
    sunspider/check-3d-morph.js
    sunspider/check-3d-cube.js
    sunspider/check-access-nbody.js
    sunspider/check-bitops-nsieve-bits.js
    sunspider/check-math-cordic.js
    sunspider/check-math-spectral-norm.js
    sunspider/check-math-partial-sums.js
    sunspider/check-string-fasta.js
    v8-v5/check-raytrace.js
    typedarray/sort.js
    v8-v5/check-splay.js
    wasm/builtin.js
    --no-wasm-baseline wasm/builtin.js
    --no-wasm-ion wasm/builtin.js
    --test-wasm-await-tier2 wasm/builtin.js
    wasm/conversion.js
    --no-wasm-baseline wasm/conversion.js
    --no-wasm-ion wasm/conversion.js
    --test-wasm-await-tier2 wasm/conversion.js
    --no-wasm-baseline wasm/errors.js
    wasm/errors.js
    --no-wasm-ion wasm/errors.js
    --test-wasm-await-tier2 wasm/errors.js
    --no-wasm-baseline wasm/memory.js
    wasm/memory.js
    --no-wasm-ion wasm/memory.js
    --test-wasm-await-tier2 wasm/memory.js
    wasm/spec/harness/wast.js
One additional fix:

MacroAssembler::storeTypedOrValue fails to box doubles when storing them: https://searchfox.org/mozilla-central/source/js/src/jit/MacroAssembler.h#2219 

Fixing this problem requires implementing boxDouble with a BaseIndex dest. I'm attaching a patch with this fix.

Two more closely related problems: 

1. emitLoadElementT loads doubles without unboxing if loadDoubles is set on the MIR node:
https://searchfox.org/mozilla-central/source/js/src/jit/CodeGenerator.cpp#11665

2. If loadDoubles is not set, emitLoadElementT calls loadUnboxedValue, which in turn calls loadInt32OrDouble, which doesn't unbox:
https://searchfox.org/mozilla-central/source/js/src/jit/x64/MacroAssembler-x64-inl.h#1002

Unfortunately, both of these functions are templated to handle both Address and BaseIndex sources. The existing implementation of unboxDouble only handles Address sources. I don't know x86 assembly well enough to devise a way to unbox a BaseIndex without using an additional scratch register, so neither of these bugs is fixed in the attached patch.
Attached patch object-biased-nan-boxing.patch (obsolete) — Splinter Review
I've made some more fixes. Instead of tracing through individual failures in rr, I bit the bullet and looked at every instance of storeDouble/loadDouble in Spidermonkey, and updated the ones that looked sketchy. This approach got me down to the following failures in jit_test.py --ion:

TIMEOUTS:
    --ion-eager --ion-offthread-compile=off basic/bug620532.js
    --ion-eager --ion-offthread-compile=off basic/testShiftRightLogical.js
    --ion-eager --ion-offthread-compile=off ion/bug925305.js
    --ion-eager --ion-offthread-compile=off ion/range-analysis.js
    --ion-eager --ion-offthread-compile=off ion/setelem-float32-typedarray-ic.js
    --baseline-eager ion/testFloat32.js
    --ion-eager --ion-offthread-compile=off ion/testFloat32.js
    --ion-eager --ion-offthread-compile=off jaeger/recompile/incdec.js
    --baseline-eager sunspider/check-access-binary-trees.js
    --ion-eager --ion-offthread-compile=off sunspider/check-3d-raytrace.js
    --ion-eager --ion-offthread-compile=off sunspider/check-3d-cube.js
    --ion-eager --ion-offthread-compile=off sunspider/check-math-spectral-norm.js
    --ion-eager --ion-offthread-compile=off sunspider/check-string-fasta.js
    --baseline-eager typedarray/sort.js
    --baseline-eager v8-v5/check-deltablue.js
    --ion-eager --ion-offthread-compile=off v8-v5/check-raytrace.js
    --ion-eager --ion-offthread-compile=off typedarray/sort.js
    --baseline-eager v8-v5/check-richards.js
    --ion-eager --ion-offthread-compile=off wasm/errors.js

This patch wraps up all the work that we've done so far. Note that the above results were run without one change Kannan made (in inlineArrayPush) that I hadn't merged yet. I don't think it should matter much, though.
Attachment #9019575 - Attachment is obsolete: true
Attachment #9021144 - Attachment is obsolete: true
Attachment #9021302 - Attachment is obsolete: true
Attachment #9021363 - Attachment is obsolete: true
After fixing a problem with masm.Push(TypedOrValueRegister), we're down to three failures (all in --baseline-eager):

sunspider/check-access-binary-trees.js
v8-v5/check-richards.js
v8-v5/check-deltablue.js

These fail because of some code that we deleted in storeUnboxedPayload (https://searchfox.org/mozilla-central/source/js/src/jit/x64/MacroAssembler-x64.h#997-1003): 

            if (type == JSVAL_TYPE_OBJECT) {
                // Ideally we would call unboxObjectOrNull, but we need an extra
                // scratch register for that. So unbox as object, then clear the
                // object-or-null bit.
                mov(ImmWord(~JSVAL_OBJECT_OR_NULL_BIT), scratch);
                andq(scratch, Operand(address));
            }
The issue is that, now that we've deleted this, if the value we are storing is Null, we end up storing a tagged Null (0x2800000000000) instead of 0x0. The loadUnboxed code knows that it should test for Null, but it is looking for an actual 0 value, not a tagged null, so we end up trying to dereference 0x2800000000000 and crashing.

Adding that code back in, and changing JSVAL_OBJECT_OR_NULL_BIT to JSVAL_SHIFTED_TAG_NULL, fixes the last three failures.

I'm not sure it's the best fix, though. Is there a reason, in our brave new object-biased world, that we don't just represent Null as an actual nullptr? We couldn't do it when 0x0 was a valid Double, but that's no longer a concern, because 0x0 is not a valid object.

(Also, for future reference: I'm pretty sure something still needs to be done in TraceDataRelocations (https://searchfox.org/mozilla-central/source/js/src/jit/x86-shared/Assembler-x86-shared.cpp#44), but I haven't figured out what yet.)
Here's the latest version of the patch. It passes all the jit-tests. It still needs a lot of cleanup before it's ready for review, though. (For example, I would be very surprised if this compiled for anything other than x64.)

Kannan, what's the plan moving forward?
Attachment #9022305 - Attachment is obsolete: true
Flags: needinfo?(kvijayan)
(In reply to Iain Ireland [:iain] from comment #67)
> Created attachment 9024092 [details] [diff] [review]
> object-biased-nan-boxing-no-failures.patch
> 
> Here's the latest version of the patch. It passes all the jit-tests. It
> still needs a lot of cleanup before it's ready for review, though. (For
> example, I would be very surprised if this compiled for anything other than
> x64.)
> 
> Kannan, what's the plan moving forward?

This is awesome.  Let me run the jit-tests.  I am still under the weather today and am taking PTO.

I am not too worried about the other platforms.  The 32-bit platforms (arm32 and x86) aren't affected by this change, and the other major 64-bit archs are mips64 and arm64.  Both of them have plenty of extra regs (scratch and secondscratch) that are generally available.

Next steps are to fuzz this on x64, while we get the patch ready for general try-green.
Flags: needinfo?(kvijayan)
I've been doing some performance comparisons with and without this patch. So far, I'm actually not seeing any difference between them (I've tried a few build configs too, just to be extra-thorough). I'll continue to try on different benchmarks, but on the plus side, the patch no longer crashes during the tests!
Have you been doing perftests using a browser build or a shell?  When I tried to build the browser in an opt (or debug-opt) build.. it crashes on startup (debug builds run fine).  Tracking that crash right now (I found the bad value coming into JS, but don't know where it's generated).
I'm doing a browser build with optimizations on using the latest patch posted here. It seems to work fine (previous patches would crash on startup). I've checked that the diffs are applied right and all, so I'm pretty sure the build is right, but it's possible something weird happened. We didn't add a build-time flag for this or anything, right?
No, no build time flag. I'm guessing you're building 64-bit?

Maybe different platform or arch?  I'm building on x64 linux and I can't even get to about:blank.
Yeah, I'm building x64 Linux (verified via the `file` command) and it works just fine for me. At this point, the only thing I can think of left to try is rerunning `./mach bootstrap` and seeing if it changes anything.

It's especially surprising to me since I've tried previous versions of the patch and I got startup crashes at the time, but this one doesn't, and I'm not doing anything different...
Ok, now that I've rerun `./mach bootstrap`, I'm getting startup crashes again. Something must have gotten screwed up in my local config, though I can't imagine what.
I was feeling curious and took a quick look at the startup crash. I think we are falling victim to the Dread C++ Standard.

We fail while trying to call DefineConstants (https://searchfox.org/mozilla-central/source/dom/bindings/BindingUtils.cpp#617-630) on the static const table generated here: https://searchfox.org/mozilla-central/source/__GENERATED__/dom/bindings/NodeFilterBinding.cpp#29-46. Specifically, we fail while trying to root the value of "SHOW_ALL", which is supposed to be JS::NumberValue(4294967295U). If you load up the browser in rr and |print mozilla::dom::NodeFilter_Binding::sConstants_specs[3]|, you will see this: 

$55 = {name = 0x7f0d55443a2c "SHOW_ALL", value = {asBits_ = 0, asDouble_ = 0, debugView_ = {payload47_ = 0, 
      tag_ = JSValueTag::JSVAL_TAG_OBJECT}, s_ = {payload_ = {i32_ = 0, u32_ = 0, why_ = JS_ELEMENTS_HOLE}}}}

Because 4294967295 > JSVAL_INT_MAX, NumberValue returns |Value::fromDouble(double(4294967295))|. (https://searchfox.org/mozilla-central/source/js/public/Value.h#1037-1040). That's not supposed to be 0, obviously. But if you look at our current implementation of JSPun64BoxDouble, it is playing some type-punning tricks with a union of double and uint64_t. IIRC, that's technically undefined in C++. I bet the constexpr evaluation code in LLVM is taking advantage of that undefined behaviour to optimize JSPun64DoubleToUint64(x) into constant 0 (and probably the same in the other direction).

Somebody who knows the spec better than me should figure out how to do constexpr type punning correctly.
Iain is right, this kind of type punning is undefined. And not even possible within a constexpr as far as the language is concerned (C++20 will add bit_cast for this type of thing). Luckily constexpr isn't really needed here so we can get away with removing that, at least for the purpose of testing, it should be feasible to write a complex constexpr that does work if we really want this. This could be a non-performant version that only serves as a compile-time version of this function.

In any case this is the patch adjusted such that it runs all of speedometer. Currently it appears like it may be a small improvement, but if there is an improvement it's very small. More testing should probably tell us more, and there may be some low hanging fruits in terms of optimizations, I haven't profiled this yet.
On a hunch, I disabled IonMonkey and re-ran the speedometer benchmark - remembering that speedometer actually hits Ion pretty hard.  IonMonkey tends to obscure any benefit from this change as it will unbox an object once, and then carry the unboxed value through most of the computation.

I re-ran the tests (thrice with each of reference and patched) with Ion disabled and got a small, but reasonably stable improvement: 65.766 vs 66.176 - which amounts to about 0.6% improvement.

Speedometer spends a lot of time re-doing layout and rendering (all it does is repeatedly render a 100-item TODO list using various different frameworks, so it hits DOM/style/render pretty heavily).

Both on this result and based on intuition, I would expect that the benefit is more pronounced on the following sorts of code:

1. Not hot enough to have entered ion (i.e. cold and moderately warm code that has not yet been Ion-compiled)

2. Code where JS execution forms a larger chunk of the time slice.


I'm also concerned that removing the constexpr from the C++ implementation might have led to poorer optimization of the compiled C++ code and diminished the results.
(In reply to Kannan Vijayan [:djvj] from comment #77)
 
> 1. Not hot enough to have entered ion (i.e. cold and moderately warm code
> that has not yet been Ion-compiled)
> 
> 2. Code where JS execution forms a larger chunk of the time slice.

would this be opening gmail, gdocs and general polymer.js stuff ?
(In reply to Mayank Bansal from comment #78)
> (In reply to Kannan Vijayan [:djvj] from comment #77)
>  
> > 1. Not hot enough to have entered ion (i.e. cold and moderately warm code
> > that has not yet been Ion-compiled)
> > 
> > 2. Code where JS execution forms a larger chunk of the time slice.
> 
> would this be opening gmail, gdocs and general polymer.js stuff ?

Those all sound like workloads that we'd be interested in exploring. It's hard to be fully intuitive on these things though, it does somewhat depend on the type of workloads.
(In reply to Kannan Vijayan [:djvj] from comment #77)
> 
> I'm also concerned that removing the constexpr from the C++ implementation
> might have led to poorer optimization of the compiled C++ code and
> diminished the results.

I'm somewhat skeptical of that I suppose? All this does is a reinterpret cast, this is all in headers, if this function is passed as a constant this just implies a reinterpret cast of a constant (taking an integer in memory and considering that a double, i.e. a movsd into an XMM register), I don't see how that could be significantly 'less optimal'.
I suspect that removing constexpr there may mean that many things start generating static constructors where they were not before.
(In reply to Mayank Bansal from comment #78)
> (In reply to Kannan Vijayan [:djvj] from comment #77)
>  
> > 1. Not hot enough to have entered ion (i.e. cold and moderately warm code
> > that has not yet been Ion-compiled)
> > 
> > 2. Code where JS execution forms a larger chunk of the time slice.
> 
> would this be opening gmail, gdocs and general polymer.js stuff ?

Yeah - anywhere where we are primarily expecting to execute interpreter and baseline code.  The effect is more pronounced on baseline jitcode because other overheads are lower compared to Interpreter, but the boxing/unboxing time stays constant, leading to them being more prominent in the performance profile of baseline jitcode.

I also measured the improvement on some micro-benchmarks, just to confirm that the patch is doing what we want.  Posting those results as a follow-up.
Attached patch object-biased-nanboxing.patch (obsolete) — Splinter Review
Unified patch that doesn't crash browser on startup.
Attachment #9024092 - Attachment is obsolete: true
Attachment #9029373 - Attachment is obsolete: true
Micro-benchmark that does a bunch of object-unboxy stuff (it builds a long chain of the form: `{sub_obj: {sub_obj: {sub_obj: .... { sub_obj: { n: number } } ... }`

Then it repeatedly borrows down the object and pulls out the number inside.  On a no-ion run, I get average score of ~850ms for reference build, and ~731ms for the patched build, which is a 14% improvement.
This micro-bench attempts to do a bunch of double unboxing, and test how much the slowdown in double-boxing affects "double-heavy" code in Interp and Baseline (ion will unbox doubles and just keep them around, so it won't be affected that much).

It builds a large array of doubles which it reads out of, to compute a sum.

Results are about a 7% improvement with the patched build.  This might seem surprising, but remember that there is a lot of OTHER stuff that happens during execution that requires object boxing and unboxing (functions, arrays, etc.).

So the improvement in object boxing is big enough that the benefit of all that other code running faster outweighs the double manipulation in the core of the test.

This is a good result.
:decoder - I'd like to start fuzzing this on AMD64 archs only, whenever possible.  Of course we're in the middle of an all-hands so I understand things may be delayed a bit.  Just bringing it to your attention early so you're aware.
Flags: needinfo?(choller)
Data point: I saw no improvements in Ion for test-nanboxing-micro.js, but after talking to Luke about it, I tested it out with --no-unboxed-objects and saw ~1% improvement. Given that we're planning on eliminating unboxed objects, that's relevant.
(In reply to Kannan Vijayan [:djvj] from comment #86)
> :decoder - I'd like to start fuzzing this on AMD64 archs only, whenever
> possible.  Of course we're in the middle of an all-hands so I understand
> things may be delayed a bit.  Just bringing it to your attention early so
> you're aware.

Can you specify which m-c revision this applies on? It does not seem to apply to tip. Thanks!
Flags: needinfo?(choller) → needinfo?(kvijayan)
Oh boy, forgot about the style changes to the code.  I'll rebase this week and post a new patch.
Attached patch object-biased-nanboxing.patch (obsolete) — Splinter Review

Updated patch that applies to tip. Passes with-ion jit-tests, baseline-only jit-tests, and interpreter-only jit-tests.

Testing against '--baseline-eager --no-ion' and '--ion-eager' right now. Will run for green on try (x64 arch only), and then we should be ready for fuzzing.

Still need to run on try and add arch support for arm64 and mips64, but that can be worked on in parallel with fuzzing on x64.

Attachment #9029635 - Attachment is obsolete: true
Flags: needinfo?(kvijayan)
Attached patch object-biased-nanboxing.patch (obsolete) — Splinter Review

Rebased object-biased nanboxing patch that applies cleanly to m-i and m-c tip.

Attachment #9041287 - Attachment is obsolete: true
Attached patch object-biased-nanboxing.patch (obsolete) — Splinter Review

Updated patch, but with the enum-define-hack removed. This is causing build problems for decoder on fuzzing infra.

Synopsis of problem:

This patch changes the JSValueTag constants from large (17-bit values representable only with uint32_ts) to small (<8-bit values).

The JS_ENUM_BEGIN hack in Value.h was skipping the type info for the struct definition, and passing 'packed' instead. For the previous constants this would generate a uint32_t type. The new constants generate a uint8_t, I presume.

This breaks on some compilers, presumably wherever MOZ_IS_GCC is defined.

Attachment #9041874 - Attachment is obsolete: true
Comment on attachment 9041931 [details] [diff] [review]
object-biased-nanboxing.patch

I'm also testing this on a box, setting feedback? on myself.
Attachment #9041931 - Flags: feedback?(nth10sd)
This is an automated crash issue comment:

Summary: Assertion failure: gc::IsCellPointerValid(obj), at dist/include/js/Value.h:1435
Build version: mozilla-central revision 4677e6074704+
Build flags: --enable-posix-nspr-emulation --enable-valgrind --enable-gczeal --disable-tests --enable-debug --enable-optimize
Runtime options: --fuzzing-safe --ion-offthread-compile=off --ion-offthread-compile=off test.js

Testcase:

    gczeal(20, 4);
    let lfPromise = import("javascript: " + '');

Backtrace:

    received signal SIGSEGV, Segmentation fault.
    #0  js::DispatchTyped<js::TenuringTraversalFunctor<JS::Value>, js::TenuringTracer*>(js::TenuringTraversalFunctor<JS::Value>, JS::Value const&, js::TenuringTracer*&&) (f=..., val=..., args#0=args#0@entry=<unknown type>) at dist/include/js/Value.h:1435
    #1  0x000055a140a374c6 in js::TenuringTracer::traverse<JS::Value> (thingp=0x7ffd11b06d10, this=<optimized out>) at js/src/gc/Marking.cpp:2774
    #2  js::gc::TraceEdgeInternal<JS::Value> (trc=<optimized out>, thingp=0x7ffd11b06d10, name=<optimized out>) at js/src/gc/Marking.cpp:574
    #3  0x000055a140a376ec in js::gc::TraceRangeInternal<JS::Value> (trc=0x7ffd11b05b90, len=3, vec=<optimized out>, name=0x55a1414e7b01 "js::AutoValueArray") at js/src/gc/Marking.cpp:599
    #4  0x000055a140a477f3 in JS::AutoGCRooter::traceAll (cx=cx@entry=0x7fc2adf17000, trc=trc@entry=0x7ffd11b05b90) at js/src/gc/RootMarking.cpp:222
    #5  0x000055a140a592f4 in js::gc::GCRuntime::traceRuntimeCommon (this=this@entry=0x7fc2adf1c6c8, trc=trc@entry=0x7ffd11b05b90, traceOrMark=traceOrMark@entry=js::gc::GCRuntime::TraceRuntime) at js/src/gc/RootMarking.cpp:355
    #6  0x000055a140a59b84 in js::gc::GCRuntime::traceRuntimeForMinorGC (this=0x7fc2adf1c6c8, trc=0x7ffd11b05b90, session=...) at js/src/gc/RootMarking.cpp:301
    #7  0x000055a140a5b6cd in js::Nursery::doCollection (this=this@entry=0x7fc2adf1f178, reason=reason@entry=JS::GCReason::DEBUG_GC, tenureCounts=...) at js/src/gc/Nursery.cpp:926
    #8  0x000055a140a5c08e in js::Nursery::collect (this=0x7fc2adf1f178, reason=reason@entry=JS::GCReason::DEBUG_GC) at js/src/gc/Nursery.cpp:755
    #9  0x000055a1409d62f8 in js::gc::GCRuntime::minorGC (this=this@entry=0x7fc2adf1c6c8, reason=reason@entry=JS::GCReason::DEBUG_GC, phase=phase@entry=js::gcstats::PhaseKind::EVICT_NURSERY_FOR_MAJOR_GC) at js/src/gc/GC.cpp:7810
    #10 0x000055a140a09dc0 in js::gc::GCRuntime::gcCycle (this=this@entry=0x7fc2adf1c6c8, nonincrementalByAPI=nonincrementalByAPI@entry=false, budget=..., reason=reason@entry=JS::GCReason::DEBUG_GC) at js/src/gc/GC.cpp:7384
    #11 0x000055a140a0a4ed in js::gc::GCRuntime::collect (this=this@entry=0x7fc2adf1c6c8, nonincrementalByAPI=nonincrementalByAPI@entry=false, budget=..., reason=reason@entry=JS::GCReason::DEBUG_GC) at js/src/gc/GC.cpp:7592
    #12 0x000055a140a0c4cb in js::gc::GCRuntime::runDebugGC (this=this@entry=0x7fc2adf1c6c8) at js/src/gc/GC.cpp:8214
    #13 0x000055a140a0c61e in js::gc::GCRuntime::gcIfNeededAtAllocation (this=0x7fc2adf1c6c8, cx=cx@entry=0x7fc2adf17000) at js/src/gc/Allocator.cpp:338
    #14 0x000055a140a40a78 in js::gc::GCRuntime::checkAllocatorState<(js::AllowGC)1> (this=<optimized out>, cx=cx@entry=0x7fc2adf17000, kind=kind@entry=js::gc::AllocKind::OBJECT_LIMIT) at js/src/gc/Allocator.cpp:302
    #15 0x000055a140a4116d in js::Allocate<JSScript, (js::AllowGC)1> (cx=cx@entry=0x7fc2adf17000) at js/src/gc/Allocator.cpp:247
    #16 0x000055a140577cea in JSScript::New (cx=cx@entry=0x7fc2adf17000, sourceObject=..., sourceStart=192059, sourceEnd=193358, toStringStart=192032, toStringEnd=toStringEnd@entry=193358) at js/src/vm/JSScript.cpp:3131
    #17 0x000055a140577e79 in JSScript::Create (cx=<optimized out>, cx@entry=0x7fc2adf17000, options=..., sourceObject=..., sourceObject@entry=..., sourceStart=<optimized out>, sourceEnd=<optimized out>, toStringStart=<optimized out>, toStringEnd=193358) at js/src/vm/JSScript.cpp:3162
    #18 0x000055a1405780de in CreateEmptyScriptForClone (cx=0x7fc2adf17000, src=..., src@entry=..., sourceObject=...) at js/src/vm/JSScript.cpp:4091
    #19 0x000055a140588463 in js::CloneScriptIntoFunction (cx=<optimized out>, enclosingScope=..., enclosingScope@entry=..., fun=..., fun@entry=..., src=src@entry=..., sourceObject=..., sourceObject@entry=...) at js/src/vm/JSScript.cpp:4133
    #20 0x000055a140589016 in js::CloneFunctionAndScript (cx=<optimized out>, fun=fun@entry=..., enclosingEnv=..., newScope=..., newScope@entry=..., sourceObject=..., sourceObject@entry=..., allocKind=allocKind@entry=js::gc::AllocKind::FUNCTION_EXTENDED, proto=...) at js/src/vm/JSFunction.cpp:2333
    #21 0x000055a14063dcde in CloneObject (cx=<optimized out>, cx@entry=0x7fc2adf17000, selfHostedObject=selfHostedObject@entry=...) at js/src/vm/SelfHosting.cpp:3254
    #22 0x000055a14063fc8a in CloneValue (cx=cx@entry=0x7fc2adf17000, selfHostedValue=..., selfHostedValue@entry=..., vp=...) at js/src/vm/SelfHosting.cpp:3313
    #23 0x000055a140640023 in JSRuntime::cloneSelfHostedValue (this=0x7fc2adf1c000, cx=cx@entry=0x7fc2adf17000, name=..., name@entry=..., vp=...) at js/src/vm/SelfHosting.cpp:3460
    #24 0x000055a1406401ac in js::GlobalObject::getIntrinsicValue (value=..., name=..., global=..., cx=0x7fc2adf17000) at js/src/vm/GlobalObject.h:767
    #25 js::CallSelfHostedFunction (cx=0x7fc2adf17000, name=..., thisv=..., args=..., rval=..., rval@entry=...) at js/src/vm/SelfHosting.cpp:1920
    #26 0x000055a1403962d7 in js::ModuleObject::GetOrCreateModuleNamespace (cx=<optimized out>, self=..., self@entry=...) at js/src/builtin/ModuleObject.cpp:1095
    #27 0x000055a1403b7998 in js::FinishDynamicModuleImport (cx=<optimized out>, referencingPrivate=..., specifier=..., promiseArg=...) at js/src/builtin/ModuleObject.cpp:1749
    #28 0x000055a14027d536 in FinishDynamicModuleImport (cx=<optimized out>, cx@entry=0x7fc2adf17000, argc=<optimized out>, vp=<optimized out>) at js/src/shell/js.cpp:4940
    #29 0x000055a14032d5e9 in CallJSNative (cx=0x7fc2adf17000, native=0x55a14027d3a0 <FinishDynamicModuleImport(JSContext*, unsigned int, JS::Value*)>, args=...) at js/src/vm/Interpreter.cpp:442
    #30 0x000055a14031f6b7 in js::InternalCallOrConstruct (cx=<optimized out>, cx@entry=0x7fc2adf17000, args=..., construct=construct@entry=js::NO_CONSTRUCT) at js/src/vm/Interpreter.cpp:534
    #31 0x000055a14031fdf0 in InternalCall (cx=0x7fc2adf17000, args=...) at js/src/vm/Interpreter.cpp:589
    #32 0x000055a140312309 in js::CallFromStack (args=..., cx=<optimized out>) at js/src/vm/Interpreter.cpp:593
    #33 Interpret (cx=0x7fc2adf17000, state=...) at js/src/vm/Interpreter.cpp:3069
    #34 0x000055a14031f0f6 in js::RunScript (cx=0x7fc2adf17000, state=...) at js/src/vm/Interpreter.cpp:422
    #35 0x000055a14031f98f in js::InternalCallOrConstruct (cx=<optimized out>, cx@entry=0x7fc2adf17000, args=..., construct=construct@entry=js::NO_CONSTRUCT) at js/src/vm/Interpreter.cpp:562
    #36 0x000055a14031fdf0 in InternalCall (cx=cx@entry=0x7fc2adf17000, args=...) at js/src/vm/Interpreter.cpp:589
    #37 0x000055a14031ff70 in js::Call (cx=cx@entry=0x7fc2adf17000, fval=..., fval@entry=..., thisv=..., thisv@entry=..., args=..., rval=...) at js/src/vm/Interpreter.cpp:605
    #38 0x000055a140376545 in js::Call (cx=0x7fc2adf17000, fval=..., thisv=..., arg0=..., rval=...) at js/src/vm/Interpreter.h:98
    #39 0x000055a1403c08c6 in PromiseReactionJob (cx=<optimized out>, cx@entry=0x7fc2adf17000, argc=<optimized out>, vp=<optimized out>) at js/src/builtin/Promise.cpp:1652
    #40 0x000055a14032d5e9 in CallJSNative (cx=0x7fc2adf17000, native=0x55a1403c0050 <PromiseReactionJob(JSContext*, unsigned int, JS::Value*)>, args=...) at js/src/vm/Interpreter.cpp:442
    [...]
    #48 main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at js/src/shell/js.cpp:11215
    rax	0x55a14263b280	94151091925632
    rbx	0x0	0
    rcx	0x7fc2aec112dd	140474132271837
    rdx	0x0	0
    rsi	0x55a141516ca0	94151073950880
    rdi	0x7fc2aeedf540	140474135213376
    rbp	0x7ffd11b05930	140724900223280
    rsp	0x7ffd11b05900	140724900223232
    r8	0x7fc2aeee0770	140474135218032
    r9	0x7fc2affd9c80	140474153016448
    r10	0x58	88
    r11	0x7fc2aeb877a0	140474131707808
    r12	0x7ffd11b05950	140724900223312
    r13	0x3	3
    r14	0x7ffd11b06d10	140724900228368
    r15	0x0	0
    rip	0x55a140a37389 <js::DispatchTyped<js::TenuringTraversalFunctor<JS::Value>, js::TenuringTracer*>(js::TenuringTraversalFunctor<JS::Value>, JS::Value const&, js::TenuringTracer*&&)+553>
    => 0x55a140a37389 <js::DispatchTyped<js::TenuringTraversalFunctor<JS::Value>, js::TenuringTracer*>(js::TenuringTraversalFunctor<JS::Value>, JS::Value const&, js::TenuringTracer*&&)+553>:	movl   $0x0,0x0
       0x55a140a37394 <js::DispatchTyped<js::TenuringTraversalFunctor<JS::Value>, js::TenuringTracer*>(js::TenuringTraversalFunctor<JS::Value>, JS::Value const&, js::TenuringTracer*&&)+564>:	ud2

Similar crash to the last comment, but without import:

gczeal(8, 4);
var ProxiedConstructor = new Proxy(Intl.PluralRules, new Proxy({}, {}));
obj = Reflect.construct(Intl.PluralRules, [], ProxiedConstructor);
Comment on attachment 9041931 [details] [diff] [review]
object-biased-nanboxing.patch

Found the assert in comment 95 as well, will await a new patch.
Attachment #9041931 - Flags: feedback?(nth10sd) → feedback-

Found the issue. Just documenting for posterity - there's a boxed null-ptr Value being created in a (rooted) AutoValueArray on the C stack (within some code for calling into FinishDynamicModuleImport).

With acknowledgements to rr, the culprit was traced back to here:

https://searchfox.org/mozilla-central/source/js/src/jsapi.h#73

Rolling updated patch.

Attached patch object-biased-nanboxing.patch (obsolete) — Splinter Review

Updated patch. Passes both the failing test cases posted so far.

Also rebased to apply to tip (there were some conflicting changes in a wasm file).

Attachment #9041931 - Attachment is obsolete: true

(In reply to Kannan Vijayan [:djvj] from comment #99)

Also rebased to apply to tip (there were some conflicting changes in a wasm file).

Please specify the exact m-c rev hash in the future, "tip" can still mean different hashes and entail a bunch of guesswork. Thanks!

Attached patch object-biased-nanboxing.patch (obsolete) — Splinter Review

Updated patch. This should apply to the current m-c revision c849fb69e2e7.

Attachment #9042237 - Attachment is obsolete: true

I only got it to apply on m-c rev b482c6618d72.

I'd like a rebase on current m-c, that m-c rev has bug 1526279 throwing constantly (since fixed on m-c). Thanks!

Flags: needinfo?(kvijayan)

Cool. Let me rebase and re-post the main patch after I post the arm64 delta. Just got that to build (simulator at least).

Flags: needinfo?(kvijayan)
Attached patch obnan-arm64.patch (obsolete) — Splinter Review
Attached patch object-biased-nanboxing.patch (obsolete) — Splinter Review

Main patch rebased to tip. Should apply to m-c revision 'bf3951daded0'.

Attachment #9043376 - Attachment is obsolete: true
Comment on attachment 9045366 [details] [diff] [review]
object-biased-nanboxing.patch

Setting feedback? for patch testing from :decoder and I.

I've started a round of this.
Attachment #9045366 - Flags: feedback?(nth10sd)
Attachment #9045366 - Flags: feedback?(choller)

I found something that might be related in bug 1529377 comment 2 so please take a look there.

Thanks for the heads up gary.

Commented on the other bug but will repeat here for posterity - the ARM64 build doesn't work yet. I haven't gotten it to pass tests yet, but will do so soon.

For now, the fuzzing with this patch should be restricted to x86-64 archs. This is where we expect to have a correct implementation right now, and a large chunk of the code involved is arch-independent (the first early fuzzbug found on this was cross-arch, for example).

If we can get x86-64 + cross-arch code fuzzed while I get ARM64 stood up generally (or at least no regressions from wherever sstangl/nbp are at), we should be able to fuzz the delta for arm64 more quickly (the differences should simply be in the core macroassembler code).

Comment on attachment 9045366 [details] [diff] [review]
object-biased-nanboxing.patch

Giving this a cautious feedback+ from some light fuzzing (and also because I'm about to head off on PTO soon...)
Attachment #9045366 - Flags: feedback?(nth10sd) → feedback+

Thanks, gary!

Repinging :decoder to make sure he's aware of the latest patch and can carry on from the light testing. I sort of assumed from the last post I made, but I realize no I didn't explicitly put a needinfo on it.

Flags: needinfo?(choller)
Comment on attachment 9045366 [details] [diff] [review]
object-biased-nanboxing.patch

I ran the updated patch for several days and didn't find any connected crashes.
Flags: needinfo?(choller)
Attachment #9045366 - Flags: feedback?(choller) → feedback+
Attached patch nanbox-main-x64.patch (obsolete) — Splinter Review

Rebased main patch.

Attachment #9045366 - Attachment is obsolete: true
Attached patch nanbox-arm64-support.patch (obsolete) — Splinter Review

Delta patch for ARM64 support rebased and one fix.

Attachment #9045057 - Attachment is obsolete: true

Iain: so I'm pinging you finally. Test failures on ARM64 are still pretty heavy and happening in TI. The x86-64 patch works ok. If you can help out, it would be much appreciated.

The arm64 patch is a delta patch, so to build you want to apply the main-x64 patch prior. Should apply clean to m-i tip.

Thanks again!

Flags: needinfo?(iireland)

Sounds good. Taking a look.

Flags: needinfo?(iireland)
Attached patch nanbox-arm64-support.patch (obsolete) — Splinter Review

Ok, this arm64 patch passes all tests on all modes (timeouts excluded), with the simulator.

Gkw & Decoder, when you guys have the time can you fuzz this delta patch (on top of the nanbox-main-x64.patch) on ARM64?

Time to send this for review.

Attachment #9048566 - Attachment is obsolete: true
Flags: needinfo?(choller)
This is an automated crash issue comment:

Summary: Assertion failure: !available->IsEmpty(), at js/src/jit/arm64/vixl/MacroAssembler-vixl.cpp:1999
Build version: mozilla-central revision c89f024c023f+
Build flags: --enable-posix-nspr-emulation --enable-valgrind --enable-gczeal --disable-tests --enable-debug --enable-optimize --enable-simulator=arm64
Runtime options: --fuzzing-safe --ion-offthread-compile=off --ion-warmup-threshold=100 test.js

Testcase:

    var count = 0;
    function callbackfn(v) {
        if (++count == 98) count = 0x7ffffff0;
    }
    arr = [1, 2, 3, 4, 5];
    for (var i = 0; i < 50; i++) arr = arr.map(callbackfn);

Backtrace:

    received signal SIGSEGV, Segmentation fault.
    #0  vixl::UseScratchRegisterScope::AcquireNextAvailable (available=0x7f5acfeb0788) at js/src/jit/arm64/vixl/MacroAssembler-vixl.cpp:1999
    #1  0x00005564722f1a35 in vixl::UseScratchRegisterScope::AcquireSameSizeAs (this=this@entry=0x7ffc1cc140f0, reg=...) at js/src/jit/arm64/vixl/MacroAssembler-vixl.cpp:1882
    #2  0x00005564723188af in vixl::MacroAssembler::AddSubMacro (this=this@entry=0x7f5acfeb0040, rd=..., rn=..., operand=..., S=S@entry=vixl::LeaveFlags, op=vixl::SUB) at js/src/jit/arm64/vixl/MacroAssembler-vixl.cpp:1056
    #3  0x000055647231ab65 in vixl::MacroAssembler::Sub (this=this@entry=0x7f5acfeb0040, rd=..., rn=..., operand=..., S=S@entry=vixl::LeaveFlags) at js/src/jit/arm64/vixl/MacroAssembler-vixl.cpp:847
    #4  0x0000556472231d06 in js::jit::MacroAssemblerCompat::unboxDouble (this=this@entry=0x7f5acfeb0040, src=..., dest=...) at js/src/jit/arm64/MacroAssembler-arm64.h:1339
    #5  0x000055647239e652 in js::jit::MacroAssemblerCompat::int32OrDouble (this=0x7f5acfeb0040, src=..., dest=...) at js/src/jit/arm64/MacroAssembler-arm64.h:1855
    #6  0x00005564723f8971 in js::jit::MacroAssemblerCompat::loadUnboxedValue (this=0x7f5acfeb0040, address=..., type=<optimized out>, dest=...) at js/src/jit/arm64/MacroAssembler-arm64.h:1868
    #7  0x00005564723afa7b in js::jit::CodeGenerator::visitLoadSlotT (this=this@entry=0x7f5acfeb0000, lir=lir@entry=0x7f5acea227b0) at js/src/jit/CodeGenerator.cpp:3951
    #8  0x00005564723f4ee8 in js::jit::CodeGenerator::generateBody (this=this@entry=0x7f5acfeb0000) at js/src/jit/CodeGenerator.cpp:6430
    #9  0x00005564723f7089 in js::jit::CodeGenerator::generate (this=this@entry=0x7f5acfeb0000) at js/src/jit/CodeGenerator.cpp:11057
    #10 0x00005564724166f1 in js::jit::GenerateCode (mir=mir@entry=0x7f5acfe9b348, lir=0x7f5acea1a6e8) at js/src/jit/Ion.cpp:1757
    #11 0x0000556472483b89 in js::jit::CompileBackEnd (mir=mir@entry=0x7f5acfe9b348) at js/src/jit/Ion.cpp:1778
    #12 0x000055647248c031 in js::jit::IonCompile (cx=<optimized out>, cx@entry=0x7f5acfe18000, script=<optimized out>, baselineFrame=baselineFrame@entry=0x7f5aced7fe28, osrPc=osrPc@entry=0x0, recompile=<optimized out>, optimizationLevel=<optimized out>) at js/src/jit/Ion.cpp:2096
    #13 0x000055647248c8db in js::jit::Compile (cx=cx@entry=0x7f5acfe18000, script=script@entry=..., osrFrame=osrFrame@entry=0x7f5aced7fe28, osrPc=osrPc@entry=0x0, forceRecompile=forceRecompile@entry=false) at js/src/jit/Ion.cpp:2311
    #14 0x000055647248d344 in BaselineCanEnterAtEntry (frame=0x7f5aced7fe28, script=..., cx=0x7f5acfe18000) at js/src/jit/Ion.cpp:2433
    #15 js::jit::IonCompileScriptForBaseline (cx=<optimized out>, frame=0x7f5aced7fe28, pc=<optimized out>) at js/src/jit/Ion.cpp:2569
    #16 0x000055647230b18e in vixl::Simulator::VisitCallRedirection (this=0x7f5acfe3b800, instr=0x7f5acfe2ec88) at js/src/jit/arm64/vixl/MozSimulator-vixl.cpp:563
    #17 0x00005564723141ed in vixl::Simulator::VisitException (this=0x7f5acfe3b800, instr=0x7f5acfe2ec88) at js/src/jit/arm64/vixl/MozSimulator-vixl.cpp:411
    #18 0x00005564722e6800 in vixl::Decoder::VisitException (this=<optimized out>, instr=0x7f5acfe2ec88) at js/src/jit/arm64/vixl/Decoder-vixl.cpp:872
    #19 0x00005564722f4d65 in vixl::Decoder::Decode (instr=<optimized out>, this=<optimized out>) at js/src/jit/arm64/vixl/Decoder-vixl.h:158
    #20 vixl::Simulator::ExecuteInstruction (this=this@entry=0x7f5acfe3b800) at js/src/jit/arm64/vixl/MozSimulator-vixl.cpp:192
    #21 0x000055647232627c in vixl::Simulator::Run (this=0x7f5acfe3b800) at js/src/jit/arm64/vixl/Simulator-vixl.cpp:70
    [...]
    #35 main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at js/src/shell/js.cpp:10958
    rax	0x556473e4f180	93889929474432
    rbx	0x7f5acfeb0788	140028012070792
    rcx	0x7f5ad0b8b2dd	140028025549533
    rdx	0x0	0
    rsi	0x556472dd4a40	93889912195648
    rdi	0x7f5ad0e59540	140028028491072
    rbp	0x7ffc1cc14030	140720790913072
    rsp	0x7ffc1cc13ff0	140720790913008
    r8	0x7f5ad0e5a770	140028028495728
    r9	0x7f5ad1f53cc0	140028046294208
    r10	0x58	88
    r11	0x7f5ad0b017a0	140028024985504
    r12	0x7f5acfeb0040	140028012068928
    r13	0x7ffc1cc14200	140720790913536
    r14	0x7ffc1cc141e8	140720790913512
    r15	0x0	0
    rip	0x5564722f19b9 <vixl::UseScratchRegisterScope::AcquireNextAvailable(vixl::CPURegList*)+249>
    => 0x5564722f19b9 <vixl::UseScratchRegisterScope::AcquireNextAvailable(vixl::CPURegList*)+249>:	movl   $0x0,0x0
       0x5564722f19c4 <vixl::UseScratchRegisterScope::AcquireNextAvailable(vixl::CPURegList*)+260>:	ud2
This is an automated crash issue comment:

Summary: Crash [@ js::IsObjectValueInCompartment]
Build version: mozilla-central revision c89f024c023f+
Build flags: --enable-posix-nspr-emulation --enable-valgrind --enable-gczeal --disable-tests --enable-debug --enable-optimize --enable-simulator=arm64
Runtime options: --fuzzing-safe --ion-offthread-compile=off --ion-warmup-threshold=0 test.js

Testcase:

    function complex(aReal, aImag) {
        this.i = aImag;
    }
    function mandelbrotValueOO(aC, aIterMax) {
        let Z = new complex(0.0, 0.0);
    }
    const width = 60;
    const height = 60;
    const max_iters = 50;
    for (let img_x = 0; img_x < width; img_x++) {
        for (let img_y = 0; img_y < height; img_y++) {
            let C = new complex(-2 + (img_x / width) * 3, -1.5 + (img_y / height) * 3);
            var res = mandelbrotValueOO(C, max_iters);
        }
    }

Backtrace:

    received signal SIGSEGV, Segmentation fault.
    #0  js::IsObjectValueInCompartment (comp=<optimized out>, v=...) at js/src/vm/JSObject.h:1102
    #1  js::NativeObject::checkStoredValue (v=..., this=0x1a68a2c00628) at js/src/vm/NativeObject.h:1013
    #2  js::NativeObject::setSlot (this=0x1a68a2c00628, slot=0, value=...) at js/src/vm/NativeObject.h:1019
    #3  0x0000562fea82e461 in js::NativeObject::setSlotWithType (overwriting=false, value=..., shape=0x333ca7eb9150, cx=0x7f9301518000, this=0x1a68a2c00628) at js/src/vm/NativeObject-inl.h:457
    #4  UpdateShapeTypeAndValueForWritableDataProp (cx=0x7f9301518000, obj=0x1a68a2c00628, shape=0x333ca7eb9150, id=..., value=...) at js/src/vm/NativeObject.cpp:1252
    #5  0x0000562fea8613c5 in AddDataProperty (v=..., id=..., obj=..., cx=0x7f9301518000) at js/src/vm/NativeObject.cpp:1498
    #6  DefineNonexistentProperty (result=..., v=..., id=..., obj=..., cx=<optimized out>) at js/src/vm/NativeObject.cpp:2065
    #7  SetNonexistentProperty<(js::QualifiedBool)1> (result=..., receiver=..., v=..., id=..., obj=..., cx=<optimized out>) at js/src/vm/NativeObject.cpp:2816
    #8  js::NativeSetProperty<(js::QualifiedBool)1> (cx=<optimized out>, obj=..., id=id@entry=..., v=..., v@entry=..., receiver=..., receiver@entry=..., result=...) at js/src/vm/NativeObject.cpp:2983
    #9  0x0000562fead8ae7e in js::jit::SetProperty (cx=<optimized out>, obj=..., name=..., name@entry=..., value=value@entry=..., strict=<optimized out>, pc=pc@entry=0x7f93000a09a5 "6") at js/src/jit/VMFunctions.cpp:596
    #10 0x0000562feafe6fa7 in js::jit::IonSetPropertyIC::update (cx=<optimized out>, outerScript=..., ic=0x7f9301533ac0, obj=..., idVal=..., rhs=...) at js/src/jit/IonIC.cpp:323
    #11 0x0000562feae2bd67 in vixl::Simulator::VisitCallRedirection (this=0x7f930153b800, instr=0x7f930152ed88) at js/src/jit/arm64/vixl/MozSimulator-vixl.cpp:578
    #12 0x0000562feae351ed in vixl::Simulator::VisitException (this=0x7f930153b800, instr=0x7f930152ed88) at js/src/jit/arm64/vixl/MozSimulator-vixl.cpp:411
    #13 0x0000562feae07800 in vixl::Decoder::VisitException (this=<optimized out>, instr=0x7f930152ed88) at js/src/jit/arm64/vixl/Decoder-vixl.cpp:872
    #14 0x0000562feae15d65 in vixl::Decoder::Decode (instr=<optimized out>, this=<optimized out>) at js/src/jit/arm64/vixl/Decoder-vixl.h:158
    #15 vixl::Simulator::ExecuteInstruction (this=this@entry=0x7f930153b800) at js/src/jit/arm64/vixl/MozSimulator-vixl.cpp:192
    #16 0x0000562feae4727c in vixl::Simulator::Run (this=0x7f930153b800) at js/src/jit/arm64/vixl/Simulator-vixl.cpp:70
    #17 0x0000562feae15f54 in vixl::Simulator::call (this=0x7f930153b800, entry=entry@entry=0x458ed9e7920 "\376w\277\251\375\003", argument_count=argument_count@entry=8) at js/src/jit/arm64/vixl/MozSimulator-vixl.cpp:277
    [...]
    #29 main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at js/src/shell/js.cpp:10958
    rax	0x0	0
    rbx	0x1a68a2c00628	29036709414440
    rcx	0x0	0
    rdx	0x7fffffffffff	140737488355327
    rsi	0x0	0
    rdi	0x7f930047fec0	140269341638336
    rbp	0x7ffc27329a80	140720966113920
    rsp	0x7ffc273299b0	140720966113712
    r8	0x7f930047fec0	140269341638336
    r9	0x1	1
    r10	0x1b	27
    r11	0x7ffc27329c90	140720966114448
    r12	0x0	0
    r13	0x7f930047fec0	140269341638336
    r14	0x7f92ffff9550	140269336892752
    r15	0x7f9301518000	140269359038464
    rip	0x562fea562bff <js::NativeObject::setSlot(unsigned int, JS::Value const&)+687>
    => 0x562fea562bff <js::NativeObject::setSlot(unsigned int, JS::Value const&)+687>:	mov    (%rax),%rax
       0x562fea562c02 <js::NativeObject::setSlot(unsigned int, JS::Value const&)+690>:	mov    0x10(%rax),%rax
This is an automated crash issue comment:

Summary: Assertion failure: gc::IsCellPointerValid(obj), at dist/include/js/Value.h:1436
Build version: mozilla-central revision c89f024c023f+
Build flags: --enable-posix-nspr-emulation --enable-valgrind --enable-gczeal --disable-tests --enable-debug --enable-optimize --enable-simulator=arm64
Runtime options: --fuzzing-safe --ion-offthread-compile=off --ion-warmup-threshold=100 --baseline-eager

Testcase:

    const RECORDLOOP = 8;
    const RUNLOOP = RECORDLOOP + 1;
    function testmath(funcname, args, expected) {
        var i, j;
        var arg_value_list = eval("[" + args + "]");
        var arity = arg_value_list.length;
        var mapfunc = eval("(function(a) {})");
        var dummies_and_input = [];
        for (i = 0; i < RUNLOOP; i++) {
            var dummy_list = [];
            for (j = 0; j < arity; j++)
                dummy_list[j] = .0078125 * ((i + j) % 128);
            dummies_and_input[i] = dummy_list;
        }
    }
    testmath("Math.abs", "void 0", Number.NaN)
    testmath("Math.abs", "null", 0)
    testmath("Math.abs", "\"a string primitive\"", Number.NaN)
    testmath("Math.abs", "new String( 'a String object' )", Number.NaN)
    testmath("Math.abs", "Number.NaN", Number.NaN)
    testmath("Math.tan", "Number.POSITIVE_INFINITY", Number.NaN)
    testmath("Math.tan", "Number.NEGATIVE_INFINITY", Number.NaN)
    testmath("Math.tan", "Math.PI/4", 1)
    testmath("Math.tan", "3*Math.PI/4", -1)
    testmath("Math.tan", "Math.PI", -0)
    testmath("Math.tan", "5*Math.PI/4", 1)
    testmath("Math.tan", "7*Math.PI/4", -1)

Backtrace:

    received signal SIGSEGV, Segmentation fault.
    #0  _ZN2js15MapGCThingTypedIZNS_14TenuringTracer8traverseIN2JS5ValueEEEvPT_EUlS5_E_EEDaRKS4_OS5_ (val=..., f=f@entry=<unknown type>) at dist/include/js/Value.h:1436
    #1  0x000055555603dbf1 in js::TenuringTracer::traverse<JS::Value> (this=this@entry=0x7fffffffce60, thingp=thingp@entry=0x3f667a986d0) at js/src/gc/Marking.cpp:2733
    #2  0x0000555555ff8c0f in js::TenuringTracer::traceSlots (end=0x3f667a986d8, vp=0x3f667a986d0, this=0x7fffffffce60) at js/src/gc/Marking.cpp:2944
    #3  js::TenuringTracer::traceObject (this=this@entry=0x7fffffffce60, obj=obj@entry=0x3f667a986a0) at js/src/gc/Marking.cpp:2919
    #4  0x0000555555ff92b6 in js::Nursery::collectToFixedPoint (this=this@entry=0x7ffff5f1f170, mover=..., tenureCounts=...) at js/src/gc/Marking.cpp:3197
    #5  0x0000555556060f6f in js::Nursery::doCollection (this=this@entry=0x7ffff5f1f170, reason=reason@entry=JS::GCReason::DESTROY_RUNTIME, tenureCounts=...) at js/src/gc/Nursery.cpp:973
    #6  0x000055555606184e in js::Nursery::collect (this=0x7ffff5f1f170, reason=reason@entry=JS::GCReason::DESTROY_RUNTIME) at js/src/gc/Nursery.cpp:783
    #7  0x0000555555fd81b8 in js::gc::GCRuntime::minorGC (this=this@entry=0x7ffff5f1c6b8, reason=reason@entry=JS::GCReason::DESTROY_RUNTIME, phase=phase@entry=js::gcstats::PhaseKind::EVICT_NURSERY_FOR_MAJOR_GC) at js/src/gc/GC.cpp:7787
    #8  0x000055555600d8c0 in js::gc::GCRuntime::gcCycle (this=this@entry=0x7ffff5f1c6b8, nonincrementalByAPI=nonincrementalByAPI@entry=true, budget=..., reason=reason@entry=JS::GCReason::DESTROY_RUNTIME) at js/src/gc/GC.cpp:7364
    #9  0x000055555600dfed in js::gc::GCRuntime::collect (this=this@entry=0x7ffff5f1c6b8, nonincrementalByAPI=nonincrementalByAPI@entry=true, budget=..., reason=reason@entry=JS::GCReason::DESTROY_RUNTIME) at js/src/gc/GC.cpp:7569
    #10 0x000055555600e489 in js::gc::GCRuntime::gc (this=this@entry=0x7ffff5f1c6b8, gckind=gckind@entry=GC_NORMAL, reason=reason@entry=JS::GCReason::DESTROY_RUNTIME) at js/src/gc/GC.cpp:7657
    #11 0x0000555555bf976f in JSRuntime::destroyRuntime (this=this@entry=0x7ffff5f1c000) at js/src/vm/Runtime.cpp:284
    #12 0x0000555555b2772d in js::DestroyContext (cx=0x7ffff5f18000) at js/src/vm/JSContext.cpp:197
    #13 0x000055555583430c in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at js/src/shell/js.cpp:10981
    rax	0x555557cdf180	93825033695616
    rbx	0x7fffffffcd10	140737488342288
    rcx	0x7ffff6c1c2dd	140737333281501
    rdx	0x555556c0d478	93825016059000
    rsi	0x7ffff6eeb770	140737336227696
    rdi	0x7ffff6eea540	140737336223040
    rbp	0x7fffffffccf0	140737488342256
    rsp	0x7fffffffccb0	140737488342192
    r8	0x7ffff6eeb770	140737336227696
    r9	0x7ffff7fe6cc0	140737354034368
    r10	0x58	88
    r11	0x7ffff6b927a0	140737332717472
    r12	0x0	0
    r13	0x7fffffffcd08	140737488342280
    r14	0x3f667a986d8	4356836001496
    r15	0x7ffff5f18000	140737319632896
    rip	0x55555603da59 <_ZN2js15MapGCThingTypedIZNS_14TenuringTracer8traverseIN2JS5ValueEEEvPT_EUlS5_E_EEDaRKS4_OS5_+665>
    => 0x55555603da59 <_ZN2js15MapGCThingTypedIZNS_14TenuringTracer8traverseIN2JS5ValueEEEvPT_EUlS5_E_EEDaRKS4_OS5_+665>:	movl   $0x0,0x0
       0x55555603da64 <_ZN2js15MapGCThingTypedIZNS_14TenuringTracer8traverseIN2JS5ValueEEEvPT_EUlS5_E_EEDaRKS4_OS5_+676>:	ud2
This is an automated crash issue comment:

Summary: Assertion failure: GCPolicy<T>::isValid(ptr), at dist/include/js/RootingAPI.h:1007
Build version: mozilla-central revision c89f024c023f+
Build flags: --enable-posix-nspr-emulation --enable-valgrind --enable-gczeal --disable-tests --enable-debug --enable-optimize --enable-simulator=arm64
Runtime options: --fuzzing-safe --cpu-count=2 --ion-offthread-compile=off

Testcase:

    See attachment.

Backtrace:

    received signal SIGSEGV, Segmentation fault.
    #0  JS::Rooted<JS::Value>::Rooted<JSContext*, JS::Handle<JS::Value>&> (this=0x7fffffffbd60, cx=@0x7fffffffbd48: 0x7ffff5f18000, initial=...) at dist/include/js/RootingAPI.h:1007
    #1  0x0000555555e65a2b in js::ToNumberSlow (cx=<optimized out>, cx@entry=0x7ffff5f18000, v_=..., out=out@entry=0x7fffffffbdf0) at js/src/jsnum.cpp:1646
    #2  0x0000555555b9dae5 in JS::ToNumber (out=0x7fffffffbdf0, v=..., cx=0x7ffff5f18000) at dist/include/js/Conversions.h:136
    #3  SetTypedArrayElement (result=..., v=..., index=15, obj=..., cx=0x7ffff5f18000) at js/src/vm/NativeObject.cpp:2831
    #4  SetExistingProperty (cx=<optimized out>, id=..., id@entry=..., v=v@entry=..., receiver=..., receiver@entry=..., pobj=..., pobj@entry=..., prop=..., prop@entry=..., result=...) at js/src/vm/NativeObject.cpp:2898
    #5  0x0000555555bcfd75 in js::NativeSetProperty<(js::QualifiedBool)1> (cx=<optimized out>, cx@entry=0x7ffff5f18000, obj=..., id=id@entry=..., v=..., v@entry=..., receiver=..., receiver@entry=..., result=...) at js/src/vm/NativeObject.cpp:2968
    #6  0x00005555558fb95c in js::SetProperty (cx=0x7ffff5f18000, obj=..., id=..., v=..., receiver=..., result=...) at js/src/vm/ObjectOperations-inl.h:284
    #7  0x00005555558d9029 in SetObjectElementOperation (cx=cx@entry=0x7ffff5f18000, obj=..., id=id@entry=..., value=..., receiver=..., strict=strict@entry=true, script=0x0, pc=0x0) at js/src/vm/Interpreter.cpp:1571
    #8  0x00005555558d9766 in js::SetObjectElementWithReceiver (cx=0x7ffff5f18000, obj=..., index=..., value=..., receiver=..., strict=<optimized out>) at js/src/vm/Interpreter.cpp:4825
    #9  0x000055555619ad67 in vixl::Simulator::VisitCallRedirection (this=0x7ffff5f3f800, instr=0x7ffff5f87088) at js/src/jit/arm64/vixl/MozSimulator-vixl.cpp:578
    #10 0x00005555561a41ed in vixl::Simulator::VisitException (this=0x7ffff5f3f800, instr=0x7ffff5f87088) at js/src/jit/arm64/vixl/MozSimulator-vixl.cpp:411
    #11 0x0000555556176800 in vixl::Decoder::VisitException (this=<optimized out>, instr=0x7ffff5f87088) at js/src/jit/arm64/vixl/Decoder-vixl.cpp:872
    #12 0x0000555556184d65 in vixl::Decoder::Decode (instr=<optimized out>, this=<optimized out>) at js/src/jit/arm64/vixl/Decoder-vixl.h:158
    #13 vixl::Simulator::ExecuteInstruction (this=this@entry=0x7ffff5f3f800) at js/src/jit/arm64/vixl/MozSimulator-vixl.cpp:192
    #14 0x00005555561b627c in vixl::Simulator::Run (this=0x7ffff5f3f800) at js/src/jit/arm64/vixl/Simulator-vixl.cpp:70
    #15 0x0000555556184f54 in vixl::Simulator::call (this=0x7ffff5f3f800, entry=entry@entry=0x1267742f920 "\376w\277\251\375\003", argument_count=argument_count@entry=8) at js/src/jit/arm64/vixl/MozSimulator-vixl.cpp:277
    #16 0x00005555561f78ab in EnterBaseline (data=..., cx=0x7ffff5f18000) at js/src/jit/BaselineJIT.cpp:113
    #17 js::jit::EnterBaselineAtBranch (cx=0x7ffff5f18000, fp=0x7ffff56dc028, pc=<optimized out>) at js/src/jit/BaselineJIT.cpp:189
    #18 0x00005555558eb3a5 in Interpret (cx=0x7ffff5f18000, state=...) at js/src/vm/Interpreter.cpp:1980
    [...]
    #28 main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at js/src/shell/js.cpp:10958
    rax	0x555557cdf180	93825033695616
    rbx	0x7fffffffbd60	140737488338272
    rcx	0x555556b81fd8	93825015488472
    rdx	0x0	0
    rsi	0x7ffff6eeb770	140737336227696
    rdi	0x7ffff6eea540	140737336223040
    rbp	0x7fffffffbd30	140737488338224
    rsp	0x7fffffffbd10	140737488338192
    r8	0x7ffff6eeb770	140737336227696
    r9	0x7ffff7fe6cc0	140737354034368
    r10	0x58	88
    r11	0x7ffff6b927a0	140737332717472
    r12	0x7fffffffbdf0	140737488338416
    r13	0x7fffffffc070	140737488339056
    r14	0x7ffff5a7fcf0	140737314815216
    r15	0x7fffffffc101	140737488339201
    rip	0x5555558fd1ae <JS::Rooted<JS::Value>::Rooted<JSContext*, JS::Handle<JS::Value>&>(JSContext* const&, JS::Handle<JS::Value>&)+142>
    => 0x5555558fd1ae <JS::Rooted<JS::Value>::Rooted<JSContext*, JS::Handle<JS::Value>&>(JSContext* const&, JS::Handle<JS::Value>&)+142>:	movl   $0x0,0x0
       0x5555558fd1b9 <JS::Rooted<JS::Value>::Rooted<JSContext*, JS::Handle<JS::Value>&>(JSContext* const&, JS::Handle<JS::Value>&)+153>:	ud2
Attached patch nanbox-arm64-support.patch (obsolete) — Splinter Review

Updated arm64 patch. This takes a slightly cruder approach to doing unboxDouble() (unboxing in place, writing to destination, reboxing in place), which avoids the scratchreg and register exhaustion issue.

:decoder - no need to test this one yet - let's fix the fuzzbugs found and then I'll ping you again.

Attachment #9049300 - Attachment is obsolete: true
Attached patch nanbox-arm64-support.patch (obsolete) — Splinter Review

This seems to fix fuzzbugs in comments 119 and 120. Other still failing.

Attachment #9049627 - Attachment is obsolete: true

Rebased main patch to tip.

Attachment #9048565 - Attachment is obsolete: true

Reposting arm64 patch to maintain ordering in list of attachments.

Attachment #9051822 - Attachment is obsolete: true

I took another shot at this. I'm now confused about why we thought the previous patches passed jit-tests, but with the attached patch (plus the two above) everything works, including decoder's fuzz bugs.

I've done a semi-rigorous audit to make sure that all of the x64 changes have an arm64 equivalent, so I feel pretty good about this version.

When boxing changes get landed, we should fix [1] to just use UndefinedValue() instead of it's weird 0x0 optimization.

[1] https://searchfox.org/mozilla-central/rev/8d78f219702286c873860f39f9ed78bad1a6d062/js/src/vm/JSFunction-inl.h#144

Comment on attachment 9054668 [details] [diff] [review]
nanbox-main-x64.patch

Review of attachment 9054668 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/public/Value.h
@@ -381,4 @@
>  
>   private:
>    explicit constexpr Value(uint64_t asBits) : asBits_(asBits) {}
> -  explicit constexpr Value(double d) : asDouble_(d) {}

The |asDouble_| union arm should be removed in the new boxing configuration.

I believe this all needs to move to phabricator as well to be reviewable

Comment on attachment 9054668 [details] [diff] [review]
nanbox-main-x64.patch

Review of attachment 9054668 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/public/Value.h
@@ +454,4 @@
>  
>   private:
>    explicit constexpr Value(uint64_t asBits) : asBits_(asBits) {}
> +  explicit Value(double d) : asBits_(JSPun64BoxDouble(d)) {}

Specifically, this section needs to be #ifdef-wrapped on JS_PUNBOX64, and the prior implementation (asDouble_(d)) kept for 32-bit code.

Keeping the asDouble_() constructor on the NUNBOX32 Value implementation should be fine (as doubles still behave that way there).

One more patch to clear up a number of build problems on try.

  1. As mentioned above, we need to keep the old Value(double) constructor for 32-bit.
  2. We were doing dumb things with macros to convince gcc to let us put an enum in a bitfield, purely for the purpose of making it easier to read values in a debugger. I got rid of that code.
  3. 32-bit x86 and ARM needed implementations of box/unboxDouble (which just redirect to store/loadDouble). I added them.

Combined with the previous three patches, this patch looks pretty good on try, except for SM(rust): https://treeherder.mozilla.org/#/jobs?repo=try&revision=279609dc122ba22a9a491647a223df2f9de902e5&selectedJob=240836842

The failure summary doesn't give much info, and I couldn't find anything useful in the logs, either.

As described in this comment re BigInt, changes to Value.h require corresponding changes to js/src/gdb/mozilla/jsval.py to keep the JS::Value prettyprinter working and js/rust/src/jsval.rs to keep the rust bindings working.

I've fixed the rust bindings. The pretty printer will take a bit more thought to figure out how to extract the double value.

I am currently working on splitting this all up into a cleaner patch stack for review.

Assignee: kvijayan → iireland

Using the old NaN-boxing scheme, a zero-initialized value was a double, which was safe to trace. Under the new scheme, it is a null object pointer.

This patch manually initializes Value arrays to Undefined.

It took me a while to convince myself that this code was still okay for 0-tagged object Values. Adding a comment to make it clearer for future readers (and in the hope that a reviewer will notice my mistake if I am wrong.)

Depends on D29051

In the past, we have been somewhat sloppy when storing / loading double values, because a boxed double and an unboxed double had the same representation. This is no longer true with object-biased NaN-boxing. This patch goes through and cleans up those cases.

Depends on D29052

Similarly to Part 3: when we push a double value, we currently don't distinguish between pushing a boxed double and pushing an unboxed double. This has to change for object-biased NaN-boxing.

Depends on D29053

This patch is where the actual changes to our value representation happens. A few notes:

  1. We did some weird macro tricks to work around a GCC bug with enums in bitfields. Those bitfields were only useful for poking at values in gdb, and the trick no longer worked with object-biased nanboxing, so I removed it. I also got rid of asDouble_, because it's no longer possible to read the double value right out of the enum without unboxing.

  2. In the previous boxing scheme, there was a mechanical conversion between a JSValueType and a JSValueTag. That's no longer true, which is why the big conversion switches exist.

  3. Waldo, you were included as a reviewer specifically to look at Value.h and make sure that our gross bit twiddling is just gross and not undefined.

Depends on D29054

We are changing the representation of values on 64-bit. Part 5 of this patch stack has more details on the changes.

Depends on D29055

This is a bit gross, but it passes all the tests that were not already failing.

Depends on D29056

(From: https://people.eecs.berkeley.edu/~krste/papers/EECS-2016-1.pdf)

Here are default NaN encodings for different platforms:

ISA Sign Significand QNaN Polarity


SPARC 0 11111111111111111111111 1
MIPS 0 01111111111111111111111 0
PA-RISC 0 01000000000000000000000 0
x86 1 10000000000000000000000 1
Alpha 1 10000000000000000000000 1
ARM 0 10000000000000000000000 1
RISC-V 0 10000000000000000000000 1

This seems like Part 5 is going to break on SPARC/MIPS. It seems like the general approach still works with an ADJ of 0x0008_0000_0000_0001 on those platforms. The mips simulator is a bit of a lie right now.

There should also be a few more static asserts, helper constexpr and a jsapi-test to validate this. We should also find someone with a physical sparc and mips to confirm at least a basic smoke test works. While they are unsupported platforms, it feels silly to break them with no notice in the next release to hit ESR. I think :jcristau might have access to some SPARC machines if I've remembering right.

Canceling the needinfo here, please re-request when the patch is ready for another round of fuzzing. Thanks!

Flags: needinfo?(choller)
Pushed by iireland@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/d5d97222de6f
Part 1: Stop zero-initializing Value arrays r=jandem
https://hg.mozilla.org/integration/autoland/rev/90a104231405
Part 2: Improve comments in TraceDataRelocations r=sfink
https://hg.mozilla.org/integration/autoland/rev/8f02092ab0b4
Part 3: Convert store/loadDouble to box/unboxDouble r=mgaudet
https://hg.mozilla.org/integration/autoland/rev/caac1da9ae80
Part 4: Add PushBoxed for float registers r=mgaudet
https://hg.mozilla.org/integration/autoland/rev/1a488ead5ac5
Part 5: Change Value representation r=tcampbell,jwalden
https://hg.mozilla.org/integration/autoland/rev/bccc5a509ebe
Part 6: Update Rust bindings r=fitzgen
https://hg.mozilla.org/integration/autoland/rev/46030572ffde
Part 7: Update gdb pretty-printer r=sfink
Pushed by iireland@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/90769708bfc2
Part 1: Stop zero-initializing Value arrays r=jandem
https://hg.mozilla.org/integration/autoland/rev/037012f898dd
Part 2: Improve comments in TraceDataRelocations r=sfink
https://hg.mozilla.org/integration/autoland/rev/098a1fc45a87
Part 3: Convert store/loadDouble to box/unboxDouble r=tcampbell,mgaudet
https://hg.mozilla.org/integration/autoland/rev/0636d29595ff
Part 4: Add PushBoxed for float registers r=mgaudet
https://hg.mozilla.org/integration/autoland/rev/b82660f5a787
Part 5: Change Value representation r=tcampbell,jwalden
https://hg.mozilla.org/integration/autoland/rev/b5afafec0097
Part 6: Update Rust bindings r=fitzgen
https://hg.mozilla.org/integration/autoland/rev/ae799ea48e2c
Part 7: Update gdb pretty-printer r=sfink
Pushed by iireland@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/2ad263987c39
Part 8: Convert C-style macros to JS::detail constexprs r=jwalden
Regressions: 1559072
Regressions: 1559319
Flags: needinfo?(iireland)
Performance Impact: --- → P3
Whiteboard: [js:perf][qf:p3] → [js:perf]
You need to log in before you can comment on or make changes to this bug.