Closed Bug 1845728 Opened 1 year ago Closed 10 months ago

Make Object.keys elidable when only the length property is used.

Categories

(Core :: JavaScript Engine: JIT, enhancement, P1)

enhancement

Tracking

()

RESOLVED FIXED
122 Branch
Tracking Status
firefox122 --- fixed

People

(Reporter: nbp, Assigned: nbp, NeedInfo)

References

(Blocks 1 open bug)

Details

(Whiteboard: [sp3])

Attachments

(3 files)

In Bug 1827268 comment 0, the snippet of code show that Object.keys is called twice on 2 objects, in order to compare the lengths against each others, and then only one of the two is iterated over.

If we are not implementing a cache for Object.keys, we might replace one of the two Object.keys calls by a special function which only compute the length, and avoid building the array.

Attached file shallowTest.js

I whipped together an egregiously broken prototype of this; so suppose you have a builtin "KeysCount" which doens't allocate the array but simply returns the count that the array would have computed. You could then write shallowEquals as follows:

// What if we had a hypothetical builtin that could count keys without actually allocating? and we could
// transform this to use said builtin?
function shallowEqualKeysCount(objA, objB) {
    if (Object.is(objA, objB)) return true;
    if ("object" !== typeof objA || null === objA || "object" !== typeof objB || null === objB) return !1;
    if (KeysCount(objA) !== KeysCount(objB)) {
        return false;
    }

    var keysA = Object.keys(objA);
    for (let key = 0; key < keysA.length; key++) {
        if (!Object.prototype.hasOwnProperty.call(objB, keysA[key]) || !Object.is(objA[keysA[key]], objB[keysA[key]])) {
            return false;
        }
    }
    return true;
}

(The expectation here being hypothetically we could teach JITs enough to use this intrinsic)

The challenge evaluating this is correctly designing a representative workload. Unfortunately, I don'te have any insight into what actual react values look like that would go through shallowEquals. However, I built the attached benchmark; you have 3 groups of objects, A B and C. A and B differ in the property key set length. A and C differ only in value.

Using a jerryrigged builtin that borrows all its code from the paths for obj_keys, I measure about a 20% speedup using the counting version of shallowEquals over the non-counting version.

hyperfine '/home/matthew/unified/obj-opt-shell-nodebug-x86_64-pc-linux-gnu/dist/bin/js ../perf/shallowTest.js' 'COUNT=1 /home/matthew/unified/obj-opt-shell-nodebug-x86_64-pc-linux-gnu/dist/bin/js ../perf/shallowTest.js' 
Benchmark 1: /home/matthew/unified/obj-opt-shell-nodebug-x86_64-pc-linux-gnu/dist/bin/js ../perf/shallowTest.js
  Time (mean ± σ):      1.012 s ±  0.016 s    [User: 1.042 s, System: 0.036 s]
  Range (min … max):    0.982 s …  1.032 s    10 runs
 
Benchmark 2: COUNT=1 /home/matthew/unified/obj-opt-shell-nodebug-x86_64-pc-linux-gnu/dist/bin/js ../perf/shallowTest.js
  Time (mean ± σ):     847.0 ms ±  10.6 ms    [User: 850.0 ms, System: 11.6 ms]
  Range (min … max):   836.9 ms … 873.3 ms    10 runs
 
Summary
  'COUNT=1 /home/matthew/unified/obj-opt-shell-nodebug-x86_64-pc-linux-gnu/dist/bin/js ../perf/shallowTest.js' ran
    1.19 ± 0.02 times faster than '/home/matthew/unified/obj-opt-shell-nodebug-x86_64-pc-linux-gnu/dist/bin/js ../perf/shallowTest.js'

I'll attach my poor patch to this bug.

Note: Again, the problem with this evidence is that it's not clear what the values actually look like in react applications! Note for example if you drop Group B from the benchmark, you see no speedup whatsoever -- this is surprising to me by the way! I would have expected that you would see some speedup from avoiding an array allocation, but I don't measure that without the addition of Group B.

Whiteboard: [sp3]
Severity: -- → N/A
Priority: -- → P1

FWIW, when looking at the DFG generated code from JSC it's not eliding the array allocation (but is doing it inline in JIT code) and is still 3.3x faster than SM on shallowEquals

That's a fair point. Before experimenting with scalar replacement, we could start by simply generating an inline fast allocation path for cases where the native iterator is available.

I literally do not care how others are implementing things, unless I have no clue and I need some reading material for inspiration.

Recover Instructions do not exist in other engines, and they are unlikely to make use of Recover Instructions if they do not already have an implementation of it. I also doubt they are going to implement recover instructions for this one use case.

I am not here to implement the same feature as other engines, I am here to make us faster, and we would not be faster if we keep copying.

Fear not; knowing about how other engines work doesn't mean you have to discard your own ideas for optimization. It just means that by combining both ideas you may become faster than the other engines.

(In reply to Nicolas B. Pierron [:nbp] from comment #5)

and we would not be faster if we keep copying.

Not strictly true because there are multiple engines to copy from. If the time in a function is made up by its subparts A and B, and if Chrome is fast at A and slow at B, and Safari is slow at A and fast at B, then we can be faster than either of them by copying Chrome's optimization for A and Safari's optimization for B, and be fast at A and B.

Testing on Speedometer 3, so far I only see improvements on Editor-TipTap, and no confidence nor improvement on any React benchmarks so far.

https://treeherder.mozilla.org/perfherder/compare?originalProject=try&originalRevision=279c4c3ebadc9dc251c7318ec0d7625c42703109&newProject=try&newRevision=85cd5cb74a92f1ea40019d2ea5d975e0016ee142&framework=13&page=1

Looking at the profiles, I do see stack frames with the obj_keys_length call (only when all conditions are met for this optimization) in TodoMVC-React-Redux.CompletingAllItems-sync, TodoMVC-React.DeletingAllItems-sync and TodoMVC-React.CompletingAllItems-sync.

Thus, the optimization is used, but not as much as suggested Bug 1827268.

I am still working on cleaning up the patch and fixing the latest issues I found while adding more test cases…
I will upload the next, and hopefully latest, version on Monday.

This optimization is made to handle cases where Object.keys(..) is called only to
compute the number of owned properties as a short-cut to iterating over the
properties of the object.

However, this short-cut, which is well deployed on the Web, does not make sense
given that Object.keys(..) iterates over the owned properties, to allocate an
array, to store the names of the properties. This optimization ellide the
allocation of the array when the content of the array is not needed.

Note, this optimization skip the case of Proxy where the Object.keys(..)
function might be effectful, or if the object might be mutated after
Object.keys(..) is called. Both of these would require much more involved checks
to guarantee that the optimization remains valid.

Attachment #9361045 - Attachment description: Bug 1845728 - Ellide keys call in Object.keys(..).length. WIP → Bug 1845728 - Ellide keys call in Object.keys(..).length.
Pushed by npierron@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/18d07964513a Ellide keys call in Object.keys(..).length. r=iain
Regressions: 1869091

This optimization doesn't seem to be triggering on React's shallowEqual.

I see two ObjectKeys ops in the IR each of which seems to correspond to a call to VMWrapper: ObjectKeys. I'd expect to see one ObjectKeysLength and one ObjectKeys

Actually, it seems like it does trigger on the shell branch's shallowEqual but not on the no-compress branch's.

shell:

function shallowEqual(objA, objB) {
  if (objectIs(objA, objB)) return !0;
  if ("object" !== typeof objA || null === objA || "object" !== typeof objB || null === objB) return !1;
  var keysA = Object.keys(objA),
    keysB = Object.keys(objB);
  if (keysA.length !== keysB.length) return !1;
  for (keysB = 0; keysB < keysA.length; keysB++) if (!hasOwnProperty$1.call(objB, keysA[keysB]) || !objectIs(objA[keysA[keysB]], objB[keysA[keysB]])) return !1;
  return !0;
}

no-compress

function shallowEqual(objA, objB) {
  if (objectIs(objA, objB)) return true;
  if (typeof objA !== "object" || objA === null || typeof objB !== "object" || objB === null) return false;
  var keysA = Object.keys(objA);
  var keysB = Object.keys(objB);
  if (keysA.length !== keysB.length) return false;
  for (var i = 0; i < keysA.length; i++) if (!hasOwnProperty$1.call(objB, keysA[i]) || !objectIs(objA[keysA[i]], objB[keysA[i]])) return false;
  return true;
}

The main difference that stands out to me is that the no-compress version has a separate variable i instead of reusing the name keysB

Is practical to make it work on the no-compress version?

Status: ASSIGNED → RESOLVED
Closed: 10 months ago
Resolution: --- → FIXED
Target Milestone: --- → 122 Branch

needinfo nbp for comment 12. Perhaps now it should be a new bug?

Flags: needinfo?(nicolas.b.pierron)

(In reply to Jeff Muizelaar [:jrmuizel] from comment #12)

The main difference that stands out to me is that the no-compress version has a separate variable i instead of reusing the name keysB

Is practical to make it work on the no-compress version?

This is strange, what the shell version is doing by reusing the keysB variable is basically trashing the resume-point usage of the variable.
The usage of a new variable i keeps the old value of keysB, but it should have been replaced by a magic value in EliminateDeadResumePointOperands.

The fact that resume points are holding the value of keysB prevent the foldsTo operation from kicking in, as it would be invalid to recover the Object.keys(objB) if objB were to be mutated, and this is why we have extra tests preventing this optimization if we have extra resume points capturing the value of keysB after any aliasing instruction. However, the loop contains many calls which are most likely triggering this check and preventing this optimization.

I guess this might be doable, but this is likely going to be a fight between EliminateDeadResumePointOperands and some annotation which might suggest that we have to keep some value alive in case of a bailout.

Regressions: CVE-2024-29943
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: