Open Bug 1405509 Opened 7 years ago Updated 2 years ago

[jsdbg2] Debugger.Object should handle massive objects better

Categories

(Core :: JavaScript Engine, enhancement, P3)

enhancement

Tracking

()

Tracking Status
firefox57 --- wontfix

People

(Reporter: jimb, Assigned: jimb)

References

(Blocks 2 open bugs)

Details

(Whiteboard: [js:tech-debt])

Debugger.Object does not provide a convenient API for dealing with objects with massive numbers of properties like TypedArrays or web content globals. You can either get them all, with D.O.p.getOwnPropertyNames, or don't ask.

Calling getOwnPropertyNames on a mega-element TypedArray is clearly ridiculous, handing you back an array containing a freshly allocated string for every numeric index (after the first 256, which are atomized).
For named properties, D.O.p.getOwnPropertyNames allocates only the array; all the property names are atoms, so the strings in the returned array are the same memory that the object's own shape uses to record the property names. So even an object with tens of thousands of properties would allocate only hundreds of thousands of bytes to hold the array.

So the real problem is arrays. The JavaScript concept of length isn't quite helpful here:

js> a = [];
[]
js> a[10000] = "yo";
"yo"
js> a.length
10001
js> Object.getOwnPropertyNames(a)
["10000", "length"]
js> 

This array wouldn't benefit from any special treatment.

Perhaps the API should distinguish between what SpiderMonkey calls dense and sparse properties. The API could have a way to report the number of dense properties (which would be the length of a fully-allocated array), and also to retrieve only the non-dense properties (which would be 'length').
Flags: needinfo?(jimb)
Here's what I'm thinking of at the moment.

## Accessor Properties

`denseLength`
:   Either `undefined` or an integer. If this property's value is some integer
    *l*, then the referent has zero or more properties whose names are
    `i.toString()` for integers `i` between 0 and *l*.

    This accessor is meant to help inspect large arrays (including typed arrays)
    efficiently, when used together with the `getDensePropertyRange` and
    `getSparseOwnPropertyNames` methods. The specific value it returns is left
    to the JavaScript engine's discretion, but typically, an array that has most
    or all of its elements populated will have a `denseLength` greater than the
    largest populated index, so you can use `densePropertyRange` to inspect
    portions of the array, and `getSparseOwnPropertyNames` to find the
    referent's named properties.

    If this accessor returns `undefined`, the referent may still have properties
    whose names are decimal integer strings. If it returns an integer, the
    referent may have properties whose names are decimal integer strings greater
    than or equal to that value, or less than zero, and the array may not be
    fully populated.

## Function Properties

<code>getDensePropertyRange(<i>start<i>, <i>length</i>)</code>
:   Return an array of debuggee values representing the values of the referent's
    properties in the range <code>[<i>start</i>, <code><i>start</i> +
    <i>length</i>)</code>. The returned array will have holes for properties in
    that range that are absent on the referent.

    The values <code>Number(<i>start</i>)</code> and
    <code>Number(<i>length</i>)</code> must be integers. The range
    <code>[<i>start</i>, <code><i>start</i> + <i>length</i>)</code> must not
    extend outside <code>[0, <i>denseLength</i>)</code>, where
    <i>denseLength</i> is the value of this `Debugger.Object`'s `denseLength`
    property.

    If the referent is a typed array, the return value will be a typed array of
    the same kind.

`getSparseOwnPropertyNames()`
:   Return an array of strings naming all the referent's own properties,
    omitting those whose names are the string form of integers in the range
    <code>[0, <i>denseLength</i>)</code>, where <i>denseLength</i> is the value
    of this `Debugger.Object`'s `denseLength` property.
Flags: needinfo?(jimb) → needinfo?(jorendorff)
The idea here is that you'd first check `denseLength`, and if it's a number large enough to concern you, then you can use `getDensePropertyRange` to look at just a portion of the array, and `getSparseOwnPropertyNames` to find any other random non-array properties stuck on the object.

The "dense" and "spare" terminology has no connection with anything in ECMAScript as far as I know. This is deliberate, since the value returned is left entirely to the engine's discretion. In practice, SpiderMonkey will provide helpful values for arrays and typed arrays, which should be enough to address bug 1387823 effectively.
Also needinfoing Nicolas to make sure this proposal will be enough to address bug 1387823, specifically https://bugzilla.mozilla.org/show_bug.cgi?id=1387823#c9.  I'm not sure, but it seems like the proposed `getSparseOwnPropertyNames` function could be a replacement for the logic in `enumObjectProperties`  when `options.ignoreIndexedProperties` is true, and `denseLength` could be a replacement for the logic when `options.ignoreNonIndexedProperties` is true.
Flags: needinfo?(nchevobbe)
Here's Nicolas' comment:

> We want to be able to retrieve non-indexed properties on arrays (length,
> buffer, bufferOffset, custom-set properties; i.e. `x = []; x.foo ="bar"`, …).
> In the function, it uses getOwnPropertyNames, find the length property and cut
> the array after that index. For large typed arrays, this takes too much time,
> but i'm not aware of any way to only get those non-indexed properties in a
> fast way.

The proposed `getSparseOwnPropertyNames` would return only non-indexed and
custom-set properties. On typed arrays, SpiderMonkey would provide the array's
length as the value of `denseLength`, so `getSparseOwnPropertyNames` would skip
them.

(Not trying to answer for Nicolas, but just indicate how the proposal is meant to address his concerns.)
Look good to me, thanks for looking at this Jim
Flags: needinfo?(nchevobbe)
(In reply to Jim Blandy :jimb from comment #1)
> Perhaps the API should distinguish between what SpiderMonkey calls dense and sparse properties.

Is this property classification deterministic?

(In reply to Jim Blandy :jimb from comment #2)
> `denseLength`
> :   Either `undefined` or an integer. If this property's value is some integer
>     *l*, then the referent has zero or more properties whose names are
>     `i.toString()` for integers `i` between 0 and *l*.

If not undefined, is `denseLength` guaranteed to be a Uint32? Or might it be a bigger safe integer?

> <code>getDensePropertyRange(<i>start<i>, <i>length</i>)</code>
> :   Return an array of debuggee values representing the values of the referent's
>     properties in the range <code>[<i>start</i>, <code><i>start</i> +
>     <i>length</i>)</code>. The returned array will have holes for properties in
>     that range that are absent on the referent.

But since the properties are dense, maybe it would be better to return a collection of the missing properties between `start` and `length`?

>     If the referent is a typed array, the return value will be a typed array of
>     the same kind.

This would mean that `getDensePropertyRange` could only return properties in the 0..127 range for a Int8Array, or 0..255 and some would become negative. But typed arrays can be much larger.
I would make `getDensePropertyRange` always return a Uint32Array (assuming `denseLength < 2**32`)

(In reply to Brian Grinstead [:bgrins] from comment #4)
> I'm not sure, but it seems like the proposed `getSparseOwnPropertyNames` function
> could be a replacement for the logic in `enumObjectProperties`  when
> `options.ignoreIndexedProperties` is true, and `denseLength` could be a
> replacement for the logic when `options.ignoreNonIndexedProperties` is true.

Currently, indexed properties are the own array index properties (except for sparse arrays, then non-existing array indices are also included), and non-indexed properties are the other ones.

I would change the definitions so that indexed properties are the properties returned by `getDensePropertyRange(0, denseLength)`, and non-indexed properties are the ones returned by `getSparseOwnPropertyNames()`.
Priority: -- → P3
Whiteboard: [js:tech-debt]
Assignee: nobody → jimb
Note that the Proxy/Reflect methods that are built into the ES standard also don't support this use case well.

Comment 2 looks good. I would rename `.getDensePropertyRange()` to `.denseElements()` or `.denseSlice()`. The documentation should say explicitly whether or not the dense APIs ever apply to array-like objects that are neither arrays nor typed arrays: DOM objects, `arguments`, and plain Objects (such as jQuery objects).

In the case of DOM objects, it's a tricky question. Many are proxies that do have dense elements; see BaseProxyHandler::getElements(): https://searchfox.org/mozilla-central/search?q=symbol:_ZNK2js16BaseProxyHandler11getElementsEP9JSContextN2JS6HandleIP8JSObjectEEjjPNS_12ElementAdderE&redirect=false
Flags: needinfo?(jorendorff)
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.