Last Comment Bug 586842 - Store properties named by uint32_t separately from properties named otherwise
: Store properties named by uint32_t separately from properties named otherwise
Status: ASSIGNED
:
Product: Core
Classification: Components
Component: JavaScript Engine (show other bugs)
: Trunk
: All All
: -- normal with 17 votes (vote)
: ---
Assigned To: Jeff Walden [:Waldo] (remove +bmo to email)
:
Mentors:
: 820820 (view as bug list)
Depends on: 714079 751377 674775 676738 713183 713965 714218 739380 745944 786880 792108 799122 823283 827449 827490 837773
Blocks: 581180 602189 crockmark 679710 707686 726423 JSIL JaegerSpeed 594655 611423 654410 677079 687935 695438 766847
  Show dependency treegraph
 
Reported: 2010-08-12 17:57 PDT by Luke Wagner [:luke]
Modified: 2014-02-06 02:18 PST (History)
52 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments

Description Luke Wagner [:luke] 2010-08-12 17:57:53 PDT
If a dense array gets a single non-integer property (or becomes sufficiently sparse) the dense array becomes sparse which is way way slower.  One way to avoid that, that keeps being being brought up and has been used by other engines, notably Lua's, is to give an object both dense and sparse slot storage.  Without thinking too hard about the nitty gritty details, it seems like this could make general-case property access slightly slower, so we'd have to demonstrate by measurement that there are enough cases that win to justify this strategy.

One example where this really shows up is in building the result object/array of a RegExp exec (where bug 581595 shows we are 3x slower than JSC on v8-regexp).

This is probably too big to do before FF4; this bug is just to officially record the idea and catch discussion.
Comment 1 Nicholas Nethercote [:njn] 2010-08-16 19:15:29 PDT
Bug 486356 comment 3 from Brendan said this:

> An idea that seems very much worth thinking about is sharing dense array
> goodness with objects having named properties (where access is optimized
> differently), so we are not stuck with either-or.
> 
> Lua does this, and Jason and I have talked about it in terms of extending
> stored 32-bit shapes to have an 8 bit "array tag". This tag would overlay the
> property-cache four-bit scope/proto-chain coordinates (see jsinterp.h
> PCVCAP_TAG*). 8 bits is enough to express "array of shape to the left, or else
> one of the machine-typed primitives, or of JSString or other jsval primitives".
> 
> How would this help? We should also get rid of a map per dense array, which is
> killing us right now (see bug 486321). Instead we would want shape to be in a
> dedicated object field, right after classword in the current world (we might
> get to a different world but let's do it with patches).
> 
> To unify fully, we would have an aslots optional field of JSObject, in addition
> to dslots, and probably void*. It would be scanned by the GC at the discretion
> of the class, as is done today even (see array_trace).
> 
> If you take this slightly longer road, I think you avoid higher-overhead steps
> and dead-end traps along the way. Having more than one *ArrayClass is one trap.
> The ongoing lack of array/object unification is another, bigger than this bug
> and of course not a good precedent -- but let's not have more, and let's try to
> fix it here and now. Or soon, anyway.
Comment 2 Brendan Eich [:brendan] 2010-09-08 23:36:07 PDT
Some of my words cited in comment 1 are way out of date (no more map allocation for dense arrays, the map has been a shared singleton for two releases, IIRC). Also I'm not sure we need the shape array-of-primitive trick.

But the idea of a universal aslots vector and zero-storage map for it (just index directly) has a lot of appeal, once you swallow the JSObject size increase.

Not sure this is out of scope for Fx4/Moz2 -- more study needed.

/be
Comment 3 David Mandelin [:dmandelin] 2010-10-11 13:53:19 PDT
I just found out that this is the main cause for the perf gap on string-unpack-code. That benchmark builds an array like this:

    // c and d are args in the original benchmark.
    var c = 1976;
    var d = {};
    while (c--)
        d[e(c)]=k[c]||e(c);

The resulting object's only properties are the integer indices from 0..1975, but it is a vanilla object all the way, and it has too many properties for the standard elem PIC to help.

We're seeing the need for this more and more: regexp-dna from comment 0 (Luke, do you have the data/details on that?), Processing Space Invaders (bug 602189), and now string-unpack-code. Is there any chance we can do this for Fx4?
Comment 4 David Mandelin [:dmandelin] 2010-10-11 15:02:26 PDT
I made a little mistake in preparing the last comment: as you can see, the keys are not necessarily numeric, since they are generated by calling |e| (and in fact they are not numeric). So this is not a case for hybrid storage.

Anyway, that still seems like an important optimization, but back to post-Fx4.
Comment 5 Tom Schuster [:evilpie] 2010-12-15 23:33:11 PST
I am currently doing some tests with this approach. It's likelly that array access will get a tiny bit faster, as we lose on class check. What kind of relation dslots/aslots should have, and what kind of usage for aslots should be still acceptable, needs to be investigated. Eg x[10000] would we like to allocate so much, probably not for the first elment.
Comment 6 Luke Wagner [:luke] 2011-02-01 14:11:20 PST
I think another use case is the length property on dense arrays: currently, we can't do the same thing for dense arrays as we do on slow arrays (property with array length setter/getter) since dense arrays can't have normal properties.
Comment 7 Jeff Walden [:Waldo] (remove +bmo to email) 2011-02-15 11:11:34 PST
I'll take this once 4.0 wraps up.
Comment 8 Paul Biggar 2011-02-16 18:25:05 PST
(In reply to comment #7)
> I'll take this once 4.0 wraps up.

I'm interested in helping, if you need anything.
Comment 9 Tom Schuster [:evilpie] 2011-07-27 07:47:56 PDT
Jeff are you working on this? Maybe we could do this in project branch together?
Comment 10 Brian Hackett (:bhackett) 2011-07-27 08:04:04 PDT
I think this bug should wait until after we have a generational GC and can do rapid allocation of dense slot arrays for objects.  The problem right now is that an object's fixed slots are used to hold the dense slots of small arrays, and to do this right it should be possible for an object to have both a dense slot vector *and* inline slots which can be accessed in one dereference by JITs.

The only reason small dense arrays use inline slots for storage is that malloc is slow, and with a nursery we can just do a single bump allocation reserving space for an object plus its fixed slots plus initial space for its dense slots.
Comment 11 David Anderson [:dvander] 2011-07-27 11:00:57 PDT
Maybe, though if someone's ready and willing to do the work now, it's not technically blocked on GC stuff. At least it seems like most of the bug's difficulty is in the VM.
Comment 12 Brian Hackett (:bhackett) 2011-07-27 11:12:17 PDT
Sure, it could be implemented now, but getting good performance characteristics for small arrays will I think require a very large increase in complexity vs. what would be necessary with a fast allocator.
Comment 13 Jeff Walden [:Waldo] (remove +bmo to email) 2011-07-27 13:02:35 PDT
I started work on this on Friday.  I hadn't thought of this as particularly well-suited to a project branch, and so far I've just been trying to carve out small pieces at a time to work on incrementally.  But maybe I just haven't thought it through hard enough.  I'll try to get back with more in the next few days.
Comment 14 David Mandelin [:dmandelin] 2011-07-27 16:13:13 PDT
(In reply to comment #12)
> Sure, it could be implemented now, but getting good performance
> characteristics for small arrays will I think require a very large increase
> in complexity vs. what would be necessary with a fast allocator.

I'd like this to start now, in the interest of getting it finished sooner. This is a big performance fault for us, and we can certainly fix that now. All the general VM adaptations should still apply with new GC. 

I'm fine with not fully optimizing and tuning arrays until we get the other pieces in place.
Comment 15 Jeff Walden [:Waldo] (remove +bmo to email) 2012-04-09 16:00:25 PDT
I'm touching so much of the same code that this seems like the right time to begin implementing the proto-climbing refactoring:

http://wiki.ecmascript.org/doku.php?id=harmony:proto_climbing_refactoring

This should also have benefits for JSAPI users, where we currently hack in workarounds for some of that stuff in awkward ways.  It is scope creep, somewhat (but note "touching so much of the same code"), but it also means I have a clearer understanding of what each bit of functionality I implement corresponds to, because there are straightforward spec algorithms to point to.  These patches are landing piecemeal over time, adjacent to the current code, but not yet used (to enable easy switching back and forth until we're completely comfortable with the new representation).
Comment 16 Jan de Mooij [:jandem] (PTO until July 31) 2012-12-12 06:45:02 PST
*** Bug 820820 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.