Closed Bug 661374 Opened 14 years ago Closed 3 months ago

Implement bytecode-based escape analysis

Categories

(Core :: JavaScript Engine, enhancement)

x86
Windows 7
enhancement

Tracking

()

RESOLVED WONTFIX

People

(Reporter: kael, Unassigned)

Details

In certain types of applications (games, for example), many common patterns are designed around the assumed presence of structured value types, used to represent things like locations, transformation matrices, etc. When these patterns are implemented in JavaScript, they end up creating large amounts of garbage in the form of Objects representing those structured value type. I believe that some degree of static analysis is possible in order to detect Objects that satisfy a set of requirements to be classified as a 'value type object' - they do not hold references to other objects within the GC heap and do not contain complex behavior; all their values are either primitives (like Number) or other value type objects. If we can statically verify this, we can then optimize the use of these value types - they don't need to participate in cycle collection, and they may be able to be entirely ignored by the GC; we could heap-allocate them in cases where we are able to determine that they do not escape, and in cases where they escape, we can combine simple heap allocation along with naive refcounting since we know they will not participate in cycles. We can also potentially optimize their layout in memory since we know the exact shape of the object in advance. cdleary suggested that a good way to prototype this would be to implement a simple form of this analysis on spidermonkey bytecode, so that's probably my starting point. Some gotchas that might prevent this from being possible: No matter what, you have Object.prototype involved, which means that value type objects can inherit references to other objects from their prototype, and can inherit references to functions that might allow the 'this'-reference to escape. While optimistically, emulating struct semantics should provide a performance win, it's possible that there are cases where objects are used like structs, but treating them like actual structs will make the resulting JS perform worse. If this is implemented, it will need some carefully tuned heuristics to ensure we don't regress performance on real-world applications. Generational GCs reduce the cost associated with short-lived simple objects, so if we end up shipping a generational GC, this optimization might not be worthwhile anymore.
Well, as I mentioned in IRC, the analysis that's feasible with the current engine infrastructure is a function-local scope optimization. You can detect object literals that can be decomposed into fields, like so: function foo() { var o = {a: 1, b: 2, c: 3}; o.c *= o.a; print(o.a + o.b) } Because you know that the property lookups are definitely going to o (and not something on the prototype chain) and you conservatively see that o can't escape from the local scope (no method calls or name uses) you can lower the properties of o into local slots, like vars. As soon as multiple functions get involved you need to assume things about the identity of the function that gave you a new object; i.e. function Point(x, y) { return {x: x, y: y}; } function foo() { var p = new Point(1, 2); p.x *= p.y; print(p.x + p.y); } This is going to be tougher, because you can't reason about the object source locally, within a single function. I'd recommend that we start with the local one and then brainstorm how best to apply it to cross-function analysis (there are some possibilities there).
Summary: JavaScript 'value type' optimization → Implement bytecode-based escape analysis
Whiteboard: [mentor:cdleary]
There is a fair amount of overlap here with how property accesses are treated in the type inference branch (bug 608741, merging to TM hopefully within a week or so). When constructing objects we analyze uses of 'this' to determine when it escapes and which properties are definitely added before it does so. We then use that information at access sites to get to the property at a fixed offset into the object, a single dereference (if the type of the property is known statically) and no IC or branching. This only covers accesses, not allocation or GC management --- you might want to look at the discussion in bug 642002 on escape analysis.
Note that all objects always hold references into the GC heap: to their proto and parent. We're working on nixing the latter but the former is not going to go away... now in the case of literals, the proto will be the default Object.prototype, which we can ensure lives long enough as long as there's no escaping.
See also http://wiki.ecmascript.org/doku.php?id=strawman:binary_data This is approved for ES.next, should be getting implementation love soon I hear. /be
(In reply to comment #4) > See also > > http://wiki.ecmascript.org/doku.php?id=strawman:binary_data > > This is approved for ES.next, should be getting implementation love soon I > hear. > > /be That is awesome, I hadn't realized Binary Data was already approved for ES.next. That proposal provides wins in a lot of the same cases I'm considering here. One thing that's unclear - I see mention of blocks living in the heap instead of being ECMAScript objects; does that mean that in practice, single Struct instances would not count as garbage? Or is that beyond the scope of the binary data proposal, and something the proposed optimization from this bug would still benefit?
> That is awesome, I hadn't realized Binary Data was already approved for > ES.next. Well, it's promoted to Harmony status, but nothing is officially "approved" yet. However, the committee is very supportive of the proposal, so we just have to do the work to finish it. This is high-priority and I need to get on this ASAP as Brendan says. > That proposal provides wins in a lot of the same cases I'm > considering here. One thing that's unclear - I see mention of blocks living > in the heap instead of being ECMAScript objects; does that mean that in > practice, single Struct instances would not count as garbage? Or is that > beyond the scope of the binary data proposal, and something the proposed > optimization from this bug would still benefit? Hm, I didn't expect the spec to address garbage collection but I may not be understanding you. The internal binary data at least conceptually lives on the heap because there may be multiple Struct objects that point to shared data in a block (i.e., when you pull out a substruct it's pointing to the same mutable binary data). If you want to allocate data internally when you can prove it isn't shared, I think that should be fine. Dave
In particular, an ArrayType containing StructType elements is one contiguous allocation -- not a Java-esque vector of pointers to the heap, where the vector is itself heap-allocated. And yeah, this bug wants to do stack allocation transparently too, but that should be doable with binary data as well as with other JS data. /be
Kevin, if you want to defer working on this until type inference comes online or if you're interested in working on the binary data strawman implementation instead of a bytecode-oriented analysis in the near-term, let's close this bug out.
This bug seems worth keeping unless we know TI subsumes it -- and even then, if there's a unit of TI work summarized with words close to this bug's summary. Or close it, if there's a better bug, but my main point is that implementing the binary data strawman is a fine thing, but it is not the same bug as this one, so filing and fixing it should not close this one. /be
I think we should keep this bug. TI doesn't subsume this, but this should definitely be implemented on top of the analyses available with TI --- escape analysis becomes much simpler working on SSA rather than the raw bytecode, and cross-script analysis becomes much more tractable with a callgraph and other type info available.
I'll be discussing whether this makes sense in the short term with Taras, but in the long term I still plan to do it. I'm going to be researching the bytecode analysis part of it anyway since it overlaps with a personal project, so if that research is fruitful it'll provide a starting point for this bug (whenever the work actually gets done).
Is this still relevant, or is nbp's work on Ion escape analysis in bug 856533 all that should be done here?
Whiteboard: [mentor:cdleary]
Escape analysis in bytecode is not a good idea. Figuring out whether values escape the context at a bytecode level is simply too weak for most of the perf gains we're looking to get from this analysis. Consider the following simple case: function addVec(a, b) { return [a[0] + b[0], a[1] + b[1]]; } function scaleVec(a, scale) { return [a[0] * scale, b[0] * scale]; } function addScaleVec(a, b, scale) { return addVec(a, scaleVec(b, scale)); } This kind of code can be seen in v8/raytrace, and on the web in situations where objects are used as structured arguments. Looking above, just at the bytecode, does |scaleVec(b, scale)| escape the scope of the |addScaleVec| call? Well, to do that, we need to figure out whether the second argument of |addVec| escapes addVec, which is not immediately available when analyzing addScaleVec. These kinds of problems abound when analyzing bytecode. There's just not enough information to be able to conclude anything about escaping values. A much more workable plan for escape analysis is to investigate a completed IonGraph after construction. 1. At that time, type-information will tell us what functions are called. 2. TI will also tell us whether or not some innocuous looking assignment to a slot in a non-escaping object is actually a call to a setter that may make the assigned value escape, or whether it can be guaranteed to be a plain set. 3. Inlining will expand the scope of analysis "for free". Basically, bytecode analysis is close to useless for figuring out escape properties. It'll give us a very limited amount of information that'll allow for meager optimizations only. It's better to move this work to bug 856533.
Assignee: kevin.gadd → general
Severity: normal → enhancement
Assignee: general → nobody
Severity: normal → S3

:nbp, is this still relevant?

Flags: needinfo?(nicolas.b.pierron)

I concur with Kannan previous comment, and with the fact that bytecode analysis comes in the way of running the code, which impede cost to the page-load metric, where-as escape analysis, which is currently a phase in Ion, benefits from inlining and from the SSA representation and aliasing information too.

Status: NEW → RESOLVED
Closed: 3 months ago
Flags: needinfo?(nicolas.b.pierron)
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.