Closed Bug 772329 Opened 12 years ago Closed 1 month ago

Optimize nested property fetches where possible

Categories

(Core :: JavaScript Engine, defect)

defect

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: kael, Unassigned)

References

(Blocks 1 open bug, )

Details

(Whiteboard: [js:t])

V8 has a useful optimization for nested property fetches, like:

X.Y.Z.W

I'm not entirely sure how it's implemented, but in cases where the properties do not change frequently, the cost of the deeply nested fetch is nearly equivalent to the cost of simply fetching X. This is really really helpful for generated code that uses nesting to reproduce object models (like basically every line of code my translator JSIL generates).

In the case where the JS is being generated by a translator, it could probably locate nested fetches like the above and cache them in advance, but that will eat up valuable local variable slots and potentially incur additional overhead beyond that (for example, having to pre-fetch properties that are only used in certain branched code paths).

Here is a simple test case:
http://jsperf.com/deeply-nested-property-fetches/2

This is a likely contributor to terrible performance for all JSIL test cases in FF (hard to know for sure until JS profiling lands).
That looks like LICM (loop invariant code motion) which will turn

  for (...)
     X.Y.Z.W

into

  t = X.Y.Z
  for (...)
     t.Z

What do you see on IM (which has LICM)?
What flags would I use to force-enable IM? I tried this in the latest Nightly and it's still very slow there. Is there a set of flags I could pass to a modern js.exe build to test out IM's implementation of LICM to see if it's faster?
Since my test case actually would have been optimized by *either* really smart LICM or optimized nested fetches, I tried to add a second set of cases that is less likely to get optimized by LICM (I think it would need a combination of inlining and smart LICM, which is much less likely):

http://jsperf.com/deeply-nested-property-fetches/3

It is still blazing fast in Chrome. Unfortunately, it's still possible that this is just really awesome LICM - as long as the benchmark calls a pure function in a loop, I suppose it's possible for them to optimize everything down to 'if (constant !== constant)', which is basically instantaneous. Any suggestions on how to make this test less likely to be inaccurate would be great.

I've seen claims that V8 optimizes for the case of chained property fetches like this, specifically because they show up really often in JS code that's namespaced/abstracted. But I don't know offhand how you would optimize for it in the function call case (where the nested fetch only occurs say, once, in a given scope) - maybe you have a native trampoline for doing chained fetches instead of calling GetProp repeatedly on its own result, and that gets you most of the win? It may also be that they optimize for the best case, where the entire chained fetch is successful, and as a result end up deoptimized horribly in the failure case but it doesn't matter because it's rare?
(In reply to Kevin Gadd (:kael) from comment #3)
Chrome (and IM) do inlining early in the pipeline so this still looks like LICM.

To try out IM, check out http://hg.mozilla.org/projects/ionmonkey.  dvander: is IM on by default in content in the browser?
Blocks: WebJSPerf
Whiteboard: [js:t]
(In reply to Kevin Gadd (:kael) from comment #3)
> http://jsperf.com/deeply-nested-property-fetches/3

Noticed this was much slower in FF 18 and assumed it was IM, but FF 17 has a similar regression…
Blocks: 885526
Blocks: JSIL
Just checked and v8 is just doing LICM here and that's why the times are the same.

We fail to LICM it, because our current heurstic decide that the "new Error()" could potentially override the values of X.Y.Z.W. So we should reload the value every time. This is caused due to our Alias Analysis not being aware of the graph structure.

There are two possible solutions:
1) bug 844779: Improve our rpo graph. That way AA won't see the the contents of the "if" as part of the loop.
2) Teach Alias Analysis about graph structure.

(Applying the patch from bug 844779 improves score to 110ms, as fast as d8)
Depends on: 844779
Assignee: general → nobody
Severity: normal → S3

the link to jsperf does not work anymore.
Bryan, should this bug be closed?

Flags: needinfo?(bthrall)

Reading through the comments:

  1. The title is misleading, since it's actually just a question of having sufficiently precise alias analysis to enable LICM.
  2. It looks like our alias analysis wasn't flow-sensitive when this was opened, but comment 6 implies that improving our AA fixed the perf gap.

Should be safe to close this.

Status: NEW → RESOLVED
Closed: 1 month ago
Flags: needinfo?(bthrall)
Resolution: --- → WORKSFORME

FWIW, flow sensitive alias analysis was implemented in bug 1255008, but was never enabled. It was removed in bug 1455280

You need to log in before you can comment on or make changes to this bug.