Closed Bug 545051 Opened 15 years ago Closed 15 years ago

TM: Survey tracing of closure variable access

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: dmandelin, Assigned: dmandelin)

References

Details

The reason to do this is to figure out which fast paths are important and how important they are, so we know how hard to work to restore them after removing the callee parent guard (which enables those fast paths).
In v8, the only benchmark that has closure access that we trace is earley-boyer. There, we do 3144 closure accesses, and all them use JSOP_NAME, call the builtin, and read from an interpreter frame. Thus, we are not using any fast paths there. But I think we could use bz's fast path to help us there with the extra guard I recommended over in bug 510554.
Can you instrument closure accesses from the interpreter too? /be
(In reply to comment #2) > Can you instrument closure accesses from the interpreter too? That was bug 543149. Is there something more you want to know that wasn't discovered there?
In SunSpider, 2 benchmarks do traced closure access: tagcloud: 190 reads with NAME, calling the builtin and reading from an interpreter frame. unpack: 59,753 reads with NAME, using the tracker. unpack uses the tracker fast path, but its perf seemed unaffected by the parent guard removal patch. I'll retest that at some point. For unpack at least, the parent guard doesn't hurt us, so that's a case where we could emit that guard on demand in order to enable tracker use.
In Dromaeo, we have these closure variable accesses: read builtin, interpreter frame 8,245 bz callobj fast path 2,874,597 tracker fast path 149,187 write bz callobj fast path 1,460,504 So both kinds of fast path look important here. We also have an opportunity to extend the callobj fast path to cover the other 8,245, and then we would be using fast paths for everything.
In the fluid simulator, we get 1,964,434 reads through the callobj fast path, and 45 through the builtin and interpreter frames. In the go benchmark (bug 460050), we get 31,937 reads through the builtin and interpreter frames. We also get 31,937 sets through the builtin. In the bug 504829 benchmark, we get 1,034,659 reads (in a partial run, I got sick of waiting) all through the tracker. In this test case we get slower as the test goes on because of the parent guard.
Discussion: As we already knew, the callobj fast paths are valuable. Fortunately, it is easy to preserve (and in fact extend) them without the parent guard simply by adding a LIR branch. The tracker fast paths also get used, but they are more problematic. They can be easily preserved by adding back the parent guard on demand, but the test case in bug 504829 requires the parent guard to be absent in order to have stable perf. For these, I think I will start by disabling the fast paths and measuring the effect on the known important workloads that use it. If that hurts too much, then I will try to add back a restricted version that gives fairly good perf without the parent guard. The go benchmark does sets through the builtin. I think it would be pretty hard to make that fast on trace with variations of our current technique. Trying to speed that up would be kind of hard so I don't really want to worry about that just yet.
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
(In reply to comment #3) > (In reply to comment #2) > > Can you instrument closure accesses from the interpreter too? > > That was bug 543149. Is there something more you want to know that wasn't > discovered there? Sorry, forgot about that bug (and the TM: in this one's summary). Ok, flat closures matter in general but not always, and for v8 in particular, also for the go game, full closures are necessary to make things fast, and we want to do them via fast scope-chain navigation (interp and JIT). Do I have the right summary conclusions? /be
(In reply to comment #8) > (In reply to comment #3) > > (In reply to comment #2) > > > Can you instrument closure accesses from the interpreter too? > > > > That was bug 543149. Is there something more you want to know that wasn't > > discovered there? > > Sorry, forgot about that bug (and the TM: in this one's summary). Ok, flat > closures matter in general but not always, and for v8 in particular, also for > the go game, full closures are necessary to make things fast, and we want to do > them via fast scope-chain navigation (interp and JIT). Do I have the right > summary conclusions? I think so. A couple of additional detail points: - To really optimize the fast scope-chain navigation, we will want more info from the front end (e.g., the different kinds of heavyweights, speculative static binding even when there are intervening dynamic scopes). - On trace, we seem to have most of the fast paths we need for now, but we do need to keep them. We may discover a need for new fast paths once the parent guard is removed.
You need to log in before you can comment on or make changes to this bug.