Closed
Bug 586533
Opened 14 years ago
Closed 14 years ago
TM: Flag rarely used JSStackFrame members
Categories
(Core :: JavaScript Engine, enhancement)
Core
JavaScript Engine
Tracking
()
RESOLVED
FIXED
People
(Reporter: bhackett1024, Assigned: bhackett1024)
References
Details
Attachments
(6 files, 5 obsolete files)
13.48 KB,
patch
|
Details | Diff | Splinter Review | |
37.76 KB,
patch
|
luke
:
review+
|
Details | Diff | Splinter Review |
75.14 KB,
patch
|
luke
:
review+
|
Details | Diff | Splinter Review |
15.30 KB,
patch
|
luke
:
review+
|
Details | Diff | Splinter Review |
168.28 KB,
patch
|
luke
:
review+
|
Details | Diff | Splinter Review |
49.06 KB,
patch
|
luke
:
review+
|
Details | Diff | Splinter Review |
It would be nice to flag JSStackFrame members rarely used in JIT code, stubs and natives so that they can be left uninitialized during frame construction. Ones I see (naively, probably wrong on some of these) include imacpc (covered by bug 586358), callobj, argsobj, script, function, annotation, scopeChain, blockChain, hookData, maybe thisv and rval. Attached patch covers callobj and argsobj, passes trace-tests and a couple jstests failures. I doubt this will improve benchmarks for TM much; JM constructs more frames, and the goal is to simplify frame construction enough that work can start on inlining frame construction and things it is blocking (scripted call ICs), without waiting for frame evisceration to finish. All these writes close together in frame construction are expensive; microbenchmarking C programs (large grain of salt) for 1.2 million calls times 10 writes is 15-20ms on my machine.
Assignee | ||
Comment 1•14 years ago
|
||
This patch gets the same failures for me as clean TM.
Attachment #465056 -
Attachment is obsolete: true
Attachment #465099 -
Flags: review?(lw)
Comment 2•14 years ago
|
||
Out of curiosity, does this have a measurable effect on TM/JM? As I was saying earlier today, I have already made a good number of these changes in my WIP argv/argc removal patch, so I'd like to make a few syntactic requests: - Spelling out ArgumentsObject and CallObject is a is a bit long. Perhaps just getArgsObj/getCallObj()? - Could getCallObj()/getArgsObj() return a JSObject&, since its non-null? We're starting to do this more. - Also, for returning a nullable object, we tend to use 'maybe'. So perhaps getArgumentsObjectOptional could just be maybeArgsObj, and the same for maybeCallObj. More substantial comments: - I don't think this syncForTrace() is the right strategy to deal with argsobj/callobj and the tracer. It looks like callobj is only accessed in a single case in setCallProp that can be changed to interpreted JSStackFrame::flags appropriately, and argsobj isn't accessed from trace at all, but rather just during compilation and when entering/leaving trace. (We can talk about these in more detail if you like.) But basically, I don't think it should be necessary to introduce this concept of 'syncing' a stack frame or littering calls to it in various places. - in js_NewGenerator, I think newfp->flags = ... will clobber the flag info set by setCallObject above. I think the call/args obj stealing can be moved below setting the flags. With this there is no need to clearCallObject() in the else branch.
Comment 3•14 years ago
|
||
bhackett: cdleary added assertions like this in his imacpc patch: + JS_ASSERT(!fp->hasIMacroPC()); everywhere that a "fp->imacpc = NULL;" assignment was removed. (Well, the assert had to be moved after fp->flags was assigned.) I think you should do the same for callobj and argsobj.
Comment 4•14 years ago
|
||
Here's a patch doing this for annotations. I wrote the patch before I saw this bug, it'll conflict horribly, sorry. It'll also conflict with cdleary's imacpc patch (which is against TM instead of JM, just to confuse things further). I tried to rebase cdleary's patch against JM and apply my patch on top of it, but had trace-test failures with cdleary's patch, I don't know why. But at least it shows that you can do this easily for annotations. I measured the change, it's a 0.1% instruction reduction for JM on SS. To avoid us all stepping on each other's toes, one person should do all of these changes, and to a single repo (cdleary's patch hasn't landed yet, so it could still be rebased to JM if necessary). bhackett, cdleary, if either of you want to take it that's fine, or I can do it if you have other useful stuff to work on. I'm about to sign off for today, let me know what you decide.
Comment 5•14 years ago
|
||
Why shouldn't this stuff land on tracemonkey first, to minimize JM/TM diff to the changes essential to JM? /be
Assignee | ||
Comment 6•14 years ago
|
||
(In reply to comment #2) > Out of curiosity, does this have a measurable effect on TM/JM? > > As I was saying earlier today, I have already made a good number of these > changes in my WIP argv/argc removal patch, so I'd like to make a few syntactic > requests: > - Spelling out ArgumentsObject and CallObject is a is a bit long. Perhaps just > getArgsObj/getCallObj()? > - Could getCallObj()/getArgsObj() return a JSObject&, since its non-null? > We're starting to do this more. > - Also, for returning a nullable object, we tend to use 'maybe'. So perhaps > getArgumentsObjectOptional could just be maybeArgsObj, and the same for > maybeCallObj. > > More substantial comments: > - I don't think this syncForTrace() is the right strategy to deal with > argsobj/callobj and the tracer. It looks like callobj is only accessed in a > single case in setCallProp that can be changed to interpreted > JSStackFrame::flags appropriately, and argsobj isn't accessed from trace at > all, but rather just during compilation and when entering/leaving trace. (We > can talk about these in more detail if you like.) But basically, I don't think > it should be necessary to introduce this concept of 'syncing' a stack frame or > littering calls to it in various places. > - in js_NewGenerator, I think newfp->flags = ... will clobber the flag info set > by setCallObject above. I think the call/args obj stealing can be moved below > setting the flags. With this there is no need to clearCallObject() in the else > branch. I'll make the API changes you suggest. I think the syncForTrace is pretty nasty too; it's in there because if it's not then the tracer breaks, and I don't know how to fix the tracer, even which path leads to the errors (random assertion failures in jsrecursion.cpp, corruption elsewhere). The tracer is taking the address of argsobj for specific frames, but it also seems to generate code which accesses that offset for an unknown frame, and even with TMFLAGS=full I don't know what happens to these pointers. Alternative: make patches that just hide the fields in JSStackFrame behind an interface (get/setCallObj etc.) but doesn't change the semantics of the system at all. Land on TM. Then, either in TM or JM, make local modifications to jsinterp.h and code constructing frames that gets the intended behavior (and sleeks down frame construction), for experimentation/measuring. Would this conflict with your argc/argv patch? This will be quick to put together and I'm happy to do it.
Assignee | ||
Comment 7•14 years ago
|
||
This patch flags blockChain and scopeChain, passes trace-tests / jstests.
Assignee | ||
Comment 8•14 years ago
|
||
This patch hides callobj and argsobj behind an interface, and does not change the semantics of accessing them.
Attachment #465099 -
Attachment is obsolete: true
Attachment #465375 -
Flags: review?(lw)
Attachment #465099 -
Flags: review?(lw)
Comment 9•14 years ago
|
||
Comment on attachment 465375 [details] [diff] [review] patch for callobj/argsobj preserving semantics Awesome, thanks!
Attachment #465375 -
Flags: review?(lw) → review+
Assignee | ||
Comment 10•14 years ago
|
||
This patch hides scopeChain and blockChain behind an interface, and does not change the semantics of accessing them.
Attachment #465375 -
Attachment is obsolete: true
Attachment #465410 -
Flags: review?(lw)
Assignee | ||
Updated•14 years ago
|
Attachment #465375 -
Attachment is obsolete: false
Assignee | ||
Updated•14 years ago
|
Attachment #465347 -
Attachment is obsolete: true
Assignee | ||
Comment 11•14 years ago
|
||
(In reply to comment #8) > Created attachment 465375 [details] [diff] [review] http://hg.mozilla.org/tracemonkey/rev/d8e6eff0f77e
Comment 12•14 years ago
|
||
(In reply to comment #5) > Why shouldn't this stuff land on tracemonkey first, to minimize JM/TM diff to > the changes essential to JM? My motivation in doing the annotation patch against JM was to measure the effect. But landing on TM first is a good idea.
Comment 13•14 years ago
|
||
bhackett, I'll take your addition of new patches as indicating that you're happy to do all the work for this bug. Please reply if I've misinterpreted.
Assignee | ||
Comment 14•14 years ago
|
||
Yes, I'm happy to work on this bug. My biggest interest right now is scripted calls, the rough plan is: - Abstract away JSStackFrame internals with access methods (TM). - Replace with new internals that initialize most fields lazily (TM or JM). - Get fast path and ICs working for scripted calls/new with lazy stack frame. This should be pretty close to what is needed once stack frame evisceration finishes.
Comment 15•14 years ago
|
||
(In reply to comment #14) That all sounds good. How does the hoisting-ensureSpace check and merging-with-inlineCallCount-check fit in to this plan?
Updated•14 years ago
|
Attachment #465410 -
Flags: review?(lw) → review+
Assignee | ||
Comment 16•14 years ago
|
||
(In reply to comment #15) > (In reply to comment #14) > That all sounds good. How does the hoisting-ensureSpace check and > merging-with-inlineCallCount-check fit in to this plan? Those are in the fairly open-ended third part, optimizations to do once the call fast path is in place and the big wad of frame writes is (mostly) removed.
Comment 17•14 years ago
|
||
this was checked in, but I backed it out, since it didn't compile on any platform. kind of a dealbreaker.
Assignee | ||
Comment 18•14 years ago
|
||
This builds on OSX x64 and passes trace-tests, jstests, xpcshell-tests, and jsreftests.
Attachment #465410 -
Attachment is obsolete: true
Attachment #465641 -
Flags: review?(lw)
Assignee | ||
Comment 19•14 years ago
|
||
This patch hides the JSStackFrame members annotation, debugHook and callerVersion behind an interface, and does not change the semantics of accessing them.
Attachment #465664 -
Flags: review?(lw)
Assignee | ||
Comment 20•14 years ago
|
||
This patch hides the JSStackFrame members fun, script, thisv and rval behind an interface, and does not change the semantics of accessing them.
Attachment #465665 -
Flags: review?(lw)
Comment 21•14 years ago
|
||
Comment on attachment 465641 [details] [diff] [review] scopeChain/blockChain build break fix >@@ -5666,24 +5666,17 @@ SynthesizeFrame(JSContext* cx, const Fra >- fp->blockChain = fi.block; >- fp->blockChain = fi.block; >-#ifdef DEBUG >- if (fi.block != fp->blockChain) { >- for (JSObject* obj = fi.block; obj != fp->blockChain; obj = obj->getParent()) >- JS_ASSERT(obj); >- } >-#endif lol.
Attachment #465641 -
Flags: review?(lw) → review+
Updated•14 years ago
|
Attachment #465664 -
Flags: review?(lw) → review+
Comment 22•14 years ago
|
||
Comment on attachment 465665 [details] [diff] [review] patch for fun/script/thisv/rval preserving semantics Phew! This will also make the removal patch of these members smaller, so thanks. >-JS_STATIC_ASSERT(offsetof(JSStackFrame, rval) % sizeof(Value) == 0); >-JS_STATIC_ASSERT(offsetof(JSStackFrame, thisv) % sizeof(Value) == 0); >+// FIXME >+//JS_STATIC_ASSERT(offsetof(JSStackFrame, rval) % sizeof(Value) == 0); >+//JS_STATIC_ASSERT(offsetof(JSStackFrame, thisv) % sizeof(Value) == 0); It would be nice to keep these on as things get shuffled around in the stack frame. JS_STATIC_ASSERT/offsetof have all sorts of annoying restrictions when used in class context; one thing I've found that works is: struct JSStackFrame { ... void staticAsserts() { JS_STATIC_ASSERT(offsetof(JSStackFrame, rval) % sizeof(Value) == 0); ... } }; r+ with that and assuming benchmarks don't show the compiler doing anything silly like not inlining.
Attachment #465665 -
Flags: review?(lw) → review+
Assignee | ||
Comment 23•14 years ago
|
||
(In reply to comment #18) > Created attachment 465641 [details] [diff] [review] > scopeChain/blockChain build break fix > > This builds on OSX x64 and passes trace-tests, jstests, xpcshell-tests, and > jsreftests. http://hg.mozilla.org/tracemonkey/rev/b630d28840ea
Assignee | ||
Comment 24•14 years ago
|
||
(In reply to comment #19) > Created attachment 465664 [details] [diff] [review] > patch for annotation/debugHook/callerVersion preserving semantics > > This patch hides the JSStackFrame members annotation, debugHook and > callerVersion behind an interface, and does not change the semantics of > accessing them. http://hg.mozilla.org/tracemonkey/rev/0bc3ab72c26b
Assignee | ||
Comment 25•14 years ago
|
||
(In reply to comment #20) > Created attachment 465665 [details] [diff] [review] > patch for fun/script/thisv/rval preserving semantics > > This patch hides the JSStackFrame members fun, script, thisv and rval behind an > interface, and does not change the semantics of accessing them. http://hg.mozilla.org/tracemonkey/rev/13e72485d286
Comment 26•14 years ago
|
||
http://hg.mozilla.org/mozilla-central/rev/d8e6eff0f77e
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Assignee | ||
Comment 27•14 years ago
|
||
One more lazifiable field. savedPC will also be lazified for JM (compute using ncode), but it's pretty much private to JSStackFrame/JSContext as is.
Assignee: general → bhackett1024
Status: RESOLVED → REOPENED
Attachment #467439 -
Flags: review?(lw)
Resolution: FIXED → ---
Comment 28•14 years ago
|
||
Comment on attachment 467439 [details] [diff] [review] patch for argc preserving semantics The argv-removal patch has: Value *formalArgs(); Value *actualArgs(); (which are disjoint sets in the case of argc > fun->nargs). To parallel those nice, could you s/getArgCount/numActualArgs/ and s/getFormalCount/numFormalArgs/ ? (The XCount to numX change is just a personal preference to hear up front, when reading the identifier in some bigger expression, that this is a number, and not the array.)
Assignee | ||
Comment 29•14 years ago
|
||
Fixes for comment 28.
Attachment #467439 -
Attachment is obsolete: true
Attachment #467449 -
Flags: review?(lw)
Attachment #467439 -
Flags: review?(lw)
Comment 30•14 years ago
|
||
Comment on attachment 467449 [details] [diff] [review] revised argc patch Thanks! Also, I forgot, for consistency: s/setArgCount/setNumActualArgs/
Attachment #467449 -
Flags: review?(lw) → review+
Comment 31•14 years ago
|
||
Style guide should say we prefer noun phrases (empty not isEmpty) for infallible getters and predicates, but there's obviously a mix in the current source. That's another reason to prefer numFormalArgs to getArgCount. The formal vs. actual distinction is crisp, better still than trying to use param for formal and arg for actual. /be
Comment 32•14 years ago
|
||
Random vaguely-related comment. If you're interested in measuring which offsets of which objects see a lot of action and which don't (iow, finding hot vs cold fields) by observing jsengine as it runs, the DHAT tool in bug 576861 might be a good starting point. It can't currently measure exactly that, but it sounds to me feasible to extend it to do so.
Assignee | ||
Comment 33•14 years ago
|
||
A related property is which fields have far more writes than reads, which makes them good candidates for lazifying. It's not too hard to do this for specific objects (JSStackFrame, JSContext), but an analysis that could extend this to the rest of JS and the browser would be great.
Comment 34•14 years ago
|
||
(In reply to comment #32) Julian, that sounds like an awesome tool; have you considered applying it to the fields of JSContext?
Comment 35•14 years ago
|
||
(In reply to comment #34) > have you considered applying it to the fields of JSContext? At the moment it can only map memory accesses to specific blocks, but it doesn't build up a histogram, for each allocation stack, of the number of times each offset from the start of the block is accessed. If you see what I mean. And that histogram is needed to find what you want, viz, offsets which are never or very rarely accessed. Probably doable in a day's hacking tho.
Comment 36•14 years ago
|
||
(In reply to comment #35) > Probably doable in a day's hacking tho. That could really pay off: JSContext is large and I would think only a small subset of the fields are hot. A related valgrind tool that keeps coming up in discussions is one that counts unaligned loads/stores. We've had at least 3 cases of serious perf regressions from this. I wouldn't be surprised if such a tool uncovered a few more. Also, we could use it to track the overall number and catch such regressions automatically.
Comment 37•14 years ago
|
||
See also the static analysis (complementary, want both) at bug 492185. FYI, /be
Status: REOPENED → ASSIGNED
Comment 38•14 years ago
|
||
(In reply to comment #36) > > A related valgrind tool that keeps coming up in discussions is one that counts > unaligned loads/stores. We've had at least 3 cases of serious perf regressions > from this. I wouldn't be surprised if such a tool uncovered a few more. Also, > we could use it to track the overall number and catch such regressions > automatically. Here's one I prepared earlier: bug 476122. I haven't run it since I wrote it 18 months ago, but there's a good chance it'll work out of the box.
Comment 39•14 years ago
|
||
> (In reply to comment #35) > > Probably doable in a day's hacking tho. > > That could really pay off: JSContext is large and I would think only a small > subset of the fields are hot. Working on it now. > (In reply to comment #37) > See also the static analysis (complementary, want both) at bug 492185. FYI, Will need some kind of static support for full automation, since at best the tool can produce a histogram of number of memory references per-block-offset for each allocation stack. But it doesn't know the C++ type of any (heap allocated) object since that information is not present in Dwarf3. Maybe Taras has some magic trick that can help. In the short term we can work around this merely by peering at Fx sources to see the type of object allocated at each allocation point.
Comment 40•14 years ago
|
||
http://hg.mozilla.org/mozilla-central/rev/76ede469d9a3
Status: ASSIGNED → RESOLVED
Closed: 14 years ago → 14 years ago
Resolution: --- → FIXED
Updated•14 years ago
|
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment 41•14 years ago
|
||
This can be marked FIXED, right?
Updated•14 years ago
|
Status: REOPENED → RESOLVED
Closed: 14 years ago → 14 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•