Closed
Bug 463238
Opened 16 years ago
Closed 15 years ago
TM: Support calling arbitrary JSFastNatives from trace
Categories
(Core :: JavaScript Engine, defect, P1)
Tracking
()
RESOLVED
FIXED
mozilla1.9.1b4
People
(Reporter: jorendorff, Assigned: gal)
References
Details
(Keywords: fixed1.9.1, Whiteboard: fixed-in-tracemonkey)
Attachments
(2 files, 7 obsolete files)
18.33 KB,
patch
|
brendan
:
review+
|
Details | Diff | Splinter Review |
9.62 KB,
patch
|
brendan
:
review+
|
Details | Diff | Splinter Review |
Bug 462027 aims to make it safe to call arbitrary native code from trace. Once it's safe, we should trace fast natives. (And then getters and setters--separate bug--and then we can check to see if we're tracing the DOM already or what.)
Reporter | ||
Comment 1•16 years ago
|
||
I need someplace to build the vp vector. My (vague) plan is to put it on the JS stack, exactly where it would be if we were in the interpreter. As part of bug 462021, I'll pre-allocate all the JS stack space I could possibly need in js_ExecuteTree.
Reporter | ||
Comment 2•15 years ago
|
||
Here's what's left, to finish this. 1. I need to emit code to calculate, when calling a JSFastNative, the address in the interpreter stack that corresponds to the location of the arguments on the native stack. (It doesn't sound hard to me, but a very similar task took me days to get right in bug 462027. So.) 2. When we deep-bail from a JSFastNative, we LeaveTree in pre-call state. But we need to *not* clobber the interpreter stack locations corresponding to vp with values from the native stack. The native may have changed those values.[*] My plan for now is to add another fake tag next to JSVAL_BOXED and JSVAL_TNULL. (My theory is that we are not actually limited to 8 typemap values, but I haven't looked closely.) [*] It is in fact totally normal to do this; see e.g. https://developer.mozilla.org/En/SpiderMonkey/JSAPI_Reference/JS_ConvertArguments .
Reporter | ||
Comment 3•15 years ago
|
||
Task #2 was easy, but task #1 looks harder now. I can emit a constant for the offset of the vp array in the *current tree's* stack, but since we might be in a nested tree, we have to calculate the total offset from the callstack--information that is only available at run time. :-P
Reporter | ||
Comment 4•15 years ago
|
||
Do we want this feature? We can instead modify qsgen.py to generate traceable fast natives. That would run faster, too.
Comment 5•15 years ago
|
||
This is preferable for non-Firefox embedders, though, is it not? Not that that consideration should trump expediency....
Comment 6•15 years ago
|
||
Andreas and I were talking about a generic traceable native dispatcher that could call any fast native, boxing as needed and assuming fallibility. Better coverage, seems doable. Any reason why not? /be
Reporter | ||
Comment 7•15 years ago
|
||
Making qsgen.py emit real FASTCALL versions of the quick stubs would avoid a lot of unnecessary boxing and unboxing at run time. It might be significantly faster, or not. It's less general but probably general enough for our purposes, and it's closer to what we *really* want to do which is emit those FASTCALL functions using NanoJIT. In comments 1-3 I was trying too hard to preserve the invariant that when we're not on trace, the interpreter state is exactly what it would be if you had never gone on trace. The way you describe in comment 6 will, in the case of a deep bail, leave the vp vector (part of the interpreter stack) in the pre-call state (see comment 2 item #2). And that's OK.
Comment 8•15 years ago
|
||
I agree it's ok (making sure I'm following here, correct me if I'm confused) to let a native canonicalize args, store conversions in local argv roots, maybe even compute a fortune cookie and overwrite an incoming arg with it. Caveat native, and I don't know of any bad fortune cookie scenarios. Canonicalization and local GC rooting of conversions should happen once, the effects should "stick". Hope this is what you meant. /be
Reporter | ||
Comment 9•15 years ago
|
||
I don't know that it's worth puzzling out what I meant, beyond "gee, I wish I had thought of that a month or two ago". Anyway, if I can make getters and setters work, I'll go after this as Brendan suggests in comment 6. Note that it is not exactly how Ben Turner, Andreas, and I decided to go about it; but it might be easier.
Assignee | ||
Comment 10•15 years ago
|
||
#6 was intended as the fallback if no fast path is available. It might make sense to implement it first though, and then generate the FASTCALL versions.
Comment 11•15 years ago
|
||
We should have a generic fallback, and then make type-specific, method-specific traceable natives as needed. Otherwise the coverage problem will dog us forever, by which I mean we'll find an indefinite supply of benchmarks where we fall off trace and underperform, compared to falling back on a generic native dispatcher. /be
Assignee | ||
Comment 12•15 years ago
|
||
This bug will do 2 things 1) Generate some straight forward code on trace to call arbitrary JSFastNatives as a fallback if we don't find a matching traceable native. 2) Remove all *_tn implementations that are not clearly a performance win. For example, if a FASTCALL implementation merely wraps an existing fast native right now, its likely not a perf win and we can generate that wrapper code on trace instead.
Assignee: general → gal
Flags: blocking1.9.1?
Priority: -- → P1
Summary: TM: Call JSFastNatives from trace → TM: Support calling arbitrary JSFastNatives from trace
Assignee | ||
Comment 13•15 years ago
|
||
This patch reorganizes the code a bit so we can add the generic JSFastNative argument marshaling code.
Assignee | ||
Comment 14•15 years ago
|
||
x.js: for (var i = 0; i < 10; ++i) print(5.5); whale:src gal$ TRACEMONKEY=stats ./Darwin_DBG.OBJ/js -j x.js 5.5 5.5 5.5 5.5 5.5 5.5 5.5 5.5 5.5 5.5 recorder: started(1), aborted(0), completed(1), different header(0), trees trashed(0), slot promoted(0), unstable loop variable(0), breaks(0), returns(0), unstableInnerCalls(0) monitor: triggered(1), exits(1), type mismatch(0), global mismatch(0) Generated code: mov eax,152(ebx) ebx(cx) esi(sp) edi(state) mov eax,60(eax) eax(ld2) ebx(cx) esi(sp) edi(state) mov 0(esi),2737528 eax(ld3) ebx(cx) esi(sp) edi(state) mov 8(esi),eax eax(ld3) ebx(cx) esi(sp) edi(state) mov -16(ebp),0 eax(ld3) ebx(cx) esi(sp) edi(state) mov -12(ebp),1075183616 eax(ld3) ebx(cx) esi(sp) edi(state) movq xmm0,-16(ebp) eax(ld3) ebx(cx) esi(sp) edi(state) mov 16(esi),0 eax(ld3) ebx(cx) esi(sp) edi(state) xmm0(#0x40160000:0) mov 20(esi),1075183616 eax(ld3) ebx(cx) esi(sp) edi(state) xmm0(#0x40160000:0) mov -16(ebp),2737528 eax(ld3) ebx(cx) esi(sp) edi(state) xmm0(#0x40160000:0) mov -12(ebp),eax eax(ld3) ebx(cx) esi(sp) edi(state) xmm0(#0x40160000:0) sub esp,8 ebx(cx) esi(sp) edi(state) xmm0(#0x40160000:0) sub esp,8 ebx(cx) esi(sp) edi(state) xmm0(#0x40160000:0) movq 0(esp),xmm0 ebx(cx) esi(sp) edi(state) xmm0(#0x40160000:0) mov ecx,ebx ebx(cx) esi(sp) edi(state) call js_BoxDouble ebx(cx) esi(sp) edi(state) add esp,8 ebx(cx) esi(sp) edi(state) mov ecx,eax eax(js_BoxDouble1) ebx(cx) esi(sp) edi(state) lea eax,-16(ebp) ebx(cx) esi(sp) edi(state) cmp ecx,16 eax(alloc1) ecx(js_BoxDouble1) ebx(cx) esi(sp) edi(state) je 0x2a7fcd eax(alloc1) ecx(js_BoxDouble1) ebx(cx) esi(sp) edi(state) mov -8(ebp),ecx eax(alloc1) ecx(js_BoxDouble1) ebx(cx) esi(sp) edi(state) mov 480(ebx),2490708 eax(alloc1) ebx(cx) esi(sp) edi(state) sub esp,4 eax(alloc1) ebx(cx) esi(sp) edi(state) push eax eax(alloc1) ebx(cx) esi(sp) edi(state) push 1 ebx(cx) esi(sp) edi(state) push ebx ebx(cx) esi(sp) edi(state) call JSFastNative ebx(cx) esi(sp) edi(state) add esp,16 ebx(cx) esi(sp) edi(state) mov ecx,edi ebx(cx) esi(sp) edi(state) mov 0(esi),eax eax(JSFastNative1) ecx(state) ebx(cx) esi(sp) mov 480(ebx),0 ecx(state) ebx(cx) esi(sp) mov ebx,484(ebx) ecx(state) ebx(cx) esi(sp) test ebx,ebx ecx(state) ebx(ld5) esi(sp) jne 0x2a7fdc ecx(state) esi(sp) We crash somewhere half-way through trace-tests. The patch is by no means complete, but it definitively works as advertised and looks very promising. We will have to compare performance between fastcall quickstubs and marshaling with this patch. It might still be worth it. On the other hand only doubles are really expensive to box, and that DOM always never needs. This patch does not yet remove any existing _tn wrappers. I will do that in a separate patch once everything is done and landed, and only if I can prove that its not a perf loss (I suspect it might be a small perf win, actually).
Attachment #367964 -
Attachment is obsolete: true
Assignee | ||
Comment 15•15 years ago
|
||
Note: I only intend to remove _tn wrappers that are not actually a fast path and were merely added "for the sake of staying on trace". That we can much better with this patch. The goal is to only code special cases for frequently executed relevant fast paths. Everything else we fall back on this. Note 2: Everything except constructors and eval are fast natives, so this really traces almost everything we possibly care about. Constructors still need work.
Assignee | ||
Comment 16•15 years ago
|
||
v3 adds proper unboxing of the return value and the boolean return value of the fast native is propagated into builtinStatus. This runs SS, and a couple other test cases but still dies in trace-tests.js. Beyond that we also need some further work on: - converting Object arguments into strings using an imacro where needed (some string functions need this, we can't do this from inside the native on trace) - deal with the funny String object vs JSString business. This is simply untested but shouldn't be hard. More work on this tomorrow.
Attachment #367974 -
Attachment is obsolete: true
Updated•15 years ago
|
Flags: wanted1.9.1+
Flags: blocking1.9.1?
Flags: blocking1.9.1-
Assignee | ||
Comment 17•15 years ago
|
||
Sayrer is right. Not blocking 1.9.1. I meant to ask for wanted 1.9.1, which is what I got.
tracking-fennec: --- → ?
Assignee | ||
Comment 18•15 years ago
|
||
The patch dies on this code: function testObjectOrderedCmp() { var a = new Array(5); for(var i=0;i<5;++i) a[i] = ({} < {}); return a.join(","); } testObjectOrderedCmp(); whale:src gal$ ./Darwin_DBG.OBJ/js -j x.js x.js:4: TypeError: Function.prototype.toString called on incompatible object whale:src gal$
Assignee | ||
Comment 19•15 years ago
|
||
Passes trace-tests.
Attachment #367980 -
Attachment is obsolete: true
Comment on attachment 368166 [details] [diff] [review] v4 So this breaks at least one page: http://www.chromeexperiments.com/hosted/gravity/index.html Mousing over the page triggers the gravity effect without this patch but there are lots of 'undefined function' errors in the console when this patch is applied.
Comment 21•15 years ago
|
||
For what it's worth, Ben Turner is working hard on tracing DOM access: https://bugzilla.mozilla.org/show_bug.cgi?id=trace-into-dom and is going through contortions to get the right #defines necessary to include jsbuiltins.h outside of js/src. So a proper solution here would be appreciated and useful immediately.
Assignee | ||
Comment 22•15 years ago
|
||
Attachment #368166 -
Attachment is obsolete: true
Assignee | ||
Comment 23•15 years ago
|
||
Attachment #368343 -
Attachment is obsolete: true
Assignee | ||
Comment 25•15 years ago
|
||
Comment on attachment 368422 [details] [diff] [review] v5, fix return value processing Passes all mochitests!
Assignee | ||
Comment 27•15 years ago
|
||
Passes mochi tests. The traceable quickstubs are already in, so we already have the same bug surface for the DOM side. This patch adds full support for all traceable natives. The upside is no more perf nightmares due to missing natives in the engine. The downside will be mostly limited to engine builtins going wrong as DOM going wrong the traceable quickstubs already bring into play. In other words I think we should take this for b4. Outstanding issues: - trace setters/getters (WIP, easier than this, shouldn't increase risk since the general risk is calling all this stuff in general, a bit more due to setters/getters shouldn't create more hazards) - avoid side-exiting by doing object2string and defaultvalue for object arguments early (not a priority I think, I don't see many aborts due to this)
Flags: blocking1.9.1- → blocking1.9.1?
Target Milestone: --- → mozilla1.9.1b4
Comment on attachment 368422 [details] [diff] [review] v5, fix return value processing FYI in order to build browser this patch needs to remove JSTraceableNative from jsprvtd.h as it was added to jspubtd.h
Assignee | ||
Comment 29•15 years ago
|
||
Attachment #368422 -
Attachment is obsolete: true
Attachment #368627 -
Flags: review?(brendan)
Assignee | ||
Comment 30•15 years ago
|
||
Attachment #368455 -
Attachment is obsolete: true
Attachment #368629 -
Flags: review?(brendan)
Comment 31•15 years ago
|
||
Comment on attachment 368627 [details] [diff] [review] refreshed fast native patch > #define JS_FS(name,call,nargs,flags,extra) \ >- {name, call, nargs, flags, extra} >+ {name, call, nargs, flags, extra } Undo gratuitous change. >+typedef struct JSCallInfo { >+ JSWord callinfo[2]; >+#ifdef DEBUG >+ JSWord debug; >+#endif >+} JSCallInfo; s/JSWord/jsword/g -- don't use JSWord, JSInt32, JSUint64, etc. Could try to use same as Nanojit here: uintptr_t? >+ this->generatedTraceableNative = new JSTraceableNative(); Why does this need to be new'ed and delete'd? Why not a struct member? >+ case FAIL_NEG: >+ { >+ res_ins = lir->ins1(LIR_i2f, res_ins); >+ guard(false, lir->ins2(LIR_flt, res_ins, lir->insImmq(0)), OOM_EXIT); >+ break; >+ } Now that this code moves and starts new blame history, please remove the unnecessary braces. >+ jsval& fval = stackval(0 - (2 + argc)); >+ jsval& tval = stackval(0 - (argc + 1)); 0 - (1 + argc) to match style above and previously. >+TraceRecorder::callNative(JSFunction* fun, uintN argc, bool constructing) >+{ >+ if (fun->flags & JSFUN_TRACEABLE) >+ if (callTraceableNative(fun, argc, constructing)) >+ return true; Brace multi-line then (outer then). >+ alloca_ins = lir->insAlloc((2 + argc) * sizeof(jsval)); Nit: is alloca name appropriate here? args_ins? argv_ins? vp_ins? Maybe invokevp_ins. >+ CallInfo* ci = (CallInfo*)lir->skip(sizeof(struct CallInfo))->payload(); Nit: space after cast for readability. >+ jsval& tval = stackval(0 - (argc + 1)); 0 - (1 + argc) >+ JS_ASSERT(*cx->fp->regs->pc == JSOP_CALL || >+ *cx->fp->regs->pc == JSOP_APPLY); Is js_GetOpcode required here? I guess so. Elsewhere to? /be
Comment 32•15 years ago
|
||
Comment on attachment 368629 [details] [diff] [review] remove some traceable natives from jsstr.cpp Wunderbar! /be
Attachment #368629 -
Flags: review?(brendan) → review+
Assignee | ||
Comment 33•15 years ago
|
||
http://hg.mozilla.org/tracemonkey/rev/1c6be1c210b9 http://hg.mozilla.org/tracemonkey/rev/4c0e77401ac2
Whiteboard: fixed-in-tracemonkey
Comment 34•15 years ago
|
||
(In reply to comment #31) > >+ this->generatedTraceableNative = new JSTraceableNative(); > > Why does this need to be new'ed and delete'd? Why not a struct member? This comment was overlooked, waahhh. /be
Assignee | ||
Comment 35•15 years ago
|
||
Was not overlooked. Include dependencies. Callinfo not visible can only use the pointer in .h
Comment 36•15 years ago
|
||
That's fine: JSCallInfo has only word members, but these can hold pointers into the TraceRecorder at the generatedTraceableNative member-struct. My point is that there's no cause for dynamic allocation of store pointed at by the current pointer typed member of that same name. /be
Comment 37•15 years ago
|
||
Comment on attachment 368627 [details] [diff] [review] refreshed fast native patch Thought I r+'ed this already. /be
Attachment #368627 -
Flags: review?(brendan) → review+
Comment 38•15 years ago
|
||
(In reply to comment #33) > http://hg.mozilla.org/tracemonkey/rev/1c6be1c210b9 > http://hg.mozilla.org/tracemonkey/rev/4c0e77401ac2 This broke ARM tinderbox.
Comment 39•15 years ago
|
||
http://hg.mozilla.org/mozilla-central/rev/1c6be1c210b9
Status: NEW → RESOLVED
Closed: 15 years ago
Flags: blocking1.9.1? → blocking1.9.1+
Resolution: --- → FIXED
Reporter | ||
Comment 40•15 years ago
|
||
I think this has a GC hazard. The JSFastNative being called can modify vp in place, deep-bail, and then GC.
Comment 41•15 years ago
|
||
http://hg.mozilla.org/releases/mozilla-1.9.1/rev/2a858cc9b051
Keywords: fixed1.9.1
Updated•10 years ago
|
tracking-fennec: ? → ---
You need to log in
before you can comment on or make changes to this bug.
Description
•