Closed
Bug 513190
Opened 15 years ago
Closed 15 years ago
avoiding jsint tagging of the private slot data
Categories
(Core :: JavaScript Engine, enhancement)
Core
JavaScript Engine
Tracking
()
RESOLVED
FIXED
Tracking | Status | |
---|---|---|
status1.9.2 | --- | beta1-fixed |
People
(Reporter: igor, Assigned: igor)
References
Details
(Whiteboard: fixed-in-tracemonkey)
Attachments
(1 file, 1 obsolete file)
86.86 KB,
patch
|
jorendorff
:
review+
|
Details | Diff | Splinter Review |
With the bug 493457 and 511425 fixed it is no longer necessary to tag the data stored in the private slot as jsint. Instead the value can be stored there directly with direct cast to jsval avoiding setting and clearing the int bit. The GC can just check for that when marking object's slots. +++ This bug was initially created as a clone of Bug # +++ Currently SpiderMonkey sources use the following ways to access the private data pointer stored in the private slot: Reading a slot that can be not-yet-initialized: JS_GetPrivate(cx, obj); jsval v = STOBJ_GET_SLOT(obj, JSSLOT_PRIVATE); if (!JSVAL_IS_VOID(v)) p = JSVAL_TO_PRIVATE(v); Reading a previously initialized slot: OBJ_GET_PRIVATE(cx, obj); STOBJ_GET_PRIVATE(cx, obj); JSVAL_TO_PRIVATE(OBJ_GET_SLOT(cx, obj, JSSLOT_PRIVATE)) JSVAL_TO_PRIVATE(obj->fslots[JSSLOT_PRIVATE]) Writing: STOBJ_SET_SLOT(obj, JSSLOT_PRIVATE, JSVAL_TO_PRIVATE(data)); obj->fslots[JSSLOT_PRIVATE] = JSVAL_TO_PRIVATE(data); JS_SetPrivate(cx, obj, data); This suggests to consolidate all of these cases using few inline functions covering all the intended usage. It would not only improve the readability of the code but also reduce useless code bloat resulting from using function calls like JS_(Get|Set)Property for what is effectively a field access. Another source of the bloat is using OBJ_GET_SLOT(cx, obj, JSSLOT_PRIVATE) that adds a locking/unlocking code for accessing this fixed slot.
Comment 1•15 years ago
|
||
Closely related to that, XP connect, DOM and layout all implement their own lock-free variant of JS_SetPrivate. We should use a public API instead. JS_SetPrivateUnlocked or Immutable or something like that.
Comment 2•15 years ago
|
||
JS_InitPrivate to imply up-front one time setting? Don't want too long a name or it'll see less use than bad old JS_SetPrivate. /be
Comment 3•15 years ago
|
||
Init is not the problem. GetPrivate is the really issue. It requires a JSContext* that it never uses and DOM/XPConnect/content don't seem to have handy when calling it.
Comment 4•15 years ago
|
||
(In reply to comment #3) > Init is not the problem. GetPrivate is the really issue. It requires a > JSContext* that it never uses and DOM/XPConnect/content don't seem to have > handy when calling it. Right, I was bouncing off your comment 1. How about JS_Private(obj)? /be
obj->getPrivate() ?
Comment 6•15 years ago
|
||
Love #5. I take 3 of those. Brendan?
Comment 7•15 years ago
|
||
Comment 5 is already in jsobj.h:JSObject. Do try to keep up :-P. Problem is that jsobj.h is not public API. If we are just adding cx-free versions of JS_[GS]etPrivate to jsapi.h, then comment 1 and comment 3 suggest some names in keeping with existing API. If we want a C++ API, I contend we nevertheless do not want a fragile base class or a bunch of private stuff in jsobj.h to be visited upon all and sundry. Cc'ing Luke and Jason. /be
Just do a context-less overload, #ifdef CPLUSPLUS?
Comment 9•15 years ago
|
||
(In reply to comment #8) > Just do a context-less overload, #ifdef CPLUSPLUS? BRILLIANT! Also possibly confusing. I'm in favor. /be
Comment 10•15 years ago
|
||
We wanted to hide JS-engine internals anyway, so why not expose a different version of JSObject via JSAPI? I really would like to hide all the engine internals with hard #error guards against including anything but public headers. namespace JSPrivate { class JSObject; } namespace JSPublic { class JSObject { operator JSPrivate::JSObject() { union { JSPrivate::JSObject* priv; JSPublic::JSObject* pub; } u; u.priv = this; return u.pub; } JSObject* getPrivate() { return JS_GetPrivateUnlocked(this); } } } Something like this should work and JSPrivate::JSObject remains completely opaque. Its 2009. Lets use some C++ magic =)
(In reply to comment #9) > Also possibly confusing. I'm in favor. Classic! (This might also point the way towards JS_GetStringBytes(cx, str) overload to solve a problem that I dimly recall Andreas having a bit recently.)
Comment 12•15 years ago
|
||
Make it really confusing. Ifdef C++ then use a default parameter with cx = NULL. I think that works.
Comment 13•15 years ago
|
||
#error if you don't define the secret password? C'mon, this is all advisory and you can't stop people from hacking right into the most intimate source file. That does not mean we support all such "embedders". If we want a *real* C++ API, it shouldn't be jammed into jsapi.h. Shortest path says some additions to jsapi.h could win, but too many hops and we have made a bigger mess. Just one more hop? Heard that before... /be
Comment 14•15 years ago
|
||
Comment 10 is painfully overwrought (at least use an anonymous union :-P). The problem with embedders is not holding a "unsupported private API usage, use a public API" line, it's having decent public APIs. We exported all .h files in the ancient days to avoid starving those embedders who might be best situated to build better mousetraps without first trying to specify them for us so we could provide the blessed public API. That kind of astronaut architecture waterfall process never works. OTOH we didn't go out and harvest the real-world "embedding-side APIs" that were built from private and public parts. Not much, anyway (some of the ById and Iterator APIs, and a few others, come to mind). If we want a *real* C++ API, it shouldn't be jammed into jsapi.h. /be
Comment 15•15 years ago
|
||
Could see 1. Some small-effort/short-path evolution of jsapi.h 1a. C only, because embedders haven't had to swallow C++ yet for this header 1b. C++ with overloads as shaver says, to fix old parameterization bugs. 2. Big effort/win C++ API -- more to say later. 3. Both, with 1 fading away as 2 comes online. I think 1a is the better short-term course, on reflection. We should definitely newsgroup and wiki up proposals for 2. /be
Comment 16•15 years ago
|
||
I agree, 1a then 2. Design for 2 should follow GC design decisions. Applause for the actual change proposed in the description, too.
Assignee | ||
Comment 17•15 years ago
|
||
A simple hack to export inlined version of JSObject::getPrivate in jsapi.cpp without exposing jsobj.h would be something like this: #define JS_PRIVATE_OFFSET (4 * JS_BYTES_PER_WORD) static JS_INLINE JS_GetPrivate(JSObject *obj) { asserts; return (void *) ((uint8 *) obj + PRIVATE_OFFSET); } with a static assert in jsobj.h that validates JS_PRIVATE_OFFSET. This assumes that this bug is fixed.
Assignee: general → igor
Assignee | ||
Comment 18•15 years ago
|
||
The patch changes the private slot to store the private pointer directly and changes code accordingly. That required to add an extra check during object construction for the JSCLASS_HAS_PRIVATE flag to set JSSLOT_PRIVATE to either JSVAL_VOID or NULL. But at least with GCC 4.2 that initialization code is branch-free. In the patch I also added JSSLOT_PRIMITIVE_THIS as an alias for JSSLOT_PRIVATE to emphasize that Boolean, Number and String uses that slot to store the corresponding jsval, not the private value.
Attachment #397538 -
Flags: review?(jorendorff)
Comment 19•15 years ago
|
||
What is the advantage of this patch? Why should we make object creation slower (even if its just a little bit slower)? What is the problem with tagging that field?
Assignee | ||
Comment 20•15 years ago
|
||
(In reply to comment #19) > What is the advantage of this patch? Why should we make object creation slower > (even if its just a little bit slower)? What is the problem with tagging that > field? The problem is that the initialization of that field and its each read for HAS_PRIVATE classes right now requires a conditional check and bit manipulations. This means that reads like JSObject->JSFunction or JSObject->xpconnect_private or JSObject->ISupport is slower than necessary. On the other hand the cost of 3 extra branchless insructions can be avoided for all hot paths like object creation in the same way as it is avoided now in couple of places with explicit passing of the value for JSSLOT_PRIVATE. I will fail that in a separated bug together with a suggestion to skip initializing unused object's slots to void until their scope is grown.
Comment 21•15 years ago
|
||
Why does initialization of that field require a conditional check? The bit manipulation to set and get the field is completely hidden by the API overhead at the moment. Also, masking off that bit is pretty inexpensive. Do we have some profiles showing that getPrivate() is a performance problem at the moment? I am strongly opposed to adding a single cycle to the object creation path. Its already several times slower than where we want to be.
Comment 22•15 years ago
|
||
This shouldn't go in while Andreas is strongly opposed. Apart from that, this patch would get my r+. Andreas, the place I want to get to is: - we can subclass JSObject - objects can be different sizes and the GC can cope - object fields don't have to be jsvals and the GC can cope - we should never go groveling about for a parent or proto except in jsapi.cpp--the proto cache in bug 512195 works but isn't the final answer, imho - whatever we need to cache in Function objects etc. to make that happen, let's do it - each class we care about should come with an initialization function slim enough to call on trace, instead of js_NewObject having to be all things to all classes It seems to me igor's patch gets us a little closer to that eventual goal, and unless it regresses benchmarks we should take it.
Comment 23•15 years ago
|
||
Also, object creation at high rate, only to do nothing with the object, is stupid benchmarketing and we should not over-optimize for it. We certainly can and should separate (and inline where appropriate) code to init an object without private data, compared to one with. /be
Comment 24•15 years ago
|
||
Originally (1996 fall) objects were a pair {map, slots} and the GC blindly scanned all slots. Even then I could have made the GC check JSCLASS_HAS_PRIVATE and not interpret that slot. That would have been a good idea from the get-go, since it saves cycles in the GC and (most importantly) on every DOM get-private. That's the real workload to tune for, not a fake-o bazillion unused new objects benchmark. /be
Comment 25•15 years ago
|
||
Comment on attachment 397538 [details] [diff] [review] v1 The hacks in DOM and Caps make me queasy. Is caps really so fast that we can't afford a DSO call there? Is that code even correct anymore, given Block objects? I guess fixing this API leakage is another bug. r?gal for further performance review.
Attachment #397538 -
Flags: review?(jorendorff)
Attachment #397538 -
Flags: review?(gal)
Attachment #397538 -
Flags: review+
Comment 26•15 years ago
|
||
My only objection is that we are assuming that we are making something faster here, but there is no proof for that. Nobody has actually measured the speedup on every DOM get-private Brendan is referring to. Please measure before you optimize, otherwise we just add complexity and chase ghosts which has no benefit and costs complexity (which we already have more than enough of). whale:tmp gal$ cat x.cc unsigned priv = 12; struct Object { unsigned priv; unsigned* getPrivate() { return (unsigned*)priv; } }; Object o; int main() { o.priv = unsigned(&priv); unsigned x = 0; for (long l = 0; l < 1000000000; ++l) x += *o.getPrivate(); return 0; } whale:tmp gal$ cat x2.cc unsigned priv = 12; struct Object { unsigned priv; unsigned* getPrivate() { return (unsigned*)(priv & 0xfffffffe); } }; Object o; int main() { o.priv = unsigned(&priv) | 1; unsigned x = 0; for (long l = 0; l < 1000000000; ++l) x += *o.getPrivate(); return 0; } whale:tmp gal$ time ./x real 0m4.070s user 0m4.024s sys 0m0.013s whale:tmp gal$ time ./x2 real 0m4.066s user 0m4.026s sys 0m0.013s This benchmark is synthetic. If you don't believe it, I invite you to do better, but the bit masking is clearly completely hidden by the pointer fetch, even in case of a cache hit. Please note that on top of this the GetPrivate() API calls a non-inlineable function from the DOM side, which hides the pointer shuffling even further. That effect isn't even measured here. I don't think my review is needed for this patch, jorendorff is the expert for that code. I am just pointing out that we are optimizing a non-existing problem with added complexity and added object-creation cost.
Updated•15 years ago
|
Attachment #397538 -
Flags: review?(gal)
Comment 27•15 years ago
|
||
Heh, my original design was performance-neutral, at least. Andreas is right that by requiring a tag, it avoided a special case for JSCLASS_HAS_PRIVATE. Genius! :-) But Jason's on the right track. We want to move away from untyped jsval fslots if we can, for cleaner code. If this does not win perf, at least we get cleaner code. /be
Assignee | ||
Comment 28•15 years ago
|
||
(In reply to comment #26) > Nobody has actually measured the speedup > on every DOM get-private Brendan is referring to. In past quite a few cases of JS_GetPrivate were replaced by a custom inlined version of it precisely because JS_GetPrivate noticeably affected performance. > unsigned* getPrivate() { > return (unsigned*)(priv & 0xfffffffe); > } The bit manipulation is not the problem. The issue is an extra check to return NULL when the stored value is JSVAL_VOID. With modified test cases I have with GCC 4.3: ~/s> cat x.cc volatile unsigned priv = 12; struct Object { long priv; Object(volatile unsigned *ptr) : priv(long(ptr)) { } volatile unsigned* getPrivate() volatile { return (volatile unsigned *) priv; } }; volatile Object o(&priv); int main() { unsigned x = 0; for (long l = 0; l < 1000000000; ++l) x += *o.getPrivate(); return x; } ~/s> cat x2.cc volatile unsigned priv = 12; struct Object { long priv; Object(volatile unsigned *ptr) : priv(long(ptr) | 1L) { } volatile unsigned* getPrivate() volatile { // JSVAL_VOID is 22 return priv != 22 ? (volatile unsigned*)(priv & ~1L) : 0; } }; volatile Object o(&priv); int main() { unsigned x = 0; for (long l = 0; l < 1000000000; ++l) x += *o.getPrivate(); return x; } ~/s> time ./x real 0m2.814s user 0m2.778s sys 0m0.002s ~/s> time ./x2 real 0m4.626s user 0m4.603s sys 0m0.002s That is, the if check and bit manipulations slow dowm the test case by 70%.
Assignee | ||
Comment 29•15 years ago
|
||
The new version avoids JSVAL_VOID/JSVAL_NULL check for Object construcor called from the trace and unifies JSObject initialization behind the single init method.
Attachment #397538 -
Attachment is obsolete: true
Attachment #398424 -
Flags: review?(jorendorff)
Updated•15 years ago
|
Attachment #398424 -
Attachment mime type: application/octet-stream → text/plain
Updated•15 years ago
|
Attachment #398424 -
Attachment is patch: true
Comment 30•15 years ago
|
||
Comment on attachment 398424 [details] [diff] [review] v2 Even better.
Attachment #398424 -
Flags: review?(jorendorff) → review+
Assignee | ||
Comment 31•15 years ago
|
||
http://hg.mozilla.org/tracemonkey/rev/45772700955a
Whiteboard: fixed-in-tracemonkey
Assignee | ||
Updated•15 years ago
|
Flags: wanted1.9.2?
Comment 32•15 years ago
|
||
http://hg.mozilla.org/mozilla-central/rev/45772700955a
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Comment 33•15 years ago
|
||
http://hg.mozilla.org/releases/mozilla-1.9.2/rev/d38f890b7279
status1.9.2:
--- → beta1-fixed
Flags: wanted1.9.2? → wanted1.9.2+
You need to log in
before you can comment on or make changes to this bug.
Description
•