Closed
Bug 365851
(js-propcache)
Opened 18 years ago
Closed 17 years ago
get/set/call optimizations (return of the property cache)
Categories
(Core :: JavaScript Engine, defect, P1)
Core
JavaScript Engine
Tracking
()
RESOLVED
FIXED
mozilla1.9beta4
People
(Reporter: brendan, Assigned: brendan)
References
Details
(Keywords: perf)
Attachments
(5 files, 22 obsolete files)
27.68 KB,
patch
|
Details | Diff | Splinter Review | |
44.64 KB,
application/x-tar
|
Details | |
20.16 KB,
patch
|
igor
:
review+
brendan
:
approval1.9+
|
Details | Diff | Splinter Review |
12.59 KB,
text/plain
|
Details | |
201.78 KB,
patch
|
shaver
:
review+
|
Details | Diff | Splinter Review |
First patch here will re-factor getter and setter code to pave the way for further optimizations.
/be
Assignee | ||
Comment 1•18 years ago
|
||
Attachment #250391 -
Flags: review?(igor.bukanov)
Updated•18 years ago
|
Attachment #250391 -
Flags: review?(igor.bukanov) → review+
Assignee | ||
Comment 2•18 years ago
|
||
Attachment #250391 -
Attachment is obsolete: true
Attachment #250497 -
Flags: review?(igor.bukanov)
Updated•18 years ago
|
Attachment #250497 -
Flags: review?(igor.bukanov) → review+
Assignee | ||
Comment 3•18 years ago
|
||
Landed on trunk:
Checking in jsapi.c;
/cvsroot/mozilla/js/src/jsapi.c,v <-- jsapi.c
new revision: 3.301; previous revision: 3.300
done
Checking in jscntxt.c;
/cvsroot/mozilla/js/src/jscntxt.c,v <-- jscntxt.c
new revision: 3.98; previous revision: 3.97
done
Checking in jscntxt.h;
/cvsroot/mozilla/js/src/jscntxt.h,v <-- jscntxt.h
new revision: 3.134; previous revision: 3.133
done
Checking in jsdbgapi.c;
/cvsroot/mozilla/js/src/jsdbgapi.c,v <-- jsdbgapi.c
new revision: 3.83; previous revision: 3.82
done
Checking in jsgc.c;
/cvsroot/mozilla/js/src/jsgc.c,v <-- jsgc.c
new revision: 3.196; previous revision: 3.195
done
Checking in jsinterp.c;
/cvsroot/mozilla/js/src/jsinterp.c,v <-- jsinterp.c
new revision: 3.317; previous revision: 3.316
done
Checking in jsobj.c;
/cvsroot/mozilla/js/src/jsobj.c,v <-- jsobj.c
new revision: 3.312; previous revision: 3.311
done
Checking in jsobj.h;
/cvsroot/mozilla/js/src/jsobj.h,v <-- jsobj.h
new revision: 3.60; previous revision: 3.59
done
Checking in jsscope.c;
/cvsroot/mozilla/js/src/jsscope.c,v <-- jsscope.c
new revision: 3.56; previous revision: 3.55
done
Checking in jsscope.h;
/cvsroot/mozilla/js/src/jsscope.h,v <-- jsscope.h
new revision: 3.41; previous revision: 3.40
done
Onward!
/be
Status: NEW → ASSIGNED
Assignee | ||
Comment 4•18 years ago
|
||
This was a >2K code size reduction in jsinterp.o on my MacBook Pro, using -Os. But it seems on argo-vm and luna to have bloated jsinterp.o. Lack of -Os or -O2?
Anyone who can test Windows before/after jsinterp.obj size, please report it. Thanks,
/be
Comment 5•18 years ago
|
||
(In reply to comment #4)
> Anyone who can test Windows before/after jsinterp.obj size, please report it.
MSVC7.1 optimized:
Before: 64,924 bytes
After : 61,754 bytes
Assignee | ||
Comment 6•18 years ago
|
||
So we're using the wrong gcc options, I guess. See bug 255433.
/be
Comment 7•18 years ago
|
||
Makefile.ref uses -O2 by default on Win32, my before/after results are:
Before patch applied:
jsinterp.obj 74387
After patch applied:
jsinterp.obj 72869
Testing with -O1 (which cl claims optimizes for minimum size) yields:
Before Patch applied:
jsinterp.obj 59343
After Patch applied:
jsinterp.obj 56935
Comment 8•18 years ago
|
||
Maybe I'm being dense, but I don't see the compile-flags relation to bug 255433
Assignee | ||
Comment 9•18 years ago
|
||
(In reply to comment #8)
> Maybe I'm being dense, but I don't see the compile-flags relation to bug 255433
Sorry, typoed bug 225433, which Stan Shebs has kindly adopted.
/be
Assignee | ||
Comment 10•18 years ago
|
||
Top of last /tmp/calltable.dump.N file after startup/load-gmail/click-message:
Method call summary:
method callsites: 1801
total method calls: 17240
native method calls: 12631 (73.27%)
monomorphic calls: 12373 (71.77%)
monomorphic natives: 8595 (49.85%)
polymorphic-2 calls: 2057 (11.93%)
polymorphic-3 calls: 838 (4.86%)
polymorphic-4 calls: 618 (3.58%)
polymorphic-5 calls: 161 (0.93%)
polymorphic-6 calls: 120 (0.70%)
polymorphic-7 calls: 30 (0.17%)
polymorphic-8 calls: 1043 (6.05%)
megamorphic calls: 0 (0.00%)
The majority of calls are native; a majority of callsites are monomorphic; over two-thirds of native callsites are monomorphic. There are zero megamorphic calls with a PIC of size 8, < 10% with size 4.
/be
Assignee | ||
Comment 11•18 years ago
|
||
Generalizing summary to fit problem space.
/be
Summary: getter and setter hard cases bloat and slow down critical paths → get/set/call optimizations
Assignee | ||
Comment 12•18 years ago
|
||
Suspect a bug due to mode at polymorphic-8 and 0 megamorphic... more tomorrow.
/be
Assignee | ||
Comment 13•18 years ago
|
||
Latest results (I've seen runs with non-zero megamorphic counts, so forget about the bug worry):
Method call summary:
method callsites: 2742
total method calls: 33663
computed method calls: 3162
prototype chain searches: 31969
prototype chain steps: 25279
scope chain searches: 252
scope chain steps: 59
anonymous method calls: 2660
native method calls: 25537 (75.86%)
monomorphic natives: 15629 (46.43%)
monomorphic calls: 23092 (68.60%)
polymorphic-2 calls: 2127 (6.32%)
polymorphic-3 calls: 1804 (5.36%)
polymorphic-4 calls: 1270 (3.77%)
polymorphic-5 calls: 494 (1.47%)
polymorphic-6 calls: 414 (1.23%)
polymorphic-7 calls: 615 (1.83%)
polymorphic-8 calls: 685 (2.03%)
megamorphic calls: 0 (0.00%)
This was startup/gmail/message/click-on-bugzilla-link/quit-closing-two-tabs.
Considering the low number of call sites, we could have a 4-entry PIC and use less than 175K. Each entry would be either a method (function object) pointer, or an sprop (as with the old direct-mapped runtime-wide property cache that I recently removed). It should be possible to split cases further to optimize for stub vs. non-stub getter and setter.
Since almost 80% of method calls search up one step in the prototype chain, would like to optimize to avoid looking in the directly-referenced receiver object when the cache says we're hitting one object "up" the prototype chain. Notice how the scope chain searches hit on the head object more often (>75%, and such function or constructor calls are small potatoes compared to method calls).
/be
Attachment #250933 -
Attachment is obsolete: true
Assignee | ||
Comment 14•18 years ago
|
||
I added function bytecode length dumping (at the end of the header line for each function), and sure enough:
$ grep ' calls ' /tmp/calltable.dump.3 | sort -n +2 -3 | tail -20
file:///Users/brendaneich/Hacking/trunk/mozilla/dist/MinefieldDebug.app/Contents/MacOS/components/nsExtensionManager.js:8085 calls 88 (4) argc 1/1 length 94
http://mail.google.com/mail/?view=page&name=js&ver=1u6fu1tdvx6hk:668 calls 101 (0) argc 0/0 length 24
http://mail.google.com/mail/?view=page&name=js&ver=1u6fu1tdvx6hk:668 calls 103 (0) argc 0/0 length 104
http://mail.google.com/mail/?view=page&name=js&ver=1u6fu1tdvx6hk:668 calls 103 (0) argc 0/0 length 54
http://mail.google.com/mail/?view=page&name=m_base&ver=aycm2mrckk0u:2 calls 106 (60) argc 1/1 length 149
chrome://global/content/globalOverlay.js:113 calls 112 (92) argc 1/1 length 92
chrome://global/content/globalOverlay.js:132 calls 112 (92) argc 2/2 length 79
chrome://global/content/globalOverlay.js:85 calls 112 (92) argc 1/1 length 238
http://mail.google.com/mail/?view=page&name=js&ver=1u6fu1tdvx6hk:909 calls 116 (0) argc 1/1 length 65
http://mail.google.com/mail/?view=page&name=m_base&ver=aycm2mrckk0u:25 calls 122 (101) argc 1/1 length 36
http://mail.google.com/mail/?view=page&name=js&ver=1u6fu1tdvx6hk:973 calls 133 (0) argc 0/0 length 5
http://mail.google.com/mail/?view=page&name=m_base&ver=aycm2mrckk0u:27 calls 138 (124) argc 1/1 length 11
file:///Users/brendaneich/Hacking/trunk/mozilla/dist/MinefieldDebug.app/Contents/MacOS/components/nsExtensionManager.js:6929 calls 159 (236) argc 2/2 length 83
file:///Users/brendaneich/Hacking/trunk/mozilla/dist/MinefieldDebug.app/Contents/MacOS/components/nsExtensionManager.js:6949 calls 159 (109) argc 2/2 length 101
file:///Users/brendaneich/Hacking/trunk/mozilla/dist/MinefieldDebug.app/Contents/MacOS/components/nsSafebrowsingApplication.js:1867 calls 162 (122) argc 1/1 length 10
http://mail.google.com/mail/?view=page&name=m_base&ver=aycm2mrckk0u:25 calls 164 (128) argc 1/1 length 13
chrome://browser/content/bookmarks/bookmarksMenu.js:1013 calls 181 (281) argc 4/4 length 101
file:///Users/brendaneich/Hacking/trunk/mozilla/dist/MinefieldDebug.app/Contents/MacOS/components/nsExtensionManager.js:7878 calls 188 (161) argc 3/3 length 104
chrome://browser/content/browser.js:3371 calls 191 (0) argc 0/0 length 60
chrome://browser/content/bookmarks/bookmarksMenu.js:1040 calls 214 (113) argc 2/2 length 53
The 20 most frequently called functions tend to be small. The top 10 might even inline into the PIC space if these functions are called from monomorphic sites. Have to work on the dump format and stats a bit more....
/be
Assignee | ||
Comment 15•17 years ago
|
||
Hope to make up for loss of property cache (bug 128150 by way of bug 408144), and then some.
/be
Flags: wanted1.9+
Priority: -- → P1
Target Milestone: mozilla1.9alpha1 → mozilla1.9 M11
Assignee | ||
Comment 16•17 years ago
|
||
Assignee | ||
Comment 17•17 years ago
|
||
Almost all callsites (all native method callsites) are monomorphic:
callsites 2047 mean polymorphism 1.03566 sigma 0.267493
hits 1765843
misses 126516
overloaded 0
megamorphic 3182
erecycles 126808
trecycles 129787
esweeps 292
tsweeps 311
No need for poly- when mono- will do, and we aren't inlining for 1.9 (tracing after 1.9 for scripted functions, and natives can't be inlined), so I'm focusing on caller->callee fast path using a single-entry cache per callsite.
Also want to avoid memoizing function values in the property tree. Finally, want to handle getters and setters but may split that out into a followup bug.
/be
Assignee | ||
Comment 18•17 years ago
|
||
From the patch, megamorphic here means > 4 callees for the cached callsite -- collisions in the two-way associative direct cache will overwrite some entries, possibly undercounting these; see |erecycles - esweeps|. Still highly likely to be one callee per callsite in our code, only a bit less likely in gmail code.
/be
Assignee | ||
Comment 19•17 years ago
|
||
Preliminary patch to clear out this old code, which I do not want to maintain or leave bloating the source and rusting. Better approach is the jspic.[ch] approach, which could be improved to have fewer collisions, but which has the virtue of being out-of-line and less intrusive. Could even use (possibly beefed up) debugger call hooks for zero source code touch plugability.
I'd like to get this in today, ASAP. Thanks,
/be
Attachment #293190 -
Flags: review?(igor)
Updated•17 years ago
|
Attachment #293190 -
Flags: review?(igor) → review+
Assignee | ||
Comment 20•17 years ago
|
||
Comment on attachment 293190 [details] [diff] [review]
remove DUMP_CALL_TABLE
a=me, not part of build.
/be
Attachment #293190 -
Flags: approval1.9+
Assignee | ||
Comment 21•17 years ago
|
||
DUMP_CALL_TABLE removed:
js/src/jsgc.c 3.257
js/src/jsinterp.c 3.394
js/src/jsinterp.h 3.68
js/src/jsobj.c 3.407
js/src/jsstr.c 3.179
/be
Updated•17 years ago
|
Flags: blocking1.9?
Assignee | ||
Comment 22•17 years ago
|
||
Updated•17 years ago
|
Flags: blocking1.9? → blocking1.9+
Assignee | ||
Comment 23•17 years ago
|
||
Passes js testsuite (I exclude spidermonkey-n.tests and slow-n.tests) -- bc, cayou try it out some time? More work to do, missing capability revocation on scopechain shadowing, e.g.
/be
Attachment #295891 -
Attachment is obsolete: true
Comment 24•17 years ago
|
||
(In reply to comment #23)
>
> bc, > can you try it out some time?
sure, I normally don't run with slow-n.tests either. Should I? I'll make a stab a bit later with browser|shell. If you have a wip 3 patch in the works, let me know so I can test that instead.
Comment 25•17 years ago
|
||
(In reply to comment #23)
Lots of Assertion failure: PCVAL_IS_NULL(v) || PCVAL_IS_SPROP(v), at jsinterp.c:128
just one example of many: ecma/Types/8.6.2.1-1.js shell
#0 JS_Assert (s=0x81365c4 "PCVAL_IS_NULL(v) || PCVAL_IS_SPROP(v)", file=0x8136568 "jsinterp.c", ln=128) at jsutil.c:63
#1 0x0809b7fa in js_FillPropertyCache (cache=0x9bf8a70, pc=0x9c0a75b "6", scopeIndex=0, obj=0xb7f4bae0, protoIndex=0, sprop=0x9c13910, entry=0x9bfb3c8) at jsinterp.c:128
#2 0x080c9222 in js_SetPropertyHelper (cx=0x9c00e08, obj=0xb7f4bae0, id=-1208674788, vp=0xbf92ec60, entry=0x9bfb3c8) at jsobj.c:3775
#3 0x080ae484 in js_Interpret (cx=0x9c00e08, pc=0x9c0a75b "6", result=0xbf92ef88) at jsinterp.c:3815
#4 0x0809d4d5 in js_Invoke (cx=0x9c00e08, argc=4, vp=0x9c05880, flags=1) at jsinterp.c:1178
#5 0x0809e984 in js_InvokeConstructor (cx=0x9c00e08, vp=0x9c05880, argc=4) at jsinterp.c:1748
#6 0x080a8ef3 in js_Interpret (cx=0x9c00e08, pc=0x9c1239b "#", result=0xbf92f6b4) at jsinterp.c:3402
#7 0x0809dcba in js_Execute (cx=0x9c00e08, chain=0xb7f50000, script=0x9c122c8, down=0x0, flags=0, result=0xbf92f75c) at jsinterp.c:1403
#8 0x08061aa8 in JS_ExecuteScript (cx=0x9c00e08, obj=0xb7f50000, script=0x9c122c8, rval=0xbf92f75c) at jsapi.c:4774
#9 0x0804a270 in Load (cx=0x9c00e08, obj=0xb7f50000, argc=1, argv=0x9c0585c, rval=0xbf92f7f0) at js.c:626
#10 0x0809d449 in js_Invoke (cx=0x9c00e08, argc=1, vp=0x9c05854, flags=0) at jsinterp.c:1162
#11 0x080afc29 in js_Interpret (cx=0x9c00e08, pc=0x9c0df9e ":", result=0xbf92fe94) at jsinterp.c:4084
#12 0x0809dcba in js_Execute (cx=0x9c00e08, chain=0xb7f50000, script=0x9c0df60, down=0x0, flags=0, result=0xbf930f68) at jsinterp.c:1403
#13 0x08061aa8 in JS_ExecuteScript (cx=0x9c00e08, obj=0xb7f50000, script=0x9c0df60, rval=0xbf930f68) at jsapi.c:4774
#14 0x0804980d in Process (cx=0x9c00e08, obj=0xb7f50000, filename=0x0, forceTTY=0) at js.c:306
#15 0x08049f08 in ProcessArgs (cx=0x9c00e08, obj=0xb7f50000, argv=0xbf9310f8, argc=0) at js.c:532
#16 0x0804f216 in main (argc=0, argv=0xbf9310f8, envp=0xbf9310fc) at js.c:3561
or in the browser during startup: Assertion failure: PCVAL_IS_NULL(v) || PCVAL_IS_SPROP(v), at /work/mozilla/builds/1.9.0-test/mozilla/js/src/jsinterp.c:128
wip'it up again, maestro!
Assignee | ||
Comment 26•17 years ago
|
||
Silly bogus assertion -- pay it no mind ;-). New patch soon, opt testing of that patch instead of dbg should turn out ok (pls. holler if not).
/be
Comment 27•17 years ago
|
||
Plenty of crashes (signal 11's) on linux (centos5) and maci386 in the browser version of the test but with one crash which was not repeatable, I could not reproduce it locally on fedora7 or on maci386. Not sure I trust that. I'll test the new patch when its available.
Comment 28•17 years ago
|
||
Assignee | ||
Comment 29•17 years ago
|
||
Seems stable -- sayrer, go nuts.
Needs to be cleaned up and be split into blocking bugs, for less mega-fix-mashup landing/auditing pain.
Narrative about what this is soon, although Ecma TC39 meeting will take time the next two days.
/be
Attachment #296794 -
Attachment is obsolete: true
Assignee | ||
Comment 30•17 years ago
|
||
Comment 31•17 years ago
|
||
bc any chance you can run through the regression suite (if you haven't already)
Assignee | ||
Comment 32•17 years ago
|
||
I had been running the suite until I moved to Firefox to battle with gmail (which does all sorts of fun prototype chain foreshadowing); running suite now, seeing fresh problems. Bob, no need to attach results here, I should have a new patch at some point today.
/be
Comment 33•17 years ago
|
||
(In reply to comment #32)
k.
Assignee | ||
Comment 34•17 years ago
|
||
This passes the suite with the same test failures (failure details differ in one case due to assertion differences here and on the trunk). Bob, please confirm.
Sayrer, I'm hoping you can take this for a spin.
/be
Attachment #298896 -
Attachment is obsolete: true
Assignee | ||
Comment 35•17 years ago
|
||
Attachment #298900 -
Attachment is obsolete: true
Comment 36•17 years ago
|
||
I'm running the shell test suite (e4x, ecma*, js1_[1-8]; excluding slow-n.tests) under valgrind on sm-valgrind01 now, it'll take a couple of hours. (Restarting the shell and reloading the drivers for every test takes its toll!)
Note to self or sayrer:
cd tests/v-logs-bacon; ../../summarize-valgrind-logs.sh
summary-errors-only.txt is the compact output (test nerrors ncontext)
summary-unified.txt is the full list for sanity checking
Have a sunspider-under-valgrind run going now as well.
Comment 37•17 years ago
|
||
Sunspider is valgrind-clean on 64-bit Linux.
Assignee | ||
Comment 38•17 years ago
|
||
Attachment #299534 -
Attachment is obsolete: true
Assignee | ||
Comment 39•17 years ago
|
||
Attachment #299535 -
Attachment is obsolete: true
Comment 40•17 years ago
|
||
(In reply to comment #34)
> Created an attachment (id=299534) [details]
> This passes the suite with the same test failures (failure details differ in
> one case due to assertion differences here and on the trunk). Bob, please
> confirm.
Tested in shell on linux. I didn't seen any difference with this patch in either opt or debug builds. I'll try attachment 299713 [details] [diff] [review]) on other platforms later today.
Comment 41•17 years ago
|
||
Took 10 hours, but valgrind says you're clean on e4x+ecma*+js1_*.
Assignee | ||
Updated•17 years ago
|
Alias: js-propcache
Assignee | ||
Comment 42•17 years ago
|
||
Attachment #299713 -
Attachment is obsolete: true
Assignee | ||
Comment 43•17 years ago
|
||
Attachment #299714 -
Attachment is obsolete: true
Assignee | ||
Comment 44•17 years ago
|
||
Fix was this, in jsscope.c:
-#define SPROP_FLAGS_NOT_MATCHED SPROP_MARK
+#define SPROP_FLAGS_NOT_MATCHED (SPROP_MARK | SPROP_FLAG_PCTYPE_REGEN)
/be
Assignee | ||
Updated•17 years ago
|
Attachment #299958 -
Attachment is obsolete: true
Assignee | ||
Updated•17 years ago
|
Attachment #299959 -
Attachment is obsolete: true
Comment 45•17 years ago
|
||
Does this need review?
Assignee | ||
Comment 46•17 years ago
|
||
Beltzner, we're getting there. This is going to go in tomorrow, I hope -- special dispensation. The prior patches are getting in (last one I want to split out is bug 414452).
/be
Comment 47•17 years ago
|
||
Sorry - it was an honest question, as no review flag was set, so I didn't know if it was pending, etc. As you were!
Assignee | ||
Comment 48•17 years ago
|
||
Still have to deal with some issues, tomorrow.
/be
Attachment #300229 -
Attachment is obsolete: true
Assignee | ||
Comment 49•17 years ago
|
||
I need to rest and write some comments tomorrow. Shaver has heard some of the pitch, and read some whiteboard scribblings. His suggestions for where to write a sentence or an essay are welcome.
/be
Attachment #300624 -
Attachment is obsolete: true
Attachment #300827 -
Flags: review?(shaver)
Assignee | ||
Comment 50•17 years ago
|
||
Attachment #300827 -
Attachment is obsolete: true
Attachment #300827 -
Flags: review?(shaver)
Assignee | ||
Comment 51•17 years ago
|
||
I'm too sick to write good comments where needed, but shaver will guide me.
/be
Attachment #250497 -
Attachment is obsolete: true
Attachment #250999 -
Attachment is obsolete: true
Attachment #300911 -
Attachment is obsolete: true
Attachment #301231 -
Flags: review?(shaver)
Assignee | ||
Comment 52•17 years ago
|
||
Attachment #301231 -
Attachment is obsolete: true
Attachment #301235 -
Flags: review?(shaver)
Attachment #301231 -
Flags: review?(shaver)
Comment 53•17 years ago
|
||
Pushing this out to block beta 4 ...
Target Milestone: mozilla1.9beta3 → mozilla1.9beta4
Comment 54•17 years ago
|
||
Don't let the TM deter you - we want this in on the trunk as fast as humanly possible :-)
Assignee | ||
Comment 55•17 years ago
|
||
Some other checkin regressed gmail. My last pull was 1/31. Grumble. Rebuilding and testing, want to attach this before retiring. More tomorrow.
/be
Attachment #301235 -
Attachment is obsolete: true
Attachment #301457 -
Flags: review?(shaver)
Attachment #301235 -
Flags: review?(shaver)
Assignee | ||
Comment 56•17 years ago
|
||
Updating cured whatever ailed gmail. Not this patch, which seems to be working well. Shaver, you reviewing?
/be
Comment 57•17 years ago
|
||
Yeah, I'm a little under halfway through, hope to get done tonight. So far my list is just a couple of places that want more expansive comments, and a wish that you hadn't rolled as many clean-up fixes into such a patch. :)
Assignee | ||
Comment 58•17 years ago
|
||
(In reply to comment #57)
> Yeah, I'm a little under halfway through, hope to get done tonight. So far my
> list is just a couple of places that want more expansive comments,
Great -- lay them on me sooner rather than later.
> and a wish
> that you hadn't rolled as many clean-up fixes into such a patch. :)
My weakness, but I think you protest too much: the only such clean-up left (I split real work into blocking bugs) is the JOF_TYPE/JOF_OPTYPE macrology, and that is easy on the eyes.
I'm a bit concerned that I'll ding Ts/Txul/Tp somehow (e.g., top-level scripts are manually managed, not GC-managed, so js_FlushPropertyCacheForScript may rear its ugly head). I'd like to check into a quiet tree for a test cycle or three. Suggestions?
/be
Comment 59•17 years ago
|
||
> I'm a bit concerned that I'll ding Ts/Txul/Tp somehow (e.g., top-level scripts
> are manually managed, not GC-managed, so js_FlushPropertyCacheForScript may
> rear its ugly head). I'd like to check into a quiet tree for a test cycle or
> three. Suggestions?
>
Let's close the tree for this. Just need to know when and we'll do it.
Comment 60•17 years ago
|
||
I was thinking of the OBJ_SET_SLOT -> STOBJ_SET_SLOT changes as cleanup -- if there is other significance to them I haven't figured it out yet, but I am still not to the exciting conclusion!
The comment preceding JSRuntime.pcTypeGen says the GC prevents wraparound, but by my eyes it's js_GeneratePCType that does so, and the GC merely assists by compressing the space. In the js_GeneratePCType code, we don't handle the case in which GC can't free up a pctype, other than by asserting in DEBUG builds. If that case is safe for non-DEBUG, it'd be good to have a comment there expressing how; likewise in js_TraceObject and js_SweepScopeProperties. (Is the code in js_GC re-enabling the property cache only if the overflow bit is unset sufficient? I'm pondering the other cases in which we increment rt->pcTypeGen with only an assertion as defense.)
js_EmitTree's TOK_DOT/TOK_NAME sharing via fall-through is pretty convoluted now, and I wonder if it's worth it in order to avoid duplication of the JSOP_DUP and EMIT_INDEX_OP (though without the ternary operator selection, were it split out).
+#define PCVCAP_SCOPEMASK_LO JS_BITMASK(PCVCAP_SCOPEBITS)
+#define PCVCAP_SCOPEMASK_HI (PCVCAP_SCOPEMASK_LO << PCVCAP_PROTOBITS)
I think HI wants to shift by SCOPEBITS, if that's what's defining the width of LO. (Saved by the numeric coincidence!)
Have to finish making dinner now, back after to untangle the horrible diff produced in js_Interpret under the out: label!
Assignee | ||
Comment 61•17 years ago
|
||
(In reply to comment #60)
> I was thinking of the OBJ_SET_SLOT -> STOBJ_SET_SLOT changes as cleanup -- if
> there is other significance to them I haven't figured it out yet, but I am
> still not to the exciting conclusion!
They are totally significant (I was a bit over-eager before the most recent patches that fixed this).
The layering is such that all native object arbitrary slot setting must funnel through LOCKED_OBJ_WRITE_BARRIER/GC_WRITE_BARRIER, to re-brand the scope when a branded scope's object sees function-valued properties (aka "methods") change value (old one overwritten, new one replacing a non-function value).
To make a long story short, most slot-setting was properly monitored, but I had forgotten to fix js_SetSlotThreadSafe to use the right macros. Recall that the OBJ_SET_SLOT macro calls js_SetSlotThreadSafe if the object is not native or if its scope->ownercx != cx. Yes, all my JS_THREADSAFE tests involved request-locked, contention-free object usage.
Fixing this left assertions when bad old pre-fslots/dslots code used OBJ_SET_SLOT on the well-known fslots, especially JSSLOT_PRIVATE but also JSSLOT_PROTO and JSSLOT_PARENT.
> The comment preceding JSRuntime.pcTypeGen says the GC prevents wraparound, but
> by my eyes it's js_GeneratePCType that does so, and the GC merely assists by
> compressing the space.
Good point, I will re-word.
> In the js_GeneratePCType code, we don't handle the case
> in which GC can't free up a pctype,
Nit: it's not a matter of freeing anything -- the GC renumbers the live world. But I know what you mean: In case there are 2^28 distinct branded structural types and prefixes of those types.
> other than by asserting in DEBUG builds.
I do not expect this to happen in practice, but I would like to know. Am I abusing an assertion? Could do a printf. Anyway, overflow is not a problem, because in this unlikely event (2^28 distinct pctypes among the property tree nodes and live scopes) the GC leaves the property cache disabled, until the regenerated type count reduces to not overflow.
Oops, it disables only the current cx->thread's cache! Will fix in revised patch along with other stuff you point out. Thanks.
> If that case is safe for non-DEBUG, it'd be good to have a comment there
> expressing how;
Will be safe once all caches are disabled ;-).
> likewise in js_TraceObject and js_SweepScopeProperties. (Is
> the code in js_GC re-enabling the property cache only if the overflow bit is
> unset sufficient? I'm pondering the other cases in which we increment
> rt->pcTypeGen with only an assertion as defense.)
It has to increment, or we get aliasing. Are you worried about it wrapping in 32 bits before the GC notices? I don't think that's possible, at least not on 32-bit systems -- too many sprops and scopes.
> js_EmitTree's TOK_DOT/TOK_NAME sharing via fall-through is pretty convoluted
> now, and I wonder if it's worth it in order to avoid duplication of the
> JSOP_DUP and EMIT_INDEX_OP (though without the ternary operator selection, were
> it split out).
Will give it a try.
> +#define PCVCAP_SCOPEMASK_LO JS_BITMASK(PCVCAP_SCOPEBITS)
> +#define PCVCAP_SCOPEMASK_HI (PCVCAP_SCOPEMASK_LO << PCVCAP_PROTOBITS)
>
> I think HI wants to shift by SCOPEBITS, if that's what's defining the width of
> LO. (Saved by the numeric coincidence!)
No, PROTOBITS. Low-order nybble of vcap is protobits, high-order nybble is scope. This is the typical bitfield defining idiom where a non-right-justified field's mask is left-shifted by the width of the right-justified field after it.
> Have to finish making dinner now, back after to untangle the horrible diff
> produced in js_Interpret under the out: label!
Sorry, diff -w is clean there. Check out earlier -w attachments here. You caught me, that was purely an indentation fix (Igor got off by one due to a major comment ;-).
/be
Assignee | ||
Comment 62•17 years ago
|
||
Tired, sick: for 2^28 above read 2^24. Still way more types than any 32-bit embedding is likely to see. Will ponder 64-bit scaling.
/be
Comment 63•17 years ago
|
||
(In reply to comment #61)
> > In the js_GeneratePCType code, we don't handle the case
> > in which GC can't free up a pctype,
>
> Nit: it's not a matter of freeing anything -- the GC renumbers the live world.
> But I know what you mean: In case there are 2^28 distinct branded structural
> types and prefixes of those types.
Yeah, I meant "make available a no-longer-live pctype".
> Oops, it disables only the current cx->thread's cache! Will fix in revised
> patch along with other stuff you point out. Thanks.
I'd like to pretend that this is the case I was eventually going to come to, if all readers here would indulge my delusion.
> It has to increment, or we get aliasing. Are you worried about it wrapping in
> 32 bits before the GC notices? I don't think that's possible, at least not on
> 32-bit systems -- too many sprops and scopes.
Yeah, that was the case I was worried about, possibly on 64-bit systems. Agree that it's far-fetched.
> > +#define PCVCAP_SCOPEMASK_LO JS_BITMASK(PCVCAP_SCOPEBITS)
> > +#define PCVCAP_SCOPEMASK_HI (PCVCAP_SCOPEMASK_LO << PCVCAP_PROTOBITS)
> >
> > I think HI wants to shift by SCOPEBITS, if that's what's defining the width of
> > LO. (Saved by the numeric coincidence!)
>
> No, PROTOBITS. Low-order nybble of vcap is protobits, high-order nybble is
> scope. This is the typical bitfield defining idiom where a non-right-justified
> field's mask is left-shifted by the width of the right-justified field after
> it.
If the low-order nybble is PROTOBITS, then why is _LO defined in terms of SCOPEBITS? That's the inconsistency I was pointing out: we shift according to the width of PROTOBITS, but define the right-justified field in terms of SCOPEBITS, so if we were to increase SCOPEBITS by one, _LO would overlap _HI, no?
> Sorry, diff -w is clean there. Check out earlier -w attachments here. You
> caught me, that was purely an indentation fix (Igor got off by one due to a
> major comment ;-).
I'm out of steam here, been running on Dayquil all day. I'll resume in the morning; bacon makes a fine breakfast, and the -w patch will keep it from being too gloomy a start. :)
Assignee | ||
Comment 64•17 years ago
|
||
(In reply to comment #63)
> Yeah, that was the case I was worried about, possibly on 64-bit systems. Agree
> that it's far-fetched.
I'll add more defense.
> > > +#define PCVCAP_SCOPEMASK_LO JS_BITMASK(PCVCAP_SCOPEBITS)
> > > +#define PCVCAP_SCOPEMASK_HI (PCVCAP_SCOPEMASK_LO << PCVCAP_PROTOBITS)
[. . .]
> If the low-order nybble is PROTOBITS, then why is _LO defined in terms of
> SCOPEBITS? That's the inconsistency I was pointing out: we shift according to
> the width of PROTOBITS, but define the right-justified field in terms of
> SCOPEBITS, so if we were to increase SCOPEBITS by one, _LO would overlap _HI,
> no?
No, _LO is the right-justified version of _HI -- you can see this because _HI = (_LO << PROTOBITS). Perhaps the names could be better? I thought about using just SCOPEMASK for the shifted one, but that name is ambiguous.
/be
Assignee | ||
Comment 65•17 years ago
|
||
Still pondering 64-bit scaling.
/be
Attachment #301457 -
Attachment is obsolete: true
Attachment #301636 -
Flags: review?(shaver)
Attachment #301457 -
Flags: review?(shaver)
Assignee | ||
Comment 66•17 years ago
|
||
Tail of /tmp/propcache.stats after big gmail/bugzilla/sunspider session:
Property cache stats for thread 17854784, GC 555
fills 2893315
nofills 87320
rofills 0
disfills 183878
oddfills 11
modfills 2264207
brandfills 9380
longchains 0
recycles 2659730
pcrecycles 105492
tests 58421604
pchits 54454495
protopchits 3997099
settests 7871408
addpchits 692338
setpchits 6906280
setpcmisses 63954
setmisses 269528
idmisses 2699781
komisses 117232
vcmisses 344551
misses 3161564
flushes 549
hit rates: pc 93.2095% (proto 6.84182%), set 96.5344%, full 94.5884%
Sharing a single direct mapped cache between the two levels of keys (kpc,ktype) and the slower (atom,kobj) seems to be working well -- no sign of thrashing an entry from first to second level (unrelated) key.
I'm surprised setmisses is not higher, since I left the hazard
slot < STOBJ_NSLOTS(obj)) {
instead of calling js_AllocSlot in the "addpchit" fast path. But JSOP_INITPROP is not participating here, and setpchits (existing props hit by the first level cache test) are nice and high. Add JSOP_INITPROP to the fun is something for another bug.
/be
Comment 67•17 years ago
|
||
- why do we NULL out entry in JSOP_CALLPROP's non-native cache-miss case? It doesn't look like we use it again afterward.
- in PROPERTY_CACHE_TEST, JS_ASSERT(&obj != &pobj) in service of the just-so restrictions?
- {LOCKED_,ST}OBJ_SET_SLOT, assertions against the function-to-non, non-to-function sets that hackers must note well?
I've finished mentally executing the paths for non-native getters/setters, and I think I'm ready to stamp this bad boy. Have to take care of some real work, and then I'm over to arrays tomorrow for JSFastLockable refactoring or whatever, I very much hope!
Comment 68•17 years ago
|
||
Comment on attachment 301636 [details] [diff] [review]
with shaver's comments so far addressed
Oh yeah, I had to read this comment a couple of times to figure out where the list of initial conditions ended and the list of subsequent actions began. Might be worth rewording for clarity, though it's not critical.
+ * At this point we are inevitably leaving an interpreted function or a
+ * top-level script, returning to an inline call in the same js_Interpret,
+ * an "out of line" call made through js_Invoke, a js_Execute activation,
+ * a generator (SendToGenerator, jsiter.c), or else js_FilterXMLList.
r=shaver, with gusto.
Attachment #301636 -
Flags: review?(shaver) → review+
Assignee | ||
Comment 69•17 years ago
|
||
Renamed pctype => shape, etc., based on Andreas Gal's amazing parallel development of same idea.
A final once over would be good. I didn't burden DEBUG builds with extra checking under {LOCKED,ST}OBJ_SET_SLOT, since that would be pretty costly and it would require splitting out yet another sub-layer (LOCKED_OBJ_WRITE_BARRIER uses LOCKED_OBJ_SET_SLOT uses STOBJ_SET_SLOT). I'd rather do something to reduce the number of *OBJ_SET_SLOT macros, in a followup bug.
/be
Attachment #301636 -
Attachment is obsolete: true
Attachment #301809 -
Flags: review?(shaver)
Assignee | ||
Comment 70•17 years ago
|
||
Filed bug 416058 on *OBJ_SET_SLOT elimination.
/be
Comment 71•17 years ago
|
||
Comment on attachment 301809 [details] [diff] [review]
final patch for committing
I want to understand why you would do this mass renaming before checking in the long-sought-after patch, and mix other (albeit minor) changes into it for my review. I thought we had trust.
Nonetheless, I have inspected the bugzilla interdiff, and I again grant my r+.
Attachment #301809 -
Flags: review?(shaver) → review+
Assignee | ||
Comment 72•17 years ago
|
||
Shaver, "trust" would have been broken if I had checked in without re-requesting review. If you would rather I do the renaming and add a few more meters in a separate bug, just say so. I figured you could quickly confirm that nothing else changed, approve it without biting back and we could get on with our lives. But I am chastened....
/be
Comment 73•17 years ago
|
||
Ah, me, I was poking fun. I am glad that we will be able to celebrate bacon together on the trunk of our software soon; my toasting glass stands ready.
Assignee | ||
Comment 74•17 years ago
|
||
I checked in:
js/src/js.c 3.192
js/src/jsapi.c 3.404
js/src/jsapi.h 3.182
js/src/jsbool.c 3.36
js/src/jscntxt.h 3.193
js/src/jsemit.c 3.301
js/src/jsexn.c 3.95
js/src/jsfun.c 3.258
js/src/jsgc.c 3.275
js/src/jsgc.h 3.103
js/src/jsinterp.c 3.417
js/src/jsinterp.h 3.70
js/src/jsiter.c 3.86
js/src/jslock.c 3.75
js/src/jslock.h 3.37
js/src/jsnum.c 3.105
js/src/jsobj.c 3.424
js/src/jsobj.h 3.83
js/src/jsopcode.c 3.287
js/src/jsopcode.h 3.58
js/src/jsopcode.tbl 3.106
js/src/jsprvtd.h 3.38
js/src/jspubtd.h 3.98
js/src/jsscope.c 3.78
js/src/jsscope.h 3.55
js/src/jsscript.c 3.169
js/src/jsscript.h 3.40
js/src/jsstr.c 3.189
Would not be surprised to see a regression, but last I tested, I was mochitest clean. Ts/Txul/Tp again make me wonder about js_FlushPropertyCacheForScript. We shall see....
/be
Status: ASSIGNED → RESOLVED
Closed: 17 years ago
Resolution: --- → FIXED
Comment 75•17 years ago
|
||
This seems to have caused a trace malloc leak increase on all platforms, e.g.
http://build-graphs.mozilla.org/graph/query.cgi?tbox=bm-xserve11.build.mozilla.org&testname=trace_malloc_leaks&units=bytes&autoscale=1&days=2&avg=1&
Possibly also a Ts regression on Linux and Mac.
Assignee | ||
Comment 76•17 years ago
|
||
(In reply to comment #75)
> This seems to have caused a trace malloc leak increase on all platforms, e.g.
>
> http://build-graphs.mozilla.org/graph/query.cgi?tbox=bm-xserve11.build.mozilla.org&testname=trace_malloc_leaks&units=bytes&autoscale=1&days=2&avg=1&
That's an existing leak that grew with this bug's patch. See bug 399112.
> Possibly also a Ts regression on Linux and Mac.
Will be addressed by bug 408871.
/be
Updated•17 years ago
|
Summary: get/set/call optimizations → get/set/call optimizations (return of the property cache)
Updated•17 years ago
|
Flags: in-testsuite-
Flags: in-litmus-
You need to log in
before you can comment on or make changes to this bug.
Description
•