Last Comment Bug 365851 - (js-propcache) get/set/call optimizations (return of the property cache)
(js-propcache)
: get/set/call optimizations (return of the property cache)
Status: RESOLVED FIXED
: perf
Product: Core
Classification: Components
Component: JavaScript Engine (show other bugs)
: Trunk
: All All
: P1 normal with 2 votes (vote)
: mozilla1.9beta4
Assigned To: Brendan Eich [:brendan]
:
: Jason Orendorff [:jorendorff]
Mentors:
Depends on: 416462 335700 366292 366396 367080 411630 413097 414452 416302 416404 416406 416460 416601 416665 418139 424636 425828 428282 452189 461158 462734
Blocks: jsbcopt 408144 410994 463239
  Show dependency treegraph
 
Reported: 2007-01-03 15:27 PST by Brendan Eich [:brendan]
Modified: 2009-03-08 11:16 PDT (History)
19 users (show)
mtschrep: blocking1.9+
brendan: wanted1.9+
bob: in‑testsuite-
bob: in‑litmus-
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
cleanup patch (42.48 KB, patch)
2007-01-03 15:28 PST, Brendan Eich [:brendan]
igor: review+
Details | Diff | Splinter Review
cleanup patch, v2 (42.89 KB, patch)
2007-01-04 13:09 PST, Brendan Eich [:brendan]
igor: review+
Details | Diff | Splinter Review
checkpoint work on type adaptation (51.90 KB, patch)
2007-01-08 23:11 PST, Brendan Eich [:brendan]
no flags Details | Diff | Splinter Review
checkpoint work on type adaptation, v2 (60.77 KB, patch)
2007-01-09 15:58 PST, Brendan Eich [:brendan]
no flags Details | Diff | Splinter Review
brute-force (not for production use) per-thread PIC, to measure calls (27.68 KB, patch)
2007-12-13 18:03 PST, Brendan Eich [:brendan]
no flags Details | Diff | Splinter Review
sample output from patch amid gmail and bugzilla in tabs (44.64 KB, application/x-tar)
2007-12-13 18:13 PST, Brendan Eich [:brendan]
no flags Details
remove DUMP_CALL_TABLE (20.16 KB, patch)
2007-12-14 13:24 PST, Brendan Eich [:brendan]
igor: review+
brendan: approval1.9+
Details | Diff | Splinter Review
work in progress, backup mostly (73.45 KB, patch)
2008-01-07 20:19 PST, Brendan Eich [:brendan]
no flags Details | Diff | Splinter Review
wip backup 2, getting closer (130.11 KB, patch)
2008-01-12 23:57 PST, Brendan Eich [:brendan]
no flags Details | Diff | Splinter Review
GMail crash stack (12.59 KB, text/plain)
2008-01-15 18:10 PST, Robert Sayre
no flags Details
Daddy brings home the bacon! (202.43 KB, patch)
2008-01-24 01:18 PST, Brendan Eich [:brendan]
no flags Details | Diff | Splinter Review
diff -w version of last attachment (183.80 KB, patch)
2008-01-24 01:34 PST, Brendan Eich [:brendan]
no flags Details | Diff | Splinter Review
Makin' bacon on the beach... (201.95 KB, patch)
2008-01-27 01:30 PST, Brendan Eich [:brendan]
no flags Details | Diff | Splinter Review
diff -w version of last attachment (183.31 KB, patch)
2008-01-27 01:31 PST, Brendan Eich [:brendan]
no flags Details | Diff | Splinter Review
Updated bacon to latest trunk state (201.24 KB, patch)
2008-01-28 00:18 PST, Brendan Eich [:brendan]
no flags Details | Diff | Splinter Review
diff -w version of last attachment (182.61 KB, patch)
2008-01-28 00:20 PST, Brendan Eich [:brendan]
no flags Details | Diff | Splinter Review
Updated bacon to latest trunk state (174.86 KB, patch)
2008-01-28 23:36 PST, Brendan Eich [:brendan]
no flags Details | Diff | Splinter Review
diff -w version of last attachment (156.26 KB, patch)
2008-01-28 23:36 PST, Brendan Eich [:brendan]
no flags Details | Diff | Splinter Review
update to track trunk changes, plus one crucial fix (201.23 KB, patch)
2008-01-29 18:43 PST, Brendan Eich [:brendan]
no flags Details | Diff | Splinter Review
better bacon (183.04 KB, patch)
2008-01-31 03:37 PST, Brendan Eich [:brendan]
no flags Details | Diff | Splinter Review
better bacon, passing mochitests (185.08 KB, patch)
2008-01-31 21:53 PST, Brendan Eich [:brendan]
no flags Details | Diff | Splinter Review
without debugging glorp (183.27 KB, patch)
2008-02-01 12:34 PST, Brendan Eich [:brendan]
no flags Details | Diff | Splinter Review
lightly polished for review (183.48 KB, patch)
2008-02-03 21:48 PST, Brendan Eich [:brendan]
no flags Details | Diff | Splinter Review
heavier polish, plus completeness fixes (196.34 KB, patch)
2008-02-03 23:38 PST, Brendan Eich [:brendan]
no flags Details | Diff | Splinter Review
latest patch (198.56 KB, patch)
2008-02-05 00:54 PST, Brendan Eich [:brendan]
no flags Details | Diff | Splinter Review
with shaver's comments so far addressed (201.60 KB, patch)
2008-02-05 23:56 PST, Brendan Eich [:brendan]
shaver: review+
Details | Diff | Splinter Review
final patch for committing (201.78 KB, patch)
2008-02-06 19:20 PST, Brendan Eich [:brendan]
shaver: review+
Details | Diff | Splinter Review

Description Brendan Eich [:brendan] 2007-01-03 15:27:22 PST
First patch here will re-factor getter and setter code to pave the way for further optimizations.

/be
Comment 1 Brendan Eich [:brendan] 2007-01-03 15:28:15 PST
Created attachment 250391 [details] [diff] [review]
cleanup patch
Comment 2 Brendan Eich [:brendan] 2007-01-04 13:09:33 PST
Created attachment 250497 [details] [diff] [review]
cleanup patch, v2
Comment 3 Brendan Eich [:brendan] 2007-01-04 13:15:51 PST
Landed on trunk:

Checking in jsapi.c;
/cvsroot/mozilla/js/src/jsapi.c,v  <--  jsapi.c
new revision: 3.301; previous revision: 3.300
done
Checking in jscntxt.c;
/cvsroot/mozilla/js/src/jscntxt.c,v  <--  jscntxt.c
new revision: 3.98; previous revision: 3.97
done
Checking in jscntxt.h;
/cvsroot/mozilla/js/src/jscntxt.h,v  <--  jscntxt.h
new revision: 3.134; previous revision: 3.133
done
Checking in jsdbgapi.c;
/cvsroot/mozilla/js/src/jsdbgapi.c,v  <--  jsdbgapi.c
new revision: 3.83; previous revision: 3.82
done
Checking in jsgc.c;
/cvsroot/mozilla/js/src/jsgc.c,v  <--  jsgc.c
new revision: 3.196; previous revision: 3.195
done
Checking in jsinterp.c;
/cvsroot/mozilla/js/src/jsinterp.c,v  <--  jsinterp.c
new revision: 3.317; previous revision: 3.316
done
Checking in jsobj.c;
/cvsroot/mozilla/js/src/jsobj.c,v  <--  jsobj.c
new revision: 3.312; previous revision: 3.311
done
Checking in jsobj.h;
/cvsroot/mozilla/js/src/jsobj.h,v  <--  jsobj.h
new revision: 3.60; previous revision: 3.59
done
Checking in jsscope.c;
/cvsroot/mozilla/js/src/jsscope.c,v  <--  jsscope.c
new revision: 3.56; previous revision: 3.55
done
Checking in jsscope.h;
/cvsroot/mozilla/js/src/jsscope.h,v  <--  jsscope.h
new revision: 3.41; previous revision: 3.40
done

Onward!

/be
Comment 4 Brendan Eich [:brendan] 2007-01-04 14:47:53 PST
This was a >2K code size reduction in jsinterp.o on my MacBook Pro, using -Os.  But it seems on argo-vm and luna to have bloated jsinterp.o.  Lack of -Os or -O2?

Anyone who can test Windows before/after jsinterp.obj size, please report it. Thanks,

/be
Comment 5 :Gavin Sharp [email: gavin@gavinsharp.com] 2007-01-04 14:58:25 PST
(In reply to comment #4)
> Anyone who can test Windows before/after jsinterp.obj size, please report it.

MSVC7.1 optimized:
Before: 64,924 bytes
After : 61,754 bytes
Comment 6 Brendan Eich [:brendan] 2007-01-04 14:59:31 PST
So we're using the wrong gcc options, I guess.  See bug 255433.

/be
Comment 7 Brian Crowder 2007-01-04 15:17:34 PST
Makefile.ref uses -O2 by default on Win32, my before/after results are:

Before patch applied:
jsinterp.obj 74387

After patch applied:
jsinterp.obj 72869


Testing with -O1 (which cl claims optimizes for minimum size) yields:

Before Patch applied:
jsinterp.obj 59343

After Patch applied:
jsinterp.obj 56935
Comment 8 Brian Crowder 2007-01-04 15:19:25 PST
Maybe I'm being dense, but I don't see the compile-flags relation to bug 255433
Comment 9 Brendan Eich [:brendan] 2007-01-04 15:36:22 PST
(In reply to comment #8)
> Maybe I'm being dense, but I don't see the compile-flags relation to bug 255433

Sorry, typoed bug 225433, which Stan Shebs has kindly adopted.

/be
Comment 10 Brendan Eich [:brendan] 2007-01-08 23:11:18 PST
Created attachment 250933 [details] [diff] [review]
checkpoint work on type adaptation

Top of last /tmp/calltable.dump.N file after startup/load-gmail/click-message:

Method call summary:
   method callsites:   1801
 total method calls:  17240
native method calls:  12631 (73.27%)
  monomorphic calls:  12373 (71.77%)
monomorphic natives:   8595 (49.85%)
polymorphic-2 calls:   2057 (11.93%)
polymorphic-3 calls:    838 (4.86%)
polymorphic-4 calls:    618 (3.58%)
polymorphic-5 calls:    161 (0.93%)
polymorphic-6 calls:    120 (0.70%)
polymorphic-7 calls:     30 (0.17%)
polymorphic-8 calls:   1043 (6.05%)
  megamorphic calls:      0 (0.00%)

The majority of calls are native; a majority of callsites are monomorphic; over two-thirds of native callsites are monomorphic. There are zero megamorphic calls with a PIC of size 8, < 10% with size 4.

/be
Comment 11 Brendan Eich [:brendan] 2007-01-09 00:06:43 PST
Generalizing summary to fit problem space.

/be
Comment 12 Brendan Eich [:brendan] 2007-01-09 00:22:42 PST
Suspect a bug due to mode at polymorphic-8 and 0 megamorphic... more tomorrow.

/be
Comment 13 Brendan Eich [:brendan] 2007-01-09 15:58:19 PST
Created attachment 250999 [details] [diff] [review]
checkpoint work on type adaptation, v2

Latest results (I've seen runs with non-zero megamorphic counts, so forget about the bug worry):

Method call summary:
        method callsites:   2742
      total method calls:  33663
   computed method calls:   3162
prototype chain searches:  31969
   prototype chain steps:  25279
    scope chain searches:    252
       scope chain steps:     59
  anonymous method calls:   2660
     native method calls:  25537 (75.86%)
     monomorphic natives:  15629 (46.43%)
       monomorphic calls:  23092 (68.60%)
     polymorphic-2 calls:   2127 (6.32%)
     polymorphic-3 calls:   1804 (5.36%)
     polymorphic-4 calls:   1270 (3.77%)
     polymorphic-5 calls:    494 (1.47%)
     polymorphic-6 calls:    414 (1.23%)
     polymorphic-7 calls:    615 (1.83%)
     polymorphic-8 calls:    685 (2.03%)
       megamorphic calls:      0 (0.00%)

This was startup/gmail/message/click-on-bugzilla-link/quit-closing-two-tabs.

Considering the low number of call sites, we could have a 4-entry PIC and use less than 175K.  Each entry would be either a method (function object) pointer, or an sprop (as with the old direct-mapped runtime-wide property cache that I recently removed).  It should be possible to split cases further to optimize for stub vs. non-stub getter and setter.

Since almost 80% of method calls search up one step in the prototype chain, would like to optimize to avoid looking in the directly-referenced receiver object when the cache says we're hitting one object "up" the prototype chain.  Notice how the scope chain searches hit on the head object more often (>75%, and such function or constructor calls are small potatoes compared to method calls).

/be
Comment 14 Brendan Eich [:brendan] 2007-01-09 16:20:36 PST
I added function bytecode length dumping (at the end of the header line for each function), and sure enough:

$ grep ' calls ' /tmp/calltable.dump.3 | sort -n +2 -3 | tail -20
file:///Users/brendaneich/Hacking/trunk/mozilla/dist/MinefieldDebug.app/Contents/MacOS/components/nsExtensionManager.js:8085 calls 88 (4) argc 1/1 length 94
http://mail.google.com/mail/?view=page&name=js&ver=1u6fu1tdvx6hk:668 calls 101 (0) argc 0/0 length 24
http://mail.google.com/mail/?view=page&name=js&ver=1u6fu1tdvx6hk:668 calls 103 (0) argc 0/0 length 104
http://mail.google.com/mail/?view=page&name=js&ver=1u6fu1tdvx6hk:668 calls 103 (0) argc 0/0 length 54
http://mail.google.com/mail/?view=page&name=m_base&ver=aycm2mrckk0u:2 calls 106 (60) argc 1/1 length 149
chrome://global/content/globalOverlay.js:113 calls 112 (92) argc 1/1 length 92
chrome://global/content/globalOverlay.js:132 calls 112 (92) argc 2/2 length 79
chrome://global/content/globalOverlay.js:85 calls 112 (92) argc 1/1 length 238
http://mail.google.com/mail/?view=page&name=js&ver=1u6fu1tdvx6hk:909 calls 116 (0) argc 1/1 length 65
http://mail.google.com/mail/?view=page&name=m_base&ver=aycm2mrckk0u:25 calls 122 (101) argc 1/1 length 36
http://mail.google.com/mail/?view=page&name=js&ver=1u6fu1tdvx6hk:973 calls 133 (0) argc 0/0 length 5
http://mail.google.com/mail/?view=page&name=m_base&ver=aycm2mrckk0u:27 calls 138 (124) argc 1/1 length 11
file:///Users/brendaneich/Hacking/trunk/mozilla/dist/MinefieldDebug.app/Contents/MacOS/components/nsExtensionManager.js:6929 calls 159 (236) argc 2/2 length 83
file:///Users/brendaneich/Hacking/trunk/mozilla/dist/MinefieldDebug.app/Contents/MacOS/components/nsExtensionManager.js:6949 calls 159 (109) argc 2/2 length 101
file:///Users/brendaneich/Hacking/trunk/mozilla/dist/MinefieldDebug.app/Contents/MacOS/components/nsSafebrowsingApplication.js:1867 calls 162 (122) argc 1/1 length 10
http://mail.google.com/mail/?view=page&name=m_base&ver=aycm2mrckk0u:25 calls 164 (128) argc 1/1 length 13
chrome://browser/content/bookmarks/bookmarksMenu.js:1013 calls 181 (281) argc 4/4 length 101
file:///Users/brendaneich/Hacking/trunk/mozilla/dist/MinefieldDebug.app/Contents/MacOS/components/nsExtensionManager.js:7878 calls 188 (161) argc 3/3 length 104
chrome://browser/content/browser.js:3371 calls 191 (0) argc 0/0 length 60
chrome://browser/content/bookmarks/bookmarksMenu.js:1040 calls 214 (113) argc 2/2 length 53

The 20 most frequently called functions tend to be small.  The top 10 might even inline into the PIC space if these functions are called from monomorphic sites. Have to work on the dump format and stats a bit more....

/be
Comment 15 Brendan Eich [:brendan] 2007-12-13 18:01:59 PST
Hope to make up for loss of property cache (bug 128150 by way of bug 408144), and then some.

/be
Comment 16 Brendan Eich [:brendan] 2007-12-13 18:03:09 PST
Created attachment 293055 [details] [diff] [review]
brute-force (not for production use) per-thread PIC, to measure calls
Comment 17 Brendan Eich [:brendan] 2007-12-13 18:13:05 PST
Created attachment 293057 [details]
sample output from patch amid gmail and bugzilla in tabs

Almost all callsites (all native method callsites) are monomorphic:

callsites 2047 mean polymorphism 1.03566 sigma 0.267493

hits           1765843
misses          126516
overloaded           0
megamorphic       3182
erecycles       126808
trecycles       129787
esweeps            292
tsweeps            311

No need for poly- when mono- will do, and we aren't inlining for 1.9 (tracing after 1.9 for scripted functions, and natives can't be inlined), so I'm focusing on caller->callee fast path using a single-entry cache per callsite.

Also want to avoid memoizing function values in the property tree. Finally, want to handle getters and setters but may split that out into a followup bug.

/be
Comment 18 Brendan Eich [:brendan] 2007-12-13 19:12:08 PST
From the patch, megamorphic here means > 4 callees for the cached callsite -- collisions in the two-way associative direct cache will overwrite some entries, possibly undercounting these; see |erecycles - esweeps|. Still highly likely to be one callee per callsite in our code, only a bit less likely in gmail code.

/be
Comment 19 Brendan Eich [:brendan] 2007-12-14 13:24:50 PST
Created attachment 293190 [details] [diff] [review]
remove DUMP_CALL_TABLE

Preliminary patch to clear out this old code, which I do not want to maintain or leave bloating the source and rusting. Better approach is the jspic.[ch] approach, which could be improved to have fewer collisions, but which has the virtue of being out-of-line and less intrusive. Could even use (possibly beefed up) debugger call hooks for zero source code touch plugability.

I'd like to get this in today, ASAP. Thanks,

/be
Comment 20 Brendan Eich [:brendan] 2007-12-14 13:33:48 PST
Comment on attachment 293190 [details] [diff] [review]
remove DUMP_CALL_TABLE

a=me, not part of build.

/be
Comment 21 Brendan Eich [:brendan] 2007-12-14 13:36:19 PST
DUMP_CALL_TABLE removed:

js/src/jsgc.c 3.257
js/src/jsinterp.c 3.394
js/src/jsinterp.h 3.68
js/src/jsobj.c 3.407
js/src/jsstr.c 3.179

/be
Comment 22 Brendan Eich [:brendan] 2008-01-07 20:19:42 PST
Created attachment 295891 [details] [diff] [review]
work in progress, backup mostly
Comment 23 Brendan Eich [:brendan] 2008-01-12 23:57:25 PST
Created attachment 296794 [details] [diff] [review]
wip backup 2, getting closer

Passes js testsuite (I exclude spidermonkey-n.tests and slow-n.tests) -- bc, cayou try it out some time? More work to do, missing capability revocation on scopechain shadowing, e.g.

/be
Comment 24 Bob Clary [:bc:] 2008-01-13 08:50:06 PST
(In reply to comment #23)

> 
>  bc, > can you try it out some time? 

sure, I normally don't run with slow-n.tests either. Should I? I'll make a stab a bit later with browser|shell. If you have a wip 3 patch in the works, let me know so I can test that instead.

Comment 25 Bob Clary [:bc:] 2008-01-14 07:21:29 PST
(In reply to comment #23)

Lots of Assertion failure: PCVAL_IS_NULL(v) || PCVAL_IS_SPROP(v), at jsinterp.c:128

just one example of many: ecma/Types/8.6.2.1-1.js shell

#0  JS_Assert (s=0x81365c4 "PCVAL_IS_NULL(v) || PCVAL_IS_SPROP(v)", file=0x8136568 "jsinterp.c", ln=128) at jsutil.c:63
#1  0x0809b7fa in js_FillPropertyCache (cache=0x9bf8a70, pc=0x9c0a75b "6", scopeIndex=0, obj=0xb7f4bae0, protoIndex=0, sprop=0x9c13910, entry=0x9bfb3c8) at jsinterp.c:128
#2  0x080c9222 in js_SetPropertyHelper (cx=0x9c00e08, obj=0xb7f4bae0, id=-1208674788, vp=0xbf92ec60, entry=0x9bfb3c8) at jsobj.c:3775
#3  0x080ae484 in js_Interpret (cx=0x9c00e08, pc=0x9c0a75b "6", result=0xbf92ef88) at jsinterp.c:3815
#4  0x0809d4d5 in js_Invoke (cx=0x9c00e08, argc=4, vp=0x9c05880, flags=1) at jsinterp.c:1178
#5  0x0809e984 in js_InvokeConstructor (cx=0x9c00e08, vp=0x9c05880, argc=4) at jsinterp.c:1748
#6  0x080a8ef3 in js_Interpret (cx=0x9c00e08, pc=0x9c1239b "#", result=0xbf92f6b4) at jsinterp.c:3402
#7  0x0809dcba in js_Execute (cx=0x9c00e08, chain=0xb7f50000, script=0x9c122c8, down=0x0, flags=0, result=0xbf92f75c) at jsinterp.c:1403
#8  0x08061aa8 in JS_ExecuteScript (cx=0x9c00e08, obj=0xb7f50000, script=0x9c122c8, rval=0xbf92f75c) at jsapi.c:4774
#9  0x0804a270 in Load (cx=0x9c00e08, obj=0xb7f50000, argc=1, argv=0x9c0585c, rval=0xbf92f7f0) at js.c:626
#10 0x0809d449 in js_Invoke (cx=0x9c00e08, argc=1, vp=0x9c05854, flags=0) at jsinterp.c:1162
#11 0x080afc29 in js_Interpret (cx=0x9c00e08, pc=0x9c0df9e ":", result=0xbf92fe94) at jsinterp.c:4084
#12 0x0809dcba in js_Execute (cx=0x9c00e08, chain=0xb7f50000, script=0x9c0df60, down=0x0, flags=0, result=0xbf930f68) at jsinterp.c:1403
#13 0x08061aa8 in JS_ExecuteScript (cx=0x9c00e08, obj=0xb7f50000, script=0x9c0df60, rval=0xbf930f68) at jsapi.c:4774
#14 0x0804980d in Process (cx=0x9c00e08, obj=0xb7f50000, filename=0x0, forceTTY=0) at js.c:306
#15 0x08049f08 in ProcessArgs (cx=0x9c00e08, obj=0xb7f50000, argv=0xbf9310f8, argc=0) at js.c:532
#16 0x0804f216 in main (argc=0, argv=0xbf9310f8, envp=0xbf9310fc) at js.c:3561


or in the browser during startup: Assertion failure: PCVAL_IS_NULL(v) || PCVAL_IS_SPROP(v), at /work/mozilla/builds/1.9.0-test/mozilla/js/src/jsinterp.c:128

wip'it up again, maestro!
Comment 26 Brendan Eich [:brendan] 2008-01-14 10:02:29 PST
Silly bogus assertion -- pay it no mind ;-). New patch soon, opt testing of that patch instead of dbg should turn out ok (pls. holler if not).

/be
Comment 27 Bob Clary [:bc:] 2008-01-14 10:37:27 PST
Plenty of crashes (signal 11's) on linux (centos5) and maci386 in the browser version of the test but with one crash which was not repeatable, I could not reproduce it locally on fedora7 or on maci386. Not sure I trust that. I'll test the new patch when its available.
Comment 28 Robert Sayre 2008-01-15 18:10:59 PST
Created attachment 297291 [details]
GMail crash stack
Comment 29 Brendan Eich [:brendan] 2008-01-24 01:18:31 PST
Created attachment 298896 [details] [diff] [review]
Daddy brings home the bacon!

Seems stable -- sayrer, go nuts.

Needs to be cleaned up and be split into blocking bugs, for less mega-fix-mashup landing/auditing pain.

Narrative about what this is soon, although Ecma TC39 meeting will take time the next two days.

/be
Comment 30 Brendan Eich [:brendan] 2008-01-24 01:34:41 PST
Created attachment 298900 [details] [diff] [review]
diff -w version of last attachment
Comment 31 Mike Schroepfer 2008-01-24 08:48:17 PST
bc any chance you can run through the regression suite (if you haven't already)
Comment 32 Brendan Eich [:brendan] 2008-01-24 09:08:33 PST
I had been running the suite until I moved to Firefox to battle with gmail (which does all sorts of fun prototype chain foreshadowing); running suite now, seeing fresh problems. Bob, no need to attach results here, I should have a new patch at some point today.

/be
Comment 33 Bob Clary [:bc:] 2008-01-24 10:23:59 PST
(In reply to comment #32)

k. 
Comment 34 Brendan Eich [:brendan] 2008-01-27 01:30:42 PST
Created attachment 299534 [details] [diff] [review]
Makin' bacon on the beach...

This passes the suite with the same test failures (failure details differ in one case due to assertion differences here and on the trunk). Bob, please confirm.

Sayrer, I'm hoping you can take this for a spin.

/be
Comment 35 Brendan Eich [:brendan] 2008-01-27 01:31:46 PST
Created attachment 299535 [details] [diff] [review]
diff -w version of last attachment
Comment 36 Mike Shaver (:shaver -- probably not reading bugmail closely) 2008-01-27 08:36:41 PST
I'm running the shell test suite (e4x, ecma*, js1_[1-8]; excluding slow-n.tests) under valgrind on sm-valgrind01 now, it'll take a couple of hours. (Restarting the shell and reloading the drivers for every test takes its toll!)

Note to self or sayrer:
  cd tests/v-logs-bacon; ../../summarize-valgrind-logs.sh
  summary-errors-only.txt is the compact output (test nerrors ncontext)
  summary-unified.txt is the full list for sanity checking

Have a sunspider-under-valgrind run going now as well.
Comment 37 Mike Shaver (:shaver -- probably not reading bugmail closely) 2008-01-27 09:06:15 PST
Sunspider is valgrind-clean on 64-bit Linux.
Comment 38 Brendan Eich [:brendan] 2008-01-28 00:18:31 PST
Created attachment 299713 [details] [diff] [review]
Updated bacon to latest trunk state
Comment 39 Brendan Eich [:brendan] 2008-01-28 00:20:50 PST
Created attachment 299714 [details] [diff] [review]
diff -w version of last attachment
Comment 40 Bob Clary [:bc:] 2008-01-28 07:44:22 PST
(In reply to comment #34)
> Created an attachment (id=299534) [details]

> This passes the suite with the same test failures (failure details differ in
> one case due to assertion differences here and on the trunk). Bob, please
> confirm.

Tested in shell on linux. I didn't seen any difference with this patch in either opt or debug builds. I'll try attachment 299713 [details] [diff] [review]) on other platforms later today.

Comment 41 Mike Shaver (:shaver -- probably not reading bugmail closely) 2008-01-28 08:20:09 PST
Took 10 hours, but valgrind says you're clean on e4x+ecma*+js1_*.
Comment 42 Brendan Eich [:brendan] 2008-01-28 23:36:01 PST
Created attachment 299958 [details] [diff] [review]
Updated bacon to latest trunk state
Comment 43 Brendan Eich [:brendan] 2008-01-28 23:36:49 PST
Created attachment 299959 [details] [diff] [review]
diff -w version of last attachment
Comment 44 Brendan Eich [:brendan] 2008-01-29 18:43:48 PST
Created attachment 300229 [details] [diff] [review]
update to track trunk changes, plus one crucial fix

Fix was this, in jsscope.c:

-#define SPROP_FLAGS_NOT_MATCHED SPROP_MARK
+#define SPROP_FLAGS_NOT_MATCHED (SPROP_MARK | SPROP_FLAG_PCTYPE_REGEN)

/be
Comment 45 Mike Beltzner [:beltzner, not reading bugmail] 2008-01-29 18:53:00 PST
Does this need review?
Comment 46 Brendan Eich [:brendan] 2008-01-29 21:27:21 PST
Beltzner, we're getting there. This is going to go in tomorrow, I hope -- special dispensation. The prior patches are getting in (last one I want to split out is bug 414452).

/be
Comment 47 Mike Beltzner [:beltzner, not reading bugmail] 2008-01-29 22:46:42 PST
Sorry - it was an honest question, as no review flag was set, so I didn't know if it was pending, etc. As you were!
Comment 48 Brendan Eich [:brendan] 2008-01-31 03:37:51 PST
Created attachment 300624 [details] [diff] [review]
better bacon

Still have to deal with some issues, tomorrow.

/be
Comment 49 Brendan Eich [:brendan] 2008-01-31 21:53:38 PST
Created attachment 300827 [details] [diff] [review]
better bacon, passing mochitests

I need to rest and write some comments tomorrow. Shaver has heard some of the pitch, and read some whiteboard scribblings. His suggestions for where to write a sentence or an essay are welcome.

/be
Comment 50 Brendan Eich [:brendan] 2008-02-01 12:34:55 PST
Created attachment 300911 [details] [diff] [review]
without debugging glorp
Comment 51 Brendan Eich [:brendan] 2008-02-03 21:48:19 PST
Created attachment 301231 [details] [diff] [review]
lightly polished for review

I'm too sick to write good comments where needed, but shaver will guide me.

/be
Comment 52 Brendan Eich [:brendan] 2008-02-03 23:38:41 PST
Created attachment 301235 [details] [diff] [review]
heavier polish, plus completeness fixes
Comment 53 Mike Beltzner [:beltzner, not reading bugmail] 2008-02-04 13:13:24 PST
Pushing this out to block beta 4 ...
Comment 54 Mike Schroepfer 2008-02-04 13:14:33 PST
Don't let the TM deter you - we want this in on the trunk as fast as humanly possible :-)  

Comment 55 Brendan Eich [:brendan] 2008-02-05 00:54:00 PST
Created attachment 301457 [details] [diff] [review]
latest patch

Some other checkin regressed gmail. My last pull was 1/31. Grumble. Rebuilding and testing, want to attach this before retiring. More tomorrow.

/be
Comment 56 Brendan Eich [:brendan] 2008-02-05 15:31:57 PST
Updating cured whatever ailed gmail. Not this patch, which seems to be working well. Shaver, you reviewing?

/be
Comment 57 Mike Shaver (:shaver -- probably not reading bugmail closely) 2008-02-05 15:51:18 PST
Yeah, I'm a little under halfway through, hope to get done tonight.  So far my list is just a couple of places that want more expansive comments, and a wish that you hadn't rolled as many clean-up fixes into such a patch. :)
Comment 58 Brendan Eich [:brendan] 2008-02-05 16:21:32 PST
(In reply to comment #57)
> Yeah, I'm a little under halfway through, hope to get done tonight.  So far my
> list is just a couple of places that want more expansive comments,

Great -- lay them on me sooner rather than later.

> and a wish
> that you hadn't rolled as many clean-up fixes into such a patch. :)

My weakness, but I think you protest too much: the only such clean-up left (I split real work into blocking bugs) is the JOF_TYPE/JOF_OPTYPE macrology, and that is easy on the eyes.

I'm a bit concerned that I'll ding Ts/Txul/Tp somehow (e.g., top-level scripts are manually managed, not GC-managed, so js_FlushPropertyCacheForScript may rear its ugly head). I'd like to check into a quiet tree for a test cycle or three. Suggestions?

/be
Comment 59 Mike Schroepfer 2008-02-05 16:28:53 PST
> I'm a bit concerned that I'll ding Ts/Txul/Tp somehow (e.g., top-level scripts
> are manually managed, not GC-managed, so js_FlushPropertyCacheForScript may
> rear its ugly head). I'd like to check into a quiet tree for a test cycle or
> three. Suggestions?
> 

Let's close the tree for this.  Just need to know when and we'll do it.
Comment 60 Mike Shaver (:shaver -- probably not reading bugmail closely) 2008-02-05 16:53:41 PST
I was thinking of the OBJ_SET_SLOT -> STOBJ_SET_SLOT changes as cleanup -- if there is other significance to them I haven't figured it out yet, but I am still not to the exciting conclusion!

The comment preceding JSRuntime.pcTypeGen says the GC prevents wraparound, but by my eyes it's js_GeneratePCType that does so, and the GC merely assists by compressing the space.  In the js_GeneratePCType code, we don't handle the case in which GC can't free up a pctype, other than by asserting in DEBUG builds.  If that case is safe for non-DEBUG, it'd be good to have a comment there expressing how; likewise in js_TraceObject and js_SweepScopeProperties.  (Is the code in js_GC re-enabling the property cache only if the overflow bit is unset sufficient?  I'm pondering the other cases in which we increment rt->pcTypeGen with only an assertion as defense.)

js_EmitTree's TOK_DOT/TOK_NAME sharing via fall-through is pretty convoluted now, and I wonder if it's worth it in order to avoid duplication of the JSOP_DUP and EMIT_INDEX_OP (though without the ternary operator selection, were it split out).

+#define PCVCAP_SCOPEMASK_LO     JS_BITMASK(PCVCAP_SCOPEBITS)
+#define PCVCAP_SCOPEMASK_HI     (PCVCAP_SCOPEMASK_LO << PCVCAP_PROTOBITS)

I think HI wants to shift by SCOPEBITS, if that's what's defining the width of LO.  (Saved by the numeric coincidence!)

Have to finish making dinner now, back after to untangle the horrible diff produced in js_Interpret under the out: label!
Comment 61 Brendan Eich [:brendan] 2008-02-05 17:22:54 PST
(In reply to comment #60)
> I was thinking of the OBJ_SET_SLOT -> STOBJ_SET_SLOT changes as cleanup -- if
> there is other significance to them I haven't figured it out yet, but I am
> still not to the exciting conclusion!

They are totally significant (I was a bit over-eager before the most recent patches that fixed this).

The layering is such that all native object arbitrary slot setting must funnel through LOCKED_OBJ_WRITE_BARRIER/GC_WRITE_BARRIER, to re-brand the scope when a branded scope's object sees function-valued properties (aka "methods") change value (old one overwritten, new one replacing a non-function value).

To make a long story short, most slot-setting was properly monitored, but I had forgotten to fix js_SetSlotThreadSafe to use the right macros. Recall that the OBJ_SET_SLOT macro calls js_SetSlotThreadSafe if the object is not native or if its scope->ownercx != cx. Yes, all my JS_THREADSAFE tests involved request-locked, contention-free object usage.

Fixing this left assertions when bad old pre-fslots/dslots code used OBJ_SET_SLOT on the well-known fslots, especially JSSLOT_PRIVATE but also JSSLOT_PROTO and JSSLOT_PARENT.

> The comment preceding JSRuntime.pcTypeGen says the GC prevents wraparound, but
> by my eyes it's js_GeneratePCType that does so, and the GC merely assists by
> compressing the space.

Good point, I will re-word.

> In the js_GeneratePCType code, we don't handle the case
> in which GC can't free up a pctype,

Nit: it's not a matter of freeing anything -- the GC renumbers the live world. But I know what you mean: In case there are 2^28 distinct branded structural types and prefixes of those types.

> other than by asserting in DEBUG builds.

I do not expect this to happen in practice, but I would like to know. Am I abusing an assertion? Could do a printf. Anyway, overflow is not a problem, because in this unlikely event (2^28 distinct pctypes among the property tree nodes and live scopes) the GC leaves the property cache disabled, until the regenerated type count reduces to not overflow.

Oops, it disables only the current cx->thread's cache! Will fix in revised patch along with other stuff you point out. Thanks.
 
> If that case is safe for non-DEBUG, it'd be good to have a comment there
> expressing how;

Will be safe once all caches are disabled ;-).

> likewise in js_TraceObject and js_SweepScopeProperties.  (Is
> the code in js_GC re-enabling the property cache only if the overflow bit is
> unset sufficient?  I'm pondering the other cases in which we increment
> rt->pcTypeGen with only an assertion as defense.)

It has to increment, or we get aliasing. Are you worried about it wrapping in 32 bits before the GC notices? I don't think that's possible, at least not on 32-bit systems -- too many sprops and scopes.

> js_EmitTree's TOK_DOT/TOK_NAME sharing via fall-through is pretty convoluted
> now, and I wonder if it's worth it in order to avoid duplication of the
> JSOP_DUP and EMIT_INDEX_OP (though without the ternary operator selection, were
> it split out).

Will give it a try.

> +#define PCVCAP_SCOPEMASK_LO     JS_BITMASK(PCVCAP_SCOPEBITS)
> +#define PCVCAP_SCOPEMASK_HI     (PCVCAP_SCOPEMASK_LO << PCVCAP_PROTOBITS)
> 
> I think HI wants to shift by SCOPEBITS, if that's what's defining the width of
> LO.  (Saved by the numeric coincidence!)

No, PROTOBITS. Low-order nybble of vcap is protobits, high-order nybble is scope. This is the typical bitfield defining idiom where a non-right-justified field's mask is left-shifted by the width of the right-justified field after it.

> Have to finish making dinner now, back after to untangle the horrible diff
> produced in js_Interpret under the out: label!

Sorry, diff -w is clean there. Check out earlier -w attachments here. You caught me, that was purely an indentation fix (Igor got off by one due to a major comment ;-).

/be
Comment 62 Brendan Eich [:brendan] 2008-02-05 17:27:49 PST
Tired, sick: for 2^28 above read 2^24. Still way more types than any 32-bit embedding is likely to see. Will ponder 64-bit scaling.

/be
Comment 63 Mike Shaver (:shaver -- probably not reading bugmail closely) 2008-02-05 19:16:17 PST
(In reply to comment #61)
> > In the js_GeneratePCType code, we don't handle the case
> > in which GC can't free up a pctype,
> 
> Nit: it's not a matter of freeing anything -- the GC renumbers the live world.
> But I know what you mean: In case there are 2^28 distinct branded structural
> types and prefixes of those types.

Yeah, I meant "make available a no-longer-live pctype".

> Oops, it disables only the current cx->thread's cache! Will fix in revised
> patch along with other stuff you point out. Thanks.

I'd like to pretend that this is the case I was eventually going to come to, if all readers here would indulge my delusion.

> It has to increment, or we get aliasing. Are you worried about it wrapping in
> 32 bits before the GC notices? I don't think that's possible, at least not on
> 32-bit systems -- too many sprops and scopes.

Yeah, that was the case I was worried about, possibly on 64-bit systems.  Agree that it's far-fetched.

> > +#define PCVCAP_SCOPEMASK_LO     JS_BITMASK(PCVCAP_SCOPEBITS)
> > +#define PCVCAP_SCOPEMASK_HI     (PCVCAP_SCOPEMASK_LO << PCVCAP_PROTOBITS)
> > 
> > I think HI wants to shift by SCOPEBITS, if that's what's defining the width of
> > LO.  (Saved by the numeric coincidence!)
> 
> No, PROTOBITS. Low-order nybble of vcap is protobits, high-order nybble is
> scope. This is the typical bitfield defining idiom where a non-right-justified
> field's mask is left-shifted by the width of the right-justified field after
> it.

If the low-order nybble is PROTOBITS, then why is _LO defined in terms of SCOPEBITS?  That's the inconsistency I was pointing out: we shift according to the width of PROTOBITS, but define the right-justified field in terms of SCOPEBITS, so if we were to increase SCOPEBITS by one, _LO would overlap _HI, no?

> Sorry, diff -w is clean there. Check out earlier -w attachments here. You
> caught me, that was purely an indentation fix (Igor got off by one due to a
> major comment ;-).

I'm out of steam here, been running on Dayquil all day.  I'll resume in the morning; bacon makes a fine breakfast, and the -w patch will keep it from being too gloomy a start. :)
Comment 64 Brendan Eich [:brendan] 2008-02-05 21:35:49 PST
(In reply to comment #63)
> Yeah, that was the case I was worried about, possibly on 64-bit systems.  Agree
> that it's far-fetched.

I'll add more defense.

> > > +#define PCVCAP_SCOPEMASK_LO     JS_BITMASK(PCVCAP_SCOPEBITS)
> > > +#define PCVCAP_SCOPEMASK_HI     (PCVCAP_SCOPEMASK_LO << PCVCAP_PROTOBITS)
[. . .]
> If the low-order nybble is PROTOBITS, then why is _LO defined in terms of
> SCOPEBITS?  That's the inconsistency I was pointing out: we shift according to
> the width of PROTOBITS, but define the right-justified field in terms of
> SCOPEBITS, so if we were to increase SCOPEBITS by one, _LO would overlap _HI,
> no?

No, _LO is the right-justified version of _HI -- you can see this because _HI = (_LO << PROTOBITS). Perhaps the names could be better? I thought about using just SCOPEMASK for the shifted one, but that name is ambiguous.

/be

Comment 65 Brendan Eich [:brendan] 2008-02-05 23:56:02 PST
Created attachment 301636 [details] [diff] [review]
with shaver's comments so far addressed

Still pondering 64-bit scaling.

/be
Comment 66 Brendan Eich [:brendan] 2008-02-06 09:11:27 PST
Tail of /tmp/propcache.stats after big gmail/bugzilla/sunspider session:

Property cache stats for thread 17854784, GC 555
      fills    2893315
    nofills      87320
    rofills          0
   disfills     183878
   oddfills         11
   modfills    2264207
 brandfills       9380
 longchains          0
   recycles    2659730
 pcrecycles     105492
      tests   58421604
     pchits   54454495
protopchits    3997099
   settests    7871408
  addpchits     692338
  setpchits    6906280
setpcmisses      63954
  setmisses     269528
   idmisses    2699781
   komisses     117232
   vcmisses     344551
     misses    3161564
    flushes        549
hit rates: pc 93.2095% (proto 6.84182%), set 96.5344%, full 94.5884%

Sharing a single direct mapped cache between the two levels of keys (kpc,ktype) and the slower (atom,kobj) seems to be working well -- no sign of thrashing an entry from first to second level (unrelated) key.

I'm surprised setmisses is not higher, since I left the hazard

                                slot < STOBJ_NSLOTS(obj)) {

instead of calling js_AllocSlot in the "addpchit" fast path. But JSOP_INITPROP is not participating here, and setpchits (existing props hit by the first level cache test) are nice and high. Add JSOP_INITPROP to the fun is something for another bug.

/be
Comment 67 Mike Shaver (:shaver -- probably not reading bugmail closely) 2008-02-06 13:04:00 PST
- why do we NULL out entry in JSOP_CALLPROP's non-native cache-miss case?  It doesn't look like we use it again afterward.

- in PROPERTY_CACHE_TEST, JS_ASSERT(&obj != &pobj) in service of the just-so restrictions?

- {LOCKED_,ST}OBJ_SET_SLOT, assertions against the function-to-non, non-to-function sets that hackers must note well?

I've finished mentally executing the paths for non-native getters/setters, and I think I'm ready to stamp this bad boy.  Have to take care of some real work, and then I'm over to arrays tomorrow for JSFastLockable refactoring or whatever, I very much hope!
Comment 68 Mike Shaver (:shaver -- probably not reading bugmail closely) 2008-02-06 13:14:42 PST
Comment on attachment 301636 [details] [diff] [review]
with shaver's comments so far addressed

Oh yeah, I had to read this comment a couple of times to figure out where the list of initial conditions ended and the list of subsequent actions began.  Might be worth rewording for clarity, though it's not critical.

+     * At this point we are inevitably leaving an interpreted function or a
+     * top-level script, returning to an inline call in the same js_Interpret,
+     * an "out of line" call made through js_Invoke, a js_Execute activation,
+     * a generator (SendToGenerator, jsiter.c), or else js_FilterXMLList.

r=shaver, with gusto.
Comment 69 Brendan Eich [:brendan] 2008-02-06 19:20:50 PST
Created attachment 301809 [details] [diff] [review]
final patch for committing

Renamed pctype => shape, etc., based on Andreas Gal's amazing parallel development of same idea.

A final once over would be good. I didn't burden DEBUG builds with extra checking under {LOCKED,ST}OBJ_SET_SLOT, since that would be pretty costly and it would require splitting out yet another sub-layer (LOCKED_OBJ_WRITE_BARRIER uses LOCKED_OBJ_SET_SLOT uses STOBJ_SET_SLOT). I'd rather do something to reduce the number of *OBJ_SET_SLOT macros, in a followup bug.

/be
Comment 70 Brendan Eich [:brendan] 2008-02-06 20:33:13 PST
Filed bug 416058 on *OBJ_SET_SLOT elimination.

/be
Comment 71 Mike Shaver (:shaver -- probably not reading bugmail closely) 2008-02-07 06:33:17 PST
Comment on attachment 301809 [details] [diff] [review]
final patch for committing

I want to understand why you would do this mass renaming before checking in the long-sought-after patch, and mix other (albeit minor) changes into it for my review.  I thought we had trust.

Nonetheless, I have inspected the bugzilla interdiff, and I again grant my r+.
Comment 72 Brendan Eich [:brendan] 2008-02-07 12:35:28 PST
Shaver, "trust" would have been broken if I had checked in without re-requesting review. If you would rather I do the renaming and add a few more meters in a separate bug, just say so. I figured you could quickly confirm that nothing else changed, approve it without biting back and we could get on with our lives. But I am chastened....

/be
Comment 73 Mike Shaver (:shaver -- probably not reading bugmail closely) 2008-02-07 12:44:44 PST
Ah, me, I was poking fun.  I am glad that we will be able to celebrate bacon together on the trunk of our software soon; my toasting glass stands ready.
Comment 74 Brendan Eich [:brendan] 2008-02-07 15:20:17 PST
I checked in:

js/src/js.c 3.192
js/src/jsapi.c 3.404
js/src/jsapi.h 3.182
js/src/jsbool.c 3.36
js/src/jscntxt.h 3.193
js/src/jsemit.c 3.301
js/src/jsexn.c 3.95
js/src/jsfun.c 3.258
js/src/jsgc.c 3.275
js/src/jsgc.h 3.103
js/src/jsinterp.c 3.417
js/src/jsinterp.h 3.70
js/src/jsiter.c 3.86
js/src/jslock.c 3.75
js/src/jslock.h 3.37
js/src/jsnum.c 3.105
js/src/jsobj.c 3.424
js/src/jsobj.h 3.83
js/src/jsopcode.c 3.287
js/src/jsopcode.h 3.58
js/src/jsopcode.tbl 3.106
js/src/jsprvtd.h 3.38
js/src/jspubtd.h 3.98
js/src/jsscope.c 3.78
js/src/jsscope.h 3.55
js/src/jsscript.c 3.169
js/src/jsscript.h 3.40
js/src/jsstr.c 3.189

Would not be surprised to see a regression, but last I tested, I was mochitest clean. Ts/Txul/Tp again make me wonder about js_FlushPropertyCacheForScript. We shall see....

/be
Comment 75 Dave Townsend [:mossop] 2008-02-08 05:12:02 PST
This seems to have caused a trace malloc leak increase on all platforms, e.g.

http://build-graphs.mozilla.org/graph/query.cgi?tbox=bm-xserve11.build.mozilla.org&testname=trace_malloc_leaks&units=bytes&autoscale=1&days=2&avg=1&

Possibly also a Ts regression on Linux and Mac.
Comment 76 Brendan Eich [:brendan] 2008-02-08 13:20:41 PST
(In reply to comment #75)
> This seems to have caused a trace malloc leak increase on all platforms, e.g.
> 
> http://build-graphs.mozilla.org/graph/query.cgi?tbox=bm-xserve11.build.mozilla.org&testname=trace_malloc_leaks&units=bytes&autoscale=1&days=2&avg=1&

That's an existing leak that grew with this bug's patch. See bug 399112.

> Possibly also a Ts regression on Linux and Mac.

Will be addressed by bug 408871.

/be

Note You need to log in before you can comment on or make changes to this bug.