Last Comment Bug 630456 - Convert objects to dictionary mode less aggressively
: Convert objects to dictionary mode less aggressively
Status: RESOLVED FIXED
fixed-in-tracemonkey
:
Product: Core
Classification: Components
Component: JavaScript Engine (show other bugs)
: unspecified
: All All
: -- normal with 1 vote (vote)
: ---
Assigned To: Nicholas Nethercote [:njn]
:
Mentors:
Depends on:
Blocks: 598466
  Show dependency treegraph
 
Reported: 2011-01-31 21:39 PST by Nicholas Nethercote [:njn]
Modified: 2011-02-11 11:49 PST (History)
12 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
instrumentation patch (1.36 KB, patch)
2011-02-01 15:51 PST, Nicholas Nethercote [:njn]
no flags Details | Diff | Review
quick-and-dirty SHAPE_METER patch (3.33 KB, patch)
2011-02-01 17:08 PST, Brendan Eich [:brendan]
no flags Details | Diff | Review
histogram instrumentation patch (1.63 KB, patch)
2011-02-03 16:08 PST, Nicholas Nethercote [:njn]
no flags Details | Diff | Review
patch (685 bytes, patch)
2011-02-03 16:09 PST, Nicholas Nethercote [:njn]
brendan: review+
bzbarsky: approval2.0+
Details | Diff | Review
new SHAPE_METER patch handles indexed and array props (5.13 KB, patch)
2011-02-07 19:20 PST, Brendan Eich [:brendan]
no flags Details | Diff | Review

Description Nicholas Nethercote [:njn] 2011-01-31 21:39:21 PST
MAX_HEIGHT is currently 64.  In some very preliminary measurements I increased it to 128 with negligible change in speed, but a ~15% reduction in the number of Shapes allocated, which translates to a few MB in moderately large workloads.

I plan to do some more rigorous measurements.
Comment 1 Nicholas Nethercote [:njn] 2011-01-31 22:00:37 PST
A suggestion from Shaver:

Could you instrument to print out script and lineno when our height goes from 64 to 65? would be interesting to know what code patterns cause it in normal browsing.
Comment 2 Brendan Eich [:brendan] 2011-01-31 22:34:12 PST
Even better, let's histogram to count the population of objects in various obj->propertyCount() bins. See jsutil.h's JSBasicStats. I should have added this ifdef DEBUG and conditioned by a one-time-getenv, a la other such instrumentation.

/be
Comment 3 Nicholas Nethercote [:njn] 2011-02-01 15:10:42 PST
I started by measuring the number of Shapes allocated and recycled for SS and V8 with some different values of MAX_HEIGHT:

- SS, MAX_HEIGHT=32:  21723 new, 0 recycled
- SS, MAX_HEIGHT=64:  22122 new, 0 recycled
- SS, MAX_HEIGHT=128: 20688 new, 0 recycled

- V8, MAX_HEIGHT=32:  6032 new, 0 recycled
- V8, MAX_HEIGHT=64:  6132 new, 0 recycled
- V8, MAX_HEIGHT=128: 5934 new, 0 recycled

First point: shape recycling isn't working.  What's up with that?

Second point: MAX_HEIGHT=64 results in more Shapes that 32 or 128 for both SS and V8.  I don't claim to understand that.  I thought the idea was that when converting to dictionary mode a bunch of extra Shapes are made, so the number of Shapes allocated should increase monotonically as MAX_HEIGHT decreases.


> Even better, let's histogram to count the population of objects in various
> obj->propertyCount() bins.

Sounds good, but I don't know where/when to count the populations.  If JSObject had a destructor that would be the obvious place;  is there something similar that indicates when an object's lifetime ends?
Comment 4 Brendan Eich [:brendan] 2011-02-01 15:45:50 PST
(In reply to comment #3)
> I started by measuring the number of Shapes allocated and recycled for SS and
> V8 with some different values of MAX_HEIGHT:
> 
> - SS, MAX_HEIGHT=32:  21723 new, 0 recycled
> - SS, MAX_HEIGHT=64:  22122 new, 0 recycled
> - SS, MAX_HEIGHT=128: 20688 new, 0 recycled
> 
> - V8, MAX_HEIGHT=32:  6032 new, 0 recycled
> - V8, MAX_HEIGHT=64:  6132 new, 0 recycled
> - V8, MAX_HEIGHT=128: 5934 new, 0 recycled

Need to know dictionary shapes (see js::Shape::newDictionaryShape) vs. total.

> First point: shape recycling isn't working.  What's up with that?

You have to collect some but not all shapes for the ones that were GC'ed to be on the freelist.

How many GC runs were there? Perhaps the GCs (if any) collected all shapes.

Try a browser session, you'll see shape recycling.

> Second point: MAX_HEIGHT=64 results in more Shapes that 32 or 128 for both SS
> and V8.  I don't claim to understand that.  I thought the idea was that when
> converting to dictionary mode a bunch of extra Shapes are made, so the number
> of Shapes allocated should increase monotonically as MAX_HEIGHT decreases.

A dictionary-mode object owns its shapes and they are mutable, so more dicts makes for fewer shapes than if they're shared in the empty-shape-rooted forest.

But sharing can win too, if there's no in-situ mutation, and good re-use of common shape lineages.

Dynamics!

I'm taking your numbers as gospel here, since the patch must be simple. Want to post it (with the dict-shapes accounted), though, for sanity-check review?

> > Even better, let's histogram to count the population of objects in various
> > obj->propertyCount() bins.
> 
> Sounds good, but I don't know where/when to count the populations.  If JSObject
> had a destructor that would be the obvious place;  is there something similar
> that indicates when an object's lifetime ends?

Sure, look for thing->finalize(cx) calls (one call site, but it's in a template) in jsgc.cpp.

/be
Comment 5 Nicholas Nethercote [:njn] 2011-02-01 15:51:26 PST
Created attachment 508929 [details] [diff] [review]
instrumentation patch

For sanity-check.
Comment 6 Brendan Eich [:brendan] 2011-02-01 16:22:08 PST
That's too simple to go wrong. With a browser, though, you'd want not to have to grep stdout or stderr, right?

/be
Comment 7 Brendan Eich [:brendan] 2011-02-01 17:08:12 PST
$3 = {
  alloc = 161665, 
  oom = 0, 
  recycle = 10514, 
  dprop = 64965, 
  dprop4add = 0
}

is the third js_GC pre-order sample of atomic-incremented meters I just threw together. Some recycling, and dictionary shapes (dprops) account 40% of all shapes.

/be
Comment 8 Brendan Eich [:brendan] 2011-02-01 17:08:57 PST
Created attachment 508958 [details] [diff] [review]
quick-and-dirty SHAPE_METER patch

Could clean this up and add it to one-time-getenv-triggered DEBUG only dumping.

/be
Comment 9 Brendan Eich [:brendan] 2011-02-01 17:10:07 PST
Comment 7 should have said for a browser session with gmail and news.google.com open, and third js_GC bp hit was after when loading news. Odd that so few js_GC calls are hit -- I barely recognize the old jsgc.cpp code any longer. Yay for compartment GC, I think.

/be
Comment 10 Nicholas Nethercote [:njn] 2011-02-01 17:12:27 PST
(In reply to comment #6)
> That's too simple to go wrong. With a browser, though, you'd want not to have
> to grep stdout or stderr, right?

I just pipe stderr into a script I have that counts how many times each unique line was printed.  Works fine with the browser, I do this all the time.
Comment 11 Brendan Eich [:brendan] 2011-02-01 17:15:06 PST
Ok, I like pipes -- good to know. I am generally leery of adding to the sewage in stdout (or is it stderr)? False positive sewage matching, ewww.

/be
Comment 12 Nicholas Nethercote [:njn] 2011-02-01 17:21:42 PST
Opt build don't have sewage problems :)  And my script output makes it easy to sort the sewage from the gold.
Comment 13 Brendan Eich [:brendan] 2011-02-01 18:02:05 PST
I shut down and the final meters were

$14 = {
  alloc = 161665, 
  oom = 0, 
  recycle = 105903, 
  dprop = 116662, 
  dprop4add = 0
}

/be
Comment 14 Nicholas Nethercote [:njn] 2011-02-01 20:21:37 PST
Opening techcrunch.com:

MAX_HEIGHT=32:
286898  new
147671  dprop
 15068  recycle

MAX_HEIGHT=64:
280959  new
115006  dprop
 23308  recycle

MAX_HEIGHT=128:
238852  new
 53095  dprop
 18784  recycle

MAX_HEIGHT=192:
243166  new
 51992  dprop
 20038  recycle

"new" is the most important number, that's how many shapes we allocated.  With MAX_HEIGHT=128 we get a ~15% drop.  If we have 20MB of Shapes (as in http://blog.mozilla.com/nnethercote/2011/02/01/memory-profiling-firefox-on-techcrunch/) that'll give us a ~3MB reduction.  MAX_HEIGHT=192 gave similar results, so 128 seems a better choice as it's a smaller change from the status quo.

"dprop" is a good sanity check -- it drops as MAX_HEIGHT increases, ie. we convert to dictionary mode less often.

"recycle" doesn't change much;  I think it's more a function of how many GCs have happened.

So, MAX_HEIGHT=128 seems a good choice for minimizing memory usage.  What's the speed impact?  For Sunspider and V8 it's negligible:

---------------------------------------------------------------
| millions of instructions executed                           |
| total                        | compiled (may overestimate)  |
---------------------------------------------------------------
|    68.331    68.369 (0.999x) |    44.762    44.761 (------) | 3d-cube
|    40.232    40.233 (------) |    25.189    25.189 (------) | 3d-morph
|    63.756    63.785 (------) |    37.220    37.220 (------) | 3d-raytrace
|    25.283    25.288 (------) |    11.101    11.100 (------) | access-binary-
|    91.814    91.816 (------) |    85.809    85.809 (------) | access-fannkuc
|    29.085    29.082 (------) |    16.514    16.514 (------) | access-nbody
|    36.156    36.144 (------) |    29.335    29.335 (------) | access-nsieve
|     7.373     7.373 (------) |     3.253     3.253 (------) | bitops-3bit-bi
|    36.509    36.509 (------) |    31.742    31.742 (------) | bitops-bits-in
|    15.818    15.818 (------) |    12.018    12.018 (------) | bitops-bitwise
|    38.001    37.986 (------) |    32.968    32.967 (------) | bitops-nsieve-
|    16.960    16.953 (------) |    12.874    12.874 (------) | controlflow-re
|    23.123    23.169 (0.998x) |    11.792    11.792 (------) | crypto-md5
|    19.215    19.245 (0.998x) |     8.495     8.495 (------) | crypto-sha1
|    68.846    68.860 (------) |    21.930    21.930 (------) | date-format-to
|    51.907    51.915 (------) |     9.617     9.617 (------) | date-format-xp
|    41.597    41.590 (------) |    27.329    27.329 (------) | math-cordic
|    22.733    22.745 (0.999x) |     6.305     6.305 (------) | math-partial-s
|    31.510    31.514 (------) |    26.589    26.588 (------) | math-spectral-
|    47.090    47.109 (------) |    34.563    34.563 (------) | regexp-dna
|    25.927    25.940 (------) |     9.252     9.252 (------) | string-base64
|    60.807    60.820 (------) |    32.705    32.705 (------) | string-fasta
|    95.116    95.152 (------) |    17.269    17.269 (------) | string-tagclou
|    98.217    99.021 (0.992x) |    12.626    12.628 (------) | string-unpack-
|    38.620    38.643 (0.999x) |     8.976     8.976 (------) | string-validat
-------
|  1094.037  1095.088 (0.999x) |   570.242   570.242 (------) | all

---------------------------------------------------------------
| millions of instructions executed                           |
| total                        | compiled (may overestimate)  |
---------------------------------------------------------------
|  1207.737  1195.445 (1.010x) |  1035.899  1035.769 (------) | v8-crypto
|  1144.137  1144.135 (------) |   842.543   842.543 (------) | v8-deltablue
|   889.240   889.344 (------) |   448.816   448.817 (------) | v8-earley-boye
|   503.378   503.386 (------) |   237.960   237.960 (------) | v8-raytrace
|   870.299   870.346 (------) |   371.779   371.779 (------) | v8-regexp
|  1143.380  1143.400 (------) |  1115.519  1115.519 (------) | v8-richards
|   430.371   429.986 (1.001x) |   101.115   101.088 (------) | v8-splay
-------
|  6188.544  6176.046 (1.002x) |  4153.634  4153.477 (------) | all

The v8-crypto change is just noise.

Measuring browser changes seems more difficult.
Comment 15 Brendan Eich [:brendan] 2011-02-01 21:00:54 PST
Changing MAX_HEIGHT should not affect performance much. It's a "reasonable" limit on growth of the shape tree, the lexicographic tree of all properties (the alphabet is almost everything identifying that property), since that tree is sparse and has the potential for very long linear subtrees. So we need to clamp down on the worst case, but not defeat the win of sharing shapes.

So 128 seems just fine. Browser numbers would be good. And histogrammed obj->propertyCount() bins counting populations at finalization would be great.

/be
Comment 16 Nicholas Nethercote [:njn] 2011-02-02 20:35:32 PST
(In reply to comment #13)
> 
> $14 = {
>   alloc = 161665, 
>   oom = 0, 
>   recycle = 105903, 
>   dprop = 116662, 
>   dprop4add = 0
> }

dprop4add is 0 because newDictionaryShapeForAdd() is dead.  I'll remove it in the patch I write.
Comment 17 Brendan Eich [:brendan] 2011-02-02 22:24:31 PST
Cc'ing jorendorff -- did the SetPropHit-ectomy eliminate dprop4add?

/be
Comment 18 Nicholas Nethercote [:njn] 2011-02-03 16:08:39 PST
Created attachment 509625 [details] [diff] [review]
histogram instrumentation patch

Results from running a bunch of sites (techcrunch.com, google.com, theage.com.au, etc, etc):

mean property counts 1.3802, std. deviation 7.66435, max 1054
        [     0]:   253252 ******************
        [     1]:    22897 ***************
        [     2]:    21902 ***************
        [     3]:     8488 **************
        [     4]:     6193 *************
        [     5]:     3368 ************
        [     6]:     4814 *************
        [     7]:     3036 ************
        [     8]:     3200 ************
        [     9]:      523 **********
[    10,   +inf]:     8075 *************

Seems like a lot of empty objects.
Comment 19 Nicholas Nethercote [:njn] 2011-02-03 16:09:50 PST
Created attachment 509626 [details] [diff] [review]
patch
Comment 20 Boris Zbarsky [:bz] 2011-02-04 05:33:17 PST
> Seems like a lot of empty objects.

Your typical Function and typical DOM object will be empty (since all the interesting stuff is on the prototype), right?
Comment 21 Brendan Eich [:brendan] 2011-02-04 15:39:50 PST
(In reply to comment #20)
> > Seems like a lot of empty objects.
> 
> Your typical Function and typical DOM object will be empty (since all the
> interesting stuff is on the prototype), right?

Exactly. The original design for JS was a lot like Harmony proxies but using the prototype as the handler: empty objects fronting for various peer C++ (C in the old Netscape days) data structures referenced by each object's private slot. This is more efficient than copying everything from one space to the other and back.

/be
Comment 22 Brendan Eich [:brendan] 2011-02-04 15:41:20 PST
(In reply to comment #20)
> > Seems like a lot of empty objects.
> 
> Your typical Function and typical DOM object will be empty (since all the
> interesting stuff is on the prototype), right?

(on the prototype and in the case of Function instances, in the private data structure, JSFunction, and hanging off it.)

/be
Comment 23 Brendan Eich [:brendan] 2011-02-04 15:43:15 PST
Comment on attachment 509626 [details] [diff] [review]
patch

Seems like a win for Firefox 4, low risk (just a cutoff on exponential worst-case growth, still low). Nick, stop me if you disagree.

/be
Comment 24 Brendan Eich [:brendan] 2011-02-04 15:45:54 PST
Comment on attachment 509625 [details] [diff] [review]
histogram instrumentation patch

This could be made landable, but boy did you recompile the world! Instead of touching jsapi.h, a single extern in JSObject::finish would suffice. And all #ifdef DEBUG, with the JS_DumpBasicStats conditioned on a getenv naming the file?

Or not, it's easy to recreate or find and rebase this patch. We do take DEBUG-only but maintained metering, though, so I wanted to raise the option.

/be
Comment 25 Nicholas Nethercote [:njn] 2011-02-04 17:43:38 PST
(In reply to comment #24)
> 
> This could be made landable, but boy did you recompile the world! Instead of
> touching jsapi.h, a single extern in JSObject::finish would suffice. And all
> #ifdef DEBUG, with the JS_DumpBasicStats conditioned on a getenv naming the
> file?

Sure, I was doing it as quick-and-dirty as possible :)
Comment 26 Brendan Eich [:brendan] 2011-02-07 18:05:54 PST
Nick, can you get the patch landed?

I bet a lot of shapes come from (perhaps one or two) large slow arrays. We should not be using shapes for slow array elements, esp. if the array is dense but not quite dense enough to avoid being slow-ified. Can you measure?

/be
Comment 27 Nicholas Nethercote [:njn] 2011-02-07 18:47:34 PST
(In reply to comment #26)
> Nick, can you get the patch landed?

Don't I have to wait for 2.0 approval?
Comment 28 Boris Zbarsky [:bz] 2011-02-07 18:52:22 PST
Comment on attachment 509626 [details] [diff] [review]
patch

Approval, schmaproval!
Comment 29 Nicholas Nethercote [:njn] 2011-02-07 19:12:17 PST
http://hg.mozilla.org/tracemonkey/rev/cffa2568bcb0
Comment 30 Brendan Eich [:brendan] 2011-02-07 19:20:11 PST
Created attachment 510501 [details] [diff] [review]
new SHAPE_METER patch handles indexed and array props

From a run on perfect.js, which indexes a plain object, at JS_ShutDown:

$1 = {
  alloc = 909, 
  oom = 0, 
  recycle = 0, 
  dprop = 540, 
  aprop = 0, 
  iprop = 501
}

Rebuilding Minefield now after updating and reconfiguring.

/be
Comment 31 Jason Orendorff [:jorendorff] 2011-02-08 13:51:29 PST
This regressed js1_8_5/regress/regress-595365-2.js which contains the following comment:

/*
 * NB: this test hardcodes the value of PropertyTree::MAX_HEIGHT.
 */

(The failing test hasn't upset the tinderboxen because it uses shapeOf and is therefore shell-only.)
Comment 32 Nicholas Nethercote [:njn] 2011-02-08 14:06:38 PST
(In reply to comment #31)
> This regressed js1_8_5/regress/regress-595365-2.js 

Fixed.  Sorry for the breakage.
Comment 33 Nicholas Nethercote [:njn] 2011-02-08 17:36:25 PST
I'm going to pull a Brendan:  Brendan, can you file a new bug for the indexed prop measurements and details on how those Shapes can be optimized?  This bug is done.  Thanks.
Comment 34 Brendan Eich [:brendan] 2011-02-08 18:30:23 PST
Is that a "Brendan"? I know what a "gal" is. Anyway, totally agree -- was gonna file separately. I'm still working on better instrumentation. My own browsing did not find a huge slow array or two causing lots of shapes. More in the new bug.

/be
Comment 35 Brendan Eich [:brendan] 2011-02-08 18:32:27 PST
Bug 632653.

/be
Comment 36 Chris Leary [:cdleary] (not checking bugmail) 2011-02-11 11:49:15 PST
cdleary-bot mozilla-central merge info:
http://hg.mozilla.org/mozilla-central/rev/cffa2568bcb0
http://hg.mozilla.org/mozilla-central/rev/53763d7e9e54

Note You need to log in before you can comment on or make changes to this bug.