Closed Bug 123668 Opened 21 years ago Closed 18 years ago

Lots of mallocs from js_NewObject

Categories

(Core :: JavaScript Engine, defect, P1)

defect

Tracking

()

RESOLVED FIXED
mozilla1.8alpha6

People

(Reporter: sfraser_bugs, Assigned: brendan)

References

Details

(Keywords: js1.5, memory-footprint, perf)

Attachments

(12 files, 19 obsolete files)

7.56 KB, text/plain
Details
1.06 KB, text/plain
Details
1.04 KB, text/plain
Details
1.18 KB, text/plain
Details
1.44 KB, text/plain
Details
1.28 KB, text/plain
Details
7.06 KB, patch
Details | Diff | Splinter Review
100.13 KB, patch
Details | Diff | Splinter Review
97.82 KB, patch
brendan
: review+
Details | Diff | Splinter Review
20.34 KB, patch
Details | Diff | Splinter Review
20.38 KB, patch
brendan
: review+
Details | Diff | Splinter Review
120.12 KB, patch
Details | Diff | Splinter Review
The JS_malloc() in js_NewObject, which allocates the slots array for new JS 
Objects, does 6043 mallocs for startup + first browser window, allocating a total 
of 144996 bytes.

All these allocations are 24 bytes in size, apart from those for JavaArray, 
JavaClass, and JavaObject objects, which are 20 bytes (I was running on Mac with 
the MRJ Plugin installed).

This bug should address:
1) if we can reduce the number of JSObjects allocated, and
2) optimize allocation of obj->slots, perhaps by making it fixed size, or using
   arena allocation.

I'll attach a breakdown of the object classes.
Reassigning to Kenton; cc'ing Roger -
Assignee: rogerl → khanson
>1) if we can reduce the number of JSObjects allocated, and

Simon: this bug may not be well-assigned, in the case where chrome JS analysis
and fixes are needed.  Should we have another bug, and maybe a tracking bug? 
For that matter, what tracking bug should depend on this one?

>2) optimize allocation of obj->slots, perhaps by making it fixed size, or
>using arena allocation.

Arenas are appropriate for LIFO allocation patterns, which do not describe JS
object slots allocations.  Arena + freelist (called a recycler by some around
here :-) might help, but only by cutting down on malloc overhead if all these
objects are indeed live after startup (and it seems most of them will be, see
below).

Shaver keeps hyping the slab allocator, mabye he'll take this bug.

I think the answer to question 1 is more important than this one.  It looks like
4679 out of 6043 objects are functions, which do need a parent slot (waldemar
asked today about why SpiderMonkey burns a word per object on the parent slot,
when only functions and DOM objects need it -- but most of these objects are
function and DOM objects), and 246 are XULElement objects (which I believe
require the silly DOM scope link).

An interesting question, one maybe sfraser can answer quickly, is how many of
these functions are actually called.  See bug 107907, the initial description --
the numbers there don't square with the ones attached here.

/be
Keywords: footprint
I repeat that malloc is not evil, but JS could fuse an object's initial slots
with its GC-thing (arena-amortized) allocation.  I'll take this bug for 1.1.

/be
Assignee: khanson → brendan
Keywords: mozilla1.1
Target Milestone: --- → mozilla1.1
Priority: -- → P1
Blocks: 129496
Keywords: perf
Doesn't seem like a beta change, but what the heck.

/be
Target Milestone: mozilla1.1alpha → mozilla1.1beta
Status: NEW → ASSIGNED
I did some work on this over my vacation, but it's not going to make 1.1.

/be
Keywords: mozilla1.1mozilla1.2
Target Milestone: mozilla1.1beta → mozilla1.2alpha
Blocks: 21762
Keywords: nsbeta1
brendan, what is the current status?
Moving out, some of these may move to 1.3alpha shortly.

/be
Target Milestone: mozilla1.2alpha → mozilla1.2beta
This should work, I tested lightly.  Phil, can you run it past the suites and
anything else?	Thanks, gotta crash.

/be
This is the patch that fixes a few bugs:

- Don't finalize extension-things unconditionally, only if we just finalized a
non-extension thing (this little bit of state machinery consolidates code, as
opposed to a scheme with an inner, inner loop that runs ahead of the finalized
thing's flagp).

- Don't set thing->flagp when freeing thing if we're coalescing it with the
last_free_thing (this seemed necessary only because of another bug I fixed just
before attaching the original patch, but it never made sense; it was late and I
wanted to get a working patch in the bug).

/be
Attachment #99841 - Attachment is obsolete: true
The latest patch passes the JS testsuite in both the debug/optimized
JS shell. No test regressions are introduced by the patch -
Attached patch latest and greatest (obsolete) — Splinter Review
This ought to help quite a bit.  Testers, please measure if you can and report
improved memory use.

/be
Attachment #99843 - Attachment is obsolete: true
Anyone had time to try the last patch yet?

/be
Attachment #99917 - Attachment is obsolete: true
I will run the JS testsuite on it now -
For some reason, my patch program is failing (silently) on the
latest attachment (in Comment #14). It stops after patching jsexn.c:

[//d/JS_EXP/mozilla/js/src] patch < 123668.patch
patching file `jsapi.c'
patching file `jsapi.h'
patching file `jsbool.c'
patching file `jscntxt.h'
patching file `jsdate.c'
patching file `jsexn.c'
[//d/JS_EXP/mozilla/js/src]


It should continue to patch jsfun.c, jsgc.c, etc., but it's not.
I will look into this problem -
Patch worked fine for me - test suite ran fine (debug, win2k)
Should work on a current trunk js/src.

Anyone have DHTML or pure JS performance test results?

/be
Attachment #100026 - Attachment is obsolete: true
System:  Windows 2000,  Memory 256 MB,  Processor:  1.7  GHz

I have applied the latest patch and tested in the optimized JS shell 
with loops like this, using object constructors:

function test()
{
  //Array
  var t0=new Date();
  for(x=0;x<1000000;x++)
  { 
    var arr=new Array();
  }
  print("Array:\t\t" +(new Date()-t0) + "\tms\n");

        etc.
        etc.
}

               BEFORE PATCH    AFTER PATCH
Array:          4453    ms      3204  ms
Object:         2781    ms      1718  ms
RegExp:         6953    ms      5172  ms
String:         2937    ms      1735  ms
Date:           3594    ms      2437  ms
Function:       14078   ms     10859  ms
Number:         3609    ms      2422  ms
Same thing, only this time testing object literals:

function test()
{
  //Array
  var t0=new Date();
  for(x=0;x<1000000;x++)
  { 
    var a=[];
  }
  print ("Array:\t\t" +(new Date()-t0) + "\tms");

             etc.
             etc.
}

               BEFORE PATCH    AFTER PATCH
Array:          4953    ms       3250  ms
Object:         2984    ms       1750  ms
RegExp:         578     ms        563  ms
String:         579     ms        609  ms
Date:           n/a               n/a
Function:       2437    ms       1250  ms
Number:         578     ms        547  ms
Oops, is that x not declared as a local variable?

Can you post the whole testcase as an attachment to this bug?  Thanks.

/be
Attached file xp.js (obsolete) —
> xpcshell xp.js
Array:		64312	ms

> cscript /nologo xp.js
Array:		12338	ms

my xpcshell is unpatched from today. cscript is:
Microsoft (R) Windows Script Host Version 5.1 for Windows
[p2/450, 512mb of ram, xpcshell seemed to get 75-80% of the cpu]
Attached file xp.js
> cscript /nologo xp.js
Array:		11056	ms
Object: 	11456	ms
RegExp: 	29052	ms
String: 	18266	ms
Date:		21451	ms
Function:		78844	ms
Number: 	11246	ms

(extra newlines stripped from xpcshell output)
> f:xpcshell xp.js
Array:		65014	ms
Object: 	50232	ms
RegExp: 	77742	ms
String: 	58273	ms
Date:		80586	ms
Function:		158949	ms
Number: 	65764	ms
Attachment #100277 - Attachment is obsolete: true
These are the results I get when declaring the variable x inside the for-loop.  

Testing object constructors:

               BEFORE PATCH    AFTER PATCH
Array:          5375    ms      3250  ms
Object:         3078    ms      1657  ms
RegExp:         7813    ms      5297  ms
String:         3141    ms      1687  ms
Date:           3890    ms      2422  ms
Function:      16672    ms     11844  ms
Number:         3922    ms      2390  ms

Testing object literals:

               BEFORE PATCH    AFTER PATCH
Array:          5375    ms      3250  ms
Object:         3110    ms      1687  ms
RegExp:         437     ms       438  ms
String:         453     ms       453  ms
Date:            n/a             n/a 
Function:       2750    ms      1125  ms
Number:         422     ms       437  ms


I noticed in timeless' test, the object variable |arr| is global:
   t0=new Date(); for(x=0;x<1000000;x++) arr=new Array(); t1=(new Date()-t0);

In my test it is a local variable.
Comment on attachment 100294 [details]
Object constructor test, with Brendan's correction (var x)

The number test is calling new Date() in the loop.

/be
Attachment #100294 - Flags: needs-work+
Comment on attachment 100294 [details]
Object constructor test, with Brendan's correction (var x)

Sorry, I will add the corrected test case.
Attachment #100294 - Attachment is obsolete: true
Testing object constructors with updated number constructor:

	       BEFORE PATCH    AFTER PATCH
Array:		5672	ms	 1350  ms
Object: 	3406	ms	 1609  ms
RegExp: 	9047	ms	 5391  ms
String: 	3235	ms	 1687  ms
Date:		4359	ms	 2406  ms
Function:	18234	ms	11875  ms
Number: 	3422	ms	 1750  ms
Today, I am getting an intermittent crash in the optimized JS shell 
with the latest patch applied.  It crashes in the |new Number()| loop,
but only when the JS shell is first started up.  That's probably why 
I didn't see this on Monday.  I had an error -|new Date()| instead of 
|new Number()|,  and I never restarted the JS shell after the correction.

Here is the crash:

The instruction at "0x44014444" referenced memory at "0x44014444".
The memory could not be "read".

Unhandled exception is js.exe:0xC0000005:Access Violation.

Call Stack in optimized JS shell with patch:
44014444()
JS32! js_SetClassPrototype + 87 bytes
JS32! JS_InitClass + 191 bytes
JS32! js_InitNumberClass + 83 bytes
JS32! JS_ResolveStandardClass + 346 bytes
JS! main + -45 bytes
JS32! js_LookupProperty + 628 bytes
JS32! js_FindProperty + 115 bytes
JS32! js_Interpret + 22997 bytes


I tried to crash in the Debug JS shell, but simply ran out of memory:

js>  
load("D:/Performance/169531-literals,constructors/169531-javascript-constructors-MILLION.js");
Array:          6594    ms
Object:         5953    ms
RegExp:         9344    ms
String:         93375   ms
Date:           124562  ms
D:/Performance/169531-literals,constructors/169531-javascript-constructors-MILLION.js:53:
out of memory
js> 
load("D:/Performance/169531-literals,constructors/169531-javascript-constructors-MILLION.js");
3: out of memory
Please help me test this, I'd like to land it soon.  It passes mazie's test. 
Thanks, good find on that last crash!

/be
Attachment #100263 - Attachment is obsolete: true
With Brendan's latest patch,  I no longer crash.

Here are my results with CScript, the optimized JS shell before and after the
latest patch, and the XPCshell from today's Mozilla trunk build (2002092508):

System:  Windows 2000,  Memory 256 MB,  Processor:  1.7  GHz 
CScript: Microsoft (R) Windows Script Host Version 5.6


Testing constructors:

              CScript       Opt JS(w/o patch)   Opt JS(w/patch)   Trunk XPCshell
Array:      2578   ms        3672  ms             2875 ms            11875 ms
Object:     2578   ms        2312  ms             1594 ms             9391 ms
RegExp:     8687   ms        5672  ms             4141 ms            14687 ms
String:     6797   ms        2187  ms             2344 ms            10735 ms
Date:       5563   ms        3141  ms             3031 ms            12578 ms
Function:  18859   ms       12891  ms            10172 ms            33406 ms
Number:     2672   ms        2656  ms             2640 ms            12500 ms


Testing literals:

              CScript       Opt JS(w/o patch)   Opt JS(w/patch)   Trunk XPCshell
Array:      2437   ms        3750  ms             2969 ms            12032 ms
Object:     2375   ms        2359  ms             1703 ms             9406 ms
RegExp:     2656   ms         406  ms              422 ms              312 ms
String:      532   ms         406  ms              406 ms              313 ms
Date:           n/a            n/a                 n/a                 n/a
Function:   3296   ms        1954  ms             1250 ms             9641 ms
Number:      422   ms         390  ms              390 ms              328 ms
The latest patch passes the JS testsuite on WinNT in both the 
debug and optimized JS shell. No regressions were introduced.

However, I notice a new warning when building the shell:

  jsinterp.c(1434) : warning C4047: '=' : 'long ' differs in levels of 
  indirection from 'struct JSObject *'

From jsinterp.c:

          rval = OBJ_GET_PARENT(cx, withobj);       <------------- LINE 1434
I'll have a new patch in a day or so, but this clearly missed 1.2.

/be
Keywords: mozilla1.2mozilla1.3
Target Milestone: mozilla1.2beta → mozilla1.3alpha
Attached patch latest patch (obsolete) — Splinter Review
This patch uses a thread-private data structure containing a gc-thing freelist,
for a fast lock-free allocation path through js_AllocGCThing.  The
thread-private freelist is not built until about 8K of GC space has been
allocated by contexts active on the thread.  Once a thread has a private
freelist, it will be refreshed as the thread continues to allocate about 8K. 
This should remove most of the cost of thread safety in the allocator.

I could use lots of testing help, both to show correctness, and to benchmark js
vs. xpcshell vs. mozilla.  I need to run a debug xpcshell under MSVC on the
js/src/xpconnect/tests/js/old/threads.js testcase, which seems to hang for me
on Linux when run not under gdb, and which wedges earlier under gdb in a way
that leaves gdb unable to find the current thread id.  Maybe a thread died
without taking the whole app down?  Who knows.

Thanks for any help,

/be
Attachment #100777 - Attachment is obsolete: true
The latest patch passes the JS testsuite on WinNT in both the
debug and optimized shell. No test regressions were introduced.

Mazie is getting the performance data now -
No chance of getting this into 1.2 then?
Why put 1.2 at risk?  A patch this big is prime alpha material.  I expect more
change, followup fixes, further improvements, to be needed.  The latest patch is
not the one I'd check in today, or tomorrow.  Anyway, back to testing.

/be
I have applied Brendan's latest patch which he posted in Comment #38.

Here are my results with CScript, optimized JS shell before and after the latest
patch, and 

XPCShell:

System:  Windows 2000,  Memory 256 MB,  Processor:  1.7  GHz 
CScript: Microsoft (R) Windows Script Host Version 5.6
Trunk XPCShell: Mozilla build ID-2002102108


Testing constructors:

             CScript       Opt JS(w/o patch)   Opt JS(w/patch)   Trunk XPCshell
Array:       2594 ms          3625 ms             2922 ms           11859 ms
Object:      2609 ms          2297 ms             1672 ms            9375 ms
RegExp:      8704 ms          5797 ms             4172 ms           14938 ms
String:      7125 ms          2281 ms             2265 ms           10703 ms
Date:        5219 ms          3109 ms             3125 ms           12391 ms
Function:   18859 ms         13016 ms            10016 ms           31844 ms
Number:      2594 ms          2593 ms             2734 ms           12609 ms


Testing literals:

              CScript       Opt JS(w/o patch)   Opt JS(w/patch)   Trunk XPCshell
Array:       2313 ms          3688 ms             3078 ms           11750 ms
Object:      2297 ms          2360 ms             1797 ms            9265 ms
RegExp:      2562 ms           390 ms              406 ms             313 ms
String:       532 ms           422 ms              407 ms             328 ms
Date:           n/a            n/a                 n/a                 n/a
Function:    3203 ms          1938 ms             1250 ms            9406 ms
Number:       390 ms           390 ms              390 ms             297 ms
"cvs diff -u" of jsobj.c including current patch from this bug as well as
additional checks to benchmark timing in js_NewObject().  Put here as requested
in bug 171262 comment 5.  This measures time differences from start to return
of function and partitions-out the time taken in objectHook(), the last call
made before the return.  Counters are reset after 10sec of idleness, total
calls through here are counted to work-out per call averages.

Benchmarked with simple9999.html (attached to bug 171262) on a circa 1995 HP-UX
11.00 workstation, and taking the first and last reports...
stopwatch time 24 sec
js_NewObject	64 ..  0.006707s total	0.000105s ave per obj  56% for hook
js_NewObject  1408 .. 17.542389s total	0.012459s ave per obj  100% for hook

Interesting to note that the objectHook() portion is the substantial piece of
the timing for the objects in this test, and that the first allocations take
0.1ms, but 1400 objects later, allocations are taking 12.5ms (over 100x
longer).
Attachment #103637 - Attachment mime type: application/octet-stream → text/plain
Attachment #103637 - Attachment is patch: true
Test results for nums9999.html (attached to bug 171262) in the same config:
js_NewObject    64 ..   0.006632s total  0.000104s ave per obj   54% for hook
js_NewObject 21632 .. 435.227305s total  0.020120s ave per obj  100% for hook
Overall stopwatch time: 461sec
In this case, allcoation 21000 takes 20.1ms, 200x as long as the first few.

For reference, calling calloc(1,1024) 100,000 times in a loop on this machine
takes 41 seconds, averaging 0.4ms per call.
Dan, please disable venkman (type /startup-init off in its console and restart
mozilla -- you can verify that it's off at startup by starting venkman again and
typing /startup-init), then re-run the benchmark.  The runtime's objectHook
should be null, so your timing code should measure essentially zero time in the
hook.

Phil, would you please spin off a bug against the js debugger where this
apparent slowdown can be investigated separately from this bug?  Thanks.

/be
I have filed bug 176087 against the JS Debugger on the above issue -
Depends on: 176182
I applied Brendan's latest patch and built the browser with this in my
mozconfig file:   ac_add_options --disable-debug
                  ac_add_options --enable-optimize=-O1

I have tried the tests with several different upper bounds. While running the
tests, I watched the Windows Task Manager and have made observations.

Here are the results with the patched and unpatched XPCshell:

                           Testing Constructors

Results with an upper bound of 10,000:

           Moz XPCshell(10-21-02)        XPCshell w/patch
Array:          78      ms                    78    ms
Object:         78      ms                    63    ms
RegExp:         110     ms                    93    ms
String:         62      ms                    63    ms
Date:           94      ms                    94    ms
Function:       234     ms                   218    ms
Number:         328     ms                    94    ms

Observations on Mozilla XPCshell: uses barely any RAM.
Observations on XPCshell w/patch: uses little RAM.


Results with an upper bound of 100,000:

           Moz XPCshell(10-21-02)        XPCshell w/patch
Array:          984     ms                   750    ms
Object:         906     ms                   650    ms
RegExp:         1532    ms                  1062    ms
String:         843     ms                   813    ms
Date:           1297    ms                  1250    ms
Function:       3360    ms                  3016    ms
Number:         1282    ms                  1234    ms

Observations on Mozilla XPCshell: uses barely any RAM.
Observations on XPCshell w/patch: uses half the RAM.


Results with an upper bound of 1,000,000 (bound of original tests):

           Moz XPCshell(10-21-02)        XPCshell w/patch
Array:          11782   ms                  10297    ms
Object:         9359    ms               couldn't continue: out of RAM
RegExp:         14516   ms
String:         10546   ms
Date:           12266   ms
Function:       32047   ms
Number:         12453   ms

Observations on Mozilla XPCshell: uses barely any RAM.
Observations on XPCshell w/patch: After 10 secs, almost all my RAM is used up, 
              
must be forced to quit.


The results for object literal creation were the same -
Turns-out that disabling venkman isn't as easy as it should be (detailed in bug
176331), but once that was out of the way, I could gets some numbers on those
same tests with venkman disabled:
simple 9999
js_NewObject    64 .. 0.004076s total  0.000064s ave per obj  19% for hook
js_NewObject  1344 .. 0.059689s total  0.000044s ave per obj  28% for hook

nums9999
js_NewObject    64 .. 0.004232s total  0.000066s ave per obj  25% for hook
js_NewObject 21312 .. 1.209004s total  0.000057s ave per obj  23% for hook

Given that the average allocations started at 1/2 their previous starting-point
and stayed at roughly the same point, I have to believe that the venkman hook is
the major component to the problem here: following up with details for bug
176087.  Note also that the times here per-call are small enough that the calls
to gettimeofday() are showing up as a fairly significant part of the
measurement, as the "hook" would be null and not called in this case.
Regarding comment #47, something's clearly wrong with the patch.  The unpatched
xpcshell shouldn't use more memory that the patched one does.

Can you verify that the js shell, w/ and w/o the patch, works as reported in
comment #42 ?

Also, the unpatched xpcshell memory use should not be less than the js shell use
-- it ought to be a bit more, due to JS_THREADSAFE and XPCOM overheads.

/be
I went over and retried the constructor tests in the optimized JS shell
on Mazie's machine, with and without the latest patch.

I observed the RAM as the tests ran, in each case starting from an
indicated value of 130M available RAM.

Results with an upper bound of 1,000,000 (bound of original tests):
Testing constructors:

             Opt JS(w/o patch)   Opt JS(w/patch)
Array:          3750    ms          2922    ms
Object:         2375    ms          1672    ms
RegExp:         5937    ms          4234    ms
String:         2313    ms          2266    ms
Date:           3250    ms          3141    ms
Function:       13234   ms         10484    ms
Number:         2656    ms          2781    ms

RAM:          130M ---> 114M      130M ---> 129M


The RAM behavior is exactly the opposite of what was observed in
the XPCshell. Here, in the optimized JS shell, less RAM is used
with the patch in place. By contrast, in the XPCshell, more RAM
was used with the patch in place; in fact, so much more that the
tests could not be run with the original upper bound of 1,000,000.

I don't know why that would be. In any case, the timings I got in
the JS shell are virtually the same as Mazie got in Comment #42 -
Does this still need tweaking or is it already ready to bake on the trunk?
Blocks: js-perf
Flags: blocking1.3b?
Fixing TM.

/be
Target Milestone: mozilla1.3alpha → mozilla1.3beta
The bug, to be solved in 1.3 beta, depends on bug #176182, which will
be solved in 1.4 alpha.  Does that mean the bug can be solved before
176182, despite the dependency?
no, it just means you should not put absolute trust in the target milestone field.
Flags: blocking1.3b? → blocking1.3b-
But this bug won't make 1.3b, right? :/
no, because You should put absolute trust in the flags field :)
But this might make 1.3 final then?
Target Milestone: mozilla1.3beta → mozilla1.4alpha
No longer blocks: 129496
If I can get my 1.4alpha-targeted patches landed in that milestone, I think I
can fit this into 1.4beta without adding too much risk.  Time will tell.

/be
Target Milestone: mozilla1.4alpha → mozilla1.4beta
Attached patch patch updated to trunk today (obsolete) — Splinter Review
A diff -w version is next.

/be
Attachment #103555 - Attachment is obsolete: true
Attached patch diff -w version of last patch (obsolete) — Splinter Review
The latest patch passes the JS testsuite for me on WinNT, in both
the debug and optimized JS shell. No test regressions are observed -
Attached patch fixed patch, please test heavily (obsolete) — Splinter Review
This works for me on the constructors test.

/be
Attachment #120664 - Attachment is obsolete: true
Attached patch diff -w of last patch (obsolete) — Splinter Review
This one is for reviewing.

/be
Attachment #120665 - Attachment is obsolete: true
I have applied Brendan's latest patch in Comment #62.

Here are my results with CScript, the optimized JS shell before/after
the latest patch, and the XPCShell from a recent Mozilla trunk binary:

System:  WinNT 4.0, 128M RAM, 500 MHz CPU
CScript: Microsoft (R) Windows Script Host Version 5.6
Trunk XPCShell: Mozilla build ID 2003041609


Testing constructors:

             CScript       Opt JS(w/o patch)   Opt JS(w/patch)   Trunk XPCshell
Array:       5672 ms          10031 ms           9609 ms           32062 ms
Object:      5593 ms           4797 ms           4172 ms           24125 ms
RegExp:     19797 ms          12782 ms          13047 ms           38235 ms
String:     14750 ms           4765 ms           4797 ms           28062 ms 
Date:       11172 ms           6531 ms           5578 ms           32891 ms
Function:   52984 ms          36032 ms          35140 ms           89000 ms 
Number:      5766 ms           5125 ms           4610 ms           33234 ms  


Testing literals:

             CScript       Opt JS(w/o patch)   Opt JS(w/patch)   Trunk XPCshell
Array:       4641 ms          10110 ms           9766 ms           32078 ms
Object:      4531 ms           4875 ms           4250 ms           24125 ms
RegExp:      5391 ms            969 ms            953 ms            1141 ms
String:      1359 ms            969 ms            969 ms            1140 ms
Date:          n/a               n/a               n/a                n/a
Function:    6438 ms           3296 ms           2844 ms           21141 ms  
Number:      1046 ms            938 ms            922 ms            1078 ms
The latest patch in Comment #62 also passes the JS testsuite on WinNT,
in both the debug and optimized JS shell. No test regressions occurred.

I will also build the browser with this patch in, and report back -
The patch doesn't pass
http://lxr.mozilla.org/mozilla/source/js/src/xpconnect/tests/js/old/threads.js,
alas.  I don't want to push this patch into a non-alpha release, so I'm going to
fix the remaining problems and land it in 1.5alpha, early.

/be
Target Milestone: mozilla1.4beta → mozilla1.5alpha
adt: nsbeta1-
Keywords: nsbeta1nsbeta1-
Blocks: 203448
1.5a freeze is 9th of july - can we get this in ?
Blocks: 169531
Will this make it to 1.5b or will the new milestone be 1.6a?
Blocks: 168157
Target Milestone: mozilla1.5alpha → mozilla1.6alpha
Keywords: nsbeta1-nsbeta1
Markush, there is no nsbeta1 keyword consumer any longer.  I'm removing it, lest
it cause confusion.

I hope to have a new patch here soon.

/be
Keywords: nsbeta1
Bump.

/be
Target Milestone: mozilla1.6alpha → mozilla1.7alpha
Target Milestone: mozilla1.7alpha → mozilla1.8alpha
When will this really be ready for baking on the trunk? (The TM is missed 
again)
This is a trickier than it sounds bug.  Probably too tricky for anything but an
alpha release (and 1.8a4 is already packed with other feature work).  The
sticking point is that the allocator cannot afford much contention but still
needs to be memory efficient and handle multiple consumers.  There are some
design bugs, they need to be found, but teasing them out is going to take some
careful sleuthing about the concurrency.
Can we say "it's for 1.9a1"?
This has been postponed from 1.2 on - so if this can make by any chance 1.8
(a4) that'll be highly appreciated.
if the problem is lack of testing, we can try to involve as many testers as
possible, prepare compiled versions of branch with this patch so we can make
sure it's ready to serve.
I belive that this patch not only speeds up JS on websites, but also internal
Mozilla Suite parts that are written in JS.
Many people will contribute in finally getting this patch in. What can we do 
about it?!
This is a lot simpler, and paves the way for a fix here and for other bugs.

/be
Attachment #120788 - Attachment is obsolete: true
diff -w of that patch coming next.

/be
Keywords: js1.5
Target Milestone: mozilla1.8alpha1 → mozilla1.8alpha6
The patch this obsoletes would fragment like crazy, it was a bad idea.

/be
Attachment #170118 - Flags: review?(shaver)
Oh, one more thing.  The patch at attachment 170117 [details] [diff] [review] works only if you turn off
E4X (#define JS_HAS_XML_SUPPORT 0 in jsconfig.h for the JS_VERSION == 150 case).

I'll have a patch for jsxml.[ch] to make E4X work with this patch shortly.

/be
Comment on attachment 170118 [details] [diff] [review]
phase 1 diff -w version (to review)

That was just to whet your appetite ;-).  Full patch that passes all testsuites
in a little while.

/be
Attachment #170118 - Flags: review?(shaver)
Money all around.

The jsxml.c changes look big, but they're pretty much "pattern matching": 
Since all JSXML, JSXMLQName, and JSXMLNamespace are GC-things now, no need to
get JSObject wrappers for them when referencing them from more than one
JSXMLArray element.  JSXMLArrayFinish doesn't need a dtor argument.  No manual
destroys of object-less privates on error return paths.  Etc.

/be
Attachment #170117 - Attachment is obsolete: true
Attachment #120789 - Attachment is obsolete: true
Attachment #170166 - Attachment is obsolete: true
Attachment #170167 - Flags: review?(shaver)
Attachment #170166 - Attachment is obsolete: false
Attachment #170118 - Attachment is obsolete: true
I checked in the phase 1 patch, plus some warning fixes and an E4X PutProperty
fix that dmose reported, to JSGC_REVAMP_20050103_BRANCH.  On to phase 2!

/be
The goal here is to preserve friend API compatibility by not changing JSObject,
but always allocating a JSNativeObject that contains strongly-typed members for
proto, parent, clasp, and private data, moving those out of obj->slots and
thereby avoiding malloc'ing slots altogether for unmutated prototype clones.

This required more OBJ_GET/SET_* macroization, so it looks big.

It may be that this loses; it certainly breaks recent work to prevent very long
object chains from crashing the GC, if the attacker chains via __proto__, due
to the special-case JSNativeObject member storage of what used to be slots.

I'm not checking this into the branch yet, just recording it here for review
and comments (testing and benchmarking, ideally).  I'd like to try an
alternative patch that leaves JSObject alone, while allocating short-ish slots
vectors from the GC (as well as JSFunctions).

/be
Attachment #170215 - Flags: review?(shaver)
This is better.  It passes the regular and e4x testsuites.  The only issues
are:

1.  Does it speed up real-world apps?  It obviously reduces malloc calls
substantially.	Numbers in a bit.

2.  Can we stand to break the memory pressure API compatibility, requiring
callers of JS_NewRuntime (aka JS_Init) to pass a larger maxbytes parameter than
they have, if they don't pass 0xffffffff?  Because we're allocating small-ish
slots vectors, and JSFunctions, from the GC-heap, we'll need to bump
XPConnect's 4MB argument, probably to 16MB or so.

/be
Comment on attachment 170215 [details] [diff] [review]
phase 2: avoid malloc'ing slots, etc., diff -w (review)

This is not the way to go, violating the obj->slots uniformity is bad, and for
most objects (which mutate) you still malloc a slots vector.

/be
Attachment #170215 - Attachment is obsolete: true
Attachment #170215 - Flags: review?(shaver)
Attachment #170214 - Attachment is obsolete: true
Comment on attachment 170167 [details] [diff] [review]
diff -w, v2 (review)

shaver gave r=him over IRC, he's still on vacation but popped in just in time.

/be
Attachment #170167 - Flags: review?(shaver) → review+
Comment on attachment 170275 [details] [diff] [review]
alternate patch2: GC-allocate small obj->slots, etc., diff -w (review)

Getting ready to land these patches, woohoo!

/be
Attachment #170275 - Flags: review?(shaver) → review+
For testing.  The liveconnect problem was because it uses JSObjectOps with a
NULL mark hook.  The previous patch here marked small obj->slots vectors in the
native implementation of that hook, js_Mark.  I moved that marking step into
jsgc.c for all objects (MARK_GC_THING).

There's no way, given the layering, for any user of JS_NewObject or JSClass (or
JSObjectOps, curse it) to allocate slots other than by using the common jsobj.c
AllocSlots/FreeSlots stuff.  So it's ok for the GC to know that small-enough
slots are GC-allocated, and need to be marked.	For now, at any rate!

/be
Comment on attachment 170315 [details] [diff] [review]
consolidated branch landing patch, with liveconnect fix

Passes pageload test on Mac OS X.

/be
Did this turn out to be a win?  Comment 88 indicates that it's not totally clear.
I will try to show results of DHTML/JS tests for 2005.01.01 build and todays.
I addressed issue 2 in comment 88 by keeping separate account of bytes allocated
for "private" data (small objects' slots and for JSFunction private data), so as
not to increase memory pressure incompatibly.

Another bug I'm hoping to fix during 1.8, bug 157334, should help shave cycles
off the GC-thing allocator by avoiding the runtime's GC lock.

But does any of this beat good ol' malloc?  Benchmarking on various OSes is the
only way to say.

This patch was a small codesize win:

  libmozjs.so
  	Total:	       -352 (+2444/-2796)
  	Code:	       -417 (+2312/-2740)
  	Data:	        +65 (+132/-56)

Thanks to Gandalf for his help benchmarking, and I'm interested in the results,
but I bet we'll have a hard time showing that this patch sped anything up more
than a few percent.  It would be interesting to measure RSS or other dynamic
footprint measures.

/be
gandalf, anyone: any numbers to compare before vs. after?

tinderbox stats didn't change much AFAICT.  Marking FIXED, feel free to report
results here.  Any problems seemingly in the code I touched should go in new
bugs, as usual.

/be
Status: ASSIGNED → RESOLVED
Closed: 18 years ago
Resolution: --- → FIXED
I found this URL for javascript performance test:

http://www.24fun.com/downloadcenter/benchjs/benchjs.html

I don't know to which extent it is relevant to the bug discussed here but
here's the result I got, on a win2k Celeron  1.2G machine:

       IE     ff 1.0  ff 050120
test#1 3.245 10.846   3.014
test#2 3.355  2.133   1.462
test#3 1.422  1.071   1.051
test#4 1.272  1.091   1.031
test#5 1.522  0.681   0.641
test#6 2.844  5.588   3.425
test#7 3.284  2.644   2.644
> I don't know to which extent it is relevant to the bug discussed here

It's not. In fact, it's not relevant to JavaScript performance at all.  All
those are tests of DOM performance.

Also, note that if you search on that URL we have bugs for tracking performance
on those tests...
No longer blocks: 21762
Blocks: 21762
The fixed bug depends on bug 176182 but the latter is not fixed yet.  Is the fix
just a workaround?
Bogus dependency.

/be
No longer depends on: 176182
Flags: testcase-
You need to log in before you can comment on or make changes to this bug.