Closed Bug 171262 Opened 22 years ago Closed 13 years ago

Performance: improve speed of new Array creation

Categories

(Core :: JavaScript Engine, defect, P2)

defect

Tracking

()

RESOLVED WORKSFORME
Future

People

(Reporter: pschwartau, Unassigned)

References

(Blocks 1 open bug)

Details

(Keywords: js1.5, perf)

Attachments

(9 files)

This arose out of bug 169531,
"Creating new objects in Mozilla is sometimes even 5 times slower than in IE"

After some testing there, it was found that the creation of new objects
in the JS shell is actually quite fast - faster than in the CScript shell,
for example, with only one exception: new Array creation.

See http://bugzilla.mozilla.org/show_bug.cgi?id=169531#c21

So we are filing this bug to investigate performance improvements
on new Array creation -
Blocks: 117611
Keywords: perf
bug 169531 comment 23 says in part "Second, the new Array slowness [...] can be
fixed more readily with a patch to jsarray.c only (to avoid defining a length
property on each Array instance, requiring a mutable scope for each instance)."

The idea is to use a reserved slot (JSCLASS_HAS_RESERVED_SLOTS(1) among the
initializer flags for js_ArrayClass) in each instance to hold length, and a
JSPROP_SHARED|JSPROP_PERMANENT prototype property for length -- a singleton
member descriptor shared among all instances, allowing the instances to remain
lightweight while of length 0 (the benchmark in bug 169531 creates zero-length
Array instances).

To optimize further and avoid a map, we'd want to specialize jsarray.c to
implement JSObjectOps, not just JSClass.  That could be done to speed up arrays
of non-zero length.

/be
I've been investigating slow array creation issues as well -- notably with
pages which use arrays like this test case to create multi-level selections. 
This test example (gzipped to fit as a posting here -- uncompresses to 1/2 MB)
is based on some pages which use js to modify one selection object based on the
choice in another (e.g. "Select your country" dictates what goes into the
"Select your locality" box), so you end up with large n-dimentional arrays to
store all possible selection combinations.

With the 1.1 Moz release, this test case takes ~1.5 minutes to load/display on
my PIII/800 Linux box, and 30 seconds on a 1GHz Mac w/OS 10.1.	The same case
in either Opera on Linux or IE on Mac takes about 3 seconds.  

While I haven't had the chance to test this specific case on my main test box
-- an older HP-UX 11.00 workstation with Mozilla 1.2beta and including the js
allocator performance improvements suggested in bug #123668 -- I have run a
similar page through that machine with some timing checks in js_NewObject: that
page displays in about 45 seconds on that machine, 42 seconds of which is spent
at and below js_NewObject.  That page results in about 10000 calls to
js_NewObject, each allocating about 200 bytes total.  Note that a simple
program on that machine to calloc() 200 bytes in a loop of 100000 iterations
takes about 2 seconds (to allocate and clear 10x as much memory).  Also of
interest is that the first few calls to js_NewObject take less than 1/10th the
time of the last few calls.
Same test as above, but using only 3 digits -- a 3-dimentional array to
allocate 1/10 the space of the larger test.  For testing on slower machines or
direct download from this page, where the larger test is prohibitively large.
Attachment #103516 - Attachment mime type: text/html → application/octet-stream
I can confirm Dan's timing results. IE6 is much faster than Mozilla
in loading his testcases. But these testcases are mixing two issues:

var digits = new Array();
digits[0] = new Array();
digits[0][0] = new Array();
digits[0][0][0] = new Array();
digits[0][0][0][0] = new Option("0000", "0000");
digits[0][0][0][1] = new Option("0001", "0001");
digits[0][0][0][2] = new Option("0002", "0002");
digits[0][0][0][3] = new Option("0003", "0003");

                  etc.
                  etc.


The right side of these assignments involve new object creation,
and DOM objects, at that. As such, it is not a pure (or standalone)
JS benchmark.

Dan: by any chance, do you have a version of your tests that 
simply assigns the string values of the numbers, like this:

digits[0][0][0][0] = "0000";
digits[0][0][0][1] = "0001";
digits[0][0][0][2] = "0002";
digits[0][0][0][3] = "0003";

          etc.
         

If not, we can cook that up. This will remove DOM performance from the
equation, and the new object assignments on the right side as well.
That way, we can test new array creation speed independently of 
those other considerations -
Dan, thanks for the data and effort to get it.  A couple of quick thoughts:

- Could you (or maybe pschwartau or mazielobo will do it) fork the test into a
version that uses only core JS objects, not any DOM objects such as Options
arrays?  It would be helpful to get timings for the js shell and xpcshell with
such reduced tests.

- It's best to wrap your top-level script in a function so as to use local
variables only, not global ones.  This isn't to say that real-world pages do
such awkward things; only that, for the purpose of isolating costs in the code,
we want to use locals and avoid the confounding effects of the global variable
tax exacted by DOM security code.  That tax is not paid in reduced tests run in
the js shell or in xpcshell (no DOM code in those shells).

- Any timing results common to all classes, not peculiar to Arrays, should go
over in bug 123668.  I'm working on a new patch for that bug for 1.3alpha, one
that will, I hope, make the multi-threaded costs paid by xpcshell and Mozilla
almost go away, so that asymptotic benchmarks (ones that create lots and lots of
objects, to amortize the costs remaining in the multi-threaded case even with
the new patch) in the js shell run no faster than in xpcshell or in Mozilla
(again, modulo DOM security check costs on global variables).

Could you give details (in bug 123668) of the timing code that you added to
js_NewObject, and the results you get for a reduced test, or even for the
current test?  Thanks again,

/be
Assignee: khanson → brendan
Keywords: js1.5, mozilla1.3
Priority: -- → P2
Hardware: PC → All
Target Milestone: --- → mozilla1.3alpha
The array initialized with strings rather than "new Option"s -- definitely
quite a bit faster: now only just beyond 2x the times seen elsewhere.  Still
slow, but 1/5 the time with the Option objects: seems that's the bulk of the
time.

As for the timing code: mostly just a call to gettimeofday() upon entering the
function and again just before the return.  I'll post the specifics when I'm
back at the other machine tomorrow.
Timing code posted to bug 123668 as requested along with some timing info.  The
encapuslation of these tests into something more refined to test the individual
components more cleanly I'll leave to someone who understands the various
interactions better than I.
System:  Windows 2000,  Memory 256 MB,  Processor:  1.7  GHz 
CScript: Microsoft (R) Windows Script Host Version 5.6
Trunk XPCShell: Mozilla build ID-2002102108

Here are my results with CScript, optimized JS shell, and XPCShell.
I also applied Brendan's patch from bug #123668 comment #38.

To obtain greater resolution in the results I ran Dan's test from Comment #6
in loops of 1, 10, and 100 cycles all within one function:

               CScript       Opt JS(w/o patch)   Opt JS(w/patch)   Trunk XPCshell
1 cycle:        109 ms          15 ms                 16 ms            703 ms
10  cycles:    1703 ms         235 ms                219 ms          20265 ms  
       
100 cycles:   17672 ms        2656 ms               2781 ms         338016 ms 

I ran the same tests except the loops are now separated into three functions.
Notice the XPCshell times are much faster than before.

               CScript       Opt JS(w/o patch)   Opt JS(w/patch)   Trunk XPCshell
1 cycle:         94 ms          16 ms                 15 ms            672 ms
10  cycles:    1672 ms         234 ms                203 ms           6922 ms  
       
100 cycles:   17875 ms        2641 ms               2875 ms          70594 ms
Someone (user@domain.invalid) posted an interesting text in n.p.m.performance
that might be related. Please say if this should be a bug of its own or if it's
a known issue. The text is below:

Why using the forms of the Array constructor that specify a size takes more time
than the forms of the Array constructor that don't specify a size?
I wrote two JavaScript functions, unspecify() and specify(), as displayed below
and ran them several times in the Mozilla browser.
The results shows that unspecify() has a better runtime than specify().

function unspecify( )
{
    var n = 10000;
    var m = 10;
    var a = new Array();
    for (var i=0; i< n; i++)
    {
        a[i] = new Array();
        for(var j=0; j<m; j++)
            a[i][j] = j;
    }
}

function specify( )
{
    var n = 10000;
    var m = 10;
    var a = new Array(n);
    for (var i=0; i< n; i++)
    {
        a[i] = new Array(m);
        for(var j=0; j<m; j++)
            a[i][j] = j;
    }
}





I found one of the JavaScript performance hints posted in the Rhino project. It
claims that using the forms of the Array constructor that specify a size is
better than the forms of the Array that don't specify a size. The Array
constructor specifying a size helps speed up the JavaScript code. I wrote two
test functions, bad() and good(), as displayed below and tested them in Mozilla.

>  Please say if this should be a bug of its own
>  or if it's a known issue. 

I will have to let the developers comment on whether it's a known
issue, but I would prefer that we not file a separate bug on it.

I added timing and reporting code to the above tests, as follows.
Note I'm using the new Date.now() method, not available in IE:

function unspecify( )
{
    var n = 10000;
    var m = 10;

    var start = Date.now();
    var a = new Array();
    for (var i=0; i< n; i++)
    {
        a[i] = new Array();
        for(var j=0; j<m; j++)
            a[i][j] = j;
    }
    var end = Date.now();
    print('\n\tunspecify():\t' + (end - start) + ' ms');
}

function specify( )
{
    var n = 10000;
    var m = 10;

    var start = Date.now();
    var a = new Array(n);
    for (var i=0; i< n; i++)
    {
        a[i] = new Array(m);
        for(var j=0; j<m; j++)
            a[i][j] = j;
    }

    var end = Date.now();
    print('\tspecify():\t' + (end - start) + ' ms\n');
}


I got these results on my WinNT box (500 MHz CPU; 128M RAM):

unspecify():    1703 ms
specify():      1282 ms

unspecify():    1703 ms
specify():      1281 ms

unspecify():    1703 ms
specify():      1266 ms
Fixing TM.

/be
Target Milestone: mozilla1.3alpha → mozilla1.3beta
Target Milestone: mozilla1.3beta → mozilla1.5alpha
Status: NEW → ASSIGNED
I'm doing a performance evaluation of javascript vs. a few other languages. 
It's probably a pretty bogus evaluation, but hopefully my professor won't
notice.  In any case, here's one of my test cases:

var arr = new Array(1024 * 1024 * 10);
for(var i = 0 ; i < 1024 * 1024 * 10 ; i++)
  arr[i] = 0;

The equivalent java test case takes about 50M using latest jvm for linux, but js
1.5rc4 takes well over 100M.  Actually I had to kill the test while top reported
memory use still increasing rapidly.

Does this bug cover this test case?  Do you care about this admittedly pretty
bogus (in the sense of not existing in real code) test case?
Ash, a script that runs from 0 to a huge index creating elements, will find its
performance improved some day when this bug is fixed.  However, such a benchmark
represents no useful workload that we've seen in the real world.  If you're ok
with a synthetic benchmark not representative of real-world workload, feel free
to report numbers here.

/be
FWIW, on my WinNT box (128M RAM), I don't have enough memory to
complete ash's test.  When I began, my available memory was 67M,
which went down to 0, at which point the JS shell gave me the
"out of memory" error.

If I reduce the upper bound on ash's test by a factor of 10, 
my memory usage goes from 67M down to 26M = 41M memory usage:

var arr = new Array(1024 * 1024);
for(var i = 0 ; i < 1024 * 1024; i++)
  arr[i] = 0;
the js shell from the 1.5 rc5 tarball uses 391M on the test case I gave
according to top.  if i change the array size and the loop bound to 1024*1024*2,
it uses 82M.  For comparison (though not a good or useful comparison), java uses
40M above its minimum size of 10M (!) on the equivalent test case.  Now on to
python, perl, and ruby...
btw, here's a timing test:
var length = 1024*1024;
var arr = new Array(length);
for(var j = 0 ; j < 100 ; j++)
  for(var i = 0 ; i < length ; i++)
    arr[i] = i;

The equivalent test in java runs in ~1 second.  This test in js 1.5 rc5 takes
270 seconds.  This is all wall clock time.  The disk light on my machine turned
on very infrequently during the test, so this is not a massive thrashing problem.
Turn off your Java VM's JIT if you can, and let us know the comparison.  Java
will still win even in bytecoded mode, because it's a statically typed language.

/be
in interpretted mode, the last test I gave takes 11.6 seconds.  since apparently
java uses 10x the data to store the array elements, it looks like js is only
about 2.5x slower per byte than java in interpretted mode, though who knows if a
useful metric.

i kind of figured the typing might be an issue for speed (type checks) and for
space (storing type info), so i think the comparisons with perl and python will
be more interesting.
Blocks: 203448
Target Milestone: mozilla1.5alpha → Future
Depends on: native-arrays
Blocks: 334935
No longer blocks: 334935
QA Contact: pschwartau → general
Throwing back to pool -- crowder, igor feel free to grab.

/be
Assignee: brendan → general
Status: ASSIGNED → NEW
I noticed that new Array(x, y, ...) is much faster (up to 100%, depending on the number of elements) than the equivalent array literal [x, y, ...] which was a bit surprising at first, but not after I looked at the bytecode the engine generates for these expressions. Is that covered by this bug?
It sounds like a great sub-bug to file -- nice find!  Can you file one with details on the bytecode and a test, and mark it blocking this?
Depends on: 396191
(In reply to comment #24)
> I noticed that new Array(x, y, ...) is much faster (up to 100%, depending on
> the number of elements) than the equivalent array literal [x, y, ...] which was
> a bit surprising at first, but not after I looked at the bytecode the engine
> generates for these expressions. Is that covered by this bug?

So I looked into this.  We have JSOP_NEWARRAY, which I found to be 8% faster than the equivalent constructor call, however, this only gets emitted for (1) arrays shorter than 2^16 and (2) arrays being created while in a function.  So, by running the performance tests in the global scope, we force the codegen to produce JS_NEWINIT and a sequence of JSOP_INITELEM, which is much slower.  Wrapping it all in a function call makes literal syntax a bit faster.
This bug affects Peacekeeper (#499198) and so it should probably block that.  

In their arrayWeighted test, it creates 25 arrays 10,000 times.  It uses the [] way to initialize the array.  I'll be attaching the test case to here as well as a slightly modified that uses "new Array".  The times I get vs Chrome are:

FF (Mozilla/5.0 (Windows NT 5.1; rv:2.0b8pre) Gecko/20101105 Firefox/4.0b8pre):
test13c.html: 60ms ([] construct)
test13d.html: 35ms (new Array construct)

Chrome (7.0.517.44):
test13c.html: 3ms ([] construct)
test13d.html: 18ms (new Array construct)

So, FF is 20 times slower than Chrome with [] and 2 times slower with new Array.
Attached file test13c.html
This uses the [] construct to create the arrays.  This is also what is used in the actual Peacekeeper test.
Attached file test13d.html
This modifies test13d.html to use "new Array" instead of [].
There has been a lot of optimization of object creation and constructor calls since the OP.  I downloaded the test in comment 2 and the script takes 52ms.

Peacekeeper (comment 28 and comment 29) is a separate issue; the benchmarks create pure garbage so to match chrome (which is about 2x faster) we need bug 619558 (generational gc).  We see this often for synthetic benchmarks (just look at the dependencies of bug 619558), so I don't think anything is added by keeping this bug open since the original issue has been resolved and the Peacekeeper issue is well known.
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: