Closed
Bug 360964
Opened 18 years ago
Closed 18 years ago
Native array push is really slow
Categories
(Rhino Graveyard :: Core, defect)
Tracking
(Not tracked)
RESOLVED
FIXED
1.6R6
People
(Reporter: james, Unassigned)
Details
Attachments
(1 file, 4 obsolete files)
12.37 KB,
patch
|
Details | Diff | Splinter Review |
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1) Gecko/20060601 Firefox/2.0 (Ubuntu-edgy)
Build Identifier: Rhino 1.6 R4
The code below shows a hundred fold speed difference between the push function and setting array elements directly. Furthermore, when running push some calls do execute quickly and some don't.
This looks like some kind of simple optimisation problem in the push function, I've had a look but can't spot quite what it is....
Reproducible: Always
Steps to Reproduce:
var iMax = 100, jMax = 1000;
var sys = java.lang.System;
var t0 = sys.currentTimeMillis ();
var x = [];
for (var i = 0; i < iMax; i++) {
for (var j = 0; j < jMax; j++) {
x.push ({});
}
sys.out.print ('.');
}
var t1 = java.lang.System.currentTimeMillis ();
java.lang.System.out.println (t1 - t0);
var x = [], n = 0;
for (var i = 0; i < iMax; i++) {
for (var j = 0; j < jMax; j++) {
x [n++] = {};
}
sys.out.print ('.');
}
var t2 = java.lang.System.currentTimeMillis ();
java.lang.System.out.println (t2 - t1);
Actual Results:
The first line of dots are slow and appear erraticaly. The second line appear very quickly.
....................................................................................................36170.0
....................................................................................................352.0
Expected Results:
Both lines should appear quickly and show short (< 1 second) times.
Comment 1•18 years ago
|
||
The problem here seems to be the lookup of the Array.push function property. The slot allocation mechanism in ScriptableObject scales really badly for dense arrays, because it completely fills the slot array without leaving any gaps. Now if a function property or other non-index property is looked up, ScriptableObject.getSlotPosition() searches from the property's hashcode value until it finds either the slot in question or a null slot. If the first n thousand slots are occupied by array elements, this causes a lot of useless looping.
I created an experimental patch that warps the hash/slot index to leave a free slot every 500 elements. With this, push() is roughly half as fast as direct element access for very large arrays.
Maybe the right solution would be to always store numeric Array properties in the dense array in NativeArray (how common are sparse arrays, really?), or separate the array elements from other properties by some other means.
Comment 2•18 years ago
|
||
Reporter | ||
Comment 3•18 years ago
|
||
Is there a good reason not to just use a Java HashMap for this?
Comment 4•18 years ago
|
||
I don't know for sure, but I guess the reasons were performance considerations and keeping order for numeric array elements.
I think the easiest way to fix this would be to use chaining instead of linear probing (see <http://en.wikipedia.org/wiki/Hash_table#Open_addressing_versus_chaining>). I'll see if I can come up with a patch.
Reporter | ||
Comment 5•18 years ago
|
||
"I don't know for sure, but I guess the reasons were performance considerations
and keeping order for numeric array elements."
FWIW, this doesn't prevent numeric array elements getting out of order... eg:var
a = {};
a [0] = 0;
a [2] = 2;
a.b = 'b';
a.n = 'n';
a [1] = 1;
for (var i in a) java.lang.System.out.println (''+i');
0
n
2
1
b
Comment 6•18 years ago
|
||
Here's a patch that implements the ScriptableObject slot table using chaining rather than linear probing for hash collisions. I've been using it for a few days locally and also run the test suite on it without any noticable problems. From my testing, performance is quite a bit better for large objects/arrays, and pretty much unchanged (maybe a hint slower, but almost unmeasurable) for small ones.
Comment 7•18 years ago
|
||
Contains a few fixes and improvements over the previous patch:
* Actually check integer index in slot lookup (!!!) This caused numeric slots to get lost in sparse arrays.
* Make Slot.next transient. This should preserve serialization compatibility, and should be ok since the slot table is newly built upon deserialization (haven't actually tested).
* Unify the two local getSlot() methods, no real need for the static one.
* Only use REMOVED object for lastAccess cache, we never actually use it in the slot table.
* Various fixes in replacing setterSlot: Check for slot identity, set wasDeleted on old slot, break the loop when done.
Attachment #246156 -
Attachment is obsolete: true
Attachment #246454 -
Attachment is obsolete: true
Comment 8•18 years ago
|
||
Hannes,
as you seem to be familiar with ScriptableObject's hashing code, I'd like to ask whether you see any reason why couldn't we just simply use a HashMap in the ScriptableObject implementation?
In fact, we could go one step further and introduce a sort of a "MapFactory" somewhere (probably pluggable into the Context object), so people could plug in whatever map implementations they want, i.e.
- the Trove <http://trove4j.sourceforge.net/> THashMap, to restore open-addressing implementation if they're unsatisfied with HashMap's strategy
- TreeMap if they'd want ordering guarantees in "for"
- LinkedHashMap if they'd want weaker ordering guarantees
Comment 9•18 years ago
|
||
(In reply to comment #8)
> as you seem to be familiar with ScriptableObject's hashing code, I'd like to
> ask whether you see any reason why couldn't we just simply use a HashMap in the
> ScriptableObject implementation?
That would normally seem like the way to go. But from a gut level, I'd hesitate to make such a big change at a spot that is so central to Rhino. For example, this might affect those people using the small-footprint version of Rhino in an environment where memory or cpu performance are scarse. So the goal of my patch was to fix the problem but change as little as possible otherwise.
> In fact, we could go one step further and introduce a sort of a "MapFactory"
> somewhere (probably pluggable into the Context object), so people could plug in
> whatever map implementations they want
That would definitely be interesting to experiment with, and to compare it with the current implementation. I don't think it would be useful for me to have as a permanent feature, but maybe it would be for others.
Reporter | ||
Comment 10•18 years ago
|
||
(In reply to comment #9)
> (In reply to comment #8)
> > as you seem to be familiar with ScriptableObject's hashing code, I'd like to
> > ask whether you see any reason why couldn't we just simply use a HashMap in the
> > ScriptableObject implementation?
>
> That would normally seem like the way to go. But from a gut level, I'd hesitate
> to make such a big change at a spot that is so central to Rhino. For example,
> this might affect those people using the small-footprint version of Rhino in an
> environment where memory or cpu performance are scarse. So the goal of my patch
> was to fix the problem but change as little as possible otherwise.
> > In fact, we could go one step further and introduce a sort of a "MapFactory"
>
> That would definitely be interesting to experiment with, and to compare it with
> the current implementation. I don't think it would be useful for me to have as
> a permanent feature, but maybe it would be for others.
To satisfy both of these efficiently you could create a factory for the default Scriptable implementation. Then you can have a hand-crafted version if you want, or one which uses java libraries and things like map factories internally. I'd definitely like to see this kind of flexibility in Rhino...
James
Comment 11•18 years ago
|
||
The patch for bug 224334 that has been committed breaks my patch. Since the problem with linear probing in ScriptableObject.java persists, I'll work on a new, updated patch when I have time.
Comment 12•18 years ago
|
||
This is a rewrite of my previous patch adapted to recent changes in ScriptableObject. Running the test suite over it with 5 second timeout, I got the same results as with CVS HEAD, with performance slightly improved.
Attachment #248648 -
Attachment is obsolete: true
Comment 13•18 years ago
|
||
David: adding you to CC list, maybe you are interested in looking into this.
Comment 14•18 years ago
|
||
I get a compile error building with the 2007-02-06 07:13 PDT patch:
[javac] Compiling 112 source files to /home/norris/rhino/mozilla/js/rhino/build/classes
[javac] /home/norris/rhino/mozilla/js/rhino/src/org/mozilla/javascript/ScriptableObject.java:2293: cannot find symbol
[javac] symbol : variable tableSize
[javac] location: class org.mozilla.javascript.ScriptableObject
[javac] int insertPos = getSlotIndex(tableSize, slot.indexOrHash);
[javac] ^
[javac] 1 error
Comment 15•18 years ago
|
||
The definition of int tableSize was removed in revision 1.120 of ScriptableObject. This new patch puts it in again. From a quick test I did, everything looks ok.
Attachment #254164 -
Attachment is obsolete: true
Comment 16•18 years ago
|
||
Patch committed on both 1.6R6 branch and trunk.
The idea of switching to HashMap is good, but that's more of a task for next release rather than this release.
Status: NEW → RESOLVED
Closed: 18 years ago
Resolution: --- → FIXED
Comment 17•18 years ago
|
||
Created Bug 383592 for the future enhancement of using HashMap in ScriptableObject.
Updated•18 years ago
|
Target Milestone: --- → 1.6R6
You need to log in
before you can comment on or make changes to this bug.
Description
•