Closed Bug 360964 Opened 18 years ago Closed 18 years ago

Native array push is really slow

Tracking

(Not tracked)

Status:

RESOLVED FIXED

Milestone:

1.6R6

People

(Reporter: james, Unassigned)

Details

Attachments

(1 file, 4 obsolete files)

Improves performance of demo script by leaving gaps in dense array slot allocation 18 years ago Hannes Wallnoefer 1.89 KB, patch		Details \| Diff \| Splinter Review
Implement slot table using chaining instead of linear probing 18 years ago Hannes Wallnoefer 10.57 KB, patch		Details \| Diff \| Splinter Review
Implement slot table using chaining instead of linear probing 18 years ago Hannes Wallnoefer 10.71 KB, patch		Details \| Diff \| Splinter Review
New implementation for slot table using chaining instead of linear probing 18 years ago Hannes Wallnoefer 12.30 KB, patch		Details \| Diff \| Splinter Review
Updated version of previous patch 18 years ago Hannes Wallnoefer 12.37 KB, patch		Details \| Diff \| Splinter Review

James Pharaoh

Reporter

Description

•

18 years ago

User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1) Gecko/20060601 Firefox/2.0 (Ubuntu-edgy) Build Identifier: Rhino 1.6 R4 The code below shows a hundred fold speed difference between the push function and setting array elements directly. Furthermore, when running push some calls do execute quickly and some don't. This looks like some kind of simple optimisation problem in the push function, I've had a look but can't spot quite what it is.... Reproducible: Always Steps to Reproduce: var iMax = 100, jMax = 1000; var sys = java.lang.System; var t0 = sys.currentTimeMillis (); var x = []; for (var i = 0; i < iMax; i++) { for (var j = 0; j < jMax; j++) { x.push ({}); } sys.out.print ('.'); } var t1 = java.lang.System.currentTimeMillis (); java.lang.System.out.println (t1 - t0); var x = [], n = 0; for (var i = 0; i < iMax; i++) { for (var j = 0; j < jMax; j++) { x [n++] = {}; } sys.out.print ('.'); } var t2 = java.lang.System.currentTimeMillis (); java.lang.System.out.println (t2 - t1); Actual Results: The first line of dots are slow and appear erraticaly. The second line appear very quickly. ....................................................................................................36170.0 ....................................................................................................352.0 Expected Results: Both lines should appear quickly and show short (< 1 second) times.

Hannes Wallnoefer

Comment 1

•

18 years ago

The problem here seems to be the lookup of the Array.push function property. The slot allocation mechanism in ScriptableObject scales really badly for dense arrays, because it completely fills the slot array without leaving any gaps. Now if a function property or other non-index property is looked up, ScriptableObject.getSlotPosition() searches from the property's hashcode value until it finds either the slot in question or a null slot. If the first n thousand slots are occupied by array elements, this causes a lot of useless looping. I created an experimental patch that warps the hash/slot index to leave a free slot every 500 elements. With this, push() is roughly half as fast as direct element access for very large arrays. Maybe the right solution would be to always store numeric Array properties in the dense array in NativeArray (how common are sparse arrays, really?), or separate the array elements from other properties by some other means.

Hannes Wallnoefer

Comment 2

•

18 years ago

Attached patch Improves performance of demo script by leaving gaps in dense array slot allocation (obsolete) — Details — Splinter Review

James Pharaoh

Reporter

Comment 3

•

18 years ago

Is there a good reason not to just use a Java HashMap for this?

Hannes Wallnoefer

Comment 4

•

18 years ago

I don't know for sure, but I guess the reasons were performance considerations and keeping order for numeric array elements. I think the easiest way to fix this would be to use chaining instead of linear probing (see <http://en.wikipedia.org/wiki/Hash_table#Open_addressing_versus_chaining>). I'll see if I can come up with a patch.

James Pharaoh

Reporter

Comment 5

•

18 years ago

"I don't know for sure, but I guess the reasons were performance considerations and keeping order for numeric array elements." FWIW, this doesn't prevent numeric array elements getting out of order... eg:var a = {}; a [0] = 0; a [2] = 2; a.b = 'b'; a.n = 'n'; a [1] = 1; for (var i in a) java.lang.System.out.println (''+i'); 0 n 2 1 b

Hannes Wallnoefer

Comment 6

•

18 years ago

Attached patch Implement slot table using chaining instead of linear probing (obsolete) — Details — Splinter Review

Here's a patch that implements the ScriptableObject slot table using chaining rather than linear probing for hash collisions. I've been using it for a few days locally and also run the test suite on it without any noticable problems. From my testing, performance is quite a bit better for large objects/arrays, and pretty much unchanged (maybe a hint slower, but almost unmeasurable) for small ones.

Hannes Wallnoefer

Comment 7

•

18 years ago

Attached patch Implement slot table using chaining instead of linear probing (obsolete) — Details — Splinter Review

Contains a few fixes and improvements over the previous patch: * Actually check integer index in slot lookup (!!!) This caused numeric slots to get lost in sparse arrays. * Make Slot.next transient. This should preserve serialization compatibility, and should be ok since the slot table is newly built upon deserialization (haven't actually tested). * Unify the two local getSlot() methods, no real need for the static one. * Only use REMOVED object for lastAccess cache, we never actually use it in the slot table. * Various fixes in replacing setterSlot: Check for slot identity, set wasDeleted on old slot, break the loop when done.

Attachment #246156 - Attachment is obsolete: true

Attachment #246454 - Attachment is obsolete: true

Attila Szegedi

Comment 8

•

18 years ago

Hannes, as you seem to be familiar with ScriptableObject's hashing code, I'd like to ask whether you see any reason why couldn't we just simply use a HashMap in the ScriptableObject implementation? In fact, we could go one step further and introduce a sort of a "MapFactory" somewhere (probably pluggable into the Context object), so people could plug in whatever map implementations they want, i.e. - the Trove <http://trove4j.sourceforge.net/> THashMap, to restore open-addressing implementation if they're unsatisfied with HashMap's strategy - TreeMap if they'd want ordering guarantees in "for" - LinkedHashMap if they'd want weaker ordering guarantees

Hannes Wallnoefer

Comment 9

•

18 years ago

(In reply to comment #8) > as you seem to be familiar with ScriptableObject's hashing code, I'd like to > ask whether you see any reason why couldn't we just simply use a HashMap in the > ScriptableObject implementation? That would normally seem like the way to go. But from a gut level, I'd hesitate to make such a big change at a spot that is so central to Rhino. For example, this might affect those people using the small-footprint version of Rhino in an environment where memory or cpu performance are scarse. So the goal of my patch was to fix the problem but change as little as possible otherwise. > In fact, we could go one step further and introduce a sort of a "MapFactory" > somewhere (probably pluggable into the Context object), so people could plug in > whatever map implementations they want That would definitely be interesting to experiment with, and to compare it with the current implementation. I don't think it would be useful for me to have as a permanent feature, but maybe it would be for others.

James Pharaoh

Reporter

Comment 10

•

18 years ago

(In reply to comment #9) > (In reply to comment #8) > > as you seem to be familiar with ScriptableObject's hashing code, I'd like to > > ask whether you see any reason why couldn't we just simply use a HashMap in the > > ScriptableObject implementation? > > That would normally seem like the way to go. But from a gut level, I'd hesitate > to make such a big change at a spot that is so central to Rhino. For example, > this might affect those people using the small-footprint version of Rhino in an > environment where memory or cpu performance are scarse. So the goal of my patch > was to fix the problem but change as little as possible otherwise. > > In fact, we could go one step further and introduce a sort of a "MapFactory" > > That would definitely be interesting to experiment with, and to compare it with > the current implementation. I don't think it would be useful for me to have as > a permanent feature, but maybe it would be for others. To satisfy both of these efficiently you could create a factory for the default Scriptable implementation. Then you can have a hand-crafted version if you want, or one which uses java libraries and things like map factories internally. I'd definitely like to see this kind of flexibility in Rhino... James

Hannes Wallnoefer

Comment 11

•

18 years ago

The patch for bug 224334 that has been committed breaks my patch. Since the problem with linear probing in ScriptableObject.java persists, I'll work on a new, updated patch when I have time.

Hannes Wallnoefer

Comment 12

•

18 years ago

Attached patch New implementation for slot table using chaining instead of linear probing (obsolete) — Details — Splinter Review

This is a rewrite of my previous patch adapted to recent changes in ScriptableObject. Running the test suite over it with 5 second timeout, I got the same results as with CVS HEAD, with performance slightly improved.

Attachment #248648 - Attachment is obsolete: true

Hannes Wallnoefer

Comment 13

•

18 years ago

David: adding you to CC list, maybe you are interested in looking into this.

Norris Boyd

Comment 14

•

18 years ago

I get a compile error building with the 2007-02-06 07:13 PDT patch: [javac] Compiling 112 source files to /home/norris/rhino/mozilla/js/rhino/build/classes [javac] /home/norris/rhino/mozilla/js/rhino/src/org/mozilla/javascript/ScriptableObject.java:2293: cannot find symbol [javac] symbol : variable tableSize [javac] location: class org.mozilla.javascript.ScriptableObject [javac] int insertPos = getSlotIndex(tableSize, slot.indexOrHash); [javac] ^ [javac] 1 error

Hannes Wallnoefer

Comment 15

•

18 years ago

Attached patch Updated version of previous patch — Details — Splinter Review

The definition of int tableSize was removed in revision 1.120 of ScriptableObject. This new patch puts it in again. From a quick test I did, everything looks ok.

Attachment #254164 - Attachment is obsolete: true

Norris Boyd

Comment 16

•

18 years ago

Patch committed on both 1.6R6 branch and trunk. The idea of switching to HashMap is good, but that's more of a task for next release rather than this release.

Status: NEW → RESOLVED

Closed: 18 years ago

Resolution: --- → FIXED

Norris Boyd

Comment 17

•

18 years ago

Created Bug 383592 for the future enhancement of using HashMap in ScriptableObject.

Norris Boyd

Updated

•

18 years ago

Target Milestone: --- → 1.6R6

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Native array push is really slow

Categories

(Rhino Graveyard :: Core, defect)

Tracking

(Not tracked)

People

(Reporter: james, Unassigned)

References

Details

Crash Data

Security

(public)

User Story

Attachments

(1 file, 4 obsolete files)

Description

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Comment 9

Comment 10

Comment 11

Comment 12

Comment 13

Comment 14

Comment 15

Comment 16

Comment 17

Updated

Attachment

General

Description

File Name

Content Type