Javascript: Object construction time O(n^2) when getter closures used

RESOLVED FIXED in mozilla1.9alpha1

Status

()

Core
JavaScript Engine
P2
normal
RESOLVED FIXED
11 years ago
10 years ago

People

(Reporter: Charles Verdon, Assigned: brendan)

Tracking

({perf})

1.8 Branch
mozilla1.9alpha1
Points:
---
Dependency tree / graph
Bug Flags:
in-testsuite +

Firefox Tracking Flags

(Not tracked)

Details

(URL)

Attachments

(5 attachments, 6 obsolete attachments)

(Reporter)

Description

11 years ago
User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.2) Gecko/20060308 Firefox/1.5.0.2
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.2) Gecko/20060308 Firefox/1.5.0.2

Allocating an object is fast if it contains normal functions but VERY slow when the same functionality is coded with getters:

function ObjectWithFunction()
{
	var m_var = null;
	this.init = function(p_var) {
		m_var = p_var;
	}
	this.getVar = function()
	{
		return m_var;
	}
	this.setVar = function(v)
	{
		m_var = v;
	}
}

versus

function ObjectWithGetter()
{
	var m_var = null;
	this.init = function(p_var) {
		m_var = p_var;
	}
	this.Var getter = function() {
		return m_var;
	}
	this.Ver setter = function(v) {
		m_var = v;
	}
}

(See test case)

Charles

Reproducible: Always

Steps to Reproduce:
1. Allocate sequentially objects with getters. 


Actual Results:  
It takes close to 1 minute on my old P4.

Expected Results:  
Around 1 sec like objects with functions.


Currently this causes my extension (Data Analytics) to take a very long time to load a table with lots of rows (1 row = 1 obj). I don't really want to remove getters because they are convenient...
(Reporter)

Comment 1

11 years ago
Created attachment 220029 [details]
Test case

Updated

11 years ago
Assignee: general → general
Component: General → JavaScript Engine
Product: Mozilla Application Suite → Core
QA Contact: general → general
Version: unspecified → 1.8 Branch
(Assignee)

Comment 2

11 years ago
This needs investigating, because there shouldn't be any difference.

However, you've written an inefficient constructor pattern.  There is no reason for methods, getters or setters to be assigned to the same property ids in each and every instance.  Just set the methods, etc., once, on ObjectWithWhatever.prototype.

/be
(Assignee)

Updated

11 years ago
Summary: Javascript: Object allocation time abismal when getters used → Javascript: Object allocation time abysmal when getters used
(Reporter)

Comment 3

11 years ago
Hi,
Getters and setters could be in the prototype if the underlying field was public (ie this.m_var) but the code relies on closures to provide private members of 'classes'. The getters/setters in the prototype does not have access to the scope of the object so cannot access the value (var m_var). At least this is my understanding of how it works.

cv
(Assignee)

Comment 4

11 years ago
Information hiding is overrated.  Yet using closures as objects is not causing a performance problem for the first case.  Is it possible you have turned on strict warnings?  Check your JavaScript console -- this could easily be the problem.

/be
(Reporter)

Comment 5

11 years ago
Strict warnings are not activated...
(Assignee)

Comment 6

11 years ago
Ok, that was a shot in the right direction.  I suspect the security check at http://lxr.mozilla.org/mozilla/source/js/src/jsinterp.c#5133 is costly in Firefox (which means this is not a JS engine bug).

Please try this variation on your constructor:

function ObjectWithGetter()
{
    var m_var = null;
    var self = {
        init: function(p_var) {
            m_var = p_var;
        },
        get Var() {
            return m_var;
        },
        set Ver(v) {
            m_var = v;
        }
    };
    return self;
}

If this performs about as well as the constructor closure form that does not use getters and setters, then the runtime access check, which calls into mozilla/caps code, is where the cost arises.

/be
(Assignee)

Comment 7

11 years ago
Created attachment 220150 [details] [diff] [review]
optimize strict warning costs in parsing getter/setter

This is not the mysterious cost, but it's an avoidable minor cost (atomizing a printable version of the getter or setter name, which is needed only if the strict option is enabled).

/be
Assignee: general → brendan
Status: UNCONFIRMED → ASSIGNED
Attachment #220150 - Flags: review?(mrbkap)
(Assignee)

Updated

11 years ago
Priority: -- → P3
Target Milestone: --- → mozilla1.9alpha

Updated

11 years ago
Attachment #220150 - Flags: review?(mrbkap) → review+
(Assignee)

Comment 8

11 years ago
(In reply to comment #6)
>         get Var() {
>             return m_var;
>         },
>         set Ver(v) {

I copied the Ver typo in comment 0 -- obviously the setter for m_var should be named Var.

/be
(Assignee)

Comment 9

11 years ago
Created attachment 220153 [details]
JS shell testcase, should work in browser too

I've reduced the testcase provided, cutting out unrelated costs and browser-only gewgaws.  There's a real bug here, and it does not involve access checks:

bash-3.1$ ./WINNT5.1_OPT.OBJ/js foopy.js
100 ms
210 ms
58715 ms

Thanks for reporting it.  Now to profile...

/be
(Assignee)

Comment 10

11 years ago
BTW, commenting out the Var set in the loop doesn't fix the disparity:

                var obj = new ctor();
                //obj.Var = 42;
                arr.push(obj);

It's the constructor function runtime that is abysmal, as (with corrected summary) reported.

/be
Summary: Javascript: Object allocation time abysmal when getters used → Javascript: Object construction time abysmal when getters used
(Tweaking summary to make it include the helpful word "slow".  I had to eventually remember the editorial "abysmal" to find this again.)

I did some profiling with Shark (what a fantastic tool!), and while I'm still trying to figure out how to generate a friendly exported report, it seems to show pretty clearly that the getter case is spending a lot more time in InsertPropertyTreeChild and GetPropertyTreeChild.  (They show up like red-hot lava in the comparison of a session using just ObjectWithFunction with one using just ObjectWithGetter.  Together, they make up about 50% of the different in time, measured as "Self".)

In GetPropertyTreeChild itself, the hotspot is the chunk-enumeration loop (~lines 753-762); it might be enlightening to add some instrumentation and see if we're violating the low-fanout assumption described in the preceding comment.

Similarly in InsertPropertyTree, where the chunk-scanning loop from 584 to 604 is hottest.

Shark recommends some unrolling and doing something about the store/load dispatch-group dependencies in the NULL-checking of sprop, and the alignment of the loop, but I'm probably not profiling an opt build, so use a grain of salt there.

The SPROP_MATCH macro dominates those loops in both cases, perhaps obviously.   (The cmpw instruction at the heart of SPROP_MATCH is unsurprisingly hot when I drill down to assembly.)

Brendan: I'd be more than happy to show you the Shark results on Monday, lemme know if you want to get together.  Someone with an Intel Mac can get the IA32-analyzed variant as well, without too much trouble at all.
Summary: Javascript: Object construction time abysmal when getters used → Javascript: Object construction time abysmally slow when getters used
(Assignee)

Comment 12

11 years ago
Oh, of course.  The property tree is not designed to scale better than O(n^2) for n kids below the root ply.  It didn't seem worth optimizing, but this bug shows how getter and setter closure-constructor triggers it too easily.  Patching...

/be
Flags: blocking1.9?
Keywords: perf
(Assignee)

Updated

11 years ago
OS: Windows XP → All
Priority: P3 → P2
Hardware: PC → All
(Assignee)

Updated

11 years ago
Summary: Javascript: Object construction time abysmally slow when getters used → Javascript: Object construction time O(n^2) when getter closures used
(Assignee)

Comment 13

11 years ago
Created attachment 240892 [details] [diff] [review]
patch to consider

Time without patch:

107 ms
249 ms
51923 ms

Time with patch:

102 ms
233 ms
2469 ms

/be
Attachment #240892 - Flags: superreview?(shaver)
Attachment #240892 - Flags: review?(igor.bukanov)

Comment 14

11 years ago
Comment on attachment 240892 [details] [diff] [review]
patch to consider

My review+ would have no weight. I think I got the idea behind hashtable sharing/scope tree, but I never looked at  the details. In particular I have no idea why the patch improves things.
Attachment #240892 - Flags: review?(igor.bukanov)
(Assignee)

Comment 15

11 years ago
(In reply to comment #14)
> (From update of attachment 240892 [details] [diff] [review] [edit])
> My review+ would have no weight. I think I got the idea behind hashtable
> sharing/scope tree, but I never looked at  the details.

Ok, thanks anyway.

> In particular I have no idea why the patch improves things.

Each constructor call results in a getter and setter closure being captured by the new object, and each getter and setter function object must be cloned in order to hold the different scope chain link (parent slot) to the particular activation of the constructor.  So below the root ply of the property tree, the test will make 32000 children (each, at ply 1 and ply 2) for the get/set variant.

With the patch, after 30 kids in three chunks the code switches to using a double hashtable in the first chunk, avoiding O(n^2) growth.

I put a table member in each chunk because it's messy/hard to optimize just to extend the first chunk's allocation.  For one thing, the first chunk may be deleted by the GC if no scopes need any properties that reach kids in it, but then  we'd have to move the table pointer to the new first chunk, reallocating it, etc.

We don't want to add locking costs at all when searching and inserting in the property tree, unless only in rare, exceptional cases such as surpassing the 30 threshold.  And then we must cope with racing insertions that make their own hashtable (loser has to throw its table away).

I think the bug is fixed, and the runtime differences reflect greater costs of getter/setter cloning, but I'll need to spend time in shark to confirm.

Hoping shaver can sr+.  I'll r? mrbkap.

/be
(Assignee)

Updated

11 years ago
Attachment #240892 - Flags: review?(mrbkap)
(Assignee)

Comment 16

11 years ago
(In reply to comment #15)
> So below the root ply of the property tree, the
> test will make 32000 children (each, at ply 1 and ply 2) for the get/set
> variant.

Just to be clear: the new properties are needed because a JSScopeProperty is identified in the property tree by almost all of its members, including the getter and setter members, which if JSPROP_GETTER|JSPROP_SETTER are actually cloned function object pointers (so must differe for each construction of the environment-capturing object containing a getter and a setter bound to lambda expressions).

> With the patch, after 30 kids in three chunks the code switches to using a
> double hashtable in the first chunk, avoiding O(n^2) growth.

In order to leave the rest of the code alone, I should have added that the kids chunks are still maintained, up to long list length.  The hashtable is a dual data structure to optimize lookups in GetPropertyTreeChild.

We can't switch from chunks to use only the table, because (a) we would need to lock on every insertion, pessimistically; (b) we need to cope with "duplicate child" cases that arise from "middle delete" followed by GC (e.g., x-y-z and x-w-y are in two scopes and therefore mapped by the property tree; then the user deletes w in the second object, then the GC runs, collecting w's node and leaving x-y-z and x-y' where y' is identical to y under the JSScopeProperty hash equivalence). Duplicate child cases require a list or array, not (only) a hashtable.

/be
(Assignee)

Comment 17

11 years ago
(In reply to comment #16)
> (b) we need to cope with "duplicate
> child" cases that arise from "middle delete" followed by GC (e.g., x-y-z and
> x-w-y are in two scopes and therefore mapped by the property tree; then the
> user deletes w in the second object, then the GC runs, collecting w's node and
> leaving x-y-z and x-y' where y' is identical to y under the JSScopeProperty
> hash equivalence). Duplicate child cases require a list or array, not (only) a
> hashtable.

And therefore (to finish the story), when using the hashtable to optimize lookups, care must be taken to avoid hitting y' when looking for y, and vice versa, when removing.  IOW we must remove the right kid, no matter which kid the hashtable maps.

Yes, this means that the hashtable when optimizing lookups will lead to only one kid (wherefore the need to maintain all the chunks, reason (b) here).  Since y and y' are equivalent under the property tree identity relation, that's ok.  The only reason we can't fuse y and y' is because the GC would then have to forward all pointers to y' to y.  That didn't and does not seem worth the effort.

/be
(Assignee)

Comment 18

11 years ago
Created attachment 245261 [details] [diff] [review]
updated to trunk

I'd like to get this in.  I think mrbkap and shaver are the reviewers to request.  Please demur quickly if you don't agree or don't have time, you two!  Thanks,

/be
Attachment #240892 - Attachment is obsolete: true
Attachment #245261 - Flags: superreview?(shaver)
Attachment #245261 - Flags: review?(mrbkap)
Attachment #240892 - Flags: superreview?(shaver)
Attachment #240892 - Flags: review?(mrbkap)
Comment on attachment 245261 [details] [diff] [review]
updated to trunk

I think that looks fine. sr=shaver
Attachment #245261 - Flags: superreview?(shaver) → superreview+
Comment on attachment 245261 [details] [diff] [review]
updated to trunk

>+static JSDHashTable *
>+HashChunks(PropTreeKidsChunk *chunk, uintN n)
>+{
...
>+            entry = (JSPropertyTreeEntry *)
>+                    JS_DHashTableOperate(table, sprop, JS_DHASH_ADD);

Do you need to null-check entry here?

>+                                chunk = KIDS_TO_CHUNK(parent->kids);
>+                                if (!chunk->table) {
>+                                    table = HashChunks(chunk, n);
>+                                    JS_LOCK_RUNTIME(rt);
>+                                    if (!table)
>+                                        goto out_of_memory;
>+                                    if (chunk->table)

How will |chunk->table| ever be non-null here?

r=mrbkap
Attachment #245261 - Flags: review?(mrbkap) → review+
(In reply to comment #20)
> >+                                chunk = KIDS_TO_CHUNK(parent->kids);
> >+                                if (!chunk->table) {
> >+                                    table = HashChunks(chunk, n);
> >+                                    JS_LOCK_RUNTIME(rt);
> >+                                    if (!table)
> >+                                        goto out_of_memory;
> >+                                    if (chunk->table)
> 
> How will |chunk->table| ever be non-null here?

If this thread loses the race to create the table, it will be non-null.  From comment 15:

"We don't want to add locking costs at all when searching and inserting in the
property tree, unless only in rare, exceptional cases such as surpassing the 30
threshold.  And then we must cope with racing insertions that make their own
hashtable (loser has to throw its table away)."
(In reply to comment #21)
> threshold.  And then we must cope with racing insertions that make their own
> hashtable (loser has to throw its table away)."

Ah, sorry. I'd forgotten about that comment... Threads suck!
(Assignee)

Updated

11 years ago
Blocks: 356116
(Assignee)

Comment 23

11 years ago
Created attachment 245416 [details] [diff] [review]
with pre-loaded capacity fix

Mac sed is broken, so I would appreciate it if someone can attach the xpcom/glue pldhash.[ch] patch based on running js/src/plify_jsdhash.sed on jsdhash.[ch] with this patch applied.

/be
Attachment #245261 - Attachment is obsolete: true
Attachment #245416 - Flags: superreview?(shaver)
Attachment #245416 - Flags: review?(mrbkap)

Updated

11 years ago
Attachment #245416 - Flags: review?(mrbkap) → review+
(Assignee)

Comment 24

11 years ago
Created attachment 246124 [details] [diff] [review]
My Mac now has gnused, yay
(Assignee)

Comment 25

11 years ago
Comment on attachment 245416 [details] [diff] [review]
with pre-loaded capacity fix

Shaver already reviewed the jsscope.c change -- dbaron, could you sr the *dhash.[ch] changes?  Thanks,

/be
Attachment #245416 - Flags: superreview?(shaver) → superreview?(dbaron)
(Assignee)

Comment 26

11 years ago
Created attachment 247447 [details] [diff] [review]
updated to merge other jsscope.c changes

The "My Mac now has gnused, yay" attachment for pldhash.[ch] is still fresh.

/be
Attachment #245416 - Attachment is obsolete: true
Attachment #247447 - Flags: superreview?(dbaron)
Attachment #247447 - Flags: review+
Attachment #245416 - Flags: superreview?(dbaron)
Comment on attachment 247447 [details] [diff] [review]
updated to merge other jsscope.c changes

sr=dbaron on the jsdhash.* changes, although I wonder whether JS_DHASH_CAP should have some name that more clearly warns people not to use it.
Attachment #247447 - Flags: superreview?(dbaron) → superreview+
(Assignee)

Comment 28

11 years ago
(In reply to comment #27)
> (From update of attachment 247447 [details] [diff] [review] [edit])
> sr=dbaron on the jsdhash.* changes, although I wonder whether JS_DHASH_CAP
> should have some name that more clearly warns people not to use it.

The name wants to be short too.  It could just be obscured, but probably a warning comment is best.

/be
(Assignee)

Comment 29

11 years ago
Created attachment 247487 [details] [diff] [review]
jsscope.c-only patch

This depends on the patch that I just split out in bug 356116.

/be
Attachment #247487 - Flags: review+
(Assignee)

Comment 30

11 years ago
Fixed on trunk:

Checking in jsscope.c;
/cvsroot/mozilla/js/src/jsscope.c,v  <--  jsscope.c
new revision: 3.51; previous revision: 3.50
done

/be
Status: ASSIGNED → RESOLVED
Last Resolved: 11 years ago
Resolution: --- → FIXED
(Assignee)

Comment 31

11 years ago
Backed out jsscope.c rev 3.51, so reopening.  If this back-out fixes balsa's orangeness, there must be a bug in that rev, or a more subtle bug in the macro added for bug 356116 (JS_DHASH_CAPACITY) that bites only the jsscope.c change.

/be
Status: RESOLVED → UNCONFIRMED
Resolution: FIXED → ---
(Assignee)

Updated

11 years ago
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
(Assignee)

Updated

11 years ago
Blocks: 362762
(Assignee)

Updated

11 years ago
Blocks: 365851
(Assignee)

Updated

11 years ago
No longer blocks: 356116
Depends on: 356116
(Assignee)

Comment 32

11 years ago
Created attachment 251349 [details] [diff] [review]
fixed jsscope.c patch

Fixed RemovePropertyTreeChild to remove from chunk->table; obviously this function must not return early, hence the new freeChunk variable initialized to null, and the goto out usage.

/be
Attachment #246124 - Attachment is obsolete: true
Attachment #247447 - Attachment is obsolete: true
Attachment #247487 - Attachment is obsolete: true
Attachment #251349 - Flags: review?(mrbkap)

Updated

11 years ago
Attachment #251349 - Flags: review?(mrbkap) → review+
(Assignee)

Comment 33

11 years ago
Fixed on trunk:

js/src/jsscope.c rev 3.56

/be
Status: ASSIGNED → RESOLVED
Last Resolved: 11 years ago11 years ago
Resolution: --- → FIXED

Comment 34

11 years ago
Created attachment 251751 [details]
js1_5/Object/regress-335700.js

I am attaching this for your enjoyment but not checking it in since the bug as described and it's fix aren't quite accurate and the actual result can't be tested independently of the machine being used.

For small ranges of 0..10,000 the test with ObjectWithGetter is O(N^2) in both 1.8.0 and 1.9.0 and is O(N) for 30,000..40,000. So really this fix didn't change the big O behavior but did substantially change the scale factor.

Updated

11 years ago
Flags: in-testsuite-

Comment 35

11 years ago
/cvsroot/mozilla/js/tests/js1_5/extensions/regress-335700.js,v  <--  regress-335700.js

I've gone ahead and checked this test in rather than try to keep track of it locally. I've also moved it to the extensions subdir since it uses getters.
Flags: in-testsuite- → in-testsuite+
Flags: blocking1.9?
You need to log in before you can comment on or make changes to this bug.