Closed Bug 417131 Opened 12 years ago Closed 11 years ago

JS Enumeration Allocation Consternation

Categories

(Core :: JavaScript Engine, defect, P1)

defect

Tracking

()

VERIFIED FIXED

People

(Reporter: sayrer, Assigned: igor)

References

Details

(Keywords: perf, testcase, verified1.9.1)

Attachments

(3 files, 25 obsolete files)

90.96 KB, patch
Details | Diff | Splinter Review
1.38 KB, text/plain
Details
v5
19.66 KB, patch
igor
: review+
Details | Diff | Splinter Review
string-fasta debian shootout test shows that enumeration is slow because we malloc to make enumerators used in for-in loops. It would be nice to get these on the stack for native stuff.
Assignee: general → crowder
Just finishing up testing on this now.
Comment on attachment 303302 [details] [diff] [review]
keep a stack of leftover iterator structs to be reused as we go; free in gc

Victory in the JS test-suite, I think this is a decent fix with a light touch.
Attachment #303302 - Flags: review?(brendan)
Version: unspecified → Trunk
This seems to yields a small, but measurable perf-bump across the board.
Flags: blocking1.9?
Priority: -- → P2
Target Milestone: --- → mozilla1.9beta4
Comment on attachment 303302 [details] [diff] [review]
keep a stack of leftover iterator structs to be reused as we go; free in gc

Duhrr, this doesn't need the extra next pointer, I'll just overload the original one.  Patch next.
Attachment #303302 - Attachment is obsolete: true
Attachment #303302 - Flags: review?(brendan)
Attached patch less wasteful (obsolete) — Splinter Review
Attachment #303595 - Flags: review?(brendan)
Actually, ditching my blocking request here.  Might be worth a small speed-up, but I don't think worth blocking on.
Flags: blocking1.9?
Comment on attachment 303595 [details] [diff] [review]
less wasteful

The jsarena.[ch] stuff is unrelated to this bug. 

This doesn't avoid allocating the JSIdArray, however -- and it can hoard memory bounded by nesting depth of for-in constructs. If we want to use only interpreter stack space, we should look for a way to bypass the id-array.

/be
Attachment #303595 - Flags: review?(brendan) → review-
Attached patch WIP, good for testing (obsolete) — Splinter Review
More to do...

/be
Assignee: crowder → brendan
Attachment #303595 - Attachment is obsolete: true
Status: NEW → ASSIGNED
Comment on attachment 303836 [details] [diff] [review]
WIP, good for testing

This is failing JSON tests. I think I'm using it right.

*** CHECK FAILED: {"child has two members":{"this":"one"}} == {"child has two members":{"this":"one","2":"and this one"}}
JS frame :: xpcshell-simple/head.js :: do_throw :: line 99
JS frame :: xpcshell-simple/head.js :: do_check_eq :: line 114
JS frame :: xpcshell-simple/json_test/unit/test_encode.js :: testStringEncode :: line 118
JS frame :: xpcshell-simple/json_test/unit/test_encode.js :: run_test :: line 232
JS frame :: xpcshell-simple/tail.js :: _execute_test :: line 43
JS frame :: xpcshell-simple/execute_test.js :: <TOP_LEVEL> :: line 38
Attached patch change JSON iterator call (obsolete) — Splinter Review
sayrer: thanks, I will supply a full patch for the next WIP, which should work -- may need your help testing, but won't break overtly.

/be
OS: Mac OS X → All
Hardware: PC → All
Target Milestone: mozilla1.9beta4 → mozilla1.9
Priority: P2 → P1
Target Milestone: mozilla1.9 → mozilla1.9beta5
This includes patched nsJSON.{cpp,h}, so patch -p0 < fasta.patch from the mozilla directory. Testing help welcome, more tests coming.

/be
Attachment #303836 - Attachment is obsolete: true
Attachment #304269 - Attachment is obsolete: true
Some important fixes here. Comments and tests coming soon.

/be
Attachment #307865 - Attachment is obsolete: true
(In reply to comment #14)
> Some important fixes here. Comments and tests coming soon.

Suspense level: [XXXXXXXXXXXXX  ]

Please save us!
Sorry, optimizing is easy but recovering correctness in the case of deletion of not-yet-enumerated properties after a loop starts, and addition of same, is not. This is for 3.next.

/be
Target Milestone: mozilla1.9beta5 → ---
Getting this on the 1.9.1 triage queue.
Flags: blocking1.9.1?
Flags: blocking1.9.1? → blocking1.9.1+
Attached patch brute force alterna-patch (obsolete) — Splinter Review
** TOTAL **:           1.039x as fast    2904.9ms +/- 0.3%   2795.8ms +/- 0.1%     significant

Shape-based JSNativeEnumerator (formerly JSNativeIteratorState) caching in the runtime. Comments welcome.

The first patch (attachment 307967 [details] [diff] [review]) tries to optimize the short-ish property map case by avoiding allocating anything but another stack slot. The downside is that it has to cope with the reverse-linked property tree path, and to compensate for adds and deletes (js_EnumeratingBarrier -- not yet written and debugged).

This patch has the virtue of being simple, and saving non-trivial micro-benchmark runtime. It could be considered contrived, though. How often does real-world code re-enumerate the same shape?

/be
Attachment #326858 - Flags: review?(igor)
(In reply to comment #18)

> This patch has the virtue of being simple, and saving non-trivial
> micro-benchmark runtime. It could be considered contrived, though. How often
> does real-world code re-enumerate the same shape?

I'll instrument the hit rate and report it from the GC, so we can see how it trends over real browsing sessions using gmail, etc.

/be
Depends on: 443039
Still need to instrument hit rate.

/be
Attachment #327692 - Flags: review?(igor)
Attachment #326858 - Attachment is obsolete: true
Attachment #326858 - Flags: review?(igor)
After loading gmail at a message, going back to Inbox, with six other tabs all loading bugzilla pages:

nativeEnumCache hit rate 79.8891
nativeEnumCache hit rate 79.9094
nativeEnumCache hit rate 80.0898
nativeEnumCache hit rate 80.1295
--WEBSHELL 0x25eb42e0 == 18
--WEBSHELL 0x25ea44b0 == 17

After SunSpider in-browser run:

nativeEnumCache hit rate 81.6874

Patch with instrumentation next.

/be
Attached patch with ifdef'ed instrumentation (obsolete) — Splinter Review
hg qdiff again -- this is pushed on top of the JSOP_ITER patch. I can attach the `hg diff -r qparent -U 8 -p` result but it will show both patches of course.

I let the hit rate accumulate -- resetting probe and miss counters on each gc seems too forgetful, often gives 100% hit rate (nan% on shutdown ;-). Suggestions welcome, but this seems winning. Latest final rate on shutdown after SunSpider and gmail:

nativeEnumCache hit rate 98.4947%

/be
Attachment #327692 - Attachment is obsolete: true
Attachment #327711 - Flags: review?(igor)
Attachment #327692 - Flags: review?(igor)
The pat(In reply to comment #22)
> Created an attachment (id=327711) [details]
> with ifdef'ed instrumentation

The patch in JSENUMERATE_INIT removes JSNativeEnumerator from the cache and then cache it back in JSENUMERATE_DESTROY which requires to take the lock again. AFAICS that can be avoided if JSENUMERATE_INIT would keep the state in the cache and not put it to doubly-linked list. The state is pushed to the doubly-linked list only if it is evicted from the cache while the enumerator is still running.

Then JSENUMERATE_DESTROY can check that the state is not on the list and simply set next_index to some sentinel value indicating that the state struct can be reused. A possible race with another thread that evicts the state from the cache and puts it to the list is harmless. It would just mean that the deallocation of the struct happens during the GC when it finds a struct with a sentinel value of next_index on the list.

Another thing that can be improved is to allocate the id array together with the state object improving cache locality.
(In reply to comment #23)
> The patch in JSENUMERATE_INIT removes JSNativeEnumerator from the cache and
> then cache it back in JSENUMERATE_DESTROY which requires to take the lock
> again.

The JSENUMERATE_DESTROY case already takes the look to remove the state from the linked list, I didn't add extra lock-taking. The critical path is only slightly longer.

> AFAICS that can be avoided if JSENUMERATE_INIT would keep the state in
> the cache and not put it to doubly-linked list. The state is pushed to the
> doubly-linked list only if it is evicted from the cache while the enumerator is
> still running.

That is an interesting idea. It would require marking from the cache, though, instead of purging the cache as we do now. How then would the cache entries be collected under memory pressure?

> Then JSENUMERATE_DESTROY can check that the state is not on the list and simply
> set next_index to some sentinel value indicating that the state struct can be
> reused. A possible race with another thread that evicts the state from the
> cache and puts it to the list is harmless. It would just mean that the
> deallocation of the struct happens during the GC when it finds a struct with a
> sentinel value of next_index on the list.

There are several other things to try, as you note:

* Fuse allocations more.

* Avoid an object (use a PRIVATE jsval as the iterator, which untags to the state pointer by itself -- this requires another stack slot for the current object being enumerated but I had a patch that supplied jsval *vp instead of JSObject *obj to js_CallIteratorNext, etc. -- wasn't hard);

* Try sharing cache hits as you propose here.

But I'm hoping we can do these in follow up bugs. I'm happy to give this bug and patch away too, if you are interested. Or r+ it and I'll file the followup bugs, which are up for grabs. Thoughts?

/be
(In reply to comment #25)
> > AFAICS that can be avoided if JSENUMERATE_INIT would keep the state in
> > the cache and not put it to doubly-linked list. The state is pushed to the
> > doubly-linked list only if it is evicted from the cache while the enumerator is
> > still running.
> 
> That is an interesting idea. It would require marking from the cache, though,
> instead of purging the cache as we do now. How then would the cache entries be
> collected under memory pressure?

The entry can be collected as long as the enumerator is done as indicated by a sentinel value in next_index. Otherwise the entry must be traced. In general the idea is to view the cache as an extension to the current doubly-linked list. First the enumerator goes to the cache and then to the list after eviction from the cache.

> 
> But I'm hoping we can do these in follow up bugs.

Well, fusing the state and id array would make the changes from the current patch about JS_DestroyIdArray -> js_DestroyIdArray renames unnecessary (and may even require to undo them to minimize code bloat). So it is better to do it here within this bug. 

On the other hand that lock-free JSENUMERATE_DESTROY can be built upon the patch in separated bugs.

To hide the empty id array singleton I would be willing to swallow the js_ vs. JS_DestroyIdArray layering -- we do that for similar reasons elsewhere. Hope this is not controversial.

I'll invite whoever is up to it to steal this bug, but do a fusing pass when I get time (may be next week) and keep this fresh in my mq.

/be
The patch is a result of late night of hacking to check if fusing state and ids would minimize the patch and code. AFAICS this is indeed the case.

I have not done any testing with the patch.
The empty_id_array singleton was a win in past testing, but I forget whether with SunSpider or my usual gmail+bugzilla session. Worth testing.

Steal this bug, please! Or we can ping-pong patches, but want to converge.

/be
(In reply to comment #29)
> 
> Steal this bug, please! Or we can ping-pong patches, but want to converge.

OK, I will test the fused patch tomorrow and ask for a review. All those lock-less optimizations can go to a separated bug.
Assignee: brendan → igor
Status: ASSIGNED → NEW
The test should provide a good coverage for the cache code including accessing the cache from different threads.
The new version puts the enumerator states to the global list only when it is evicted from the cache and implements that idea of lock-free JSENUMERATION_DESTROY. For the states in the cache it also takes the GC lock only once.

The patch passes shell and mochi tests.
Attachment #327997 - Flags: review?(brendan)
(In reply to comment #29)
> The empty_id_array singleton was a win in past testing, but I forget whether
> with SunSpider or my usual gmail+bugzilla session. Worth testing.

I should have remembered: empty_id_array is an obvious win for all for-in loops that use enumeration, where the standard prototype(s) -- Object.prototype at least, Array.prototype before it for arrays -- have not been extended with enumerable ad-hoc properties.

/be
(In reply to comment #33)
> I should have remembered: empty_id_array is an obvious win for all for-in loops
> that use enumeration, where the standard prototype(s) -- Object.prototype at
> least, Array.prototype before it for arrays -- have not been extended with
> enumerable ad-hoc properties.

Year, but with fused state/id array this is no longer an issue: no extra memory will be allocated for id-less enumerations.
Can you eliminate the state allocation altogether for objects with zero enumerable properties? More stack encoding of native-empty state pointer, with another slot for current obj, and friend API changes to pass jsval *vp -- this means nsJSON.cpp changes too, I will look for the patch I hacked a few months ago that did this -- let me know.

/be
(In reply to comment #35)
> Can you eliminate the state allocation altogether for objects with zero
> enumerable properties? 

That would not work since Object.prototype and Array.prototype would have different shape values for the scope. So a different empty state would be required for each of them. Assuming good hashing, this is exactly what happens with the cache. 

A shared state would be possible only if state->shape is moved outside the state into the hashtable itself. But this would increase the size of the cache by the factor of 2 and probably hurt CPU caching.
(In reply to comment #36)
> (In reply to comment #35)
> > Can you eliminate the state allocation altogether for objects with zero
> > enumerable properties? 
> 
> That would not work since Object.prototype and Array.prototype would have
> different shape values for the scope. So a different empty state would be
> required for each of them. Assuming good hashing, this is exactly what happens
> with the cache. 

Still the empty state can be made better: one do not need to to put it to the doubly linked list. In fact prevp/next/next_index can be eliminated for it completely. So the cost for each empty scope would be a hash pointer to 2-word struct.
(In reply to comment #36)
> (In reply to comment #35)
> > Can you eliminate the state allocation altogether for objects with zero
> > enumerable properties? 
> 
> That would not work since Object.prototype and Array.prototype would have
> different shape values for the scope.

I was proposing a more radical change: push a pseudo-boolean meaning "empty state" instead of any state pointer. This builds on the idea of avoiding making an object to wrap the state for the native enumerator case.

/be
No need for pseudo-boolean novelty, just use JSVAL_VOID:

vp[0] - current object along prototype chain
vp[1] - JSVAL_VOID if zero enumerable properties,
        else PRIVATE_TO_JSVAL(state),
        else OBJECT_TO_JSVAL(iterobj).

This works with something like the current js_ValueToIterator signature. With more changes, we could simply stop enumerating on zero enumerable properties if at the last object in the proto chain, or skip that object if not at last.

/be
Ok, so the patch "passes js tests in the shell" (attachment 307967 [details] [diff] [review]) does have the js_CallIteratorNext(cx, vp, rval) signature change I've been talking up.

/be
Attached patch optimizing empty enumerator (obsolete) — Splinter Review
The new version returns JSVAL_NULL as a state value for empty enumerations. In addition for such enumeration the cache entry points to a heap-allocated uint32 value holding the shape. It avoids the need to allocate the full JSNativeEnumerator struct. To distinguish between normal and empty cache entries the pointer for normal entry is tagged with 1 in the lowest bit. 

I checked the patch just against the stress test.
Attachment #327997 - Attachment is obsolete: true
Attachment #327997 - Flags: review?(brendan)
(In reply to comment #40)
> Ok, so the patch "passes js tests in the shell" (attachment 307967 [details] [diff] [review]) does have
> the js_CallIteratorNext(cx, vp, rval) signature change I've been talking up.
> 
> /be
> 

I filed the bug 443599 to implement this iterator object elimination as this is orthogonal to optimizing JSNativeEnumerator allocation.
Attached patch optimizing empty enumerator v2 (obsolete) — Splinter Review
The new patch fixes various issues and passes the tests. But before asking for a review I want to check if caching shapes of just objects with no properties to enumerate would alone provide most of the benchmark speedups. 

The observation is that allocation of the state object and looping through its properties to collect ids may not be that expensive if the script later accesses the values of such properties as typically happens with for-in loops.
Attachment #328064 - Attachment is obsolete: true
Attachment #327898 - Attachment is obsolete: true
Attached patch optimizing empty enumerator v3 (obsolete) — Splinter Review
The new version does not push to the cache enumerators with more then 10 entries to release their state arrays ASAP. The number 10 comes from benchmarks.
Attachment #328126 - Attachment is obsolete: true
Attachment #328181 - Flags: review?(brendan)
(In reply to comment #42)
> I filed the bug 443599 to implement this iterator object elimination as this is
> orthogonal to optimizing JSNativeEnumerator allocation.

Agreed, I wanted to get the idea out -- great to have that filed separately. Reviewing next.

/be
Comment on attachment 328181 [details] [diff] [review]
optimizing empty enumerator v3

What kind of SunSpider speedup do you get, and what's the string-fasta individual-test speedup?

>+ * The structure is not allocated when the number of enumeratable properties

Typo/misspelling: s/enumeratable/enumerable/

>+ * in the scope is zero. js_Enumerate represents such enumerators with
>+ * JSVAL_ZERO. To avoid repeated scanning in JSENUMERATE_INIT of scopes for
>+ * Object.prototype and similar objects with no properties to enumerate
>+ * JSRuntime.nativeEnumCache is still filled with a pointer to a heap-
>+ * allocated uint32 containing the shape of such scope. To distinguish cached
>+ * JSNativeEnumerator* entries from such uint32* pointers the former is tagged
>+ * with 1 in the lowest bit.

No need for heap-allocated word -- shapes fit in 24 bits. See jsinterp.h:

#define SHAPE_OVERFLOW_BIT      JS_BIT(32 - PCVCAP_TAGBITS)

/be
Attached patch optimizing empty enumerator v4 (obsolete) — Splinter Review
The new version stores the shapes of scope without properties to enumerate directly in the cache.
Attachment #328181 - Attachment is obsolete: true
Attachment #328219 - Flags: review?(brendan)
Attachment #328181 - Flags: review?(brendan)
The results contain too much noise to draw any conclusion.
Here string-fasta shows 7.9% improvment but I guess the penalty from the increased code size offsets that improvement.
Blocks: 443746
No longer blocks: 443746
Depends on: 443746
Comment on attachment 328219 [details] [diff] [review]
optimizing empty enumerator v4

I will update the patch after considering bug 443746.
Attachment #328219 - Attachment is obsolete: true
Attachment #328219 - Flags: review?(brendan)
(In reply to comment #48)
> Created an attachment (id=328220) [details]
> sunspider results, thread-safe jsshell, GCC 4.3, Fedora 9, Petium-M 1.1GHz
> 
> The results contain too much noise to draw any conclusion.

With the patch tested here, could you reproduce my results with thread-unsafe js shell from comment 18?

/be
(In reply to comment #51)
> (In reply to comment #48)
> > Created an attachment (id=328220) [details] [details]
> > sunspider results, thread-safe jsshell, GCC 4.3, Fedora 9, Petium-M 1.1GHz
> > 
> > The results contain too much noise to draw any conclusion.
> 
> With the patch tested here, could you reproduce my results with thread-unsafe
> js shell from comment 18?

That patch does not apply to the trunk. I guess this is due to changes that went into bug 443039. Also note that string-fasta is not realistic benchmark for for-in loops. It contains the following code:

      for (var c in table) {
        if (r < table[c]) {
          line[i] = c;
          break;
        }
      }

Here the nature of algorithm is such that the loop almost always exits via the break. So Object.prototype is almost never touched and any explicit optimizations for zero-length enumerators would only slow things down.

Moreover, the loop benefits from the id caching substantially as it breaks often rather early. So id array caching avoids repeated scanning of properties in JSENUMERATE_INIT that would not be enumerated during the loop.

I would claim that such for-in loops are not typical. for-in loop in string-tagcloud is more representative as it does not break. But there, due to the complex loop body, improvements from faster enumeration setup are not observable.
(In reply to comment #51)
> With the patch tested here, could you reproduce my results with thread-unsafe
> js shell from comment 18?

One more thing: on sm-valgrind01 string-fasta is 17% faster with thread-unsafe build than with thread-safe one. In fact the benchmark shows the biggest improvement from turning JS_THREADSAFE off. Given that the objects under the benchmark contains 15 and 4 properties (some of which are not enumerated due to the early break) the cost of extra locking in JSENUMERATE_INIT/DESTROY must be contributing to that substantially.
SunSpider benchmarks (some from the Great Language Shootout, others not) are to be sure far from representative of anything but themselves.

The locking cost is non-trivial, and we could use a JSThinLock instead of the GC lock. It's hard to use compare-and-swap since several fields are being updated at once, but there may be a way.

/be
(In reply to comment #54)
> The locking cost is non-trivial, and we could use a JSThinLock instead of the
> GC lock. It's hard to use compare-and-swap since several fields are being
> updated at once, but there may be a way.

I meant "hard to use compare-and-swap directly" of course, since thin locks are built on CAS.

/be
Attached patch v5 (obsolete) — Splinter Review
Here is a patch update to reflect changes from the bug 443746. This version avoids taking the GC lock when checking the cache for scopes with no enumerable properties. It does not benefit string-fasta but it avoids the lock when enumerating Object.prototype and friends.

The patch passes the tests but before asking for a review I would like to try an idea of freeing JSNativeEnumerator only from the GC. This way the code can read JSNativeEnumerator.shape without taking the lock, it should be possible to update JSNativeEnumerator.cursor from -1 to length using compare-and-swap and only singly-linked list for the GC safety would be necessary.
Attached patch native enumerator as GC thing (obsolete) — Splinter Review
The new patch implements that idea of deallocating JSNativeEnumerator only from the GC. It allowed to implement mostly-lock-free cache using js_CompareAndSwap to check for running enumerators that are in the cache. The lock is only taken when adding the newly created enumerator to the global list for the GC and updating the cache with the new value. When the enumerator is empty, lock is never taken.

Since only the GC frees the enumerators, it is necessary to update gcBytes counters to account for increased memory pressure. Hence the reason for js_AddAsGCBytes/js_RemoveAsGCBytes functions.

The patch also adds the missing js_CompareAndSwap implementation on 64-bit Linux using compiler's atomic built-ins. 

With the patch on sm-valgrind01 (64 bit Linux) string-fasta is 11% faster with thread-safe built and 6% faster with thread-unsafe built.
Attachment #328273 - Attachment is obsolete: true
Attachment #328371 - Flags: review?(brendan)
The new version of the patch fixes stalled comments.
Attachment #328220 - Attachment is obsolete: true
Attachment #328224 - Attachment is obsolete: true
Attachment #328371 - Attachment is obsolete: true
Attachment #328374 - Flags: review?(brendan)
Attachment #328371 - Flags: review?(brendan)
Attachment #327711 - Attachment is obsolete: true
Attachment #327711 - Flags: review?(igor)
Sorry for reviewing out of order -- possible to get an updated patch after the NativeCompareAndSwap stuff lands?

/be
Here is an updated patch to sync with trunk changes.
Attachment #328374 - Attachment is obsolete: true
Attachment #328374 - Flags: review?(brendan)
Attached patch native enumerator as GC thing v2 (obsolete) — Splinter Review
The new patch just syncs the previous version with the trunk. The patch was ready for a review but I forgot to ask for one.
Attachment #330808 - Attachment is obsolete: true
Attachment #332987 - Flags: review?(brendan)
Comment on attachment 332987 [details] [diff] [review]
native enumerator as GC thing v2

>  * Increase runtme->gcBytes by sz bytes to account for an allocation outside

"runtime" misspelled

> #error "NSPR_LOCK should be on when the platform lacks native comprae-and-swap."

"compare" misspelled

Fold JS_ATOMIC_INCREMENT calls into ENUM_CACHE_METER macro

Where is newenumerator set to true?

/be
Attached patch native enumerator as GC thing v3 (obsolete) — Splinter Review
The new version of the patch fixes the nits and that missing newenumerator = true embarrassing bug.
Attachment #333873 - Flags: review?(brendan)
Attachment #332987 - Attachment is obsolete: true
Attachment #332987 - Flags: review?(brendan)
Attached patch v2-v3 delta (obsolete) — Splinter Review
Comment on attachment 333874 [details] [diff] [review]
v2-v3 delta

>-# define ENUM_CACHE_METER(x)    x
>+# define ENUM_CACHE_METER_INCR(name)    JS_ATOMIC_INCREMENT(&cx->runtime->name)

The _INCR suffix is unnecessary (METER means measure) and we don't use it in analogous situations -- at most UNMETER is added if a decrementing form is needed. Shorter and similar naming wins?

>@@ -4330,42 +4328,43 @@ js_Enumerate(JSContext *cx, JSObject *ob
>                     !(sprop->flags & SPROP_IS_ALIAS) &&
>                     (!SCOPE_HAD_MIDDLE_DELETE(scope) ||
>                      SCOPE_HAS_PROPERTY(scope, sprop))) {
>                     JS_ASSERT(ids < ne->ids + length);
>                     *ids++ = sprop->id;
>                 }
>             }
>             JS_ASSERT(ids == ne->ids + length);
>+            newenumerator = JS_TRUE;
>         } while (0);
>         JS_UNLOCK_SCOPE(cx, scope);
> 
[snip]
>         if (!ne) {
>             JS_ASSERT(length == 0);
>+            JS_ASSERT(!newenumerator);
>             *statep = JSVAL_ZERO;
>         } else {
>             JS_ASSERT(length != 0);
>             JS_ASSERT(ne->cursor == length);
>+            if (newenumerator) {
>+                JS_LOCK_GC(cx->runtime);
>+                if (!js_AddAsGCBytes(cx, NativeEnumeratorSize(length))) {

How about commoning this NativeEnumeratorSize(length) with the one used by JS_malloc above when newenumerator is going to be set true, and storing the result in a allocated size variable -- then you can test if (allocated) instead of if (newenumerator)? The alllowercase name caught my eye but the CSE seems worthwhile.

r=me with these addressed.

/be
Attachment #333874 - Flags: review+
Attached patch v4 (obsolete) — Splinter Review
The new version of the patch addresses the nits from the comment 65.
Attachment #333873 - Attachment is obsolete: true
Attachment #333874 - Attachment is obsolete: true
Attachment #334430 - Flags: review+
Attachment #333873 - Flags: review?(brendan)
Attached patch v3-v4 delta (obsolete) — Splinter Review
For the sheriff:

To check in - attachment 334430 [details] [diff] [review]

the check in message: bug 417131 - caching enumerators to speedup for-in loops. r=brendan
Comment on attachment 334430 [details] [diff] [review]
v4

>+            cachep = &cx->runtime->
>+                     nativeEnumCache[NATIVE_ENUM_CACHE_HASH(shape)];
>+            oldcache = *cachep;
>+            if (oldcache & (jsuword) 1) {
>+                if ((uint32) (oldcache >> 1) == shape) {
>+                    /*scope has a shape with no enumerable properties. */

Uber-nit: Missing space before "scope has a shape".

>+                    break;
>+                }
>+            } else if (oldcache != (jsuword) 0) {
>+                /*
>+                 * We can safely read ne->shape without taking the GC lock as
>+                 * ne is only deleted when running the GC and ne->shape is

s/only deleted/deleted only/

and perhaps add "outside of this request" before "when running the GC".

Sorry for missing these earlier.

/be
Comment on attachment 334430 [details] [diff] [review]
v4

>+    while (!!(ne = *nep)) {

Missed this too -- house style prefers nesting assignment in a loop condition only, and then comparing != NULL explicitly. The !! idiom is rarely used, not in loop conditions at any rate. It stands out due to ! being exclamation point, and lack of common use of it makes it harder to read and understand for most readers. If we used it all over, maybe... but I'd rather not.

/be
(In reply to comment #70)
> (From update of attachment 334430 [details] [diff] [review])
> >+    while (!!(ne = *nep)) {
> 
> Missed this too -- house style prefers nesting assignment in a loop condition
> only, and then comparing != NULL explicitly.

Using !! was from the desire to avoid using NULL in new C++ code. Due to operator overloading and templates it can bring more harm than clarity.
C++ and NULL, sigh. Does 0x have null built-in?

We should be careful about operators and templates. I take it the other while ((p=q) != NULL) instances in the code do not cause problems. I guess I would rather cross that bridge if (not when) we come to it, than start incrementally injecting !! while leaving a majority case of != NULL around.

/be
Attached patch v5Splinter Review
Fixing the latest nits, here is the delta:

@@ -4272,13 +4272,13 @@ js_Enumerate(JSContext *cx, JSObject *ob
             oldcache = *cachep;
             if (oldcache & (jsuword) 1) {
                 if ((uint32) (oldcache >> 1) == shape) {
-                    /*scope has a shape with no enumerable properties. */
+                    /* scope has a shape with no enumerable properties. */
                     break;
                 }
             } else if (oldcache != (jsuword) 0) {
                 /*
                  * We can safely read ne->shape without taking the GC lock as
-                 * ne is only deleted when running the GC and ne->shape is
+                 * ne is deleted only when running the GC and ne->shape is
                  * read-only after initialization.
                  */
                 ne = (JSNativeEnumerator *) *cachep;
@@ -4415,7 +4415,7 @@ js_TraceNativeEnumerators(JSTracer *trc)
     }

     nep = &rt->nativeEnumerators;
-    while (!!(ne = *nep)) {
+    while ((ne = *nep) != NULL) {
         JS_ASSERT(ne->length != 0);
         if (ne->cursor != 0) {
             /* Trace ids of the running enumerator. */
-
Attachment #334430 - Attachment is obsolete: true
Attachment #334431 - Attachment is obsolete: true
Attachment #334501 - Flags: review+
For the sheriff:

To check in - attachment 334501 [details] [diff] [review]

the check in message: bug 417131 - caching enumerators to speedup for-in loops.
r=brendan 
Is it bug ready to close?

The change of js/src/jslock.cpp caused a problem on Solaris x86.

-#ifndef NSPR_LOCK
-
 /* Implement NativeCompareAndSwap. */
[snip]
 #error "JS_HAS_NATIVE_COMPARE_AND_SWAP should be 0 if your platform lacks a compare-and-swap instruction."
 
 #endif /* arch-tests */
 
+#if JS_HAS_NATIVE_COMPARE_AND_SWAP
+

"ifndef NSPR_LOCK" moved down caused build error if JS_HAS_NATIVE_COMPARE_AND_SWAP is 0.
We should either remove the #error, or add #if  JS_HAS_NATIVE_COMPARE_AND_SWAP.
Depends on: 451504
Filed bug 451504
No longer depends on: 451504
Depends on: 451504
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Flags: in-testsuite?
Flags: in-litmus-
Keywords: testcase
added test http://hg.mozilla.org/mozilla-central/rev/46fb3c7d6faa and cvs.
Flags: in-testsuite? → in-testsuite+
v 1.9.1, 1.9.2
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.