Last Comment Bug 633219 - Replacing resolving hash with a linked list
: Replacing resolving hash with a linked list
Status: RESOLVED FIXED
fixed-in-tracemonkey
:
Product: Core
Classification: Components
Component: JavaScript Engine (show other bugs)
: Trunk
: x86_64 Linux
: -- normal (vote)
: ---
Assigned To: Igor Bukanov
:
Mentors:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2011-02-10 08:54 PST by Igor Bukanov
Modified: 2011-06-15 08:27 PDT (History)
5 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
v1 (38.64 KB, patch)
2011-02-10 09:06 PST, Igor Bukanov
no flags Details | Diff | Splinter Review
v2 (41.90 KB, patch)
2011-02-11 16:02 PST, Igor Bukanov
no flags Details | Diff | Splinter Review
v3 (45.78 KB, patch)
2011-02-15 16:26 PST, Igor Bukanov
no flags Details | Diff | Splinter Review
v4 (47.85 KB, patch)
2011-02-16 08:10 PST, Igor Bukanov
luke: review+
Details | Diff | Splinter Review
v5 (28.07 KB, patch)
2011-03-10 13:39 PST, Igor Bukanov
luke: review+
Details | Diff | Splinter Review

Description Igor Bukanov 2011-02-10 08:54:05 PST
Currently to implement run-away recursion detection during Class resolve hook and watch handler invocation the code uses a hashtable stored in JSContext.

With multiple JSContext instances in a typical browser session we end up with multiple hashtable instances that are empty most of the time. So the idea is to to move the hash into JSThread while simplifying the interface for the class replacing js_(Start|Stop) resolving with a single auto class.
Comment 1 Igor Bukanov 2011-02-10 09:06:48 PST
Created attachment 511411 [details] [diff] [review]
v1

The patch replaces per-context JSDHashTable with HashSet stored in JSThreadData. To minimize the amount of checks when calling the resolve hook the patch makes the resolvingSet an eagerly initialized field of JSThreadData. That required to provide proper constructors and destructors to JSThread and JSThreadData.

As this removed some direct and indirect calls that JSDHashTable this also made calling the resolve hooks faster. In particular, for the following synthetic benchmark I observed a 2% speedup with method jit:

// Minimize global object lookup timing for absent properties
this.__proto__ = null;

function test(N) {
    var t = Date.now();
    for (var i = N; i != 0; --i)
        "not_existing_property" in this;
    return Date.now() - t;
}

test(1e3); // warm up

var results = [];
for (var i = 0; i != 100; ++i)
    results.push(test(1e6));

var sum = 0;
for (var i = 0; i != results.length; ++i)
    sum += results[i];
var average = sum / results.length;
var min = Math.min.apply(null, results);
var max = Math.max.apply(null, results);

print("Min: "+min);
print("Max: "+max);
print("Average: "+average.toFixed(1));
Comment 2 Igor Bukanov 2011-02-11 16:02:37 PST
Created attachment 511872 [details] [diff] [review]
v2

To support the most common case of 1 or 2 entries in the resolving set the patch uses a separated fixed-size array and inlines the fast path that works on that array without touching the hashset. With that depending on CPU the patch shows 10%-20% speedup for the test case above. 

But we need a good test coverage for the resolve hook machinery to have a confidence in the changes.
Comment 3 Luke Wagner [:luke] 2011-02-15 11:00:52 PST
Comment on attachment 511872 [details] [diff] [review]
v2

This patch has a lot of great simplifications.

I like the constructor/destructor-ification of JSThread/JSThreadData/JSContext!  Instead of js_calloc and doing all the work of JS_ASSERT'ing that various fields have been initialized to zero, could you just use js_new<> and change the asserts to member-initialization?  I don't think thread creation/destruction is a hot enough path for any of this to matter perf-wise, and it would just be more regular.

In CallResolveOp, the RAII guards let you just 'return true' on success.  But, for failure, you still need to set *propp = NULL, requiring the do{}while(0)/break trick.  Is it measurably slower to just set *propp = NULL before hand so that, on failure, you can just 'return false' ?

>+    Set::AddPtr p = data->set.lookupForAdd(key);
>+    if (p) {
>+        /*
>+         * If we just populated the table from the array we know the new key
>+         * is not their.
>+         */
>+        JS_ASSERT(data->size > JS_ARRAY_LENGTH(data->array));

s/their/there/... but I think the assert says it clearly enough without the comment.

>+     * We do not have a recursion, add the key and prepare for
>+     * set->remove() call in stopImpl().

s/stopImpl/~AutoResolving/

>+    ~AutoResolving() {
...
>+        if (data->size <= JS_ARRAY_LENGTH(data->array)) {
>+#ifdef DEBUG
>+            size_t i = 0;
>+            do {
>+                JS_ASSERT(i < data->size);
>+            } while (!(data->array[i++] == key));
>+#endif

Perhaps replace this block with just

  JS_ASSERT_IF(data->size != JS_ARRAY_LENGTH(data->array),
               data->array[data->size-1] == key) ?

Also explains why you can just do data->size-- below.

So there is a comment "We must clear it first as we do not empty the hash in removeFromSet when transition to use the array."  Wouldn't it be simpler (and ultimately as efficient) to just have fremoveFromSet clear the table?

>+            JS_STATIC_ASSERT(sizeof(word) == 4 || sizeof(word) == 8);
>+            if (sizeof(word) == 8)
>+                word ^= (word >> 32);

Yeah... we should really do this in DefaultHasher<T*>, that or make sizeof(HashNumber) == sizeof(uintptr_t).

>+        static bool match(const Key &a, const Key &b) {
>+            return a == b;
>+        }
>+    };
>+
>+
>+    static const size_t INITIAL_SET_CAPACITY = 4;
>+    typedef HashSet<Key, Hasher, SystemAllocPolicy> Set;

Extra line.  Also INITIAL_SET_CAPACITY seems unused.

>+    /* Per-thread structure. */
>+    struct Data {

While reading other parts, I wondered what 'Data' was and what is was associated with.  ThreadData I think gives the right impression.

>+        size_t      size;

As a convention, I think we use 'size' for number-of-bytes and 'length' for number of elements.  'length' here then?

A last request is that you take this as a motivating use case to make a js::UsuallySmallHashSet<T,N,HP,AP> template implemented in terms of a js::HashTable that takes care of all the array<-->set transitions.  However, perhaps this could just be a followup bug.
Comment 4 Igor Bukanov 2011-02-15 16:13:40 PST
(In reply to comment #3)
>  Instead of js_calloc and doing all the work of JS_ASSERT'ing that various
> fields have been initialized to zero, could you just use js_new<> and change
> the asserts to member-initialization?

I tried that, but after missing an initializer and spending some time to track TryServer failure I want to do that in another bug. The focus here is simply to get a constructor/destructor for ThreadData so I it can contain a field with constructor/destructor. 

> In CallResolveOp, the RAII guards let you just 'return true' on success.  But,
> for failure, you still need to set *propp = NULL, requiring the
> do{}while(0)/break trick.  Is it measurably slower to just set *propp = NULL
> before hand so that, on failure, you can just 'return false' ?

I have a patch for the bug 627016 that is build on top of this bug that changes CallResolveOp to return the found shape via the return value. 

> > A last request is that you take this as a motivating use case to make a
> js::UsuallySmallHashSet<T,N,HP,AP> template implemented in terms of a
> js::HashTable that takes care of all the array<-->set transitions.

It turned out that we do not need the hashtable, a vector is enough. A sane embedding would not have unbounded recursion of resolve hooks - I could not get more then 4 nested resolve calls in firefox. Also a runaway recursion in the watchpoint handlers would be stopped via the protection against too-deep-native-stack-usage long before O(N) search in the vector would matter. So in a new patch I replaced HashSet with Vector which simplified things substantially.

Still we should definitely consider the small hash case optimization. But this is not that straightforward. For example, for maximum performance in the small size case one wants to pack keys together with values stored separately to minimize the number of cache lines to read. But that breaks the current interface built on the notion of the hash entry storing both key and value.
Comment 5 Igor Bukanov 2011-02-15 16:26:37 PST
Created attachment 512650 [details] [diff] [review]
v3

The new patch uses Vector, not HashSet, for resolving set.
Comment 6 Igor Bukanov 2011-02-16 08:10:46 PST
Created attachment 512814 [details] [diff] [review]
v4

The new patch uses linked list for the resolving set. The infallible allocation allowed to simplify the usage and implementation significantly.
Comment 7 Igor Bukanov 2011-02-16 08:47:40 PST
(In reply to comment #6)
> Created attachment 512814 [details] [diff] [review]
> v4

The patch speeds up the browser startup time by about 4ms or 1%. This is comes from the fact that the resolve hook is called about 15000 times during start up and the patch effectively removes the hashing overhead of adding and removing entries from JSDHashTable per each resolve call.
Comment 8 Luke Wagner [:luke] 2011-02-16 11:40:30 PST
Comment on attachment 512814 [details] [diff] [review]
v4

Good points!
Comment 10 Robert Sayre 2011-02-17 08:17:57 PST
is this a blocker?
Comment 11 Mike Shaver (:shaver -- probably not reading bugmail closely) 2011-02-17 08:53:10 PST
No, please back it out of tracemonkey.  Thanks.
Comment 12 Phil Ringnalda (:philor) 2011-02-17 08:54:42 PST
Looks like this broke http://mxr.mozilla.org/mozilla-central/source/accessible/tests/mochitest/text/test_whitespaces.html?force=1 and http://mxr.mozilla.org/mozilla-central/source/accessible/tests/mochitest/text/test_singleline.html?force=1, on 32-bit (32-bit Linux but not 64-bit Linux, and Win7/WinXP, no OS X since we don't do a11y or mochitest-a11y there).
Comment 13 Igor Bukanov 2011-02-17 10:44:20 PST
backed out
Comment 14 Chris Leary [:cdleary] (not checking bugmail) 2011-02-17 17:50:34 PST
cdleary-bot mozilla-central merge info:
http://hg.mozilla.org/mozilla-central/rev/510c42c0d472
http://hg.mozilla.org/mozilla-central/rev/ce54ab33686b (backout)
http://hg.mozilla.org/mozilla-central/rev/c91eb9df95a9 (backout)
http://hg.mozilla.org/mozilla-central/rev/41d1950c22a2 (backout)
Note: not marking as fixed because last changeset is a backout.
Comment 15 Igor Bukanov 2011-03-10 13:37:10 PST
To determine the origin by mochitest-a11y failures I tried to minimize the patch. After extensive testing it looks like a weird compiler bug. But when I figures that out the end result was much smaller patch that, particular, keeps the resolving data structure in JSContext. So I change the title for the bug accordingly.
Comment 16 Igor Bukanov 2011-03-10 13:39:52 PST
Created attachment 518503 [details] [diff] [review]
v5

here is that small patch - I will add the proper constructors to JSThread/JSThreadData in a separated bug.
Comment 17 Igor Bukanov 2011-03-15 05:28:11 PDT
Luke: review ping
Comment 18 Luke Wagner [:luke] 2011-03-15 19:19:14 PDT
Comment on attachment 518503 [details] [diff] [review]
v5

Sorry for the delay, this is a great patch and I didn't mean to delay it!

> JSObject *
> js_InitFunctionAndObjectClasses(JSContext *cx, JSObject *obj)
...
>+    AutoResolving resolving1(cx, obj, ATOM_TO_JSID(classAtoms[JSProto_Function]));
>+    AutoResolving resolving2(cx, obj, ATOM_TO_JSID(classAtoms[JSProto_Object]));

The logic is pre-existing so I doubt its a bug but, for my own personal edification:
I don't see resolving1/2 being tested for alreadyStarted().  Is the assumption that you can't get an infinite recursion of js_InitFunctionAndObjectClasses?
Comment 19 Igor Bukanov 2011-03-16 05:07:23 PDT
(In reply to comment #18)
> The logic is pre-existing so I doubt its a bug but, for my own personal
> edification:
> I don't see resolving1/2 being tested for alreadyStarted().  Is the assumption
> that you can't get an infinite recursion of js_InitFunctionAndObjectClasses?

This is rather old code indeed. The assumption is that one does not call js_InitFunctionAndObjectClasses lazily but rather together with the creation of the global object.
Comment 20 Igor Bukanov 2011-06-15 08:27:18 PDT
I forgot to add fixed-in-tracemonkey when landing this. This patch is long fixed now:

http://hg.mozilla.org/mozilla-central/rev/4bd4c1d52b8b

Note You need to log in before you can comment on or make changes to this bug.