Closed Bug 633219 Opened 13 years ago Closed 13 years ago

Replacing resolving hash with a linked list

Categories

(Core :: JavaScript Engine, defect)

x86_64
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: igor, Assigned: igor)

Details

(Whiteboard: fixed-in-tracemonkey)

Attachments

(1 file, 4 obsolete files)

Currently to implement run-away recursion detection during Class resolve hook and watch handler invocation the code uses a hashtable stored in JSContext.

With multiple JSContext instances in a typical browser session we end up with multiple hashtable instances that are empty most of the time. So the idea is to to move the hash into JSThread while simplifying the interface for the class replacing js_(Start|Stop) resolving with a single auto class.
Attached patch v1 (obsolete) — Splinter Review
The patch replaces per-context JSDHashTable with HashSet stored in JSThreadData. To minimize the amount of checks when calling the resolve hook the patch makes the resolvingSet an eagerly initialized field of JSThreadData. That required to provide proper constructors and destructors to JSThread and JSThreadData.

As this removed some direct and indirect calls that JSDHashTable this also made calling the resolve hooks faster. In particular, for the following synthetic benchmark I observed a 2% speedup with method jit:

// Minimize global object lookup timing for absent properties
this.__proto__ = null;

function test(N) {
    var t = Date.now();
    for (var i = N; i != 0; --i)
        "not_existing_property" in this;
    return Date.now() - t;
}

test(1e3); // warm up

var results = [];
for (var i = 0; i != 100; ++i)
    results.push(test(1e6));

var sum = 0;
for (var i = 0; i != results.length; ++i)
    sum += results[i];
var average = sum / results.length;
var min = Math.min.apply(null, results);
var max = Math.max.apply(null, results);

print("Min: "+min);
print("Max: "+max);
print("Average: "+average.toFixed(1));
Attachment #511411 - Flags: review?(lw)
Attached patch v2 (obsolete) — Splinter Review
To support the most common case of 1 or 2 entries in the resolving set the patch uses a separated fixed-size array and inlines the fast path that works on that array without touching the hashset. With that depending on CPU the patch shows 10%-20% speedup for the test case above. 

But we need a good test coverage for the resolve hook machinery to have a confidence in the changes.
Attachment #511411 - Attachment is obsolete: true
Attachment #511411 - Flags: review?(lw)
Comment on attachment 511872 [details] [diff] [review]
v2

This patch has a lot of great simplifications.

I like the constructor/destructor-ification of JSThread/JSThreadData/JSContext!  Instead of js_calloc and doing all the work of JS_ASSERT'ing that various fields have been initialized to zero, could you just use js_new<> and change the asserts to member-initialization?  I don't think thread creation/destruction is a hot enough path for any of this to matter perf-wise, and it would just be more regular.

In CallResolveOp, the RAII guards let you just 'return true' on success.  But, for failure, you still need to set *propp = NULL, requiring the do{}while(0)/break trick.  Is it measurably slower to just set *propp = NULL before hand so that, on failure, you can just 'return false' ?

>+    Set::AddPtr p = data->set.lookupForAdd(key);
>+    if (p) {
>+        /*
>+         * If we just populated the table from the array we know the new key
>+         * is not their.
>+         */
>+        JS_ASSERT(data->size > JS_ARRAY_LENGTH(data->array));

s/their/there/... but I think the assert says it clearly enough without the comment.

>+     * We do not have a recursion, add the key and prepare for
>+     * set->remove() call in stopImpl().

s/stopImpl/~AutoResolving/

>+    ~AutoResolving() {
...
>+        if (data->size <= JS_ARRAY_LENGTH(data->array)) {
>+#ifdef DEBUG
>+            size_t i = 0;
>+            do {
>+                JS_ASSERT(i < data->size);
>+            } while (!(data->array[i++] == key));
>+#endif

Perhaps replace this block with just

  JS_ASSERT_IF(data->size != JS_ARRAY_LENGTH(data->array),
               data->array[data->size-1] == key) ?

Also explains why you can just do data->size-- below.

So there is a comment "We must clear it first as we do not empty the hash in removeFromSet when transition to use the array."  Wouldn't it be simpler (and ultimately as efficient) to just have fremoveFromSet clear the table?

>+            JS_STATIC_ASSERT(sizeof(word) == 4 || sizeof(word) == 8);
>+            if (sizeof(word) == 8)
>+                word ^= (word >> 32);

Yeah... we should really do this in DefaultHasher<T*>, that or make sizeof(HashNumber) == sizeof(uintptr_t).

>+        static bool match(const Key &a, const Key &b) {
>+            return a == b;
>+        }
>+    };
>+
>+
>+    static const size_t INITIAL_SET_CAPACITY = 4;
>+    typedef HashSet<Key, Hasher, SystemAllocPolicy> Set;

Extra line.  Also INITIAL_SET_CAPACITY seems unused.

>+    /* Per-thread structure. */
>+    struct Data {

While reading other parts, I wondered what 'Data' was and what is was associated with.  ThreadData I think gives the right impression.

>+        size_t      size;

As a convention, I think we use 'size' for number-of-bytes and 'length' for number of elements.  'length' here then?

A last request is that you take this as a motivating use case to make a js::UsuallySmallHashSet<T,N,HP,AP> template implemented in terms of a js::HashTable that takes care of all the array<-->set transitions.  However, perhaps this could just be a followup bug.
(In reply to comment #3)
>  Instead of js_calloc and doing all the work of JS_ASSERT'ing that various
> fields have been initialized to zero, could you just use js_new<> and change
> the asserts to member-initialization?

I tried that, but after missing an initializer and spending some time to track TryServer failure I want to do that in another bug. The focus here is simply to get a constructor/destructor for ThreadData so I it can contain a field with constructor/destructor. 

> In CallResolveOp, the RAII guards let you just 'return true' on success.  But,
> for failure, you still need to set *propp = NULL, requiring the
> do{}while(0)/break trick.  Is it measurably slower to just set *propp = NULL
> before hand so that, on failure, you can just 'return false' ?

I have a patch for the bug 627016 that is build on top of this bug that changes CallResolveOp to return the found shape via the return value. 

> > A last request is that you take this as a motivating use case to make a
> js::UsuallySmallHashSet<T,N,HP,AP> template implemented in terms of a
> js::HashTable that takes care of all the array<-->set transitions.

It turned out that we do not need the hashtable, a vector is enough. A sane embedding would not have unbounded recursion of resolve hooks - I could not get more then 4 nested resolve calls in firefox. Also a runaway recursion in the watchpoint handlers would be stopped via the protection against too-deep-native-stack-usage long before O(N) search in the vector would matter. So in a new patch I replaced HashSet with Vector which simplified things substantially.

Still we should definitely consider the small hash case optimization. But this is not that straightforward. For example, for maximum performance in the small size case one wants to pack keys together with values stored separately to minimize the number of cache lines to read. But that breaks the current interface built on the notion of the hash entry storing both key and value.
Attached patch v3 (obsolete) — Splinter Review
The new patch uses Vector, not HashSet, for resolving set.
Attachment #511872 - Attachment is obsolete: true
Attachment #512650 - Flags: review?(lw)
Attached patch v4 (obsolete) — Splinter Review
The new patch uses linked list for the resolving set. The infallible allocation allowed to simplify the usage and implementation significantly.
Attachment #512650 - Attachment is obsolete: true
Attachment #512814 - Flags: review?(lw)
Attachment #512650 - Flags: review?(lw)
(In reply to comment #6)
> Created attachment 512814 [details] [diff] [review]
> v4

The patch speeds up the browser startup time by about 4ms or 1%. This is comes from the fact that the resolve hook is called about 15000 times during start up and the patch effectively removes the hashing overhead of adding and removing entries from JSDHashTable per each resolve call.
Comment on attachment 512814 [details] [diff] [review]
v4

Good points!
Attachment #512814 - Flags: review?(lw) → review+
http://hg.mozilla.org/tracemonkey/rev/510c42c0d472
Whiteboard: fixed-in-tracemonkey
is this a blocker?
No, please back it out of tracemonkey.  Thanks.
Looks like this broke http://mxr.mozilla.org/mozilla-central/source/accessible/tests/mochitest/text/test_whitespaces.html?force=1 and http://mxr.mozilla.org/mozilla-central/source/accessible/tests/mochitest/text/test_singleline.html?force=1, on 32-bit (32-bit Linux but not 64-bit Linux, and Win7/WinXP, no OS X since we don't do a11y or mochitest-a11y there).
backed out
Whiteboard: fixed-in-tracemonkey
To determine the origin by mochitest-a11y failures I tried to minimize the patch. After extensive testing it looks like a weird compiler bug. But when I figures that out the end result was much smaller patch that, particular, keeps the resolving data structure in JSContext. So I change the title for the bug accordingly.
Summary: Moving resolving hash to JSThread → Replacing resolving hash with a linked list
Attached patch v5Splinter Review
here is that small patch - I will add the proper constructors to JSThread/JSThreadData in a separated bug.
Attachment #512814 - Attachment is obsolete: true
Attachment #518503 - Flags: review?
Attachment #518503 - Flags: review? → review?(luke)
Luke: review ping
Comment on attachment 518503 [details] [diff] [review]
v5

Sorry for the delay, this is a great patch and I didn't mean to delay it!

> JSObject *
> js_InitFunctionAndObjectClasses(JSContext *cx, JSObject *obj)
...
>+    AutoResolving resolving1(cx, obj, ATOM_TO_JSID(classAtoms[JSProto_Function]));
>+    AutoResolving resolving2(cx, obj, ATOM_TO_JSID(classAtoms[JSProto_Object]));

The logic is pre-existing so I doubt its a bug but, for my own personal edification:
I don't see resolving1/2 being tested for alreadyStarted().  Is the assumption that you can't get an infinite recursion of js_InitFunctionAndObjectClasses?
Attachment #518503 - Flags: review?(luke) → review+
(In reply to comment #18)
> The logic is pre-existing so I doubt its a bug but, for my own personal
> edification:
> I don't see resolving1/2 being tested for alreadyStarted().  Is the assumption
> that you can't get an infinite recursion of js_InitFunctionAndObjectClasses?

This is rather old code indeed. The assumption is that one does not call js_InitFunctionAndObjectClasses lazily but rather together with the creation of the global object.
I forgot to add fixed-in-tracemonkey when landing this. This patch is long fixed now:

http://hg.mozilla.org/mozilla-central/rev/4bd4c1d52b8b
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Whiteboard: fixed-in-tracemonkey
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: