Closed Bug 1342345 Opened 7 years ago Closed 7 years ago

AutoCycleDetector and JSON CycleDetector should not use a HashSet

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla54
Tracking Status
firefox54 --- fixed

People

(Reporter: jandem, Assigned: jandem)

References

Details

Attachments

(2 files, 1 obsolete file)

See bug 1341902 comment 2. This is especially bad because we have to look up a separate unique ID nowadays for stuff we put in these sets.

We should fix both of them to use a Vector or an InlineSet. Using a GCVector for the JSON one wins a few ms on Kraken stringify-tinderbox. What's nice about a Vector is that the CycleDetector destructor can just pop the last value.

Using an InlineSet instead of a Vector could be faster in some pathological cases (let's say > 20 levels deep). It also makes things a bit slower, and more complicated because there's no GCInlineSet at the moment, and we'll probably run out of native stack size first, so I'm not sure it's worth worrying about. JSC just uses a vector for the JSON stringify case FWIW.
Below is a testcase that can be used to measure the worst-case here (deeply nested object graph). Here are some numbers:

depth  HashSet  Vector
2      182 ms   144 ms
100    195 ms   179 ms
2500   302 ms   893 ms
3000   <overrecursion error in the shell>

The Vector beats the HashSet up to level 200 or so on my machine.

And yes, depth 2500 is a clear win for the HashSet, but considering we're talking super unlikely corner cases here, the 2-3x difference is really not that big. Chrome gets about 680 ms for this and Safari 1511 ms.

Based on this, let's go with a Vector. It's much faster in the common cases, about as fast in the bad cases, and the really unlikely cases aren't that much slower.

---------

function getObject(depth) {
    if (depth === 2500) {
        var a = [];
        for (var i=0; i<5000; i++)
            a.push({x: i});
        return a;
    }
    return {x: getObject(depth + 1), y: {}};
}
function test() {
    var o = getObject(0);
    var t = new Date;
    for (var i=0; i<100; i++)
        s = JSON.stringify(o);
    print(new Date - t);
}
test();
If I add a MOZ_UNLIKELY to the compare loop for the Vector, it gets even faster. Even depth 200 is still a small win and depth 2500 improves to 727 ms :)
Assignee: nobody → jdemooij
Status: NEW → ASSIGNED
Attachment #8840810 - Flags: review?(evilpies)
Attachment #8840810 - Flags: review?(evilpies)
Attachment #8840810 - Attachment is obsolete: true
Attachment #8840812 - Flags: review?(evilpies)
Comment on attachment 8840812 [details] [diff] [review]
Part 1 - Use a Vector for JSON cycle detection

Review of attachment 8840812 [details] [diff] [review]:
-----------------------------------------------------------------

With those numbers this total makes sense to implement with a Vector. In the Google Drive json file the highest depth seems to be just around 5.
Attachment #8840812 - Flags: review?(evilpies) → review+
Using GCVector is also a nice simplification.

The join microbenchmark below improves from 274 ms to 112 ms, a much bigger win than I expected. I think we should be really careful where we use MovableCellHasher, it's convenient but it has very bad performance overhead.

function f() {
    var t = new Date;
    for (var i=0; i<1000000; i++) {
        var arr = [1, 2, 3];
        arr.join(",");
    }
    print(new Date - t);
}
f();
Attachment #8840828 - Flags: review?(jcoppeard)
Comment on attachment 8840828 [details] [diff] [review]
Part 2 - Use a Vector for AutoCycleDetector

Review of attachment 8840828 [details] [diff] [review]:
-----------------------------------------------------------------

Nice result, and it's simpler too.
Attachment #8840828 - Flags: review?(jcoppeard) → review+
Heh. I'm kind of curious what the scaling behavior would be if you did something like

 js::AutoCycleDetector::Vector& cycleDetectorVector(void* ptr) {
        return cycleDetectorVectors_[(uintptr_t(ptr) >> 5) & 3].ref();
 }

(Make 4 vectors, and choose one based on 2 bits grabbed from the pointer. It's sort of a mini hash in front of a vector.)
Using a set data structure here was deliberate, from the time I added spec-mandated cycle detection back in the day in bug 578273, to avoid pathological behavior in the cases already discussed in comments here.  Why are we knowingly adding non-linear behavior, rather than taking the time to implement an InlineSet (GCInlineSet) that will likely have value in the future for other uses?
(In reply to Jeff Walden [:Waldo] (remove +bmo to email) from comment #9)
> Why
> are we knowingly adding non-linear behavior, rather than taking the time to
> implement an InlineSet (GCInlineSet) that will likely have value in the
> future for other uses?

(1) As the numbers here show, this GCInlineSet would need an inline capacity of 200 or so because at that point the vector is *still* faster. This means the InlineSet we currently have can't be used without wasting stack space and we need another data structure that combines a Vector and a HashSet.

(2) The pathological cases here are limited by our native stack size (the depth can't get *too* bad or we will run out of stack space first). I think that's the key part here and IMO a 2x slowdown for the worst case is worth it to speed up the 99.9% of sane calls that actually show up in the wild.

(3) An InlineSet would still add some amount of overhead to 99.9% of calls.

A lot of my time is spent fixing performance issues and we still have some bad performance cliffs, so I have to be smart about where I spend my time. At this point I don't think optimizing the depth > 200 cases to be 1-2x faster is a good use of my/our time.
Pushed by jandemooij@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/d82cdc1a1bf3
part 1 - Use a Vector for JSON cycle detector. r=evilpie
https://hg.mozilla.org/integration/mozilla-inbound/rev/7e6204be142a
part 2 - Use a Vector for AutoCycleDetector. r=jonco
See Also: → 1342650
https://hg.mozilla.org/mozilla-central/rev/d82cdc1a1bf3
https://hg.mozilla.org/mozilla-central/rev/7e6204be142a
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla54

Worth noting for posterity that this change inadvertently fixed a hard to find OOM bug. The old ~CycleDetector() could wind up triggering an OOM situation after JA()/JO() returned true. This would result in an OOM being left on cx, but the stack unwinding with success codepaths, leading to some other later unrelated call into the engine failing. After this change it should no longer be possible for ~CycleDetector to OOM, even though the code structure is still similarly laid out in the latest revision.

You need to log in before you can comment on or make changes to this bug.