GC: Eliminate JSObject finalizers using special mark bits

NEW
Unassigned

Status

()

Core
JavaScript Engine
6 years ago
3 years ago

People

(Reporter: billm, Unassigned)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment)

(Reporter)

Description

6 years ago
Created attachment 567609 [details] [diff] [review]
patch

The goal in bug 648320 is to eliminate JSObject finalizers by allocating dynamic slots arrays in the GC heap, rather than using malloc. I realized that there is a simpler way to do this. We can have a bitmap that keeps track of which objects have slot arrays. During sweeping, we can avoid calling a finalizer of an object's bit is not set. This patch implements that idea.

It doesn't give a big speedup, but we do get a little faster on earley-boyer (by about 1.5% in a threadsafe build). Nothing else seems to be affected much. I was hoping to pretty much eliminate the cost of sweeping, but it turns out to be hard to do this. A lot of the cost of sweeping comes from the fact that there are so many arenas we have to iterate over and release--approximately 1 million in earley-boyer. The objshrink work will help here somewhat, but not a huge amount.

One nice thing about this patch is that it should be a big help on single-core machines where background finalization doesn't help.
Attachment #567609 - Flags: review?(igor)
(Reporter)

Comment 1

6 years ago
Luke asked about the single-threaded speedup. It's 4.8% on earley-boyer.

Also, I ran the threaded version more times. I still think it's a speedup, but there's a lot of variability. So please treat the 1.5% number as a rough approximation.

I'd be interested if you could try this patch yourself, Igor. It's useful to test it against bug 693426, since otherwise the patch looks like a bigger speedup than it really is.
(Reporter)

Comment 2

6 years ago
Luke had another idea, which was to see how many chunks appear to be completely free based on the bits. I just did a very simple loop before the sweep phase of every GC:
    for (GCChunkSet::Enum e(rt->gcChunkSet); !e.empty(); e.popFront()) {
        Chunk *chunk = e.front();
        if (chunk->bitmap.noBitsSet() && chunk->finalizerBitmap.noBitsSet())
            empty++;
    }

I got 2362 empty chunks and 1862 non-empty chunks during earley-boyer. This overestimates the number of empty chunks, since the finalizer bits are only maintained for JSObjects. However, I don't think this should have too much of an effect, since virtually everything in this benchmark is a JSObject.

This looks really nice, but it will take some more work to exploit. We would have to segregate JSObject chunks from non-JSObject chunks. Or else we'll have to maintain the finalizer bits for non-JSObject GC things. I'll look into this tomorrow. I'll leave up the current patch, since I don't think it will be invalidated by the new stuff.
(Reporter)

Comment 3

6 years ago
Comment on attachment 567609 [details] [diff] [review]
patch

Last night I realized there's a simpler way to implement this. Hopefully a little faster too.
Attachment #567609 - Flags: review?(igor)
(Assignee)

Updated

3 years ago
Assignee: general → nobody
You need to log in before you can comment on or make changes to this bug.