Closed Bug 648320 Opened 14 years ago Closed 8 years ago

GC: Eliminate standard JSObject finalizers

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED INCOMPLETE
blocking-kilimanjaro -

People

(Reporter: billm, Assigned: igor)

References

(Depends on 1 open bug)

Details

I think we should try, as much as possible, to get rid of finalizers. Instead, we would allocate stuff like object slots, empty shapes, and arguments data from the GC heap rather than from malloc. This would require some changes to the GC to make allocation classes more flexible, but it shouldn't be too difficult. Objects with custom finalizers would be allocated from special "penalty" arenas. That way, we could avoid running finalizers at all for most arenas. If we can get rid of most finalizers, we might be able to eliminate the background free thread, which would simplify the code somewhat.
Allocating slots from the GC heap means that the GC would need to do more during the marking phase to mark the slots arrays. Finalization, that we run in parallel after the GC, allows to avoid that.
The extra work would be setting one additional mark bit for each slot array. As we get better at predicting how big an object should be (once I get around to finishing bug 547327), slot arrays should become more rare. Mostly they'll only be needed for arrays with more than four elements. Since we'll already be doing a lot of marking work on those (to scan their slots), I doubt the difference will matter. The benefit is that we won't need to do any finalization work. In order for generational GC to work, the cost of collecting garbage objects needs to be zero. This bug should get us there.
How would the penalty arenas work when we don't know at the time of allocation whether an object will end up with a slot array? The simple thing to me seemed to be for objects in arenas to malloc stuff and have finalizers, and for objects in nurseries to allocate stuff from the nursery --- their classes would have finalizers, but we deliberately wouldn't call those finalizers as they would try to free stuff allocated in the nursery. Two issues: - Objects whose finalizers do stuff other than free malloc'ed data. These would have to be allocated outside of nurseries from the get go. - For array objects still in the nurseries but with tons of slots we probably want to allow them to have malloc'ed slots, would need a list of such malloc'ed pointers associated with the nursery to make sure they get freed if the objects are destroyed on the next sweep of the nursery. Also, we should figure out how bug 547327 interacts with the stuff in the TI branch, as I've since had to do object-size prediction in order to do IC-less property accesses on properties definitely held by objects.
(In reply to comment #3) > How would the penalty arenas work when we don't know at the time of > allocation whether an object will end up with a slot array? Er, to be clearer, some arrays are far bigger than an arena. Would the GC still manage memory for these?
(In reply to comment #4) > Er, to be clearer, some arrays are far bigger than an arena. Would the GC > still manage memory for these? I think we'd probably want to do two things: - Add support for "big arenas" that are larger than 4K. I don't see any reason why some arenas can't be larger than others, as long as we don't do conservative marking of objects in the big arenas. - Add a special "very large object" allocation path. Basically, it would just mmap the memory directly and add it to a list (pretty much what you described). The "very large" path would be sufficient, but I think it would be better to have both.
(In reply to comment #2) > The extra work would be setting one additional mark bit for each slot array. Right - we need to scan those arrays for GC things any any case so an additional marking would be pretty much cost-free. (In reply to comment #4) > Er, to be clearer, some arrays are far bigger than an arena. Would the GC > still manage memory for these? For big sizes we may just put all those malloced slots into a linked list or something like that.
(In reply to comment #5) > - Add support for "big arenas" that are larger than 4K. This would bring a lot of complexity and in that case we may well consider adopting something like jemalloc+GC marking bits rather then creating a universal allocator on our own. I don't see any > reason why some arenas can't be larger than others, as long as we don't do > conservative marking of objects in the big arenas. > - Add a special "very large object" allocation path. Basically, it would > just mmap the memory directly and add it to a list (pretty much what you > described). This is what any sensible malloc implementation would do on its own. So lets delegate managing of big allocations to it.
(In reply to comment #7) > This is what any sensible malloc implementation would do on its own. So lets > delegate managing of big allocations to it. This sounds right to me. Is the plan to move away from arenas that allocate fixed size things? I really like that property of our arenas (makes things clean, simple, fast), but it seems like something we wouldn't have if we needed to allocate variable-length slot buffers out of the GC heap.
(In reply to comment #8) > I > really like that property of our arenas (makes things clean, simple, fast), > but it seems like something we wouldn't have if we needed to allocate > variable-length slot buffers out of the GC heap. It is rather simple to add an extra marking bit for malloced things. For example, for any allocation over, say, 512 bytes, an implementation can call malloc to get the necessary size + 1 word for the linked list linkage and marking. The marking code would just set a bit in that linkage word and the sweeping part would walk over the linked list clearing the bit and freeing unmarked nodes. This adds some complexity, but I assume this would easier to do than to implement variable-sized arenas.
> > This is what any sensible malloc implementation would do on its own. So lets > > delegate managing of big allocations to it. > > This sounds right to me. Yes yes yes! Please. Using malloc makes things like profiling with Massif much easier.
Blocks: 505308
Igor, do you have any time to look into this bug? It's a big priority for the generational GC work, and it'd be very exciting to get rid of much of the cost of finalizers. I suspect that the best way to resolve questions about how to handle variable-sized objects is to start coding and see what works.
I can look at this prototyping the suggestion from the comment 9.
Assignee: general → igor
(In reply to comment #12) > I can look at this prototyping the suggestion from the comment 9. Great, thanks!
Any progress on this, Igor? This would make it easy to GC-allocate scripts even though they're not fixed size. Also, dvander wants to be able to GC-allocate non-fixed size objects for IonMonkey.
(In reply to comment #14) > Any progress on this, Igor? This would make it easy to GC-allocate scripts > even though they're not fixed size. Also, dvander wants to be able to > GC-allocate non-fixed size objects for IonMonkey. I will do this after finishing those small chunks patches. I suppose I will have some patches during the next week.
This might help with bug 673840, if it reduces the number of arena kinds.
Blocks: 673840
(In reply to comment #14) > Any progress on this, Igor? This would make it easy to GC-allocate scripts > even though they're not fixed size. Also, dvander wants to be able to > GC-allocate non-fixed size objects for IonMonkey. That exposed a problem with my initial approach of delegating big-sized allocations to malloc. Such allocations would not be recognized by the conservative GC scanner and could be GC-ed before getting stored in GC things. One workaround here is to always allocate the variable part together with GC the thing that stores it, but that would not help JSScript. Another alternative is to use an explicit autorooting but that requires more work for JIT. Another problem is that even for small allocations from GC arenas the conservative GC still needs to know the type of allocation to properly scan it for GC things. That implies that we cannot just use untyped GC-kind that depends only on the size of GC things. The type should be recorded somewhere or we my end up with proliferation with GC arena kinds. So I think it is worth to consider a simple universal allocator for non-fixed GC things among the lines of the current bump allocator for arenas and see how it would fair.
Could you talk more about what you mean in the last paragraph?
(In reply to comment #18) > Could you talk more about what you mean in the last paragraph? Bump-like allocators has been investigate for at least 15 years. Writing one is not complex, see, for example, http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.40.271&rep=rep1&type=pdf So my idea is to check if we can quickly adopt an existing solution.
I thought you were worried about how to take a pointer and find the start of the thing it points to and what kind of thing it is. How will bump pointer allocation solve that?
(In reply to comment #20) > I thought you were worried about how to take a pointer and find the start of > the thing it points to and what kind of thing it is. How will bump pointer > allocation solve that? Yes, it turned out trickier than I thought. So forget about modifying pre-existing bump allocators. My new plan is the following: 1. Add a few arena types for variable-length things to cover the sizes up to halve of the GC arena. As this will proliferate the number of arenas we may need to shrink the arena size to 2K. For the conservative GC the type of the allocation would be stored either in a word before the allocation or in a separated type map. The first approach is the most bump-allocator and cache friendly but the second would waste at most few bits per GC thing and would allow to have the allocation sizes like 2^N, not 2^N - (sizeof(type word) + alignment), as would be with the first approach. 2. For allocation sizes over arena size but less than the chunk size I borrow the code from jemalloc and its so-called large allocation management. The GC arenas that belongs to such allocations would be marked with a special type to help the conservative GC identify them and to find the allocation start/type. Note that this requires to move ArenaHeader out of Arena (bug 600234). 3. The allocation in chunk size or greater would be allocated as a series of chunks. Each chunk from the sequence would be recorded in the chunk hash together with a special marker to identify the first chunk. To make marking of such multi-chunk GC things transparent the first chunk would still have the markbitmap, but it would be moved to the start of the chunk. Alternatively the marking functions for variable-size things may require to pass the allocation length so we may skip the mark bitmap for multi-chunk allocations and store the marking bits in the chunk hash.
Depends on: 600234
blocking-kilimanjaro: --- → ?
Sounds like it's not critical to k9o: rather it allows us to get good finalizer perf without having the background thread. If this would in fact have a dramatic effect on sweep times and can be done quickly, please renominate.
blocking-kilimanjaro: ? → -
We've mostly made the background finalization thread tractable via other changes.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.