648320 - GC: Eliminate standard JSObject finalizers

Bill McCloskey [inactive unless it's an emergency] (:billm)

Reporter

Description

•

14 years ago

I think we should try, as much as possible, to get rid of finalizers. Instead, we would allocate stuff like object slots, empty shapes, and arguments data from the GC heap rather than from malloc. This would require some changes to the GC to make allocation classes more flexible, but it shouldn't be too difficult. Objects with custom finalizers would be allocated from special "penalty" arenas. That way, we could avoid running finalizers at all for most arenas. If we can get rid of most finalizers, we might be able to eliminate the background free thread, which would simplify the code somewhat.

Igor Bukanov

Assignee

Comment 1

•

14 years ago

Allocating slots from the GC heap means that the GC would need to do more during the marking phase to mark the slots arrays. Finalization, that we run in parallel after the GC, allows to avoid that.

Bill McCloskey [inactive unless it's an emergency] (:billm)

Reporter

Comment 2

•

14 years ago

The extra work would be setting one additional mark bit for each slot array. As we get better at predicting how big an object should be (once I get around to finishing bug 547327), slot arrays should become more rare. Mostly they'll only be needed for arrays with more than four elements. Since we'll already be doing a lot of marking work on those (to scan their slots), I doubt the difference will matter. The benefit is that we won't need to do any finalization work. In order for generational GC to work, the cost of collecting garbage objects needs to be zero. This bug should get us there.

Brian Hackett [Laid off!]

Comment 3

•

14 years ago

How would the penalty arenas work when we don't know at the time of allocation whether an object will end up with a slot array? The simple thing to me seemed to be for objects in arenas to malloc stuff and have finalizers, and for objects in nurseries to allocate stuff from the nursery --- their classes would have finalizers, but we deliberately wouldn't call those finalizers as they would try to free stuff allocated in the nursery. Two issues: - Objects whose finalizers do stuff other than free malloc'ed data. These would have to be allocated outside of nurseries from the get go. - For array objects still in the nurseries but with tons of slots we probably want to allow them to have malloc'ed slots, would need a list of such malloc'ed pointers associated with the nursery to make sure they get freed if the objects are destroyed on the next sweep of the nursery. Also, we should figure out how bug 547327 interacts with the stuff in the TI branch, as I've since had to do object-size prediction in order to do IC-less property accesses on properties definitely held by objects.

Brian Hackett [Laid off!]

Comment 4

•

14 years ago

(In reply to comment #3) > How would the penalty arenas work when we don't know at the time of > allocation whether an object will end up with a slot array? Er, to be clearer, some arrays are far bigger than an arena. Would the GC still manage memory for these?

Bill McCloskey [inactive unless it's an emergency] (:billm)

Reporter

Comment 5

•

14 years ago

(In reply to comment #4) > Er, to be clearer, some arrays are far bigger than an arena. Would the GC > still manage memory for these? I think we'd probably want to do two things: - Add support for "big arenas" that are larger than 4K. I don't see any reason why some arenas can't be larger than others, as long as we don't do conservative marking of objects in the big arenas. - Add a special "very large object" allocation path. Basically, it would just mmap the memory directly and add it to a list (pretty much what you described). The "very large" path would be sufficient, but I think it would be better to have both.

Igor Bukanov

Assignee

Comment 6

•

14 years ago

(In reply to comment #2) > The extra work would be setting one additional mark bit for each slot array. Right - we need to scan those arrays for GC things any any case so an additional marking would be pretty much cost-free. (In reply to comment #4) > Er, to be clearer, some arrays are far bigger than an arena. Would the GC > still manage memory for these? For big sizes we may just put all those malloced slots into a linked list or something like that.

Igor Bukanov

Assignee

Comment 7

•

14 years ago

(In reply to comment #5) > - Add support for "big arenas" that are larger than 4K. This would bring a lot of complexity and in that case we may well consider adopting something like jemalloc+GC marking bits rather then creating a universal allocator on our own. I don't see any > reason why some arenas can't be larger than others, as long as we don't do > conservative marking of objects in the big arenas. > - Add a special "very large object" allocation path. Basically, it would > just mmap the memory directly and add it to a list (pretty much what you > described). This is what any sensible malloc implementation would do on its own. So lets delegate managing of big allocations to it.

Brian Hackett [Laid off!]

Comment 8

•

14 years ago

(In reply to comment #7) > This is what any sensible malloc implementation would do on its own. So lets > delegate managing of big allocations to it. This sounds right to me. Is the plan to move away from arenas that allocate fixed size things? I really like that property of our arenas (makes things clean, simple, fast), but it seems like something we wouldn't have if we needed to allocate variable-length slot buffers out of the GC heap.

Igor Bukanov

Assignee

Comment 9

•

14 years ago

(In reply to comment #8) > I > really like that property of our arenas (makes things clean, simple, fast), > but it seems like something we wouldn't have if we needed to allocate > variable-length slot buffers out of the GC heap. It is rather simple to add an extra marking bit for malloced things. For example, for any allocation over, say, 512 bytes, an implementation can call malloc to get the necessary size + 1 word for the linked list linkage and marking. The marking code would just set a bit in that linkage word and the sweeping part would walk over the linked list clearing the bit and freeing unmarked nodes. This adds some complexity, but I assume this would easier to do than to implement variable-sized arenas.

Nicholas Nethercote [inactive]

Comment 10

•

14 years ago

> > This is what any sensible malloc implementation would do on its own. So lets > > delegate managing of big allocations to it. > > This sounds right to me. Yes yes yes! Please. Using malloc makes things like profiling with Massif much easier.

Tom S [:evilpie]

Updated

•

14 years ago

Blocks: 505308

Bill McCloskey [inactive unless it's an emergency] (:billm)

Reporter

Comment 11

•

14 years ago

Igor, do you have any time to look into this bug? It's a big priority for the generational GC work, and it'd be very exciting to get rid of much of the cost of finalizers. I suspect that the best way to resolve questions about how to handle variable-sized objects is to start coding and see what works.

Igor Bukanov

Assignee

Comment 12

•

14 years ago

I can look at this prototyping the suggestion from the comment 9.

Assignee: general → igor

Bill McCloskey [inactive unless it's an emergency] (:billm)

Reporter

Comment 13

•

14 years ago

(In reply to comment #12) > I can look at this prototyping the suggestion from the comment 9. Great, thanks!

Bill McCloskey [inactive unless it's an emergency] (:billm)

Reporter

Comment 14

•

13 years ago

Any progress on this, Igor? This would make it easy to GC-allocate scripts even though they're not fixed size. Also, dvander wants to be able to GC-allocate non-fixed size objects for IonMonkey.

Igor Bukanov

Assignee

Comment 15

•

13 years ago

(In reply to comment #14) > Any progress on this, Igor? This would make it easy to GC-allocate scripts > even though they're not fixed size. Also, dvander wants to be able to > GC-allocate non-fixed size objects for IonMonkey. I will do this after finishing those small chunks patches. I suppose I will have some patches during the next week.

Nicholas Nethercote [inactive]

Comment 16

•

13 years ago

This might help with bug 673840, if it reduces the number of arena kinds.

Blocks: 673840

Igor Bukanov

Assignee

Comment 17

•

13 years ago

(In reply to comment #14) > Any progress on this, Igor? This would make it easy to GC-allocate scripts > even though they're not fixed size. Also, dvander wants to be able to > GC-allocate non-fixed size objects for IonMonkey. That exposed a problem with my initial approach of delegating big-sized allocations to malloc. Such allocations would not be recognized by the conservative GC scanner and could be GC-ed before getting stored in GC things. One workaround here is to always allocate the variable part together with GC the thing that stores it, but that would not help JSScript. Another alternative is to use an explicit autorooting but that requires more work for JIT. Another problem is that even for small allocations from GC arenas the conservative GC still needs to know the type of allocation to properly scan it for GC things. That implies that we cannot just use untyped GC-kind that depends only on the size of GC things. The type should be recorded somewhere or we my end up with proliferation with GC arena kinds. So I think it is worth to consider a simple universal allocator for non-fixed GC things among the lines of the current bump allocator for arenas and see how it would fair.

Bill McCloskey [inactive unless it's an emergency] (:billm)

Reporter

Comment 18

•

13 years ago

Could you talk more about what you mean in the last paragraph?

Igor Bukanov

Assignee

Comment 19

•

13 years ago

(In reply to comment #18) > Could you talk more about what you mean in the last paragraph? Bump-like allocators has been investigate for at least 15 years. Writing one is not complex, see, for example, http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.40.271&rep=rep1&type=pdf So my idea is to check if we can quickly adopt an existing solution.

Bill McCloskey [inactive unless it's an emergency] (:billm)

Reporter

Comment 20

•

13 years ago

I thought you were worried about how to take a pointer and find the start of the thing it points to and what kind of thing it is. How will bump pointer allocation solve that?

Igor Bukanov

Assignee

Comment 21

•

13 years ago

(In reply to comment #20) > I thought you were worried about how to take a pointer and find the start of > the thing it points to and what kind of thing it is. How will bump pointer > allocation solve that? Yes, it turned out trickier than I thought. So forget about modifying pre-existing bump allocators. My new plan is the following: 1. Add a few arena types for variable-length things to cover the sizes up to halve of the GC arena. As this will proliferate the number of arenas we may need to shrink the arena size to 2K. For the conservative GC the type of the allocation would be stored either in a word before the allocation or in a separated type map. The first approach is the most bump-allocator and cache friendly but the second would waste at most few bits per GC thing and would allow to have the allocation sizes like 2^N, not 2^N - (sizeof(type word) + alignment), as would be with the first approach. 2. For allocation sizes over arena size but less than the chunk size I borrow the code from jemalloc and its so-called large allocation management. The GC arenas that belongs to such allocations would be marked with a special type to help the conservative GC identify them and to find the allocation start/type. Note that this requires to move ArenaHeader out of Arena (bug 600234). 3. The allocation in chunk size or greater would be allocated as a series of chunks. Each chunk from the sequence would be recorded in the chunk hash together with a special marker to identify the first chunk. To make marking of such multi-chunk GC things transparent the first chunk would still have the markbitmap, but it would be moved to the start of the chunk. Alternatively the marking functions for variable-size things may require to pass the allocation length so we may skip the mark bitmap for multi-chunk allocations and store the marking bits in the chunk hash.

Igor Bukanov

Assignee

Updated

•

13 years ago

Depends on: 600234

David Mandelin [:dmandelin]

Updated

•

13 years ago

blocking-kilimanjaro: --- → ?

David Mandelin [:dmandelin]

Comment 22

•

13 years ago

Sounds like it's not critical to k9o: rather it allows us to get good finalizer perf without having the background thread. If this would in fact have a dramatic effect on sweep times and can be done quickly, please renominate.

blocking-kilimanjaro: ? → -

Terrence Cole [:terrence]

Comment 23

•

8 years ago

We've mostly made the background finalization thread tractable via other changes.

Status: NEW → RESOLVED

Closed: 8 years ago

Resolution: --- → INCOMPLETE