Closed Bug 993347 Opened 9 years ago Closed 8 years ago
PJS: must clear result array of discarded results after parallel bailout
If a parallel operation is applied to an array of non-primitive values - so that a cursor is supplied to the kernel - a non-fatal bailout followed by a retry allows the kernel's second pass to observe the results written by the previous pass. This is semantically undesirable if the values written are primitives, and possibly fatal if they are pointers once the generational parallel GC is operational (it's possible to have pointers into one worker's nursery be available to another worker). The most reasonable way to fix this is to clear the result array when restarting the operation after a bailout. Clearing should be done in parallel. It can be done in preparation for each slice or in bulk before control is given to the workers, though in the latter case work the fact that we're work stealing probably means there needs to be an additional synchronization point.
Obvious temporary fix, sits on top of the patch for bug 993396.
Type descriptors have a "initialize" method for initializing memory in a suitable way. Presumably we want to call that, exposed via intrinsic or some such. I think that path is parallel-safe, though I don't recall for sure.
After some discussion on IRC, Lars and I came upon the following possible strategy. It's a bit more aggressive, but it seems to address a couple of concerns: 1. For arrays, we would track the initialized length separate from the full length. 2. Before invoking the user callback on each slice, we would initialize the slice in question. 3. In seq mode, we can advance the initialized length as we go. 4. In par mode, if a bailout occurs, we never change initialized length, if no bailout occurs, we set initialized length to the end of the array. So it's all or nothing.
(In reply to Niko Matsakis [:nmatsakis] from comment #3) > After some discussion on IRC, Lars and I came upon the following possible > strategy. It's a bit more aggressive, but it seems to address a couple of > concerns: > > 1. For arrays, we would track the initialized length separate from the full > length. > 2. Before invoking the user callback on each slice, we would initialize the > slice in question. > 3. In seq mode, we can advance the initialized length as we go. > 4. In par mode, if a bailout occurs, we never change initialized length, if > no bailout occurs, we set initialized length to the end of the array. So > it's all or nothing. This is probably insufficient - by itself - to support PJS generational GC, since the GC needs to scan the result array during a minor GC. There are some workarounds for that, such as a per-worker byte map that tracks the slices touched by a particular worker. My current thinking is that I will land an obviously-correct but suboptimal stopgap solution as part of the PJS GC work, and then work on a more sophisticated solution here later.
The problem in this bug is orthogonal to PJS GC provided that (a) PJS GC always evacuates its nurseries whether there's a bailout or not and (b) never clears and always incorporates its tenured areas into the general heap on bailout. It's easily observed that those conditions will result in a consistent heap, regardless of where an object is allocated and where references to any object are found. (It may also have good performance given that most objects will never make it out of the PJS nurseries; the PJS tenured areas are likely very small in practice.)
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.