Closed
Bug 509067
Opened 15 years ago
Closed 14 years ago
The thread safety of the OOM callback firing and management is broken
Categories
(Tamarin Graveyard :: Garbage Collection (mmGC), defect, P1)
Tracking
(Not tracked)
VERIFIED
FIXED
flash10.1
People
(Reporter: treilly, Assigned: treilly)
References
Details
Attachments
(3 files, 2 obsolete files)
2.86 KB,
patch
|
lhansen
:
superreview+
|
Details | Diff | Splinter Review |
12.20 KB,
patch
|
lhansen
:
superreview+
|
Details | Diff | Splinter Review |
2.94 KB,
patch
|
lhansen
:
superreview+
|
Details | Diff | Splinter Review |
I *think* we can protect these with the heap lock and be done.
Assignee | ||
Comment 1•15 years ago
|
||
Attachment #398442 -
Flags: superreview?(lhansen)
Assignee | ||
Updated•15 years ago
|
Attachment #398442 -
Flags: superreview?(lhansen)
Assignee | ||
Comment 2•15 years ago
|
||
Attachment #398442 -
Attachment is obsolete: true
Attachment #398457 -
Flags: superreview?(lhansen)
Assignee | ||
Updated•15 years ago
|
Summary: The thread safety of the OOM callback list and gc manager list is suspect → The thread safety of the OOM callback firing and management is broken
Assignee | ||
Comment 3•15 years ago
|
||
We also need to keep the GCHeap lock while invoking callbacks but allow on thread callback/gc list remove operations and freeblock operations to occur
Assignee | ||
Updated•15 years ago
|
Attachment #398457 -
Flags: superreview?(lhansen)
Assignee | ||
Comment 4•15 years ago
|
||
Comment on attachment 398457 [details] [diff] [review] use a separate lock for list asking for r+ as a code cleanup change but not the complete fix, pushing this will bring us up to speed with internal branches and make integrations easier.
Attachment #398457 -
Flags: superreview?(lhansen)
Updated•15 years ago
|
Attachment #398457 -
Flags: superreview?(lhansen) → superreview+
Comment 5•15 years ago
|
||
Comment on attachment 398457 [details] [diff] [review] use a separate lock for list r+ as requested but with the proviso that this is not correct - there is a race condition in how callbacks are handled; it's possible to grab a callback off the list and then have another thread cause the callback to be destroyed before it's used by the thread that grabbed it. The correct fix appears to be to change how callbacks are handled globally.
Assignee | ||
Comment 6•15 years ago
|
||
in addition to the race conditions we need to block other threads from making GCHeap allocations while the callbacks are being fired. Currently if another thread gets into ExpandHeap while another thread is already there firing callbacks we abort.
Updated•15 years ago
|
Target Milestone: --- → flash10.1
Updated•15 years ago
|
Priority: P2 → P1
Assignee | ||
Comment 7•14 years ago
|
||
Attachment #420790 -
Flags: superreview?(lhansen)
Comment 8•14 years ago
|
||
Comment on attachment 420790 [details] [diff] [review] Keeps spinlock during callback firing and allows recursion in select cases Most allocs don't go through GC::Alloc any more so the assertion in GC::Alloc is insufficient. If I understand the new restriction correctly, it will again be impossible for a finalizer to allocate memory reliably. (GC::memoryStatusChange calls GC::Collect which runs arbitrary finalizers.) It seems to me we worked hard to make that possible. I'm inclined to r- this but I'll hold off - can you explain what you're trying to do here and why this fix is appropriate?
Assignee | ||
Comment 9•14 years ago
|
||
(In reply to comment #8) > (From update of attachment 420790 [details] [diff] [review]) > Most allocs don't go through GC::Alloc any more so the assertion in GC::Alloc > is insufficient. I should move it to GCAlloc::Alloc I take it. > If I understand the new restriction correctly, it will again be impossible for > a finalizer to allocate memory reliably. (GC::memoryStatusChange calls > GC::Collect which runs arbitrary finalizers.) It seems to me we worked hard to > make that possible. We could make recursive allocations allowed I suppose, we'll still just Abort if they can't be satisfied. The assertions were nice though because I found a bunch of shutdown allocations that could have/should have been avoided. > I'm inclined to r- this but I'll hold off - can you explain what you're trying > to do here and why this fix is appropriate? Its been a known problem that unlocking the heap to send the notifications lets other threads start performing allocations while we're sending the notifications, on the thread heavy desktop this commonly results in an abort before even finish sending the notifications. The idea is that when we run out of memory and send the notifications all incoming requests from other threads should be blocked until we're done sending out the notifications.
Comment 10•14 years ago
|
||
(In reply to comment #9) > (In reply to comment #8) > > (From update of attachment 420790 [details] [diff] [review] [details]) > > Most allocs don't go through GC::Alloc any more so the assertion in GC::Alloc > > is insufficient. > > I should move it to GCAlloc::Alloc I take it. Also GCLargeAlloc::Alloc. > > If I understand the new restriction correctly, it will again be impossible for > > a finalizer to allocate memory reliably. (GC::memoryStatusChange calls > > GC::Collect which runs arbitrary finalizers.) It seems to me we worked hard to > > make that possible. > > We could make recursive allocations allowed I suppose, we'll still just Abort > if they can't be satisfied. The assertions were nice though because I found a > bunch of shutdown allocations that could have/should have been avoided. I believe that, but we should not introduce a new restriction here. Aborting on a failure is probably OK for now. During the next round finalization will not be running "inside" the GC and the problem will go away. > > I'm inclined to r- this but I'll hold off - can you explain what you're trying > > to do here and why this fix is appropriate? > > Its been a known problem that unlocking the heap to send the notifications lets > other threads start performing allocations while we're sending the > notifications, on the thread heavy desktop this commonly results in an abort > before even finish sending the notifications. The idea is that when we run out > of memory and send the notifications all incoming requests from other threads > should be blocked until we're done sending out the notifications. OK, that makes sense. If you can post a patch that doesn't create the new restriction on operations within finalizers, I will try to be prompt with the review.
Assignee | ||
Comment 11•14 years ago
|
||
I left in the asserts but limited to kMemAbort case, coolio?
Attachment #420790 -
Attachment is obsolete: true
Attachment #421081 -
Flags: superreview?(lhansen)
Attachment #420790 -
Flags: superreview?(lhansen)
Updated•14 years ago
|
Attachment #421081 -
Flags: superreview?(lhansen) → superreview+
Assignee | ||
Comment 12•14 years ago
|
||
changeset: 3478:2b22a41cae41
Assignee | ||
Comment 13•14 years ago
|
||
(In reply to comment #5) > (From update of attachment 398457 [details] [diff] [review]) > r+ as requested but with the proviso that this is not correct - there is a race > condition in how callbacks are handled; it's possible to grab a callback off > the list and then have another thread cause the callback to be destroyed before > it's used by the thread that grabbed it. The correct fix appears to be to > change how callbacks are handled globally. This is still a problem and I think it could be solved by removing the callback list lock and relying on the GCHeap spinlock instead. Then the other thread would be blocked from removing the callback (but the thread issuing the notifications could remove a callback by using a ALLOW_RECURSION lock). We should probably fix this last bit for 10.1 no?
Comment 14•14 years ago
|
||
(In reply to comment #13) > This is still a problem and I think it could be solved by removing the callback > list lock and relying on the GCHeap spinlock instead. Then the other thread > would be blocked from removing the callback (but the thread issuing the > notifications could remove a callback by using a ALLOW_RECURSION lock). We > should probably fix this last bit for 10.1 no? Sounds plausible.
Assignee | ||
Comment 15•14 years ago
|
||
This removes the list lock and uses the GCHeap's main lock instead. In theory this will prevent a callback from being removed/deleted by one thread while being used to send notifications by another. In practice we needed accomodate the BasicList's re-entries into the GCHeap so I used a hack to take advantage of the recent recursive lock stuff. Really here we want to have a completely generic recursive lock, arguably the restriction of the notification recursion to that one case isn't really buying us that much and we should allow general recursion on the GCHeap lock but that's a bit much to bite off at this point I think. So we go with this "hack" for now I think.
Attachment #422765 -
Flags: superreview?(lhansen)
Comment 16•14 years ago
|
||
Comment on attachment 422765 [details] [diff] [review] crap I should not be approving patches called "crap".
Attachment #422765 -
Flags: superreview?(lhansen) → superreview+
Assignee | ||
Comment 17•14 years ago
|
||
tamarin-redux 3638 tamarin-redux-argo 3603
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•