Closed Bug 626684 Opened 14 years ago Closed 13 years ago

Add assertions that check against exactly-tracing dangling pointers

Categories

(Tamarin Graveyard :: Garbage Collection (mmGC), defect, P1)

defect

Tracking

(Not tracked)

RESOLVED FIXED
Q3 11 - Serrano

People

(Reporter: lhansen, Assigned: rulohani)

References

Details

Attachments

(1 file)

From bug #604733: "We can add checks to the tracer in DEBUG builds (or even in
RELEASE builds, with a separate tweak) that check that every traced pointer
points to live storage; this guards against the dangling pointer problem and
may be important in practice."

Smallish job, probably goes into TracePointer and maybe a few others (especially if we start type-specializing tracers, bug 604704).  Will probably help us find some bugs more quickly.
This is needed today (1 February 2011) - needs to be in TR before merge to FRR / FR, will make it much less painful to find deletion errors in the existing code base.
Priority: P2 → P1
Lars, I see the assert was added and commented:

        // We cannot yet assert the following.  The reason is an interaction between
        // exact tracing, stack pinning, and reference counting.  Reference counts
        // must reflect at least the number of references from exactly traced objects.
        // But code that maintains arrays of reference counts and cleverly "transfers"
        // the counts from one copy of the array to the other without resetting the
        // exact-trace bit or clearing the object or freeing it will end up in a
        // situation where, if the dead array copy is reached during tracing, the
        // reference counts will be wrong and the assert will hit.  The object can be
        // reached even if the C++ code is correct since we trace the stack and some
        // objects conservatively.  The problem will disappear when we remove
        // reference counting.
        //
        // Another problem here is that explicit deletion of objects can create
        // dangling pointers that will also hit the assert, but in that case there are
        // other, more insidious bugs, so explicit deletion has to be dealt with in
        // any case.
        // GCAssert((bits2 & (kMark|kQueued)) != (kMark|kQueued)); 

How often was the assert hit because of the first problem above? I thought this is the assert we want for checking dangling pointers.
The assert you want for dangling pointers probably goes in GC::TracePointer or in functions that call that, they should assert that a non-null pointer value points to live storage.
A less muddled comment (after talking to Ruchi and Steven):

Of course the code above is in GC::TracePointer, and it's even the correct code.  The comment is even correct.  There's a bug, which is that the array code - and maybe other, similar code - does not clear storage containing pointers to rc'd objects when the RCs are transfered.  Since RC deletes objects from underneath the GC that creates dangling pointers, which is a big deal if the object containing the pointers is reached by GC.  It can be reached by conservative GC, at least.

I did not realize previously that that is a bug, but it is.  In fact it is a bug simply by application of the existing dangling-pointer invariant.

More interestingly perhaps it uncovers an implied "RC must be partially correct" invariant that states that if the RC drops to zero then there are /no/ pointers to that object from exactly-traced storage.

Together with the ContainsPointers assert the test is probably even sufficient, because ContainsPointers checks that the pointer points into the heap somewhere (I think - more checking might be worthwhile).  Anyway that's the second half of the test this bug calls for.
Flags: flashplayer-bug-
I haven't hit this assert (enabled on my system) till now with the Flash player 4 ATSes. Should this be enabled? How do I find the code which needs cleanup after RC transfer?
(In reply to comment #5)
> I haven't hit this assert (enabled on my system) till now with the Flash player
> 4 ATSes. Should this be enabled? How do I find the code which needs cleanup
> after RC transfer?

I assume that means that the array zeroing was implemented in TR and that you're looking for where that code is so that you can transfer it to FR?  I'd talk to Steven and agree with him whether he thinks it's OK to enable it, and find out the code to copy over (if any).
Talked with Steven. There wasn't any code found in TR but not in FR that was probably causing false assert failure. avmplusList is one possibility which is doing copy and free etc. Specifically this code:
        newData->len = m_data->len;
        // don't call ListHelper::clearRange here; we want the refCounts to be transferred
        ListHelper::LISTDATA::free(gc, m_data);
        ListHelper::wbData(this, &m_data, newData);

but I wonder how will the deleted memory be accessible while exact tracing as m_data gets updated with the newData?

Other than this I filed another bug related to refcounting : 634785. Though, I doubt that it would have caused the assert to hit. 

I think we should just enable this assert. At least it will help find any dangling pointers or for that matter the other bug in question.

Steven if you agree that the commented out assert should be enabled, I will attach a patch with updated comments.
Have we enabled the assert, then run thru ATS9/ATS10 in Flash? That would be a good test.
Yep, I enabled the assert in my local branch and ran ATSes without hitting this assert.
(In reply to comment #9)
> Yep, I enabled the assert in my local branch and ran ATSes without hitting this
> assert.

Most excellent.  I guess Steven has the final word.
Well ok then, enable that sucker!
Attachment #514695 - Flags: superreview?(lhansen)
Attachment #514695 - Flags: review?(stejohns)
Attachment #514695 - Flags: review?(stejohns) → review+
Attachment #514695 - Flags: superreview?(lhansen) → superreview+
TR 5996:4e563e1a9caf
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
in FRR (Mac Standalone Debug Debugger), when I run ATS9 build smoke, I hit the assert when I get to test 12518 (23/1633)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: