Closed Bug 615490 Opened 14 years ago Closed 13 years ago

Remove explicit deletion of managed objects

Categories

(Tamarin Graveyard :: Virtual Machine, defect, P3)

defect

Tracking

(Not tracked)

RESOLVED FIXED
Q3 11 - Serrano

People

(Reporter: lhansen, Assigned: rulohani)

References

Details

Attachments

(4 files, 1 obsolete file)

Explicit deletion of managed objects is benign in the context of conservative tracing but creates a dangerous dangling pointer problem in the context of exact tracing.  Explicit deletion may be safe in some circumstances, and it may be an important optimization in some cases, but generally it should not be used.

We cannot find sites that perform explicit deletion by making the delete operator private, as C++ requires the delete operator to be visible to any virtual destructor (ISO C++ 1998 2.4 para 11).  So long as we keep using destructors for finalizers we'll have to find some other method for tracking down explicit deletion.  For the time being I've been content with setting breakpoints in the base delete operators and running tests; as neither DRC nor GC invokes the delete operator this finds exactly the cases where the delete operator is invoked explicitly.

The other problem is direct use of GC::Free, of course, which is much more prevalent but the uses of which are easier to find by making GC::Free private and friending the few methods that must have access to it.
Note that the notion of a dangling pointer is subtle.  For example, there's one in this code:

        const Atom* atoms = getAtoms();
        gc->Free(atoms);
        setAtoms(newAtoms);

The dangling pointer is in InlineHashtable::m_atomsAndFlags which will point to deleted storage when setAtoms is called.  If setAtoms allocates storage (even indirectly) then the tracer may see the dangling pointer.  setAtoms may allocate storage because it invokes the write barrier; the GC could in principle be driven forward by the overflow of a write barrier buffer or something like that.  And if there's any kind of concurrent or background marking then the dangling pointer is obvious.

As a consequence of that, GC::Free will probably have to go away and be replaced by something obnoxious, eg, GC::ClearLastHeapReferenceAndFree(void**), which would take the address of m_atomsAndFlags, would read it, clear it, and then free the storage.
Note further that explicit deletion of String objects has been disabled in the String code by implementing an empty delete operator.  I have not investigated but I assume that change came in with Michael's string rewrite.
(In reply to comment #1)
>
> The dangling pointer is in InlineHashtable::m_atomsAndFlags which will point to
> deleted storage when setAtoms is called.  If setAtoms allocates storage (even
> indirectly) then the tracer may see the dangling pointer. 

And in a threaded GC, marking could be happening on another thread anyway, so a whole new (and potentially very large) class of dangling pointer hazards arises even when the invoking thread doesn't allocate storage.
Calling Free() from only one place means it's easier to control the dangling pointer situation.
Attachment #494340 - Flags: review?(stejohns)
See also bug #615830 (regarding StringOutputStream and StringBuffer).
I'm not sure I completely understand the problem, is the problem that delete happens in a step wise fashion down the class hierarchy and the exact tracing flag isn't unset until we get to the bottom?   We guard against deleted objects being on the mark stack/barrier stacks right?    

The list problem was that we WEREN'T calling free on ditched array during a grow operation and left its exact tracing flag intact.   So that introduces a new rule that if you can't have possibly dead RCObjects reachable from exactly traced memory.   I'm having trouble making the jump from that new rule to we must eradicate all explicit deletions (although I'd love it if our GC was so efficient we could remove all explicit deletions and not suffer for it).

As long as we have DRC and conservative stack scanning I think we'll always have to tolerate finding deleted objects on the stack (and in false positives in conservatively scanned heap objects).    I guess I don't understand why the exact tracer can't be similarly tolerant (at least until DRC and explicit deletion is gone).

There's a class of objects in the player that are always explicitly deleted when operating in the old VM where display object couldn't move about, ie when something drops off the display list its deleted and the dtor does a lot of work to remove any dangling pointers.     Do we care about freeing of objects that aren't exactly traced?    

In those cases it should be relatively straight forward to refactor the dtor's out into separate methods and get the same behavior through a different mechanism than calling delete, although explicit deletion is visible in weak ref behavior so we would risk breaking things dependent on weak ref behavior.
The problem has nothing to do with the delete operator per se, it has to do with freeing storage explicitly.

The problem is that the exact tracer will not bother to determine whether a pointer field points to an object that has been deleted - the field has to be valid, and it has to be NULL or a pointer to a live object.  This is a central part of what it means for the tracer to be "exact".

Consequently, if the freeing of an object leaves a field in an object pointing to the memory previously occupied by the object (and that memory can be unmapped so it can "not exist" in some sense) and the exact tracer can be exposed to that field, then we have a bug.

Stack scanning has nothing to do with this.  The stack scan may identify a deleted object as reachable; if that object is exactly scanned then it must have been cleared.  

Since RC smart pointers clear the pointers (or ought to) I don't think RC is really part of the central problem here; explicit freeing of storage through GC::Free is, however.
Attachment #494340 - Flags: review?(stejohns) → review+
Attachment #494341 - Flags: review?(stejohns) → review+
changeset: 5604:738293714b0a
user:      Lars T Hansen <lhansen@adobe.com>
summary:   For 615490 - Remove explicit deletion of managed objects: centralize the use of GC::Free in InlineHashtable code (r=stejohns)

http://hg.mozilla.org/tamarin-redux/rev/738293714b0a
changeset: 5605:12efad62783c
user:      Lars T Hansen <lhansen@adobe.com>
summary:   For 615490 - Remove explicit deletion of managed objects: centralize the use of GC::Free in MultinameHashtable code (r=stejohns)

http://hg.mozilla.org/tamarin-redux/rev/12efad62783c
Also corrects some dodgy comments in that code.
Attachment #494659 - Flags: review?(stejohns)
Attachment #494659 - Flags: review?(stejohns) → review+
changeset: 5609:902b13f1c358
user:      Lars T Hansen <lhansen@adobe.com>
summary:   For 615490 - Remove explicit deletion of managed objects: Avoid dangling the pointer when freeing a String buffer (r=stejohns)

http://hg.mozilla.org/tamarin-redux/rev/902b13f1c358
Depends on: 625297
Depends on: 625328
Assignee: lhansen → rulohani
BTW the bug should not be taken literally, there's no need to remove all explicit deletion.  But we need to control it somehow.
Related: bug 601271
Depends on: 626684
Summary of investigation till now:
GC::Free called ~40 times ( this includes both player and avm)
GCRef<>::Delete called ~56 times
(Not uploading files with call locations as they are flash player specific)

Clearing the raw pointer can be handled in GCRef<>::Delete(). Some of the places actually assign the object to Null after calling Delete() so they will also need clean up. 

Most of the deletes have already been converted to GCRef<>::Delete depending upon if that class got converted for GCRef usage. There are few of those calling delete directly. I think I could only find 1 "delete this;" for ScriptPlayer by just greping for all delete.
Attachment #505284 - Flags: review?(bgetlin)
Attachment #505284 - Flags: feedback?(treilly)
Comment on attachment 505284 [details] [diff] [review]
Set the raw GC object pointer to NULL in GCRef<T>::Delete()

A couple questions:
1) Is NULLifying t really necessary? The idea here is to mimic general pointer behavior as much as possible, and general pointer behavior doesn't automatically set the value to NULL when 'delete' is called.

2)Why the 'tCopy' variable?  Wouldn't this be sufficient:

delete t;
t = NULL;
Attachment #505284 - Flags: review?(bgetlin) → review-
(In reply to comment #17)

> A couple questions:
> 1) Is NULLifying t really necessary? The idea here is to mimic general pointer
> behavior as much as possible, and general pointer behavior doesn't
> automatically set the value to NULL when 'delete' is called.

With exactgc its necessary for the pointer to be set to NULL to avoid dangling pointers. With Delete as a method of GCRef, I thought it would be best to do it there. This would also clean up client/player code at several places where it sets the ref to NULL after calling Delete(). 

> 2)Why the 'tCopy' variable?  Wouldn't this be sufficient:
> 
> delete t;
> t = NULL;

I think this should be fine. I was just setting it to NULL before deleting it.
(In reply to comment #17)
> Comment on attachment 505284 [details] [diff] [review]
> Set the raw GC object pointer to NULL in GCRef<T>::Delete()
> 
> A couple questions:
> 1) Is NULLifying t really necessary? The idea here is to mimic general pointer
> behavior as much as possible, and general pointer behavior doesn't
> automatically set the value to NULL when 'delete' is called.

What Ruchi said.

> 2)Why the 'tCopy' variable?  Wouldn't this be sufficient:
> 
> delete t;
> t = NULL;

It is not sufficient.  The moment you delete the object pointed to by t, t is a dangling pointer.  While it looks like that may be OK here because no allocation activity comes between the deletion and the nulling, it is not OK because the deletion can itself cause allocator activity (eg, the destructor allocates and causes the GC to run).
(In reply to comment #19)

> It is not sufficient.  The moment you delete the object pointed to by t, t is a
> dangling pointer.  While it looks like that may be OK here because no
> allocation activity comes between the deletion and the nulling, it is not OK
> because the deletion can itself cause allocator activity (eg, the destructor
> allocates and causes the GC to run).

I thought about the first fact when I said "It should be fine" but i did not think about the second ( allocations can occur in dtor). So yes, it should set the pointer to NULL before calling delete.
(In reply to comment #19)
> It is not sufficient.  The moment you delete the object pointed to by t, t is a
> dangling pointer.  While it looks like that may be OK here because no
> allocation activity comes between the deletion and the nulling, it is not OK
> because the deletion can itself cause allocator activity (eg, the destructor
> allocates and causes the GC to run).

Let's capture that in a comment, so that future coders won't be tempted to "optimize" it...
Added comment.
Attachment #505284 - Attachment is obsolete: true
Attachment #505550 - Flags: superreview?(lhansen)
Attachment #505550 - Flags: review?(bgetlin)
Attachment #505284 - Flags: feedback?(treilly)
Comments == good.  Also all the invariants will be documented in the exactgc documentation (multiple comments == good as long as they agree).
Attachment #505550 - Flags: superreview?(lhansen) → superreview+
(In reply to comment #13)
> BTW the bug should not be taken literally, there's no need to remove all
> explicit deletion.  But we need to control it somehow.

To clarify my clarification, there are two problems with dangling pointers and exact GC.  The first is that the explicit deletion operates on the last reference to the object; in this case the dangling pointer is a local problem that's fixed by just NULLing out the reference before the deletion:

   Fnord* tmp = obj;
   obj = NULL;
   delete tmp;

The second problem is when the explicit deletion does not operate on the last reference to the object; in this case the dangling pointer is a global problem without an easy fix.

Unfortunately it's usually very difficult to know from looking at the code whether we're dealing with the one problem or the other.
Attachment #505550 - Flags: review?(bgetlin) → review+
I just had a long IM conversation with Brent and several things are worth recording here.

In the specific case of the Delete method it is not actually necessary for GC correctness to use the local temp and NULL-before-delete idiom.  (The reason is that the logic in GC::Free that ends up in AbortFree prevents anything bad from happening with the storage.)  It is however desirable to use it because it prevents the pointer to a half-destroyed object to be observed by eg a presweep callback, should the GC be triggered by the freeing.

In the case of a plain use of the 'delete' operator it is also not necessary for GC correctness to use the idiom, but it is somewhat more desirable, as these two code sequences are not equivalent:

  delete obj;
  obj = NULL;
  obj = new (gc, ...) C(...)

vs 

  delete obj;
  obj = new (gc, ...) C(...)

Spot the difference?  In the second case a dangling pointer is observable to the GC if the GC runs during the allocation on the right hand side of the assignment statement.  But most programmers would deem the nulling in the first case to be redundant and remove it.  By placing it before the delete call that is less likely to happen.

It gets more interesting however.  Suppose we have this code:

  DRCWB(Fnord*) obj;

Observe that in this case, this is not correct:

  tmp = obj;
  obj = NULL;
  delete tmp;

because the NULL assignment will decrement the ref on obj, which may then be taken by the reaper, so tmp may be dangling.  Nor is this:

  delete obj;
  obj = NULL;

because the assignment will result in decrementref being called on the object, which is definitely garbage (obj is dangling).

(For non-RC barriers either is fine, however: assigning a NULL to a non-RC barrier just stores NULL.)
Brent also pointed out that though we cannot make the delete operator private in GCFinalizedObject (due to C++ rules) in order to track down where it's used, we can do so in GCObject.
> It gets more interesting however.  Suppose we have this code:
> 
>   DRCWB(Fnord*) obj;
> 
> Observe that in this case, this is not correct:
> 
>   tmp = obj;
>   obj = NULL;
>   delete tmp;
> 
> because the NULL assignment will decrement the ref on obj, which may then be
> taken by the reaper, so tmp may be dangling.  

Not true, the reaper scans the stack.  

> Nor is this:
> 
>   delete obj;
>   obj = NULL;
> 
> because the assignment will result in decrementref being called on the object,
> which is definitely garbage (obj is dangling).

All this points to really needing the GC to handle the delete/clear as one atomic operation, ie:

obj.Delete();

Right?
(In reply to comment #27)
> All this points to really needing the GC to handle the delete/clear as one
> atomic operation, ie:
> 
> obj.Delete();

The phrase "one atomic operation" needs clarification.  Does "atomic" mean that "the gc cannot run until both the delete and clear have completed" -- if so, then obviously we cannot do that, because of the problem of storage-allocating destructors.

If you just mean that we need to encapsulate the pattern
  tmp = obj; obj = NULL; delete tmp;
in some reasonable way, then I'm good with that.
(In reply to comment #28)
> If you just mean that we need to encapsulate the pattern
>   tmp = obj; obj = NULL; delete tmp;
> in some reasonable way, then I'm good with that.

(But isn't that exactly what the patch for this ticket (attachment 505550 [details] [diff] [review]) is doing??)
Yep, we should push it!
Depends on: 629391
changeset: 5857:13af43aa28ce
user:      Tommy Reilly <treilly@adobe.com>
summary:   Bug 615490 - Set explicitly deleted GC object pointer to NULL (author=rlohani,r=lhansen,r=bgetlin,sr=lhansen)

http://hg.mozilla.org/tamarin-redux/rev/13af43aa28ce
Depends on: 630924
Flags: flashplayer-bug-
We've done what we're going to do.
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: