Closed Bug 972861 Opened 10 years ago Closed 10 years ago

DrawTarget's destructor should never crash, and users should not need to assert on its refcount

Categories

(Core :: Graphics, defect)

defect
Not set
normal

Tracking

()

RESOLVED WONTFIX

People

(Reporter: bjacob, Unassigned)

Details

DrawTarget is a reference-counted class, but we have an assertion

  MOZ_ASSERT(mDrawTarget->refCount() == 1);

in TextureClient. It first appeared in TextureClientD3D11::Unlock(), and bug 968807 copies it in BufferTextureClient::Unlock().

Apparently, the reason for that is that DrawTargetD2D's destructor crashes, at least in debug builds, if, by the time it is called, something else is still locking the surface. Specifically, I was told that

DrawTargetD2D::~DrawTargetD2D()
{
  if (mRT) {  
    PopAllClips();

    mRT->EndDraw();

this EndDraw() call is what makes Direct2D calls that have a fatal assertion about the locking state of the surface.

I'm disturbed that this unfortunate detail of Direct2D has such deep consequences on the design of our own non-Direct2D-specific code, as to cause us to make an assertion on the refcount of DrawTarget, effectively changing DrawTarget's lifetime model from "just refcounting" to something custom.

Can we somehow make DrawTargetD2D's destructor never crash?

If there is indeed a fundamental reason why we can not, can we then prevent that from leaking badness onto users, forcing them to assert on the refcount of a refcounted object that they are using?

If we can't do anything about that, then please at least write a detailed explanation here before WONTFIXING. I can't imagine that this strange aspect of DrawTarget's lifetime model won't be haunting us, and it will then be at least be useful to have a place to direct people to to read an explanation.
(In reply to Benoit Jacob [:bjacob] from comment #0)
> DrawTarget is a reference-counted class, but we have an assertion
> 
>   MOZ_ASSERT(mDrawTarget->refCount() == 1);

Personally, I have no problem whatsoever with assertions like this, it's a valid form of verifying the state of affairs, which is exactly what assertions are meant for (and one of the very few valid uses of the refCount() function :)).

Does this assert mean that an object should be explicitly allocated and de-allocated rather than refcounted? Maybe. Objects are refcounted when they're used in situations where you want to be able to easily have multiple owners and not worry about who the last owner is. However an object like that can sometimes be used in a situation where you really -do- want to know you're the last owner.

Regardless of the assertion in Direct2D that is most certainly the case here. Since the texture client decides when the unlock happens, and knows that at that point the DrawTargets becomes completely unusable (and even dangerous to use), it's good that it can confirm that noone else is holding on to it to prevent someone outside of the class from accidentally using it.

This even has an advantage over implicit allocation, since if you'd be using explicit new/delete, a class accidentally holding a pointer to the DrawTarget would be dereferencing free'd memory if it'd use it anyway, causing a potential security bug. In this situation this would trigger this assertion earlier on in execution, making things much easier to debug.
(In reply to Benoit Jacob [:bjacob] from comment #0)
> DrawTarget is a reference-counted class, but we have an assertion
> 
>   MOZ_ASSERT(mDrawTarget->refCount() == 1);
> 
> in TextureClient. It first appeared in TextureClientD3D11::Unlock(), and bug
> 968807 copies it in BufferTextureClient::Unlock().
> 
> Apparently, the reason for that is that DrawTargetD2D's destructor crashes,
> at least in debug builds, if, by the time it is called, something else is
> still locking the surface. Specifically, I was told that

This is *one of the reasons* we have this assertion. We have a similar situation with GrallocTextureClientOGL.

I think during our skype meeting we got too focused on the specific case of ~DrawTargetD2D.
What happens here is that Moz2D is a drawing API, and not an API to share surfaces across processes.
For performance reasons it provides API's to create DrawTarget around already existing memory/textures under the following contract: "The DrawTarget is wrapping memory that it doesn't own, therefore the caller that created the draw target is responsible for keeping the memory available to the draw target". 

This is nothing new. It is like creating a gfxImageSurface around an existing buffer.

Now, in the very specific case of TextureClient, the memory must be shared with a compositor process. The way we do it with certain API's (D3D11, Gralloc) is that we make the memory accessible by locking it, and when we release the lock, it makes the memory inaccessible (as in, trying to access it crashes and rightfully so.).

TextureClient being the abstraction that creates the draw target around special memory, it knows when the memory is accessible and when it is not. DrawTarget, however, doesn't, because as far as it knows, this memory is just a pointer, or a D3D11 texture. It doesn't know that this texture will be unmapped, because it is a drawing api, and not a sharing api, and because we are using the "power user" feature of wrapping a DrawTarget around already existing memory, the nature of which the draw target doesn't fully know.

So things work as follows:
Someone wants to paint on a TextureClient. It locks it, and after checking that lock did not fail, *borrows* a DrawTarget the TextureClient.
After it is done drawing, it Unlocks the TextureClient.

At this point, if the user tries to draw after the texture has been unlocked, it is wrong. Wrong and should be punished with an assertion. That said, TextureClient is sitting above the DrawTarget and not below so it cannot punish the user for trying to use the draw target.

What it can do, however, is punish the user if he tries to keep a reference to the draw target longer than the duration of the lock, since there is no reason this reference should be held.

And that is the assertion Benoit points to at the beginning of his comment.

The problem that this assertion solve, is that some TextureClients (Gralloc, D3D11, ...) will crash if we try to draw after unlock, while some other (Shmem, Memory, ...) will not because the memory stays mapped in the process after unlock.
This means that a feature that is developed on a platform that uses shmem (such as Mac) will not be crashing if the texture not locked.

Typically the new ContentClient was developed without the constraint of locking correctly, and landed. It took me weeks to get it to work on platforms that are stricter about locking. So we strongly need to ensure that all platform crash the same way when the user is doing something wrong.

This assertion is making sure users of TextureClient/DrawTarget are not doing something they should not be doing.
I was missing the fact that this drawtarget is entirely internal to the TextureClient except between the Lock and Unlock calls.

That makes it sort of OK to assert that no outside references remain by the time Unlock is called. A problem, though, is that GetAsDrawTarget() returns a TemporaryRef, requiring the caller to assign that to a RefPtr, so, as the comment on GetAsDrawTarget() suggests, the only way for the user to avoid crashing is to enclose that in a scope that ends before the user calls Unlock. It would be nicer in this case for GetAsDrawTarget() to return a raw pointer instead of a TemporaryRef, as we agreed on the phone.

I tried doing that, but this runs into the problem that some GetAsDrawTarget implementations call Factory functions that return a TemporaryRef themselves, and there isn't really a clean way to get a raw pointer from that (returning a TemporaryRef AddRef'd the object, and it would be very ugly to pair that with a manual call to Release()).

I suppose that the only thing that can be done here, then, is to write a comment near the assertions referring to the comment near GetAsDrawTarget().
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → WONTFIX
Uploaded patch on bug 968807 adding such comments.
You need to log in before you can comment on or make changes to this bug.