Closed Bug 574481 Opened 10 years ago Closed 10 years ago

Fix ownership model and lifetimes of nsWindow, its layer manager, its GLContext, managed layers, and their textures

Categories

(Core :: Graphics, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
Tracking Status
blocking2.0 --- final+

People

(Reporter: cjones, Assigned: vlad)

References

Details

Attachments

(6 files, 1 obsolete file)

See bug 573929 for a problem arising when the X window for fullscreen video dies while there are still live references to the layer manager's GLContext.  Trying to use the context for freeing textures makes for crashy.

This is likely a more general problem than just on X platforms, and getting it right will probably be hard, so this bug is spun off to get the known crashers off the critical path to beta1.
There are a few related problems here:
A) how to clean up GL resources owned by content when a window might have gone away but the content persists
B) currently Layers and ImageContainers can stay alive longer than the LayerManagers they reference
C) special case of B: reparenting <video> elements from one window to another, we need to be able to reparent the ImageContainer to the new window without interrupting playback

Here's a proposal:

1) Make all our GL contexts share resources. Then we can have a single global GL context that we use to clean up all resources.
2) Add a requirement to the layers API that outside a transaction, all layers for a given layer manager are in the layer tree for that manager.
3) Make the layer->layerManager pointer weak, in the sense that destroying the layer manager will null out the pointer. There can also be per-layer cleanup code executed when the layer manager goes away. Since the layer manager can't die during a transaction, it's enough to just walk the layer tree to find the layers that need to be notified.
4) Make the ImageContainer->LayerManager pointer weak too. We'll need to keep a list of ImageContainers in the LayerManager to notify them, but I think that's fine. ImageContainers will be required to keep working even after the LayerManager is gone.
5) Add an ImageContainer::SetLayerManager API to attach an ImageContainer to a new layer manager. It's allowed to fail, e.g. if the new layer manager is the wrong type. However, that would be a weird uncommon case.
6) Make nsVideoFrame painting check to see if the ImageContainer has the right layer manager. If it doesn't, try to set it with SetLayerManager. If that fails, we'll keep using the ImageContainer with the old LayerManager, which may in fact be null (if that LayerManager went away). That could be slow but things should still work.

Note on point 4: BasicImageContainer doesn't use its layer manager so it's no problem to keep working when the layer manager is gone. If we share GL or D3D resources across contexts then we can keep using a global context to keep the ImageContainer working. If we can't use resources after the original LayerManager is dead, then we could resort to software fallback at least temporarily; it could work something like this:
-- ImageContainer keeps a list of the live Images that it owns
-- when the LayerManager goes away, recover the contents of each of the Images from textures into main memory
-- while there is no LayerManager, behave like a BasicImageContainer doing everything in main memory
That's complicated, but hopefully we won't have to actually do it. The important thing is that we can set up the APIs according to points 1-6 and then if we find we have problems sharing resources, we can recover by implementing a scheme like this, without breaking the design.
blocking2.0: --- → ?
blocking2.0: ? → final+
Chris, are you the right person for this bug?
Assignee: nobody → jones.chris.g
Vlad's already working on this.
Assignee: jones.chris.g → vladimir
Attached patch checkpoint (obsolete) — Splinter Review
just posting this patch in progress; it fixes a number of crashes, though there are still problems remaining.  Specifically, video-in-browser with MOZ_ACCELERATED=1 doesn't work (fullscreen video works fine), and I still have to implement the ImageContainer layer manager handoff.

For the ImageContainer handoff, what's a good way to test that?  I'm not sure when/where that would actually happen currently.
Drag a tab containing video from one window to another?
This adds a layer lifetime API to the layers interfaces.

The intent is that when Destroy is called on a layer manager, it will walk its layer chain and call Destroy on each layer.  The Layer Manager and Layers are still alive after that, but it's invalid to do anything to them other than to access user data on the layer manager.  Perhaps we should make it so that doing anything is a noop.

This also adds a SetLayerManager API to ImageContainer, for switching owner layer managers.
Attachment #459958 - Attachment is obsolete: true
Attachment #460346 - Flags: review?(roc)
Add some useful things to GLContext, including a flag indicating whether a context is the global share 'root', a helper for reading a texture as a gfxImageSurface, and removing the now-not-needed CreateTexture/DeleteTexture/WindowDestroyed API.  Also adds a MarkDestroyed() which explcitly nulls out all GL function params, so that any misuses of a dead GLContext can get caught as null deref crashes.
Attachment #460347 - Flags: review?(bas.schouten)
I don't think anything special needs to happen for basic & d3d layers here
Attachment #460349 - Flags: review?(bas.schouten)
Teach nsVideoFrame to call SetLayerManager in an attempt to swap the ImageContainer's manager.  Also only treat the layer as inactive if the basic layer manager is in play, not for other manager types.
Attachment #460351 - Flags: review?(roc)
Comment on attachment 460348 [details] [diff] [review]
part 3, implement lifetime functions in GL layers

This looks mostly OK to me except for two things.  First, I'm not convinced we need mDestroyed fields for all the *LayerOGLs.  While working on bug 570620, which implements something similar for IPC layers, I found that the layers had existing field(s) that marked them as dead after Destroy().  For example, |if (!mBuffer)| rather than |if (mDestroyed)|.  I think that style will be cleaner in the long run, because it unifies after-Destroy() no-ops with failed-resource-allocation no-ops, for example.  The second thing that bugged me was the |if (destroyed)| checks in *LayerOGL::RenderLayer(), which can only be called through LayerManagerOGL::Render() (right?).  Maybe replace these with |NS_ASSERTION(!mManager->IsDestroyed()|?  Or just drop.

I don't feel qualified to review the ImageContainer code.  Will wait for response to above before doing a detailed review.
Comment on attachment 460346 [details] [diff] [review]
part 1, add lifetime management API to layers interface

+  PRBool mDestroyed;

PRPackedBool
I put a patch in bug 582467 which duplicates the last part of part 3. I'll cancel my patch. Bug 582416 has a patch that duplicates the second part of part 3, I'll cancel that too.
Vlad, why are you removing the fTexSubImage2D path in SetupPlaneTexture? That was a significant performance win when I added it on Mac.
(In reply to comment #15)
> Vlad, why are you removing the fTexSubImage2D path in SetupPlaneTexture? That
> was a significant performance win when I added it on Mac.

Hmm, interesting -- I dropped those because in /theory/ they should have had no effect, or been slower, at least on mobile; but I guess on OSX the drivers always reallocate memory even if they don't need to?  Ugh.  I'll add it back.
(Well, on my crappy ATI system they do)
Attachment #460347 - Flags: review?(bas.schouten) → review+
(In reply to comment #12)
> Comment on attachment 460348 [details] [diff] [review]
> part 3, implement lifetime functions in GL layers
> 
> This looks mostly OK to me except for two things.  First, I'm not convinced we
> need mDestroyed fields for all the *LayerOGLs.  While working on bug 570620,
> which implements something similar for IPC layers, I found that the layers had
> existing field(s) that marked them as dead after Destroy().  For example, |if
> (!mBuffer)| rather than |if (mDestroyed)|.  I think that style will be cleaner
> in the long run, because it unifies after-Destroy() no-ops with
> failed-resource-allocation no-ops, for example.

That's fair, though it doesn't give an easy way to ask if a layer has actually been destroyed.. since that then becomes up to each layer to decide.  Maybe that's not a problem

> The second thing that bugged
> me was the |if (destroyed)| checks in *LayerOGL::RenderLayer(), which can only
> be called through LayerManagerOGL::Render() (right?).  Maybe replace these with
> |NS_ASSERTION(!mManager->IsDestroyed()|?  Or just drop.

Hm, yeah, I guess LayerOGL is private so noone other than LayerManagerOGL can call RenderLayer.  I'll just drop it.
Comment on attachment 460349 [details] [diff] [review]
part 4, stub out lifetime functions in other layer impls

As per IRC the D3D9 method should always return FALSE for now.
Attachment #460349 - Flags: review?(bas.schouten) → review+
Comment on attachment 460348 [details] [diff] [review]
part 3, implement lifetime functions in GL layers

Looks good to me, we should make sure to note that GetCurrentAsSurface can be really slow on OGL.
Attachment #460348 - Flags: review?(bas.schouten) → review+
Note for those following along: I'm trying to land this, but I'm running into PPC-only MacOS X compile breakage due to something with <exception> and objc exceptions and @try and #define try and other awfulness.
Looks like this got landed, I had to add THEBES_API to GLContext::MarkHistory to get shared builds to work again:

http://hg.mozilla.org/mozilla-central/rev/22ebb1687a52
Looks like bug 604044 is caused by code introduced by these commits: we're crashing on exit because a GL context is marked as destroyed in nsWindow::Destroy():

        if (gl) {
            gl->MarkDestroyed();
        }

and in GLTexture::Release() we're still trying to call fDeleteTextures which is still a null function pointer.

Continuing discussion in bug 604044.
Blocks: 604044
Attachment #460348 - Flags: review?(jones.chris.g)
You need to log in before you can comment on or make changes to this bug.