Closed Bug 792663 Opened 7 years ago Closed 7 years ago

BasicBuffer holds on to AutoOpenSurface buffers when it shouldn't

Categories

(Core :: Graphics: Layers, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla18
blocking-basecamp +

People

(Reporter: cjones, Assigned: cjones)

References

Details

(Whiteboard: [LOE:S])

Attachments

(2 files)

nrc found this.  Easy fix.

It's a violation of the new invariants in this code, but I don't believe it would cause any problems in the field.  Needs to be fixed though.
An invariant in ThebesBuffers that are using mBufferProvider is, !mBufferProvider => !mBuffer.  When the code here was called outside of BasicShadowableThebesLayer::Paint(), then we don't use a buffer provider and there's no buffer tracker around the buffer set on the ThebesBuffer.  So we end up in a case where !mBufferProvider && mBuffer.
Comment on attachment 662796 [details] [diff] [review]
Don't SyncFrontBufferToBackBuffer() while a BufferTracker isn't around to revoke our buffer provider

Review of attachment 662796 [details] [diff] [review]:
-----------------------------------------------------------------

Can we assert that mLayer->mBufferTracker is non-null in SyncFrontBufferToBackBuffer? Also can we assert that mBuffer is null in ThebesLayerBuffer::SetBufferProvider if aProvider is non-null?
Attachment #662796 - Flags: review?(ncameron) → review+
nrc, was going to ping on IRC --- did this incidentally fix the leak you were seeing?
I ran all the crashtest-ipc and reftest-ipc tests locally and tested on a b2g device, but there were aborts on some mediastorage tests.  They're probably not using MOZ_LAYERS_FORCE_SHMEM_SURFACES, which is unfortunate because the shadow-layers code isn't in a shippable state on X11.  Oh well.  Backed out.

https://hg.mozilla.org/integration/mozilla-inbound/rev/a0b08694ac6d
(In reply to Chris Jones [:cjones] [:warhammer] from comment #6)
> nrc, was going to ping on IRC --- did this incidentally fix the leak you
> were seeing?

Yes, it did, or at least a version modified inline with the refactorings did. Thanks!
I wasn't able to reproduce the failure locally with bug 784244 applied.  Perhaps they need to be a pair.
tryserver for this and bug 784244

https://tbpl.mozilla.org/?tree=Try&rev=f58910e8053c

and an additional push that hacks off xlib surfaces, in case some xlib-specific problem is the cause of the crashes

https://tbpl.mozilla.org/?tree=Try&rev=6e80c32f87bd
Still broken on tryserver, but only on opt builds apparently.  I was testing debug locally.  That would be odd.

Still awaiting results with the patch that disables cross-process sharing of xlib surfaces.
Looks good so far.  Sorry xlib.
I know that there's a real bug here, but we're not shipping xlib shadow layers for a while and b2g is under a crunch.  We can hunt this down in a followup.

Will try to engineer a review steal while you enjoy your weekend :).
Attachment #663630 - Flags: review?(karlt)
This is a transitive blocker now.
blocking-basecamp: --- → +
Whiteboard: [LOE:S]
(In reply to Chris Jones [:cjones] [:warhammer] from comment #5)
> With requested assertions
> 
> https://hg.mozilla.org/integration/mozilla-inbound/rev/1ba44c7c860a

(In reply to Chris Jones [:cjones] [:warhammer] from comment #7)
> I ran all the crashtest-ipc and reftest-ipc tests locally and tested on a
> b2g device, but there were aborts on some mediastorage tests.  They're
> probably not using MOZ_LAYERS_FORCE_SHMEM_SURFACES, which is unfortunate
> because the shadow-layers code isn't in a shippable state on X11.  Oh well. 
> Backed out.
> 
> https://hg.mozilla.org/integration/mozilla-inbound/rev/a0b08694ac6d

The mochitest-2 failure at
https://tbpl.mozilla.org/?tree=Mozilla-Inbound&rev=1ba44c7c860a ,
the one marked with the backout changeset id, looks like bug 782505, but just happened to be during a different test.
The reason for that crash is described in bug 763449 comment 19.

Do you know why attachment 662796 [details] [diff] [review] would make that more common?

If there is a reason why switching to shmem surfaces would address that problem, then can you explain please?  Would DestroySharedSurface not destroy a shmem surface that is still in use by the other process?
(In reply to Karl Tomlinson (:karlt) from comment #15)
> Do you know why attachment 662796 [details] [diff] [review] would make that
> more common?
> 

I couldn't reproduce the crashes locally.  I think the patch just changes timing.  I don't have a guess beyond that.

> If there is a reason why switching to shmem surfaces would address that
> problem, then can you explain please?  Would DestroySharedSurface not
> destroy a shmem surface that is still in use by the other process?

Correct.  Each process has a "strong reference" to shmem through the mapped fd.
(In reply to Karl Tomlinson (:karlt) from comment #15)
> The mochitest-2 failure [...] looks like bug 782505, but
> just happened to be during a different test.

Actually it was the same test, so don't know why tbpl didn't suggest that bug.
Comment on attachment 663630 [details] [diff] [review]
Temporary require opting in to allocating cross-process xlib surfaces

r+ based on comment 17, but this is not a prerequisite for landing attachment 662796 [details] [diff] [review].  Bug 782505 is a common failure, and I don't think there's any evidence that attachment 662796 [details] [diff] [review] made it more common.  However this patch is still a sensible solution to deal with bug 782505.
Attachment #663630 - Flags: review?(karlt) → review+
Blocks: 782505
https://hg.mozilla.org/mozilla-central/rev/cf7824a17b2b
https://hg.mozilla.org/mozilla-central/rev/780a4b8551e6
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla18
Blocks: 796182
No longer blocks: 796182
Depends on: 796182
You need to log in before you can comment on or make changes to this bug.