Closed Bug 1280110 Opened 4 years ago Closed 3 years ago

Crash in mozilla::layers::CompositingRenderTargetOGL::~CompositingRenderTargetOGL

Categories

(Core :: Graphics: Layers, defect, critical)

Unspecified
macOS
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla50
Tracking Status
firefox47 --- wontfix
firefox48 --- wontfix
firefox49 --- fixed
firefox50 --- fixed

People

(Reporter: njn, Assigned: nical)

References

Details

(Keywords: crash, Whiteboard: [gfx-noted])

Crash Data

Attachments

(4 files)

This bug was filed from the Socorro interface and is 
report bp-42d04018-7f31-4b70-9f16-8a29d2160507.
=============================================================

This crash has occurred 349 times in the past week, in versions 47--50. It's only on Mac, and the crash address is always zero.

jgilbert, any ideas?
Flags: needinfo?(jgilbert)
Component: Canvas: WebGL → Graphics: Layers
Flags: needinfo?(jgilbert) → needinfo?(nical.bugzilla)
Whiteboard: [gfx-noted]
Not a great solution, but i'd uplift this patch eyes closed.
Assignee: nobody → nical.bugzilla
Flags: needinfo?(nical.bugzilla)
Attachment #8763523 - Flags: review?(jnicol)
Return a null render target rather than creating one without a GL context, make sure the callers of CreateRenderTarget either null-check or are never called if the compositor is destroyed (which would be the reason mGLContext is null when creating the render target).
Attachment #8763524 - Flags: review?(sotaro.ikeda.g)
Attachment #8763523 - Flags: review?(jnicol) → review+
From the crash, it seems that there could be case that CompositorOGL::Destroy() is called before CompositingRenderTargetOGL destruction. From the source, when CompositorBridgeParent::ActorDestroy() is called without calling CompositorBridgeParent::RecvWillClose(), ClearCachedResources() is not called during the destruction.

Isn't it better to add calling the ClearCachedResources() in CompositorBridgeParent::ActorDestroy() like the RecvWillClose(), if mLayerManager is still valid, is it? It could ensure intermediate surfaces' CompositingRenderTarget destructions.

 https://dxr.mozilla.org/mozilla-central/source/gfx/layers/ipc/CompositorBridgeParent.cpp#673
Flags: needinfo?(nical.bugzilla)
(In reply to Sotaro Ikeda [:sotaro] from comment #3)
> Isn't it better to add calling the ClearCachedResources() in
> CompositorBridgeParent::ActorDestroy() like the RecvWillClose(), if
> mLayerManager is still valid, is it? It could ensure intermediate surfaces'
> CompositingRenderTarget destructions.

We should do both. My patch explicitly handles failure cases of this kind, while calling ClearCachedResource in ActorDestroy will remove one of the possible ways (the most likely one) to be in this situations, but I wouldn't be surprised that other ways to get there exist or will exist eventually.

I have a patch that implements your suggestion which I am about to upload. landing both patches (well, the three of them) makes the code more robust without adding significant complexity.
Flags: needinfo?(nical.bugzilla)
Attachment #8763524 - Flags: review?(sotaro.ikeda.g) → review+
Pushed by nsilva@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/1db92044b8de
Null check mGL when destroying CompositingRenderTargetOGL. r=jnicol
Comment on attachment 8763523 [details] [diff] [review]
Simple null check

Approval Request Comment
[Feature/regressing bug #]:
[User impact if declined]: Some crashes (rather low volume).
[Describe test coverage new/current, TreeHerder]: None.
[Risks and why]: None. A trivial null-check before dereferencing the pointer.
[String/UUID change made/needed]: None.
Attachment #8763523 - Flags: approval-mozilla-beta?
Attachment #8763523 - Flags: approval-mozilla-aurora?
Attachment #8763849 - Flags: review?(sotaro.ikeda.g) → review+
Keywords: leave-open
Pushed by nsilva@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/042ea219692c
Ensure CompositingRenderTargetOGL is never created without a gl context. r=sotaro
https://hg.mozilla.org/integration/mozilla-inbound/rev/0da5b65544ce
Make sure compositor and layer resources are cleared when shutting down CompositorBridgeParent. r=sotaro
Status: NEW → RESOLVED
Closed: 4 years ago
Keywords: leave-open
Resolution: --- → FIXED
Comment on attachment 8763523 [details] [diff] [review]
Simple null check

Fix a crash, taking it.
Merci!

Should be in 48 beta 3
Attachment #8763523 - Flags: approval-mozilla-beta?
Attachment #8763523 - Flags: approval-mozilla-beta+
Attachment #8763523 - Flags: approval-mozilla-aurora?
Attachment #8763523 - Flags: approval-mozilla-aurora+
Maybe this is expected, but a crash with this signature is still happening. For instance:
  https://crash-stats.mozilla.com/report/index/e7b26517-8e56-4c96-a0c4-ca2822160701
Ah, we forgot to clean up ContainerLayerComposite::mPrepared in ContainerLayerComposite::CleanupResources().
Flags: needinfo?(nical.bugzilla)
Yes indeed, patch incoming!
Status: RESOLVED → REOPENED
Flags: needinfo?(nical.bugzilla)
Resolution: FIXED → ---
Comment on attachment 8767595 [details] [diff] [review]
Cleanup ContainerLayerComposite's intermediate targets in ClearCachedResources.

Looks good!
Attachment #8767595 - Flags: review?(sotaro.ikeda.g) → review+
Pushed by nsilva@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/093656bb1702
Cleanup ContainerLayerComposite's intermediate render targets during shutdown. r=sotaro
https://hg.mozilla.org/mozilla-central/rev/093656bb1702
Status: REOPENED → RESOLVED
Closed: 4 years ago4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla50
Comment on attachment 8767595 [details] [diff] [review]
Cleanup ContainerLayerComposite's intermediate targets in ClearCachedResources.

Approval Request Comment
[Feature/regressing bug #]:
[User impact if declined]: Some crashes.
[Describe test coverage new/current, TreeHerder]: None.
[Risks and why]: Rather low. Too late for beta, maybe (unless the crash volume is significant), but simple enough to take in aurora without second thoughts. 
[String/UUID change made/needed]: None.
Attachment #8767595 - Flags: approval-mozilla-aurora?
5 crashes or so per week on beta versions, and we are heading into beta 6 build tonight. So I think this is ok to keep to 49.
Marking this affected so that it will show up on sheriff's queries as needing to land on aurora.
Comment on attachment 8767595 [details] [diff] [review]
Cleanup ContainerLayerComposite's intermediate targets in ClearCachedResources.

Crash fix cleanup, ok to uplift to aurora.
Attachment #8767595 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Hi Nicolas,
Do you consider to uplift this patch in 48 beta 7 if this is not too risky?
Flags: needinfo?(nical.bugzilla)
(In reply to Gerry Chang [:gchang] from comment #26)
> Hi Nicolas,
> Do you consider to uplift this patch in 48 beta 7 if this is not too risky?

We could if the crash volume warrants it. But quoting liz:

(In reply to Liz Henry (:lizzard) (needinfo? me) from comment #22)
> 5 crashes or so per week on beta versions, and we are heading into beta 6
> build tonight. So I think this is ok to keep to 49.

The patch is not very risky but there isn't a lot of time left for beta to bake. At this point it is a release management question more than an engineering one.
Flags: needinfo?(nical.bugzilla)
Ok, let's ride the train on 49/50 and won't fix in 48.
Duplicate of this bug: 1277754
The crash is still happening, bp-1d5ab4c7-eced-4faf-ae5d-446e62160822.
Flags: needinfo?(nical.bugzilla)
(In reply to Ting-Yu Chou [:ting] from comment #30)
> The crash is still happening, bp-1d5ab4c7-eced-4faf-ae5d-446e62160822.

Let's reopen, then. Only 3 crashes since the patch landed, so it's fairly low priority for me right now.
Status: RESOLVED → REOPENED
Flags: needinfo?(nical.bugzilla)
Resolution: FIXED → ---
Blocks: 1297390
(In reply to Nicolas Silva [:nical] from comment #31)
> (In reply to Ting-Yu Chou [:ting] from comment #30)
> > The crash is still happening, bp-1d5ab4c7-eced-4faf-ae5d-446e62160822.
> 
> Let's reopen, then. Only 3 crashes since the patch landed, so it's fairly
> low priority for me right now.

I've just created bug 1297390 as a new one - I've been reproducing it running activity stream tests.

I would recommend leaving this one as fixed, as its already shipped on non-nightly versions, and it makes the tracking awkward if there's a second round of uplifts.
Status: REOPENED → RESOLVED
Closed: 4 years ago3 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.