Crashes caused by SharedSurfaces outliving their GLContext.

RESOLVED FIXED

Status

()

Core
Graphics
RESOLVED FIXED
2 years ago
a year ago

People

(Reporter: nical, Assigned: jgilbert)

Tracking

unspecified
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox45 wontfix, firefox46 wontfix, firefox47+ fixed, firefox48+ fixed, firefox-esr4547+ fixed)

Details

(Whiteboard: [gfx-noted])

MozReview Requests

()

Submitter Diff Changes Open Issues Last Updated
Loading...
Error loading review requests:

Attachments

(7 attachments, 1 obsolete attachment)

1.83 KB, patch
Details | Diff | Splinter Review
3.37 KB, patch
nical
: review+
Details | Diff | Splinter Review
1.88 KB, patch
nical
: review+
Details | Diff | Splinter Review
58 bytes, text/x-review-board-request
jrmuizel
: review+
Details | Review
58 bytes, text/x-review-board-request
jrmuizel
: review+
Details | Review
58 bytes, text/x-review-board-request
jrmuizel
: review+
Details | Review
58 bytes, text/x-review-board-request
jrmuizel
: review+
Details | Review
(Reporter)

Description

2 years ago
Destroying a GLContext causes the destruction of its ScreenBuffer, which causes the destruction of a SurfaceFactory object, which keeps some textures alive in its recycle pool. As the pool is destroyed, its textures are destroyed too in the process, and destroying these textures causes the destruction of SharedSurface objects. SharedSurface destruction involves some calls to the GLContext that is being destroyed (and is pretty much at the end of its destructor) which causes crashes a crash in MakeCurrent.

I ended up in this situation by clicking on the background webgl canvas with the dom inspector from the devtool in http://www.ro.me/ (not sure what is important and what isn't in these STR, but that's how I found this crash).
Whiteboard: [gfx-noted]
I've been taking a look at this crash on Fennec, and it looks to be canvas/surface recycling related after working back from the crash stacks.

The story seems to go:
 - A canvas is updated, and CanvasClientSharedSurface::Updated() is called. Before switching out the front TextureClient, it calls WaitForCompositorRecycle() on it. [1]

 - This causes the TextureChild to keep a reference to the TextureClient, and calls SendClientRecycle to the compositor. The compositor is expected to reply with SendCompositorRecycle once it is finished with the corresponding TextureHost, at which point the TextureChild derefs the TextureClient. [2]

 - Content finishes with the GLContext (the canvas is being destroyed or the context lost or...?) and it kills its GLScreenBuffer, which kills its SurfaceFactory. The SurfaceFactory *should* kill its entire pool of SharedSurfaceTextureClients by dereferencing them [3]. This is fine at this point because the GLContext hasn't fully died yet and we can still use its function table.

 - Some time later, RecvCompositorRecycle() is called on a TextureChild, and it clears the last reference to the TextureClient. At this point, the GLContext has already died so we crash.


[1] https://dxr.mozilla.org/mozilla-central/rev/4ea7408b3eef059aa248f4b00328f8fdb4475112/gfx/layers/client/CanvasClient.cpp#459
[2] https://dxr.mozilla.org/mozilla-central/rev/4ea7408b3eef059aa248f4b00328f8fdb4475112/gfx/layers/client/TextureClient.cpp#118
[3] https://dxr.mozilla.org/mozilla-central/rev/eb25b90a05c194bfd4f498ff3ffee7440f85f1cd/gfx/gl/SharedSurface.cpp#304
Assignee: nobody → edwin
Created attachment 8726308 [details] [diff] [review]
Patch to make this *way* easier to reproduce on Fennec
Created attachment 8726312 [details] [diff] [review]
1224199.patch

Fix. This patch simply gets rid of the waiting TextureClient reference in TextureChild when SurfaceFactory is killing its SharedSurfaceTextureClient pool.
Attachment #8726312 - Flags: review?(nical.bugzilla)
Created attachment 8726319 [details] [diff] [review]
1224199.patch

Fixes a UAF I noticed in the last patch just after uploading...
Attachment #8726312 - Attachment is obsolete: true
Attachment #8726312 - Flags: review?(nical.bugzilla)
Attachment #8726319 - Flags: review?(nical.bugzilla)
Created attachment 8726323 [details] [diff] [review]
1224199-unused.patch

Get rid of some unused recycling code. Not sure if it's meant to be used, but it certainly isn't at the moment.
Attachment #8726323 - Flags: review?(nical.bugzilla)
(Reporter)

Updated

2 years ago
Attachment #8726319 - Flags: review?(nical.bugzilla) → review+
(Reporter)

Updated

2 years ago
Attachment #8726323 - Flags: review?(nical.bugzilla) → review+
(Reporter)

Comment 6

2 years ago
AFAICT, the problem I described in the bug description still exists: Textures depend on the gl context, and the gl context frees textures from its destructor, wich means textures are themselfs calling into a half-dead gl context in their own destruction code. So marking leave-open, but it's cool that this patch improves on the current situation.
Keywords: leave-open

Comment 7

2 years ago
https://hg.mozilla.org/integration/mozilla-inbound/rev/79214adeaeac
https://hg.mozilla.org/integration/mozilla-inbound/rev/b380eca3966c

Comment 8

2 years ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/79214adeaeac
https://hg.mozilla.org/mozilla-central/rev/b380eca3966c
(Reporter)

Comment 9

2 years ago
Carrying stats over from bug 1223810.
Jeff and Dan, I think this is your area. It's causing various crashes like bug 1223810 among other things. Is that something you can look into soon-ish?
Assignee: edwin → nobody
status-firefox45: --- → affected
status-firefox46: --- → affected
status-firefox47: --- → affected
status-firefox48: --- → affected
status-firefox-esr45: --- → affected
Flags: needinfo?(jgilbert)
Flags: needinfo?(dglastonbury)
Flags: needinfo?(milan)
Jeff should be able to take care of this, don't need Dan.
Flags: needinfo?(milan)
Flags: needinfo?(dglastonbury)
(Assignee)

Comment 11

2 years ago
I'll take this.
Assignee: nobody → jgilbert
(Reporter)

Updated

2 years ago
Flags: needinfo?(jgilbert)
(Assignee)

Comment 12

2 years ago
Created attachment 8741181 [details]
MozReview Request: Bug 1224199 - Remove DestroyScreenBuffer. r?jrmuizel

From 320888354ebfbedc1f0509b5baf57861c66ab7ae Mon Sep 17 00:00:00 2001
---
 gfx/gl/GLContext.cpp | 12 ++----------
 gfx/gl/GLContext.h   |  2 --
 2 files changed, 2 insertions(+), 12 deletions(-)

Review commit: https://reviewboard.mozilla.org/r/46261/diff/#index_header
See other reviews: https://reviewboard.mozilla.org/r/46261/
Attachment #8741181 - Flags: review?(jmuizelaar)
Attachment #8741182 - Flags: review?(jmuizelaar)
Attachment #8741183 - Flags: review?(jmuizelaar)
Attachment #8741184 - Flags: review?(jmuizelaar)
(Assignee)

Comment 13

2 years ago
Created attachment 8741182 [details]
MozReview Request: Null eagerly in MarkDestroyed. r?jrmuizel

From 5ee22edfd3cc4364806d7d1cf82e651760dbcbb9 Mon Sep 17 00:00:00 2001
---
 gfx/gl/GLBlitHelper.cpp         | 3 +++
 gfx/gl/GLContext.cpp            | 7 ++++---
 gfx/gl/GLReadTexImageHelper.cpp | 3 +++
 3 files changed, 10 insertions(+), 3 deletions(-)

Review commit: https://reviewboard.mozilla.org/r/46263/diff/#index_header
See other reviews: https://reviewboard.mozilla.org/r/46263/
(Assignee)

Comment 14

2 years ago
Created attachment 8741183 [details]
MozReview Request: Assert IsCurrent in TextureGarbageBin::EmptyGarbage. r?jrmuizel

From 5bbe6aa209356262374466fbe91629285ba1c637 Mon Sep 17 00:00:00 2001
---
 gfx/gl/TextureGarbageBin.cpp | 1 +
 1 file changed, 1 insertion(+)

Review commit: https://reviewboard.mozilla.org/r/46265/diff/#index_header
See other reviews: https://reviewboard.mozilla.org/r/46265/
(Assignee)

Comment 15

2 years ago
Created attachment 8741184 [details]
MozReview Request: Fortify SharedSurface dtors against MakeCurrent failure. r?jrmuizel

From 1f9516bc5b0a938fd26b32ae1892cd325e600752 Mon Sep 17 00:00:00 2001
---
 gfx/gl/SharedSurfaceANGLE.cpp        |  4 +++-
 gfx/gl/SharedSurfaceD3D11Interop.cpp |  7 +++++--
 gfx/gl/SharedSurfaceEGL.cpp          | 20 +++++++++++---------
 gfx/gl/SharedSurfaceGLX.cpp          |  8 ++++----
 gfx/gl/SharedSurfaceGralloc.cpp      |  6 +++---
 gfx/gl/SharedSurfaceIO.cpp           |  9 ++++-----
 6 files changed, 30 insertions(+), 24 deletions(-)

Review commit: https://reviewboard.mozilla.org/r/46267/diff/#index_header
See other reviews: https://reviewboard.mozilla.org/r/46267/
Comment on attachment 8741181 [details]
MozReview Request: Bug 1224199 - Remove DestroyScreenBuffer. r?jrmuizel

https://reviewboard.mozilla.org/r/46261/#review43077
Attachment #8741181 - Flags: review?(jmuizelaar) → review+
Comment on attachment 8741182 [details]
MozReview Request: Null eagerly in MarkDestroyed. r?jrmuizel

https://reviewboard.mozilla.org/r/46263/#review43079
Attachment #8741182 - Flags: review?(jmuizelaar) → review+
Comment on attachment 8741183 [details]
MozReview Request: Assert IsCurrent in TextureGarbageBin::EmptyGarbage. r?jrmuizel

https://reviewboard.mozilla.org/r/46265/#review43081
Attachment #8741183 - Flags: review?(jmuizelaar) → review+
Comment on attachment 8741184 [details]
MozReview Request: Fortify SharedSurface dtors against MakeCurrent failure. r?jrmuizel

https://reviewboard.mozilla.org/r/46267/#review43083

It would be nice if had some way to cause MakeCurrent to fail from a test harness so that we could better test this code.
Attachment #8741184 - Flags: review?(jmuizelaar) → review+

Comment 20

2 years ago
https://hg.mozilla.org/integration/mozilla-inbound/rev/22f36f2a2e4c
Tracking this as it seems like a good candidate for uplift to 47 once it lands in m-c and has some verification in 48.
status-firefox46: affected → wontfix
tracking-firefox47: --- → +
tracking-firefox48: --- → +

Comment 22

2 years ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/22f36f2a2e4c
(Assignee)

Updated

2 years ago
status-firefox45: affected → wontfix
status-firefox48: affected → fixed
(Assignee)

Updated

2 years ago
Status: NEW → RESOLVED
Last Resolved: 2 years ago
Keywords: leave-open
Resolution: --- → FIXED
(Assignee)

Comment 23

2 years ago
Comment on attachment 8741181 [details]
MozReview Request: Bug 1224199 - Remove DestroyScreenBuffer. r?jrmuizel

Approval Request Comment
[Feature/regressing bug #]: very old
[User impact if declined]: Crashes in some cases when using WebGL.
[Describe test coverage new/current, TreeHerder]: none
[Risks and why]: Low: This is very unlikely to make anything worse, but there's a chance it doesn't completely fix the issues here.
[String/UUID change made/needed]: none
Attachment #8741181 - Flags: approval-mozilla-aurora?
(Assignee)

Comment 24

2 years ago
That approval request goes for all the patches that follow it as well.

The patches previous to mine are already in 47.
(Assignee)

Comment 25

2 years ago
I'm going to leave this open until we're done with uplifts.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment on attachment 8741181 [details]
MozReview Request: Bug 1224199 - Remove DestroyScreenBuffer. r?jrmuizel

Crash fixes in WebGL, Aurora47+
Attachment #8741181 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+

Updated

2 years ago
Attachment #8741182 - Flags: approval-mozilla-aurora+

Updated

2 years ago
Attachment #8741183 - Flags: approval-mozilla-aurora+

Updated

2 years ago
Attachment #8741184 - Flags: approval-mozilla-aurora+
(Assignee)

Updated

2 years ago
status-firefox47: affected → fixed
(Assignee)

Comment 27

2 years ago
Comment on attachment 8741181 [details]
MozReview Request: Bug 1224199 - Remove DestroyScreenBuffer. r?jrmuizel

[Approval Request Comment]
If this is not a sec:{high,crit} bug, please state case for ESR consideration:
User impact if declined: Occasional crashes when WebGL is GC'd or otherwise destroyed.
Fix Landed on Version: 47+
Risk to taking this patch (and alternatives if risky): Low: This is very unlikely to make anything worse, but there's a chance it doesn't completely fix the issues here.
String or UUID changes made by this patch: None

See https://wiki.mozilla.org/Release_Management/ESR_Landing_Process for more info.
Attachment #8741181 - Flags: approval-mozilla-esr45?
Duplicate of this bug: 1272753
This patch also seemed to have stopped a crash from bug 1272753.  20 or so crashes in the ESR versions last week.  400 crashes total between all versions, stopping with 46.0.1.
(Assignee)

Comment 30

a year ago
[Tracking Requested - why for this release]:
It's a crash, and needed for bug 1223810.
tracking-firefox-esr45: --- → ?
Comment on attachment 8741181 [details]
MozReview Request: Bug 1224199 - Remove DestroyScreenBuffer. r?jrmuizel

Crash fix, landed already in 47, ok to uplift to esr45. 
Jeff, double checking, only this patch to uplift to esr?
Flags: needinfo?(jgilbert)
Attachment #8741181 - Flags: approval-mozilla-esr45? → approval-mozilla-esr45+
(Assignee)

Comment 32

a year ago
(In reply to Liz Henry (:lizzard) (needinfo? me) from comment #31)
> Comment on attachment 8741181 [details]
> MozReview Request: Bug 1224199 - Remove DestroyScreenBuffer. r?jrmuizel
> 
> Crash fix, landed already in 47, ok to uplift to esr45. 
> Jeff, double checking, only this patch to uplift to esr?

All patches that were landed on Central should be landed on ESR45.
Flags: needinfo?(jgilbert)
has problems to apply to esr45:

grafting 339897:22f36f2a2e4c "Bug 1224199 - Destroy SharedSurfaces before ~GLContext(). - r=jrmuizel"
merging gfx/gl/GLBlitHelper.cpp
merging gfx/gl/GLContext.cpp
merging gfx/gl/GLContext.h
merging gfx/gl/GLReadTexImageHelper.cpp
merging gfx/gl/SharedSurfaceANGLE.cpp
merging gfx/gl/SharedSurfaceD3D11Interop.cpp
merging gfx/gl/SharedSurfaceEGL.cpp
merging gfx/gl/SharedSurfaceGLX.cpp
merging gfx/gl/SharedSurfaceGralloc.cpp
merging gfx/gl/SharedSurfaceIO.cpp
warning: conflicts while merging gfx/gl/SharedSurfaceD3D11Interop.cpp! (edit, then use 'hg resolve --mark')
abort: unresolved conflicts, can't continue
(use 'hg resolve' and 'hg graft --continue')
Tomcats-MacBook-Pro-2:mozilla-central Tomcat$
Flags: needinfo?(jgilbert)
milan can you help out here if Jeff isn't around? For this bug and bug 1223810.
Flags: needinfo?(milan)
(Assignee)

Comment 35

a year ago
I've got this.
Flags: needinfo?(milan)
(Assignee)

Comment 36

a year ago
Pulling in the fix from bug 1241484 seems important here, as part of rebasing.
Depends on: 1241484
Flags: needinfo?(jgilbert)
(Assignee)

Comment 37

a year ago
https://hg.mozilla.org/releases/mozilla-esr45/rev/39b745d6d352
https://hg.mozilla.org/releases/mozilla-esr45/rev/b24e1cc592ec
https://hg.mozilla.org/releases/mozilla-esr45/rev/dc190bd03d24
(Assignee)

Updated

a year ago
status-firefox-esr45: affected → fixed
(Assignee)

Comment 38

a year ago
That should be it!
Status: REOPENED → RESOLVED
Last Resolved: 2 years agoa year ago
Resolution: --- → FIXED
tracking-firefox-esr45: ? → 47+
Depends on: 1287444
You need to log in before you can comment on or make changes to this bug.