crash in mozilla::layers::SharedSurfaceTextureHost::Lock()

RESOLVED FIXED in Firefox 36

Status

()

Core
Graphics: Layers
--
critical
RESOLVED FIXED
3 years ago
2 years ago

People

(Reporter: snorp, Assigned: snorp)

Tracking

({crash, topcrash, topcrash-android-armv7})

unspecified
mozilla38
All
Android
crash, topcrash, topcrash-android-armv7
Points:
---

Firefox Tracking Flags

(firefox35 unaffected, firefox36+ fixed, firefox37 fixed, firefox38 affected, fennec36+)

Details

(Whiteboard: [gfx-noted], crash signature, URL)

Attachments

(1 attachment)

This bug was filed from the Socorro interface and is 
report bp-5702e34b-a0e1-4684-85bf-fa9fa2150118.
=============================================================
I get this crash by clicking one of the retailers in the URL. I think it pans the embedded Google map.
[Tracking Requested - why for this release]: New signature for Firefox 36 with STR.
status-firefox35: --- → unaffected
status-firefox36: --- → affected
status-firefox37: --- → affected
status-firefox38: --- → affected
tracking-firefox36: --- → ?
Component: Graphics → Graphics: Layers
This appears to be android-specific, or at least it does not occur on Windows.
Whiteboard: [gfx-noted]
I understand this is a top crasher?  Can somebody with info confirm or deny?
Related to bug 1111889?
See Also: → bug 1111889
Yes this is a top crash in Firefox 36 beta 2, currently ranked #5. https://crash-stats.mozilla.com/report/list?product=FennecAndroid&signature=mozilla%3A%3Alayers%3A%3ASharedSurfaceTextureHost%3A%3ALock%28%29#tab-sigsummary
Keywords: topcrash-android-armv7
Created attachment 8556040 [details] [diff] [review]
Always deallocate SharedSurface on the client

This patch seems to fix it for me, and does appear to just enforce what the SharedSurfaceTexture host/client already assume.
Attachment #8556040 - Flags: review?(jgilbert)
Comment on attachment 8556040 [details] [diff] [review]
Always deallocate SharedSurface on the client

Adding :nical for review as well.
Attachment #8556040 - Flags: review?(nical.bugzilla)
Top crash, tracking!
tracking-firefox36: ? → +
Comment on attachment 8556040 [details] [diff] [review]
Always deallocate SharedSurface on the client

Review of attachment 8556040 [details] [diff] [review]:
-----------------------------------------------------------------

A patch that makes only the last texture synchronous landed in bug 1119019 and was uplifted recently. It should fix this issue with a much smaller performance penalty. I don't know if this made it into a beta build already (it was uplifted around the time the last build was made, a little before or a little after), but I would hold this one off until we know whether the patch that landed fixes the issue (which it should). If the problem persists in the next beta build (which will be made in a few hours), I'll r+ this (marking f- in the mean time).
Attachment #8556040 - Flags: feedback-
Actually, the fix I am talking about was included in the beta build from 2015/01/26 and crash-stats isn't showing this crash on beta builds from after 2015/01/20 so I think that we don't need this patch.
Duplicate of this bug: 1091677
I can reproduce this crash on Nightly with ease by following the STR in comment #2. With my patch, I can't.
Nical, I'm not sure I understand why only synchronously destroying the last texture client would fix this. If we destroy the client side (and the SharedSurface that it holds), we will have problems if we don't synchronously destroy it on the host -- for all textures, not just the last.

I guess we could ref count the SharedSurface, but I think some surface types must be deallocated on the client side anyway.
Flags: needinfo?(nical.bugzilla)
tracking-fennec: --- → ?
Assignee: nobody → snorp
tracking-fennec: ? → 36+
Keywords: topcrash
(In reply to James Willcox (:snorp) (jwillcox@mozilla.com) from comment #14)
> Nical, I'm not sure I understand why only synchronously destroying the last
> texture client would fix this. If we destroy the client side (and the
> SharedSurface that it holds), we will have problems if we don't
> synchronously destroy it on the host -- for all textures, not just the last.
> 
> I guess we could ref count the SharedSurface, but I think some surface types
> must be deallocated on the client side anyway.

The SharedSurface is kept by the CanvasClient 1 more frame after it's not used and then recycled in the factory in SharedSurface.cpp so in the general case it should live long enough (while writing this and reading through the code I see that there are other cases where it can be destroyed instead of being recycled), The moment where it is most likely to race is when clearing the CanvasClient because all shared surfaces are cleared at once or something like that.

Anyways, I thought we could get away with synchronizing only when clearing the CanvasClient. This is bad news because it means WebGL perf will suffer until we make some bigger changes to SurfaceStream (and TextureClient a bit).

Updated

3 years ago
Attachment #8556040 - Flags: review?(nical.bugzilla)
Attachment #8556040 - Flags: review+
Attachment #8556040 - Flags: feedback-

Updated

3 years ago
Flags: needinfo?(nical.bugzilla)
Ah, yeah, if we destroy the TC on every frame, that'll be a bad time. Ugh.
Is there a way to recognize this bad situation and report/assert/telemetry whatever, so that we know how much we're hitting it?
(In reply to Milan Sreckovic [:milan] from comment #17)
> Is there a way to recognize this bad situation and report/assert/telemetry
> whatever, so that we know how much we're hitting it?

We'll hit the synchronous ipc every time we update a SurfaceStream backed canvas. So pretty much constantly in games and pages that animate something in a canvas, never on pages that don't use SkiaGL-canvas or WebGL.
We talked about it in the last gfx meeting and decided we should take this patch now and work on a fix that will avoid the synchronous ipc calls in the medium term. I have an idea about how to fix this but it's not something I would implement in a day and uplift all the way to beta.
Attachment #8556040 - Flags: review?(jgilbert) → review+
Comment on attachment 8556040 [details] [diff] [review]
Always deallocate SharedSurface on the client

Approval Request Comment
[Feature/regressing bug #]: ?
[User impact if declined]: crashes
[Describe test coverage new/current, TreeHerder]: none so far
[Risks and why]: fairly low risk, but does introduce a known performance penalty
[String/UUID change made/needed]: none
Attachment #8556040 - Flags: approval-mozilla-beta?
Attachment #8556040 - Flags: approval-mozilla-aurora?
Attachment #8556040 - Flags: approval-mozilla-beta?
Attachment #8556040 - Flags: approval-mozilla-beta+
Attachment #8556040 - Flags: approval-mozilla-aurora?
Attachment #8556040 - Flags: approval-mozilla-aurora+
https://hg.mozilla.org/releases/mozilla-aurora/rev/b5e80541fef4
https://hg.mozilla.org/releases/mozilla-beta/rev/47e26f891d66
status-firefox36: affected → fixed
status-firefox37: affected → fixed
https://hg.mozilla.org/integration/mozilla-inbound/rev/dafe59b07492
https://hg.mozilla.org/mozilla-central/rev/dafe59b07492
Status: NEW → RESOLVED
Last Resolved: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla38
Depends on: 1108527
I think I just ran into this again visiting: http://jam3.github.io/three-bmfont-text/test/
https://crash-stats.mozilla.com/report/index/52a74be6-1af8-471e-a868-e5e142150917
You need to log in before you can comment on or make changes to this bug.