Closed
Bug 1298058
Opened 9 years ago
Closed 9 years ago
Crash in mozilla::layers::BasicCompositor::DrawQuad
Categories
(Core :: Graphics, defect)
Tracking
()
RESOLVED
FIXED
mozilla51
People
(Reporter: milan, Assigned: jerry)
References
(Blocks 1 open bug)
Details
(Keywords: crash, Whiteboard: [gfx-noted])
Crash Data
Attachments
(1 file, 4 obsolete files)
This bug was filed from the Socorro interface and is
report bp-f12bfd98-be91-47c0-89ca-d15ef2160824.
=============================================================
Follow up on bug 1187464. These crashes should only show up in nightly and aurora, but this particular crash has some information we didn't have last time - we are in fact in the basic compositor with DataTextureSourceD3D11, so something is out of sync after a device reset.
| Reporter | ||
Updated•9 years ago
|
Whiteboard: [gfx-noted]
A quick look suggests that the media decoder is shipping D3D11 frames after a device reset. We're supposed to protect against this by destroying layers and barring old or incompatible textures from getting attached in CompositableTransactionParent.
But something's not working right, the texture is still leaking into the D3D11 compositor.
Comment 2•9 years ago
|
||
Crash volume for signature 'mozilla::layers::BasicCompositor::DrawQuad':
- nightly (version 51): 10 crashes from 2016-08-01.
- aurora (version 50): 4 crashes from 2016-08-01.
- beta (version 49): 21 crashes from 2016-08-02.
- release (version 48): 25 crashes from 2016-07-25.
- esr (version 45): 1975 crashes from 2016-05-02.
Crash volume on the last weeks (Week N is from 08-22 to 08-28):
W. N-1 W. N-2 W. N-3
- nightly 4 0 1
- aurora 1 1 1
- beta 8 3 3
- release 12 9 1
- esr 204 217 178
Affected platforms: Windows, Mac OS X, Linux
Crash rank on the last 7 days:
Browser Content Plugin
- nightly #110
- aurora #1573
- beta #1506
- release #1812
- esr #37
status-firefox48:
--- → affected
status-firefox49:
--- → affected
status-firefox50:
--- → affected
status-firefox-esr45:
--- → affected
| Reporter | ||
Comment 3•9 years ago
|
||
(In reply to David Anderson [:dvander] from comment #1)
> A quick look suggests that the media decoder is shipping D3D11 frames after
> a device reset. We're supposed to protect against this by destroying layers
> and barring old or incompatible textures from getting attached in
> CompositableTransactionParent.
>
> But something's not working right, the texture is still leaking into the
> D3D11 compositor.
To be clear, in CompositableTransactionParent, https://dxr.mozilla.org/mozilla-central/source/gfx/layers/ipc/CompositableTransactionParent.cpp#80, we don't need to worry about early return if compositor->GetCompositor() is null? Is it possible we have a valid compositor, but it's already the "new one"?
If the compositable hasn't been attached to any compositor yet, the compositor will be null. The attach happens in a separate operation, which we also try to guard against:
https://dxr.mozilla.org/mozilla-central/rev/7963ebdd52b93f96b812eff2eab8d94097147b9c/gfx/layers/ipc/LayerTransactionParent.cpp#625
But maybe this is not robust enough. Possibly we should attach a device counter to all layer transactions so they can be verified against the compositor.
(On a side note, wow that's a super confusing check - "IsValid" actually returns whether it's invalid. There must have been a refactoring gone wrong on my part... will fix.)
| Assignee | ||
Updated•9 years ago
|
Assignee: nobody → hshih
Status: NEW → ASSIGNED
| Assignee | ||
Comment 5•9 years ago
|
||
If we set the basicCompositor at this line:
https://hg.mozilla.org/mozilla-central/annotate/1a5b53a831e5a6c20de1b081c774feb3ff76756c/gfx/layers/composite/ImageHost.cpp#l331
The textureHost::Lock() should failed at this line:
https://hg.mozilla.org/mozilla-central/annotate/1a5b53a831e5a6c20de1b081c774feb3ff76756c/gfx/layers/composite/ImageHost.cpp#l335
So I still can't know why there is a path for BasicCompositor::DrawQuad with d3d11TextureHost.
Let me digging more.
| Assignee | ||
Comment 6•9 years ago
|
||
| Assignee | ||
Comment 7•9 years ago
|
||
Oh. I found the problem! \o/
This change break the protection of driver removed. The dxr doesn't show the latest code before.
https://hg.mozilla.org/mozilla-central/diff/7b1349cb7487/gfx/layers/d3d11/TextureD3D11.cpp#l736
Let me think about how to fix this.
| Assignee | ||
Updated•9 years ago
|
Attachment #8786284 -
Attachment is obsolete: true
| Assignee | ||
Comment 8•9 years ago
|
||
Hi Kip,
I still can't find the usage for the removing of compositor checking:
https://hg.mozilla.org/mozilla-central/diff/7b1349cb7487/gfx/layers/d3d11/TextureD3D11.cpp#l736
I try to search the lock() call usage in that webvr implementations, but I can't see that.
Could you please tell more about that?
TextureHostD3D11 uses that checking to prevent the mismatched textureHost and compositor pair(e.g. d3d11Texture and basicCompositor). If we remove the checking, we might hit an assert like bp-f12bfd98-be91-47c0-89ca-d15ef2160824.
Flags: needinfo?(kgilbert)
Comment 9•9 years ago
|
||
Given that we've had these crashes before and it's regressed, then it might be worth trying to think about a way to add tests for this.
I guess we'd need to fake a error that causes us to recreate the compositor and fallback to Basic. Maybe we could add something to the PCompositor protocol that forces a compositor recreation and tries to composite with the current layer tree?
Could be difficult, but seems like a nice thing to have if it's possible.
Comment 10•9 years ago
|
||
(In reply to Jerry Shih[:jerry] (UTC+8) from comment #8)
> Hi Kip,
> I still can't find the usage for the removing of compositor checking:
> https://hg.mozilla.org/mozilla-central/diff/7b1349cb7487/gfx/layers/d3d11/
> TextureD3D11.cpp#l736
> I try to search the lock() call usage in that webvr implementations, but I
> can't see that.
> Could you please tell more about that?
>
> TextureHostD3D11 uses that checking to prevent the mismatched textureHost
> and compositor pair(e.g. d3d11Texture and basicCompositor). If we remove the
> checking, we might hit an assert like
> bp-f12bfd98-be91-47c0-89ca-d15ef2160824.
In the case of WebVR, we do not use a compositor object at all and VR is not supported without DX11 and a discrete GPU. In this case mCompositor will always be null. The VR equivalent currently uses DX11 calls only. Perhaps we could come up with a higher level interface that could be shared between the 2d compositor types and the VR compositing functions? Alternately, would it help to have a boolean set by the VR case to inform TextureHost that mCompositor is expected to be null?
Flags: needinfo?(kgilbert)
| Reporter | ||
Comment 11•9 years ago
|
||
If there is a special case for "mCompositor is null and that's OK", it sounds like a boolean to tag that case would be a way to go.
| Assignee | ||
Comment 12•9 years ago
|
||
There are to cases for video update.
a) If the image compositable hasn't been attached, the attachment will be done at:
https://dxr.mozilla.org/mozilla-central/rev/7963ebdd52b93f96b812eff2eab8d94097147b9c/gfx/layers/ipc/LayerTransactionParent.cpp#625
and we have a guard checking for driver-reset.
b) If the image compositable is already attached and chrome side just receives the new textureHosts, the textureHosts will be set at:
https://dxr.mozilla.org/mozilla-central/rev/3ba5426a03b495b6417fffb872d42874edb80855/gfx/layers/ipc/CompositableTransactionParent.cpp#80
We also have a driver-reset guard checking here, too.
For case b, bug 1298501 fix the typo for that invalid compositor checking. Before that change, I think that's the reason why we still have the combination of basicCompositor and d3d11TextureHost.
I add some patch in textureHost::Lock() for the last-ditch driver-reset checking before. Maybe that changes could reverted after bug 1298501.
Depends on: 1298501
| Comment hidden (obsolete) |
| Assignee | ||
Comment 14•9 years ago
|
||
After bug 1298501, maybe there will be no incompatible textureHost/compositor pair later.
However, I think we still could have a last-ditch checking before the DrawQuad() call.
This patch is try to fix the checking which we had bfore in WebVR case.
Attachment #8787544 -
Flags: review?(nical.bugzilla)
Attachment #8787544 -
Flags: review?(milan)
Attachment #8787544 -
Flags: review?(dvander)
| Assignee | ||
Updated•9 years ago
|
Attachment #8787542 -
Attachment is obsolete: true
Attachment #8787542 -
Flags: review?(nical.bugzilla)
Attachment #8787542 -
Flags: review?(milan)
Attachment #8787542 -
Flags: review?(dvander)
| Reporter | ||
Comment 15•9 years ago
|
||
Comment on attachment 8787544 [details] [diff] [review]
use different lock/unlock function for WebVR composition path. v2
Review of attachment 8787544 [details] [diff] [review]:
-----------------------------------------------------------------
Overall, this is fine. Minor changes to consider:
Naming things - while the "I don't need the compositor" is currently (only) used by VR, it seems that at the level of TextureHost, the distinction is LockWithoutCompositor and UnlockWithoutCompositor or some such, rather than LockVR/UnlockVR. TextureHost doesn't and shouldn't know about VR, so polluting its function names may be better avoided. In other words, name things for what they do, not how they're currently used.
The AutoLock... classes are small enough that it probably doesn't make sense to have a base and derived classes, but if we start adding more things in there, it would be something to consider.
Attachment #8787544 -
Flags: review?(milan) → review+
| Assignee | ||
Comment 16•9 years ago
|
||
The comment 14 is wrong.
The behavior of [1] in bug 1298501 is eventually the same as before. So, it's still a path go with an incompatible textureHost/compositor pair.
[1]
https://bug1298501.bmoattachments.org/attachment.cgi?id=8785454
| Assignee | ||
Comment 17•9 years ago
|
||
refine the comment and function naming for comment 15.
| Assignee | ||
Updated•9 years ago
|
Attachment #8787544 -
Attachment is obsolete: true
Attachment #8787544 -
Flags: review?(nical.bugzilla)
Attachment #8787544 -
Flags: review?(dvander)
| Assignee | ||
Comment 18•9 years ago
|
||
fix the typo of lock/unlock function prototypes in textureHost.h.
| Assignee | ||
Updated•9 years ago
|
Attachment #8788081 -
Attachment is obsolete: true
| Assignee | ||
Comment 19•9 years ago
|
||
Please land the attachment 8788090 [details] [diff] [review] to m-c.
try:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=575b2a936380
Keywords: checkin-needed
Comment 20•9 years ago
|
||
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/d64ac5fa8914
Use different lock/unlock function for WebVR composition path. r=milan
Keywords: checkin-needed
Comment 21•9 years ago
|
||
| bugherder | ||
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla51
Comment 22•9 years ago
|
||
Not a very big volume but we could take an uplift to 50. Thanks
Flags: needinfo?(hshih)
| Assignee | ||
Comment 23•9 years ago
|
||
In comment 7, the change
https://hg.mozilla.org/mozilla-central/diff/7b1349cb7487/gfx/layers/d3d11/TextureD3D11.cpp#l736
is in firefox51. So, I think this patch doesn't need to uplift.
Sylvestre, do you see the similar drawQuad() crash report in ff50?
Flags: needinfo?(hshih) → needinfo?(sledru)
Comment 24•9 years ago
|
||
On:
https://crash-stats.mozilla.com/signature/?signature=mozilla%3A%3Alayers%3A%3ABasicCompositor%3A%3ADrawQuad
I can see 4 crashes over the last 7 days.
We see it on many different older version (44, 45, etc)
Maybe this is a different issue ?!
Flags: needinfo?(sledru)
| Assignee | ||
Comment 25•9 years ago
|
||
ff50 crash:
https://crash-stats.mozilla.com/report/index/8f7630ca-37fe-4c31-b4c1-441372160908
https://crash-stats.mozilla.com/report/index/1d8c6d46-ed82-4d78-8c2f-8b9d52160907
https://crash-stats.mozilla.com/report/index/afd3d541-846b-49fc-b53d-83e572160907
These three are crashed at https://hg.mozilla.org/mozilla-unified/annotate/3a0fd1bb116a/gfx/layers/basic/BasicCompositor.cpp#l522 which is different from this bug.
Will have a bug to track this.
But this https://crash-stats.mozilla.com/report/index/01d71a7e-b9f5-40f4-ade6-7ddd22160905 is weird. I think we should not have this crashed path. I will check this, either.
For ff48, I have a quick look for these log.
https://crash-stats.mozilla.com/report/index/125e2231-d423-4295-a066-6d3462160908
https://crash-stats.mozilla.com/report/index/79564e0a-8a2f-4b82-8879-c27f92160906
https://crash-stats.mozilla.com/report/index/edfe96a9-468e-4413-9c64-05a082160905
are not related to this and the fix of bug 1187466.
Let me look more about these logs.
Flags: needinfo?(sledru)
Comment 26•9 years ago
|
||
OK, thanks. Are you going to report a new bug for the other? thanks
Flags: needinfo?(sledru)
50 is unaffected based on comment 25.
Comment 28•9 years ago
|
||
Crash volume for signature 'mozilla::layers::BasicCompositor::DrawQuad':
- nightly (version 52): 2 crashes from 2016-09-19.
- aurora (version 51): 0 crashes from 2016-09-19.
- beta (version 50): 29 crashes from 2016-09-20.
- release (version 49): 179 crashes from 2016-09-05.
- esr (version 45): 2546 crashes from 2016-07-25.
Crash volume on the last weeks (Week N is from 10-17 to 10-23):
W. N-1 W. N-2 W. N-3 W. N-4
- nightly 0 2 0 0
- aurora 0 0 0 0
- beta 2 10 9 3
- release 66 57 36 7
- esr 283 248 296 312
Affected platform: Windows
Crash rank on the last 7 days:
Browser Content Plugin
- nightly
- aurora
- beta #2314 #4182
- release #841
- esr #33
status-firefox52:
--- → affected
Comment 29•8 years ago
|
||
Sounds like nothing useful is going to come from this bug for 52 at this point.
You need to log in
before you can comment on or make changes to this bug.
Description
•