Closed
Bug 1397040
Opened 7 years ago
Closed 7 years ago
Increase in Intel graphics driver crashes during 56.0b
Categories
(Core :: Graphics, defect)
Tracking
()
RESOLVED
FIXED
mozilla57
People
(Reporter: philipp, Assigned: bas.schouten)
References
Details
(Keywords: crash, regression)
Crash Data
Attachments
(1 file)
59 bytes,
text/x-review-board-request
|
mattwoodrow
:
review+
lizzard
:
approval-mozilla-beta+
|
Details |
crash stats shows that there are a number of signatures with intel graphics drivers in the browser process that are on the rise in firefox 56.0b again. those same signatures where already spiking back in 55.0b until we implemented a blocklisting mechanism for 6th gen intel gpus from using D3D11 keyed mutex in bug 1359416. this new spike in 56.0b is affecting the same devices again - has something changed again in this regard since 55? Adapter device id facet 1 0x0102 718 39.02 % 2 0x0116 533 28.97 % 3 0x0106 223 12.12 % 4 0x0126 195 10.60 % 5 0x0166 66 3.59 % 6 0x0152 21 1.14 % 7 0x0122 16 0.87 % 8 0x0112 15 0.82 % 9 0x0046 13 0.71 % 10 0x0156 13 0.71 % 11 0x2a42 10 0.54 % Adapter driver version facet 1 9.17.10.4459 1026 54.69 % 2 8.15.10.2418 595 31.72 % 3 9.17.10.4229 90 4.80 % 4 8.15.10.2353 29 1.55 % 5 9.17.10.2843 28 1.49 % 6 8.15.10.2900 14 0.75 % 7 9.17.10.2770 14 0.75 % 8 9.17.10.2828 12 0.64 % crash stats search link for affected signatures: http://bit.ly/2f0L7M6 movement of intel signatures between 55.0b99 and 56.0b: https://mozilla.github.io/stab-crashes/scomp.html?common=product%3DFirefox%26signature%3D^igd%26date%3D%3E2017-06-01&p1=version%3D55.0b99&p2=version%3D56.0b
Flags: needinfo?(milan)
We have the same blocklisting code in 56 as it went out in 55, at least in the source code, so this is probably a different thing. About a third of the crashes report a device reset, which really means that the stack is not that helpful - the culprit is whatever code caused the device reset, rather than where we eventually crash trying to recover from it. More than a half seem to come from VideoData::CreateAndCopyData/FFmpegVideoDecoder, and I think there were recent changes, but I'm not sure if they were in 56. Jean-Yves would know.
Flags: needinfo?(milan) → needinfo?(jyavenard)
Updated•7 years ago
|
Flags: needinfo?(ajones)
Comment 2•7 years ago
|
||
This is likely related to the direct upload of the YUV buffer into a D3D11 texture, which was added in bug 1223270. May have to disable that change :(
Flags: needinfo?(jyavenard)
Assignee | ||
Comment 4•7 years ago
|
||
I have an idea, the vast majority of these crashes happen when the main thread is waiting for the internal device lock in https://hg.mozilla.org/releases/mozilla-beta/annotate/dcb3e24852c4/gfx/layers/d3d11/TextureD3D11.cpp#l463 and the video thread is inside UpdateSubResource. (There's a few exceptions but not many) During our video thread uploads we already enter the ID3D10Multithread mutex. I bet we can avoid a lot of the crashes by explicitly locking that mutex while we call CreateTexture2D at the other callsite as well, I will provide a patch, that at the very least will be harmless, but will probably solve most of the issue.
Flags: needinfo?(bas)
Comment hidden (mozreview-request) |
Comment 6•7 years ago
|
||
mozreview-review |
Comment on attachment 8905232 [details] Bug 1397040: During D3D11TextureData::Create lock the device multithread mutex explicitly when accessing the device. https://reviewboard.mozilla.org/r/177024/#review182026 ::: gfx/layers/d3d11/TextureD3D11.cpp:490 (Diff revision 1) > uploadDataPtr = &uploadData; > } > > + // See bug 1397040 > + RefPtr<ID3D10Multithread> mt; > + device->QueryInterface((ID3D10Multithread**)getter_AddRefs(mt)); Would be nice if we had an RAII helper for the QI + Enter/Leave!
Attachment #8905232 -
Flags: review?(matt.woodrow) → review+
Comment 7•7 years ago
|
||
mozreview-review |
Comment on attachment 8905232 [details] Bug 1397040: During D3D11TextureData::Create lock the device multithread mutex explicitly when accessing the device. https://reviewboard.mozilla.org/r/177024/#review182068 ::: gfx/layers/d3d11/TextureD3D11.cpp:490 (Diff revision 1) > uploadDataPtr = &uploadData; > } > > + // See bug 1397040 > + RefPtr<ID3D10Multithread> mt; > + device->QueryInterface((ID3D10Multithread**)getter_AddRefs(mt)); we do have D3D11MTAutoEnter.
Comment 8•7 years ago
|
||
mozreview-review |
Comment on attachment 8905232 [details] Bug 1397040: During D3D11TextureData::Create lock the device multithread mutex explicitly when accessing the device. https://reviewboard.mozilla.org/r/177024/#review182070 ::: gfx/layers/d3d11/TextureD3D11.cpp:502 (Diff revision 1) > HRESULT hr = device->CreateTexture2D(&newDesc, uploadDataPtr, getter_AddRefs(texture11)); > > if (FAILED(hr) || !texture11) { > gfxCriticalNote << "[D3D11] 2 CreateTexture2D failure Size: " << aSize > << "texture11: " << texture11 << " Code: " << gfx::hexa(hr); > return nullptr; no mt->Leave() here? ::: gfx/layers/d3d11/TextureD3D11.cpp:526 (Diff revision 1) > // on it to be synchronized using it. If we did an initial upload using aSurface > // then bizarely this isn't covered, so we insert a manual lock/unlock pair > // to force this. > if (aSurface && newDesc.MiscFlags == D3D11_RESOURCE_MISC_SHARED_KEYEDMUTEX) { > if (!LockD3DTexture(texture11.get())) { > return nullptr; and there too?
Pushed by bschouten@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/52433c05f306 During D3D11TextureData::Create lock the device multithread mutex explicitly when accessing the device. r=mattwoodrow
Assignee | ||
Updated•7 years ago
|
Assignee: nobody → bas
Status: NEW → ASSIGNED
Comment 10•7 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/52433c05f306
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla57
Comment 11•7 years ago
|
||
Please request Beta approval on this when you get a chance.
status-firefox55:
--- → wontfix
status-firefox-esr52:
--- → wontfix
Flags: needinfo?(ajones) → needinfo?(bas)
Assignee | ||
Comment 12•7 years ago
|
||
(In reply to Ryan VanderMeulen [:RyanVM] from comment #11) > Please request Beta approval on this when you get a chance. I feel this patch is probably too risky to move to beta if we don't have beta builds left to test this with :-(
Flags: needinfo?(bas)
Assignee | ||
Comment 13•7 years ago
|
||
Comment on attachment 8905232 [details] Bug 1397040: During D3D11TextureData::Create lock the device multithread mutex explicitly when accessing the device. Approval Request Comment [Feature/Bug causing the regression]: Client side video upload [User impact if declined]: Potential crashes with video [Is this code covered by automated tests?]: Yes [Has the fix been verified in Nightly?]: Yes [Needs manual test from QE? If yes, steps to reproduce]: No [List of other uplifts needed for the feature/fix]: None [Is the change risky?]: Slightly, if there's driver bugs in existance [Why is the change risky/not risky?]: Relies on driver internals [String changes made/needed]: None
Attachment #8905232 -
Flags: approval-mozilla-beta?
Comment 14•7 years ago
|
||
Comment on attachment 8905232 [details] Bug 1397040: During D3D11TextureData::Create lock the device multithread mutex explicitly when accessing the device. A bit risky for late beta, but we still have beta 11, 12, and the RC build next week to notice and fix any fallout, and I think it is worth the risk to try and get rid of this new batch of driver crashes.
Attachment #8905232 -
Flags: approval-mozilla-beta? → approval-mozilla-beta+
Comment 15•7 years ago
|
||
uplift |
https://hg.mozilla.org/releases/mozilla-beta/rev/9d789fc951a7
You need to log in
before you can comment on or make changes to this bug.
Description
•