Closed Bug 1044975 Opened 10 years ago Closed 10 years ago

Intermittent 789933-1.html | application crashed [@ mozilla::layers::CompositorD3D11::UpdateConstantBuffers()]

Categories

(Core :: Graphics: Layers, defect)

x86
Windows 8
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla35
Tracking Status
firefox33 --- fixed
firefox34 --- fixed
firefox35 --- fixed
firefox-esr31 --- unaffected

People

(Reporter: cbook, Assigned: nical)

References

()

Details

(Keywords: crash, intermittent-failure)

Attachments

(1 file, 4 obsolete files)

WINNT 6.2 mozilla-inbound opt test crashtest on 2014-07-28 00:39:19 PDT for push b70f46a09115

slave: t-w864-ix-093

https://tbpl.mozilla.org/php/getParsedLog.php?id=44693233&tree=Mozilla-Inbound

00:44:06  WARNING -  PROCESS-CRASH | file:///C:/slave/test/build/tests/reftest/tests/dom/canvas/crashtests/789933-1.html | application crashed [@ mozilla::layers::CompositorD3D11::UpdateConstantBuffers()]
00:44:06     INFO -  Crash dump filename: c:\users\cltbld~1.t-w\appdata\local\temp\tmpfrhgr6.mozrunner\minidumps\548f6217-d415-4f38-a082-0adf492d6079.dmp
00:44:06     INFO -  Operating system: Windows NT
00:44:06     INFO -                    6.2.9200
00:44:06     INFO -  CPU: x86
00:44:06     INFO -       GenuineIntel family 6 model 30 stepping 5
00:44:06     INFO -       8 CPUs
00:44:06     INFO -  Crash reason:  EXCEPTION_ACCESS_VIOLATION_WRITE
00:44:06     INFO -  Crash address: 0x6fddf853
00:44:06     INFO -  Thread 29 (crashed)
00:44:06     INFO -   0  xul.dll!mozilla::layers::CompositorD3D11::UpdateConstantBuffers() [CompositorD3D11.cpp:b70f46a09115 : 970 + 0xc]
00:44:06     INFO -      eip = 0x71a42c29   esp = 0x0bffefb4   ebp = 0x0bffefd0   ebx = 0x0a364820
00:44:06     INFO -      esi = 0x0a364940   edi = 0x6fddf853   eax = 0x00000000   ecx = 0x00000008
00:44:06     INFO -      edx = 0x0bffefc0   efl = 0x00010246
00:44:06     INFO -      Found by: given as instruction pointer in context
00:44:06     INFO -  Thread 0
00:44:06     INFO -   0  ntdll.dll + 0x41318
00:44:06     INFO -      eip = 0x77a01318   esp = 0x004cec04   ebp = 0x004ced84   ebx = 0x004cedac
00:44:06     INFO -      esi = 0x004ced2c   edi = 0x00000000   eax = 0x00000044   ecx = 0x00000000
00:44:06     INFO -      edx = 0x00000000   efl = 0x00200202
00:44:06     INFO -      Found by: given as instruction pointer in context
00:44:06     INFO -   1  user32.dll + 0x9878
00:44:06     INFO -      eip = 0x774b9879   esp = 0x004ced8c   ebp = 0x004ceddc
00:44:06     INFO -      Found by: previous frame's frame pointer
00:44:06     INFO -   2  user32.dll + 0x9777
00:44:06     INFO -      eip = 0x774b9778   esp = 0x004cede4   ebp = 0x004cedfc
00:44:06     INFO -      Found by: previous frame's frame pointer
00:44:06     INFO -   3  user32.dll + 0x992f
00:44:06     INFO -      eip = 0x774b9930   esp = 0x004cee04   ebp = 0x004cee18
00:44:06     INFO -      Found by: previous frame's frame pointer
00:44:06     INFO -   4  xul.dll!mozilla::ipc::MessageChannel::WaitForSyncNotify() [WindowsMessageLoop.cpp:b70f46a09115 : 847 + 0x18]
00:44:06     INFO -      eip = 0x71779eb3   esp = 0x004cee20   ebp = 0x004cee78
00:44:06     INFO -      Found by: previous frame's frame pointer
00:44:06     INFO -   5  xul.dll!mozilla::ipc::MessageChannel::SendAndWait(IPC::Message *,IPC::Message *) [MessageChannel.cpp:b70f46a09115 : 676 + 0x6]
00:44:06     INFO -      eip = 0x71781ced   esp = 0x004cee80   ebp = 0x004cee8c
00:44:06     INFO -      Found by: call frame info
00:44:06     INFO -   6  xul.dll!mozilla::ipc::MessageChannel::Send(IPC::Message *,IPC::Message *) [MessageChannel.cpp:b70f46a09115 : 589 + 0xf]
00:44:06     INFO -      eip = 0x71782598   esp = 0x004cee94   ebp = 0x004ceec0
00:44:06     INFO -      Found by: call frame info
00:44:06     INFO -   7  xul.dll!mozilla::layers::PCompositorChild::SendFlushRendering() [PCompositorChild.cpp:b70f46a09115 : 335 + 0xc]
Component: General → Graphics: Layers
This is odd, no stack trace. in UpdateConstantBuffers we are mapping buffers which may fail (for example in case of device reset) but we don't check the result of map.
Assignee: nobody → nical.bugzilla
Attachment #8499516 - Flags: review?(bas)
Comment on attachment 8499516 [details] [diff] [review]
Check that mapping didn't fail, don't crash if it did.

Review of attachment 8499516 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/layers/d3d11/CompositorD3D11.cpp
@@ +983,5 @@
>    D3D11_MAPPED_SUBRESOURCE resource;
> +
> +  hr = mContext->Map(mAttachments->mVSConstantBuffer, 0, D3D11_MAP_WRITE_DISCARD, 0, &resource);
> +  if (FAILED(hr)) {
> +    return false;

Crash here if our device -didn't- reset. Just to make sure we still crash when this is an unrecoverable situation that will result in an unusable browser (see ID3D11Device::GetDeviceRemovedReason())
Attachment #8499516 - Flags: review?(bas) → review-
Attachment #8499516 - Attachment is obsolete: true
Attachment #8499699 - Flags: review?(bas)
Comment on attachment 8499699 [details] [diff] [review]
v2, choose whether to crash or not depending on the error code

Review of attachment 8499699 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/layers/d3d11/CompositorD3D11.cpp
@@ +983,3 @@
>    D3D11_MAPPED_SUBRESOURCE resource;
> +
> +  hr = mContext->Map(mAttachments->mVSConstantBuffer, 0, D3D11_MAP_WRITE_DISCARD, 0, &resource);

See my previous comments about checking whether the device has been removed. Only on hr = D3DERR_DEVICE_REMOVED should we crash. An even more solid way would be to use GetDeviceRemovedReason and crash if that returns SUCCEEDED inside this if clause.
Attachment #8499699 - Flags: review?(bas) → review-
Arf, sorry. I re-uploaded the first patch instead of giving you the actual v2. Here it is.
Attachment #8499699 - Attachment is obsolete: true
Attachment #8500377 - Flags: review?(bas)
Comment on attachment 8500377 [details] [diff] [review]
v2, choose whether to crash or not depending on the error code

Review of attachment 8500377 [details] [diff] [review]:
-----------------------------------------------------------------

NS_ABORT_IF_FALSE doesn't crash in release builds.

::: gfx/layers/d3d11/CompositorD3D11.cpp
@@ +964,5 @@
> +bool ShouldRecoverFromMapFailure(HRESULT hr, ID3D11Device* device)
> +{
> +  // XXX - it would be nice to use gfxCriticalError, but it needs to
> +  // be made to work off the main thread first.
> +  if (!FAILED(hr)) {

nit: SUCCEEDED(hr)

@@ +967,5 @@
> +  // be made to work off the main thread first.
> +  if (!FAILED(hr)) {
> +    return true;
> +  }
> +  if (!device) {

It would seem to me this check is redundant, I'd personally remove it and be happy with crashing in a nullptr deref if this occurs.

@@ +977,5 @@
> +      case DXGI_ERROR_DEVICE_REMOVED:
> +      case DXGI_ERROR_DEVICE_RESET:
> +        return true;
> +      case DXGI_ERROR_INVALID_CALL:
> +      case DXGI_ERROR_DRIVER_INTERNAL_ERROR:

this one should be recoverable as well in general.
Attachment #8500377 - Flags: review?(bas) → review-
Attachment #8500377 - Attachment is obsolete: true
Attachment #8500390 - Flags: review?(bas)
I forgot to address the comment about NS_ABORT_IF_FALSE in the previous patch
Attachment #8500390 - Attachment is obsolete: true
Attachment #8500390 - Flags: review?(bas)
Attachment #8500394 - Flags: review?(bas)
Attachment #8500394 - Attachment is patch: true
Attachment #8500394 - Flags: review?(bas) → review+
Comment on attachment 8500394 [details] [diff] [review]
v4 - with MOZ_CRASH instead of NS_ABORT_IF_FALSE

Approval Request Comment
[Feature/regressing bug #]:
[User impact if declined]: Some crashes on Windows
[Describe test coverage new/current, TBPL]:
[Risks and why]: low, simply try to recover in a code path where we would otherwise crash, doesn't look risky from a security standpoint.
[String/UUID change made/needed]:
Attachment #8500394 - Flags: approval-mozilla-beta?
Attachment #8500394 - Flags: approval-mozilla-aurora?
Comment on attachment 8500394 [details] [diff] [review]
v4 - with MOZ_CRASH instead of NS_ABORT_IF_FALSE

we merged beta => release and it does not seem to impact android.
Attachment #8500394 - Flags: approval-mozilla-release+
Attachment #8500394 - Flags: approval-mozilla-beta?
Attachment #8500394 - Flags: approval-mozilla-beta-
Attachment #8500394 - Flags: approval-mozilla-aurora?
Attachment #8500394 - Flags: approval-mozilla-aurora+
https://hg.mozilla.org/mozilla-central/rev/dd3e8945b40d
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla35
https://hg.mozilla.org/releases/mozilla-aurora/rev/4b8e5cb2bbf5

Sylvestre, I didn't see the release approval in time unfortunately :(. Did you still want me to push it at this point? Maybe not worth a respin by itself, but at least so it's there if something else does require one.
Flags: needinfo?(sledru)
Because of the hosting issue, we delayed the build of the RC. We will take it. Thanks!
Flags: needinfo?(sledru)
(In reply to Sylvestre Ledru [:sylvestre] from comment #17)
> Because of the hosting issue, we delayed the build of the RC. We will take
> it. Thanks!

landed as remote:   https://hg.mozilla.org/releases/mozilla-release/rev/9bf2a5b5162d
Attachment #8500394 - Flags: approval-mozilla-release+
Attachment #8500394 - Flags: approval-mozilla-beta-
For some context, Kairo noticed that the number of crashes increased with RC. Since we are going to build a new 33.0, we backout the three patches (bug 1074378, bug 1076825 and bug 1044975) to go back to the beta 9 levels (we were happy with them). We are unsure which one causes the increase.
https://crash-analysis.mozilla.com/rkaiser/2014-10-09/2014-10-09.firefox.33.explosiveness.html

Leaving the status flag as affected, we could take it back for 33.1 once we have more information (with 34 in beta)
See bug 1081171 on new the crashes we saw in RC. Bas said it's probable that bug 1074378 is actually at fault, but we backed out all gfx changed between b9 and RC to be on the safe side.
Comment on attachment 8500394 [details] [diff] [review]
v4 - with MOZ_CRASH instead of NS_ABORT_IF_FALSE

Like for bug 1083071, I would like this patch in both mozilla-release and mozilla-release branch GECKO330_2014101104_RELBRANCH
Thanks
Flags: needinfo?(ryanvm)
Flags: needinfo?(cbook)
Attachment #8500394 - Flags: approval-mozilla-release+
You need to log in before you can comment on or make changes to this bug.