1116812 - CompositorD3D11::HandleError coming from mozilla::layers::CompositorD3D11::UpdateRenderTarget() probably from TDRs

Stephen Donner [:stephend] Not actively reading bugmail

Reporter

Description

•

11 years ago

This bug was filed from the Socorro interface and is report bp-53c615b9-efef-4886-9769-1a2992141230. ============================================================= STR: 1. Using yesterday's Nightly build, I updated my Nvidia driver to this: http://www.nvidia.com/download/driverResults.aspx/80913/en-us Version: 347.09 WHQL Release Date: 2014.12.23 Operating System: Windows 7 64-bit, Windows 8.1 64-bit, Windows 8 64-bit, Windows Vista 64-bit Language: English (US) 2. While installing the above update, my screen flashed black for a second (while the driver reset/detected the display, I guess) 3. As soon as the screen flashed, Nightly crashed (Sorry, but I can't repro -- still filing because it'll likely help.)

Stephen Donner [:stephend] Not actively reading bugmail

Reporter

Comment 1

•

11 years ago

Frame Module Signature Source 0 xul.dll mozilla::layers::CompositorD3D11::HandleError(long, mozilla::layers::CompositorD3D11::Severity) gfx/layers/d3d11/CompositorD3D11.cpp 1 xul.dll mozilla::layers::CompositorD3D11::Failed(long, mozilla::layers::CompositorD3D11::Severity) gfx/layers/d3d11/CompositorD3D11.cpp 2 xul.dll mozilla::layers::CompositorD3D11::UpdateRenderTarget() gfx/layers/d3d11/CompositorD3D11.cpp 3 xul.dll mozilla::layers::CompositorD3D11::BeginFrame(nsIntRegion const&, mozilla::gfx::RectTyped<mozilla::gfx::UnknownUnits> const*, mozilla::gfx::RectTyped<mozilla::gfx::UnknownUnits> const&, mozilla::gfx::RectTyped<mozilla::gfx::UnknownUnits>*, mozilla::gfx::RectTyped<mozilla::gfx::UnknownUnits>*) gfx/layers/d3d11/CompositorD3D11.cpp 4 xul.dll mozilla::layers::LayerManagerComposite::Render() gfx/layers/composite/LayerManagerComposite.cpp 5 xul.dll mozilla::layers::LayerManagerComposite::EndTransaction(void (*)(mozilla::layers::PaintedLayer*, gfxContext*, nsIntRegion const&, mozilla::layers::DrawRegionClip, nsIntRegion const&, void*), void*, mozilla::layers::LayerManager::EndTransactionFlags) gfx/layers/composite/LayerManagerComposite.cpp 6 xul.dll mozilla::layers::LayerManagerComposite::EndEmptyTransaction(mozilla::layers::LayerManager::EndTransactionFlags) gfx/layers/composite/LayerManagerComposite.cpp 7 xul.dll mozilla::layers::CompositorParent::CompositeToTarget(mozilla::gfx::DrawTarget*, nsIntRect const*) gfx/layers/ipc/CompositorParent.cpp 8 xul.dll RunnableMethod<mozilla::layers::CompositorParent, void ( mozilla::layers::CompositorParent::*)(mozilla::TimeStamp), Tuple1<mozilla::TimeStamp> >::Run() ipc/chromium/src/base/task.h 9 xul.dll MessageLoop::DoWork() ipc/chromium/src/base/message_loop.cc 10 xul.dll base::MessagePumpForUI::DoRunLoop() ipc/chromium/src/base/message_pump_win.cc 11 xul.dll base::MessagePumpWin::Run(base::MessagePump::Delegate*) ipc/chromium/src/base/message_pump_win.h 12 xul.dll MessageLoop::RunHandler() ipc/chromium/src/base/message_loop.cc 13 xul.dll MessageLoop::Run() ipc/chromium/src/base/message_loop.cc 14 xul.dll base::Thread::ThreadMain() ipc/chromium/src/base/thread.cc 15 xul.dll `anonymous namespace'::ThreadFunc(void*) ipc/chromium/src/base/platform_thread_win.cc 16 kernel32.dll BaseThreadInitThunk 17 ntdll.dll RtlUserThreadStart 18 kernel32.dll BasepReportFault 19 kernel32.dll BasepReportFault

Robert Kaiser

Comment 2

•

11 years ago

[Tracking Requested - why for this release]: This is the #4 topcrash on 36.0b1 with 2.8% of all crashes as of now.

status-firefox36: --- → affected

status-firefox37: --- → affected

status-firefox38: --- → affected

tracking-firefox36: --- → ?

Robert Kaiser

Updated

•

11 years ago

Summary: crash in mozilla::layers::CompositorD3D11::HandleError(long, mozilla::layers::CompositorD3D11::Severity) | mozilla::layers::CompositorD3D11::Failed(long, mozilla::layers::CompositorD3D11::Severity) | mozilla::layers::CompositorD3D11::UpdateRenderTarget() → CompositorD3D11::HandleError comping from mozilla::layers::CompositorD3D11::UpdateRenderTarget()

Sylvestre Ledru [:Sylvestre]

Comment 3

•

11 years ago

Top crash, tracking! Milan, can you help?

tracking-firefox36: ? → +

Flags: needinfo?(milan)

Milan Sreckovic [:milan] (needinfo for best results)

Comment 4

•

11 years ago

This is "Invalid D3D API Call" (DXGI_ERROR_INVALID_CALL), and is preceded by a bunch of failures to create "normal" sized bitmaps (e.g., 22x21, 59x22, etc.) with failure code 0x8899000c in DrawTargetD2D1::CreateSourceSurfaceFromData.

Assignee: nobody → bas

Flags: needinfo?(milan) → needinfo?(bas)

Bas Schouten (:bas.schouten)

Assignee

Comment 5

•

11 years ago

(In reply to Milan Sreckovic [:milan] from comment #4) > This is "Invalid D3D API Call" (DXGI_ERROR_INVALID_CALL), and is preceded by > a bunch of failures to create "normal" sized bitmaps (e.g., 22x21, 59x22, > etc.) with failure code 0x8899000c in > DrawTargetD2D1::CreateSourceSurfaceFromData. This is another crash that happens on TDR. A situation where we always used to crash, so unlikely to be a regression, but one that we can probably fix though.

Flags: needinfo?(bas)

Bas Schouten (:bas.schouten)

Assignee

Comment 6

•

11 years ago

On the bright side, this really shows how much of a long tail of crashes we've consolidated.

Milan Sreckovic [:milan] (needinfo for best results)

Comment 7

•

11 years ago

(In reply to Bas Schouten (:bas.schouten) from comment #6) > On the bright side, this really shows how much of a long tail of crashes > we've consolidated. I know! Fixing this would have a major impact now. We can't tell what's causing driver resets, can we? I'm thinking of the bug 1124427, where "stress testing webgl" causes driver resets, wonder if a good portion of them (resets) is coming from there...

Bas Schouten (:bas.schouten)

Assignee

Comment 8

•

11 years ago

(In reply to Milan Sreckovic [:milan] from comment #7) > (In reply to Bas Schouten (:bas.schouten) from comment #6) > > On the bright side, this really shows how much of a long tail of crashes > > we've consolidated. > > I know! Fixing this would have a major impact now. > > We can't tell what's causing driver resets, can we? I'm thinking of the bug > 1124427, where "stress testing webgl" causes driver resets, wonder if a good > portion of them (resets) is coming from there... Not really, I think technically Windows doesn't even really know.

Bas Schouten (:bas.schouten)

Assignee

Comment 9

•

11 years ago

Fwiw, I -suspect- most of the resets are just a GPU driver crashing because of a random driver bug that happens to get hit.

Sylvestre Ledru [:Sylvestre]

Comment 10

•

10 years ago

Bas, do you think you will be able to fix that during the 36 cycle? thanks beta5 gtb is tomorrow

Flags: needinfo?(bas)

Bas Schouten (:bas.schouten)

Assignee

Comment 11

•

10 years ago

(In reply to Sylvestre Ledru [:sylvestre] from comment #10) > Bas, do you think you will be able to fix that during the 36 cycle? thanks > beta5 gtb is tomorrow Perhaps, I want to reiterate this is -not- a regression.

Flags: needinfo?(bas)

Comment hidden (Legacy TBPL/Treeherder Robot)

Sylvestre Ledru [:Sylvestre]

Comment 13

•

10 years ago

(In reply to Bas Schouten (:bas.schouten) from comment #11) > Perhaps, I want to reiterate this is -not- a regression. Sure but it does not really matter since it is a crash.

Milan Sreckovic [:milan] (needinfo for best results)

Comment 14

•

10 years ago

(In reply to Sylvestre Ledru [:sylvestre] from comment #13) > (In reply to Bas Schouten (:bas.schouten) from comment #11) > > Perhaps, I want to reiterate this is -not- a regression. > Sure but it does not really matter since it is a crash. Right, the claim is that we have just grouped the existing crashes into a single signature. Still want to see if we can fix it, just clarifying.

Bas Schouten (:bas.schouten)

Assignee

Comment 15

•

10 years ago

(In reply to Sylvestre Ledru [:sylvestre] from comment #13) > (In reply to Bas Schouten (:bas.schouten) from comment #11) > > Perhaps, I want to reiterate this is -not- a regression. > Sure but it does not really matter since it is a crash. Yes, let me try and be clearer.. there is noone crashing from this crash, that wasn't crashing already in this situation on 36, or Release.

Marcia Knous [:marcia]

Comment 16

•

10 years ago

pragmatic hit this crash on her older Win 7 machine while watching youtube.com in full screen. https://crash-stats.mozilla.com/report/index/8270542a-e9a6-4031-8104-be1cd2150130. She might have a good machine to try to use for reproducing if it is needed.

Milan Sreckovic [:milan] (needinfo for best results)

Comment 17

•

10 years ago

Bas, any more info we can ask for to help us? Let's assume we can fix this on our side and see what we can do short term.

Flags: needinfo?(bas)

Milan Sreckovic [:milan] (needinfo for best results)

Comment 18

•

10 years ago

We may just drop the assert in Beta, and see what happens; worse case scenario is that we crash elsewhere. Bas will provide more details and the patch.

Sylvestre Ledru [:Sylvestre]

Comment 20

•

10 years ago

Milan, any progress on this? Thanks

Flags: needinfo?(milan)

Bas Schouten (:bas.schouten)

Assignee

Comment 21

•

10 years ago

(In reply to Sylvestre Ledru [:sylvestre] from comment #20) > Milan, any progress on this? Thanks As per my e-mail before. I don't think there's going to be any point in dropping this assert. I believe it would just spread the crashes out and make them harder to diagnose. I think we simply need to try to disable DXVA and see how that affects this signature. We can always remove this assert later, -that-'s never going to have a true negative effect.

Flags: needinfo?(bas)

Milan Sreckovic [:milan] (needinfo for best results)

Updated

•

10 years ago

Flags: needinfo?(milan)

Sylvestre Ledru [:Sylvestre]

Updated

•

10 years ago

tracking-firefox37: --- → +

tracking-firefox38: --- → +

cmtalbert

Comment 22

•

10 years ago

I've been hearing from folks that there is confusion around this bug. Let me try to explain this a bit for those not entirely accustomed to graphics. There is a phenomenon on windows called TDR which is when the graphics driver resets beneath us. This causes Firefox to crash. This bug is essentially in the catch-all area of where we land when these kinds of things happen to us. In beta 36 we have been running two simultaneous pieces of unproven tech: * MSE - video playback improvements over flash * D2D 1.1 - a newish method for doing some drawing Note that our normal hardware acceleration is provided by d3d11/9. I'm still not sure how d3d and d2d interact with each other per se (other than the obvious 2d vs 3d difference). We have some crashes we think are D2D related. We have some crashes that are MSE related. Now this comes back to the TDR issue because we are hypothesizing that due to the relatively untested codepath that we are using with MSE for hardware accelerated video decoding (what Bas is referring to above when he talks about DXVA) we might be pushing some graphics drivers beyond their limits and hitting TDR's. Thus causing crashes with this signature. Of course it could also be D2D related too. Now that we have disabled MSE in beta 8, if the spike in this signature was caused by hardware decoding as a result of MSE, we *should* see the crash rate of this issue diminish. If we don't, then we likely have some other culprit. Hope this helps fix some of the confusion.

(Away)

Comment 23

•

10 years ago

> Now that we have disabled MSE in beta 8, if the spike in this signature was > caused by hardware decoding as a result of MSE, we *should* see the crash > rate of this issue diminish. If we don't, then we likely have some other > culprit. I don't have absolute numbers but in early b8 data this is still at the same relative volume.

Bas Schouten (:bas.schouten)

Assignee

Comment 24

•

10 years ago

(In reply to Clint Talbert ( :ctalbert ) from comment #22) > I've been hearing from folks that there is confusion around this bug. Let > me try to explain this a bit for those not entirely accustomed to graphics. > > There is a phenomenon on windows called TDR which is when the graphics > driver resets beneath us. This causes Firefox to crash. This bug is > essentially in the catch-all area of where we land when these kinds of > things happen to us. > > In beta 36 we have been running two simultaneous pieces of unproven tech: > * MSE - video playback improvements over flash > * D2D 1.1 - a newish method for doing some drawing > > Note that our normal hardware acceleration is provided by d3d11/9. I'm still > not sure how d3d and d2d interact with each other per se (other than the > obvious 2d vs 3d difference). > > We have some crashes we think are D2D related. > We have some crashes that are MSE related. Now this comes back to the TDR > issue because we are hypothesizing that due to the relatively untested > codepath that we are using with MSE for hardware accelerated video decoding > (what Bas is referring to above when he talks about DXVA) we might be > pushing some graphics drivers beyond their limits and hitting TDR's. Thus > causing crashes with this signature. > > Of course it could also be D2D related too. > > Now that we have disabled MSE in beta 8, if the spike in this signature was > caused by hardware decoding as a result of MSE, we *should* see the crash > rate of this issue diminish. If we don't, then we likely have some other > culprit. > > Hope this helps fix some of the confusion. It should be noted we were -already- using D2D 1.0, which, on systems which have D2D 1.1, uses the -exact- same libraries as D2D 1.1 (D2D 1.1 is a superset of D2D 1.0), having said that, with D2D 1.1 we've started using a small amount of the APIs that is only in the superset, in theory, that could make some difference in TDR occurrance, but it's not the most likely cause.

Bogdan Maris, Desktop Test Engineering

Comment 25

•

10 years ago

Reproduced this on Windows 7, Windows 8.1 and Vista while graphics card driver update, if you need any information please needinfo me.

Bas Schouten (:bas.schouten)

Assignee

Comment 26

•

10 years ago

(In reply to Bogdan Maris, QA [:bogdan_maris] from comment #25) > Reproduced this on Windows 7, Windows 8.1 and Vista while graphics card > driver update, if you need any information please needinfo me. That triggers a driver reset and is expected to trigger this on beta. On release it would trigger a different crash, but also crash. It would be nice if you could confirm on Aurora this no longer causes a crash.

Bogdan Maris, Desktop Test Engineering

Comment 27

•

10 years ago

(In reply to Bas Schouten (:bas.schouten) from comment #26) > (In reply to Bogdan Maris, QA [:bogdan_maris] from comment #25) > > Reproduced this on Windows 7, Windows 8.1 and Vista while graphics card > > driver update, if you need any information please needinfo me. > > That triggers a driver reset and is expected to trigger this on beta. On > release it would trigger a different crash, but also crash. It would be nice > if you could confirm on Aurora this no longer causes a crash. Just tried using latest Aurora on various Windows operating systems and I still receive crashes, only on Vista with this signature. Windows 8.1 64-bit bp-3024813c-0327-43ae-9c54-df4ce2150213 bp-40471cf6-2d13-45a3-bf56-4b5ae2150213 Windows Vista 64-bit: bp-8ccdec4e-edbf-4891-8012-9df002150213 Windows 7 32-bit: bp-593a164a-8ff8-4e02-90a4-85b342150213

Bas Schouten (:bas.schouten)

Assignee

Comment 28

•

10 years ago

(In reply to Bogdan Maris, QA [:bogdan_maris] from comment #27) > (In reply to Bas Schouten (:bas.schouten) from comment #26) > > (In reply to Bogdan Maris, QA [:bogdan_maris] from comment #25) > > > Reproduced this on Windows 7, Windows 8.1 and Vista while graphics card > > > driver update, if you need any information please needinfo me. > > > > That triggers a driver reset and is expected to trigger this on beta. On > > release it would trigger a different crash, but also crash. It would be nice > > if you could confirm on Aurora this no longer causes a crash. > > Just tried using latest Aurora on various Windows operating systems and I > still receive crashes, only on Vista with this signature. > > Windows 8.1 64-bit > bp-3024813c-0327-43ae-9c54-df4ce2150213 > bp-40471cf6-2d13-45a3-bf56-4b5ae2150213 > > Windows Vista 64-bit: > bp-8ccdec4e-edbf-4891-8012-9df002150213 > > Windows 7 32-bit: > bp-593a164a-8ff8-4e02-90a4-85b342150213 I guess we need to uplift bug 1126490 to Aurora, how about nightly?

Sylvestre Ledru [:Sylvestre]

Comment 29

•

10 years ago

We still have this bug in 36 beta 9, right?

Bogdan Maris, Desktop Test Engineering

Comment 30

•

10 years ago

Attached image Screenshot showing the issue — Details

(In reply to Bas Schouten (:bas.schouten) from comment #28) > (In reply to Bogdan Maris, QA [:bogdan_maris] from comment #27) > > (In reply to Bas Schouten (:bas.schouten) from comment #26) > > > (In reply to Bogdan Maris, QA [:bogdan_maris] from comment #25) > > > > Reproduced this on Windows 7, Windows 8.1 and Vista while graphics card > > > > driver update, if you need any information please needinfo me. > > > > > > That triggers a driver reset and is expected to trigger this on beta. On > > > release it would trigger a different crash, but also crash. It would be nice > > > if you could confirm on Aurora this no longer causes a crash. > > > > Just tried using latest Aurora on various Windows operating systems and I > > still receive crashes, only on Vista with this signature. > > > > Windows 8.1 64-bit > > bp-3024813c-0327-43ae-9c54-df4ce2150213 > > bp-40471cf6-2d13-45a3-bf56-4b5ae2150213 > > > > Windows Vista 64-bit: > > bp-8ccdec4e-edbf-4891-8012-9df002150213 > > > > Windows 7 32-bit: > > bp-593a164a-8ff8-4e02-90a4-85b342150213 > > I guess we need to uplift bug 1126490 to Aurora, how about nightly? I get some interesting results using Nightly: Windows Vista 64-bit Nightly e10s enabled - no Firefox crash but tabs do crash and Firefox has no buttons. 'See attachment' Nightly e10s disabled bp-ea783782-ef12-404e-b739-379c12150216 Windows 8.1 64-bit Nightly e10s enabled bp-4445bcd0-d23f-4d72-8fe5-c28f72150216 Nightly e10s disabled - no crash. Windows 7 32-bit Nightly e10s enabled - no crash. Nightly e10s disabled - no crash (In reply to Sylvestre Ledru [:sylvestre] from comment #29) > We still have this bug in 36 beta 9, right? Yes, 36 beta 9 is still affected, just reproduced on Windows 7 32-bit: bp-4a2ba501-e8b0-4a4a-8139-711632150216

Bas Schouten (:bas.schouten)

Assignee

Comment 31

•

10 years ago

(In reply to Sylvestre Ledru [:sylvestre] from comment #29) > We still have this bug in 36 beta 9, right? This crash will occur on beta 9 some of the time when a TDR occurs (other times other crashes will happen, if D2D 1.1 is enabled usually the FillRectangle one, otherwise a large variety of other crashes). There also have been no attempts to make beta 9 resilient to driver resets (Release crashes on driver resets, although it has an even larger range of associated signatures than beta). On 37 we intend to make sure our TDR issues are mostly addressed and a driver reset should generally become survivable.

Sylvestre Ledru [:Sylvestre]

Comment 32

•

10 years ago

OK. So, if I understand correctly, we will ship 36 with this bug.

status-firefox36: affected → wontfix

Bas Schouten (:bas.schouten)

Assignee

Comment 33

•

10 years ago

(In reply to Sylvestre Ledru [:sylvestre] from comment #32) > OK. So, if I understand correctly, we will ship 36 with this bug. Unless we can bring our TDRs down, which I still think -may- be related to DXVA (it's the only change we've made in 36 that I could see significantly affecting TDRs, and there's some reports of people being on youtube when this occurs), then yes. This in itself is not a bug, or not anything new, we've always crashed somehow when the graphics device resets. It just appears that, unless this signature is simply a consolidation of other crashes, we've increased the amount of driver resets that we caused.

cmtalbert

Comment 34

•

10 years ago

Attached image TDR_awesome_crash.png — Details

So I hit this crash with a pretty awesome TDR. Here's what happened. 1. Starting my workday, I connected to my external monitor, used skype for an earlier call and did not start vidyo 2. Started my browsers - I'm running nightly and beta each with a lot of tabs in their own different profiles. Nightly is running in E10s mode 3. To get on my next call, I started up vidyo desktop and attempted to join my vidyo room. At this point, both the laptop screen and the external display went black, flashed their screens back on, went black again, and then flashed back on in the state captured in the screen shot. My theory is that the TDR was triggered by something vidyo did, and this caused Beta to crash. However, on nightly, you can see the state of it - the black background and no content. While the chrome process still works in nightly - I can switch tabs and it responds, no content loads. So the content process has likely crashed but I get NO indication of this, which is even more serious from a UX point of view. I was also running dev edition at the time (yes, I actually run all three versions at once). And dev edition's UI was on the laptop monitor (the nightly and beta builds were displayed on the external monitor) and it escaped mostly unscathed. It did *not* crash, it still renders its content, however, the "minimize, maximize, close" icons from windows have been replaced with a solid line of color. They still work though, and if I click on them, they re-render themselves. This is all on windows 8.1. The crash reports that were filed were: * From beta: https://crash-stats.mozilla.com/report/index/bp-4419de6a-c784-450c-8798-8cef02150220 * There seems to be no crash reported from the nightly browser, even though it's clearly been broken.

Jeff Muizelaar [:jrmuizel]

Updated

•

10 years ago

Summary: CompositorD3D11::HandleError comping from mozilla::layers::CompositorD3D11::UpdateRenderTarget() → CompositorD3D11::HandleError coming from mozilla::layers::CompositorD3D11::UpdateRenderTarget() probably from TDRs

Bas Schouten (:bas.schouten)

Assignee

Comment 35

•

10 years ago

(In reply to Clint Talbert ( :ctalbert ) from comment #34) > Created attachment 8567162 [details] > TDR_awesome_crash.png > > So I hit this crash with a pretty awesome TDR. Here's what happened. > 1. Starting my workday, I connected to my external monitor, used skype for > an earlier call and did not start vidyo > 2. Started my browsers - I'm running nightly and beta each with a lot of > tabs in their own different profiles. Nightly is running in E10s mode > 3. To get on my next call, I started up vidyo desktop and attempted to join > my vidyo room. > > At this point, both the laptop screen and the external display went black, > flashed their screens back on, went black again, and then flashed back on in > the state captured in the screen shot. My theory is that the TDR was > triggered by something vidyo did, and this caused Beta to crash. > However, on nightly, you can see the state of it - the black background and > no content. While the chrome process still works in nightly - I can switch > tabs and it responds, no content loads. So the content process has likely > crashed but I get NO indication of this, which is even more serious from a > UX point of view. > > I was also running dev edition at the time (yes, I actually run all three > versions at once). And dev edition's UI was on the laptop monitor (the > nightly and beta builds were displayed on the external monitor) and it > escaped mostly unscathed. It did *not* crash, it still renders its content, > however, the "minimize, maximize, close" icons from windows have been > replaced with a solid line of color. They still work though, and if I click > on them, they re-render themselves. > > This is all on windows 8.1. The crash reports that were filed were: > * From beta: > https://crash-stats.mozilla.com/report/index/bp-4419de6a-c784-450c-8798- > 8cef02150220 > * There seems to be no crash reported from the nightly browser, even though > it's clearly been broken. Can you file a separate bug for the issue you had on nightly?

Anthony Jones (:ajones, :kentuckyfriedtakahe, :k17e)

Comment 36

•

10 years ago

It is the top browser crasher for YouTube in release but it is dwarfed by Flash crashes. It is the top crasher in Firefox 37b2 but is in 2nd place on YouTube. The competitor for first place is OOM. We have a few things that improve the OOM situation. What needs to happen to resolve this issue?

Bas Schouten (:bas.schouten)

Assignee

Comment 37

•

10 years ago

(In reply to Anthony Jones (:kentuckyfriedtakahe, :k17e) from comment #36) > It is the top browser crasher for YouTube in release but it is dwarfed by > Flash crashes. It is the top crasher in Firefox 37b2 but is in 2nd place on > YouTube. The competitor for first place is OOM. We have a few things that > improve the OOM situation. > > What needs to happen to resolve this issue? We need to diagnose if these are TDRs triggered by youtube issues. As I've suggested in several different forums, A/B testing on a channel with a sufficient population, where we use acceleration in one case for youtube and no acceleration in the other, is the best method of diagnosis.

Robert Kaiser

Comment 38

•

10 years ago

I personally do not think there is any time to do experimentation or A/B testing before we intend to ship MSE as the final beta of this cycle is going to build in two weeks.

Anthony Jones (:ajones, :kentuckyfriedtakahe, :k17e)

Comment 39

•

10 years ago

(In reply to Bas Schouten (:bas.schouten) from comment #37) > (In reply to Anthony Jones (:kentuckyfriedtakahe, :k17e) from comment #36) > > It is the top browser crasher for YouTube in release but it is dwarfed by > > Flash crashes. It is the top crasher in Firefox 37b2 but is in 2nd place on > > YouTube. The competitor for first place is OOM. We have a few things that > > improve the OOM situation. > > > > What needs to happen to resolve this issue? > > We need to diagnose if these are TDRs triggered by youtube issues. As I've > suggested in several different forums, A/B testing on a channel with a > sufficient population, where we use acceleration in one case for youtube and > no acceleration in the other, is the best method of diagnosis. It contributes to a slightly smaller proportion of the crashes on YouTube than in the web at large in beta 37. This means we don't have anything to support the hypothesis that hardware decoding (or even video more generally) is making the problem worse. I'm assuming accelerated layers is correlated with TDRs on the basis that we're simply not doing much in the GPU. What is involved in recovering from a driver reset? Do we need to tear down all of our layers, images, video frames, etc. and regenerate them? I'm guessing that would be an involved process and we'd end up being at a loose end for canvas and the like.

Bas Schouten (:bas.schouten)

Assignee

Comment 40

•

10 years ago

(In reply to Anthony Jones (:kentuckyfriedtakahe, :k17e) from comment #39) > (In reply to Bas Schouten (:bas.schouten) from comment #37) > > (In reply to Anthony Jones (:kentuckyfriedtakahe, :k17e) from comment #36) > > > It is the top browser crasher for YouTube in release but it is dwarfed by > > > Flash crashes. It is the top crasher in Firefox 37b2 but is in 2nd place on > > > YouTube. The competitor for first place is OOM. We have a few things that > > > improve the OOM situation. > > > > > > What needs to happen to resolve this issue? > > > > We need to diagnose if these are TDRs triggered by youtube issues. As I've > > suggested in several different forums, A/B testing on a channel with a > > sufficient population, where we use acceleration in one case for youtube and > > no acceleration in the other, is the best method of diagnosis. > > It contributes to a slightly smaller proportion of the crashes on YouTube > than in the web at large in beta 37. This means we don't have anything to > support the hypothesis that hardware decoding (or even video more generally) > is making the problem worse. > > I'm assuming accelerated layers is correlated with TDRs on the basis that > we're simply not doing much in the GPU. > > What is involved in recovering from a driver reset? Do we need to tear down > all of our layers, images, video frames, etc. and regenerate them? I'm > guessing that would be an involved process and we'd end up being at a loose > end for canvas and the like. That code all exists, we -should- be able to recover from a TDR on nightly but it's very, very tricky (indeed, canvas is just screwed). I've added telemetry data to attempt to detect TDRs, the first bits of data on that should be in and should provide us with some information as to what is causing the TDRs. In general though triggering a driver reset is a -very- bad end-user experience, some programs don't deal with it and crash, all screens flicker black, etc. etc. So we need to really remove the cause of the increase in TDRs we're seeing. If it's not DXVA that's causing it, I'm not sure what could be causing an increase, we could try switching off D2D 1.1 again to see if that brings it down, of course, to see if something about D2D 1.1 is causing TDRs (which seems unlikely but is not impossible).

Anthony Jones (:ajones, :kentuckyfriedtakahe, :k17e)

Comment 41

•

10 years ago

There is definitely something odd going on with the crashes http://tiny.cc/o4l7ux - most builds have very few crashes with the occasional build going ballistic.

(Away)

Comment 42

•

10 years ago

I'm not sure I trust that table. It looks like it has all channels. It makes sense for the releases and betas to be higher. But it claims version is "36" which doesn't make sense for nightlies and auroras in 2015-03. Hmm.

Anthony Jones (:ajones, :kentuckyfriedtakahe, :k17e)

Comment 43

•

10 years ago

(In reply to David Major [:dmajor] (UTC+13) from comment #42) > I'm not sure I trust that table. It looks like it has all channels. It makes > sense for the releases and betas to be higher. But it claims version is "36" > which doesn't make sense for nightlies and auroras in 2015-03. Hmm. Maybe I should leave the crash analysis to the experts, eh!

Robert Kaiser

Comment 44

•

10 years ago

(In reply to Anthony Jones (:kentuckyfriedtakahe, :k17e) from comment #41) > There is definitely something odd going on with the crashes > http://tiny.cc/o4l7ux - most builds have very few crashes with the > occasional build going ballistic. That table is bogus. It adds up everything seen for all builds from any channel. Given that we build two beta builds a week and Nightly/DevEdition have way fewer users, it's pretty clear that those numbers will fluctuate wildly. There is a bug open to fix this but it seems like nobody is willing to work on it.

Robert Kaiser

Comment 45

•

10 years ago

(In reply to Anthony Jones (:kentuckyfriedtakahe, :k17e) from comment #43) > Maybe I should leave the crash analysis to the experts, eh! It shouldn't be that way. That said, there are a few idiosyncrasies and traps, that this is one of them. Bug 898432 is filed on that one.

Lawrence Mandel [:lmandel] (use needinfo)

Comment 46

•

10 years ago

Although this is a top crash, as Bas said multiple times in this bug, the underlying crashes have been around for a while but are now consolidated and may be triggered more often. Given that we have not made progress on this recently, I'm doubtful that we'll be able to produce a fix before 37 ships. I'm marking this bug as wontfix for 37. Bas - Do you have enough information to continue to investigate this bug? What does the Telemetry data show?

status-firefox37: affected → wontfix

Flags: needinfo?(bas)

Bas Schouten (:bas.schouten)

Assignee

Comment 47

•

10 years ago

(In reply to Lawrence Mandel [:lmandel] (use needinfo) from comment #46) > Although this is a top crash, as Bas said multiple times in this bug, the > underlying crashes have been around for a while but are now consolidated and > may be triggered more often. Given that we have not made progress on this > recently, I'm doubtful that we'll be able to produce a fix before 37 ships. > I'm marking this bug as wontfix for 37. > > Bas - Do you have enough information to continue to investigate this bug? > What does the Telemetry data show? Telemetry data shows we're TDR-ing a lot, which is basically not unexpected. But in reality there's just two things we can do here to check whether -why- we're TDR-ing more and if we can easily reduce it: 1. Disable DXVA, see what happens to TDR rates. 2. Disable D2D 1.1, see what happens to TDR rates.

Flags: needinfo?(bas)

(Away)

Comment 48

•

10 years ago

(In reply to Bas Schouten (:bas.schouten) from comment #47) > Telemetry data shows we're TDR-ing a lot, which is basically not unexpected. > But in reality there's just two things we can do here to check whether -why- > we're TDR-ing more and if we can easily reduce it: > > 1. Disable DXVA, see what happens to TDR rates. > 2. Disable D2D 1.1, see what happens to TDR rates. Just to be clear, are you planning to try those experiments?

Loic

Comment 49

•

10 years ago

A user is able to reproduce the same crash in bug 1145143.

Bas Schouten (:bas.schouten)

Assignee

Comment 50

•

10 years ago

(In reply to :dmajor (semi-away, use needinfo) from comment #48) > (In reply to Bas Schouten (:bas.schouten) from comment #47) > > Telemetry data shows we're TDR-ing a lot, which is basically not unexpected. > > But in reality there's just two things we can do here to check whether -why- > > we're TDR-ing more and if we can easily reduce it: > > > > 1. Disable DXVA, see what happens to TDR rates. > > 2. Disable D2D 1.1, see what happens to TDR rates. > > Just to be clear, are you planning to try those experiments? I'm not just going to push stuff to Beta :) Nor would I know when to do it and how to analyze the data. I've suggested this on this bug as well as in at least 2 e-mail threads. But there wasn't much of a reply.

Bas Schouten (:bas.schouten)

Assignee

Comment 51

•

10 years ago

(In reply to Loic from comment #49) > A user is able to reproduce the same crash in bug 1145143. More or less unrelated, yes, if something causes a driver crash firefox crashes with this bug. The trick here is figuring out what, for most people, causes the driver crash. And that particular user's use pattern is not going to be it.

Robert Kaiser

Comment 52

•

10 years ago

This signature has about tripled on the 37 (beta) train over the last few days, in 37.0b7 it's actually #1 in front of the OOM|small signature now.

Bas Schouten (:bas.schouten)

Assignee

Comment 53

•

10 years ago

(In reply to Robert Kaiser (:kairo@mozilla.com) from comment #52) > This signature has about tripled on the 37 (beta) train over the last few > days, in 37.0b7 it's actually #1 in front of the OOM|small signature now. Has something occurred that might have caused people to be watching more videos or something along those lines? That's the only change I could think of that would occur without us doing anything and that might cause this. Another option I suppose if some buggy driver update was pushed, put the adoption of those really never is that fast.

Nicolas Silva [:nical]

Comment 54

•

10 years ago

(In reply to Bas Schouten (:bas.schouten) from comment #53) > (In reply to Robert Kaiser (:kairo@mozilla.com) from comment #52) > > This signature has about tripled on the 37 (beta) train over the last few > > days, in 37.0b7 it's actually #1 in front of the OOM|small signature now. > > Has something occurred that might have caused people to be watching more > videos or something along those lines? With bug 1138967, we upload textures in the ImageBridge thread and share them with DXGI, rather than doing the upload on the compositor side. This change was uplifted in the last beta and the only video-related thing I can thing of that made it to beta lately.

Robert Kaiser

Comment 55

•

10 years ago

(In reply to Bas Schouten (:bas.schouten) from comment #53) > Another option I suppose if some buggy driver update was pushed, put the > adoption of those really never is that fast. I that case, we would see it across channels and builds, but it looks like this is isolated to 37.0b7. What Nical points to is much more likely as the issue.

Bas Schouten (:bas.schouten)

Assignee

Comment 56

•

10 years ago

(In reply to Robert Kaiser (:kairo@mozilla.com) from comment #55) > (In reply to Bas Schouten (:bas.schouten) from comment #53) > > Another option I suppose if some buggy driver update was pushed, put the > > adoption of those really never is that fast. > > I that case, we would see it across channels and builds, but it looks like > this is isolated to 37.0b7. What Nical points to is much more likely as the > issue. Right, a change like that in the realm of video could certainly cause this by increasing the amount of driver crashes somehow. I don't keep a close eye on video changes, but the bug Nical points at will certainly affect what driver codepaths we're hitting.

Matt Woodrow (:mattwoodrow)

Comment 57

•

10 years ago

(In reply to Robert Kaiser (:kairo@mozilla.com) from comment #52) > This signature has about tripled on the 37 (beta) train over the last few > days, in 37.0b7 it's actually #1 in front of the OOM|small signature now. Where can I find this information? Looking at the report list linked from the crash report in comment #0, it appears that the total number of crashes with this signature is about the same as other betas (3121 for b7, 3509 for b6, 3866 for b5 etc). Is it possible to get a list of reports only for a specific version? I can't find a way to do that search. I'm particularly interested in looking at the Graphics Adapter Report for these lists, so I can see if the increase in crashes correlates with a specific graphics card. It appears that 'PCI\VEN_8086&DEV_2E32&SUBSYS_31031565&REV_03 Intel G41 express graphics' has spiked massively (from 2% -> 9%) in the last 3 days vs the last 28. Does that card alone explain the rise in these crashes, or are we really seeing an increase across the board? It's really hard to tell from the relative percentages. Average number of crashes/day for each given device id for each version would be the easiest metric to compare these with I think. Bug 1146313 also appears to correlate really strongly (>80%) with that same device id.

Matt Woodrow (:mattwoodrow)

Updated

•

10 years ago

Flags: needinfo?(kairo)

Robert Kaiser

Comment 58

•

10 years ago

(In reply to Matt Woodrow (:mattwoodrow) from comment #57) > (In reply to Robert Kaiser (:kairo@mozilla.com) from comment #52) > > This signature has about tripled on the 37 (beta) train over the last few > > days, in 37.0b7 it's actually #1 in front of the OOM|small signature now. > > Where can I find this information? https://crash-analysis.mozilla.com/rkaiser/2015-03-23/2015-03-23.firefox.37.explosiveness.html is a report that looks at the crash rates (crashes / 1M ADI) for signatures on the 37 train in total. When you find that signatures there, right of the explosiveness factor columns (which are the result of some stats to tell how fast they are rising), you will find the rates for this signature on various days, here is simplified fashion: 03-23 03-22 03-21 03-20 03-19 03-18 03-17 03-16 03-15 03-14 03-13 960 1316 1043 549 541 629 461 406 528 473 449 I guessed the "tripled" with not having 03-23 data yet, looks like on a weekday it looks more like we doubled. The "#1" thing comes from looking at https://crash-stats.mozilla.com/topcrasher/products/Firefox/versions/37.0b7 If you compare the percentages of this signature vs. OOM|small to https://crash-stats.mozilla.com/topcrasher/products/Firefox/versions/37.0b6 (and to https://crash-stats.mozilla.com/topcrasher/products/Firefox/versions/36.0.4 to determine if an external factor has made it spike for everyone) then you'll see that it's definitely higher in b7 than anywhere else. > Is it possible to get a list of reports only for a specific version? I can't > find a way to do that search. The topcrash list for 37.0b7 I linked above should give you that link: https://crash-stats.mozilla.com/report/list?product=Firefox&range_value=7&range_unit=days&date=2015-03-24&signature=mozilla%3A%3Alayers%3A%3ACompositorD3D11%3A%3AHandleError%28long%2C+mozilla%3A%3Alayers%3A%3ACompositorD3D11%3A%3ASeverity%29+%7C+mozilla%3A%3Alayers%3A%3ACompositorD3D11%3A%3AFailed%28long%2C+mozilla%3A%3Alayers%3A%3ACompositorD3D11%3A%3ASeverity%29+%7C+mozilla%3A%3Alayers%3A%3ACompositorD3D11%3A%3AUpdateRenderTarget%28%29&version=Firefox%3A37.0b7 This should contain the graphics adapter report. You could also do an equivalent search and get to https://crash-stats.mozilla.com/signature/?build_id=20150319212106&product=Firefox&release_channel=beta&process_type=browser&process_type=content&version=37.0&signature=mozilla%3A%3Alayers%3A%3ACompositorD3D11%3A%3AHandleError%28long%2C+mozilla%3A%3Alayers%3A%3ACompositorD3D11%3A%3ASeverity%29+|+mozilla%3A%3Alayers%3A%3ACompositorD3D11%3A%3AFailed%28long%2C+mozilla%3A%3Alayers%3A%3ACompositorD3D11%3A%3ASeverity%29+|+mozilla%3A%3Alayers%3A%3ACompositorD3D11%3A%3AUpdateRenderTarget%28%29&_columns=date&_columns=product&_columns=version&_columns=build_id&_columns=platform&_columns=reason&_columns=address&page=1 which lets you look at stats for any annotation/field in the "Aggregations" section. > Does that card alone explain the rise in these crashes, or are we really > seeing an increase across the board? That's a bit hard to determine, one would need to do some math on that based on the numbers above. > Bug 1146313 also appears to correlate really strongly (>80%) with that same > device id. If it's that ID in general, that's surely an interesting find. Is there something we could do on that specifically?

Flags: needinfo?(kairo)

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Comment 59

•

10 years ago

(In reply to Robert Kaiser (:kairo@mozilla.com) from comment #58) > > Is it possible to get a list of reports only for a specific version? I can't > > find a way to do that search. > > The topcrash list for 37.0b7 I linked above should give you that link: > https://crash-stats.mozilla.com/report/ > list?product=Firefox&range_value=7&range_unit=days&date=2015-03- > 24&signature=mozilla%3A%3Alayers%3A%3ACompositorD3D11%3A%3AHandleError%28long > %2C+mozilla%3A%3Alayers%3A%3ACompositorD3D11%3A%3ASeverity%29+%7C+mozilla%3A% > 3Alayers%3A%3ACompositorD3D11%3A%3AFailed%28long%2C+mozilla%3A%3Alayers%3A%3A > CompositorD3D11%3A%3ASeverity%29+%7C+mozilla%3A%3Alayers%3A%3ACompositorD3D11 > %3A%3AUpdateRenderTarget%28%29&version=Firefox%3A37.0b7 > > This should contain the graphics adapter report. I don't know how much I trust the above URL. It says it's only for 37.0b7 but if you look at the product breakdown on that page it has all sorts of different versions. > You could also do an equivalent search and get to > https://crash-stats.mozilla.com/signature/ > ?build_id=20150319212106&product=Firefox&release_channel=beta&process_type=br > owser&process_type=content&version=37. > 0&signature=mozilla%3A%3Alayers%3A%3ACompositorD3D11%3A%3AHandleError%28long% > 2C+mozilla%3A%3Alayers%3A%3ACompositorD3D11%3A%3ASeverity%29+|+mozilla%3A%3Al > ayers%3A%3ACompositorD3D11%3A%3AFailed%28long%2C+mozilla%3A%3Alayers%3A%3ACom > positorD3D11%3A%3ASeverity%29+|+mozilla%3A%3Alayers%3A%3ACompositorD3D11%3A%3 > AUpdateRenderTarget%28%29&_columns=date&_columns=product&_columns=version&_co > lumns=build_id&_columns=platform&_columns=reason&_columns=address&page=1 > which lets you look at stats for any annotation/field in the "Aggregations" > section. This seems more useful. For b6 it looks like the top-hit adapter device ids are: Rank Adapter device id Count % 1 0x0166 342 8.92 % 2 0x0116 341 8.90 % 3 0x0046 249 6.50 % 4 0x0102 244 6.37 % 5 0x0a16 181 4.72 % 6 0x0106 155 4.04 % 7 0x2e32 139 3.63 % ... 24 0x2a02 25 0.65 % ... and for b7: Rank Adapter device id Count % 1 0x2e32 2783 41.98 % 2 0x2a02 714 10.77 % 3 0x0116 268 4.04 % 4 0x0046 260 3.92 % 5 0x0166 225 3.39 % 6 0x0106 179 2.70 % ... so it looks like adapter ids 0x2e32 accounts for most of the spike, and 0x2a02 also seems to have increased significantly.

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Comment 60

•

10 years ago

(Note: I got the b6 list by using the link kairo provided and changing the buildid in the search parameters to 20150316202753)

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Comment 61

•

10 years ago

I also grepped my way through the raw crash data from the 19th to the 23rd (because I still don't fully trust the crash-stats web interface) and posted the relevant data to http://people.mozilla.org/~kgupta/bug/1116812/. b6-devices.txt and b7-devices.txt show the number of HandleError crashes (at the start of the line) on b6 and b7 grouped by { AdapterVendorID, AdapterDeviceID, AdapterSubsysID, AdapterDriverVersion }. b6-devices-nosubsys.txt and b7-devices-nosubsys.txt are the same but I stripped out the AdapterSubsysID since it seemed pretty noisy.

Robert Kaiser

Comment 62

•

10 years ago

(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #59) > I don't know how much I trust the above URL. It says it's only for 37.0b7 > but if you look at the product breakdown on that page it has all sorts of > different versions. The product breakdown is the one thing on the Signature Summary that always is without version filters, as otherwise it wouldn't be too useful in a case like this. Might make sense to flag that in some way in the UI, though.

Robert Kaiser

Comment 63

•

10 years ago

(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #59) > so it looks like adapter ids 0x2e32 accounts for most of the spike, and > 0x2a02 also seems to have increased significantly. That seem to be "Intel G41 express graphics" and "Intel GM965, Intel X3100". See http://www.pcidatabase.com/search.php?device_search_str=0x2e32&device_search=Search and http://www.pcidatabase.com/search.php?device_search_str=0x2a02&device_search=Search

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Comment 64

•

10 years ago

Also added b6-idonly.txt and b7-idonly.txt which are grouped by { AdapterVendorID, AdapterDeviceID }. The numbers seem to agree pretty closely with crash-stats web interface, that adapter ids 0x2e32 and 0x2a02 seem to be affected the most.

Matt Woodrow (:mattwoodrow)

Comment 65

•

10 years ago

Awesome, thanks for that! (In reply to Robert Kaiser (:kairo@mozilla.com) from comment #58) > If it's that ID in general, that's surely an interesting find. Is there > something we could do on that specifically? Yeah, we can blacklist those devices so that they no longer get accelerated layers, or we could add a new blacklist type to specifically avoid this crash. They're both pretty old cards, blacklisting them entirely doesn't seem like a big deal.

Robert Kaiser

Comment 66

•

10 years ago

(In reply to Matt Woodrow (:mattwoodrow) from comment #65) > (In reply to Robert Kaiser (:kairo@mozilla.com) from comment #58) > > If it's that ID in general, that's surely an interesting find. Is there > > something we could do on that specifically? > > Yeah, we can blacklist those devices so that they no longer get accelerated > layers, or we could add a new blacklist type to specifically avoid this > crash. > > They're both pretty old cards, blacklisting them entirely doesn't seem like > a big deal. Sounds like something to do for 38, then, I guess.

Matt Woodrow (:mattwoodrow)

Comment 67

•

10 years ago

Is it possible to get data on driver versions for the two affected devices? It would be nice to blacklist only a certain range of driver versions, rather than the entire device.

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Comment 68

•

10 years ago

http://people.mozilla.org/~kgupta/bug/1116812/b7-devices-nosubsys.txt has driver versions as well. If you have a more specific query you'd like data for let me know and I can probably extract it.

Matt Woodrow (:mattwoodrow)

Comment 69

•

10 years ago

Attached patch Blacklist the two devices that spiked with 37b7 — Details — Splinter Review

Attachment #8582839 - Flags: review?(bas)

Robert Kaiser

Comment 70

•

10 years ago

(In reply to Matt Woodrow (:mattwoodrow) from comment #67) > Is it possible to get data on driver versions for the two affected devices? You can get to that via searches, like https://crash-stats.mozilla.com/search/?product=Firefox&version=37.0&build_id=20150319212106&release_channel=beta&process_type=browser&process_type=content&signature=%3Dmozilla%3A%3Alayers%3A%3ACompositorD3D11%3A%3AHandleError%28long%2C+mozilla%3A%3Alayers%3A%3ACompositorD3D11%3A%3ASeverity%29+|+mozilla%3A%3Alayers%3A%3ACompositorD3D11%3A%3AFailed%28long%2C+mozilla%3A%3Alayers%3A%3ACompositorD3D11%3A%3ASeverity%29+|+mozilla%3A%3Alayers%3A%3ACompositorD3D11%3A%3AUpdateRenderTarget%28%29&adapter_device_id=0x2e32&_facets=adapter_driver_version&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform#facet-adapter_driver_version

Jeff Muizelaar [:jrmuizel]

Comment 71

•

10 years ago

Comment on attachment 8582839 [details] [diff] [review] Blacklist the two devices that spiked with 37b7 Review of attachment 8582839 [details] [diff] [review]: ----------------------------------------------------------------- I guess so. But let's see if we can figure out what's going on here.

Attachment #8582839 - Flags: review?(bas) → review+

Matt Woodrow (:mattwoodrow)

Comment 72

•

10 years ago

Do you have any suggestions for how we might do that? Do we have any of the affected devices in the TO office?

Matt Woodrow (:mattwoodrow)

Comment 73

•

10 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/2118109cc0e2

Sotaro Ikeda [:sotaro]

Comment 74

•

10 years ago

By the STR of Bug 1145102 Comment 4, we could cause a lot of types of crashes. CompositorD3D11::HandleError also seemed to happen on lonovo W530 and Inspiron 5547.

Bas Schouten (:bas.schouten)

Assignee

Comment 75

•

10 years ago

Comment on attachment 8582839 [details] [diff] [review] Blacklist the two devices that spiked with 37b7 Review of attachment 8582839 [details] [diff] [review]: ----------------------------------------------------------------- Hrm, we should -really- only blacklist video for these devices longer term since everything else seems to be okay. Can you make sure we create a way for only blacklisting video?

Ryan VanderMeulen [:RyanVM]

Comment 76

•

10 years ago

https://hg.mozilla.org/mozilla-central/rev/2118109cc0e2

Status: NEW → RESOLVED

Closed: 10 years ago

status-firefox39: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla39

Robert Kaiser

Comment 77

•

10 years ago

The checkin does not fix the complete bug but probably only fixes the additional spike we saw in 37.0b7.

Status: RESOLVED → REOPENED

Resolution: FIXED → ---

alex_mayorga

Comment 78

•

10 years ago

bp-fb107229-2a02-414d-a6aa-d62c82150317 17/03/2015 09:05 a.m.

Milan Sreckovic [:milan] (needinfo for best results)

Comment 79

•

10 years ago

(In reply to Robert Kaiser (:kairo@mozilla.com) from comment #77) > The checkin does not fix the complete bug but probably only fixes the > additional spike we saw in 37.0b7. Kairo, when will we have enough data to check this?

Flags: needinfo?(kairo)

Robert Kaiser

Comment 80

•

10 years ago

(In reply to Milan Sreckovic [:milan] from comment #79) > (In reply to Robert Kaiser (:kairo@mozilla.com) from comment #77) > > The checkin does not fix the complete bug but probably only fixes the > > additional spike we saw in 37.0b7. > > Kairo, when will we have enough data to check this? Probably once we have this patch on beta. Right now it only just merged to aurora, from what I can tell. That said, the signature will still be the #2 topcrash after this patch is on beta.

Flags: needinfo?(kairo)

Virtual_ManPL [:Virtual] 🇵🇱 - (please needinfo? me - so I will see your comment/reply/question/etc.)

Updated

•

10 years ago

Updated

•

10 years ago

Crash Signature: [@ mozilla::layers::CompositorD3D11::HandleError(long, mozilla::layers::CompositorD3D11::Severity) | mozilla::layers::CompositorD3D11::Failed(long, mozilla::layers::CompositorD3D11::Severity) | mozilla::layers::CompositorD3D11::UpdateRenderTarget()] → [@ mozilla::layers::CompositorD3D11::HandleError(long, mozilla::layers::CompositorD3D11::Severity) | mozilla::layers::CompositorD3D11::Failed(long, mozilla::layers::CompositorD3D11::Severity) | mozilla::layers::CompositorD3D11::UpdateRenderTarget()] [@ m…

Virtual_ManPL [:Virtual] 🇵🇱 - (please needinfo? me - so I will see your comment/reply/question/etc.)

Updated

•

10 years ago

Crash Signature: , mozilla::layers::CompositorD3D11::Severity) | mozilla::layers::CompositorD3D11::BeginFrame(nsIntRegion const&, mozilla::gfx::RectTyped<mozilla::gfx::UnknownUnits> const*, mozilla::gfx::RectTyped<mozil... ] → , mozilla::layers::CompositorD3D11::Severity) | mozilla::layers::CompositorD3D11::BeginFrame(nsIntRegion const&, mozilla::gfx::RectTyped<mozilla::gfx::UnknownUnits> const*, mozilla::gfx::RectTyped<mozil... ] [@ xul.dll@0xab50e6 | xul.dll@0xab3bc3]

mkdante381

Comment 81

•

10 years ago

Crash after click PPM mouse on link to the YT movie https://crash-stats.mozilla.com/report/index/0d563e1a-13a0-4917-99db-5a7d12150409

Sylvestre Ledru [:Sylvestre]

Comment 82

•

10 years ago

Matt, can we have an uplift request to 38? Thanks

Flags: needinfo?(matt.woodrow)

Matt Woodrow (:mattwoodrow)

Comment 83

•

10 years ago

Comment on attachment 8582839 [details] [diff] [review] Blacklist the two devices that spiked with 37b7 Approval Request Comment [Feature/regressing bug #]: HTML5 Video [User impact if declined]: Crashes on some device. [Describe test coverage new/current, TreeHerder]: None [Risks and why]: Low, simple blacklist change. [String/UUID change made/needed]: None

Flags: needinfo?(matt.woodrow)

Attachment #8582839 - Flags: approval-mozilla-beta?

Wayne Mery (:wsmwk)

Comment 84

•

10 years ago

#8 crash for TB38.0b1, so this will also benefit Thunderbird

Whiteboard: [tbird crash]

Sylvestre Ledru [:Sylvestre]

Updated

•

10 years ago

Attachment #8582839 - Flags: approval-mozilla-beta? → approval-mozilla-beta+

Sylvestre Ledru [:Sylvestre]

Comment 85

•

10 years ago

Should be in 38 beta 4.

Ryan VanderMeulen [:RyanVM]

Comment 86

•

10 years ago

https://hg.mozilla.org/releases/mozilla-beta/rev/6d9fdd280e65

status-firefox38: affected → fixed

Ton [:Tonnes]

Comment 87

•

10 years ago

Time for my $0.02: I just hope the blaclist check-in is or was intended temporary and will be backed out before reaching release, though that would be exceptional. So is it permanent?

Jeff Muizelaar [:jrmuizel]

Comment 88

•

10 years ago

This is causing us to use WARP on these devices which seems to cause us to not invalidate properly giving us stale content.

Robert Kaiser

Comment 89

•

10 years ago

"This" meaning the bug or the fix/blacklist?

Robert Kaiser

Updated

•

10 years ago

Keywords: topcrash-win

Sylvestre Ledru [:Sylvestre]

Comment 90

•

10 years ago

Bas, this is the most important crash in 38. Could you help on this? Thanks

status-firefox38: fixed → affected

status-firefox39: fixed → affected

Flags: needinfo?(bas)

Bas Schouten (:bas.schouten)

Assignee

Comment 91

•

10 years ago

(In reply to Sylvestre Ledru [:sylvestre] from comment #90) > Bas, this is the most important crash in 38. Could you help on this? Thanks Are we talking about TDRs being the most important crash? If so, is there a rise relative to 37? As I've said in numerous places before, there appears to be a rise in TDRs lately, and there is a decent amount of evidence that this is related to video in a bunch of cases. I've raised bug 1157764 for this. We're also analogous to this trying to make TDRs survivable, but that is not something we will be able to uplift.

Flags: needinfo?(bas)

Milan Sreckovic [:milan] (needinfo for best results)

Comment 92

•

10 years ago

We're trying to reduce the number of TDRs in bug 1157764, which we're planning on uplifting up to 38.

Bas Schouten (:bas.schouten)

Assignee

Comment 93

•

10 years ago

Attached patch Consider DXGI_ERROR_INVALID_CALL a recoverable error for GetBuffer (obsolete) — Details — Splinter Review

Attachment #8599029 - Flags: review?(jmuizelaar)

Jeff Muizelaar [:jrmuizel]

Comment 94

•

10 years ago

Comment on attachment 8599029 [details] [diff] [review] Consider DXGI_ERROR_INVALID_CALL a recoverable error for GetBuffer Review of attachment 8599029 [details] [diff] [review]: ----------------------------------------------------------------- ::: gfx/layers/d3d11/CompositorD3D11.cpp @@ +1224,5 @@ > + if (hr == DXGI_ERROR_INVALID_CALL) { > + // This happens on some GPUs/drivers when there's a TDR. > + gfxCriticalError() << "GetBuffer returned invalid call!"; > + return; > + } Does this cause us to reset the device or stop rendering?

Bas Schouten (:bas.schouten)

Assignee

Comment 95

•

10 years ago

(In reply to Jeff Muizelaar [:jrmuizel] from comment #94) > Comment on attachment 8599029 [details] [diff] [review] > Consider DXGI_ERROR_INVALID_CALL a recoverable error for GetBuffer > > Review of attachment 8599029 [details] [diff] [review]: > ----------------------------------------------------------------- > > ::: gfx/layers/d3d11/CompositorD3D11.cpp > @@ +1224,5 @@ > > + if (hr == DXGI_ERROR_INVALID_CALL) { > > + // This happens on some GPUs/drivers when there's a TDR. > > + gfxCriticalError() << "GetBuffer returned invalid call!"; > > + return; > > + } > > Does this cause us to reset the device or stop rendering? If DidRenderingDeviceReset returns an error as well, yes (which presumably, it does).

Jeff Muizelaar [:jrmuizel]

Comment 96

•

10 years ago

(In reply to Bas Schouten (:bas.schouten) from comment #95) > (In reply to Jeff Muizelaar [:jrmuizel] from comment #94) > > Comment on attachment 8599029 [details] [diff] [review] > > Consider DXGI_ERROR_INVALID_CALL a recoverable error for GetBuffer > > > > Review of attachment 8599029 [details] [diff] [review]: > > ----------------------------------------------------------------- > > > > ::: gfx/layers/d3d11/CompositorD3D11.cpp > > @@ +1224,5 @@ > > > + if (hr == DXGI_ERROR_INVALID_CALL) { > > > + // This happens on some GPUs/drivers when there's a TDR. > > > + gfxCriticalError() << "GetBuffer returned invalid call!"; > > > + return; > > > + } > > > > Does this cause us to reset the device or stop rendering? > > If DidRenderingDeviceReset returns an error as well, yes (which presumably, > it does). We should check if DidRenderingDeviceReset here and if we don't have a reset we should continue to crash.

Jeff Muizelaar [:jrmuizel]

Updated

•

10 years ago

Attachment #8599029 - Flags: review?(jmuizelaar) → review-

Milan Sreckovic [:milan] (needinfo for best results)

Updated

•

10 years ago

Depends on: 1157764

Milan Sreckovic [:milan] (needinfo for best results)

Comment 97

•

10 years ago

We need to stop TDRs, I'm not convinced we can recover from too many of these. This is bug 1157764

Bas Schouten (:bas.schouten)

Assignee

Comment 98

•

10 years ago

Attached patch Consider DXGI_ERROR_INVALID_CALL a recoverable error for GetBuffer and make sure we check the correct device — Details — Splinter Review

Attachment #8599029 - Attachment is obsolete: true

Attachment #8599377 - Flags: review?(jmuizelaar)

Jeff Muizelaar [:jrmuizel]

Updated

•

10 years ago

Attachment #8599377 - Flags: review?(jmuizelaar) → review+

Sylvestre Ledru [:Sylvestre]

Comment 99

•

10 years ago

Bas, could you fill the uplift request to 38 ? I guess we want this... Thanks

Flags: needinfo?(bas)

Milan Sreckovic [:milan] (needinfo for best results)

Comment 100

•

10 years ago

Also, Bas: http://hg.mozilla.org/mozilla-central/annotate/caf25344f73e/gfx/layers/d3d11/TextureD3D11.cpp#l1028 may return device as nullptr, based on one of the FinalizeFrame crashes - https://crash-stats.mozilla.com/report/index/095c5cb7-0806-4c99-937c-3100d2150428. Something we should deal with?

Milan Sreckovic [:milan] (needinfo for best results)

Comment 101

•

10 years ago

Comment on attachment 8599377 [details] [diff] [review] Consider DXGI_ERROR_INVALID_CALL a recoverable error for GetBuffer and make sure we check the correct device High volume crash. Another guard against it.

Attachment #8599377 - Flags: approval-mozilla-beta?

Attachment #8599377 - Flags: approval-mozilla-aurora?

Sylvestre Ledru [:Sylvestre]

Comment 102

•

10 years ago

Comment on attachment 8599377 [details] [diff] [review] Consider DXGI_ERROR_INVALID_CALL a recoverable error for GetBuffer and make sure we check the correct device [Triage Comment] Should be in 38 RC1

Attachment #8599377 - Flags: approval-mozilla-release+

Attachment #8599377 - Flags: approval-mozilla-beta?

Attachment #8599377 - Flags: approval-mozilla-aurora?

Attachment #8599377 - Flags: approval-mozilla-aurora+

Bas Schouten (:bas.schouten)

Assignee

Comment 103

•

10 years ago

(In reply to Milan Sreckovic [:milan] from comment #100) > Also, Bas: > http://hg.mozilla.org/mozilla-central/annotate/caf25344f73e/gfx/layers/d3d11/ > TextureD3D11.cpp#l1028 may return device as nullptr, based on one of the > FinalizeFrame crashes - > https://crash-stats.mozilla.com/report/index/095c5cb7-0806-4c99-937c- > 3100d2150428. Something we should deal with? Yes, we probably do.

Flags: needinfo?(bas)

Pulsebot

Comment 104

•

10 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/b7d29990d645

Ryan VanderMeulen [:RyanVM]

Updated

•

10 years ago

Target Milestone: mozilla39 → ---

Ryan VanderMeulen [:RyanVM]

Comment 105

•

10 years ago

https://hg.mozilla.org/releases/mozilla-release/rev/a1efc72ea226

status-firefox38: affected → fixed

Ryan VanderMeulen [:RyanVM]

Comment 106

•

10 years ago

https://hg.mozilla.org/releases/mozilla-aurora/rev/4fafee28d0b7

status-firefox39: affected → fixed

Ryan VanderMeulen [:RyanVM]

Comment 107

•

10 years ago

https://hg.mozilla.org/releases/mozilla-esr38/rev/a1efc72ea226

status-firefox-esr38: --- → fixed

Ryan VanderMeulen [:RyanVM]

Comment 108

•

10 years ago

https://hg.mozilla.org/releases/mozilla-beta/rev/a1efc72ea226

status-firefox38.0.5: --- → fixed

Phil Ringnalda (:philor)

Comment 109

•

10 years ago

https://hg.mozilla.org/mozilla-central/rev/b7d29990d645

Status: REOPENED → RESOLVED

Closed: 10 years ago → 10 years ago

status-firefox40: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla40

alex_mayorga

Comment 110

•

10 years ago

¡Hola Bas! Is this really fixed on 39? Please see below. ¡Gracias! bp-f485d360-9ae9-4b6f-9f21-362502150606 06/06/2015 05:44 p.m. Crashing Thread Frame Module Signature Source 0 xul.dll mozilla::layers::CompositorD3D11::HandleError(long, mozilla::layers::CompositorD3D11::Severity) gfx/layers/d3d11/CompositorD3D11.cpp 1 xul.dll mozilla::layers::CompositorD3D11::Failed(long, mozilla::layers::CompositorD3D11::Severity) gfx/layers/d3d11/CompositorD3D11.cpp 2 xul.dll mozilla::layers::CompositorD3D11::UpdateRenderTarget() gfx/layers/d3d11/CompositorD3D11.cpp 3 xul.dll mozilla::layers::CompositorD3D11::BeginFrame(nsIntRegion const&, mozilla::gfx::RectTyped<mozilla::gfx::UnknownUnits> const*, mozilla::gfx::RectTyped<mozilla::gfx::UnknownUnits> const&, mozilla::gfx::RectTyped<mozilla::gfx::UnknownUnits>*, mozilla::gfx::RectTyped<mozilla::gfx::UnknownUnits>*) gfx/layers/d3d11/CompositorD3D11.cpp 4 xul.dll mozilla::layers::LayerManagerComposite::Render() gfx/layers/composite/LayerManagerComposite.cpp 5 xul.dll mozilla::layers::LayerManagerComposite::EndTransaction(void (*)(mozilla::layers::PaintedLayer*, gfxContext*, nsIntRegion const&, mozilla::layers::DrawRegionClip, nsIntRegion const&, void*), void*, mozilla::layers::LayerManager::EndTransactionFlags) gfx/layers/composite/LayerManagerComposite.cpp

Flags: needinfo?(bas)

Bas Schouten (:bas.schouten)

Assignee

Comment 111

•

10 years ago

(In reply to alex_mayorga from comment #110) > ¡Hola Bas! > > Is this really fixed on 39? > > Please see below. > > ¡Gracias! > > bp-f485d360-9ae9-4b6f-9f21-362502150606 > 06/06/2015 05:44 p.m. > > Crashing Thread > Frame Module Signature Source > 0 xul.dll mozilla::layers::CompositorD3D11::HandleError(long, > mozilla::layers::CompositorD3D11::Severity) > gfx/layers/d3d11/CompositorD3D11.cpp > 1 xul.dll mozilla::layers::CompositorD3D11::Failed(long, > mozilla::layers::CompositorD3D11::Severity) > gfx/layers/d3d11/CompositorD3D11.cpp > 2 xul.dll mozilla::layers::CompositorD3D11::UpdateRenderTarget() > gfx/layers/d3d11/CompositorD3D11.cpp > 3 xul.dll mozilla::layers::CompositorD3D11::BeginFrame(nsIntRegion const&, > mozilla::gfx::RectTyped<mozilla::gfx::UnknownUnits> const*, > mozilla::gfx::RectTyped<mozilla::gfx::UnknownUnits> const&, > mozilla::gfx::RectTyped<mozilla::gfx::UnknownUnits>*, > mozilla::gfx::RectTyped<mozilla::gfx::UnknownUnits>*) > gfx/layers/d3d11/CompositorD3D11.cpp > 4 xul.dll mozilla::layers::LayerManagerComposite::Render() > gfx/layers/composite/LayerManagerComposite.cpp > 5 xul.dll mozilla::layers::LayerManagerComposite::EndTransaction(void > (*)(mozilla::layers::PaintedLayer*, gfxContext*, nsIntRegion const&, > mozilla::layers::DrawRegionClip, nsIntRegion const&, void*), void*, > mozilla::layers::LayerManager::EndTransactionFlags) > gfx/layers/composite/LayerManagerComposite.cpp It looks like you were running very low on memory there, low memory situations can still trigger this crash.

Flags: needinfo?(bas)

Milan Sreckovic [:milan] (needinfo for best results)

Comment 112

•

10 years ago

Which may very well be tracked in bug 1172351.

alex_mayorga

Comment 113

•

10 years ago

¡Hola Bas! This still happens on 40: bp-3ad85bf4-1ed6-4872-99a0-04bd12150804 04/08/2015 10:13 a.m. Shall I file a new bug, reopen this one or just let this be?

Flags: needinfo?(bas)

Bas Schouten (:bas.schouten)

Assignee

Comment 114

•

10 years ago

(In reply to alex_mayorga from comment #113) > ¡Hola Bas! > > This still happens on 40: > > bp-3ad85bf4-1ed6-4872-99a0-04bd12150804 > 04/08/2015 10:13 a.m. > > Shall I file a new bug, reopen this one or just let this be? There should still be an open bug tracking the remaining crashes somewhere.

Flags: needinfo?(bas)

Screenshot showing the issue 10 years ago Bogdan Maris, Desktop Test Engineering 46.54 KB, image/jpeg		Details
TDR_awesome_crash.png 10 years ago cmtalbert 528.88 KB, image/png		Details
Blacklist the two devices that spiked with 37b7 10 years ago Matt Woodrow (:mattwoodrow) 2.94 KB, patch	jrmuizel : review+ Sylvestre : approval-mozilla-beta+	Details \| Diff \| Splinter Review
Consider DXGI_ERROR_INVALID_CALL a recoverable error for GetBuffer 10 years ago Bas Schouten (:bas.schouten) 1.39 KB, patch	jrmuizel : review-	Details \| Diff \| Splinter Review
Consider DXGI_ERROR_INVALID_CALL a recoverable error for GetBuffer and make sure we check the correct device 10 years ago Bas Schouten (:bas.schouten) 2.25 KB, patch	jrmuizel : review+ Sylvestre : approval-mozilla-aurora+ Sylvestre : approval-mozilla-release+	Details \| Diff \| Splinter Review