Crash in nvwgf2umx.dll | RtlAllocateMemoryBlockLookaside | ... (advapi32.dll)
Categories
(Core :: Graphics, defect, P2)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr52 | --- | unaffected |
firefox-esr60 | --- | wontfix |
firefox59 | --- | unaffected |
firefox60 | --- | wontfix |
firefox61 | --- | wontfix |
firefox63 | --- | wontfix |
firefox64 | --- | wontfix |
firefox65 | --- | wontfix |
firefox66 | --- | wontfix |
firefox67 | --- | wontfix |
firefox68 | --- | fix-optional |
People
(Reporter: philipp, Unassigned)
References
Details
(Keywords: crash, regression, Whiteboard: [gfx-noted])
Crash Data
This bug was filed from the Socorro interface and is report bp-e2b60a52-6910-4ce5-bd93-b12bd0180319. ============================================================= content crashes with this signature are regressing in volume in the 60.0b cycle from users on windows 7. this seems to be constrained to a number of device/driver configurations. would there a way to blocklist them or should we reach out to nvidia? Adapter driver version facet 1 10.18.13.6510 61 30.50 % 2 10.18.13.6472 37 18.50 % 3 10.18.13.6175 31 15.50 % 4 10.18.13.5900 21 10.50 % 5 10.18.13.5921 18 9.00 % 6 10.18.13.6519 13 6.50 % 7 10.18.13.6451 8 4.00 % 8 10.18.13.5850 6 3.00 % 9 10.18.13.5582 4 2.00 % 10 10.18.13.5906 1 0.50 % Adapter device id facet 1 0x128b 77 38.50 % 2 0x1380 41 20.50 % 3 0x11c2 21 10.50 % 4 0x1402 19 9.50 % 5 0x11c0 13 6.50 % 6 0x1187 12 6.00 % 7 0x0f02 10 5.00 % 8 0x1287 4 2.00 % 9 0x104a 3 1.50 %
Updated•6 years ago
|
Comment 1•6 years ago
|
||
This looks like 60 only, but low volume. Lets wait and see.
Updated•6 years ago
|
Still small number of installs, low numbers in B13, maybe gone?
Comment 3•6 years ago
|
||
Milan: Verify that it's no longer showing up?
Comment 4•6 years ago
|
||
(In reply to Marion Daly [:mdaly] from comment #3) > Milan: Verify that it's no longer showing up? still present, very low user count, high volume for those users.
Updated•6 years ago
|
Comment 5•6 years ago
|
||
Adding 63 and 64 as affected.
Comment 6•6 years ago
|
||
The number of crashes and users affected is going up since we shipped 63, maybe this should be reprioritized as a P2 instead of a P3
Comment 7•6 years ago
|
||
This has been scaling up on Windows on release even prior to 63.
Comment 8•6 years ago
|
||
Jeff, any thoughts on who could look at this? (Normally I'd ask Bas)
Comment 9•6 years ago
|
||
Not really. The stacks are pretty unhelpful. It might be best to ask nvidia on mozilla-nvidia discuss.
Comment 11•6 years ago
|
||
(In reply to Jeff Muizelaar [:jrmuizel] from comment #9) > Not really. The stacks are pretty unhelpful. It might be best to ask nvidia > on mozilla-nvidia discuss. Agreed. Jeff could you kick that off?
Updated•6 years ago
|
Comment 15•5 years ago
|
||
Any suggestions on anything else that could be done for investigation?
Updated•5 years ago
|
Comment 16•5 years ago
|
||
The most productive thing is probably to continue to try to get in touch with Nvidia. I'll try harder.
Comment 17•5 years ago
|
||
Thanks for the report. This is NV bug number 2446669. I'll look into this..
Updated•5 years ago
|
Comment 18•5 years ago
|
||
I am emailing 6 users who submitted crash reports including their email addresses with one of these signatures. If anyone replies with permission to share the crash dumps, then Jeff or I can provide them, in accordance with Mozilla's data protection policies.
Updated•5 years ago
|
Comment 19•5 years ago
|
||
Jeff, just noting here that we have permission from the user in email to pass on the dump from this crash to NVIDIA: https://crash-stats.mozilla.com/signature/?email=%3Ddungeonus%40ukr.net&product=Firefox&signature=nvwgf2umx.dll%20%7C%20VirtualQuery&date=%3E%3D2018-07-16T06%3A36%3A32.000Z&date=%3C2019-01-16T04%3A36%3A32.000Z&_columns=date&_columns=product&_columns=version&_columns=build_id&_columns=platform&_columns=reason&_columns=address&_columns=install_time&_sort=-date&page=1#reports
Comment 20•5 years ago
|
||
Though looking at it I'm not sure how useful that's going to be!
Comment 22•5 years ago
|
||
This particular minidump used driver version: 341.44, driver date "2-3-2015", geforce gtx 260
Latest "supported" driver for the card is: Version: 342.01 WHQL, release Date: 2016.12.14
The other crash reports report drivers from newer branches, but around the same era. Example: one crashed with 364.72, it is 2016/H1.
I tried Win7 + 342.01 + GT210 to repro this. The callstack looks like initialization code and IIRC the reports indicate the process lives only few seconds. For the repro I tried to just open and close the app and open multiple windows. I did not succeed. In theory the 342.01 could be fixed wrt this, but I doubt it's the case. More probable explanation is that this more transient and/or needs more specific repro scenario. I'll try the exact 241.44 later.
The main thread crashes at
ul.dll!mozilla::gfx::DoesTextureSharingWorkInternal(ID3D11Device * device, DXGI_FORMAT format, unsigned int bindflags) Line 253 C++
xul.dll!mozilla::gfx::DeviceManagerDx::CreateCompositorDevice(mozilla::gfx::FeatureState & d3d11) Line 495 C++
xul.dll!mozilla::gfx::DeviceManagerDx::CreateCompositorDevices() Line 180 C++
xul.dll!gfxWindowsPlatform::InitializeD3D11() Line 1522 C++
xul.dll!gfxWindowsPlatform::InitializeDevices() Line 1497 C++
xul.dll!gfxWindowsPlatform::HandleDeviceReset() Line 431 C++
xul.dll!gfxWindowsPlatform::UpdateRenderMode() Line 486 C++
xul.dll!nsWindow::OnPaint(HDC__ * aDC, unsigned int aNestingLevel) Line 181 C++
xul.dll!nsWindow::ProcessMessage(unsigned int msg, unsigned __int64 & wParam, __int64 & lParam, int64 * aRetValue) Line 5576 C++
xul.dll!nsWindow::WindowProcInternal(HWND * hWnd, unsigned int msg, unsigned __int64 wParam, int64 lParam) Line 5034 C++
xul.dll!nsWindow::WindowProc(HWND * hWnd, unsigned int msg, unsigned __int64 wParam, __int64 lParam) Line 4986 C++
[External Code]
xul.dll!nsAppShell::ProcessNextNativeEvent(bool mayWait) Line 569 C++
xul.dll!nsBaseAppShell::OnProcessNextEvent(nsIThreadInternal * thr, bool mayWait) Line 273 C++
xul.dll!nsThread::ProcessNextEvent(bool aMayWait, bool * aResult) Line 1160 C++
xul.dll!NS_ProcessNextEvent(nsIThread * aThread, bool aMayWait) Line 530 C++
xul.dll!mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate * aDelegate) Line 97 C++
xul.dll!MessageLoop::RunHandler() Line 319 C++
xul.dll!MessageLoop::Run() Line 299 C++
xul.dll!nsBaseAppShell::Run() Line 160 C++
xul.dll!nsAppShell::Run() Line 420 C++
xul.dll!nsAppStartup::Run() Line 291 C++
xul.dll!XREMain::XRE_mainRun() Line 4777 C++
xul.dll!XREMain::XRE_main(int argc, char * * argv, const mozilla::BootstrapConfig & aConfig) Line 4922 C++
xul.dll!XRE_main(int argc, char * * argv, const mozilla::BootstrapConfig & aConfig) Line 5014 C++
The last moz point in the call stack causes some sort of d3d11 driver flush. The flush drains the work in the driver worker thread, and some bug in the driver code then crashes the process. I got the driver callstacks and did some internal bug queries. Unfortunately the queries did not help, I cannot pinpoint which operation would trigger the bug, nor what the bug is.
If other minidumps would indicate the same crash point, then maybe one workaround could be blacklisting the D3D11 sharing on these old drivers. My knowledge doesn't extend to the driver internals, so I'm not sure how realistic this is.
If checking the other minidumps is a fast operation, then that could be one avenue of investigation.
The driver call stack is related to texture uploads, if that makes sense. I'm not sure if creating d3d11 keyedmutex sharetexture internally needs to upload something. If this is part of gfx system initialization, I'm not sure how much other textures have been created. Of course, if this is done per window, then most likely there could already be multiple textures in flight from other windows..
Note: I could not find the main thread from the crash report web ui reports.
Updated•5 years ago
|
Comment 24•5 years ago
|
||
Hey Jeff - is this actionable + important?
Updated•5 years ago
|
Comment 25•5 years ago
|
||
It seems like our symbol server is not serving up binaries any more. That makes analyzing this a bit of a pain.
Comment 26•2 years ago
|
||
Closing because no crashes reported for 12 weeks.
Description
•