Closed Bug 830531 Opened 12 years ago Closed 11 months ago

[Win8] crash in XPC_WN_Helper_NewResolve mainly with AMD Radeon HD 6290/6310/6320/7290/7310/7340 (Wrestler Asic)

Categories

(Core :: JavaScript Engine, defect)

19 Branch
x86
Windows 8
defect

Tracking

()

RESOLVED INCOMPLETE
Tracking Status
firefox19 + affected
firefox20 + affected

People

(Reporter: scoobidiver, Unassigned)

References

(Blocks 1 open bug)

Details

(Keywords: crash, regression, Whiteboard: [Win8][qa-not-actionable])

Crash Data

Attachments

(3 files)

It first showed up in 20.0a2/20130113 and is currently #8 top browser crasher in Aurora (high for a crash specific to Windows 8). The regression range is: http://hg.mozilla.org/releases/mozilla-aurora/pushloghtml?fromchange=4f74542c3678&tochange=cf2ccc84268f The stack trace usually looks like: Frame Module Signature Source 0 KERNELBASE.dll TlsGetValue 1 @0x3757fe04 But there are a few crashes with a better one: Signature TlsGetValue More Reports Search UUID aa5e3fda-7925-40d1-a3f1-bbe232130114 Date Processed 2013-01-14 22:14:41 Uptime 21449 Last Crash 7.8 hours before submission Install Age 8.2 hours since version was first installed. Install Time 2013-01-14 14:01:27 Product Firefox Version 20.0a2 Build ID 20130113042017 Release Channel aurora OS Windows NT OS Version 6.2.9200 Build Architecture x86 Build Architecture Info AuthenticAMD family 20 model 2 stepping 0 Crash Reason EXCEPTION_BREAKPOINT Crash Address 0x7501be04 App Notes AdapterVendorID: 0x1002, AdapterDeviceID: 0x9809, AdapterSubsysID: 00000000, AdapterDriverVersion: 8.982.7.0 D3D10 Layers? D3D10 Layers- D3D9 Layers? D3D9 Layers+ EMCheckCompatibility True Adapter Vendor ID 0x1002 Adapter Device ID 0x9809 Total Virtual Memory 4294836224 Available Virtual Memory 3831201792 System Memory Use Percentage 51 Available Page File 2244403200 Available Physical Memory 1862819840 Frame Module Signature Source 0 KERNELBASE.dll TlsGetValue 1 mozjs.dll js::InvokeKernel js/src/jsinterp.cpp:391 2 mozjs.dll js::Invoke js/src/jsinterp.cpp:439 3 mozjs.dll js::GetPropertyOperation js/src/jsinterpinlines.h:279 4 mozjs.dll js::Interpret js/src/jsinterp.cpp:2235 5 mozjs.dll js::RunScript js/src/jsinterp.cpp:348 6 mozjs.dll UncachedInlineCall js/src/methodjit/InvokeHelpers.cpp:372 7 mozjs.dll js::mjit::stubs::UncachedCallHelper js/src/methodjit/InvokeHelpers.cpp:460 8 mozjs.dll js::mjit::CallCompiler::update js/src/methodjit/MonoIC.cpp:1236 9 mozjs.dll js::mjit::ic::Call js/src/methodjit/MonoIC.cpp:1317 10 mozjs.dll js::mjit::JaegerShot js/src/methodjit/MethodJIT.cpp:1117 11 mozjs.dll js::Interpret js/src/jsinterp.cpp:2419 12 mozjs.dll JS_DHashTableOperate js/src/jsdhash.cpp:581 13 mozjs.dll js::InvokeKernel js/src/jsinterp.cpp:406 14 mozjs.dll js::Invoke js/src/jsinterp.cpp:439 15 mozjs.dll JS_CallFunctionValue js/src/jsapi.cpp:5805 16 xul.dll nsXPCWrappedJSClass::CallMethod js/xpconnect/src/XPCWrappedJSClass.cpp:1432 17 xul.dll nsXPCWrappedJS::CallMethod js/xpconnect/src/XPCWrappedJS.cpp:580 18 xul.dll PrepareAndDispatch xpcom/reflect/xptcall/src/md/win32/xptcstubs.cpp:85 19 xul.dll SharedStub xpcom/reflect/xptcall/src/md/win32/xptcstubs.cpp:112 20 xul.dll nsBrowserStatusFilter::OnStateChange toolkit/components/statusfilter/nsBrowserStatusFilter.cpp:150 21 xul.dll nsDocLoader::DoFireOnStateChange uriloader/base/nsDocLoader.cpp:1305 22 xul.dll nsDocLoader::doStopDocumentLoad uriloader/base/nsDocLoader.cpp:896 23 xul.dll nsDocLoader::DocLoaderIsEmpty uriloader/base/nsDocLoader.cpp:775 24 xul.dll nsDocLoader::OnStopRequest uriloader/base/nsDocLoader.cpp:659 25 xul.dll nsLoadGroup::RemoveRequest netwerk/base/src/nsLoadGroup.cpp:676 26 xul.dll nsLoadGroup::QueryInterface netwerk/base/src/nsLoadGroup.cpp:155 27 xul.dll nsDocument::UnblockOnload content/base/src/nsDocument.cpp:7322 28 xul.dll nsRunnableMethodImpl<void obj-firefox/dist/include/nsThreadUtils.h:367 29 nspr4.dll _MD_CURRENT_THREAD nsprpub/pr/src/md/windows/w95thred.c:312 30 nspr4.dll PR_Unlock nsprpub/pr/src/threads/combined/prulock.c:315 31 xul.dll nsTimerImpl::Cancel xpcom/threads/nsTimerImpl.cpp:337 32 xul.dll MessageLoop::RunHandler ipc/chromium/src/base/message_loop.cc:208 33 xul.dll MessageLoop::Run ipc/chromium/src/base/message_loop.cc:182 34 xul.dll nsBaseAppShell::Run widget/xpwidgets/nsBaseAppShell.cpp:163 35 xul.dll nsAppShell::Run widget/windows/nsAppShell.cpp:232 36 xul.dll nsAppStartup::Run toolkit/components/startup/nsAppStartup.cpp:288 37 xul.dll XREMain::XRE_mainRun toolkit/xre/nsAppRunner.cpp:3823 38 xul.dll XREMain::XRE_main toolkit/xre/nsAppRunner.cpp:3890 39 xul.dll XRE_main toolkit/xre/nsAppRunner.cpp:4093 More reports at: https://crash-stats.mozilla.com/report/list?product=Firefox&version=Firefox%3A20.0a2&signature=TlsGetValue
URLs: 9 https://www.facebook.com/ 4 https://www.facebook.com/messages/mozhgan.golzarian 3 https://www.facebook.com/marissa.oneal2012/friends?ft_ref=mni 2 about:blank 2 https://www.facebook.com/cher0308?ref=tn_tnmn 2 http://forum.paradoxplaza.com/forum/showthread.php?657612-A-Federation-of-quot-E 2 http://forum.paradoxplaza.com/forum/showthread.php?657612-A-Federation-of-quot-E ...and a longer list of random pages with a single hit. There doesn't seem to be any notable connection with URLs here. Correlations: Modules: 100% (51/51) vs. 9% (180/1920) bcryptPrimitives.dll 100% (51/51) vs. 10% (184/1920) WINMMBASE.dll 100% (51/51) vs. 10% (184/1920) combase.dll 100% (51/51) vs. 10% (184/1920) SHCore.dll 98% (50/51) vs. 11% (217/1920) winhttp.dll 98% (50/51) vs. 13% (246/1920) aticfx32.dll 92% (47/51) vs. 7% (141/1920) atiu9pag.dll 92% (47/51) vs. 8% (146/1920) atiumdva.dll 92% (47/51) vs. 8% (159/1920) atiumdag.dll (nothing that interesting in add-on correlations, though)
Keywords: needURLs
There are no crashes in 20.0a2/20130114 and above so it's likely caused by two connected patches landed in Aurora with one day of delay. Let's wait a few days to confirm it's gone.
It was a one-day spike.
Status: NEW → RESOLVED
Closed: 12 years ago
Keywords: steps-wanted
Resolution: --- → WORKSFORME
Back in 20.0a2/20130117. It seems PGO related.
Status: RESOLVED → REOPENED
Resolution: WORKSFORME → ---
David/Naveed - I know how much you love spikey PGO-related crashes :) Is there any hardening or disabling of PGO around this code that we can perform?
It's currently #77 top browser crasher in 20.0a2.
Keywords: topcrash
It spike again in 20.0a2/20130207.
is firefox 19 really unaffected?
(In reply to philipp from comment #9) > is firefox 19 really unaffected? It's indeed #1 top browser crasher in 19.0 while it was a very low volume crash in 19.0b6. Based on comments, Firefox 19 is unusable on Windows 8.
Severity: critical → blocker
Crash Signature: [@ TlsGetValue] → [@ TlsGetValue] [@ InterlockedIncrement]
Summary: [Win8] crash in TlsGetValue → [Win8] crash in XPC_WN_Helper_NewResolve
Keywords: topcrash
Version: 20 Branch → 19 Branch
first support question is coming in - any useful information we can gather or troubleshooting to suggest? https://support.mozilla.org/en-US/questions/950825
(In reply to philipp from comment #14) > any useful information we can gather or troubleshooting to suggest? The workaround is to downgrade to 18.0.2. With combined signatures, it accounts for 57% of all crashes. More reports also at: https://crash-stats.mozilla.com/report/list?signature=XPC_WN_Helper_NewResolve
Crash Signature: [@ TlsGetValue] [@ InterlockedIncrement] → [@ TlsGetValue] [@ InterlockedIncrement] [@ XPC_WN_Helper_NewResolve]
Crash Signature: [@ TlsGetValue] [@ InterlockedIncrement] [@ XPC_WN_Helper_NewResolve] → [@ TlsGetValue] [@ InterlockedIncrement] [@ XPC_WN_Helper_NewResolve] [@ @0x2b]
(In reply to Scoobidiver from comment #11) > (In reply to philipp from comment #9) > > is firefox 19 really unaffected? > It's indeed #1 top browser crasher in 19.0 while it was a very low volume > crash in 19.0b6. > Based on comments, Firefox 19 is unusable on Windows 8. Unusable on Win8 for affected users. Let's hold judgement until we get some data around # of unique affected users. Scoobidiver/KaiRo - are there any high correlations for this crash signature? We'll ask QA to test Win8 until we have more leads
Flags: needinfo?(kairo)
It's hard to tell, but this looks like an xpconnect problem. The crash happens when constructing an XPCCallContext. My guess is that it calls nsXPConnect::GetXPConnect(), which calls NS_IsMainThread(), and that calls this: http://mxr.mozilla.org/mozilla-central/source/xpcom/glue/nsThreadUtils.cpp#129 However, it's a bit hard to know if that's where the crash is happening. The code that's triggering this is a release-mode assertion. However, it looks like it's not the assertion itself that's crashing us--it's the act of checking the assertion condition. Also, the fact that it's sporadic across versions suggests that it's not an assertion firing. To relieve the crash, we could try removing the assertion. It might just cause us to crash elsewhere, but it might fix the problem.
Taking QA Contact to help coordinate any testing necessary.
(In reply to Bill McCloskey (:billm) from comment #17) > It's hard to tell, but this looks like an xpconnect problem. The crash > happens when constructing an XPCCallContext. My guess is that it calls > nsXPConnect::GetXPConnect(), which calls NS_IsMainThread(), and that calls > this: > > http://mxr.mozilla.org/mozilla-central/source/xpcom/glue/nsThreadUtils. > cpp#129 > > However, it's a bit hard to know if that's where the crash is happening. > > The code that's triggering this is a release-mode assertion. However, it > looks like it's not the assertion itself that's crashing us--it's the act of > checking the assertion condition. Also, the fact that it's sporadic across > versions suggests that it's not an assertion firing. > > To relieve the crash, we could try removing the assertion. It might just > cause us to crash elsewhere, but it might fix the problem. Is it possible to glean anything about why TlsGetValue is crashing? If we could tell that the segment registers are bogus that would be big. If we can't, maybe the TLS is corrupt or something but that'd be harder to tell without a full crash dump.
So far I tested on two different Windows 8 machines and have not yet been able to reproduce the crash. If you need the machine specs I can provide them.
Something weird is going on, and I don't know what to make of it. (1) If you look at the second thread in these crash reports - either on crash-stats or in Visual Studio - it's breakpad. I've never seen this before. Is that normal? Could it be that breakpad is either masking the fault with another one, or somehow misreporting or double-crashing? To make things weirder, the crashing thread is totally unreadable to Visual Studio. WinDbg seems to be okay. (2) The crash as seen by WinDbg is: > 0:000> .ecxr > eax=00000022 ebx=05dc5400 ecx=00000000 edx=00000022 esi=768c5895 edi=2d6284e0 > eip=746abe04 esp=00cdc744 ebp=00cdc758 iopl=0 nv up ei pl nz na po nc > cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00200202 > KERNELBASE!TlsGetValue+0x4: > 746abe04 ec in al,dx > > 0:000> u 746abe00 > KERNELBASE!TlsGetValue: > 746abe00 8bff mov edi,edi > 746abe02 55 push ebp > 746abe03 8bec mov ebp,esp > 746abe05 648b0d18000000 mov ecx,dword ptr fs:[18h] So, if the breakpad information is correct, the main thread jumped to some random address inside TlsGetValue. Again, I don't know what to make of this. Next steps might be (1) seeing what's going on with breakpad or (2) seeing if there's a correlation to binary addons.
CCing ted and bsmedberg. Maybe they can figure out what's going on here.
CCing Ioana so she can keep Softvision informed of what's going on here. Ioana, depending on the state of this bug when you get online later tonight, please see if your team can find steps to reproduce.
QA Contact: anthony.s.hughes
(In reply to Bill McCloskey (:billm) from comment #17) > To relieve the crash, we could try removing the assertion. It might just > cause us to crash elsewhere, but it might fix the problem. The release-mode assertion that XPConnect is always used on the main thread is an important one in terms of keeping addons honest. I'd be pretty unhappy about removing it.
Top URLs for TlsGetValue: 127 about:blank 105 https://www.facebook.com/ 57 http://www.facebook.com/ 38 about:sessionrestore 21 about:newtab 21 about:home 18 https://mail.google.com/mail/?shva=1#inbox 16 https://mail.google.com/mail/u/0/?shva=1#inbox 16 http://www.facebook.com/?ref=tn_tnmn And here's a breakdown of installations: breakpad=> SELECT version,COUNT(*) as crashes,COUNT(DISTINCT client_crash_date - install_age * interval '1 second') as installations FROM reports WHERE product='Firefox' AND signature='TlsGetValue' AND utc_day_is(date_processed, '2013-02-19') GROUP BY version; version | crashes | installations ------------+---------+--------------- 10.0.12esr | 1 | 1 17.0.1 | 1 | 1 18.0.1 | 1 | 1 18.0.2 | 29 | 29 19.0 | 2479 | 1792 3.0b1 | 1 | 1 3.6 | 2 | 2 3.6.2 | 1 | 1 4.0b4 | 1 | 1 5.0 | 1 | 1 9.0 | 1 | 1 (11 rows)
Flags: needinfo?(kairo)
(In reply to David Anderson [:dvander] from comment #21) > So, if the breakpad information is correct, the main thread jumped to some > random address inside TlsGetValue. "jumped to some random address" reminds me of bug 839270 - do we know what graphics card this is happening on?
(In reply to David Anderson [:dvander] from comment #21) > Something weird is going on, and I don't know what to make of it. > > (1) If you look at the second thread in these crash reports - either on > crash-stats or in Visual Studio - it's breakpad. I've never seen this > before. Is that normal? Could it be that breakpad is either masking the > fault with another one, or somehow misreporting or double-crashing? Do you have an example report that shows this? I clicked through a few reports and didn't see what you were talking about. Note that the minidumps always include the Breakpad thread that does the dump writing (the minidump-writing code includes all threads), but Breakpad includes a special stream with the thread ID of that thread so it knows to skip it while printing stack traces. It's possible that information is simply missing. If you can show me an example I can tell for sure. > To make things weirder, the crashing thread is totally unreadable to Visual > Studio. WinDbg seems to be okay. > > (2) The crash as seen by WinDbg is: > > > 0:000> .ecxr > > eax=00000022 ebx=05dc5400 ecx=00000000 edx=00000022 esi=768c5895 edi=2d6284e0 > > eip=746abe04 esp=00cdc744 ebp=00cdc758 iopl=0 nv up ei pl nz na po nc > > cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00200202 > > KERNELBASE!TlsGetValue+0x4: > > 746abe04 ec in al,dx > > > > 0:000> u 746abe00 > > KERNELBASE!TlsGetValue: > > 746abe00 8bff mov edi,edi > > 746abe02 55 push ebp > > 746abe03 8bec mov ebp,esp > > 746abe05 648b0d18000000 mov ecx,dword ptr fs:[18h] > > So, if the breakpad information is correct, the main thread jumped to some > random address inside TlsGetValue. Generally the exception record is pretty reliable. This comes directly from Windows' EXCEPTION_POINTERS data, so I tend to believe it. However, that doesn't mean that anything following it is reliable--obviously the register state could be corrupted in myriad ways. Unfortunately it's really hard to figure out the root cause after-the-fact when something like that happens. I've downloaded a few dumps, I'll take a look at them tomorrow.
Depends on: 842855
https://bugzilla.mozilla.org/show_bug.cgi?id=842855#c4 has the reasoning for temporarily disabling updates for Win8 users tomorrow (morning?) instead of today. I'd like us to get more data.
Most of the individual reports I looked at had AMD. (In reply to Robert Kaiser (:kairo@mozilla.com) from comment #26) > (In reply to David Anderson [:dvander] from comment #21) > > So, if the breakpad information is correct, the main thread jumped to some > > random address inside TlsGetValue. > > "jumped to some random address" reminds me of bug 839270 - do we know what > graphics card this is happening on?
No longer depends on: 842855
Depends on: 842855
Cc:ing Andrew who might have some insight here. If anyone needs access to crash dumps here, I'm happy to help...
(In reply to Marcia Knous [:marcia] from comment #29) > Most of the individual reports I looked at had AMD. It's the right lead to follow. It's even restricted to the following device IDs: 0x9802, 0x9806, 0x9807, 0x9808, 0x9809, 0x980a (see http://developer.amd.com/resources/hardware-drivers/ati-catalyst-pc-vendor-id-1002-li/ for the matching GPUs). It's a kind of bug 839270 but in the XPConnect component and with more device IDs. D2D and D9D are disabled because of bug 840161.
Summary: [Win8] crash in XPC_WN_Helper_NewResolve → [Win8] crash in XPC_WN_Helper_NewResolve mainly with AMD Radeon HD 6290/6310/6320/7290/7310/7340
Crash Signature: [@ TlsGetValue] [@ InterlockedIncrement] [@ XPC_WN_Helper_NewResolve] [@ @0x2b] → [@ TlsGetValue] [@ InterlockedIncrement] [@ XPC_WN_Helper_NewResolve] [@ @0x2b] [@ nsXPConnect::GetXPConnect()]
FWIW, I dug hard into one of these crashes: https://crash-stats.mozilla.com/report/index/d1805cc2-9606-4def-8ecb-64ec92130219 The crash is happening under this line: http://hg.mozilla.org/releases/mozilla-release/file/20238b786063/js/xpconnect/src/nsXPConnect.cpp#l139 In the debugger, the inlined call to TlsGetValue looks like this: 136: // Do a release-mode assert that we're not doing anything significant in 137: // XPConnect off the main thread. If you're an extension developer hitting 138: // this, you need to change your code. See bug 716167. 139: if (!MOZ_LIKELY(NS_IsMainThread() || NS_IsCycleCollectorThread())) 03401857 mov eax,dword ptr ds:[10E63428h] 0340185C push esi 0340185D mov esi,dword ptr ds:[109BF4E8h] 03401863 push eax 03401864 call esi <-- crash under this call The disassembled version of this looks like this: 10171857: A1 28 34 0B 11 mov eax,dword ptr [?gTLSThreadIDIndex@@3KA] 1017185C: 56 push esi 1017185D: 8B 35 E8 F4 C0 10 mov esi,dword ptr [__imp__TlsGetValue@4] 10171863: 50 push eax 10171864: FF D6 call esi The correct relocated value is 0x03e9f4e8 __imp__TlsGetValue@4 So, just like the prior bugs, we appear to have an incorrect relocation or a 2-byte memory corruption. This is not a code bug. I would *love* to find a person who can reproduce this and set up some kind of debugging mechanism to actually watch the relocation/corruption happen.
Tried to hunt this bug looking at the comments from crash reports but with no success. Upgraded from 18.0.2 to 19.0, played with gmail, facebook, yahoo, youtube, pdf. Using Windows 8 x32, Firefox 19.0 RC, with AMD Radeon HD 6450 (this is the only GPU we have here that is close to the ones related to the summary of the bug).
I have this problem on Windows 8 x64 with AMD HD6320
Win 8 x86, AMD Radeon HD 7700 Series I updated from Firefox 18.0.2 to 19.0 and used facebook, google services, about:newtab, session restore, etc with HWA both disabled and enabled, but I haven't encountered any crash.
So, here's the correlations: Modules: 100% (1732/1732) vs. 9% (5527/61447) bcryptPrimitives.dll 100% (1732/1732) vs. 9% (5605/61447) SHCore.dll 100% (1732/1732) vs. 9% (5617/61447) WINMMBASE.dll 100% (1732/1732) vs. 9% (5617/61447) combase.dll 94% (1625/1732) vs. 9% (5433/61447) winhttp.dll 85% (1473/1732) vs. 7% (4567/61447) aticfx32.dll 80% (1383/1732) vs. 6% (3487/61447) atiu9pag.dll 80% (1383/1732) vs. 6% (3692/61447) atiumdva.dll 80% (1383/1732) vs. 6% (3757/61447) atiumdag.dll 96% (1670/1732) vs. 24% (14933/61447) explorerframe.dll 96% (1670/1732) vs. 24% (14943/61447) dui70.dll 99% (1709/1732) vs. 27% (16493/61447) NapiNSP.dll 99% (1708/1732) vs. 27% (16493/61447) pnrpnsp.dll 99% (1709/1732) vs. 27% (16567/61447) nlaapi.dll 100% (1729/1732) vs. 28% (17344/61447) DWrite.dll 100% (1728/1732) vs. 28% (17313/61447) cryptsp.dll 96% (1670/1732) vs. 25% (15179/61447) duser.dll 96% (1656/1732) vs. 26% (16102/61447) FWPUCLNT.DLL 100% (1732/1732) vs. 43% (26508/61447) sspicli.dll 80% (1383/1732) vs. 26% (16076/61447) d3d9.dll 98% (1698/1732) vs. 45% (27577/61447) ntmarta.dll Nothing interesting in Add-ons. Cores: 100% (1728/1732) vs. 58% (35899/61447) x86 with 2 cores So, this is mostly ATI-graphics machines (as ati*.dll are the drivers for those), as we know, and it's all (!) dual-core machines - no single-core, no more-than-dual-core.
(In reply to Robert Kaiser (:kairo@mozilla.com) from comment #37) > ... and it's all (!) dual-core machines - no single-core, no > more-than-dual-core. Actually, almost all - the 4 other crashes with this signature are on single-core machines.
Here's a few module version correlations, first the Windows ones: 100% (1732/1732) vs. 9% (5527/61447) bcryptPrimitives.dll 0% (0/1732) vs. 0% (35/61447) 6.2.8400.0 100% (1732/1732) vs. 9% (5492/61447) 6.2.9200.16384 100% (1732/1732) vs. 9% (5605/61447) SHCore.dll 0% (0/1732) vs. 0% (53/61447) 6.2.8102.0 0% (0/1732) vs. 0% (37/61447) 6.2.8250.0 0% (0/1732) vs. 0% (35/61447) 6.2.8400.0 21% (356/1732) vs. 2% (1193/61447) 6.2.9200.16384 4% (70/1732) vs. 0% (258/61447) 6.2.9200.16420 75% (1306/1732) vs. 7% (4029/61447) 6.2.9200.16433 100% (1732/1732) vs. 9% (5617/61447) WINMMBASE.dll 0% (0/1732) vs. 0% (53/61447) 6.2.8102.0 0% (0/1732) vs. 0% (37/61447) 6.2.8250.0 0% (0/1732) vs. 0% (35/61447) 6.2.8400.0 100% (1732/1732) vs. 9% (5492/61447) 6.2.9200.16384 100% (1732/1732) vs. 9% (5617/61447) combase.dll 0% (0/1732) vs. 0% (53/61447) 6.2.8102.0 0% (0/1732) vs. 0% (37/61447) 6.2.8250.0 0% (0/1732) vs. 0% (35/61447) 6.2.8400.0 22% (374/1732) vs. 2% (1270/61447) 6.2.9200.16384 78% (1358/1732) vs. 7% (4222/61447) 6.2.9200.16420 I would guess that the 8000 versions are the previews/betas of Win8 and the 16000 ones the release(s), which would make this only happen with the latter. And now the ATI drivers - leaving out all the versions this is not happening with. 85% (1473/1732) vs. 7% (4567/61447) aticfx32.dll 80% (1388/1732) vs. 4% (2735/61447) 8.17.10.1140 3% (47/1732) vs. 0% (241/61447) 8.17.10.1151 2% (38/1732) vs. 0% (291/61447) 8.17.10.1172 80% (1383/1732) vs. 6% (3487/61447) atiu9pag.dll 80% (1380/1732) vs. 4% (2719/61447) 8.14.1.6268 0% (1/1732) vs. 0% (32/61447) 8.14.1.6278 0% (2/1732) vs. 0% (64/61447) 8.14.1.6290 80% (1383/1732) vs. 6% (3692/61447) atiumdva.dll 80% (1380/1732) vs. 4% (2711/61447) 8.14.10.363 0% (1/1732) vs. 0% (26/61447) 8.14.10.370 0% (2/1732) vs. 0% (57/61447) 8.14.10.381 80% (1383/1732) vs. 6% (3757/61447) atiumdag.dll 80% (1380/1732) vs. 4% (2713/61447) 9.14.10.924 0% (1/1732) vs. 0% (25/61447) 9.14.10.926 0% (2/1732) vs. 0% (53/61447) 9.14.10.945
FWIW the AMD Wrestler GPUs are part of AMDs low-cost low-power Bobcat core (http://en.wikipedia.org/wiki/Bobcat_(microarchitecture)). These are all one and two core CPUs with on chip GPUs. Do we have any evidence of this problem on any other CPU GPU combo? If it is restricted to this subset are we able to selectively target FF to not update on those machines?
I have an Acer netbook with the AMD C70 dualcore CPU and an integrated Radeon 7290. Since installing a Windows update today I've been encountering intermittent blue screens of death. The last time this occurred was while scrolling an email in Gmail with Firefox 19. If someone can guide me to where those dumps are stored in Windows 8 I'm happy to provide them for debugging.
Check http://support.microsoft.com/kb/315263 and its last chapter. This tool (http://www.nirsoft.net/utils/blue_screen_view.html) can help to open minidumps.
I noticed a bunch of Windows updates were installed today. I'm trying to uninstall them one at a time to see which one resolves my blue screens. Unfortunately this is going to be a long process. I'll comment back here when I find something.
Incredible amount of crashes. 20 times. Windows 8 with AMD dual core. With the graphics card. Also blue screened three times as well. 6 gb ram x64. I think it's crashing to much to be usable. IE 10 for now. It happened today.
After uninstalling http://support.microsoft.com/?kbid=2805940 I no longer experience blue screens but I do experience consistent startup crashes with the signature @nsXPConnect::GetXPConnect(). Starting a new profile I no longer see the crashes. Please advise how I can debug this further.
Here is one of my crash reports: https://crash-stats.mozilla.com/report/index/bp-64154ba2-3f16-48c4-9812-82cef2130220 I did some research (thanks Loic for the tip) and the blue screen I was seeing was this: http://msdn.microsoft.com/en-us/library/windows/hardware/ff558949%28v=vs.85%29.aspx "This error has been linked to excessive paged pool usage and may occur due to user-mode graphics drivers crossing over and passing bad data to the kernel code."
I have an easily reproducible case where I can just be using GMail for a minute or so when it crashes. This happens on a new profile as well. Benjamin is helping me debug this with WinDbg.
I installed Win8 on our trusty HP Pavilion dm1 with an AMD Radeon HD 6310, and I tried running Fx18.0.2 with several sites. I then installed Fx19 (pave over installation) and tried the same sites (netflix, facebook, facebook games, yelp, hotmail, yahoo), and I was not able to crash after maybe 15 minutes of user interaction. Then I enabled the about:config pref layers.acceleration.force-enable, restarted and tried the same things, and after a little bit of closing and opening tabs I was able to crash with this related signature: https://crash-stats.mozilla.com/report/index/bp-46c82cc6-e6f3-4e57-966b-371462130220 It wasn't easy to crash, and I had to force enable hardware acceleration. The machine has not installed about 30 pending Windows system updates. You can access it through the MV network at 10.250.6.86 with VNC.
I'm not sure if it's useful information but under the same circumstances as comment 47, Firefox 19.0-final crashes but 19.0b6 does not.
Dropping stepswanted and qawanted from this bug since we've made progress on it and I have a reproducible case. I will continue to assist Benjamin with investigation as required.
(In reply to Anthony Hughes, Mozilla QA (:ashughes) from comment #49) > I'm not sure if it's useful information but under the same circumstances as > comment 47, Firefox 19.0-final crashes but 19.0b6 does not. Some of these other AMD-only crashes we've had seemed to be PGO related, and would appear or disappear from build to build. See bug 772330 and the various blocking bugs.
(In reply to Scoobidiver from comment #31) > D2D and D9D are disabled because of bug 840161. In fact, Direct3D 9 is enabled in most crash reports I've checked because bug 840161 only applies to Windows 7. Direct2D is disabled because we required 9.10.8.0 or above for AMD GPUs on Windows 8.
(In reply to Anthony Hughes, Mozilla QA (:ashughes) from comment #41) > I have an Acer netbook with the AMD C70 dualcore CPU and an integrated > Radeon 7290. Since installing a Windows update today I've been encountering > intermittent blue screens of death. The last time this occurred was while > scrolling an email in Gmail with Firefox 19. If someone can guide me to > where those dumps are stored in Windows 8 I'm happy to provide them for > debugging. Mine is exactly the same netbook as yours. BSOD is caused by Windows update 2778344 so you can just uninstall it to stop BSOD happening again. However, Firefox 19 still crashes regardless but no BSOD. I have switched to Chrome for a while until Firefox is fixed.
(In reply to Anthony Hughes, Mozilla QA (:ashughes) from comment #47) > I have an easily reproducible case where I can just be using GMail for a > minute or so when it crashes. This happens on a new profile as well. > > Benjamin is helping me debug this with WinDbg. Any updates around the success of debugging? To help guide a final solution here, it's highly desirable to find a pref (or something similar) that we could flip instead of rolling a 19.0.1. The reasoning is: * This is our only 19.0.1 driver right now * We'd only want to push out updates to Win8 users, which is apparently difficult from the RelEng side of things * Even if we could update only Win8 users, it's possible that we'd continue playing whack-a-mole with this crash (unless we find a true fix)
Crash Signature: [@ TlsGetValue] [@ InterlockedIncrement] [@ XPC_WN_Helper_NewResolve] [@ @0x2b] [@ nsXPConnect::GetXPConnect()] → [@ TlsGetValue] [@ InterlockedIncrement] [@ XPC_WN_Helper_NewResolve] [@ @0x0 | XPC_WN_Helper_NewResolve ] [@ @0x2b] [@ nsXPConnect::GetXPConnect()] [@ XPC_WN_NoHelper_Resolve ]
(In reply to Scoobidiver from comment #52) > (In reply to Scoobidiver from comment #31) > > D2D and D9D are disabled because of bug 840161. > In fact, Direct3D 9 is enabled in most crash reports I've checked because > bug 840161 only applies to Windows 7. > Direct2D is disabled because we required 9.10.8.0 or above for AMD GPUs on > Windows 8. Let's see if blocklisting for Win8 would have the intended affect. gfx team - what pref should ashughes set to emulate the blocklist w/o needing to stage one?
To emulate blocklisted graphics features, use the .disabled prefs, in particular: layers.acceleration.disabled = true to emulate blocklisting d3d9 / d3d10 layers gfx.direct2d.disabled = true to emulate blocklisting direct2d
(In reply to Robert Kaiser (:kairo@mozilla.com) from comment #39) > And now the ATI drivers - leaving out all the versions this is not happening > with. > 85% (1473/1732) vs. 7% (4567/61447) aticfx32.dll > 80% (1388/1732) vs. 4% (2735/61447) 8.17.10.1140 -> Catalyst 12.08 > 3% (47/1732) vs. 0% (241/61447) 8.17.10.1151 -> Catalyst 12.10 > 2% (38/1732) vs. 0% (291/61447) 8.17.10.1172 -> Catalyst 13.1 Based on http://amddevcentral.com/Resources/hardware-drivers/ccc/Pages/default.aspx, I wrote the matching driver versions. See bug 806991 for Direct2D-blocking on Windows 8. (In reply to Alex Keybl [:akeybl] from comment #56) > gfx team - what pref should ashughes set to emulate the blocklist w/o > needing to stage one? With Catalyst 12.10 or lower (Direct2D already disabled), set layers.acceleration.disabled to true. Compare with the default value. Based on comment 48, it should be OK.
Windows 8 x64 on Toshiba Satellite C660D(AMD E-450,AMD Radeon HD 6320) layers.acceleration.disabled = true gfx.direct2d.disabled = true With Catalyst 13.1 or Catalyst 12.10 i have crashes.
(In reply to sergantjohns from comment #59) > layers.acceleration.disabled = true > gfx.direct2d.disabled = true You need to restart Firefox to make those changes apply.
Setting layers.acceleration.disabled=TRUE and gfx.direct2d.disabled=TRUE and restarting Firefox did not make a difference. I'm still crashing. https://crash-stats.mozilla.com/report/bp-8729f688-57dd-458f-ae62-883752130221
(In reply to Scoobidiver from comment #60) > (In reply to sergantjohns from comment #59) > > layers.acceleration.disabled = true > > gfx.direct2d.disabled = true > You need to restart Firefox to make those changes apply. I'm still crashing after restart Firefox
sergantjohns, Anthony, thanks for that testing - so we now know that the blocklisting would be ineffective. :(
Firefox 20.0b1(installer) crashes with some times BSOD but 20.0b1-candidates(build2 from zip) does not.Why?
(In reply to sergantjohns from comment #64) > Firefox 20.0b1(installer) crashes with some times BSOD but > 20.0b1-candidates(build2 from zip) does not.Why? For the same reason 19.0b6 doesn't crash and not 19.0 despite an identical code. It's a random bug depending on how the compiler optimizes the code.
(In reply to Scoobidiver from comment #65) > (In reply to sergantjohns from comment #64) > > Firefox 20.0b1(installer) crashes with some times BSOD but > > 20.0b1-candidates(build2 from zip) does not.Why? > For the same reason 19.0b6 doesn't crash and not 19.0 despite an identical > code. It's a random bug depending on how the compiler optimizes the code. Actually, if both are build2 then they have been built and optimized exactly the same. If it's different builds, then this is possible.
Benjamin asked me via IRC to clarify something. In comment 45 I mentioned uninstalling http://support.microsoft.com/?kbid=2805940 resolved the BSODs I was experiencing. This was not entirely factual. Removing http://support.microsoft.com/?kbid=2805940 reduced the occurrence of BSODs. Gary advised removing http://support.microsoft.com/?kbid=2778344 in comment 54. Doing so resolved my BSODs completely. To clarify, I had to remove both 2805940 and 2778344 to resolve my blue screens (it was not enough to remove one or the other). None of this resolves the Firefox crashes.
Summary: [Win8] crash in XPC_WN_Helper_NewResolve mainly with AMD Radeon HD 6290/6310/6320/7290/7310/7340 → [Win8] crash in XPC_WN_Helper_NewResolve mainly with AMD Radeon HD 6290/6310/6320/7290/7310/7340 (Wrestler Asic)
19.0.1-candidates/build1/ I don't have crashes!
(In reply to sergantjohns from comment #69) > 19.0.1-candidates/build1/ I don't have crashes! Thanks! Let's leave status-firefox20 as affected so that the investigation here continues into the FF20 beta cycle (even if we call this resolved for FF19).
I confirm that I'm not getting crashes with the 19.0.1 candidate builds.
Depends on: 844156
Depends on: 772330
I can confirm that there's no crash with 19.0.1 build too. A windows update for AMD processor has just come out: http://support.microsoft.com/kb/2818604 It's a small update for AMD processors, could this be it?
The update from comment 72 is very unlikely to be relevant. It's for the CPU not the graphics, and it's only for very specific stepping numbers.
Gary and ashughes, can you both please attach your dxdiag output from your affected netbooks to this bug? https://help.ea.com/article/how-to-gather-dxdiag-information
Flags: needinfo?(enquiry)
Flags: needinfo?(anthony.s.hughes)
This is the DxDiag from the HP Pavilion from comment 48 which experiences the crash but only after significant usage.
Attached file DxDiag from ashughes
Here is my DxDiag for the Acer netbook used in comment 47.
Flags: needinfo?(anthony.s.hughes)
Attached file dxdiag
Flags: needinfo?(enquiry)
Since it's been asked of QA to spotcheck release builds against this bug on known affected hardware until we're confident it is fixed I'm reporting here that 19.0.2#1 candidates appear to not be hitting this bug on my netbook. Gary or anyone else seeing this bug before, if you'd like to confirm for yourself please test the following builds: ftp://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/19.0.2-candidates/build1/win32/ Thanks
Windows 8 x64 on Toshiba C660D(AMD E-450,AMD Radeon HD 6320) with the 19.0.2-candidates/build1/ I don't have crashes!
Dropping QAWANTED from this bug as there doesn't seem to be any specific assistance we can provide any longer. We'll continue to track and verify mitigated for releases until this is actually fixed. Please re-add QAWANTED if there's some specific way we can be of service.
Keywords: qawanted
It's no longer a top crasher.
Severity: blocker → critical
Keywords: topcrash
Should be solved as fix, no longer crashes on Radeon HD 7340 E2-1800.
(In reply to Nick from comment #82) > Should be solved as fix, no longer crashes on Radeon HD 7340 E2-1800. Unfortunately we can't call this categorically "fixed" unless we knowingly landed a patch which fixed this. We could resolve this as WORKSFORME if it's started to disappear for those who were previously affected. When did you first notice this crash went away? Did you recently receive any Windows or AMD Driver updates? It would be good to know if this was fixed internally or "magically" by some other code we landed.
I just queried crash-stats and it looks like there are still people out there who are experiencing this crash. I'm seeing nearly 600 reports in the last week, most of which are in the latest release (Firefox 28). Nick, based on these numbers I think this is very unlikely to be fixed. Perhaps something in recent days/months has changed on your system to make encountering this crash much more rare.
I had this issue last year, crashed about 20 times in one day. I quit using firefox till they had what I had forgoten update, used it again in several months again. It was early 2013 I think. I have windows 8.1 X64 now, so I think that is the reason. However, I still have WDDM 1.2 as AMD did not update it yet, but I no longer have the issue. WDDM 1.2 was probably the issue, but I think they probably have old drivers now. Remember a lot of people don't update graphics driver a lot, but this is specutive. Maybe there is something different in Radeon 7340 and the ealier Bobcat. E2-1800 apu is kind of rare, not many have it. AMD did some very minor rework in it. The most likely thing is they have old drivers. AMD had stablity issues in early windows 8 drivers, as when I got my hp computer in 2012, it would very often crash. I no longer have this when I went back to it in 2013. Regardless, they are probably using the drivers that would often crash. I don't know, but what I do know is that I used the windows 8.1 beta since july 2013, and then windows 8.1 when it came out. That was when the issues ended in firefox. Maybe Microsoft fixed it and/or AMD, I'm probably no help for those still on Widnows 8.
Assignee: general → nobody
Crash Signature: [@ TlsGetValue] [@ InterlockedIncrement] [@ XPC_WN_Helper_NewResolve] [@ @0x0 | XPC_WN_Helper_NewResolve ] [@ @0x2b] [@ nsXPConnect::GetXPConnect()] [@ XPC_WN_NoHelper_Resolve ] → [@ TlsGetValue] [@ InterlockedIncrement] [@ XPC_WN_Helper_NewResolve] [@ @0x0 | XPC_WN_Helper_NewResolve ] [@ @0x2b] [@ nsXPConnect::GetXPConnect()] [@ XPC_WN_NoHelper_Resolve ] [@ nsXPConnect::GetXPConnect]
Whiteboard: [Win8] → [Win8][qa-not-actionable]
Severity: critical → S2

Since the crash volume is low (less than 5 per week), the severity is downgraded to S3. Feel free to change it back if you think the bug is still critical.

For more information, please visit auto_nag documentation.

Severity: S2 → S3
Crash Signature: [@ TlsGetValue] [@ InterlockedIncrement] [@ XPC_WN_Helper_NewResolve] [@ @0x0 | XPC_WN_Helper_NewResolve ] [@ @0x2b] [@ nsXPConnect::GetXPConnect()] [@ XPC_WN_NoHelper_Resolve ] [@ nsXPConnect::GetXPConnect] → [@ TlsGetValue] [@ InterlockedIncrement] [@ XPC_WN_Helper_NewResolve] [@ @0x0 | XPC_WN_Helper_NewResolve ] [@ @0x2b] [@ nsXPConnect::GetXPConnect] [@ XPC_WN_NoHelper_Resolve ] [@ nsXPConnect::GetXPConnect]

Let's close this. Windows 8 is no longer supported and any remaining crashes are better tracked in a new bug.

Status: REOPENED → RESOLVED
Closed: 12 years ago11 months ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: