Open Bug 714320 Opened 8 years ago Updated Last year

Firefox Crash @ nsStyleContext::AddChild with AMD Radeon HD 6xxx series

Categories

(Core :: Layout, defect, critical)

10 Branch
x86
Windows 7
defect
Not set
critical

Tracking

()

Tracking Status
firefox10 - ---
firefox11 - ---
firefox12 - ---
firefox20 --- affected
firefox21 --- affected
firefox22 --- affected
firefox-esr10 - ---

People

(Reporter: marcia, Unassigned)

References

(Blocks 1 open bug)

Details

(4 keywords, Whiteboard: [platform-rel-AMD])

Crash Data

Seen while looking at 10.0b2 stats. https://crash-stats.mozilla.com/report/list?signature=nsStyleContext::AddChild%28nsStyleContext*%29. Not enough volume to get correlations. This was seen in other versions but in lower volume.

https://crash-stats.mozilla.com/report/index/f03b1b5d-db19-4fcd-bead-bc4ce2111229

Frame 	Module 	Signature [Expand] 	Source
0 	xul.dll 	nsStyleContext::AddChild 	layout/style/nsStyleContext.cpp:148
1 	xul.dll 	nsStyleContext::nsStyleContext 	layout/style/nsStyleContext.cpp:85
2 	xul.dll 	NS_NewStyleContext 	layout/style/nsStyleContext.cpp:718
3 	xul.dll 	nsStyleSet::GetContext 	layout/style/nsStyleSet.cpp:621
4 	xul.dll 	nsFrameManager::ReResolveStyleContext 	layout/base/nsFrameManager.cpp:1233
5 	xul.dll 	nsFrameManager::ReResolveStyleContext 	layout/base/nsFrameManager.cpp:1597
6 	xul.dll 	nsFrameManager::ReResolveStyleContext 	layout/base/nsFrameManager.cpp:1597
7 	xul.dll 	nsFrameManager::ReResolveStyleContext 	layout/base/nsFrameManager.cpp:1597
8 	xul.dll 	nsFrameManager::ReResolveStyleContext 	layout/base/nsFrameManager.cpp:1597
9 	xul.dll 	nsFrameManager::ReResolveStyleContext 	layout/base/nsFrameManager.cpp:1576
10 	xul.dll 	nsFrameManager::ReResolveStyleContext 	layout/base/nsFrameManager.cpp:1597
11 	xul.dll 	nsFrameManager::ReResolveStyleContext 	layout/base/nsFrameManager.cpp:1597
12 	xul.dll 	nsFrameManager::ReResolveStyleContext 	layout/base/nsFrameManager.cpp:1597
13 	xul.dll 	nsFrameManager::ComputeStyleChangeFor 	layout/base/nsFrameManager.cpp:1683
14 	xul.dll 	mozilla::css::RestyleTracker::ProcessRestyles 	layout/base/RestyleTracker.cpp:240
15 	xul.dll 	nsCSSFrameConstructor::ProcessPendingRestyles 	layout/base/nsCSSFrameConstructor.cpp:11615
16 	xul.dll 	PresShell::FlushPendingNotifications 	layout/base/nsPresShell.cpp:4051
17 	xul.dll 	nsDocument::FlushPendingNotifications 	content/base/src/nsDocument.cpp:6268
18 	xul.dll 	xpc_qsUnwrapThis<nsGenericElement> 	js/xpconnect/src/nsDOMQS.h:121
19 	xul.dll 	nsIDOMNSElement_GetBoundingClientRect 	obj-firefox/js/xpconnect/src/dom_quickstubs.cpp:6432
20 	mozjs.dll 	js::InvokeKernel 	js/src/jsinterp.cpp:629
21 	mozjs.dll 	js::Interpret 	js/src/jsinterp.cpp:3948
22 	mozjs.dll 	js::types::TypeMonitorCall 	js/src/jsinferinlines.h:330
23 	mozjs.dll 	UncachedInlineCall 	js/src/methodjit/InvokeHelpers.cpp:392
24 	mozjs.dll 	js::mjit::stubs::UncachedCallHelper 	js/src/methodjit/InvokeHelpers.cpp:479
25 	mozjs.dll 	js::mjit::stubs::CompileFunction 	js/src/methodjit/InvokeHelpers.cpp:305
26 	mozjs.dll 	js::mjit::EnterMethodJIT 	js/src/methodjit/MethodJIT.cpp:1064
27 	mozjs.dll 	js::mjit::JaegerShot 	js/src/methodjit/MethodJIT.cpp:1142
28 	mozjs.dll 	js::Interpret 	js/src/jsinterp.cpp:3987
29 	mozjs.dll 	js::types::TypeMonitorCall 	js/src/jsinferinlines.h:330
30 	mozjs.dll 	UncachedInlineCall 	js/src/methodjit/InvokeHelpers.cpp:392
31 	mozjs.dll 	js::mjit::stubs::UncachedCallHelper 	js/src/methodjit/InvokeHelpers.cpp:479
32 	mozjs.dll 	js::mjit::stubs::CompileFunction 	js/src/methodjit/InvokeHelpers.cpp:305
33 	mozjs.dll 	js::mjit::EnterMethodJIT 	js/src/methodjit/MethodJIT.cpp:1064
34 	mozjs.dll 	js::mjit::JaegerShot 	js/src/methodjit/MethodJIT.cpp:1142
35 	mozjs.dll 	js::RunScript 	js/src/jsinterp.cpp:581
36 	mozjs.dll 	js::InvokeKernel 	js/src/jsinterp.cpp:647
37 	mozjs.dll 	js::Invoke 	js/src/jsinterp.h:148
38 	mozjs.dll 	js_fun_apply 	js/src/jsfun.cpp:1817
39 	mozjs.dll 	js::InvokeKernel 	js/src/jsinterp.cpp:629
40 	mozjs.dll 	js::Interpret 	js/src/jsinterp.cpp:3948
41 	mozjs.dll 	js::RunScript 	js/src/jsinterp.cpp:584
42 	mozjs.dll 	js::InvokeKernel 	js/src/jsinterp.cpp:647
43 	mozjs.dll 	js::Invoke 	js/src/jsinterp.cpp:679
44 	mozjs.dll 	JS_CallFunctionValue 	js/src/jsapi.cpp:5199
45 	xul.dll 	nsJSContext::CallEventHandler 	dom/base/nsJSEnvironment.cpp:1937
46 	xul.dll 	nsGlobalWindow::RunTimeout 	dom/base/nsGlobalWindow.cpp:9307
47 	xul.dll 	nsGlobalWindow::TimerCallback 	dom/base/nsGlobalWindow.cpp:9747
48 	xul.dll 	nsTimerImpl::Fire 	xpcom/threads/nsTimerImpl.cpp:425
49 	xul.dll 	nsTimerEvent::Run 	xpcom/threads/nsTimerImpl.cpp:521
50 	xul.dll 	nsThread::ProcessNextEvent 	xpcom/threads/nsThread.cpp:631
51 	xul.dll 	mozilla::ipc::MessagePump::Run 	ipc/glue/MessagePump.cpp:134
52 	xul.dll 	xul.dll@0xbb999f 	
53 	xul.dll 	MessageLoop::RunHandler 	ipc/chromium/src/base/message_loop.cc:201
54 	xul.dll 	_SEH_epilog4 	
55 	xul.dll 	MessageLoop::Run 	ipc/chromium/src/base/message_loop.cc:175
56 	xul.dll 	nsHTMLBodyElement::AddRef 	content/html/content/src/nsHTMLBodyElement.cpp:319
57 	xul.dll 	nsBaseAppShell::Run 	widget/src/xpwidgets/nsBaseAppShell.cpp:189
58 	xul.dll 	xul.dll@0xbb999f 	
59 	xul.dll 	nsAppStartup::Run 	toolkit/components/startup/nsAppStartup.cpp:228
60 	xul.dll 	XRE_main 	toolkit/xre/nsAppRunner.cpp:3551
61 	firefox.exe 	wmain 	toolkit/xre/nsWindowsWMain.cpp:107
62 	firefox.exe 	firefox.exe@0x4033 	
63 	firefox.exe 	__tmainCRTStartup 	crtexe.c:594
64 	firefox.exe 	_SEH_epilog4 	
65 	kernel32.dll 	kernel32.dll@0x51113 	
66 	ntdll.dll 	__RtlUserThreadStart 	
67 	kernel32.dll 	kernel32.dll@0x62acc 	
68 	kernel32.dll 	kernel32.dll@0x62acc 	
69 	ntdll.dll 	LdrpGetShimEngineInterface 	
70 	ntdll.dll 	_RtlUserThreadStart 	
71 	firefox.exe 	pre_c_init 	crtexe.c:304
72 	firefox.exe 	pre_c_init 	crtexe.c:304
73 		@0x7ffd3fff
This is officially an explosive crash with 11041 crashes in Firefox 10 B2 on Windows according to the signature summary. I will dig into manual correlations and look at the 51 comments. In the meantime adding the relevant keywords.
OS: Mac OS X → Windows 7
Version: 9 Branch → 10 Branch
Can someone look at nightly data and get an accurate range for when this started?
Facebook appears a lot on the comments. Do the URLs in crash stats also point to facebook? Other sites?
Keywords: needURLs
Keywords: needURLs
21 crashes in 8.0.1 in the last 4 weeks.
3 crashes in 9.0.1 in the last 4 weeks.
1 crash in 10.0b1.

In the last 4 weeks of data I see other versions such as the 4.0 betas and 3.6.x represented in small numbers, but trunk is not among the versions that are showing up in my query.

I have tried reproducing it on facebook.com but no luck so far. What I am seeing on this machine while running  Mozilla/5.0 (Windows NT 6.1; WOW64; rv:10.0) Gecko/20100101 Firefox/10.0 is that Firefox is periodically freezing with a few tabs open (including Facebook).
Marcia and I just looked through the crash data a bit more. Based upon the fact that:

* 10.0b1 has 1 crash report in the past 4 weeks
* 9.0.1 has 4 crash reports in the past 4 weeks
* 10.0b2 has 16888 crash reports in the past 4 weeks

we can safely assume that this bug is caused by a change that we've taken in 10.0b2. The full changeset from 10.0b1 to 10.0b2 is https://bugzilla.mozilla.org/buglist.cgi?quicksearch=710060%2C697215%2C699668%2C711195%2C712169%2C712506%2C701662%2C&list_id=1993989

In lieu of STR, I'm CC'ing the assignees of some of the more suspicious changes  (JS/layout related). The investigation of this bug should be a top priority since it may force a portion of our beta audience to use other FF versions or browsers to surf Facebook. Thanks in advance.
What about 11.0a1 and 12.0a1?
A Socorro query shows no crashes in that signature for either 11.0a1 or 12.0a1.

In reply to David Baron [:dbaron] from comment #7)
> What about 11.0a1 and 12.0a1?
Severity: normal → critical
Could be http://hg.mozilla.org/releases/mozilla-beta/rev/7a0d36baf5be (bug 697215). I think we should try backing it out; crashstats results should be quick, and I think we can live without the patch.
Wasn't this patch checked into Aurora as well as Beta? If it's the culprit, wouldn't we see crashes in 11.0a2 - I don't find any. In any case, if this looks suspicious, I am all for backing it out. As Rob said, it will be easy to verify.
Darn, we missed beta 3's build. Should we respin with that backed out to see? Can we verify any other the way than putting a build with the backout to our beta audience?
A quick query shows in a 24 hr time we have accumulated over 5K in crashes. I would be in favor of respinning b3 with the backout at this point.

We haven't yet been able to repro the crash but the remote testing team will be working on it later this evening.
Ok, backed out and commented in bug 697215. We'll respin the beta.
(In reply to Sheila Mooney from comment #10)
> Wasn't this patch checked into Aurora as well as Beta? If it's the culprit,
> wouldn't we see crashes in 11.0a2 - I don't find any.

Maybe some other change on Aurora prevents the crashes on Aurora and trunk?

The only other reasonable candidate would be bug 701662.

Can you look back through the reports to see the exact day crashes started showing up?
So the actual signature has been around for a long time at very low volume ie: <10 a week for any particular release - from 3.6 to 9.0. I searched back for the last couple of months. I see a single crash on 10.0a1 only in the past 2 weeks. I didn't find any crashes on 11.0a1, 11.0a2 or 12.0a1 in the past 3 weeks. All the crashes seem to be exclusively with 10.0b2 (20111228055358). I don't see any trend of increase/decrease on Aurora since we had so few to begin with. Looking back historically, few of these signatures appeared pre-release or beta anyhow.
From a date perspective, the first crash in this signature showed up in crash-stats on Dec 06, 2011. That is going back in time 4 weeks which is the max that Socorro allows in the UI.
This signature goes way back. You can see the odd crash in 3.6. In Socorro you are restricted to a 4 week search window but you can change the dates and go back in time. I did a few searches back in Aug and Sept. We average about 10-20 or so crashes with this signature a week across all versions. That seemed to be the steady state up until we release 10.b2 and it really exploded.
What's the next step here? Are we going to try a backout and see if that works?
(In reply to Sheila Mooney from comment #18)
> What's the next step here? Are we going to try a backout and see if that
> works?

As Christian noted in https://bugzilla.mozilla.org/show_bug.cgi?id=714320#c13, we backed out 697215 for beta 3. We'll continue to look for STR, but if the crash volume goes back to normal amounts in beta 3, I think we can safely say it was bug 697215.
Duh...sorry, my bad.
Blocks: 697215
Sheila - Could you send this to somebody to check whether the volume goes down in beta 3?  We want to make sure that all tracked bugs are assigned to the person with next action, or closed. Thanks!
Assignee: nobody → smooney
One thing I mentioned to Sheila is I am not seeing the NPSWF version being reported in the module section of these crashes. I do see it in other Windows crashes, however.
I will be monitoring what happens in b3. Nothing in b3 yet but the volume is still too low. I will update over the weekend.
I was able to look at the report that rhelmer generated (http://people.mozilla.org/~rhelmer/temp/Firefox-10.0b2-correlation/) - shows a high correlation to some ATI dlls:

Windows NT
  nsStyleContext::AddChild(nsStyleContext*)|EXCEPTION_ACCESS_VIOLATION_WRITE (6022 crashes)
     96% (5808/6022) vs.  28% (7391/26633) atiuxpag.dll
     96% (5808/6022) vs.  28% (7392/26633) atidxx32.dll
     94% (5678/6022) vs.  27% (7271/26633) aticfx32.dll

The addon correlation showed:

Windows NT
  nsStyleContext::AddChild(nsStyleContext*)|EXCEPTION_ACCESS_VIOLATION_WRITE (6022 crashes)
     90% (5420/6022) vs.  82% (21863/26633) testpilot@labs.mozilla.com (Mozilla Labs - Test Pilot, https://addons.mozilla.org/addon/13661)
When I drilled down to the by version module report (http://people.mozilla.org/~rhelmer/temp/Firefox-10.0b2-correlation/20120106_Firefox_10.0-interesting-modules-with-versions.txt), it seems correlated to different versions:

   95% (548/575) vs.  28% (7391/26633) atiuxpag.dll
          0% (0/575) vs.   0% (92/26633) 8.14.1.6117
          0% (0/575) vs.   0% (30/26633) 8.14.1.6126
          0% (0/575) vs.   0% (20/26633) 8.14.1.6136
          0% (0/575) vs.   0% (49/26633) 8.14.1.6143
          0% (0/575) vs.   0% (35/26633) 8.14.1.6150
         22% (125/575) vs.   6% (1568/26633) 8.14.1.6160
          6% (37/575) vs.   2% (496/26633) 8.14.1.6170
         12% (70/575) vs.   3% (696/26633) 8.14.1.6178
          5% (26/575) vs.   2% (405/26633) 8.14.1.6187
          3% (19/575) vs.   1% (248/26633) 8.14.1.6195
          0% (0/575) vs.   0% (6/26633) 8.14.1.6203
          2% (13/575) vs.   1% (196/26633) 8.14.1.6210
         34% (196/575) vs.   9% (2378/26633) 8.14.1.6214
          1% (3/575) vs.   0% (56/26633) 8.14.1.6221
          7% (38/575) vs.   2% (463/26633) 8.14.1.6226
          0% (1/575) vs.   0% (78/26633) 8.14.1.6229
          1% (3/575) vs.   0% (53/26633) 8.14.1.6233
          1% (8/575) vs.   1% (137/26633) 8.14.1.6237
          2% (9/575) vs.   1% (348/26633) 8.14.1.6243
          0% (0/575) vs.   0% (37/26633) 8.14.1.6248
comment 25 makes it look like atiuxpag.dll versions 8.14.1.6160 and newer are related to something that causes the crash, but versions 8.14.1.6150 and older are not.
it would explain why i have not had any of these crashes, all my systems have either nvidia gpu's or intel igp's.

i swear, the junk dll's amd drivers have running 24/7 are quickly bringing them to Creative labs level of bloat.
Since 10.0 Beta 3 is live, there have been no crashes in this version so the backout of bug 697215 did it.
I'm more than a little curious about why current Aurora works with bug 697215 applied and Beta crashes out. Are we sure Aurora doesn't suffer from the same problem, and just isn't tested widely enough?
Assigning to roc for further investigation since we haven't seen a crash in beta 3. Also untracking for FF10, but tracking for FF11 in case, as Jeff suggests, Aurora isn't yet being tested enough to uncover this crasher.
Assignee: smooney → roc
It's no longer a top crasher in 10.0 Beta.

There's a spike in crashes from 12.0a1/20120114.
The regression range for the spike is:
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=964b118ac852&tochange=3eaa7d9f1c69
Keywords: topcrash
(In reply to Scoobidiver from comment #31)
> There's a spike in crashes from 12.0a1/20120114.
The spike lasts one (build)day.
Now there's a spike in 11.0a2/20120131.

All crashes I checked happen with AMD Radeon HD 6xxx series.
Blocks: 605780
Depends on: 722538
Summary: Firefox Crash [@ nsStyleContext::AddChild(nsStyleContext*) ] → Firefox Crash @ nsStyleContext::AddChild with AMD Radeon HD 6xxx series
With Beta 10, this ended up being an explosive crasher. Have we seen anything similar for Firefox 11? If so, we should consider backing out bug 697215 for Firefox 11 Beta 4.
i think you should contact AMD and enquire why their drivers are the most affected first.
Signature summary shows there was only 1 crash in 11.0b1. I will check some of the other signatures that were correlated to Radeon as well and see what their volume is.
(In reply to Alex Keybl [:akeybl] from comment #33)
> With Beta 10, this ended up being an explosive crasher. 
It's probably another crash signature form of other bugs that depend on bug 722538, which is applicable to Fx 10.
Most of these still seem to be in 10.0b2 for some reason. Very few in FF11 at all. I say we remove the tracking flag for now, see if it comes back in significant volume. There are none in FF11b3 and only 1 in FF11b2. 10.0.1 has 5 and none in 10.0.2 so far.
We are probably handling this with the blocklist in bug 722538.
Am getting crashs with this signature with the 20120319 moz-central nightly build. Am using win7 32bit with D2D off. AMD E-350 with HD6310 GPU.

https://crash-stats.mozilla.com/report/index/bp-f80285f6-a022-45d1-ae3f-ebc052120319
https://crash-stats.mozilla.com/report/index/bp-5d5cce50-29bd-4cad-8b9e-bbb6f2120319

About:support



  Application Basics

        Name
        Firefox

        Version
        14.0a1

        User Agent
        Mozilla/5.0 (Windows NT 6.1; rv:14.0) Gecko/20120319 Firefox/14.0a1

        Profile Folder

          Show Folder

        Enabled Plugins

          about:plugins

        Build Configuration

          about:buildconfig

        Crash Reports

          about:crashes

        Memory Use

          about:memory

  Extensions

        Name

        Version

        Enabled

        ID

        Adblock Plus
        2.0.4a.3417
        true
        {d10d0bf8-f5b5-c8b4-a8b2-2b9879e08c5d}

        British English Dictionary
        1.19.1
        true
        en-GB@dictionaries.addons.mozilla.org

        Close Tab By Double Click
        1.14
        true
        close@doubleclick

        Element Hiding Helper for Adblock Plus
        1.2.2a.410
        true
        elemhidehelper@adblockplus.org

        Nightly Tester Tools
        3.2.1.1
        true
        {8620c15f-30dc-4dba-a131-7c5d20cf4a29}

        NoScript
        2.3.5
        true
        {73a6fe31-595d-460b-a920-fcc0f8843232}

        Adobe Acrobat - Create PDF
        1.2
        false
        web2pdfextension@web2pdf.adobedotcom

        Readability
        2.1
        false
        readability@readability.com

        Zotero
        3.0.3
        false
        zotero@chnm.gmu.edu

  Important Modified Preferences

      Name

      Value

        accessibility.typeaheadfind.flashBar
        0

        browser.cache.disk.smart_size.enabled
        false

        browser.cache.disk.smart_size.first_run
        false

        browser.cache.disk.smart_size_cached_value
        1048576

        browser.places.smartBookmarksVersion
        3

        browser.startup.homepage
        http://www.google.co.uk

        browser.startup.homepage_override.buildID
        20120319031122

        browser.startup.homepage_override.mstone
        14.0a1

        extensions.checkCompatibility
        false

        extensions.checkCompatibility.3.6
        false

        extensions.checkCompatibility.3.6b
        false

        extensions.checkCompatibility.3.6p
        false

        extensions.checkCompatibility.3.6pre
        false

        extensions.checkCompatibility.3.7a
        false

        extensions.checkCompatibility.4.0b
        false

        extensions.checkCompatibility.4.0p
        false

        extensions.checkCompatibility.4.0pre
        false

        extensions.checkCompatibility.4.2
        false

        extensions.checkCompatibility.4.2a
        false

        extensions.checkCompatibility.4.2a1
        false

        extensions.checkCompatibility.4.2a1pre
        false

        extensions.checkCompatibility.4.2b
        false

        extensions.checkCompatibility.5.0
        false

        extensions.checkCompatibility.5.0a
        false

        extensions.checkCompatibility.5.0b
        false

        extensions.checkCompatibility.6.0
        false

        extensions.checkCompatibility.6.0a
        false

        extensions.checkCompatibility.6.0b
        false

        extensions.checkCompatibility.7.0
        false

        extensions.checkCompatibility.7.0a
        false

        extensions.checkCompatibility.7.0b
        false

        extensions.checkCompatibility.nightly
        false

        extensions.lastAppVersion
        14.0a1

        gfx.content.azure.enabled
        false

        gfx.direct2d.disabled
        true

        gfx.direct3d.prefer_10_1
        true

        gfx.font_rendering.cleartype_params.force_gdi_classic_for_families
        Arial,Courier New,Tahoma,Trebuchet MS,Verdana,Segoe UI,Lucida Grande

        image.mem.discardable
        false

        network.cookie.cookieBehavior
        1

        network.cookie.prefsMigrated
        true

        network.prefetch-next
        false

        places.database.lastMaintenance
        1332145200

        places.history.expiration.transient_current_max_pages
        69886

        places.history.expiration.transient_optimal_database_size
        111816048

        privacy.donottrackheader.enabled
        true

        privacy.popups.showBrowserMessage
        false

        privacy.sanitize.migrateFx3Prefs
        true

        security.warn_viewing_mixed
        false

  Graphics

        Adapter Description
        AMD Radeon HD 6310 Graphics

        Vendor ID
        0x1002

        Device ID
        0x9802

        Adapter RAM
        384

        Adapter Drivers
        aticfx32 aticfx32 aticfx32 atiumdag atidxx32 atiumdva

        Driver Version
        8.950.0.0

        Driver Date
        2-14-2012

        Direct2D Enabled
        false

        DirectWrite Enabled
        false (6.1.7601.17776)

        ClearType Parameters
        ClearType parameters not found

        WebGL Renderer
        Google Inc. -- ANGLE (AMD Radeon HD 6310 Graphics) -- OpenGL ES 2.0 (ANGLE 1.0.0.963)

        GPU Accelerated Windows
        1/1 Direct3D 9

  JavaScript

        Incremental GC
        1

  Library Versions

        Expected minimum version

        Version in use

        NSPR
        4.9
        4.9

        NSS
        3.13.3.0 Basic ECC
        3.13.3.0 Basic ECC

        NSS Util
        3.13.3.0
        3.13.3.0

        NSS SSL
        3.13.3.0 Basic ECC
        3.13.3.0 Basic ECC

        NSS S/MIME
        3.13.3.0 Basic ECC
        3.13.3.0 Basic ECC
(In reply to DB Cooper from comment #39)
> Am getting crashs with this signature with the 20120319 moz-central nightly
> build. Am using win7 32bit with D2D off. AMD E-350 with HD6310 GPU.

We're trying to move forward with Bug 722538 after some speed bumps on staging last week. In the meantime, we should try to reproduce with our most similar hardware and add-ons in QA.

Roc - with D2D already disabled, will the blocklist in bug 722538 have an affect on this top crasher? Thanks.
This page reliably produces the crash when you scroll down through it:

http://www.flatpanelshd.com/review.php?subaction=showfull&id=1331899332
(In reply to DB Cooper from comment #39)
>         Device ID
>         0x9802
>         Direct2D Enabled
>         false
If it happens with D2D disabled, the blocklist planned in bug 722538 won't be helpful.
(In reply to Scoobidiver from comment #42)
> (In reply to DB Cooper from comment #39)
> >         Device ID
> >         0x9802
> >         Direct2D Enabled
> >         false
> If it happens with D2D disabled, the blocklist planned in bug 722538 won't
> be helpful.

To be clear, this crash has only happened with today's (20120319) moz-central nightly. 

Build config: Built from http://hg.mozilla.org/mozilla-central/rev/58a2cd0203ee
(In reply to DB Cooper from comment #43)
> To be clear, this crash has only happened with today's (20120319)
> moz-central nightly. 
The regression range for the spike is:
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=e94edfdb1f5b&tochange=58a2cd0203ee
It might be a regression from bug 666041.
Keywords: reproducible
(In reply to Scoobidiver from comment #44)
> (In reply to DB Cooper from comment #43)
> > To be clear, this crash has only happened with today's (20120319)
> > moz-central nightly. 
> The regression range for the spike is:
> http://hg.mozilla.org/mozilla-central/
> pushloghtml?fromchange=e94edfdb1f5b&tochange=58a2cd0203ee
> It might be a regression from bug 666041.

Tomorrow I can test a try build with that bug's patches backed out if you like.
It's #7 top crasher in 14.0a1 over the last 3 days.

(In reply to DB Cooper from comment #45)
> Tomorrow I can test a try build with that bug's patches backed out if you
> like.
Yes, please.
Keywords: topcrash
(In reply to Scoobidiver from comment #46)
> It's #7 top crasher in 14.0a1 over the last 3 days.
> 
> (In reply to DB Cooper from comment #45)
> > Tomorrow I can test a try build with that bug's patches backed out if you
> > like.
> Yes, please.

I'd need someone to compile that build for me though, unfortunately.
[Triage Comment]
Roc - can you provide a try build with the necessary backouts?  Also, should we be looking at filing a separate bug for this as it was first found in a recent nightly?
Bug 666041 is almost certainly not the problem.
Also, this went away again in today's builds (March 20).
(In reply to David Baron [:dbaron] from comment #50)
> Also, this went away again in today's builds (March 20).
It's the second bug that lasts one build after bug 736507 on March 16. It's odd.
Any luck tracking this down further? Is there anything QA can do to help?
Removing qawanted as per discussion in channel meeting -- will be covered under the AMD blocklisting meta issue.
Keywords: qawanted
We just resolved bug 722538 on Friday. We'll need to see whether the crash numbers fall now.

Since this is no longer a top crasher, we'll untrack for FF12.
There's is a spike in crashes from 14.0a1/20120419. The regression range for the spike is:
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=0c7e2911be75&tochange=da53be684794
Looks like we should keep an eye on this for 14, along with bug 700288. What else can we do here to close in on a fix? There is mention of a static variable not initialized somewhere in bug 700288 comment 29 - can someone try to track that down and create some builds for qa to test?
Depends on: 755974
No longer appears to be a top crasher - untracking.
Renominating as this has moved up to the top of the crash data in early Firefox 14b9 crash stats. It doesn't appear as if this was a problem in b8.
(In reply to Marcia Knous [:marcia] from comment #59)
> Renominating as this has moved up to the top of the crash data in early
> Firefox 14b9 crash stats. It doesn't appear as if this was a problem in b8.
It's the normal behavior of this crash that appears or disappears depending on the build (likely an uninitialized static variable). But this time, its usual brother crashes are not there but instead there's a new one, bug 768383.

Maybe the Beta regression range can help finding the underlying issue:
http://hg.mozilla.org/releases/mozilla-beta/pushloghtml?fromchange=f8d3886db65a&tochange=d050090e578c
Keywords: topcrash
This has jumped up quite a bit in volume, 24563 crashes since we shippped B9.
(In reply to Marcia Knous [:marcia] from comment #61)
> This has jumped up quite a bit in volume, 24563 crashes since we shippped B9.
50% of crashes in 14.0b9 is indeed a big jump!
Let's wait to see if this continues to trend in b10 before devoting much qa/engineering time to this bug.
Is it possible that this is showing up with a different signature when it's not showing up under this one?
Also, the 14.0b9 crashes are EXCEPTION_BOUNDS_EXCEEDED crashes, unlike the 10.0b2 crashes which were EXCEPTION_ACCESS_VIOLATION_WRITE.  Do they have similar correlations or is it no longer an ATI-related problem?
(In reply to David Baron [:dbaron] from comment #65)
> Is it possible that this is showing up with a different signature when it's
> not showing up under this one?
Its brother crashes in 14.0b9 are bug 768383 and bug 768560.

(In reply to David Baron [:dbaron] from comment #66)
> Do they have similar correlations or is it no longer an ATI-related problem?
Neither https://crash-analysis.mozilla.com/rkaiser/, nor https://crash-analysis.mozilla.com/crash_analysis/ contain correlations per device and vendor IDs, but I checked manually some crash reports and there were those ATI/AMD devices (0x98..).
There are no crashes in 14.0b10.
Keywords: topcrash
It's back in 16.0a1/20120713030548.
Bad news! It's #1 top crasher in 10.0.6 ESR with 19% of all crashes.
How does this line trigger an EXCEPTION_ACCESS_VIOLATION_WRITE?
http://hg.mozilla.org/releases/mozilla-esr10/annotate/6c432561c1fd/layout/style/nsStyleContext.cpp#l137

The only thing we seem to be writing to is the newly defined "nsStyleContext **list". "aChild->mRuleNode->IsRoot() ? &mEmptyChild : &mChild;" looks like they're all reads, although I guess there could be some operators going on in there. If there is memory writing going on here, and if someone can find a reproducible case (at 18% of crashes it should be possible to catch) then this is a potential security vulnerability.
Group: core-security
There is no operator magic there.  Those should all be pure reads.

The most obvious conclusion is that something corrupted page permissions on the stack?  ;)
There are nearly a dozen of these bugs (the crash comes and goes under a few different signatures, but isn't around in most releases); bug 772330 is the metabug.
We'll track for the ESR shipping alongside FF16 given this is a top crasher, but it's not clear if there will be progress in that timeframe.
Building a new version is usually enough to make it disappear, so it will likely be fixed in 10.0.7esr (15).
As expected, there are no crashes in 10.0.7esr.
Closing during CritSmash triage. We are not seeing this crash in 17+
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → WORKSFORME
It shows up in 19.0b4 along with bug 837371.
Status: RESOLVED → REOPENED
Resolution: WORKSFORME → ---
This affects AddChild, like bug 839280, although AddChild was at a pretty different address back in FF14b9 (439CE). However, I'm not so sure what was going on. We crash at an instruction that's actually valid, so there's no clear evidence we took an unexpected jump. The code looks like this:
  100439CE: 8B 51 1C           mov         edx,dword ptr [ecx+1Ch]
  100439D1: 83 7A 04 00        cmp         dword ptr [edx+4],0
  100439D5: 74 0C              je          100439E3
  100439D7: 83 C0 04           add         eax,4
  100439DA: 8B 10              mov         edx,dword ptr [eax]
  100439DC: 85 D2              test        edx,edx
  100439DE: 75 08              jne         100439E8
  100439E0: 89 08              mov         dword ptr [eax],ecx
  100439E2: C3                 ret
We crash at 100439E0. I can't tell whether the value at EAX is valid or where it came from. The stack shows that we've only just entered AddChild, but I don't know how EAX has a bad address here. AddChild never modifies EAX to point to invalid memory AFAICT.
I had a poke around with a disassembler to see if a wild jump near here could put us in the the middle of an instruction sequence that would corrupt EAX and then reach 100439E0, but I couldn't find one. I can't rule out something like that, though.
There are crashes in 21.0a1/20130213031137.
Crash Signature: [@ nsStyleContext::AddChild(nsStyleContext*) ] → [@ nsStyleContext::AddChild(nsStyleContext*) ] [@ xul.dll@0x640256 | nsStyleSet::GetContext(nsStyleContext*, nsRuleNode*, nsRuleNode*, nsIAtom*, nsCSSPseudoElements::Type, mozilla::dom::Element*, unsigned int)]
There are also crashes in 20.0a2/20130218.
Status: REOPENED → ASSIGNED
Keywords: topcrash
When 20.0 switched from Aurora to Beta, crashes stopped.
Keywords: topcrash
It spiked in 22.0a1/20130325105600.
(In reply to Daniel Veditz [:dveditz] from comment #71)
> How does this line trigger an EXCEPTION_ACCESS_VIOLATION_WRITE?
> http://hg.mozilla.org/releases/mozilla-esr10/annotate/6c432561c1fd/layout/
> style/nsStyleContext.cpp#l137
> 
> The only thing we seem to be writing to is the newly defined "nsStyleContext
> **list". "aChild->mRuleNode->IsRoot() ? &mEmptyChild : &mChild;" looks like
> they're all reads, although I guess there could be some operators going on
> in there. If there is memory writing going on here, and if someone can find
> a reproducible case (at 18% of crashes it should be possible to catch) then
> this is a potential security vulnerability.

Boris replied in comment 72 that it's not a write.  Also, roc's patch in
bug 839270 added a bunch of NOPS before that line which did seem to help
(at least for that particular build):
https://hg.mozilla.org/releases/mozilla-beta/rev/23f455023faf

I don't see a reason to keep this bug hidden / sec-high.  We already have
a dozen or so crash bugs with "AMD Radeon" in the subject that are public
so I think we can open this one too (if we hide comments with URLs).
I don't see anything more sensitive here compared to the other bugs.
Flags: needinfo?(dveditz)
OK.
Group: core-security
Flags: needinfo?(dveditz)
Keywords: sec-vector
Keywords: sec-high
22.0a2/20130511 is a bad build.
Let's not pollute the topcrash list with what we know is a single bad Aurora build, please. We don't need to put any special radar on the individual signatures of the Radeon thing as long as it's not in a beta or release we shipped to hundreds of thousands of people at least.
I'll invoke the "bugs that spearhead investigation or fixes across a large collection of crashes" clause on the meta tracker bug of those crashes, though.
Keywords: topcrash
Crash Signature: [@ nsStyleContext::AddChild(nsStyleContext*) ] [@ xul.dll@0x640256 | nsStyleSet::GetContext(nsStyleContext*, nsRuleNode*, nsRuleNode*, nsIAtom*, nsCSSPseudoElements::Type, mozilla::dom::Element*, unsigned int)] → [@ nsStyleContext::AddChild(nsStyleContext*) ] [@ xul.dll@0x640256 | nsStyleSet::GetContext(nsStyleContext*, nsRuleNode*, nsRuleNode*, nsIAtom*, nsCSSPseudoElements::Type, mozilla::dom::Element*, unsigned int)] [@ nsStyleContext::AddChild ] [@ xul.dll@…
The Socorro reports shows that this crash signature hasn't appeared on the last 28 days:
[@ xul.dll@0x640256 | nsStyleSet::GetContext ]

Also, Socorro shows that this signature is still present on the last 7 days: 
[@ nsStyleContext::AddChild ]

Considering this, I don't think we should close this bug, until this last crash signature is no longer present.
Assignee: roc → nobody
Status: ASSIGNED → NEW
platform-rel: --- → ?
Whiteboard: [platform-rel-AMD]
platform-rel: ? → ---
You need to log in before you can comment on or make changes to this bug.