Closed Bug 855898 Opened 12 years ago Closed 12 years ago

crash in SelectorMatches due to excessive recursion in frame construction

Categories

(Core :: Layout, defect)

22 Branch
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla23
Tracking Status
firefox21 --- unaffected
firefox22 --- affected
firefox23 --- fixed

People

(Reporter: bomfog, Assigned: MatsPalmgren_bugz)

References

()

Details

(Keywords: crash, regression, reproducible)

Crash Data

Attachments

(1 file, 4 obsolete files)

This bug was filed from the Socorro interface and is report bp-db0ee04b-2150-4520-9218-131d42130328 . ============================================================= PGO'd ionmonkey builds since changeset 6cac55064722 (merge from m-c noon-ish 2013-03-27) crash on the descendant selector test in Robohornet, with EXCEPTION_STACK_OVERFLOW as the reason. Non-pgo builds don't crash. Clean profile, but didn't think to try safe mode yet. bp-08446e55-779b-4009-8fbb-0b5ca2130328 bp-41e09c43-81f0-4875-9ee9-08bfd2130328 bp-17bfd3e6-84ff-4182-94d3-435d22130328 bp-6217ba12-8c9c-4dfa-aa5c-8dab82130328
From the crash data: "AdapterVendorID: 0x1002, AdapterDeviceID: 0x94c1" which google tells me is a "ATI Radeon HD 2400 Pro". Given the PGO/non-PGO difference it might be a variation of bug 772330?
Component: General → Style System (CSS)
Product: Firefox → Core
(In reply to Mats Palmgren [:mats] from comment #1) > From the crash data: "AdapterVendorID: 0x1002, AdapterDeviceID: 0x94c1" > which google tells me is a "ATI Radeon HD 2400 Pro". > > Given the PGO/non-PGO difference it might be a variation of bug 772330? Also happens on a laptop with the following crash report/about:support info: bp-d2c858c1-e7a8-419b-a03f-2d79c2130329 Graphics Adapter Description Mobile Intel(R) 4 Series Express Chipset Family Adapter Drivers igdumd64 igd10umd64 igdumdx32 igd10umd32 ... Device ID 0x2a42 DirectWrite Enabled false (6.2.9200.16492) Driver Date 2-11-2011 Driver Version 8.15.10.2302 GPU #2 Active false GPU Accelerated Windows 0/2 Basic Vendor ID 0x8086 WebGL Renderer Google Inc. -- ANGLE (Mobile Intel(R) 4 Series Express Chipset Family) AzureCanvasBackend skia AzureContentBackend none AzureFallbackCanvasBackend cairo
Ignore "non-pgo" comment in bp-bd4bf9b9-9c61-4fb1-a061-d59d72130329 Still pgo only.
PGO'd hourly Inbound build now also crashes on the descendant selector test. Built from http://hg.mozilla.org/integration/mozilla-inbound/rev/753f285620fe bp-bef29686-bef6-4b60-bbf2-0aa6b2130329 Non-PGO inbound still doesn't crash.
(In reply to Mats Palmgren [:mats] from comment #1) > Given the PGO/non-PGO difference it might be a variation of bug 772330? Ignore that. This crash is a "just" a stack overflow from to much recursion constructing frames for the very deep content tree in that test. The cycle that repeats is: #19 nsCSSFrameConstructor::ConstructBlock layout/base/nsCSSFrameConstructor.cpp:10945 #20 nsCSSFrameConstructor::ConstructNonScrollableBlock layout/base/nsCSSFrameConstructor.cpp:4407 #21 nsCSSFrameConstructor::ConstructFrameFromItemInternal layout/base/nsCSSFrameConstructor.cpp:3530 #22 nsCSSFrameConstructor::ConstructFramesFromItem layout/base/nsCSSFrameConstructor.cpp:5480 #23 nsCSSFrameConstructor::ConstructFramesFromItemList layout/base/nsCSSFrameConstructor.cpp:9811 #24 nsCSSFrameConstructor::ProcessChildren layout/base/nsCSSFrameConstructor.cpp:9953 ... "ulimit -s 2400" crashes for me on Linux64. Maybe frame construction should have a frame depth limit as we have in reflow? (it's no much point creating them if we're not going to render them ;-) )
Component: Style System (CSS) → Layout
OS: Windows 7 → All
Hardware: x86 → All
Attached patch wip (obsolete) — Splinter Review
Would something as simple as this work to prevent at least some crashes? It seems to prevent the crash for this test. Fwiw, this makes the test run in ~400ms compared to ~3700ms without it, in a Linux64 debug build.
Attachment #731419 - Flags: feedback?(bzbarsky)
It would prevent some crashes, yeah. Obviously if you keep doing appendChild you can work around this check... We may also want a somewhat bigger multiplier. Not sure. Past that, I'm not sure what we want to do here. :(
Flags: needinfo?(dbaron)
Comment on attachment 731419 [details] [diff] [review] wip Feedback+, I guess, though if we do this I'd prefer an autorestore.
Attachment #731419 - Flags: feedback?(bzbarsky) → feedback+
As of at least IM nightly built from: http://hg.mozilla.org/projects/ionmonkey/rev/61f7ebb9f3d9 this seems to have become WFM. Also WFM on a PGO'd Inbound hourly of recent vintage.
Scratch the WFM. Crashes still/again. Current Ion nightly, dirty profile: bp-64a0d0cb-bae5-4422-a997-a65a42130402 Same nightly, clean profile: bp-b328eb01-a74a-4944-9e4d-2b3782130402 in which the stack is different: Frame Module Signature Source 0 xul.dll SelectorMatches layout/style/nsCSSRuleProcessor.cpp:2108 1 xul.dll SelectorMatches layout/style/nsCSSRuleProcessor.cpp:2198 2 xul.dll AtomSelector_InitEntry layout/style/nsCSSRuleProcessor.cpp:867 but was back to the long version when I reran with another clean profile (clean profile = "Use temporary profile" option in the Profile Manager app thingy) bp-65b5881d-bdf0-4a17-b363-8bb0f2130402 Haven't tried inbound or central yet.
Current central nightly crashes bp-c74b124e-7aad-41a3-a2b6-4ec882130402 Non-pgo central hourly, same changeset, doesn't. Done again....
Thanks bomfog, we understand the problem, there's no need for more crash data.
Attached patch fix (obsolete) — Splinter Review
Assignee: nobody → matspal
Attachment #732514 - Flags: review?(bzbarsky)
Attachment #731419 - Attachment is obsolete: true
Comment on attachment 732514 [details] [diff] [review] fix So looking at this more carefully, we can't do it this way. And the reason is that this will also fail to construct anonymous kids for the frame, which is just bad. What you probably want to do instead is if the depth is exceeded skip the block that's skipped right now in the IsLeaf() case, but do everything else in here as it's done right now.
Attachment #732514 - Flags: review?(bzbarsky) → review-
Attached patch fix, v2 (obsolete) — Splinter Review
Like so?
Attachment #732514 - Attachment is obsolete: true
Attachment #732574 - Flags: review?(bzbarsky)
Comment on attachment 732574 [details] [diff] [review] fix, v2 No, you need to clear the lazy bits if you're not going to construct frames for those kids; why wouldn't you clear those?
Attached patch fix, v3 (obsolete) — Splinter Review
Attachment #732574 - Attachment is obsolete: true
Attachment #732574 - Flags: review?(bzbarsky)
Attachment #732629 - Flags: review?(bzbarsky)
Comment on attachment 732629 [details] [diff] [review] fix, v3 I guess it's better to construct the ::before and ::after bits too, yeah... >+ ClearLazyBits(child, nullptr); This should be ClearLazyBits(child, child->GetNextSibling()). r=me with that fixed, and the warning moved so it only happens once, not once per child, perhaps?
Attachment #732629 - Flags: review?(bzbarsky) → review+
Attached patch fix, v4Splinter Review
> This should be ClearLazyBits(child, child->GetNextSibling()). fixed
Attachment #732629 - Attachment is obsolete: true
Attachment #732644 - Flags: review+
WFM.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla23
Sorry for not responding sooner. Having a frame construction recursion limit seems fine to me. (Maybe we should get rid of the reflow recursion limitation while we're at it? There doesn't seem much point to constructing frames if we're not going to reflow them, so it makes sense to use a single limit and have it in frame construction.) Though we should probably be careful that the limit is about depth of the frame tree so we don't get weird appearing/disappearing effects for dynamic manipulation, like we used to with the reflow depth limit before we fixed it. Maybe file a followup bug on that?
Flags: needinfo?(dbaron)
Summary: crash in SelectorMatches → crash in SelectorMatches due to excessive recursion in frame construction
I wasn't able to reproduce this issue on an Ubuntu 12.10 32bit machine, while connecting to neither http://www.robohornet.org/tests/descendantselector.html nor http://ie.microsoft.com/testdrive/Performance/RoboHornetPro/, with Nightly builds from 27 and 28 march 2013 and 2013-04-02. There seem to be some crash reports in Socorro, within last month, regarding Firefox 23 beta 1, 2, 3, 4 and 5, and 23.0a2: https://crash-stats.mozilla.com/report/list?signature=SelectorMatches&product=Firefox&query_type=contains&range_unit=weeks&process_type=any&hang_type=any&date=2013-07-16+13%3A00%3A00&range_value=4 Does anyone have any thoughts/suggestions?
Flags: needinfo?
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: