Closed Bug 654002 Opened 9 years ago Closed 9 years ago

Stack overflow with signs of memory corruption (under nsContainerFrame::RemoveFrame)

Categories

(Core :: Layout, defect)

defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla11

People

(Reporter: rh01, Assigned: mats)

References

Details

(Keywords: testcase, Whiteboard: [sg:dos] stack exhaustion)

Attachments

(9 files, 1 obsolete file)

User-Agent:       Mozilla/5.0 (X11; Linux x86_64; rv:2.0.1) Gecko/20110430 Firefox/4.0.1 Iceweasel/4.0.1
Build Identifier: 

Performing the steps described, a stack overflow occurs due to a lot of recursive calls starting at nsFrameManager::RemoveFrame


Reproducible: Always

Steps to Reproduce:
1) Open firefox
2) Open testcase file
3) Open firebug
4) Access DOM panel
5) Expand Variable A
6) Unexpand Variable A
7) Crash (Stack Overflow)
Attached file testcase
Attached file Calltrace WinDBG
The calltrace done in WinDBG before falling into the recursion and after the stack overflow
Once as I hit the stack overflow crash, some pointers were overwritten with 0x41414141
By this bug, Firefox 3.6.17 and Firefox 4.01 with the corresponding Firebug version is affected

Always reproducible:

1) Win XP SP3 (1-2 GB RAM VirtualBox 3.2 Host: Debian 6.0)

1.1) Firefox 4.01 Firebug 1.7.0
bp-9008e4f7-d33c-426c-946d-14cc72110501
bp-1ec5d670-50b4-4284-94d3-689dd2110501

1.2) Firefox 3.6.17 Firebug 1.6.2
bp-febca605-115f-4cd1-be44-577072110501
bp-bb345367-b99a-4893-ba10-fbc452110501


2) Windows 7 Home Starter (on a Samsung N130 1GB RAM)

2.1) Firefox 4.01 Firebug 1.7.0
bp-bcaaa27b-0ef7-4e7e-812e-d09282110501
bp-af827d17-a2fb-4484-82b1-682c52110501

2.2) Firefox 3.6.17 Firebug 1.6.2
bp-396930b7-8ff2-4400-a7ab-a56c32110501
bp-1e6c4f1e-3b2c-47ed-bb3d-a7ba62110501


Exploit classification from WinDBG: UNKNOWN

Not reproducible on Windows 7 in Virtualbox, and once in a while it crashed on Ubuntu 10.04 in Virtualbox.
Maybe it is related to bug 651990 and/or bug 633927
The crash stacks do not at all look dangerous (stack exhaustion is a safe denial of service). We've seen similar reports where WinDBG shows pointers being overwritten that we cannot otherwise reproduce. I wonder if testcases like this are corrupting --windbg-- memory?
Status: UNCONFIRMED → NEW
Ever confirmed: true
(In reply to comment #6)
> The crash stacks do not at all look dangerous (stack exhaustion is a safe
> denial of service). We've seen similar reports where WinDBG shows pointers
> being overwritten that we cannot otherwise reproduce. I wonder if testcases
> like this are corrupting --windbg-- memory?

I don't know if winDBG is showing wrong memory contents.
When setting a breakpoint at xul!nsContainerFrame::RemoveFrame and then at xul!SearchTable, the recursion can be interrupted at xul!SearchTable. The stackframes at this point look like the ones from the stack overflow except that the addresses are different (see WinDBG_Outputs.txt). The first argument 'table' is on the stack and points back into the code section of xul.dll (0x1010e3e7). I don't know if this is supposed to be so as the PLDHashTable struct seems to be corrupted (see PLDHashTable.png). There the executable code at address 0x1010e3e7 (pop edi; pop esi; pop ebp; pop ebx) is interpreted as a data location.
After the stack overflow, the first argument 'table' is somewhere at address 0x27a50ab0 and not accessible. I saw that these addresses differ from crash to crash and sometimes point to accessible memory regions filled with the unicode string created by the testcase. But still this could be a winDBG issue.
Please correct me if something in this analysis is wrong.
Attached file WinDBG logs
WinDBG outputs related to comment 7
Attached image PLDHashTable.png
structure of the PLDHashTable struct after hitting the breakpoint at xul!SearchTable during the recursion
Component: Security → Layout
Product: Firefox → Core
QA Contact: firefox → layout
Summary: Stack overflow with signs of memory corruption → Stack overflow with signs of memory corruption (under nsContainerFrame::RemoveFrame)
Whiteboard: screen shots look bad, crash-stats says stack exhaustion?
roc: please assign someone from your team to look at this bug.
Assignee: nobody → roc
Whiteboard: screen shots look bad, crash-stats says stack exhaustion? → [sg:dos] screen shots look bad, crash-stats says stack exhaustion?
The STR produces an inline frame that has ~27000 continuations.
Bug 395316 made nsContainerFrame::RemoveFrame recurse on the
continuation chain.
Blocks: 395316
Keywords: testcase
Whiteboard: [sg:dos] screen shots look bad, crash-stats says stack exhaustion? → [sg:dos] stack exhaustion
Here's a testcase that produces ~32000 continuations and then destroys them.
It crashes on Windows XP, but not on Linux64.
Attached patch wip1 (obsolete) — Splinter Review
Comment on attachment 539973 [details] [diff] [review]
wip1

Note the change from "RemoveFrame(nsnull" to "RemoveFrame(aListName" -
I don't see any reason to not propagate 'nsGkAtoms::nextBidi'.

The crash test is awfully slow, not sure if it's acceptable:
Linux opt:     22903ms
Linux debug:   46961ms
Linux64 opt:   26744ms
Linux64 debug: 53332ms
OS X opt:      20642ms
OS X debug:    49760ms
OS X64 opt:    27883ms
OS X64 debug:  59593ms
Win opt:       22872ms
Win debug:     49847ms
WinXP opt:     22599ms
WinXP debug:   51188ms
Android opt:  109663ms
WinXP opt:     22599ms

I guess we could exclude the debug builds...
Attachment #539973 - Flags: review?(roc)
Comment on attachment 539973 [details] [diff] [review]
wip1

Review of attachment 539973 [details] [diff] [review]:
-----------------------------------------------------------------

I think you should disable the crashtest though, that's too long :-(
Attachment #539973 - Flags: review?(roc) → review+
I think we need a way to run slow tests like this, even if we can't
run them for every build.  Filed bug 666064 on that.
Landed without the crashtest:
http://hg.mozilla.org/integration/mozilla-inbound/rev/bc0b2201ca85
Flags: in-testsuite?
Whiteboard: [sg:dos] stack exhaustion → [sg:dos][inbound] stack exhaustion
Backed out by philor. 

"Back out bc0b2201ca85 (bug 654002) on suspicion of causing WinXP opt mochitest crashes"  http://hg.mozilla.org/integration/mozilla-inbound/rev/8c598532de9f
Whiteboard: [sg:dos][inbound] stack exhaustion → [sg:dos] stack exhaustion
The last mozilla-inbound changesets has the same oranges so I believe this patch didn't create them. Otherwise, the last changeset causes the exact same crash...
Looks like it landed on mozilla-central without incidents:
http://hg.mozilla.org/mozilla-central/rev/3e54c496db92

(the real culprit for the M1/M3 crashes is bug 664821)
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Version: unspecified → Trunk
I didn't mean to reopen...
Status: REOPENED → RESOLVED
Closed: 9 years ago9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla7
Depends on: 667025
Backed out for causing crash bug 667025.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Attached patch fix v2Splinter Review
Attachment #539973 - Attachment is obsolete: true
Attachment #574532 - Flags: review?(roc)
Attached patch crash testsSplinter Review
https://hg.mozilla.org/integration/mozilla-inbound/rev/32f510d1c4c2
(holding the crash tests until this bug is public)
Whiteboard: [sg:dos] stack exhaustion → [sg:dos] stack exhaustion [inbound]
Target Milestone: mozilla7 → mozilla11
https://hg.mozilla.org/mozilla-central/rev/32f510d1c4c2
Status: REOPENED → RESOLVED
Closed: 9 years ago9 years ago
Hardware: x86 → All
Resolution: --- → FIXED
Whiteboard: [sg:dos] stack exhaustion [inbound] → [sg:dos] stack exhaustion
Blocks: 696449
Landed the crash tests:
https://hg.mozilla.org/integration/mozilla-inbound/rev/902189a8bf87
Group: core-security
Flags: in-testsuite? → in-testsuite+
You need to log in before you can comment on or make changes to this bug.