Last Comment Bug 468211 - Crash [@ nsCSSFrameConstructor::AdjustParentFrame] with DOMAttrModified, observes, binding and focusing
: Crash [@ nsCSSFrameConstructor::AdjustParentFrame] with DOMAttrModified, obse...
Status: VERIFIED FIXED
[sg:critical]
: crash, regression, testcase, verified1.9.0.12, verified1.9.1
Product: Core
Classification: Components
Component: XUL (show other bugs)
: Trunk
: x86 Windows XP
: P2 critical (vote)
: ---
Assigned To: Olli Pettay [:smaug] (vacation Aug 25-28)
:
Mentors:
Depends on: 471166 506212
Blocks: 467422 454361 472668
  Show dependency treegraph
 
Reported: 2008-12-05 21:26 PST by Martijn Wargers [:mwargers] (not working for Mozilla)
Modified: 2011-06-13 10:01 PDT (History)
18 users (show)
roc: blocking1.9.1+
dveditz: blocking1.9.0.12+
dveditz: wanted1.9.0.x+
jruderman: in‑testsuite+
See Also:
Crash Signature:
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
testcase (834 bytes, application/vnd.mozilla.xul+xml)
2008-12-05 21:26 PST, Martijn Wargers [:mwargers] (not working for Mozilla)
no flags Details
testcase2 (zipped) (700 bytes, application/zip)
2008-12-12 15:36 PST, Martijn Wargers [:mwargers] (not working for Mozilla)
no flags Details
testcase3 (579 bytes, application/vnd.mozilla.xul+xml)
2009-01-08 18:07 PST, Martijn Wargers [:mwargers] (not working for Mozilla)
no flags Details
stack of node removal (14.05 KB, text/plain; charset=UTF-8)
2009-01-15 11:51 PST, David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch)
no flags Details
valgrind's explanation for the crash on testcase 2, with preceding assertions (6.47 KB, text/plain; charset=UTF-8)
2009-01-15 13:48 PST, David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch)
no flags Details
patch to add an assertion in EndReconstruct that makes things clearer (4.33 KB, patch)
2009-01-15 15:40 PST, David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch)
no flags Details | Diff | Splinter Review
patch, requires bug 471166 (1.11 KB, patch)
2009-01-26 02:26 PST, Olli Pettay [:smaug] (vacation Aug 25-28)
jonas: review+
jonas: superreview+
Details | Diff | Splinter Review

Description Martijn Wargers [:mwargers] (not working for Mozilla) 2008-12-05 21:26:10 PST
Created attachment 351659 [details]
testcase

See testcase, which crashes current trunk build within 100ms.

This regressed between 2008-09-11 and 2008-09-12:
http://hg.mozilla.org/mozilla-central/pushloghtml?startdate=2008-09-11+10%3A00%3A00&enddate=2008-09-12+03%3A00%3A00
I think a regression from bug 454361 somehow.

http://crash-stats.mozilla.com/report/index/6e989257-4557-4111-aa89-7695d2081205?p=1
0  	xul.dll  	nsCSSFrameConstructor::AdjustParentFrame  	 layout/base/nsCSSFrameConstructor.cpp:3407
1 	xul.dll 	nsCSSFrameConstructor::ConstructFrameInternal 	layout/base/nsCSSFrameConstructor.cpp:7494
2 	xul.dll 	nsCSSFrameConstructor::ConstructFrame 	layout/base/nsCSSFrameConstructor.cpp:7406
3 	xul.dll 	nsCSSFrameConstructor::ContentInserted 	layout/base/nsCSSFrameConstructor.cpp:8997
4 	xul.dll 	nsCSSFrameConstructor::RecreateFramesForContent 	layout/base/nsCSSFrameConstructor.cpp:11143
5 	xul.dll 	nsCSSFrameConstructor::ProcessRestyledFrames 	layout/base/nsCSSFrameConstructor.cpp:9868
6 	xul.dll 	nsCSSFrameConstructor::RestyleElement 	layout/base/nsCSSFrameConstructor.cpp:9942
7 	xul.dll 	nsCSSFrameConstructor::ProcessOneRestyle 	layout/base/nsCSSFrameConstructor.cpp:13280
8 	xul.dll 	nsCSSFrameConstructor::ProcessPendingRestyles 	layout/base/nsCSSFrameConstructor.cpp:13378
9 	xul.dll 	PresShell::DoFlushPendingNotifications 	layout/base/nsPresShell.cpp:4541
10 	xul.dll 	PresShell::FlushPendingNotifications 	layout/base/nsPresShell.cpp:4505
11 	xul.dll 	nsCSSFrameConstructor::RestyleEvent::Run 	layout/base/nsCSSFrameConstructor.cpp:13451
12 	xul.dll 	nsThread::ProcessNextEvent 	xpcom/threads/nsThread.cpp:510
13 	xul.dll 	nsBaseAppShell::Run 	widget/src/xpwidgets/nsBaseAppShell.cpp:170
14 	nspr4.dll 	PR_GetEnv 	
15 	firefox.exe 	wmain 	toolkit/xre/nsWindowsWMain.cpp:87
16 	firefox.exe 	firefox.exe@0x2197 	
17 	kernel32.dll 	BaseProcessStart
Comment 1 Martijn Wargers [:mwargers] (not working for Mozilla) 2008-12-05 21:28:36 PST
Because of the load delay by loading it over the network, there is a chance it doesn't crash. In that case, load the testcase in offline mode or test the testcase locally.
Comment 2 David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch) 2008-12-09 21:02:53 PST
I don't see a crash on Linux, either from the network or from a local file, although I do see some assertions and warnings.
Comment 3 David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch) 2008-12-10 15:38:43 PST
I crash on Windows from a local file, though.
Comment 4 David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch) 2008-12-10 15:42:16 PST
The variable parentFrame in ContentInserted is full of 0xdddddddd, and then we crash when we start accessing it.
Comment 5 David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch) 2008-12-10 15:50:14 PST
And the thing we're calling RecreateFramesForContent on is the tooltip element.
Comment 6 Martijn Wargers [:mwargers] (not working for Mozilla) 2008-12-12 15:36:04 PST
Created attachment 352799 [details]
testcase2 (zipped)

This is probably the same issue.
Comment 7 Martijn Wargers [:mwargers] (not working for Mozilla) 2009-01-08 18:07:17 PST
Created attachment 356113 [details]
testcase3
Comment 8 David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch) 2009-01-15 11:38:00 PST
testcase2 crashes reliably on Linux.
Comment 9 David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch) 2009-01-15 11:51:02 PST
Created attachment 357202 [details]
stack of node removal

This is from debugging testcase2.

So I think the problem is related to this stack:  we're removing the node here, and then on unwind from here, we're then trying to do frame construction on a node that's no longer in the document, which crashes.

Before this point, we get:

###!!! ASSERTION: killing mutation events: 'nsContentUtils::IsSafeToRunScript()', file ../../../dist/include/content/nsContentUtils.h, line 1528

called from what's frame #13 in this stack (but line 4336), but then we just go on and run the mutation events anyway, at this stack.  I'd have though that the assertion would imply that we wouldn't run the mutation events given that some of the script blockers were not removable.
Comment 10 David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch) 2009-01-15 11:57:23 PST
... in particular, we unwind to frame #24, and then get a pile of assertions when it calls ConstructFrameByDisplayType ... ConstructBlock ... ProcessChildren

But we actually don't crash until reflow, where we try to reflow some deleted frames.
Comment 11 David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch) 2009-01-15 13:48:50 PST
Created attachment 357228 [details]
valgrind's explanation for the crash on testcase 2, with preceding assertions
Comment 12 David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch) 2009-01-15 15:33:45 PST
But the crash is a result of having frames for an element with a null document pointer, because of the warning from RecreateFramesForContent:

WARNING: NS_ENSURE_TRUE(aContent->GetDocument()) failed: file /home/dbaron/builds/mozilla-central/mozilla/layout/base/nsCSSFrameConstructor.cpp, line 10984

since this is in the middle of a style context rebuild, we end up crashing as a result of skipping recreation of frames, since we depended on the frame reconstruction to delete the style contexts with dangling pointers.

There are a few things that are fragile here that we could perhaps fix:  

  1. rule tree reconstruction assuming all pointers into the old rule tree will be destroyed

  2. style reresolution relying on frame reconstruction to destroy all the frames with old style context pointers (and the resulting skipping of the style context pointer fixup)

but it does look like the underlying problem comes down to constructing frames for a node that's not in the document.
Comment 13 David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch) 2009-01-15 15:40:20 PST
Created attachment 357249 [details] [diff] [review]
patch to add an assertion in EndReconstruct that makes things clearer

I don't like this; I'm going to rewrite it.  But I want to save it here because it does have potentially useful child enumeration code (which I'm going to get rid of for the next version).
Comment 14 David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch) 2009-01-15 15:50:30 PST
(In reply to comment #12)
> There are a few things that are fragile here that we could perhaps fix:  

  3. Make RecreateFramesForContent destroy the frames but not rebuild them in the case we're hitting.
Comment 15 Jonas Sicking (:sicking) No longer reading bugmail consistently 2009-01-15 16:19:19 PST
The problem is somewhere in frame 15-18. We really shouldn't be calling SetAttr with aNotify=1 when it's unsafe to run script. One of the functioncalls in those frames need to be put in a scriptrunner. I'm guessing either the call to SynchronizeBroadcastListener or the calls to SetAttr.

There really is no other way to fully fix the crash here since the script in the mutation observer can change the whole world under the frame-ctor.
Comment 16 Jonas Sicking (:sicking) No longer reading bugmail consistently 2009-01-15 16:27:33 PST
Neil, do you have an opinion on what to put off for when it's safe to run scripts? See above comment 15.
Comment 17 Jonas Sicking (:sicking) No longer reading bugmail consistently 2009-01-15 16:30:28 PST
Since this can lead to exploitable crashes I'm marking this security sensitive.
Comment 18 David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch) 2009-01-17 00:21:29 PST
I moved the business about asserting on EndReconstruct into bug 473871.
Comment 19 David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch) 2009-01-23 21:15:55 PST
And I've decided I am going to fix the rule tree thing in bug 475128.

However, the script could have done much worse things here that layout can't protect against; this is really a content bug.  (Sorry, should have moved this last week.)
Comment 20 Olli Pettay [:smaug] (vacation Aug 25-28) 2009-01-24 03:44:15 PST
I need to test if Bug 471166 fixes this one.
Comment 21 Olli Pettay [:smaug] (vacation Aug 25-28) 2009-01-26 01:46:04 PST
This doesn't crash anymore, at least not in trunk.
I get a "killing mutation events" assertion.
Comment 22 Olli Pettay [:smaug] (vacation Aug 25-28) 2009-01-26 01:51:26 PST
Ah, sorry, the testcase 3 does still crash :( And Bug 471166 doesn't help here.
Comment 23 Olli Pettay [:smaug] (vacation Aug 25-28) 2009-01-26 02:26:12 PST
Created attachment 358815 [details] [diff] [review]
patch, requires bug 471166     

Fixes also Bug 472668.

I need to still run all the tests.
Comment 24 Jesse Ruderman 2009-01-30 13:05:08 PST
FWIW, the patch in bug 475128 didn't fix this.  I get a crash in nsCSSFrameConstructor::ConstructFrameInternal when I load the testcase.
Comment 25 Olli Pettay [:smaug] (vacation Aug 25-28) 2009-01-31 04:58:47 PST
http://hg.mozilla.org/mozilla-central/rev/b01ef42a6723
Comment 26 Martijn Wargers [:mwargers] (not working for Mozilla) 2009-02-01 08:17:25 PST
Verified fixed, using:
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2a1pre) Gecko/20090201 Minefield/3.2a1pre (.NET CLR 3.5.30729)
Comment 27 Daniel Veditz [:dveditz] 2009-02-02 14:31:53 PST
Shouldn't we take this one on 1.9.0.7 with the others (bug 466057 et al.) you've fixed like this?
Comment 28 Daniel Veditz [:dveditz] 2009-02-02 14:42:05 PST
http://hg.mozilla.org/releases/mozilla-1.9.1/rev/c47c92bbf492
Comment 29 Daniel Veditz [:dveditz] 2009-02-02 14:48:55 PST
Comment on attachment 358815 [details] [diff] [review]
patch, requires bug 471166     

Approved for 1.9.0.7, a=dveditz for release-drivers.
Comment 30 Olli Pettay [:smaug] (vacation Aug 25-28) 2009-02-03 05:41:51 PST
The patch depends on 1.9.1/trunk broadcasting changes.
Comment 31 Olli Pettay [:smaug] (vacation Aug 25-28) 2009-02-03 05:43:15 PST
Changing blocking1.9.0.7+ to blocking1.9.0.7?
Comment 32 Daniel Veditz [:dveditz] 2009-02-04 16:25:35 PST
Can you stop this crash working around the missing trunk changes? Would it be hard to back-port the trunk broadcasting changes?
Comment 33 Tony Chung [:tchung] 2009-02-04 18:16:08 PST
Verified fix on Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.1b3pre) Gecko/20090204 Shiretoko/3.1b3pre Ubiquity/0.1.5
Comment 34 Daniel Veditz [:dveditz] 2009-03-01 10:34:00 PST
These self-referencing testcases no longer work from bugzilla. Is there something we could change on the server end, maybe let the documents be cached so the self-references don't trigger a re-load and therefore a re-redirect?
Comment 35 Daniel Veditz [:dveditz] 2009-03-01 10:42:04 PST
testcase3 (at least) is crashing in the same place in 1.9.0.8pre as reported here for trunk. We ought to get this fix into 1.9.0.8
Comment 36 Reed Loden [:reed] (use needinfo?) 2009-03-01 15:24:04 PST
(In reply to comment #34)
> These self-referencing testcases no longer work from bugzilla. Is there
> something we could change on the server end, maybe let the documents be cached
> so the self-references don't trigger a re-load and therefore a re-redirect?

I don't think so... the tokens used for security bugs are one-time-use-only.
Comment 37 Daniel Veditz [:dveditz] 2009-03-16 14:20:01 PDT
Still need an answer to comment 32 from Olli, but clearly this isn't going to make tomorrow's code freeze regardless.
Comment 38 Olli Pettay [:smaug] (vacation Aug 25-28) 2009-03-17 03:21:21 PDT
(In reply to comment #37)
> but clearly this isn't going to
> make tomorrow's code freeze regardless.
Sorry about that.

I think all the broadcaster changes should be ported to branch. There are quite
a few... :/
Comment 39 Daniel Veditz [:dveditz] 2009-04-17 11:05:41 PDT
Now that 3.5b4 is essentially done you've got time to roll-up all the broadcaster changes you're talking about? Not going to make 1.9.0.10 because that code-freezes next week, but please start working on it now so it doesn't miss another release.
Comment 40 Olli Pettay [:smaug] (vacation Aug 25-28) 2009-06-14 08:14:22 PDT
bug 445177
Comment 41 Olli Pettay [:smaug] (vacation Aug 25-28) 2009-06-15 00:43:40 PDT
The assertions are fixed now in 1.9.0.12. I can't reliably reproduce the
crash on 1.9.0.x.
Martijn, could you help verifying this and bug 472668.
Comment 42 Al Billings [:abillings] 2009-06-30 16:59:00 PDT
Testcase 3 crashes reliably for me on Windows XP with 3.0.11. With 3.0.12pre (Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.12pre) Gecko/2009063005 GranParadiso/3.0.12pre), it does not crash. Marking verified1.9.0.12.

Other cases do not reproduce a crash.
Comment 43 Jesse Ruderman 2009-08-08 16:48:50 PDT
Added crashtests, but had to disable one temporarily due to bug 509269.

http://hg.mozilla.org/mozilla-central/rev/b3f6bc8b17e8

Note You need to log in before you can comment on or make changes to this bug.