Closed Bug 426987 Opened 16 years ago Closed 16 years ago

Crash [@ nsSubDocumentFrame::Reflow] with iframe onunload that modifies styles in parent frame

Categories

(Core :: Layout, defect, P2)

x86
macOS
defect

Tracking

()

VERIFIED FIXED

People

(Reporter: jruderman, Assigned: roc)

References

()

Details

(Keywords: crash, testcase, topcrash)

Crash Data

Attachments

(3 files, 1 obsolete file)

Steps to reproduce:
1. Load the testcase.
2. Click the Home button.

Result: Crash [@ nsSubDocumentFrame::Reflow].

Reduced from http://store.nexternal.com/shared/StoreFront/default.asp.  I am not making this up; the page really does crazy stuff during an iframe's onunload.  I got the URL from bp-1b3e6854-0121-11dd-8a9b-001a4bd43ef6.

nsSubDocumentFrame::Reflow is the #14 topcrash in Firefox 3 Beta 5.  Bug 369547 has some more testcases that trigger crashes there.
I can take this
Assignee: nobody → roc
+'ing.  
Flags: blocking1.9? → blocking1.9+
Priority: -- → P2
nsSubDocumentFrame::Reflow is crashing because we have no mInnerView. We have no mInnerView because nsSubDocumentFrame::ShowDocShell, called by nsSubDocumentFrame::Init, failed in nsSubDocumentFrame::GetDocShell and bailed out. GetDocShell failed because the underling <iframe> element has no mFrameLoader. And that is null because we've already called DestroyContent on this <iframe> --- nsSubDocumentFrame::Reflow is in a flush caused by setting the value of some text control during the firing of OnLocationChange, so I guess the new viewer has already been installed.

So is the right thing to do here to make nsSubDocumentFrame tolerate the frame element having no frameloader? Is it expected that DestroyContent() can happen while associated frames are still alive and even still subject to flushed reflow and painting?
My gut reaction is that that is not expected no.
I would certainly expect us to be in that state during paint suppression, given our current code.  Just like we used to be in a state where the content nodes had no document pointer during paint suppression.

I think we should be disabling reflow on the zombie document during paint suppression, frankly.
I'm not sure why we'd be in paint suppression here, since the document has loaded and displayed already. Perhaps because we called Freeze() on the presshell, but why would we Freeze() it for the bfcache and then destroy the document?
> I'm not sure why we'd be in paint suppression here, since the document has
> loaded and displayed already.

Hold on...  Which document?  There are at least 3 here?  Which presshell is the one showing?  Which presshell is the flush happening on?

DOM mutations to a bfcached document would throw it out of bfcache, but those should tear down the presshell.

I do think we definitely shouldn't be reflowing in a presshell whose document has been destroyed.
Sorry for the confusion.

After loading the testcase ("the old document"), when you navigate to a new document (about:blank, say) nsDocShell::SetupNewViewer is called on the docshell. That calls mContentViewer->Close on the old document; mSavingOldViewer is false (not sure why but it doesn't matter) so we don't put the old document in bfcache, and Close calls nsDocument::Destroy. But we still have the old document's presentation, as the comment in SetupNewViewer says:

    // We don't show the mContentViewer yet, since we want to draw the old page
    // until we have enough of the new page to show.  Just return with the new
    // viewer still set to hidden.

It's a bit hard to be sure but it seems to me that we continue to process resize reflows on the old document in this state, and hover restyling too.

So perhaps we're calling nsDocument::Destroy too early and we shouldn't be calling it until the old document's presentation actually is no longer needed?
Yeah, that's the paint suppression thing I was thinking of.

It would make sense to me to not Destroy() the old document if we're going to paint-suppress it for a bit.
Attached patch wallpaper fixSplinter Review
This wallpaper fix in nsSubDocumentFrame is super easy and low risk.

I'll try moving the nsDocument::Destroy call, but I'm afraid that the resulting patch might be too scary for RC.
It looks like we could actually remove the nsDocument::Destroy call from nsDocumentViewer::Close. When the content viewer is destroyed, we'll call it again. When a new document loads and leaves paint suppression, it will call nsDocumentViewer::Show on its viewer which will call Destroy() on the previous viewer and reach nsDocument::Destroy that way.

However, that nsDocument::Destroy call was added here:
https://bugzilla.mozilla.org/show_bug.cgi?id=294258
So it looks like removing that nsDocument::Destroy call will regress that bug by making nsGenericHTMLElement::SaveState happen too late :-(.

So what should we do? Should we split the state-saving pass out of nsDocument::Destroy into a separate method? Perhaps it would be more efficient (in time and code size) to make nsDocument::Destroy and nsIContent::DestroyContent take a mode parameter which tells us whether to just save state or actually tear stuff down as well?
The state-saving really should not be happening from nsDocument::Destroy. IMHO it's a gross hack that it happens there. Everything else nsDocument::Destroy does is to make sure we don't leak so it's very out-of-place to be doing state-saving as well as ideally nsDocument::Destroy should go away completely in an XPCOMGC world.

The only scary thing is changing this stuff so close to RC :(

It is unfortunately very important still that nsDocument::Destroy gets called. It is trivial (and does happen on top pages) to create cycles that we can't collect unless it's called. So we have to make sure it gets called in all situations.
It will be called in all situations, via nsDocumentViewer::Destroy.

I'll make a patch to split out the state saving. We can take the wallpaper if the resulting patch looks to scary upon review.
Is bug 427004 related/duplicate of this one?
Attached patch SaveState fix (obsolete) — Splinter Review
This fixes the crash. Please estimate scariness level.

After we call Close() on the viewer, we either call Destroy() on it immediately after that, or we set it as the previous viewer for some new viewer. When that new viewer gets Destroy() or Show() called on it, we'll destroy the old viewer.

I suppose it's possible that this patch could create leaks if there is a chain of references leading from the old viewer's document to the new viewer. How possible is that.
Whiteboard: [need feedback]
> I suppose it's possible that this patch could create leaks if there is a chain
> of references leading from the old viewer's document to the new viewer. How
> possible is that.

Is this a new risk? Won't we still end up calling destroy on the old viewer when it goes out of the bfcache which we are now already relying on breaking cycles to the new viewer.
I guess so. OK, you can review this then :-).
Whiteboard: [need feedback] → [needs review]
Attachment #314069 - Flags: superreview?(jonas)
Attachment #314069 - Flags: review?(jonas)
looks like this might be #10 topcrash in fx3 beta 5 so it would be a good one to get.   friendster and a few other sites might also have some code that tickles the crash...

http://crash-stats.mozilla.com/report/list?product=Firefox&query_search=signature&query_type=contains&range_unit=weeks&version=Firefox%3A3.0b5&signature=nsSubDocumentFrame%3A%3AReflow(nsPresContext*%2C+nsHTMLReflowMetrics%26%2C+nsHTMLReflowState+const%26%2C+unsigned+int%26)&range_value=1

then sort on comments to see.

Keywords: topcrash
Comment on attachment 314069 [details] [diff] [review]
SaveState fix

You should rev the IID of nsIContent and nsIDocument. Looks good otherwise.
Attachment #314069 - Flags: superreview?(jonas)
Attachment #314069 - Flags: superreview+
Attachment #314069 - Flags: review?(jonas)
Attachment #314069 - Flags: review+
Attached patch updated fixSplinter Review
patch with updated UUIDs, should be ready for checkin
Attachment #314069 - Attachment is obsolete: true
Attachment #314544 - Flags: superreview+
Attachment #314544 - Flags: review+
I should be able to check ths in today but if someone else gets to it first, that's fine.
Keywords: checkin-needed
Whiteboard: [needs review] → [reviewed patch in hand]
Comment on attachment 314544 [details] [diff] [review]
updated fix

#10 topcrash, risk is quite low.
Attachment #314544 - Flags: approval1.9?
removing checkin-needed until approval is granted
Keywords: checkin-needed
Blocks: 427982
Blocks: 427952
Comment on attachment 314544 [details] [diff] [review]
updated fix

a1.9=beltzner
Attachment #314544 - Flags: approval1.9? → approval1.9+
Checked in, with crashtest.
Status: NEW → RESOLVED
Closed: 16 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Blocks: 428840
Blocks: 427004
No longer blocks: 428840
Depends on: 428840
No longer depends on: 428840
Depends on: 428840
Depends on: 429172
So it looks like this caused bug 428840 and bug 429172.
verified fixed using  Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9pre) Gecko/2008050621 Firefox/3.0pre and the testcase - no crash on testcase

--> Verified fixed
Status: RESOLVED → VERIFIED
Whiteboard: [reviewed patch in hand]
Crash Signature: [@ nsSubDocumentFrame::Reflow]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: