Closed Bug 319551 Opened 19 years ago Closed 18 years ago

crash when browsing testcase for bug 314922 [@ nsDocLoader::QueryInterface] [@ DocumentViewerImpl::Destroy]

Categories

(Core :: DOM: Navigation, defect)

defect
Not set
critical

Tracking

()

VERIFIED FIXED

People

(Reporter: ispiked, Assigned: bryner)

References

()

Details

(Keywords: crash, verified1.8.0.5, verified1.8.1)

Crash Data

Attachments

(4 files)

Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9a1) Gecko/20051207 Firefox/1.6a1

While messing around with the testcase in bug 314922 I crashed when I hit the back button. 

#9  0x000000c9 in ?? ()
#10 0xb5aebd54 in nsDocLoader::QueryInterface (this=0x89c9e88, aIID=@0xb5b1fef0, aInstancePtr=0xbfdbe70c) at /moz/mozilla/uriloader/base/nsDocLoader.cpp:240
#11 0xb5aa2bfe in nsDocShell::QueryInterface (this=0x89c9e88, aIID=@0xb5b1fef0, aInstancePtr=0xbfdbe774) at /moz/mozilla/docshell/base/nsDocShell.cpp:376
#12 0xb5af0098 in CallQueryInterface<nsISupports, nsDocLoader> (aSource=0x89c9f1c, aDestination=0xbfdbe774) at nsISupportsUtils.h:223
#13 0xb5aeca14 in nsDocLoader::GetAsDocLoader (aSupports=0x89c9f1c) at /moz/mozilla/uriloader/base/nsDocLoader.cpp:272
#14 0xb5ab739b in nsDocShell::AddChild (this=0x864ee40, aChild=0x89c9f1c) at /moz/mozilla/docshell/base/nsDocShell.cpp:2341
#15 0xb5abc89d in nsDocShell::RestoreFromHistory (this=0x864ee40) at /moz/mozilla/docshell/base/nsDocShell.cpp:5531
#16 0xb5abce62 in HandleRestorePresentationEvent (aEvent=0x8952260) at /moz/mozilla/docshell/base/nsDocShell.cpp:5094
#17 0xb7e349ca in PL_HandleEvent (self=0x8952260) at /moz/mozilla/xpcom/threads/plevent.c:688
#18 0xb7e34856 in PL_ProcessPendingEvents (self=0x80a5810) at /moz/mozilla/xpcom/threads/plevent.c:623
#19 0xb7e378d9 in nsEventQueueImpl::ProcessPendingEvents (this=0x80a0940) at /moz/mozilla/xpcom/threads/nsEventQueue.cpp:417
#20 0xb6656190 in event_processor_callback (source=0x84873b0, condition=G_IO_IN, data=0x80a0940) at /moz/mozilla/widget/src/gtk2/nsAppShell.cpp:67
#21 0xb779531c in g_vasprintf () from /usr/lib/libglib-2.0.so.0
#22 0xb776e4ee in g_main_context_dispatch () from /usr/lib/libglib-2.0.so.0
#23 0xb77714f6 in g_main_context_check () from /usr/lib/libglib-2.0.so.0
#24 0xb77717e3 in g_main_loop_run () from /usr/lib/libglib-2.0.so.0
#25 0xb7b72e65 in gtk_main () from /usr/lib/libgtk-x11-2.0.so.0
#26 0xb6656b86 in nsAppShell::Run (this=0x813b2f0) at /moz/mozilla/widget/src/gtk2/nsAppShell.cpp:139
#27 0xb65af77f in nsAppStartup::Run (this=0x813b2a8) at /moz/mozilla/toolkit/components/startup/src/nsAppStartup.cpp:161
#28 0xb7f8b0ed in XRE_main (argc=2, argv=0xbfdbef34, aAppData=0x8049820) at /moz/mozilla/toolkit/xre/nsAppRunner.cpp:2289
#29 0x08048676 in main (argc=2, argv=0xbfdbef34) at /moz/mozilla/browser/app/nsBrowserApp.cpp:61
I get crashes pretty often when browsing http://www.seanet.com/~bradbell/omhelp/framelink.xml with a nightly build on Mac.  I can't get my debug build to crash there.  Many of the crashes are at DocumentViewerImpl::Destroy, and many of them don't involve the back button.

I set browser.sessionhistory.max_total_viewers set to 3 to test, and that seemed to make the crashes more frequent, but I'm not sure.

See also bug 333050.
Keywords: crash
OS: Linux → All
Hardware: PC → All
Summary: crash when browsing testcase for bug 314922 [@ nsDocLoader::QueryInterface] → crash when browsing testcase for bug 314922 [@ nsDocLoader::QueryInterface] [@ DocumentViewerImpl::Destroy]
Here's what I've been doing:
1. Load http://www.seanet.com/~bradbell/omhelp/framelink.xml
2. In the sidebar, click each of the four links near the top, starting with "Framelink" (last) and going to "overview" (first).
3. Click "Back" a few times.

That crashes about 60% of the time, but not always at the same point.  Sometimes the first click on "Framelink" causes a crash; sometimes the first click on the Back button causes a crash.
Blocks: 333050
No longer depends on: 314922
Flags: blocking1.9a1?
Flags: blocking1.8.1?
Flags: blocking1.8.0.3?
Flags: blocking1.8.0.3? → blocking1.8.0.3+
looks bfcache related -> bryner
Assignee: adamlock → bryner
Too late to bake a patch for 1.8.0.4
Flags: blocking1.8.0.5?
Flags: blocking1.8.0.4-
Flags: blocking1.8.0.4+
Flags: blocking1.8.0.5? → blocking1.8.0.5+
I finally figured out what's going on here.  The key thing about this site is that the toplevel frameset document has an XSLT transform applied to it.  Here's the sequence of events:

- frameset document starts to load
- 2 frames start to load in the frameset, creating child docshells
- xslt replaces the frameset document with a transformed document
- the 2 "new" frames start to load in the new frameset document

At this point, the top docshell has 4 children, because the original 2 were never cleared.  They do go away eventually, because the original document gets garbage collected.  However, if you navigate away before that happens, you end up with the orphaned docshells in the SHEntry, where they aren't strongly referenced.

It seems like the right approach would be along the lines of clearing the child docshell list when DocumentViewerImpl::SetDOMDocument() is called.  I'll try that out and see if it looks workable.  If it's problematic, another approach would be to disable bfcache for XSLT-generated documents.  In either case, we should probably also change nsSHEntry to use strong references for the child shells.

This bug would potentially affect any site which uses XSLT and containes a frame or iframe.  I also think this must be a different problem from bug 333050, since tinderbox does not use XSLT.
No longer blocks: 333050
adding peterv in case he has thoughts on the XSLT involvement here
Attached patch patchSplinter Review
This is the fix I described above.  With this patch, I no longer see the extra docshell children persisted in session history.  I also changed the references in the SHEntry to be owning.

One thing I'm not completely sure about is the comment in DocumentViewerImpl:::SetDOMDocument() which says that we assume that the layout of the current document hasn't started yet.  In the case of XSLT, I'm not sure this is necessarily true -- it seems like we've at least started creating child docshells by the time this method is called.  Is there something else bad going on?
Attachment #226178 - Flags: superreview?(bzbarsky)
Attachment #226178 - Flags: review?(roc)
Comment on attachment 226178 [details] [diff] [review]
patch

>Index: layout/base/nsDocumentViewer.cpp
>+    // Clear the list of old child docshells. CChild docshells for the new

s/CChild/Child/

I assume this plays nice with all three XSLT callers of SetDOMDocument, including the one where we pass in the source document on transform error?  Please make sure to test all three modes.

Also, I guess that CreateAboutBlankContentViewer is still happy with this?

>+    // document will be constructed as frames are created.

We create child docshells even if frames are not created, for HTML.  Does that matter here?  In any case, the comment doesn't really make sense...

The code in general looks ok, with the above fixed.
Attachment #226178 - Flags: superreview?(bzbarsky) → superreview+
Flags: blocking1.8.1? → blocking1.8.1+
not the same as 333050 (discussed with dbaron)
No longer blocks: 333050
Status: NEW → RESOLVED
Closed: 18 years ago
Resolution: --- → FIXED
Oops, I meant to mark another bug fixed.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment on attachment 226178 [details] [diff] [review]
patch

r=sicking, but please do test with an error-document per comment 8

Ping me if you need an xslt that produces an error.
Attachment #226178 - Flags: review?(roc) → review+
Comment on attachment 226178 [details] [diff] [review]
patch

a=sicking for drivers assuming it passes the XSLT tests.
Attachment #226178 - Flags: approval1.8.1?
Attachment #226178 - Flags: approval1.8.0.5+
(In reply to comment #8)
> I assume this plays nice with all three XSLT callers of SetDOMDocument,
> including the one where we pass in the source document on transform error? 
> Please make sure to test all three modes.

I'm not sure which case you're referring to, actually.  For a transform error, it creates a new error document and passes it to OnTransformDone.
OnTransformDone will then call SetDOMDocument. Actually, I just realized that I should add another testcase where we will probably call SetDOMDocument multiple times for a single transform.
Attached file testcase 3
This one should make us call SetDOMDocument twice, once when we realize that the transform will produce an HTML document, and once when the transform fails.
Attachment #226178 - Flags: approval1.8.1? → approval1.8.1+
(In reply to comment #17)
> Created an attachment (id=226571) [edit]
> testcase 3
> 
> This one should make us call SetDOMDocument twice, once when we realize that
> the transform will produce an HTML document, and once when the transform fails.
> 

This testcase actually crashes for me on the branch, but I'm not sure it has anything to do with my changes:

 	xpcom_core.dll!PL_DHashTableEnumerate(PLDHashTable * table=0x03975a80, PLDHashOperator (PLDHashTable *, PLDHashEntryHdr *, unsigned int, void *)* etor=0x024c62e0, void * arg=0x0012f920)  Line 673 + 0x24	C
 	gklayout.dll!nsBaseHashtable<nsURIHashKey,SheetLoadData *,SheetLoadData *>::Enumerate(PLDHashOperator (nsIURI *, SheetLoadData * &, void *)* enumFunc=0x024c4320, void * userArg=0x03975a38)  Line 221 + 0x13	C++
>	gklayout.dll!CSSLoaderImpl::Stop()  Line 1889	C++
 	transformiix.dll!txTransformNotifier::SignalTransformEnd(unsigned int aResult=0x80600008)  Line 954	C++
 	transformiix.dll!txTransformNotifier::OnTransformEnd(unsigned int aResult=0x80600008)  Line 909	C++
 	transformiix.dll!txMozillaXMLOutput::endDocument(unsigned int aResult=0x80600008)  Line 229	C++
 	transformiix.dll!txExecutionState::end(unsigned int aResult=0x80600008)  Line 207	C++
 	transformiix.dll!txMozillaXSLTProcessor::TransformToDoc(nsIDOMDocument * aOutputDoc=0x00000000, nsIDOMDocument * * aResult=0x00000000)  Line 634	C++
 	transformiix.dll!HandleTransformBlockerEvent(PLEvent * aEvent=0x039b9860)  Line 483	C++
 	xpcom_core.dll!PL_HandleEvent(PLEvent * self=0x039b9860)  Line 688 + 0xa	C
 	xpcom_core.dll!PL_ProcessPendingEvents(PLEventQueue * self=0x00fcf4f8)  Line 623 + 0x9	C
(In reply to comment #18)
> This testcase actually crashes for me on the branch, but I'm not sure it has
> anything to do with my changes:

I just tried this testcase on a clean branch build, didn't crash for me.
ok, unfortuantely I won't be able to look at this more until tonight at the earliest.  If someone wants to verify proper behavior with these testcases and get the patch in before then, that would be much appreciated.
Looks like the testcase 3 crash is something else entirely (crash calling CSSLoader::Stop if CreateSheet had never been called).  So, I think this patch is safe to land, and I'll do so.
filed bug 342766 on the "testcase 3" crash
fix checked in everywhere
Status: REOPENED → RESOLVED
Closed: 18 years ago18 years ago
Resolution: --- → FIXED
v.fixed on 1.8.0 branch with Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US; rv:1.8.0.5) Gecko/20060706 Firefox/1.5.0.5, no crashes for me with any of the testcases.
also verified with Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.8.1b1) Gecko/20060815 BonEcho/2.0b1, no crash on the testcases
Status: RESOLVED → VERIFIED
Flags: blocking1.9a1?
Depends on: 354908
Crash Signature: [@ nsDocLoader::QueryInterface] [@ DocumentViewerImpl::Destroy]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: