I've been seeing intermittent crashes for months (since around when fastback landed, or stabilized) while using tinderbox. In particular, the crash usually happens after I: 1. load tinderbox 2. click on one of the things (name / "L") that causes a popup in an iframe 3. click on one of the links in that popup 4. go back although sometimes (as in the case I'm debugging now), the steps seem to be: 1-2. same as above 3. let page meta-refresh, probably after closing popup The steps may require that having the same tinderbox page loaded in multiple windows (independently meta-refreshing). I have not yet found steps to reliably reproduce this, but I've been seeing it for months, and it's clearly not getting fixed, so I'm thinking some code inspection may be in order. (More data coming, in the form of attachments, which will take me some time to produce.)
(In reply to comment #0) > although sometimes (as in the case I'm debugging now), the steps seem to be: > 1-2. same as above > 3. let page meta-refresh, probably after closing popup Ignore this bit -- I was misinterpreting the log. And my memory is certainly that I usually crash when going back from a bonsai page (reached through a who.cgi popup) to a tinderbox page.
That has the earmarks of us calling QI on a deleted docshell, which would be Very Bad (like "probably exploitable" bad). :( bryner, are you going to have a chance to look at this in time for 18.104.22.168?
Severity: normal → critical
FWIW, this is quite hard to reproduce. I probably do these actions a few times a day on average, and I crash this way about once every week or two.
(But when I say I crash once every week or two, that's probably over 50% of the total crashes I experience when not testing other people's crash bugs.)
> 2. click on one of the things (name / "L") that causes a popup in an iframe For me, the name popups are iframes, but the L popups are just absolutely positioned divs.
See also bug 319551, which (sometimes?) has the same stack signature, and is easier to reproduce (at least for me).
I think the trick to reproducing this crash is to click on the tinderbox page to close the popup as the bonsai page is loading.
I just crashed in a slightly different way (from DocumentViewerImpl::Destroy) trying to reproduce this -- I think because the tinderbox page didn't go in the fastback cache for some reason: #5 <signal handler called> #6 0x00010001 in ?? () #7 0x03ac64c7 in ns_if_addref<nsIDocShellTreeItem*> (expr=0x90815c4) at ../../../dist/include/xpcom/nsISupportsUtils.h:114 #8 0x03af6176 in nsSHEntry::ChildShellAt (this=0x8e2d310, aIndex=0, aShell=0xbffc6724) at /builds/reflow/mozilla/docshell/shistory/src/nsSHEntry.cpp:574 #9 0x0728cd47 in DocumentViewerImpl::Destroy (this=0x8c1f5f0) at /builds/reflow/mozilla/layout/base/nsDocumentViewer.cpp:1502 #10 0x0728b2f7 in DocumentViewerImpl::Show (this=0x8e7e298) at /builds/reflow/mozilla/layout/base/nsDocumentViewer.cpp:1828 #11 0x072a63f1 in nsPresContext::EnsureVisible (this=0x8f84740, aUnsuppressFocus=0) at /builds/reflow/mozilla/layout/base/nsPresContext.cpp:1319 #12 0x072b4e0a in PresShell::UnsuppressAndInvalidate (this=0x907a158) at /builds/reflow/mozilla/layout/base/nsPresShell.cpp:4393 #13 0x072b5023 in PresShell::UnsuppressPainting (this=0x907a158) at /builds/reflow/mozilla/layout/base/nsPresShell.cpp:4441 #14 0x072aa798 in PresShell::sPaintSuppressionCallback (aTimer=0x8f42878, aPresShell=0x907a158) at /builds/reflow/mozilla/layout/base/nsPresShell.cpp:2574 #15 0x0021f9fd in nsTimerImpl::Fire (this=0x8f42878) at /builds/reflow/mozilla/xpcom/threads/nsTimerImpl.cpp:400 ...<the usual> Why is nsSHEntry::mChildShells holding weak pointers rather than refcounting? Especially given the comment in DocumentViewerImpl::Destroy: // Do the same for our children. Note that we need to get the child // docshells from the SHEntry now; the docshell will have cleared them.
So, in the reflow branch tree where I see that (which is more likely because of the branchpoint), when I make the obvious change to nsCOMArray, I actually end up fixing the immediate crash, but after going back, getting into a weird state, and reloading, I crash in nsGlobalWindow::ClearAllTimeouts.
Doesn't look like this makes the current release, bumping to the next.
Assignee: nobody → bryner
Flags: blocking22.214.171.124? → blocking126.96.36.199+
FWIW, just hit this on a trunk build from within the past few days; hadn't seen it for a while, but I think I'd been pretty careful to avoid keeping tinderbox windows up. This time it was just while loading the bonsai query, not while going back to the tinderbox page. But perhaps I'd previously gone back to get to the tinderbox page from which I was getting the bonsai query.
I do have one idea here -- RestoreWindowState, which is called before we reattach the child docshells, has the potential to fire events. If those events removed the child document, then we'd end up with a dangling pointer.
Isn't going to make the 188.8.131.52 train at this point, not fixed on trunk or 1.8 for regression testing. Not going to bother bumping to the next .0.x release.
Flags: blocking184.108.40.206+ → blocking220.127.116.11-
We really do want to make sure we get this in on 1.8.0.x once we have a fix... dangling pointers are bad.
Bryner do you have any time to take a look at this for 1.8.1
Flags: blocking1.8.1? → blocking1.8.1+
I can try, I've just never been able to reproduce it. The patch from bug 319551 may solve the immediate crash but there's still something bad happening.
Between the 2006-06-27-04-trunk and 2006-06-29-04-trunk build the behavior changed: in the latter build, the popup automatically goes away when the new page starts loading. I can reproduce the crash pretty easily in 2006-06-26-04-trunk (without the patch from bug 319551). In 2006-06-27-04-trunk (with the patch, but without the behavior change), I don't crash, but the browser goes into a state where it's just spinning in a loading state showing the old (forward) page when going back (easy to get out of, though, since hitting stop shows the page I went back to). In 2006-06-29-04-trunk (with the behavior change), I don't see either problem. (And note that comment 0 was incorrect in mentioning the "L" popups as Jesse pointed out in comment 7; it's only the name popups that cause the problem.)
Filed bug 343169 on the latter regression.
Changing to blocking1.8.1- since it shouldn't be crashing anymore on the branch (although I'll try to test that tonight).
Flags: blocking1.8.1+ → blocking1.8.1-
Not making 18.104.22.168, no patch, maybe WFM now.
Flags: blocking22.214.171.124? → blocking126.96.36.199-
Flags: blocking1.9a1? → blocking1.9-
Keywords: crash → regression
Reassigning my bugs, since I'm not actually working on them.
Assignee: bryner → nobody
per crash-stats none currently on trunk (brief spike at 3/24-25). but there are some crashes on branch 188.8.131.52 (not a topcrash) TB44916586 for example Stack Signature nsDocLoader::QueryInterface 90983893 Product ID Firefox2 Build ID 2008020121 Trigger Time 2008-05-08 17:23:47.0 Platform Win32 Operating System Windows NT 5.1 build 2600 Module FIREFOX.EXE + (0038a427) URL visited User Comments Since Last Crash 20387 sec Total Uptime 78957 sec Trigger Reason Access violation Source File, Line No. c:/builds/tinderbox/Fx-Mozilla1.8-Release/WINNT_5.2_Depend/mozilla/uriloader/base/nsDocLoader.cpp, line 230 Stack Trace nsDocLoader::QueryInterface [mozilla/uriloader/base/nsDocLoader.cpp, line 230] nsDocShell::QueryInterface [mozilla/docshell/base/nsDocShell.cpp, line 404] nsWebShell::QueryInterface [mozilla/docshell/base/nsWebShell.cpp, line 236] nsQueryInterfaceWithError::operator() [mozilla/xpcom/build/nsCOMPtr.cpp, line 69] nsGetInterface::operator() [mozilla/xpcom/build/nsIInterfaceRequestorUtils.cpp, line 52] nsCOMPtr_base::assign_from_helper [mozilla/xpcom/build/nsCOMPtr.cpp, line 150] nsGlobalWindow::GetTop [mozilla/dom/src/base/nsGlobalWindow.cpp, line 2080] XPTC_InvokeByIndex [mozilla/xpcom/reflect/xptcall/src/md/win32/xptcinvoke.cpp, line 102] XPCWrappedNative::CallMethod [mozilla/js/src/xpconnect/src/xpcwrappednative.cpp, line 2169] XPC_WN_GetterSetter [mozilla/js/src/xpconnect/src/xpcwrappednativejsops.cpp, line 1487] js_Invoke [mozilla/js/src/jsinterp.c, line 1379] js_InternalInvoke [mozilla/js/src/jsinterp.c, line 1473] js_InternalGetOrSet [mozilla/js/src/jsinterp.c, line 1544] js_NativeGet [mozilla/js/src/jsobj.c, line 3469] js_Interpret [mozilla/js/src/jsinterp.c, line 4036] js_Execute [mozilla/js/src/jsinterp.c, line 1638] JS_EvaluateUCScriptForPrincipals [mozilla/js/src/jsapi.c, line 4298] nsJSContext::EvaluateString [mozilla/dom/src/base/nsJSEnvironment.cpp, line 1100] nsScriptLoader::EvaluateScript [mozilla/content/base/src/nsScriptLoader.cpp, line 813] nsScriptLoader::ProcessRequest [mozilla/content/base/src/nsScriptLoader.cpp, line 711] nsScriptLoader::DoProcessScriptElement [mozilla/content/base/src/nsScriptLoader.cpp, line 644] nsScriptLoader::ProcessScriptElement [mozilla/content/base/src/nsScriptLoader.cpp, line 396] nsHTMLScriptElement::MaybeProcessScript [mozilla/content/html/content/src/nsHTMLScriptElement.cpp, line 663] nsHTMLScriptElement::BindToTree [mozilla/content/html/content/src/nsHTMLScriptElement.cpp, line 456] nsGenericElement::AppendChildTo [mozilla/content/base/src/nsGenericElement.cpp, line 2876] HTMLContentSink::ProcessSCRIPTTag [mozilla/content/html/document/src/nsHTMLContentSink.cpp, line 4177] HTMLContentSink::AddLeaf [mozilla/content/html/document/src/nsHTMLContentSink.cpp, line 3043] CNavDTD::AddLeaf [mozilla/parser/htmlparser/src/CNavDTD.cpp, line 3579] CNavDTD::HandleDefaultStartToken [mozilla/parser/htmlparser/src/CNavDTD.cpp, line 1283] CNavDTD::HandleStartToken [mozilla/parser/htmlparser/src/CNavDTD.cpp, line 1668] CNavDTD::HandleToken [mozilla/parser/htmlparser/src/CNavDTD.cpp, line 955] CNavDTD::BuildModel [mozilla/parser/htmlparser/src/CNavDTD.cpp, line 458] nsParser::BuildModel [mozilla/parser/htmlparser/src/nsParser.cpp, line 2169] TB44575343 Stack Signature nsDocLoader::QueryInterface f556741e Product ID Firefox2 Build ID 2008040413 Trigger Time 2008-04-30 14:42:41.0 Platform Win32 Operating System Windows NT 5.1 build 2600 Module FIREFOX.EXE + (0038b4fb) URL visited User Comments Since Last Crash 0 sec Total Uptime 68678 sec Trigger Reason Access violation Source File, Line No. c:/builds/tinderbox/Fx-Mozilla1.8-Release/WINNT_5.2_Depend/mozilla/uriloader/base/nsDocLoader.cpp, line 236 Stack Trace nsDocLoader::QueryInterface [mozilla/uriloader/base/nsDocLoader.cpp, line 236] nsDocLoader::GetAsDocLoader [mozilla/uriloader/base/nsDocLoader.cpp, line 273] nsDocLoader::AddDocLoaderAsChildOfRoot [mozilla/uriloader/base/nsDocLoader.cpp, line 285] nsDocShell::Init [mozilla/docshell/base/nsDocShell.cpp, line 348] nsWebShellConstructor [mozilla/docshell/build/nsDocShellModule.cpp, line 89] CallCreateInstance [mozilla/xpcom/build/nsComponentManagerUtils.cpp, line 171] nsWebShellWindow::Initialize [mozilla/xpfe/appshell/src/nsWebShellWindow.cpp, line 229] nsAppShellService::JustCreateTopWindow [mozilla/xpfe/appshell/src/nsAppShellService.cpp, line 361] nsAppShellService::CreateHiddenWindow [mozilla/xpfe/appshell/src/nsAppShellService.cpp, line 177] nsAppStartup::CreateHiddenWindow [mozilla/toolkit/components/startup/src/nsAppStartup.cpp, line 141] XRE_main [mozilla/toolkit/xre/nsAppRunner.cpp, line 2695] main [mozilla/browser/app/nsBrowserApp.cpp, line 61] kernel32.dll + 0x16fd7 (0x7c816fd7)
Component: History: Session → Document Navigation
QA Contact: history.session → docshell
dbaron, are you still hitting this? If not, should we close the bug?
crash keyword missing
Crash Signature: [@ nsDocLoader::QueryInterface]
I don't see this anymore. Resolving as Works For Me. There is another similar signature and Marcia will log that as a new bug.
Status: NEW → RESOLVED
Last Resolved: 6 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.