intermittent crashes going back to tinderbox page after using popup iframe [@ nsDocLoader::QueryInterface]

RESOLVED WORKSFORME

Status

()

Core
Document Navigation
--
critical
RESOLVED WORKSFORME
12 years ago
6 years ago

People

(Reporter: dbaron, Unassigned)

Tracking

({crash, regression})

Trunk
mozilla1.8.1beta2
x86
Linux
crash, regression
Points:
---
Bug Flags:
blocking1.9 -
wanted1.9 +
blocking1.8.1 -
blocking1.8.0.4 -
blocking1.8.0.5 -
blocking1.8.0.7 -

Firefox Tracking Flags

(Not tracked)

Details

(crash signature, URL)

Attachments

(2 attachments)

(Reporter)

Description

12 years ago
I've been seeing intermittent crashes for months (since around when fastback landed, or stabilized) while using tinderbox.  In particular, the crash usually happens after I:
 1. load tinderbox
 2. click on one of the things (name / "L") that causes a popup in an iframe
 3. click on one of the links in that popup
 4. go back
although sometimes (as in the case I'm debugging now), the steps seem to be:
 1-2. same as above
 3. let page meta-refresh, probably after closing popup
The steps may require that having the same tinderbox page loaded in multiple windows (independently meta-refreshing).

I have not yet found steps to reliably reproduce this, but I've been seeing it for months, and it's clearly not getting fixed, so I'm thinking some code inspection may be in order.

(More data coming, in the form of attachments, which will take me some time to produce.)
(Reporter)

Comment 1

12 years ago
Created attachment 217482 [details]
debugging information on crash from 2006-04-06
(Reporter)

Comment 2

12 years ago
Created attachment 217484 [details]
debugging information on crash from 2006-04-03
(Reporter)

Comment 3

12 years ago
(In reply to comment #0)
> although sometimes (as in the case I'm debugging now), the steps seem to be:
>  1-2. same as above
>  3. let page meta-refresh, probably after closing popup

Ignore this bit -- I was misinterpreting the log.  And my memory is certainly that I usually crash when going back from a bonsai page (reached through a who.cgi popup) to a tinderbox page.
That has the earmarks of us calling QI on a deleted docshell, which would be Very Bad (like "probably exploitable" bad).  :(

bryner, are you going to have a chance to look at this in time for 1.8.0.3?
Blocks: 274784
Severity: normal → critical
Flags: blocking1.9a1?
Flags: blocking1.8.1?
Flags: blocking1.8.0.3?
(Reporter)

Comment 5

12 years ago
FWIW, this is quite hard to reproduce.  I probably do these actions a few times a day on average, and I crash this way about once every week or two.
(Reporter)

Comment 6

12 years ago
(But when I say I crash once every week or two, that's probably over 50% of the total crashes I experience when not testing other people's crash bugs.)

Updated

12 years ago
Flags: blocking1.8.0.3? → blocking1.8.0.3+

Comment 7

12 years ago
>  2. click on one of the things (name / "L") that causes a popup in an iframe

For me, the name popups are iframes, but the L popups are just absolutely positioned divs.
Keywords: crash

Comment 8

12 years ago
See also bug 319551, which (sometimes?) has the same stack signature, and is easier to reproduce (at least for me).

Updated

12 years ago
Depends on: 319551
(Reporter)

Comment 9

12 years ago
I think the trick to reproducing this crash is to click on the tinderbox page to close the popup as the bonsai page is loading.
(Reporter)

Comment 10

12 years ago
I just crashed in a slightly different way (from DocumentViewerImpl::Destroy) trying to reproduce this -- I think because the tinderbox page didn't go in the fastback cache for some reason:

#5  <signal handler called>
#6  0x00010001 in ?? ()
#7  0x03ac64c7 in ns_if_addref<nsIDocShellTreeItem*> (expr=0x90815c4)
    at ../../../dist/include/xpcom/nsISupportsUtils.h:114
#8  0x03af6176 in nsSHEntry::ChildShellAt (this=0x8e2d310, aIndex=0,
    aShell=0xbffc6724)
    at /builds/reflow/mozilla/docshell/shistory/src/nsSHEntry.cpp:574
#9  0x0728cd47 in DocumentViewerImpl::Destroy (this=0x8c1f5f0)
    at /builds/reflow/mozilla/layout/base/nsDocumentViewer.cpp:1502
#10 0x0728b2f7 in DocumentViewerImpl::Show (this=0x8e7e298)
    at /builds/reflow/mozilla/layout/base/nsDocumentViewer.cpp:1828
#11 0x072a63f1 in nsPresContext::EnsureVisible (this=0x8f84740,
    aUnsuppressFocus=0)
    at /builds/reflow/mozilla/layout/base/nsPresContext.cpp:1319
#12 0x072b4e0a in PresShell::UnsuppressAndInvalidate (this=0x907a158)
    at /builds/reflow/mozilla/layout/base/nsPresShell.cpp:4393
#13 0x072b5023 in PresShell::UnsuppressPainting (this=0x907a158)
    at /builds/reflow/mozilla/layout/base/nsPresShell.cpp:4441
#14 0x072aa798 in PresShell::sPaintSuppressionCallback (aTimer=0x8f42878,
    aPresShell=0x907a158)
    at /builds/reflow/mozilla/layout/base/nsPresShell.cpp:2574
#15 0x0021f9fd in nsTimerImpl::Fire (this=0x8f42878)
    at /builds/reflow/mozilla/xpcom/threads/nsTimerImpl.cpp:400
...<the usual>

Why is nsSHEntry::mChildShells holding weak pointers rather than refcounting?  Especially given the comment in DocumentViewerImpl::Destroy:
    // Do the same for our children.  Note that we need to get the child
    // docshells from the SHEntry now; the docshell will have cleared them.
(Reporter)

Comment 11

12 years ago
So, in the reflow branch tree where I see that (which is more likely because of the branchpoint), when I make the obvious change to nsCOMArray, I actually end up fixing the immediate crash, but after going back, getting into a weird state, and reloading, I crash in nsGlobalWindow::ClearAllTimeouts.
Doesn't look like this makes the current release, bumping to the next.
Flags: blocking1.8.0.5?
Flags: blocking1.8.0.4-
Flags: blocking1.8.0.4+
Assignee: nobody → bryner
Flags: blocking1.8.0.5? → blocking1.8.0.5+
(Reporter)

Comment 13

12 years ago
FWIW, just hit this on a trunk build from within the past few days; hadn't seen it for a while, but I think I'd been pretty careful to avoid keeping tinderbox windows up.  This time it was just while loading the bonsai query, not while going back to the tinderbox page.  But perhaps I'd previously gone back to get to the tinderbox page from which I was getting the bonsai query.
I do have one idea here -- RestoreWindowState, which is called before we reattach the child docshells, has the potential to fire events.  If those events removed the child document, then we'd end up with a dangling pointer.
No longer depends on: 319551
Isn't going to make the 1.8.0.5 train at this point, not fixed on trunk or 1.8 for regression testing. Not going to bother bumping to the next .0.x release.
Flags: blocking1.8.0.5+ → blocking1.8.0.5-
We really do want to make sure we get this in on 1.8.0.x once we have a fix... dangling pointers are bad.
Flags: blocking1.8.0.6?
(Reporter)

Updated

12 years ago
Depends on: 319551
No longer depends on: 319551

Comment 17

12 years ago
Bryner do you have any time to take a look at this for 1.8.1
Flags: blocking1.8.1? → blocking1.8.1+
I can try, I've just never been able to reproduce it.  The patch from bug 319551 may solve the immediate crash but there's still something bad happening.

Updated

12 years ago
Target Milestone: --- → mozilla1.8.1beta2
(Reporter)

Comment 19

12 years ago
Between the 2006-06-27-04-trunk and 2006-06-29-04-trunk build the behavior changed:  in the latter build, the popup automatically goes away when the new page starts loading.

I can reproduce the crash pretty easily in 2006-06-26-04-trunk (without the patch from bug 319551).  In 2006-06-27-04-trunk (with the patch, but without the behavior change), I don't crash, but the browser goes into a state where it's just spinning in a loading state showing the old (forward) page when going back (easy to get out of, though, since hitting stop shows the page I went back to).  In 2006-06-29-04-trunk (with the behavior change), I don't see either problem.

(And note that comment 0 was incorrect in mentioning the "L" popups as Jesse pointed out in comment 7; it's only the name popups that cause the problem.)
(Reporter)

Comment 20

12 years ago
Filed bug 343169 on the latter regression.
(Reporter)

Comment 21

12 years ago
Changing to blocking1.8.1- since it shouldn't be crashing anymore on the branch (although I'll try to test that tonight).
Flags: blocking1.8.1+ → blocking1.8.1-
Not making 1.8.0.7, no patch, maybe WFM now.
Flags: blocking1.8.0.7? → blocking1.8.0.7-
Flags: blocking1.9a1? → blocking1.9-
Keywords: crash → regression
Whiteboard: [wanted-1.9]
Flags: wanted1.9+
Whiteboard: [wanted-1.9]
Reassigning my bugs, since I'm not actually working on them.
Assignee: bryner → nobody

Comment 24

10 years ago
per crash-stats none currently on trunk (brief spike at 3/24-25).
but there are some crashes on branch 2.0.0.14 (not a topcrash)

TB44916586 for example
Stack Signature	 nsDocLoader::QueryInterface 90983893
Product ID	Firefox2
Build ID	2008020121
Trigger Time	2008-05-08 17:23:47.0
Platform	Win32
Operating System	Windows NT 5.1 build 2600
Module	FIREFOX.EXE + (0038a427)
URL visited	
User Comments	
Since Last Crash	20387 sec
Total Uptime	78957 sec
Trigger Reason	Access violation
Source File, Line No.	c:/builds/tinderbox/Fx-Mozilla1.8-Release/WINNT_5.2_Depend/mozilla/uriloader/base/nsDocLoader.cpp, line 230
Stack Trace 	
nsDocLoader::QueryInterface  [mozilla/uriloader/base/nsDocLoader.cpp, line 230]
nsDocShell::QueryInterface  [mozilla/docshell/base/nsDocShell.cpp, line 404]
nsWebShell::QueryInterface  [mozilla/docshell/base/nsWebShell.cpp, line 236]
nsQueryInterfaceWithError::operator()  [mozilla/xpcom/build/nsCOMPtr.cpp, line 69]
nsGetInterface::operator()  [mozilla/xpcom/build/nsIInterfaceRequestorUtils.cpp, line 52]
nsCOMPtr_base::assign_from_helper  [mozilla/xpcom/build/nsCOMPtr.cpp, line 150]
nsGlobalWindow::GetTop  [mozilla/dom/src/base/nsGlobalWindow.cpp, line 2080]
XPTC_InvokeByIndex  [mozilla/xpcom/reflect/xptcall/src/md/win32/xptcinvoke.cpp, line 102]
XPCWrappedNative::CallMethod  [mozilla/js/src/xpconnect/src/xpcwrappednative.cpp, line 2169]
XPC_WN_GetterSetter  [mozilla/js/src/xpconnect/src/xpcwrappednativejsops.cpp, line 1487]
js_Invoke  [mozilla/js/src/jsinterp.c, line 1379]
js_InternalInvoke  [mozilla/js/src/jsinterp.c, line 1473]
js_InternalGetOrSet  [mozilla/js/src/jsinterp.c, line 1544]
js_NativeGet  [mozilla/js/src/jsobj.c, line 3469]
js_Interpret  [mozilla/js/src/jsinterp.c, line 4036]
js_Execute  [mozilla/js/src/jsinterp.c, line 1638]
JS_EvaluateUCScriptForPrincipals  [mozilla/js/src/jsapi.c, line 4298]
nsJSContext::EvaluateString  [mozilla/dom/src/base/nsJSEnvironment.cpp, line 1100]
nsScriptLoader::EvaluateScript  [mozilla/content/base/src/nsScriptLoader.cpp, line 813]
nsScriptLoader::ProcessRequest  [mozilla/content/base/src/nsScriptLoader.cpp, line 711]
nsScriptLoader::DoProcessScriptElement  [mozilla/content/base/src/nsScriptLoader.cpp, line 644]
nsScriptLoader::ProcessScriptElement  [mozilla/content/base/src/nsScriptLoader.cpp, line 396]
nsHTMLScriptElement::MaybeProcessScript  [mozilla/content/html/content/src/nsHTMLScriptElement.cpp, line 663]
nsHTMLScriptElement::BindToTree  [mozilla/content/html/content/src/nsHTMLScriptElement.cpp, line 456]
nsGenericElement::AppendChildTo  [mozilla/content/base/src/nsGenericElement.cpp, line 2876]
HTMLContentSink::ProcessSCRIPTTag  [mozilla/content/html/document/src/nsHTMLContentSink.cpp, line 4177]
HTMLContentSink::AddLeaf  [mozilla/content/html/document/src/nsHTMLContentSink.cpp, line 3043]
CNavDTD::AddLeaf  [mozilla/parser/htmlparser/src/CNavDTD.cpp, line 3579]
CNavDTD::HandleDefaultStartToken  [mozilla/parser/htmlparser/src/CNavDTD.cpp, line 1283]
CNavDTD::HandleStartToken  [mozilla/parser/htmlparser/src/CNavDTD.cpp, line 1668]
CNavDTD::HandleToken  [mozilla/parser/htmlparser/src/CNavDTD.cpp, line 955]
CNavDTD::BuildModel  [mozilla/parser/htmlparser/src/CNavDTD.cpp, line 458]
nsParser::BuildModel  [mozilla/parser/htmlparser/src/nsParser.cpp, line 2169]


TB44575343
Stack Signature	 nsDocLoader::QueryInterface f556741e
Product ID	Firefox2
Build ID	2008040413
Trigger Time	2008-04-30 14:42:41.0
Platform	Win32
Operating System	Windows NT 5.1 build 2600
Module	FIREFOX.EXE + (0038b4fb)
URL visited	
User Comments	
Since Last Crash	0 sec
Total Uptime	68678 sec
Trigger Reason	Access violation
Source File, Line No.	c:/builds/tinderbox/Fx-Mozilla1.8-Release/WINNT_5.2_Depend/mozilla/uriloader/base/nsDocLoader.cpp, line 236
Stack Trace 	
nsDocLoader::QueryInterface  [mozilla/uriloader/base/nsDocLoader.cpp, line 236]
nsDocLoader::GetAsDocLoader  [mozilla/uriloader/base/nsDocLoader.cpp, line 273]
nsDocLoader::AddDocLoaderAsChildOfRoot  [mozilla/uriloader/base/nsDocLoader.cpp, line 285]
nsDocShell::Init  [mozilla/docshell/base/nsDocShell.cpp, line 348]
nsWebShellConstructor  [mozilla/docshell/build/nsDocShellModule.cpp, line 89]
CallCreateInstance  [mozilla/xpcom/build/nsComponentManagerUtils.cpp, line 171]
nsWebShellWindow::Initialize  [mozilla/xpfe/appshell/src/nsWebShellWindow.cpp, line 229]
nsAppShellService::JustCreateTopWindow  [mozilla/xpfe/appshell/src/nsAppShellService.cpp, line 361]
nsAppShellService::CreateHiddenWindow  [mozilla/xpfe/appshell/src/nsAppShellService.cpp, line 177]
nsAppStartup::CreateHiddenWindow  [mozilla/toolkit/components/startup/src/nsAppStartup.cpp, line 141]
XRE_main  [mozilla/toolkit/xre/nsAppRunner.cpp, line 2695]
main  [mozilla/browser/app/nsBrowserApp.cpp, line 61]
kernel32.dll + 0x16fd7 (0x7c816fd7)

Updated

10 years ago
Component: History: Session → Document Navigation
QA Contact: history.session → docshell

Comment 25

8 years ago
dbaron, are you still hitting this?  If not, should we close the bug?

Comment 26

8 years ago
crash keyword missing
Keywords: crash
(Assignee)

Updated

7 years ago
Crash Signature: [@ nsDocLoader::QueryInterface]

Comment 27

6 years ago
I don't see this anymore. Resolving as Works For Me. There is another similar signature and Marcia will log that as a new bug.
Status: NEW → RESOLVED
Last Resolved: 6 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.