Closed Bug 418378 Opened 16 years ago Closed 16 years ago

crash [@ nsGlobalWindow::SaveWindowState(nsISupports**)]

Categories

(Core :: DOM: Navigation, defect, P2)

defect

Tracking

()

RESOLVED WORKSFORME
mozilla1.9

People

(Reporter: samuel.sidler+old, Unassigned)

References

()

Details

(Keywords: crash, topcrash, Whiteboard: [needs-testcase][needs minidump])

Crash Data

Firefox 3 beta 3 has a new topcrash. I don't see this appearing in nightly
builds, but I also don't see any bugs fixed recently that could be related to
this crash.

See also: bp-f3d0f1cc-ded8-11dc-92f9-001a4bd43e5c

Crashing Thread
Frame 	Signature 	Source
0 	nsGlobalWindow::SaveWindowState(nsISupports**) 	mozilla/dom/src/base/nsGlobalWindow.cpp:8128
1 	nsDocShell::CaptureState() 	mozilla/docshell/base/nsDocShell.cpp:5287
2 	nsDocShell::SetupNewViewer(nsIContentViewer*) 	mozilla/docshell/base/nsDocShell.cpp:6130
3 	nsDocShell::Embed(nsIContentViewer*, char const*, nsISupports*) 	mozilla/docshell/base/nsDocShell.cpp:4805


See also: bp-6479f98a-ded1-11dc-93b5-001a4bd46e84

Crashing Thread
Frame 	Signature 	Source
0 	nsGlobalWindow::SaveWindowState(nsISupports**) 	mozilla/dom/src/base/nsGlobalWindow.cpp:602
1 	nsDocShell::CaptureState() 	mozilla/docshell/base/nsDocShell.cpp:5287
2 	nsDocShell::SetupNewViewer(nsIContentViewer*) 	mozilla/docshell/base/nsDocShell.cpp:6130
3 	nsDocShell::Embed(nsIContentViewer*, char const*, nsISupports*) 	mozilla/docshell/base/nsDocShell.cpp:4805
4 	nsDocShell::CreateContentViewer(char const*, nsIRequest*, nsIStreamListener**) 	mozilla/docshell/base/nsDocShell.cpp:5997
5 	nsDSURIContentListener::DoContent(char const*, int, nsIRequest*, nsIStreamListener**, int*) 	mozilla/docshell/base/nsDSURIContentListener.cpp:138
6 	nsDocumentOpenInfo::TryContentListener(nsIURIContentListener*, nsIChannel*) 	mozilla/uriloader/base/nsURILoader.cpp:735
7 	nsDocumentOpenInfo::DispatchContent(nsIRequest*, nsISupports*) 	mozilla/uriloader/base/nsURILoader.cpp:434
8 	nsDocumentOpenInfo::OnStartRequest(nsIRequest*, nsISupports*) 	mozilla/uriloader/base/nsURILoader.cpp:280
9 	nsHttpChannel::CallOnStartRequest() 	mozilla/netwerk/protocol/http/src/nsHttpChannel.cpp:753
10 	nsHttpChannel::ProcessNormal() 	mozilla/netwerk/protocol/http/src/nsHttpChannel.cpp:935
11 	nsInputStreamPump::OnStateStart() 	mozilla/netwerk/base/src/nsInputStreamPump.cpp:439
12 	nsInputStreamPump::OnInputStreamReady(nsIAsyncInputStream*) 	mozilla/netwerk/base/src/nsInputStreamPump.cpp:395
13 	nsInputStreamReadyEvent::Run() 	mozilla/gfx/cairo/cairo/src/cairo-hull.c:107
14 	nsThread::ProcessNextEvent(int, int*) 	mozilla/xpcom/threads/nsThread.cpp:510
15 	NS_ProcessPendingEvents_P(nsIThread*, unsigned int) 	nsThreadUtils.cpp:180
16 	nsBaseAppShell::NativeEventCallback() 	mozilla/widget/src/xpwidgets/nsBaseAppShell.cpp:112
17 	nsAppShell::ProcessGeckoEvents(void*) 	mozilla/widget/src/cocoa/nsAppShell.mm:294
18 	CoreFoundation@0x7262d 	
19 	CoreFoundation@0x72d17 	
20 	HIToolbox@0x3069f 	
21 	HIToolbox@0x304b8 	
22 	HIToolbox@0x3032c 	
23 	AppKit@0x407d8 	
24 	AppKit@0x4008d 	
25 	AppKit@0x390c4 	
26 	nsAppShell::Run() 	mozilla/widget/src/cocoa/nsAppShell.mm:565
27 	nsAppStartup::Run() 	mozilla/toolkit/components/startup/src/nsAppStartup.cpp:181
28 	XRE_main 	mozilla/toolkit/xre/nsAppRunner.cpp:3135
29 	main 	mozilla/browser/app/nsBrowserApp.cpp:158
30 	start 	crt.c:272
31 	start 	
32 	@0x1


Filing in Embedding:Docshell for now, but feel free to move it to DOM if that's more appropriate.
Flags: blocking1.9?
-'ing this for now since we don't see this in nightlies, but re-nom if we see this issue again.
Flags: blocking1.9? → blocking1.9-
I'm re-requesting. This needs to get looked at – at the very least investigated – before the next beta despite the fact that we don't have reports of it on the trunk.

Often, we see topcrashes appear only in release builds because actual users are using the browser differently than nightly testers. This is not at all uncommon. Since there hasn't visibly been anything checked in that could've fixed this (at least, based on bugzilla searches), we need to investigate this crash to determine if it's been fixed magically or if it's just something that nightly testers don't run into.
Flags: blocking1.9- → blocking1.9?
Priority: -- → P1
Target Milestone: --- → mozilla1.9beta4
without a test case we can't block on a crash that may no longer exist. We'll see if it returns in b4.
Assignee: nobody → samuel.sidler
Flags: blocking1.9? → blocking1.9-
Robert,

I found a topcrash that we didn't see before beta 3 or after it but we did see in the final beta. The crash may very well no longer exist, but there's been absolutely no proof of that. I filed this topcrash bug so that a developer could analyze the stack and determine, through code, if there's something that could cause this. Often times there are topcrashes without testcases that are fixed (for example, bug 418377).

Have you analyzed this stack and the relevant code and determined that either a) there's no way to tell where it's crashing in code or b) the issue that was seen in beta 3 has been fixed? If not, I'm not sure how you can say this bug is not blocking 1.9 final without more data (i.e., crash data from beta 4).

This bug should be blocking until a) we determine *for sure* that it no longer happens, b) we've made a conscious decision to ship our product with a known topcrasher without any work done to determine, programmatically, its cause, or c) we've investigated the code and can't find a reason for this crash to happen but have done everything we can to try and reproduce it to no avail.

It's unclear to me which of those, if any, was chosen from your comment.

Please don't assign topcrash bugs to me as I am not a programmer and have no way to fix them. If a bug needs a testcase, please use [needs-testcase] in the whiteboard to indicate that.
Assignee: samuel.sidler → sayrer
Flags: blocking1.9- → blocking1.9?
Whiteboard: [needs-testcase]
do not assign bugs to me.
Assignee: sayrer → samuel.sidler
QA should find a test case, or determine that one can't be found. We'll move on from there.
Didn't I just ask that bugs not be assigned to me when you want a testcase?

I manually went through crash reports with this stack filed yesterday and today and looked for comments. The following crash reports had comments:

  bp-13a9d5c2-e000-11dc-82fa-001a4bd43ed6
  bp-043a6b91-dff1-11dc-b5de-001a4bd46e84
  bp-b6947711-dfed-11dc-9317-001a4bd43ed6
  bp-131419e6-dffa-11dc-9ade-001a4bd43e5c

One of those had a URL in the comment field. Going to that URL in beta 3 did not crash it.

QA can not find a testcase for this bug using the tools provided. Robert, where should we move from here? Re-assigning the bug to you so you can determine what's next.
Assignee: samuel.sidler → sayrer
Assignee: sayrer → nobody
Flags: blocking1.9? → blocking1.9-
not reproducible, so doesn't block and doesn't receive a priority (priorities are set by drivers). If we do find a problem, or the crash appears again, we can adjust.
Flags: wanted1.9+
Priority: P1 → --
(In reply to comment #8)
> (priorities are set by drivers)

QA has been given permission my drivers to set priority. If that's changed, drivers needs to tell QA that so that we no longer set priority.

The last person to touch the DOM code was Boris for the first stack and Waldo for the second. The last person to touch mozilla/docshell/base/nsDocShell.cpp:5287 was bryner.
OK, thanks for the research. Hopefully this is already gone.
Robert, Damon, Gecko drivers -- QA has assessed that this is a Priority 1, Topcrash bug.  It was found recurring in beta 3 builds, but not trunk.   Sam has provided adequate information here for you to analyze and trace through.  Can you please assign this to the correct person, and see what solutions can be done here in time for beta 4?  Thanks.
Flags: blocking1.9- → blocking1.9?
Priority: -- → P1
Latest crash stack:

http://crash-stats.mozilla.com/report/index/4a969d33-e13a-11dc-ab50-001a4bd43e5c

seems to implicate bug 414743

Given this is #6 on the b3 Top crash list probably worth a look.  Tony/Sam can you comb through comments and URLs to see if we can get any STR?

Bz/Bkap any thoughts?
Blocks: 414743
Schrep, I don't see how that stack indicates bug 414743 at all (other than being in the same file).

This is odd -- it seems like the mInnerWindow member has somehow been corrupted. At first, I thought it had just been collected, but the call in question is not a virtual call.
(In reply to comment #13)
> Schrep, I don't see how that stack indicates bug 414743 at all (other than
> being in the same file).
> 
> This is odd -- it seems like the mInnerWindow member has somehow been
> corrupted. At first, I thought it had just been collected, but the call in
> question is not a virtual call.
> 

Yea - I'm not really sure either - just that particular line was in the crash stack which doesn't make much sense.  We must be corrupting mem somewhere first..

Not much luck on the comments from the crashes.   Combing through the fat list, i only found one comment, and it doesnt help much.  

http://crash-stats.mozilla.com/report/index/12f51f66-e165-11dc-a70f-001a4bd43ed6

Dont think sam had any better luck.  But we'll keep looking as its continuing to happen.
I just went through about a thousand crash reports at this stack trace. The following are the only comments I found:


"just clicked a link"

"There was about a twenty minute period of inactivity, so then I pressed the space bar to wake the machine, I started down the page that I had been on and found the hyperlink I needed, but, before I could click Firefox took a dump. . . I have to say, that having been exposed to I.E. for years and years, plus every other browser ever known this upgrade to Firefox in the offing is without peer. I'm blown away by what I have read and seen."

"I have recently installed the"

"I was trying to get email activation @www.yousendit.com"

"I went to my google toolbar, did a search for Florence Scovel Shinn I've tried to look at the first link twice now and it has crashed each time. In order to look at it I tried to open a new tab and each time it didn't open, but when I'd click on the link it would load, but redirect me and crash."

"login in to collect email at myway.com"

"Application crashed when I was trying to download Pidgin"

"donno, was trying to change my search term in google; it wouldn't take any keyboard input and crashed..."

"standard for firefox,crashed 3 time in 15 min maybe its good to go back to explorer"

"無預警當機" (which translates to "When the machine without warning")

"Firefox just closed when I selected florida on the map."


--

As you can see, nothing is incredibly useful from the comments that users have left. I tried a few of the various steps that were listed but didn't have any success. This appears to be fairly random.
Report counts with this trace (looking back 2 months):

Firefox 3.0b3pre:
  2008013000 - 5
  2008013100 - 1

Firefox 3.0b3:
  2008020500 - 2996

Firefox 3.0b4pre:
  2008020400 - 1
  2008020800 - 1
  2008020900 - 1

> QA should find a test case, or determine that one can't be found. We'll move on
from there.

over the years we have fixed a ton a crash bugs where we were never really able to get a reproducable test case, or even understand exactly what was going on to get in the condition that resulted in the crash.

the two big ways to getting at an understanding for these kind of bugs where through inspection of code around the crash, and also dbaron's and others have done a lot of work looking at the details of the blackboxes.  check out https://bugzilla.mozilla.org/show_bug.cgi?id=411431#c2 for more details.

we need help making sure that we have a good system in place for doing more detailed analysis and debugging on blackbox data when the crash system just isn't giving us everything we need in the stack trace and user comments.  this sounds like a good bug to test out how we might solve this problem in the new breakpad/sirroco world.

anyone want to give it a go?

see https://bugzilla.mozilla.org/show_bug.cgi?id=411431#c5 for more details

Priority: P1 → P2
Target Milestone: mozilla1.9beta4 → mozilla1.9
Flags: tracking1.9? → blocking1.9?
Could it be that it's the outer window that's dead here somehow?
It's possible... Unlikely because the call to SaveWindowState is virtual and the outer window we're dereferencing here is the docshell's mScriptGlobal, which is a strong reference.
Renom if it persists in beta4
Flags: wanted1.9+
Flags: blocking1.9?
Whiteboard: [needs-testcase] → [needs-testcase][needs minidump]
nothing in for b4 so far.  looks like the last we saw of this crash was around 

2008020804  	Windows NT 5.1.2600 Service Pack 2
2008020904  	Mac OS X 10.5.1 9B18

no changes in docshell that I could see around feb 8-10, and just these changes in the dom..

2008-02-08 13:09	bent.mozilla%gmail.com 	mozilla/dom/src/base/nsGlobalWindow.cpp 	1.988 	12/10  	Bug 353851 - "accumulation of outer chrome windows in mOpener chains (window.opener)". r+sr=jst, a=blocking1.9. Fixed small typo that caused Txul to blow up yesterday.
2008-02-08 12:23	dolske%mozilla.com 	mozilla/dom/src/base/nsGlobalWindow.cpp 	1.987 	2/6  	Reland 406686, tests went green apparently before picking up the backout.
2008-02-08 11:07	dolske%mozilla.com 	mozilla/dom/src/base/nsGlobalWindow.cpp 	1.986 	6/2  	Backout bug 406686 to determine cause of mochitest failures.
2008-02-08 05:53	enndeakin%sympatico.ca 	mozilla/dom/src/base/nsGlobalWindow.cpp 	1.985 	2/6  	Bug 406686, close popups on blur, this time with nullcheck, r=roc,sr=dveditz
2008-02-08 13:09	bent.mozilla%gmail.com 	mozilla/dom/src/base/nsGlobalWindow.h 	1.328 	1/1  	Bug 353851 - "accumulation of outer chrome windows in mOpener chains (window.opener)". r+sr=jst, a=blocking1.9. Fixed small typo that caused Txul to blow up yesterday. 

I guess its possible that one of these fixed the problem or some other research could turn up what caused this go away, but it's probably not worth the time.

works for me if everyone agrees. 
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → WORKSFORME
Crash Signature: [@ nsGlobalWindow::SaveWindowState(nsISupports**)]
You need to log in before you can comment on or make changes to this bug.