Closed Bug 344305 Opened 16 years ago Closed 14 years ago

[FIX]Crash when opening and closing bookmark properties a lot [@ nsDocShell::SetupNewViewer]

Categories

(Core :: DOM: Navigation, defect, P1)

x86
Windows XP
defect

Tracking

()

VERIFIED FIXED
mozilla1.9alpha8

People

(Reporter: martijn.martijn, Assigned: bzbarsky)

References

Details

(Keywords: crash, fixed1.8.1, topcrash, Whiteboard: [fixed1.8.1 refers to microsummary service fix])

Crash Data

Attachments

(4 files, 1 obsolete file)

I'm seeing this with the latest FF2 branch build.

To reproduce:
- Right click on a bookmark (in the bookmarks toolbar)
- Click on properties in the context menu
- Click on Ok in the properties window
- Repeat the above steps a few times

Result:
Sometimes crash

I usually crash when repeating this within 5 times or so.

Gavin suggested that this could be something related to microsummaries.

Talkback ID: TB20842322Y
nsDocShell::SetupNewViewer  [c:/builds/tinderbox/Fx-Mozilla1.8/WINNT_5.2_Depend/mozilla/docshell/base/nsDocShell.cpp, line 5979]
nsDocShell::Embed  [c:/builds/tinderbox/Fx-Mozilla1.8/WINNT_5.2_Depend/mozilla/docshell/base/nsDocShell.cpp, line 4587]
nsDocShell::CreateContentViewer  [c:/builds/tinderbox/Fx-Mozilla1.8/WINNT_5.2_Depend/mozilla/docshell/base/nsDocShell.cpp, line 5718]
nsDSURIContentListener::DoContent  [c:/builds/tinderbox/Fx-Mozilla1.8/WINNT_5.2_Depend/mozilla/docshell/base/nsDSURIContentListener.cpp, line 131]
nsDocumentOpenInfo::TryContentListener  [c:/builds/tinderbox/Fx-Mozilla1.8/WINNT_5.2_Depend/mozilla/uriloader/base/nsURILoader.cpp, line 776]
nsDocumentOpenInfo::DispatchContent  [c:/builds/tinderbox/Fx-Mozilla1.8/WINNT_5.2_Depend/mozilla/uriloader/base/nsURILoader.cpp, line 500]
nsDocumentOpenInfo::OnStartRequest  [c:/builds/tinderbox/Fx-Mozilla1.8/WINNT_5.2_Depend/mozilla/uriloader/base/nsURILoader.cpp, line 345]
nsInputStreamChannel::OnStartRequest  [c:/builds/tinderbox/Fx-Mozilla1.8/WINNT_5.2_Depend/mozilla/netwerk/base/src/nsInputStreamChannel.cpp, line 359]
nsOutputStreamReadyEvent::EventHandler  [c:/builds/tinderbox/Fx-Mozilla1.8/WINNT_5.2_Depend/mozilla/xpcom/io/nsStreamUtils.cpp, line 217]
0x778b0c24
0x00200064
0x0c718d08
Component: Bookmarks → Embedding: Docshell
Product: Firefox → Core
QA Contact: bookmarks → docshell
Summary: Crash when opening and closing bookmark properties a lot → Crash when opening and closing bookmark properties a lot [@ nsDocShell::SetupNewViewer]
Version: Trunk → 1.8 Branch
I just had a case where it crashed the first time I followed the steps to reproduce.
Component: Embedding: Docshell → Build Config
Product: Core → Firefox
Version: 1.8 Branch → Trunk
Product: Firefox → Core
QA Contact: docshell → build-config
Component: Build Config → Embedding: Docshell
Flags: blocking1.8.1?
QA Contact: build-config → docshell
Flags: blocking1.8.1? → blocking1.8.1+
I just saw this with the 2006080903 Windows 1.8branch build - TB22028066:

nsDocShell::SetupNewViewer  [mozilla/docshell/base/nsDocShell.cpp, line 5983]
nsDocShell::Embed  [mozilla/docshell/base/nsDocShell.cpp, line 4587]
nsDocShell::CreateContentViewer  [mozilla/docshell/base/nsDocShell.cpp, line 5722]
nsDSURIContentListener::DoContent  [mozilla/docshell/base/nsDSURIContentListener.cpp, line 131]
nsDocumentOpenInfo::TryContentListener  [mozilla/uriloader/base/nsURILoader.cpp, line 776]
nsDocumentOpenInfo::DispatchContent  [mozilla/uriloader/base/nsURILoader.cpp, line 500]
nsDocumentOpenInfo::OnStartRequest  [mozilla/uriloader/base/nsURILoader.cpp, line 345]
nsInputStreamChannel::OnStartRequest  [mozilla/netwerk/base/src/nsInputStreamChannel.cpp, line 359]
nsInputStreamReadyEvent::EventHandler  [mozilla/xpcom/io/nsStreamUtils.cpp, line 217]
0x778b0c24
nsComboboxControlFrame::ReflowCombobox [mozilla/layout/forms/nsComboboxControlFrame.cpp, line 938]
0x0f74c085

Removing the two microsummary generators I had in my profile did not help, I still got TB22028241.
This is one of the top crashes in Firefox builds on the 1.8 branch: most of the user comments mention various actions related to bookmarks.
Keywords: topcrash
Can anyone reproduce this in a debug build?  If so, are we crashing because mContentViewer is null?  Or for some other reason?

I can't seem to reproduce this in a 1.8 branch Firefox debug build on Linux, with a clean profile.
-> myk
Assignee: nobody → myk
Target Milestone: --- → mozilla1.8.1
I haven't been able to reproduce this in debug builds on Linux and Windows.  Is there a particular URL which is known to crash?  What kind of internet connection are you using when it happens?
I can still reproduce this with a 2006-08-14 branch build, talkback ID: TB22145624X
I haven't been able to reproduce this with a branch debug build, though.
(In reply to comment #4)
> Can anyone reproduce this in a debug build?  If so, are we crashing because
> mContentViewer is null?  Or for some other reason?

I can't reproduce in a debug build on Linux, Windows, or Mac, but I was able to reproduce a couple times in an optimized Windows build with a printf statement inserted right before the problematic line.  Both times mContentViewer was null.

I should also note that the mere presence of the printf statement makes this exponentially harder to reproduce, so it seems to be a very subtle timing issue.
(In reply to comment #8)

> I should also note that the mere presence of the printf statement makes this
> exponentially harder to reproduce, so it seems to be a very subtle timing
> issue.

But minimizing the console makes it much easier to reproduce.  I've reproduced it several more times since figuring that out, and mContentViewer was always null right before the crash.

I should also note that I hit the printf twice every time I open the bookmark properties dialog, and it's always the second time I hit it that mContentViewer is null.
myk, could you add a member to nsDocShell that you set to true in nsDocShell::SetupNewViewer when we enter and reset to false before returning?  Then put a printf in nsDocShell::Destroy if that boolean is true.  Does that happen?

I suppose the safe fix would be to stick the viewer in a stack local, but I'd like to understand why we're losing it to start with...
(In reply to comment #10)
> myk, could you add a member to nsDocShell that you set to true in
> nsDocShell::SetupNewViewer when we enter and reset to false before returning? 
> Then put a printf in nsDocShell::Destroy if that boolean is true.  Does that
> happen?

Yup, the printf in nsDocShell::Destroy gets triggered for the first time right before the printf in nsDocShell::SetupNewViewer identifies mContentViewer as null and Firefox crashes.
Could this be triggered by a rapid create/load/destroy cycle?

When a user opens the Bookmark Properties dialog, the microsummary service creates a hidden iframe, loads a document into it via a string input stream, checks the document for <link> tags, and then destroys the iframe.

This all happens very quickly, taking only a fraction of a second.  Might the microsummary service be destroying the docshell faster than the docshell assumes any caller could destroy it?
No idea.  What's the stack to Destroy()?  That would help a lot here...
On IRC, bz said:

--
so here's what I think happens
1) The about:blank load starts.
2) Your [the microsummary service's] load starts.
3) Your load's data starts coming in.
4) Before it's done, the about:blank data starts coming in.
This is where things get interesting. We call into SetupNewViewer for the about:blank load while your load is still in progress.
5) SetupNewViewer calls Stop() on the old viewer (for your data)
6) Stop() cancels the parse, which triggers DOMContentLoaded
7) You tear down the docshell
8) We unwind to SetupNewViewer.
The ordering of #3 and #4 is key.
So I think we should fix docshell to be more robust here (not crash)
but in your case that would change from a crash to only getting partial data.
I think you really do want to stop the about:blank load before kicking off your own.
--

But a well-placed dump statement shows that things don't crash when the about:blank load finishes before the microsummary service-initiated load.  When the crash happens, the DOMContentLoaded event handler hasn't yet fired for the about:blank load.
Err, my bad. That's the assertion:

*** == Bookmark Properties (Aggregate) ==============
*** == Bookmark Properties (Aggregate Ends) =========
###!!! ASSERTION: XPConnect is being called on a scope without a 'Components' pr
operty!

This is pretty much always bad. It usually means that native code is
making a callback to an interface implemented in JavaScript, but the
document where the JS object was created has already been cleared and the
global properties of that document's window are *gone*. Generally this
indicates a problem that should be addressed in the design and use of the
callback code.
: 'Error', file c:/firefox/mozilla/js/src/xpconnect/src/xpcwrappednativescope.cp
p, line 587
Hey, this is a branch bug. Let's change the Version to 1.8 so I don't keep testing on trunk ;)
Version: Trunk → 1.8 Branch
This is a problem on trunk too, so the version was correct.

myk and I talked about what his code is actually, doing, and here's what's going on when myk's code creates an iframe and then starts the load in it via the URILoader (this part is key, since it doesn't stop the existing loads in the frame!):

1) The about:blank load starts.
2) myk's load starts.
3) myk's load's data starts coming in.  
4) Before it's done, the about:blank data starts coming in.
5) This is where things get interesting.  We call into SetupNewViewer for the
   about:blank load while myk's load is still in progress.
6)  SetupNewViewer calls Stop() on the old viewer (for myk's data)
7) Stop() cancels the parse, which triggers DOMContentLoaded
8) myk's DOMContentLoaded listener tears down the docshell
9) We unwind to SetupNewViewer and die.

The ordering of #3 and #4 is key here.  If those happen in the opposite order, no crash, since myk's DOMContentLoader listener ignores about:blank.

So the simple solution is to stop() the existing load before kicking off the new one via URILoader.  But in general, we could hit this any time Stop() triggers DOMContentLoaded.  We could try to fire DOMContentLoaded async (would that break anything?) or we could try to make docshell survive Stop() calls tearing it down (which could happen in several places in interesting ways, including under SetupNewViewer, when about to kick off a new load, etc)... Thoughts on which approach we should take?
Flags: blocking1.9?
Version: 1.8 Branch → Trunk
Here's a fix in the microsummary service that stops the about:blank load immediately after inserting the iframe into the window.  With this fix, the microsummary service no longer triggers the bug.  Although this doesn't actually fix the bug, it seems like the right thing to do anyway, since we don't need about:blank to load, and canceling the load seems simpler than merely dropping it on the floor afterwards.
Attachment #234857 - Flags: review?(bzbarsky)
Comment on attachment 234857 [details] [diff] [review]
microsummary service fix - stops about:blank load

>Index: browser/components/microsummaries/src/nsMicrosummaryService.js.in
>+    var webNav = this._iframe.docShell.QueryInterface(Ci.nsIWebNavigation);
>+    webNav.stop(webNav.STOP_NETWORK);

I'd prefer Ci.nsIWebNavigation.STOP_NETWORK here, just to make it clear that that comes off the interface proper.  r=bzbarsky with that.

Thanks for helping dig into this!
Attachment #234857 - Flags: review?(bzbarsky) → review+
This is the patch I checked into the trunk.  It uses Ci.nsIWebNavigation.STOP_NETWORK instead of webNav.STOP_NETWORK per comment 20.

Leaving the bug open for the doc shell fix.


Notes for drivers considering this approval request:

This patch changes the way the microsummary service deals with hidden iframes which automatically load about:blank when they get inserted into XUL windows.

Before, the microsummary service would leave the loads alone and just ignore them when they reported "DOMContentLoaded".  Now the service cancels them.

The change works around a bug in the doc shell that causes it to sometimes crash when both about:blank and some other document is loaded using it and then the other document is destroyed.  But even without the doc shell bug, this is a simpler and more dependable way of handling about:blank loads.

The patch has been on the trunk since Monday, August 21.  It presents a low risk of regression.  It is identical to the patch bz reviewed but with his nit from comment 20 fixed.
Attachment #234886 - Flags: approval1.8.1?
Whiteboard: [181approval pending]
Comment on attachment 234886 [details] [diff] [review]
microsummary service fix with nit fixed

a=schrep for drivers - approving all [181approval pending] bugs now that tree is open.
Attachment #234886 - Flags: approval1.8.1? → approval1.8.1+
Keywords: fixed1.8.1
Whiteboard: [181approval pending] → [181approval pending] [fixed1.8.1 refers to microsummary service fix]
Whiteboard: [181approval pending] [fixed1.8.1 refers to microsummary service fix] → [fixed1.8.1 refers to microsummary service fix]
*** Bug 352066 has been marked as a duplicate of this bug. ***
Assignee: myk → bzbarsky
Flags: blocking1.9? → blocking1.9+
Attached patch Fire DOMContentLoaded async (obsolete) — Splinter Review
Attachment #279138 - Flags: superreview?(jst)
Attachment #279138 - Flags: review?(jonas)
Flags: in-testsuite?
Priority: -- → P1
Summary: Crash when opening and closing bookmark properties a lot [@ nsDocShell::SetupNewViewer] → [FIX]Crash when opening and closing bookmark properties a lot [@ nsDocShell::SetupNewViewer]
Target Milestone: mozilla1.8.1 → mozilla1.9 M8
Attachment #279138 - Attachment is obsolete: true
Attachment #279221 - Flags: superreview?(jst)
Attachment #279221 - Flags: review?(jonas)
Attachment #279138 - Flags: superreview?(jst)
Attachment #279221 - Flags: superreview?(jst) → superreview+
Comment on attachment 279221 [details] [diff] [review]
Need a XULDocument change too

er... sicking already reviewed this, and I think I trust jst.  Let's just get this in.  ;)
Attachment #279221 - Flags: review?(jonas) → approval1.9?
Checked in the docshell patch.  Still need to create a test somehow...
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
verified fixed using Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9pre) Gecko/2008050614 Minefield/3.0pre - i can not reproduce a crash with the steps to reproduce anymore
Status: RESOLVED → VERIFIED
Crash Signature: [@ nsDocShell::SetupNewViewer]
You need to log in before you can comment on or make changes to this bug.