Last Comment Bug 497875 - Crash [@ nsXULDocument::ResumeWalk][@ nsXULPrototypeDocument::GetURI() | nsXULDocument::ResumeWalk] with loadoverlay and script
: Crash [@ nsXULDocument::ResumeWalk][@ nsXULPrototypeDocument::GetURI() | nsXU...
[sg:dos] null deref
: crash, regression, testcase, verified1.9.0.16, verified1.9.1, verified1.9.2
Product: Core
Classification: Components
Component: XUL (show other bugs)
: Trunk
: All All
: -- critical (vote)
: mozilla1.9.3a1
Assigned To: Mats Palmgren (:mats)
: Neil Deakin
Depends on: 647351 448676 630748
Blocks: 519767
  Show dependency treegraph
Reported: 2009-06-12 07:02 PDT by Martijn Wargers [:mwargers] (not working for Mozilla)
Modified: 2011-06-13 10:01 PDT (History)
14 users (show)
mats: in‑testsuite+
See Also:
Crash Signature:
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---

testcase (310 bytes, application/vnd.mozilla.xul+xml)
2009-06-12 07:02 PDT, Martijn Wargers [:mwargers] (not working for Mozilla)
no flags Details
gdb full stack (8.18 KB, text/plain)
2009-09-20 10:11 PDT, Henrik Skupin (:whimboo)
no flags Details
stacks (24.25 KB, text/html)
2009-11-05 07:56 PST, Mats Palmgren (:mats)
no flags Details
wallpaper? (1.06 KB, patch)
2009-11-05 07:58 PST, Mats Palmgren (:mats)
neil: review+
roc: approval1.9.2+
dveditz: approval1.9.1.6+
dveditz: approval1.9.0.16+
Details | Diff | Splinter Review
mochichrometest.diff (3.10 KB, patch)
2009-11-06 18:01 PST, Mats Palmgren (:mats)
no flags Details | Diff | Splinter Review

Description Martijn Wargers [:mwargers] (not working for Mozilla) 2009-06-12 07:02:06 PDT
Created attachment 382956 [details]

See testcase, which crashes current trunk build and Firefox3.0.x after the ftp site has bee loaded from the script tag.
I'm not sure if this crashes online, or just locally.
When testing locally, you need to have the security.fileuri.strict_origin_policy set to false to get the crash.
0  	xul.dll  	nsXULDocument::ResumeWalk  	 content/xul/document/src/nsXULDocument.cpp:2875
1 	xul.dll 	nsXULDocument::OnStreamComplete 	content/xul/document/src/nsXULDocument.cpp:3472
2 	xul.dll 	nsStreamLoader::OnStopRequest 	netwerk/base/src/nsStreamLoader.cpp:108
3 	xul.dll 	nsBaseChannel::OnStopRequest 	netwerk/base/src/nsBaseChannel.cpp:680
4 	xul.dll 	nsInputStreamPump::OnStateStop 	netwerk/base/src/nsInputStreamPump.cpp:576
5 	xul.dll 	nsInputStreamPump::OnInputStreamReady 	netwerk/base/src/nsInputStreamPump.cpp:401
6 	xul.dll 	nsOutputStreamReadyEvent::Run 	xpcom/io/nsStreamUtils.cpp:190
7 	xul.dll 	nsThread::ProcessNextEvent 	xpcom/threads/nsThread.cpp:510
8 	xul.dll 	nsBaseAppShell::Run 	widget/src/xpwidgets/nsBaseAppShell.cpp:170
9 	xul.dll 	nsAppStartup::Run 	toolkit/components/startup/src/nsAppStartup.cpp:193
10 	nspr4.dll 	PR_GetEnv 	
11 	firefox.exe 	wmain 	toolkit/xre/nsWindowsWMain.cpp:110
12 	firefox.exe 	firefox.exe@0x21a7 	
13 	kernel32.dll 	kernel32.dll@0x17076 

Firefox3.0.x crash report:
0  	xul.dll  	nsXULDocument::ResumeWalk  	 mozilla/content/xul/document/src/nsXULDocument.cpp:2822
1 	xul.dll 	nsCOMPtr_base::assign_from_qi 	nsCOMPtr.cpp:98
2 	xul.dll 	nsXULDocument::OnStreamComplete 	mozilla/content/xul/document/src/nsXULDocument.cpp:3371

Perhaps this is related to bug 475025?
Comment 1 Martijn Wargers [:mwargers] (not working for Mozilla) 2009-06-12 07:03:38 PDT
Ok, the testcase doesn't crash online, so perhaps this isn't much of a security issue.
Comment 2 Daniel Veditz [:dveditz] 2009-06-15 01:39:39 PDT
This is crashing in ResumeWalk because mCurrentPrototype is null.
Comment 3 Marco Zehe (:MarcoZ) 2009-09-02 22:02:13 PDT
Another crash I just got this morning with the same signature, but a slightly different stack:
I opened Minefield from Thunderbird to moderate a blog entry, something I'm doing on a daily basis, but have never seen crash before.
Comment 4 Henrik Skupin (:whimboo) 2009-09-19 10:52:41 PDT
It looks like that this crash is the top crash #46 for Firefox 3.5.3. Even that only the first frame is identical but mCurrentPrototype seems to be always 0:

0  	xul.dll  	nsXULDocument::ResumeWalk  	 content/xul/document/src/nsXULDocument.cpp:2889
1 	xul.dll 	nsCOMPtr_base::~nsCOMPtr_base 	obj-firefox/xpcom/build/nsCOMPtr.cpp:81
2 	xul.dll 	xul.dll@0x334a7b

A housemate informed me that it also happens for him when doing a search in the Google toolbar.

Do we wanna handle the topcrasher in this bug or shall we better file a new one to address it?
Comment 5 Henrik Skupin (:whimboo) 2009-09-19 10:59:22 PDT
The crashing line has been added by the patch on bug 419452. Probably a regression? I'll check. For now CC'ing Neil to this bug.
Comment 6 Henrik Skupin (:whimboo) 2009-09-19 15:44:19 PDT
It's a regression which happens between the following builds:

=> Check-ins:

=> Check-ins:

The m-c merge has a couple of the fixes from trunk. Olli, do you have an idea?
Comment 7 Henrik Skupin (:whimboo) 2009-09-20 09:57:45 PDT
Same crash on OS X with another signature: bp-064322e3-f19c-40bc-89d6-637552090920

0  	XUL  	nsXULPrototypeDocument::GetURI  	 content/xul/document/src/nsXULPrototypeDocument.cpp:481
1 	XUL 	nsXULDocument::ResumeWalk 	content/xul/document/src/nsXULDocument.cpp:2886
2 	XUL 	nsXULDocument::OnStreamComplete 	content/xul/document/src/nsXULDocument.cpp:3609
3 	XUL 	nsStreamLoader::OnStopRequest 	netwerk/base/src/nsStreamLoader.cpp:127
Comment 8 Henrik Skupin (:whimboo) 2009-09-20 10:11:42 PDT
Created attachment 401718 [details]
gdb full stack
Comment 9 Olli Pettay [:smaug] 2009-09-20 13:28:20 PDT
(In reply to comment #6)
> It's a regression which happens between the following builds:
> 2008-03-20-05-trunk/
> 2008-03-21-06-trunk/
> => Check-ins:
> 2008-03-26-02-mozilla-central/
> 2008-03-27-02-mozilla-central/
> => Check-ins:

The urls are the same. On purpose?
Comment 10 Henrik Skupin (:whimboo) 2009-09-20 13:42:19 PDT
No. The correct second url would be:
Comment 11 Mats Palmgren (:mats) 2009-11-05 07:56:06 PST
Created attachment 410513 [details]

I can reproduce the crash on Linux when reloading the testcase.

The problem appears to be that there is a nsFtpState that
creates an async nsInputStreamReadyEvent which runs *after*
we attempt to load the non-existing overlay.  When we get the
OnStreamComplete callback mCurrentPrototype is NULL.
Failing to load an overlay sets mCurrentPrototype = null here:
Comment 12 Mats Palmgren (:mats) 2009-11-05 07:58:05 PST
Created attachment 410514 [details] [diff] [review]
Comment 13 Mats Palmgren (:mats) 2009-11-05 09:22:07 PST
(possibly related: bug 136346, bug 186667)
Comment 14 Nickolay_Ponomarev 2009-11-05 10:22:25 PST
Do you mean the reason for the crash is that nsXULDocument::LoadOverlay tries (and fails, thus setting mCurrentPrototype to NULL) to load an overlay while there's a pending callback to ResumeWalk for the main document load?

If that's the case, this is somewhat related to bug 330458.

And your wallpaper is basically undoing the unconditional mCurrentPrototype access that was added in bug 419452:
Comment 15 Robert O'Callahan (:roc) (email my personal email if necessary) 2009-11-05 13:20:37 PST
This probably should block on suspicion of being needed for bug 519767, which is a blocker.

Neil, maybe you could help Mats here?
Comment 16 Mats Palmgren (:mats) 2009-11-05 14:54:11 PST
(In reply to comment #14)
> If that's the case, this is somewhat related to bug 330458.

Yes, bug 330458 looks somewhat related.  It would be great to
have that fixed, but as roc noted, bug 519767 blocks 1.9.2 and
we need to fix the crash for 1.9.2 with something less risky
than what you have in bug 330458 (IMO).

> And your wallpaper is basically undoing the unconditional
> mCurrentPrototype access that was added in bug 419452

Sorry, but I don't see how it can break anything that bug 419452 fixed,
if that's what you meant.  If mCurrentPrototype is non-null the
attached patch doesn't change anything.  If it's null, the current
code crashes, and from what I can see there's no harm in making
|overlayURI| null there and continuing with the rest of the function.
The aim of my patch is just to avoid the crash, not that it must be
"correct" for all cases where |mCurrentPrototype| is null.

Can you suggest a better fix? (that is still low-risk)
Comment 17 2009-11-05 16:37:14 PST
Comment on attachment 410514 [details] [diff] [review]

Certainly in any test case using document.loadOverlay we will never actually use the URI, so it doesn't matter if we don't have one.
Comment 18 Nickolay_Ponomarev 2009-11-06 05:05:00 PST
> Sorry, but I don't see how it can break anything that bug 419452 fixed,
> if that's what you meant
No, I meant the problem you're wallpapering over (assuming that mCurrentPrototype is always non-null in ResumeWalk) was introduced in that bug. Since this actually improves the confidence in the fix (at least for me), I wanted to note that, sorry about the confusion.
Comment 19 Mats Palmgren (:mats) 2009-11-06 14:00:35 PST

Sorry for the lack of regression test, but it's not possible to write
crash tests that needs privileges currently (bug 448676), and the crash
doesn't occur when loading the overlay over http (mochitest).
Comment 20 Boris Zbarsky [:bz] (still a bit busy) 2009-11-06 16:13:26 PST
Can you write a mochichrome test?
Comment 21 Mats Palmgren (:mats) 2009-11-06 18:01:39 PST
Created attachment 410920 [details] [diff] [review]

Thanks for the suggestion.
This crashes about 80% of the time without the patch.
Comment 22 Mats Palmgren (:mats) 2009-11-06 20:58:42 PST
mochichrome test:
Comment 23 Mats Palmgren (:mats) 2009-11-07 08:08:58 PST
Comment on attachment 410514 [details] [diff] [review]

Low-risk patch, like to get this in for beta2.
Comment 24 Mats Palmgren (:mats) 2009-11-08 20:08:30 PST
Comment 25 Daniel Veditz [:dveditz] 2009-11-09 14:20:22 PST
Comment on attachment 410514 [details] [diff] [review]

Approved for and, a=dveditz for release-drivers
Comment 26 Mats Palmgren (:mats) 2009-11-09 21:38:14 PST

On CVS trunk:
mozilla/content/xul/document/src/nsXULDocument.cpp 	1.839
mozilla/content/xul/document/test/ 	1.8
mozilla/content/xul/document/test/bug497875-iframe.xul 	1.1
mozilla/content/xul/document/test/test_bug497875.xul 	1.1
Comment 27 Henrik Skupin (:whimboo) 2009-11-17 02:48:10 PST
Verified fixed on trunk, 1.9.2, and 1.9.1 with builds on Windows:

Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.3a1pre) Gecko/20091116 Minefield/3.7a1pre ID:20091116050530

Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.2b4pre) Gecko/20091116 Namoroka/3.6b4pre ID:20091116045513

Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv: Gecko/20091116 Shiretoko/3.5.6pre
Comment 28 Al Billings [:abillings] 2009-11-23 12:19:20 PST
Verified fixed for using mochichrome test. Manual testcase does not crash 1.9.0 on Windows when tested.

Note You need to log in before you can comment on or make changes to this bug.