Last Comment Bug 594699 - startup crash [@ nsWindowWatcher::OpenWindowJSInternal(nsIDOMWindow*, char const*, char const*, char const*, int, nsIArray*, int, nsIDOMWindow**)] at safe to run script check
: startup crash [@ nsWindowWatcher::OpenWindowJSInternal(nsIDOMWindow*, char co...
Status: RESOLVED FIXED
[tb31needed][tb30needed]
: regression, topcrash
Product: Core
Classification: Components
Component: DOM: Core & HTML (show other bugs)
: Trunk
: All All
: -- blocker with 1 vote (vote)
: mozilla2.0b9
Assigned To: Blake Kaplan (:mrbkap) (in and out until 7-14)
:
Mentors:
http://getsatisfaction.com/mozilla_me...
: 563664 (view as bug list)
Depends on:
Blocks: 532730
  Show dependency treegraph
 
Reported: 2010-09-09 01:30 PDT by Mark Banner (:standard8)
Modified: 2014-10-03 17:01 PDT (History)
44 users (show)
See Also:
Crash Signature:
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
betaN+
.10+
.10-fixed
.13+
.13-fixed


Attachments
Crashes/day as of 11am, 2010-09-10 (36.58 KB, image/png)
2010-09-10 11:19 PDT, christian
no flags Details
Crashes/hour as of 11am, 2010-09-10 (32.57 KB, image/png)
2010-09-10 11:24 PDT, christian
no flags Details
Proposed "fix" (1.08 KB, patch)
2010-09-14 10:47 PDT, Blake Kaplan (:mrbkap) (in and out until 7-14)
jst: review+
dveditz: approval1.9.2.10+
dveditz: approval1.9.1.13+
Details | Diff | Review
Install folder from user with this crash (146.96 KB, text/plain)
2010-09-14 14:35 PDT, [:Cww]
no flags Details

Description Mark Banner (:standard8) 2010-09-09 01:30:53 PDT
We're seeing lots of these crashes following the recent branch releases. Looking at the crash point, it is a regression from bug 532730.

http://crash-stats.mozilla.com/report/index/4e31d147-d00e-4570-81da-3c2842100908

0  	xul.dll  	nsWindowWatcher::OpenWindowJSInternal  	 embedding/components/windowwatcher/src/nsWindowWatcher.cpp:525
1 	xul.dll 	nsWindowWatcher::OpenWindow 	embedding/components/windowwatcher/src/nsWindowWatcher.cpp:425
2 	xul.dll 	NS_InvokeByIndex_P 	xpcom/reflect/xptcall/src/md/win32/xptcinvoke.cpp:102
3 	xul.dll 	XPCWrappedNative::CallMethod 	js/src/xpconnect/src/xpcwrappednative.cpp:2722
4 	xul.dll 	XPC_WN_CallMethod 	js/src/xpconnect/src/xpcwrappednativejsops.cpp:1740
5 	js3250.dll 	js_Invoke 	js/src/jsinterp.cpp:1360
6 	js3250.dll 	js_Interpret 	js/src/jsops.cpp:2240
7 	js3250.dll 	js_Invoke 	js/src/jsinterp.cpp:1368
8 	xul.dll 	nsXPCWrappedJSClass::CallMethod 	js/src/xpconnect/src/xpcwrappedjsclass.cpp:1696
9 	xul.dll 	nsXPCWrappedJS::CallMethod 	js/src/xpconnect/src/xpcwrappedjs.cpp:570
10 	xul.dll 	PrepareAndDispatch 	xpcom/reflect/xptcall/src/md/win32/xptcstubs.cpp:114
11 	xul.dll 	SharedStub 	xpcom/reflect/xptcall/src/md/win32/xptcstubs.cpp:141
12 	xul.dll 	XRE_main 	toolkit/xre/nsAppRunner.cpp:3392

Crashing section:

nsCOMPtr<nsIContentUtils> utils =
    do_GetService("@mozilla.org/content/contentutils;1");
if (!utils->IsSafeToRunScript()) {   // <-- crashes here.
    return NS_ERROR_FAILURE;
}
Comment 1 Robert Kaiser (not working on stability any more) 2010-09-09 04:04:02 PDT
SeaMonkey also sees such crashes, though mainly in profile manager, and there it's enough to kill compreg.dat to fix it, apparently - see bug 594571
Comment 2 Mark Banner (:standard8) 2010-09-09 04:22:11 PDT
(In reply to comment #1)
> SeaMonkey also sees such crashes, though mainly in profile manager, and there
> it's enough to kill compreg.dat to fix it, apparently - see bug 594571

I think that's unlikely - Thunderbird and Firefox are already removing compreg.dat in the same way that you have fixed it.
Comment 3 Wayne Mery (:wsmwk, NI for questions) 2010-09-09 04:30:10 PDT
#175 for v3.1.2, but #2 in v3.1.3
Comment 4 Mark Banner (:standard8) 2010-09-09 04:32:46 PDT
(In reply to comment #3)
> #175 for v3.1.2, but #2 in v3.1.3

The v3.1.2 one is bug 530447 - this bug is a crash at a slightly different location in the same bug.
Comment 5 Wayne Mery (:wsmwk, NI for questions) 2010-09-09 04:55:46 PDT
and from ~#69 in v3.0.6 to #3 in v3.0.7 (and disproportionately biased to Mac)

http://gsfn.us/t/1eu41 may be this crash (not yet confirmed)
Comment 6 Olli Pettay [:smaug] (high review load, please consider other reviewers) 2010-09-09 05:47:56 PDT
So is the crash happening during startup or shutdown so that 'utils' is null
because the service is not yet, or isn't anymore available?

Looks like a null check could fix this.
Comment 7 Mark Banner (:standard8) 2010-09-09 06:06:40 PDT
(In reply to comment #6)
> So is the crash happening during startup or shutdown so that 'utils' is null
> because the service is not yet, or isn't anymore available?

From the reports I'm seeing, for the majority of people it is startup - though obviously not everyone sees this.

> Looks like a null check could fix this.

I would agree that is possible, though I don't see why that wouldn't be available.
Comment 8 Mark Banner (:standard8) 2010-09-09 07:57:19 PDT
I have try server builds for Thunderbird running with the following patch:

diff --git a/embedding/components/windowwatcher/src/nsWindowWatcher.cpp b/embedd
ing/components/windowwatcher/src/nsWindowWatcher.cpp
--- a/embedding/components/windowwatcher/src/nsWindowWatcher.cpp
+++ b/embedding/components/windowwatcher/src/nsWindowWatcher.cpp
@@ -522,7 +522,7 @@ nsWindowWatcher::OpenWindowJSInternal(ns
 
   nsCOMPtr<nsIContentUtils> utils =
     do_GetService("@mozilla.org/content/contentutils;1");
-  if (!utils->IsSafeToRunScript()) {
+  if (utils && !utils->IsSafeToRunScript()) {
     return NS_ERROR_FAILURE;
   }
 
I'm not convinced if that is the correct null check, or if it should be !utils || !utils->IsSafeToRunScript.
Comment 9 Olli Pettay [:smaug] (high review load, please consider other reviewers) 2010-09-09 09:36:35 PDT
I think !utils || !utils->IsSafeToRunScript() makes more sense.
It does prevent opening some dialog at strange time.
I wonder who is trying to open and what.
Comment 10 Bruno 'Aqualon' Escherl 2010-09-09 09:41:13 PDT
(In reply to comment #9)
> I think !utils || !utils->IsSafeToRunScript() makes more sense.
> It does prevent opening some dialog at strange time.
> I wonder who is trying to open and what.
For SeaMonkey it seems to occur if it's started with -ProfileManager, so the profile manager window is opened first.
Comment 11 christian 2010-09-09 09:54:03 PDT
Same crash is #6 on 3.5.12 as well.
Comment 12 christian 2010-09-09 10:10:27 PDT
Interesting that this doesn't show up in the top 300 crashes in 3.6.9pre or 3.5.12pre...
Comment 13 Robert Kaiser (not working on stability any more) 2010-09-09 10:55:53 PDT
If we can confirm a fix like comment #8 / comment #9 fast, I would very much like to include it in the "oilspill"/chemspill/firedrill release I need to do for SeaMonkey anyhow - but I'd need landing on the 1.9.1.12 relbranch.
Comment 14 christian 2010-09-09 11:43:44 PDT
I don't see anything obvious to point to a certain extension causing this...Also, looks to be multi-platform.
Comment 15 Mark Banner (:standard8) 2010-09-09 14:06:04 PDT
Just so folks are aware, I'm looking to get this into some TB builds asap.
Comment 16 christian 2010-09-09 14:53:46 PDT
Blake's looking into the underlying cause.
Comment 17 christian 2010-09-09 16:27:05 PDT
Crashes per hour for this signature on crash-stats going back 3 weeks: http://imagebin.ca/view/a2Ih8CFG.html
Comment 18 Mike Beltzner [:beltzner, not reading bugmail] 2010-09-10 06:21:10 PDT
That's a pretty big spike - not sure it's worth a chemspill until we know what the actual cause is. Any correlations between the crash stacks?
Comment 19 christian 2010-09-10 10:04:38 PDT
Not in the reports correlation that I can see and not in my manual viewing. All 3 platforms, some with plugins some not, all kinds of CPUs, etc.
Comment 20 christian 2010-09-10 11:19:02 PDT
Created attachment 474131 [details]
Crashes/day as of 11am, 2010-09-10
Comment 21 christian 2010-09-10 11:24:21 PDT
Created attachment 474132 [details]
Crashes/hour as of 11am, 2010-09-10
Comment 22 Mark Banner (:standard8) 2010-09-13 07:42:34 PDT
Over the weekend I've received a compreg.dat from a user, but it is looking like there are old components that haven't been removed from Thunderbird 2, or it was a stale compreg.dat that hadn't been updated.

I've therefore asked some more questions of the user and am waiting a response.
Comment 23 christian 2010-09-13 14:31:55 PDT
Support is also pinging users for more info fwiw.
Comment 24 christian 2010-09-13 23:10:27 PDT
I'd like to get the change in comment 8 and comment 9 landed asap on mozilla-1.9.2 (on the GECKO1929_20100824_RELBRANCH branch). The sooner this gets in, the sooner we can start a build and kick the update out the door.
Comment 25 Mark Banner (:standard8) 2010-09-13 23:47:48 PDT
(In reply to comment #24)
> I'd like to get the change in comment 8 and comment 9 landed asap on
> mozilla-1.9.2 (on the GECKO1929_20100824_RELBRANCH branch).

I'm not actually confident in that change and would want someone to verify it. If we make it

if (utils && !utils->IsSafeToRunScript())

then could that regress bug 532730 if the contentutils service isn't able to be loaded for some reason?

With

if (!utils || !utils->IsSafeToRunScript())

could we get ourselves into a situation where no windows would open and the user be equally broken?
Comment 26 christian 2010-09-14 09:27:35 PDT
I talked with blake about this yesterday. The first one will essentially revert users that are hitting the null case back to what was in the platform before, as the entire "if" was added for bug 532730. I would guess that sort of unfixes bug 532730 for the startup case, but I think we're ok otherwise. Blake will have to confirm.

The second one might make us run into a situation with no window is displayed. I agree it changes behavior to a 3rd state (3.6.8 behavior/crash on startup/possibly no window?) that needs to be investigated.

Blake, perhaps you could chime in and confirm the above and comment which is has the most possibility of fixing users without causing more problems. I am willing to essentially unfix bug 532730 if that's what we need to do.

I also want to point out here that we'll need this on mozilla-1.9.1 GECKO19112_20100824_RELBRANCH as well (comment 24 was written in haste and only included 1.9.2)

Thanks!
Comment 27 christian 2010-09-14 10:23:23 PDT
Just talked with Blake again and he confirmed the above. We will not be undoing bug 532730 in the general case if we go with  "if (utils && !utils->IsSafeToRunScript())" and we do not introduce new (possibly just as bad) behavior.

So we need the "if (utils && !utils->IsSafeToRunScript())" version, on both mozilla-1.9.2 (GECKO1929_20100824_RELBRANCH branch) and mozilla-1.9.1 (GECKO19112_20100824_RELBRANCH branch).
Comment 28 Blake Kaplan (:mrbkap) (in and out until 7-14) 2010-09-14 10:47:43 PDT
Created attachment 475126 [details] [diff] [review]
Proposed "fix"
Comment 30 Benjamin Smedberg [:bsmedberg] 2010-09-14 11:11:26 PDT
More technical details: we're seeing this crash on trunk betas as well as branch builds, which means that it is most-likely not caused by stale compreg. It's also not likely caused by recursive initialization of layout, but it is very likely caused by the layout module failing to initialize. We don't know why this is, yet.The null-check will prevent the crash, but it won't solve the underlying problem which *probably* indicates that the browser won't actually work. Hard to tell what's actually going on, though.
Comment 31 christian 2010-09-14 13:39:08 PDT
Just to note, most of the comments indicated people were running the browser previously, got the update prompt, updated, and then were unable to start after that point.
Comment 32 [:Cww] 2010-09-14 14:35:35 PDT
Created attachment 475241 [details]
Install folder from user with this crash

Got a compreg.dat and install folder from a user with this crash -- may help with repro.

Install folder too big to upload, ask me if you need a copy.
Comment 33 Daniel Veditz [:dveditz] 2010-09-14 20:54:41 PDT
Comment on attachment 475126 [details] [diff] [review]
Proposed "fix"

correcting after renaming the rest of the approval requests to the next release
Comment 34 christian 2010-09-15 11:20:09 PDT
Though I have been saying this in other places, I am going to mention this here as the bug is getting linked to...though this was a spike, the overall "crashiness" of Firefox 3.6.x has not really increased at all. At about 1k crashes a day, this crash is a drop in the bucket vs active daily users. But, because it is a crash on startup that could prevent people from using Firefox entirely, we feel it was best to get a fix out quickly.
Comment 35 Daniel Kirsch 2010-09-15 15:10:25 PDT
Interesting:
A virus was detected by clamav in the latest Firefox version. https://wwws.clamav.net/bugzilla/show_bug.cgi?id=2269
Could this be related?
Comment 36 Robert Strong [:rstrong] (use needinfo to contact me) 2010-09-15 16:32:40 PDT
(In reply to comment #35)
> Interesting:
> A virus was detected by clamav in the latest Firefox version.
> https://wwws.clamav.net/bugzilla/show_bug.cgi?id=2269
> Could this be related?
Very doubtful. That is in regards to a false positive for one or more of the files used by the installer and not Firefox itself.
Comment 37 alex5723 2010-09-17 09:40:57 PDT
Hi have a crash at startup after updating from 3.6.9 (which worked fine) and from 3.6.8 , to 3.6.10. "the procedure entry point sqlite3_column_double could not be located in the dynamic link library sqlite3.dll.

Windows XP SP3.  Could this crash bug be related to the latest Microsoft's Windows updates trying to fix the DLL Injection (loading DLLs from current folder) security bug ?
Comment 38 Mike Beltzner [:beltzner, not reading bugmail] 2010-09-30 13:16:00 PDT
Two questions:

 - should this block trunk?
 - do we know how that variable got made null?
Comment 39 chris hofmann 2010-09-30 14:18:47 PDT
here is the distribution across releases on sept 28


checking --- nsWindowWatcher::OpenWindowJSInternal.nsIDOMWindow...char.const...char.const...char.const...int..nsIArray...int..nsIDOMWindow.. 20100928-crashdata.cs
v
found in: 3.6.10 3.6.9 4.0b6 4.0b5 3.6.3 3.5.12 4.0b3 3.6b4 3.6.8 3.6.6 3.6 4.0b4 3.0.19 4.0b7pre 4.0b1 3.6b5 3.6.2 3.5.10 3.0.7
release total-crashes
              nsWindowWatcher::OpenWindowJSInternal.nsIDOMWindow...char.const...char.const...char.const...int..nsIArray...int..nsIDOMWindow.. crashes
                         pct.
all     373476  140     0.000374857
3.6.10  213615  55      0.000257473
3.6.9   5104    33      0.00646552
4.0b6   25035   6       0.000239664
4.0b5   3750    6       0.0016
3.6.3   12459   6       0.00048158
3.5.12  850     6       0.00705882
4.0b3   1288    5       0.00388199
3.6b4   775     4       0.00516129
3.6.8   24127   3       0.000124342
3.6.6   7131    3       0.000420698
3.6     7896    3       0.000379939
4.0b4   2376    2       0.000841751
3.0.19  10540   2       0.000189753
4.0b7pre2655    1       0.000376648
4.0b1   1091    1       0.00091659
3.6b5   609     1       0.00164204
3.6.2   2364    1       0.000423012
3.5.10  457     1       0.00218818
3.0.7   357     1       0.00280112

And daily volume on 4.0b7pre look like about 0-4 crashes per day for about the last two weeks.

I'd say no, not a blocker for 4.0b7.
Comment 40 chris hofmann 2010-10-06 17:26:46 PDT
It looks like 100% of users on 4.0b6 and 97% of users on 3.6.10 have extension compatibility checking turned off for this signature when looking at data for oct 5.

1.000	nsWindowWatcher::OpenWindowJSInternal(nsIDOMWindow*, char const*, char const*, char const*, int, nsIArray*, int, nsIDOMWindow**)	10,,10	 bugs=530447,530845,594699

0.968	nsWindowWatcher::OpenWindowJSInternal(nsIDOMWindow*, char const*, char const*, char const*, int, nsIArray*, int, nsIDOMWindow**)	62,2,60	 bugs=530447,530845,594699
Comment 41 Johnny Stenback (:jst, jst@mozilla.com) 2010-10-13 16:40:31 PDT
Blocking.
Comment 42 Mounir Lamouri (:mounir) 2010-12-22 17:12:27 PST
Pushed:
https://hg.mozilla.org/mozilla-central/rev/307f6c854845
Comment 43 Nickolay_Ponomarev 2014-10-03 17:01:15 PDT
*** Bug 563664 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.