Closed Bug 594699 Opened 14 years ago Closed 14 years ago

startup crash [@ nsWindowWatcher::OpenWindowJSInternal(nsIDOMWindow*, char const*, char const*, char const*, int, nsIArray*, int, nsIDOMWindow**)] at safe to run script check

Categories

(Core :: DOM: Core & HTML, defect)

defect
Not set
blocker

Tracking

()

RESOLVED FIXED
mozilla2.0b9
Tracking Status
blocking2.0 --- betaN+
blocking1.9.2 --- .10+
status1.9.2 --- .10-fixed
blocking1.9.1 --- .13+
status1.9.1 --- .13-fixed

People

(Reporter: standard8, Assigned: mrbkap)

References

()

Details

(Keywords: regression, topcrash, Whiteboard: [tb31needed][tb30needed])

Crash Data

Attachments

(4 files)

We're seeing lots of these crashes following the recent branch releases. Looking at the crash point, it is a regression from bug 532730.

http://crash-stats.mozilla.com/report/index/4e31d147-d00e-4570-81da-3c2842100908

0  	xul.dll  	nsWindowWatcher::OpenWindowJSInternal  	 embedding/components/windowwatcher/src/nsWindowWatcher.cpp:525
1 	xul.dll 	nsWindowWatcher::OpenWindow 	embedding/components/windowwatcher/src/nsWindowWatcher.cpp:425
2 	xul.dll 	NS_InvokeByIndex_P 	xpcom/reflect/xptcall/src/md/win32/xptcinvoke.cpp:102
3 	xul.dll 	XPCWrappedNative::CallMethod 	js/src/xpconnect/src/xpcwrappednative.cpp:2722
4 	xul.dll 	XPC_WN_CallMethod 	js/src/xpconnect/src/xpcwrappednativejsops.cpp:1740
5 	js3250.dll 	js_Invoke 	js/src/jsinterp.cpp:1360
6 	js3250.dll 	js_Interpret 	js/src/jsops.cpp:2240
7 	js3250.dll 	js_Invoke 	js/src/jsinterp.cpp:1368
8 	xul.dll 	nsXPCWrappedJSClass::CallMethod 	js/src/xpconnect/src/xpcwrappedjsclass.cpp:1696
9 	xul.dll 	nsXPCWrappedJS::CallMethod 	js/src/xpconnect/src/xpcwrappedjs.cpp:570
10 	xul.dll 	PrepareAndDispatch 	xpcom/reflect/xptcall/src/md/win32/xptcstubs.cpp:114
11 	xul.dll 	SharedStub 	xpcom/reflect/xptcall/src/md/win32/xptcstubs.cpp:141
12 	xul.dll 	XRE_main 	toolkit/xre/nsAppRunner.cpp:3392

Crashing section:

nsCOMPtr<nsIContentUtils> utils =
    do_GetService("@mozilla.org/content/contentutils;1");
if (!utils->IsSafeToRunScript()) {   // <-- crashes here.
    return NS_ERROR_FAILURE;
}
blocking1.9.1: --- → ?
blocking1.9.2: --- → ?
Whiteboard: [tb31needs][tb30needs]
SeaMonkey also sees such crashes, though mainly in profile manager, and there it's enough to kill compreg.dat to fix it, apparently - see bug 594571
(In reply to comment #1)
> SeaMonkey also sees such crashes, though mainly in profile manager, and there
> it's enough to kill compreg.dat to fix it, apparently - see bug 594571

I think that's unlikely - Thunderbird and Firefox are already removing compreg.dat in the same way that you have fixed it.
#175 for v3.1.2, but #2 in v3.1.3
(In reply to comment #3)
> #175 for v3.1.2, but #2 in v3.1.3

The v3.1.2 one is bug 530447 - this bug is a crash at a slightly different location in the same bug.
and from ~#69 in v3.0.6 to #3 in v3.0.7 (and disproportionately biased to Mac)

http://gsfn.us/t/1eu41 may be this crash (not yet confirmed)
So is the crash happening during startup or shutdown so that 'utils' is null
because the service is not yet, or isn't anymore available?

Looks like a null check could fix this.
(In reply to comment #6)
> So is the crash happening during startup or shutdown so that 'utils' is null
> because the service is not yet, or isn't anymore available?

From the reports I'm seeing, for the majority of people it is startup - though obviously not everyone sees this.

> Looks like a null check could fix this.

I would agree that is possible, though I don't see why that wouldn't be available.
I have try server builds for Thunderbird running with the following patch:

diff --git a/embedding/components/windowwatcher/src/nsWindowWatcher.cpp b/embedd
ing/components/windowwatcher/src/nsWindowWatcher.cpp
--- a/embedding/components/windowwatcher/src/nsWindowWatcher.cpp
+++ b/embedding/components/windowwatcher/src/nsWindowWatcher.cpp
@@ -522,7 +522,7 @@ nsWindowWatcher::OpenWindowJSInternal(ns
 
   nsCOMPtr<nsIContentUtils> utils =
     do_GetService("@mozilla.org/content/contentutils;1");
-  if (!utils->IsSafeToRunScript()) {
+  if (utils && !utils->IsSafeToRunScript()) {
     return NS_ERROR_FAILURE;
   }
 
I'm not convinced if that is the correct null check, or if it should be !utils || !utils->IsSafeToRunScript.
I think !utils || !utils->IsSafeToRunScript() makes more sense.
It does prevent opening some dialog at strange time.
I wonder who is trying to open and what.
(In reply to comment #9)
> I think !utils || !utils->IsSafeToRunScript() makes more sense.
> It does prevent opening some dialog at strange time.
> I wonder who is trying to open and what.
For SeaMonkey it seems to occur if it's started with -ProfileManager, so the profile manager window is opened first.
Same crash is #6 on 3.5.12 as well.
Assignee: nobody → mrbkap
Interesting that this doesn't show up in the top 300 crashes in 3.6.9pre or 3.5.12pre...
If we can confirm a fix like comment #8 / comment #9 fast, I would very much like to include it in the "oilspill"/chemspill/firedrill release I need to do for SeaMonkey anyhow - but I'd need landing on the 1.9.1.12 relbranch.
I don't see anything obvious to point to a certain extension causing this...Also, looks to be multi-platform.
blocking1.9.1: ? → .13+
blocking1.9.2: ? → .10+
Just so folks are aware, I'm looking to get this into some TB builds asap.
Blake's looking into the underlying cause.
Crashes per hour for this signature on crash-stats going back 3 weeks: http://imagebin.ca/view/a2Ih8CFG.html
That's a pretty big spike - not sure it's worth a chemspill until we know what the actual cause is. Any correlations between the crash stacks?
Not in the reports correlation that I can see and not in my manual viewing. All 3 platforms, some with plugins some not, all kinds of CPUs, etc.
Blocks: 595242
Over the weekend I've received a compreg.dat from a user, but it is looking like there are old components that haven't been removed from Thunderbird 2, or it was a stale compreg.dat that hadn't been updated.

I've therefore asked some more questions of the user and am waiting a response.
Support is also pinging users for more info fwiw.
I'd like to get the change in comment 8 and comment 9 landed asap on mozilla-1.9.2 (on the GECKO1929_20100824_RELBRANCH branch). The sooner this gets in, the sooner we can start a build and kick the update out the door.
(In reply to comment #24)
> I'd like to get the change in comment 8 and comment 9 landed asap on
> mozilla-1.9.2 (on the GECKO1929_20100824_RELBRANCH branch).

I'm not actually confident in that change and would want someone to verify it. If we make it

if (utils && !utils->IsSafeToRunScript())

then could that regress bug 532730 if the contentutils service isn't able to be loaded for some reason?

With

if (!utils || !utils->IsSafeToRunScript())

could we get ourselves into a situation where no windows would open and the user be equally broken?
I talked with blake about this yesterday. The first one will essentially revert users that are hitting the null case back to what was in the platform before, as the entire "if" was added for bug 532730. I would guess that sort of unfixes bug 532730 for the startup case, but I think we're ok otherwise. Blake will have to confirm.

The second one might make us run into a situation with no window is displayed. I agree it changes behavior to a 3rd state (3.6.8 behavior/crash on startup/possibly no window?) that needs to be investigated.

Blake, perhaps you could chime in and confirm the above and comment which is has the most possibility of fixing users without causing more problems. I am willing to essentially unfix bug 532730 if that's what we need to do.

I also want to point out here that we'll need this on mozilla-1.9.1 GECKO19112_20100824_RELBRANCH as well (comment 24 was written in haste and only included 1.9.2)

Thanks!
Just talked with Blake again and he confirmed the above. We will not be undoing bug 532730 in the general case if we go with  "if (utils && !utils->IsSafeToRunScript())" and we do not introduce new (possibly just as bad) behavior.

So we need the "if (utils && !utils->IsSafeToRunScript())" version, on both mozilla-1.9.2 (GECKO1929_20100824_RELBRANCH branch) and mozilla-1.9.1 (GECKO19112_20100824_RELBRANCH branch).
Attached patch Proposed "fix"Splinter Review
Attachment #475126 - Flags: review?(jonas)
Attachment #475126 - Flags: review?(jonas) → review?(jst)
Attachment #475126 - Flags: review?(jst) → review+
More technical details: we're seeing this crash on trunk betas as well as branch builds, which means that it is most-likely not caused by stale compreg. It's also not likely caused by recursive initialization of layout, but it is very likely caused by the layout module failing to initialize. We don't know why this is, yet.The null-check will prevent the crash, but it won't solve the underlying problem which *probably* indicates that the browser won't actually work. Hard to tell what's actually going on, though.
blocking1.9.1: .14+ → .13+
blocking1.9.2: .11+ → .10+
Just to note, most of the comments indicated people were running the browser previously, got the update prompt, updated, and then were unable to start after that point.
Attachment #475126 - Flags: approval1.9.2.10+
Attachment #475126 - Flags: approval1.9.1.13+
Got a compreg.dat and install folder from a user with this crash -- may help with repro.

Install folder too big to upload, ask me if you need a copy.
Comment on attachment 475126 [details] [diff] [review]
Proposed "fix"

correcting after renaming the rest of the approval requests to the next release
Attachment #475126 - Flags: approval1.9.2.11+
Attachment #475126 - Flags: approval1.9.2.10+
Attachment #475126 - Flags: approval1.9.1.14+
Attachment #475126 - Flags: approval1.9.1.13+
Though I have been saying this in other places, I am going to mention this here as the bug is getting linked to...though this was a spike, the overall "crashiness" of Firefox 3.6.x has not really increased at all. At about 1k crashes a day, this crash is a drop in the bucket vs active daily users. But, because it is a crash on startup that could prevent people from using Firefox entirely, we feel it was best to get a fix out quickly.
Interesting:
A virus was detected by clamav in the latest Firefox version. https://wwws.clamav.net/bugzilla/show_bug.cgi?id=2269
Could this be related?
(In reply to comment #35)
> Interesting:
> A virus was detected by clamav in the latest Firefox version.
> https://wwws.clamav.net/bugzilla/show_bug.cgi?id=2269
> Could this be related?
Very doubtful. That is in regards to a false positive for one or more of the files used by the installer and not Firefox itself.
No longer blocks: 595242
Hi have a crash at startup after updating from 3.6.9 (which worked fine) and from 3.6.8 , to 3.6.10. "the procedure entry point sqlite3_column_double could not be located in the dynamic link library sqlite3.dll.

Windows XP SP3.  Could this crash bug be related to the latest Microsoft's Windows updates trying to fix the DLL Injection (loading DLLs from current folder) security bug ?
Two questions:

 - should this block trunk?
 - do we know how that variable got made null?
blocking2.0: --- → ?
here is the distribution across releases on sept 28


checking --- nsWindowWatcher::OpenWindowJSInternal.nsIDOMWindow...char.const...char.const...char.const...int..nsIArray...int..nsIDOMWindow.. 20100928-crashdata.cs
v
found in: 3.6.10 3.6.9 4.0b6 4.0b5 3.6.3 3.5.12 4.0b3 3.6b4 3.6.8 3.6.6 3.6 4.0b4 3.0.19 4.0b7pre 4.0b1 3.6b5 3.6.2 3.5.10 3.0.7
release total-crashes
              nsWindowWatcher::OpenWindowJSInternal.nsIDOMWindow...char.const...char.const...char.const...int..nsIArray...int..nsIDOMWindow.. crashes
                         pct.
all     373476  140     0.000374857
3.6.10  213615  55      0.000257473
3.6.9   5104    33      0.00646552
4.0b6   25035   6       0.000239664
4.0b5   3750    6       0.0016
3.6.3   12459   6       0.00048158
3.5.12  850     6       0.00705882
4.0b3   1288    5       0.00388199
3.6b4   775     4       0.00516129
3.6.8   24127   3       0.000124342
3.6.6   7131    3       0.000420698
3.6     7896    3       0.000379939
4.0b4   2376    2       0.000841751
3.0.19  10540   2       0.000189753
4.0b7pre2655    1       0.000376648
4.0b1   1091    1       0.00091659
3.6b5   609     1       0.00164204
3.6.2   2364    1       0.000423012
3.5.10  457     1       0.00218818
3.0.7   357     1       0.00280112

And daily volume on 4.0b7pre look like about 0-4 crashes per day for about the last two weeks.

I'd say no, not a blocker for 4.0b7.
It looks like 100% of users on 4.0b6 and 97% of users on 3.6.10 have extension compatibility checking turned off for this signature when looking at data for oct 5.

1.000	nsWindowWatcher::OpenWindowJSInternal(nsIDOMWindow*, char const*, char const*, char const*, int, nsIArray*, int, nsIDOMWindow**)	10,,10	 bugs=530447,530845,594699

0.968	nsWindowWatcher::OpenWindowJSInternal(nsIDOMWindow*, char const*, char const*, char const*, int, nsIArray*, int, nsIDOMWindow**)	62,2,60	 bugs=530447,530845,594699
Blocking.
blocking2.0: ? → betaN+
Summary: startup crash [@ nsWindowWatcher::OpenWindowJSInternal(nsIDOMWindow*, char const*, char const*, char const*, int, nsIArray*, int, nsIDOMWindow**)] at safe to run script check → [ready to land] startup crash [@ nsWindowWatcher::OpenWindowJSInternal(nsIDOMWindow*, char const*, char const*, char const*, int, nsIArray*, int, nsIDOMWindow**)] at safe to run script check
Pushed:
https://hg.mozilla.org/mozilla-central/rev/307f6c854845
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla2.0b9
Summary: [ready to land] startup crash [@ nsWindowWatcher::OpenWindowJSInternal(nsIDOMWindow*, char const*, char const*, char const*, int, nsIArray*, int, nsIDOMWindow**)] at safe to run script check → startup crash [@ nsWindowWatcher::OpenWindowJSInternal(nsIDOMWindow*, char const*, char const*, char const*, int, nsIArray*, int, nsIDOMWindow**)] at safe to run script check
Whiteboard: [tb31needs][tb30needs] → [tb31needed][tb30needed]
Crash Signature: [@ nsWindowWatcher::OpenWindowJSInternal(nsIDOMWindow*, char const*, char const*, char const*, int, nsIArray*, int, nsIDOMWindow**)]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: