Last Comment Bug 733892 - crash aswJsFlt.dll (Avast) and Ant extension
: crash aswJsFlt.dll (Avast) and Ant extension
Status: RESOLVED FIXED
[3rd-party-bustage][STR in comment 15]
: crash, reproducible
Product: Core
Classification: Components
Component: Widget: Win32 (show other bugs)
: 11 Branch
: x86 Windows NT
: -- critical (vote)
: ---
Assigned To: Nobody; OK to take it and work on it
:
: Jim Mathies [:jimm]
Mentors:
Depends on: 850957
Blocks: 742113
  Show dependency treegraph
 
Reported: 2012-03-07 13:31 PST by Marcia Knous [:marcia - use ni]
Modified: 2013-03-13 18:24 PDT (History)
22 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
-
-


Attachments
proof of concept hack (4.68 KB, patch)
2012-04-02 00:27 PDT, Nick Cameron [:nrc]
no flags Details | Diff | Splinter Review

Description Marcia Knous [:marcia - use ni] 2012-03-07 13:31:25 PST
This bug was filed from the Socorro interface and is 
report bp-93934f25-a07e-4139-9056-e25392120307 .
============================================================= 

Seen while looking at B5 crash stats. This has been around in previous releases - https://crash-stats.mozilla.com/report/list?signature=DispatchHookW links to the reports. In the past day this ranked #24 in Beta 5 crashes.

__fnHkINLPMOUSEHOOKSTRUCTEX shows up in crash stats as a signature on its own - https://crash-stats.mozilla.com/report/list?signature=__fnHkINLPMOUSEHOOKSTRUCTEX

Frame 	Module 	Signature [Expand] 	Source
0 		@0x6d7172e3 	
1 	user32.dll 	DispatchHookW 	
2 	user32.dll 	CallHookWithSEH 	
3 	user32.dll 	__fnHkINLPMOUSEHOOKSTRUCTEX 	
4 	ntdll.dll 	KiUserCallbackDispatcher 	
5 	ntdll.dll 	KiUserApcDispatcher 	
6 	user32.dll 	_PeekMessage 	
7 	shell32.dll 	shell32.dll@0x160bb1 	
8 	shell32.dll 	shell32.dll@0x328a6 	
9 	shell32.dll 	shell32.dll@0x2240e 	
10 	shell32.dll 	shell32.dll@0x21dc9 	
11 	shell32.dll 	shell32.dll@0x21ee1 	
12 	shell32.dll 	shell32.dll@0x21e6f 	
13 	xul.dll 	nsLocalFile::Launch 	xpcom/io/nsLocalFileWin.cpp:2856
14 	xul.dll 	XPCNativeMember::GetCallInfo 	js/xpconnect/src/XPCWrappedNativeInfo.cpp:59
15 	xul.dll 	NS_InvokeByIndex_P 	xpcom/reflect/xptcall/src/md/win32/xptcinvoke.cpp:102
16 	xul.dll 	xul.dll@0xfb31f 	
17 		@0x6 	
18 	mozjs.dll 	JS_DefinePropertyById 	js/src/jsapi.cpp:3544
19 	xul.dll 	DefinePropertyIfFound 	js/xpconnect/src/XPCWrappedNativeJSOps.cpp:452
20 		@0xffffff86
Comment 1 Scoobidiver (away) 2012-03-08 03:06:35 PST
The widespread stack looks like:
Frame 	Module 	Signature [Expand] 	Source
0 		@0xa5a72e3 	
1 	user32.dll 	DispatchHookW 	
2 	user32.dll 	CallHookWithSEH 	
3 	user32.dll 	__fnHkINLPMOUSEHOOKSTRUCTEX 	
4 	ntdll.dll 	KiUserCallbackDispatcher 	
5 	user32.dll 	CallHookWithSEH 	
6 	user32.dll 	NtUserPeekMessage 	
7 	xul.dll 	nsAppShell::ProcessNextNativeEvent 	widget/src/windows/nsAppShell.cpp:336
8 	xul.dll 	nsBaseAppShell::OnProcessNextEvent 	widget/src/xpwidgets/nsBaseAppShell.cpp:324
9 	xul.dll 	nsThread::ProcessNextEvent 	xpcom/threads/nsThread.cpp:622
10 	nspr4.dll 	PR_Unlock 	nsprpub/pr/src/threads/combined/prulock.c:347
11 	xul.dll 	MessageLoop::RunHandler 	ipc/chromium/src/base/message_loop.cc:201
12 	xul.dll 	_SEH_epilog4 	
13 	xul.dll 	MessageLoop::Run 	ipc/chromium/src/base/message_loop.cc:175
14 	xul.dll 	nsLocalFile::SetFollowLinks 	xpcom/io/nsLocalFileWin.cpp:2611
15 	xul.dll 	nsBaseAppShell::Run 	widget/src/xpwidgets/nsBaseAppShell.cpp:189
16 	xul.dll 	nsAppShell::Run 	widget/src/windows/nsAppShell.cpp:258

It's slightly correlated to Babylon Toolbar and DealPly (a Conduit's toolbar):
* 10.0.2:
     22% (70/318) vs.   5% (4012/73873) ffxtlbr@babylon.com
     14% (43/318) vs.   2% (1304/73873) {EB9394A3-4AD6-4918-9537-31A1FD8E8EDF}
* 11.0:
     17% (23/139) vs.   7% (2369/34679) ffxtlbr@babylon.com
     10% (14/139) vs.   2% (714/34679) {EB9394A3-4AD6-4918-9537-31A1FD8E8EDF}

Does DealPly is susceptible to MITM attacks?
Comment 2 Robert Kaiser 2012-03-08 06:43:51 PST
This spiked significantly in yesterday's data and is now #18 on both 10.* and 11.*
Comment 3 Robert Kaiser 2012-03-12 09:00:43 PDT
#12 on 10, #7 on 11 in yesterday's data.
Comment 4 Marcia Knous [:marcia - use ni] 2012-03-12 13:29:40 PDT
Addon version correlations:

DispatchHookW|EXCEPTION_ACCESS_VIOLATION_EXEC (310 crashes)
     18% (56/310) vs.   6% (3270/58410) ffxtlbr@babylon.com
          0% (0/310) vs.   0% (1/58410) 1.1.0
          0% (0/310) vs.   0% (14/58410) 1.1.3
          0% (0/310) vs.   0% (30/58410) 1.1.8
         13% (40/310) vs.   3% (1783/58410) 1.1.9
          5% (16/310) vs.   2% (1440/58410) 1.2.0
          0% (0/310) vs.   0% (2/58410) 1.4.15.4
     13% (40/310) vs.   2% (1103/58410) {EB9394A3-4AD6-4918-9537-31A1FD8E8EDF}
          0% (0/310) vs.   0% (33/58410) 1.0.7
         13% (40/310) vs.   2% (1070/58410) 2.0
Comment 5 Marcia Knous [:marcia - use ni] 2012-03-12 13:40:01 PDT
Comment 4 has correlations from 10.0.2, here are some from the 11.0b7:

DispatchHookW|EXCEPTION_ACCESS_VIOLATION_READ (144 crashes)
     10% (14/144) vs.   3% (897/27737) mozilla_cc@internetdownloadmanager.com (IDM CC, https://addons.mozilla.org/addon/6973)
          0% (0/144) vs.   0% (6/27737) 5.7
          0% (0/144) vs.   0% (16/27737) 5.8
          0% (0/144) vs.   0% (2/27737) 6.1
          0% (0/144) vs.   0% (1/27737) 6.3
          0% (0/144) vs.   0% (15/27737) 6.4
          1% (1/144) vs.   0% (34/27737) 6.7
          0% (0/144) vs.   0% (9/27737) 6.8
          0% (0/144) vs.   0% (15/27737) 6.9.1
          0% (0/144) vs.   0% (6/27737) 6.9.7
          1% (1/144) vs.   0% (15/27737) 6.9.8
          0% (0/144) vs.   0% (2/27737) 7.1.6
          0% (0/144) vs.   0% (1/27737) 7.1.8
          0% (0/144) vs.   0% (4/27737) 7.2.2
          0% (0/144) vs.   0% (3/27737) 7.2.8
          0% (0/144) vs.   0% (15/27737) 7.3.1
          0% (0/144) vs.   0% (1/27737) 7.3.10
          0% (0/144) vs.   0% (10/27737) 7.3.11
          0% (0/144) vs.   0% (25/27737) 7.3.12
          1% (1/144) vs.   0% (128/27737) 7.3.14
          3% (4/144) vs.   1% (192/27737) 7.3.15
          5% (7/144) vs.   1% (395/27737) 7.3.16
          0% (0/144) vs.   0% (1/27737) 7.3.7
          0% (0/144) vs.   0% (1/27737) 7.3.9
      9% (13/144) vs.   3% (730/27737) wtxpcom@mybrowserbar.com
          0% (0/144) vs.   0% (2/27737) 4.3
          0% (0/144) vs.   0% (2/27737) 4.9
          6% (8/144) vs.   2% (587/27737) 5.0
          3% (5/144) vs.   1% (139/27737) 5.1
Comment 6 Robert Kaiser 2012-03-12 14:05:22 PDT
given that they together can at max be 30% of those crashes (and only a small minority of those are within the first minute so we always should have the complete add-on list), it looks a lot like it's not add-ons that trigger this.

In the modules for 10.0.2 I also don't see too much interesting, there's MpOAV.dll which is Windows Defender, some Groove stuff, some Avast libraries, ieframe.dll, but nothing with a really striking correlation.

So far, this is a tremendous spike, but without any smoking gun I could see up to this moment.

Let's take a look at URLs:

$ gunzip --stdout /data/security_group/crash_urls/20120311-crashdata.csv.gz | awk -W compat -F\t '$1 ~ /DispatchHookW/ {print $2}' | sort | uniq -c | sort -nr
    695 about:blank
     30 http://express-files.com/_welcome
     22 http://www.contaprime.com/
     17 \N
     11 http://www.facebook.com/
      8 file:///C:/Program%20Files/Babylon/Babylon-Pro/Utils/Babylon.xpi
      6 
      5 http://premiumsafe.info/v54
      5 http://premiumsafe.info/v45
      5 http://aplicativos/Babylon/Setup/BabylonTB.xpi
      5 file:///C:/Program%20Files/Wajam/Firefox/firefox_trigger_extension.htm

...and a ton more with express-files.com, Babylon(*.xpi) and other things that look like download sites or software installation.

Now we're getting somewhere...

Comments also talk a similar language, e.g. "i am installing/using a new software 1clickdownload v2.1, when mozilla crashed", "downloading updated version of game previously purchased from DQ", "Declined T&C for download exe and when it shut down, it shut down firefox at the same time. Everything reloaded perfectly. Website: Filebox", "installing skype".

It looks like this is connected to downloads in some way (which might explain some anti-virus being elevated slightly in correlations as they might scan the download). 70% of those crashes are on XP, 25% on Win7, 5% on Vista, and it's across all kinds of versions from 5.0b1 to 13.0a1 (still, nothing on 4 or lower - by chance?)

Did something happen recently for a lot of people on all kinds of Windows that made a difference in with downloads on Firefox 5 and higher? Some kind of malware or virus hiding in a Windows DLL so we don't see it in correlations?
Comment 7 Marcia Knous [:marcia - use ni] 2012-03-13 17:59:22 PDT
I see uTorrentControl2 Findbar in a few of the reports. That would make sense based on what KaiRo says in Comment 6. I installed in a VM and have tried some different operations - will report back.
Comment 8 Jorge Villalobos [:jorgev] 2012-03-15 12:46:48 PDT
Bug 735931 comment #14 points to avast! as a possible culprit. Kev, can you reach out to them and see if they are getting complaints from their users? Did they update their software recently?
Comment 9 Brian King [:kinger] 2012-03-15 15:20:36 PDT
One combo that is giving me a 100% reproducable crash on Windows is Avast anti-virus and the Ant Video Downloader and Player (https://addons.mozilla.org/addon/video-downloader-player). The Avast program has various 'shields', and I have narrowed it down to the 'Script shield'. I still haven't found the code in the add-on that is triggering it yet, but am investigating.

Note that the Avast add-on installed in Firefox appears to have no impact on this at all.
Comment 10 Scoobidiver (away) 2012-03-15 23:52:52 PDT
(In reply to Jorge Villalobos [:jorgev] from comment #8)
> Bug 735931 comment #14 points to avast! as a possible culprit.
It's a low correlation:
22% (68/312) vs.  13% (9482/74694) snxhk.dll
and a geographical one (European users).
Comment 11 Robert Kaiser 2012-03-16 05:05:42 PDT
(In reply to Scoobidiver from comment #10)
> It's a low correlation:

I never said that avast! would be the culprit for all those. But this is a crash with downloads and potentially virus scanning involved, and that's what the other one seems to point to as well. It's probably NOT specific to one single security suite, but this instance that is reproducible could, when being looked at with a debugger, potentially give a clue as to who we have seen a rise of such problems.

As we have no other lead to what could be up there, and even though the spike has resided somewhat, this stays well within topcrash range, I think someone who can handle a debugger and has some clue about code can look into the case Brian found and try to find out if it points to a pattern we could be seeing here.
Comment 12 Marcia Knous [:marcia - use ni] 2012-03-19 10:15:29 PDT
Brian: Which versions of Avast and Ant are you using? I will try to reproduce it on my end.  Thanks.

(In reply to Brian King (Briks) [:kinger] from comment #9)
> One combo that is giving me a 100% reproducable crash on Windows is Avast
> anti-virus and the Ant Video Downloader and Player
> (https://addons.mozilla.org/addon/video-downloader-player). The Avast
> program has various 'shields', and I have narrowed it down to the 'Script
> shield'. I still haven't found the code in the add-on that is triggering it
> yet, but am investigating.
> 
> Note that the Avast add-on installed in Firefox appears to have no impact on
> this at all.
Comment 13 Sheila Mooney 2012-03-22 10:39:19 PDT
We don't have to have a perfect correlation but if we can come up with a repro case based on some combination of add-ons, it will help the devs dig through this.
Comment 14 Marcia Knous [:marcia - use ni] 2012-03-22 11:21:09 PDT
I tried installing the free version of Avast and the Ant Video downloader but not crash yet.
Comment 15 Brian King [:kinger] 2012-03-23 03:38:20 PDT
(In reply to Marcia Knous [:marcia] from comment #14)
> I tried installing the free version of Avast and the Ant Video downloader
> but not crash yet.

STR here are:

1. Ensure shields are enabled in Avast!, especially the script shield
2. Open the Player feature in the Ant Video Downloader

See also:
http://support.mozilla.org/en-US/questions/889022

aswJsFlt.dll is an Avast DLL.

I've Narrowed it down - the trigger is mediaplayer.swf that we bundle in the add-on. It is from JWPlayer:

http://www.longtailvideo.com/players/jw-flv-player/

This and Avast just don't play nice together. I may have to file a bug on 1 or both of those projects upstream.
Comment 16 Alex Keybl [:akeybl] 2012-03-23 13:58:48 PDT
Thanks for finding STR Brian! Since there wasn't an 100% correlation with Avast/Ant Video Downloader (from my read of the bug) we'll try to see if there's anything we can do from our side as well.

roc - could you take a look?
Comment 17 Marcia Knous [:marcia - use ni] 2012-03-23 15:20:14 PDT
I can reproduce this using Brian's STR. I don't get the crash reporter but I do consistently get a dialog that indicates "Firefox stopped working." I installed the trial version of Avast Internet Security Suite.
Comment 18 Robert O'Callahan (:roc) (email my personal email if necessary) 2012-03-29 13:50:37 PDT
Welcome to the front lines, Nick.
Comment 19 Nick Cameron [:nrc] 2012-03-29 15:26:53 PDT
I can recreate using Brian's steps in Beta, but not in a debug build of m-c. Hopefully that means it's been fixed, but will investigate further...
Comment 20 Nick Cameron [:nrc] 2012-03-29 15:28:39 PDT
Nope, can reproduce on optimised nightly, just not in debug build.
Comment 22 Nick Cameron [:nrc] 2012-03-30 03:50:59 PDT
The crash I can pretty consistently reproduce (about 95% of the time) seems to match the support question about 'BEX', but I'm not convinced it is exactly the same problem as the crash stats, though they are no doubt related.

Looking at the crash reported code, it looks like all the reported crashes are for exception 0xc0000005 - STATUS_ACCESS_VIOLATION, but the support question and my crash are 0xc0000417 - STATUS_INVALID_CRUNTIME_PARAMETER.

BEX is a buffer overflow exception, usually associated with DEP (Data Execution Prevention), but with the exception code 0xc0000417, that seems unlikely. I disabled DEP, and still got the same crash details.

The crash has been surprisingly hard to catch with a debugger and I've only managed to debug a dump. The stack traces I found this way are very different from the ones on the crash reports. I'll try and get more details... I did find evidence that the exception was caused by aswJsFlt.dll, but I've no idea why.

Because of the Avast dll in my stack trace and the presence of hooks functions (used to implement some anti-virus software, and notoriously buggy) in the above stack traces, the hypothesis that Avast is causing the crash by interfering with the way the video plugin is hosted in Firefox (as suggested in Briks' support thread, above) seems feasible. But, KiUserCallbackDispatcher (used to dispatch callbacks from Kernel to User mode), is (apparently, could not verify) only used for windowing calls; MOUSEHOOKSTRUCTEX is a standard windows struct for reporting mouse use. This seems to be an unlikely (but still possible) place for a crash if it is due to Avast messing with the plugin.
Comment 23 Nick Cameron [:nrc] 2012-03-30 04:01:39 PDT
Also, I could not get my crash to trigger the crash reporter despite a a whole load of trying. So (along with the different stack traces and exception codes) it seems unlikely that Briks' STR actually reproduce the crash described by Marcia. Although, they both seem to have Avast in common, that might be all that links them.
Comment 24 Nick Cameron [:nrc] 2012-03-30 04:32:49 PDT
I could reproduce my crash on FF versions 10,11,12,13,14 optimised only, I could not reproduce on debug builds of 14, probably worth trying debug builds of older versions too.

Further supporting the hooks/Avast hypothesis - I think we use this kind of Windows hook to run plugins, including Flash, so it is easy to imagine some bug between FF and Avast here.
Comment 25 Robert O'Callahan (:roc) (email my personal email if necessary) 2012-04-01 15:28:50 PDT
(In reply to Nick Cameron [:nrc] from comment #24)
> Further supporting the hooks/Avast hypothesis - I think we use this kind of
> Windows hook to run plugins, including Flash,

We don't.

We should see if blocking the Avast injection DLL stops the crash. Assuming it does, I think we should ask Avast to investigate and threaten them with blacklisting if they don't fix it.
Comment 26 Nick Cameron [:nrc] 2012-04-01 21:02:03 PDT
Stack trace:

00 ntdll!ZwWaitForMultipleObjects+0x15
01 kernel32!WaitForMultipleObjectsEx+0x8e
02 kernel32!WaitForMultipleObjects+0x18
03 kernel32!GetApplicationRecoveryCallback+0x2a7
04 kernel32!GetApplicationRecoveryCallback+0x166
05 kernel32!UnhandledExceptionFilter+0x161
06 kernel32!UnhandledExceptionFilter+0xe0
07 aswJsFlt+0x960f
08 aswJsFlt+0x8f08
09 aswJsFlt+0x3b42
0a xul!nsJSContext::EvaluateStringWithValue+0x3bc
0b xul!mozilla::plugins::parent::_evaluate+0x317
0c xul!mozilla::plugins::PluginScriptableObjectParent::AnswerNPN_Evaluate+0x91
0d xul!mozilla::plugins::PPluginScriptableObjectParent::OnCallReceived+0xe5
0e xul!mozilla::plugins::PPluginModuleParent::OnCallReceived+0x5f
0f mozalloc!moz_xmalloc+0xf
10 xul!mozilla::ipc::RPCChannel::DispatchIncall+0x5f
11 xul!mozilla::ipc::RPCChannel::Incall+0x11e
12 xul!mozilla::ipc::RPCChannel::OnMaybeDequeueOne+0x159
13 xul!MessageLoop::DoWork+0x246
14 xul!base::ThreadLocalPlatform::GetValueFromSlot+0xd
15 xul!MessageLoop::current+0x2d
16 xul!mozilla::ipc::DoWorkRunnable::Run+0x2a
17 xul!nsThread::ProcessNextEvent+0x1d2
18 xul!NS_ProcessNextEvent_P+0x2b
19 xul!mozilla::ipc::MessagePump::Run+0xb1
1a xul!MessageLoop::RunHandler+0xa6
1b xul!MessageLoop::Run+0x3c
1c xul!nsBaseAppShell::Run+0x2c
1d xul!nsAppShell::Run+0x49
1e xul!nsAppStartup::Run+0x20
1f xul!XRE_main+0x19ee
20 firefox!do_main+0x1f6
21 firefox!NS_internal_main+0x1f1
22 firefox!wmain+0x163
23 firefox!__tmainCRTStartup+0x122
24 kernel32!BaseThreadInitThunk+0x12
25 ntdll!RtlInitializeExceptionChain+0x63
26 ntdll!RtlInitializeExceptionChain+0x36
Comment 27 Nick Cameron [:nrc] 2012-04-01 22:32:42 PDT
We think that the Avast dll aswJsFlt is causing at least the crash I can reproduce and probably the more general one. Avast is loading aswJsFlt.dll and another dll when Firefox loads (annoyingly, outside of the mechanism which we can blocklist). Avast is patching our code to call its own (inside aswJsFlt) and then crashing (details below).

We need to reach out to Avast to see if they can fix this. Our options seem limited: we can't block the dll injection or prevent the patching. I will try a hack to get around the patching, but this will be unsatisfactory at best and easy to defeat. I can only see the hack working on one function at a time, so if Avast are patching elsewhere (which seems likely) then their bug could still cause crashes.
Comment 28 Nick Cameron [:nrc] 2012-04-01 22:44:23 PDT
Details:

The Avast dll is patching the first instructions of JS_EvaluateUCScriptForPrincipalsVersion (jsapi.cpp), replacing them with an unconditional jump into their dll. Here some code is run and when done jumps into some dynamically generated code. This seems to be a stub which executes the code from JS_EvaluateUCScriptForPrincipalsVersion that was patched over, restores some state, and then jumps back to JS_EvaluateUCScriptForPrincipalsVersion. Sometimes this works correctly, but sometimes it causes the exception we're seeing, not sure why.

Ideally we don't want our code patched like this, we certainly don't want it patched with code that crashes!
Comment 29 Nick Cameron [:nrc] 2012-04-02 00:03:43 PDT
What is bothering me a little bit is why only optimised builds crash and debug builds don't.
Comment 30 Nick Cameron [:nrc] 2012-04-02 00:27:05 PDT
Created attachment 611373 [details] [diff] [review]
proof of concept hack

Rename JS_EvaluateUCScriptForPrincipalsVersion so that Avast does not patch its code into the function. This fixes my reproduction of the crash. But it is a very unsatisfactory solution and not suitable for patching to trunk (if we do choose to use this solution (not recommended!) then we should use a better name and fix the indentation).

Proves that the Avast patching is the cause of at the crash I can reproduce.

Doesn't seem to cause any problems with Avast.
Comment 31 Scoobidiver (away) 2012-04-02 00:33:55 PDT
(In reply to Nick Cameron [:nrc] from comment #27)
> We think that the Avast dll aswJsFlt is causing at least the crash I can
> reproduce and probably the more general one.
About 80% of crashes don't have an Avast DLL loaded (see comment 10).
Comment 32 Nick Cameron [:nrc] 2012-04-02 00:47:56 PDT
(In reply to Scoobidiver from comment #31)
> (In reply to Nick Cameron [:nrc] from comment #27)
> > We think that the Avast dll aswJsFlt is causing at least the crash I can
> > reproduce and probably the more general one.
> About 80% of crashes don't have an Avast DLL loaded (see comment 10).

This points towards the bug with Avast/Ant not being the same bug as the one in the crash reports, unfortunately it's the only one we have STR.

WRT comment 10 is the 22% or 13% the number of Avast users? Also snxhk.dll is not the dll that is causing problems, but is another Avast dll. Is it possible that aswJsFlt.dll or other Avast dlls have a higher correlation?

It is possible that there is something we (or our compilation process) is doing which is causing problems for Avast and other software that uses similar mechanisms (it is possible others are doing the same thing as Avast, but in other places). At the very least, Avast should be able to help us better understand what is going on here.
Comment 33 Scoobidiver (away) 2012-04-02 01:30:17 PDT
It's now *only* #36 top browser crasher in 11.0, #46 in 12.0b3, and #45 in 13.0a2.

(In reply to Nick Cameron [:nrc] from comment #32)
> WRT comment 10 is the 22% or 13% the number of Avast users?
13% of all crashes have this DLL loaded while 22% of this crash have this DLL loaded. In summary, 13% of ADU use Avast! AV while 22% of ADU hitting this crash have Avast! AV.

> Also snxhk.dll is not the dll that is causing problems, but is another Avast
> dll.
They are usually loaded together although snxhk.dll is loaded more often. See today's correlations on 11.0:
     17% (10/60) vs.   8% (9889/123390) aswProperty.dll
     17% (10/60) vs.   8% (9891/123390) AavmRpch.dll
     17% (10/60) vs.   8% (9891/123390) Aavm4h.dll
     17% (10/60) vs.   8% (9891/123390) ashTask.dll
     17% (10/60) vs.   8% (9891/123390) aswEngLdr.dll
     17% (10/60) vs.   8% (9891/123390) aswAux.dll
     17% (10/60) vs.   8% (9891/123390) ashBase.dll
     17% (10/60) vs.   8% (9894/123390) aswCmnOS.dll
     17% (10/60) vs.   8% (9894/123390) aswCmnBS.dll
     17% (10/60) vs.   8% (9895/123390) aswCmnIS.dll
     22% (13/60) vs.  15% (18368/123390) snxhk.dll
Comment 34 Robert O'Callahan (:roc) (email my personal email if necessary) 2012-04-02 02:17:20 PDT
This crash signature is quite likely a medley of many different bugs caused by different applications using the Windows global hook mechanism. Windows global hooks are an extremely powerful, invasive and error-prone way to mess with all the applications on the desktop. I don't think we should expect to find a single fix that will fix this bug.

Since we can reproduce the Avast issue, we should fix it or get Avast to fix it. We can do that in a different bug that this bug depends on, or in this bug. Scoobidiver, any preference?
Comment 35 Scoobidiver (away) 2012-04-02 02:45:54 PDT
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #34)
> Since we can reproduce the Avast issue, we should fix it or get Avast to fix
> it. We can do that in a different bug that this bug depends on, or in this
> bug. Scoobidiver, any preference?
I prefer a new bug for the Avast testcase that is dependent of this one.
Comment 36 Robert Kaiser 2012-04-02 04:51:03 PDT
(In reply to Scoobidiver from comment #31)
> About 80% of crashes don't have an Avast DLL loaded (see comment 10).

From what I saw, some other AV software (erm, "security suites") is in the correlations as well, so it might easily be that others are using the same or a similar tactic.
I wonder if we should work with those vendors to find some clean ways for them to plug in their probes etc. into our code, so that crashes are less likely and there are clean paths of interaction.

That said, let's fix the case we can reproduce and that we know about here, and let's go into a new bug if that doesn't fix all of the problem. The vast majority of comments in here is about the case we can reproduce, so let's keep this bug to that.
And if we see this signature happen still after that, let's file a new, clean-from-avast bug report for that.
Comment 37 Jorge Villalobos [:jorgev] 2012-04-02 08:33:58 PDT
I just pinged Kev to see if we can get in touch with Avast.
Comment 38 Nick Cameron [:nrc] 2012-04-02 15:33:58 PDT
I had a look at the functions in the two stack traces in the first two comments and the functions they call and it does not look like Avast patches code in anywhere. More evidence that the two kinds of crashes are unrelated.

I'm not entirely sure about the story in the first stack trace and why it is calling a global hook on a mouse action. But, the second stack trace makes sense: ProcessNextNativeEvent checks with Windows for mouse and keyboard messages (the call into user32, which is triggering some global hook, presumably associated with mouse events (so seems unlikely to be anything to do with anti-virus), we then see the call chain which dispatches to that global hook and either the code called by the hook or the dispatching mechanism (more likely the former) causes a crash. I doubt this is either our fault or the fault of anti-virus software (we nor they should be hooking into mouse events). I'm not sure if we can do anything about this - we may be able to block hooking in to our process, but I've no real idea.
Comment 39 Nick Cameron [:nrc] 2012-04-03 15:14:01 PDT
A minor bit of investigation shows that Avast does not patch JS_EvaluateUCScriptForPrincipalsVersion in a debug build, which is why that crash only happens on optimised builds.
Comment 40 Nick Cameron [:nrc] 2012-04-03 16:43:27 PDT
Bug 653361/Bug 681238 look related - issues with the dll block list and anti-virus hooks
Comment 41 Nick Cameron [:nrc] 2012-04-03 17:30:13 PDT
Splitting this bug.

Please use this bug for the crash reproducible by the steps in comment 9, stack trace in comment 26. Note that the impact of this crash is unknown because it does not trigger the crash reporter. The help forum question is about this crash.

Please use Bug 742113 for the probably more important (but so far not reproducible) crash discussed in comments 1 -- 8.
Comment 42 Nick Cameron [:nrc] 2012-04-03 17:37:12 PDT
Makoto Kato or Mike Hommey: we don't seem to be able to block the Avast dlls using the dll blocker - snxhk.dll loads before we get a chance, but it would be nice to be able to block aswJsFlt.dll, and we can't. Is there a way to fix this?

(I am not suggesting we should block aswJsFlt.dll in releases, but it would be nice to be able to for debugging and so that we have the option as a last resort).
Comment 43 Makoto Kato [:m_kato] 2012-04-03 18:23:36 PDT
(In reply to Nick Cameron [:nrc] from comment #42)
> Makoto Kato or Mike Hommey: we don't seem to be able to block the Avast dlls
> using the dll blocker - snxhk.dll loads before we get a chance, but it would
> be nice to be able to block aswJsFlt.dll, and we can't. Is there a way to
> fix this?

What version of avast?  newest?  If hooking LdrLoadDll is failed, hook way of Avast is changed.
Comment 44 Makoto Kato [:m_kato] 2012-04-03 18:44:17 PDT
Also, If hooking is failed, "DllBlockList Failed" will be written in app note of crash reporter.
Comment 45 Nick Cameron [:nrc] 2012-04-03 19:04:53 PDT
Yes, the newest version of Avast. I don't see LdrLoadDll loading when I start FF, but I have set the verbose logging define in the dll block list and I'm getting lots of LdrLoadDll output, so I assume LdrLoadDll is working OK - I don't see any error messages, although I don't seem to be able to redirect output to a file, so missing the first messages. I don't see anything about the Avast dlls.

This crash does not trigger the crash reporter, so there's nothing to check.
Comment 46 Makoto Kato [:m_kato] 2012-04-03 19:26:31 PDT
(In reply to Nick Cameron [:nrc] from comment #45)
> Yes, the newest version of Avast. I don't see LdrLoadDll loading when I
> start FF, but I have set the verbose logging define in the dll block list
> and I'm getting lots of LdrLoadDll output, so I assume LdrLoadDll is working
> OK - I don't see any error messages, although I don't seem to be able to
> redirect output to a file, so missing the first messages. I don't see
> anything about the Avast dlls.
> 
> This crash does not trigger the crash reporter, so there's nothing to check.

Thanks.

I set up avast! environment.

aswJsFlt.dll is loaded by snxhk.dll when mozjs.dll is loaded.  But this timing, we doesn't setup dll hook yet because XRE_SetupDllBlocklist is into xul.dll.

If we move it to firefox.exe and call it before XPCOMGlueStartup is called, we can block aswJsFlt.dll
Comment 47 Nick Cameron [:nrc] 2012-04-03 23:28:46 PDT
> If we move it to firefox.exe and call it before XPCOMGlueStartup is called,
> we can block aswJsFlt.dll

This seems like a sensible thing to do, it sounds like it would make dll blocking more secure. Is there any reason we wouldn't want to do this?
Comment 48 Mike Hommey [:glandium] 2012-04-03 23:34:54 PDT
(In reply to Makoto Kato from comment #46)
> If we move it to firefox.exe and call it before XPCOMGlueStartup is called,
> we can block aswJsFlt.dll

We can't move it to firefox.exe. The probably most sensible thing would be to move it to libmozglue.
Comment 49 Benjamin Smedberg [:bsmedberg] 2012-04-04 05:59:10 PDT
Why can't we move it? We can compile the blocklist into other binaries if we want, no?
Comment 50 Brian King [:kinger] 2012-04-04 08:22:12 PDT
(In reply to Nick Cameron [:nrc] from comment #41)
> Please use this bug for the crash reproducible by the steps in comment 9

Note the full STR with more info are in comment 15.
Comment 51 :Ehsan Akhgari 2012-04-04 08:45:45 PDT
We can definitely move it to firefox.exe if we want to!
Comment 52 Mike Hommey [:glandium] 2012-04-04 11:02:05 PDT
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #49)
> Why can't we move it? We can compile the blocklist into other binaries if we
> want, no?

patched_LdrLoadDll uses at least gInXPCOMLoadOnMainThread, NS_SetHasLoadedNewDLLs and NS_IsMainThread, which are all in xul.dll.
Comment 53 Mike Hommey [:glandium] 2012-04-04 11:10:11 PDT
Well, yeah, it /can/ be moved. Not without effort.
Comment 54 :Ehsan Akhgari 2012-04-04 11:12:13 PDT
Yeah, we can have a function pointer set to null in firefox.exe and to the right thing when xul.dll gets loaded, and attempt to call that function in patched_LdrLoadDll instead of directly referencing those symbols.
Comment 55 Alex Keybl [:akeybl] 2012-04-04 15:43:08 PDT
No longer a top crash, and not specific to FF12, so no need to track for release.
Comment 56 Nick Cameron [:nrc] 2012-04-17 18:54:48 PDT
One oddity with this crash is that it does not trigger our crash reporter. The reason is that when execution moves into the Avast dll, they call SetUnhandledExceptionFilter to null, i.e., use the default windows filter, which just pops up a dialog and dies. Our crash reporter uses a custom unhandled exception filter, and so Avast is essentially disabling it.

Thus, we have no idea how many FF crashes occur which are caused by Avast after they reset the exception filter. We should probably address this, but I have no idea how.
Comment 57 Robert O'Callahan (:roc) (email my personal email if necessary) 2012-04-17 20:16:23 PDT
This is terrible! It means our crash-stats are basically blind to Avast (and whatever else uses this trick).

I think we should hook SetUnhandledExceptionFilter to treat a "NULL" parameter as redirecting back to crashreporter.
Comment 58 Ted Mielczarek [:ted.mielczarek] 2012-04-17 21:55:33 PDT
That is pretty horrifying. We should contact them and tell them not to do that, for one, and we should also look into hooking SetUnhandledExceptionFilter to be a no-op after we call it.
Comment 59 Nick Cameron [:nrc] 2012-04-17 22:03:30 PDT
(In reply to Ted Mielczarek [:ted] from comment #58)
> That is pretty horrifying. We should contact them and tell them not to do
> that, for one, and we should also look into hooking
> SetUnhandledExceptionFilter to be a no-op after we call it.

I've contacted Kev to do the former and I'm testing a patch for the latter (or similar, anyway).
Comment 60 Kev Needham [:kev] 2012-04-18 06:51:06 PDT
Apologies, I missed Jorge's prod a while back, and reached out to Avast last night. I've received a response, and have included Nick and Marcia in the conversation. They're hopeful that they can get this fixed, and I'll leave it with Marcia and Nick.

(In reply to Nick Cameron [:nrc] from comment #59)
> I've contacted Kev to do the former and I'm testing a patch for the latter
> (or similar, anyway).
Comment 61 :Ehsan Akhgari 2012-04-18 18:52:43 PDT
(In reply to Ted Mielczarek [:ted] from comment #58)
> That is pretty horrifying. We should contact them and tell them not to do
> that, for one, and we should also look into hooking
> SetUnhandledExceptionFilter to be a no-op after we call it.

I think we should first do what I suggested in comment 54, and then do this.  We really want total control over SetUnhandledExceptionFilter, and we should treat all calls to it after the crash reporter as no-op, and we should continue to blacklist any DLL which attempts to mess with this function aggressively!

Nick, I'd be happy to help you with any question you have, and also to review your patches here.  :-)
Comment 62 Nick Cameron [:nrc] 2012-04-19 15:41:34 PDT
(In reply to Kev [:kev] Needham from comment #60)
> Apologies, I missed Jorge's prod a while back, and reached out to Avast last
> night. I've received a response, and have included Nick and Marcia in the
> conversation. They're hopeful that they can get this fixed, and I'll leave
> it with Marcia and Nick.
> 

Thanks!
Comment 63 Nick Cameron [:nrc] 2012-04-19 15:44:32 PDT
(In reply to Ehsan Akhgari [:ehsan] from comment #61)
> (In reply to Ted Mielczarek [:ted] from comment #58)
> > That is pretty horrifying. We should contact them and tell them not to do
> > that, for one, and we should also look into hooking
> > SetUnhandledExceptionFilter to be a no-op after we call it.
> 
> I think we should first do what I suggested in comment 54, and then do this.
> We really want total control over SetUnhandledExceptionFilter, and we should
> treat all calls to it after the crash reporter as no-op, and we should
> continue to blacklist any DLL which attempts to mess with this function
> aggressively!
> 
> Nick, I'd be happy to help you with any question you have, and also to
> review your patches here.  :-)

I don't understand what you are describing in comment 54. I am already working on protecting the exception filter, Bug 747213 (since it won't fix this crash). I can then look at strengthening the DLL block list (which I think is described in 54), I will probably need some help with that :-)
Comment 64 Nick Cameron [:nrc] 2012-04-19 19:47:40 PDT
From contact with the Avast! guys, they reckon this is an easy thing to fix and will get it done for their next update, this will release at the end of May. They ask if this is OK, or whether they should look at a more aggressive way to push out the fix --- what do we think?
Comment 65 Jorge Villalobos [:jorgev] 2012-04-20 07:57:16 PDT
I think we need this sooner if possible. We don't know how much impact this crash is having because of what they are doing, and Avast and Ant are both very widely used.
Comment 66 Mike Hommey [:glandium] 2012-04-20 09:53:49 PDT
(In reply to Ehsan Akhgari [:ehsan] from comment #51)
> We can definitely move it to firefox.exe if we want to!

Note that, if we move it to firefox.exe, we also need to move it to the webrt stubs. Which leads to the question: do we ensure the stubs are up-to-date where they are installed on user machines?
Comment 67 :Ehsan Akhgari 2012-04-20 09:58:48 PDT
Basically we need to move XRE_SetupDllBlocklist into firefox.exe (it currently lives in xul.dll -- see comment 46.)  This way we can install the blocklist a lot sooner than we do today (before we load libxul, etc) and this will help us catch a lot more of these types of problems.

The reason that moving XRE_SetupDllBlocklist into firefox.exe is not trivial is that it requires symbols which are defined in xul.dll (see comment 52).  What I suggest in comment 54 is a solution to that problem.  Basically, for all of the work which requires symbols in xul.dll, we can add a helper function, and use a function pointer (which is initialzed to null in firefox.exe, and set to the correct function symbol when xul.dll is loaded) to call that function if the pointer is non-null.  This way, SetupDllBlocklist can be called in firefox.exe without those symbols being available (the function pointer would be null at that time so we'd avoid calling the helper which needs those symbols), and once xul.dll is loaded and those symbols are resolved, SetupDllBlocklist will start calling the helper.

Does this make more sense?
Comment 68 :Ehsan Akhgari 2012-04-20 10:00:11 PDT
(In reply to Mike Hommey [:glandium] from comment #66)
> (In reply to Ehsan Akhgari [:ehsan] from comment #51)
> > We can definitely move it to firefox.exe if we want to!
> 
> Note that, if we move it to firefox.exe, we also need to move it to the
> webrt stubs.

Yeah, and also thunderbird.exe and seamonkey.exe.  But once the actual work behind this (comment 67 is done), doing this in multiple places should be easy.

> Which leads to the question: do we ensure the stubs are
> up-to-date where they are installed on user machines?

I'm not sure, Myk will probably know.
Comment 69 Myk Melez [:myk] [@mykmelez] 2012-04-20 16:28:58 PDT
(In reply to Ehsan Akhgari [:ehsan] from comment #68)
> (In reply to Mike Hommey [:glandium] from comment #66)
> > Which leads to the question: do we ensure the stubs are
> > up-to-date where they are installed on user machines?
> 
> I'm not sure, Myk will probably know.

Yup, we do ensure this.

On webapp startup, the stub locates the Firefox installation directory and checks Firefox's build ID.  If it's different from the stub's build ID (which is compiled into it when the stub is compiled with Firefox), the stub updates itself from the instance of the stub in the Firefox installation directory and then restarts.
Comment 70 Nick Cameron [:nrc] 2012-04-30 17:06:58 PDT
Avast have apparently fixed there bug and pushed to users. I will test this later today and report back...
Comment 71 Nick Cameron [:nrc] 2012-04-30 21:50:01 PDT
Tested with FF 12 and 15 nightly, and seems to be fixed. I've tested only with the Ant extension as this is the only way I could recreate the crash. It required updating Avast. I'd be interested to here if anyone else can still recreate the crash.
Comment 72 Nick Cameron [:nrc] 2012-04-30 21:57:01 PDT
I've started a new bug, bug 750601, for strengthening the dll blocker.

I'll mark this bug resolved, but please re-open if you can reproduce this crash after updating Avast.
Comment 73 Robert O'Callahan (:roc) (email my personal email if necessary) 2012-05-02 22:34:40 PDT
The patch for SetUnhandledExceptionFilter is in bug 747213.

Note You need to log in before you can comment on or make changes to this bug.