Closed Bug 723447 Opened 12 years ago Closed 8 years ago

Crash in ChildProcessInfo::GenerateRandomChannelID @ rand_s with McAfee Host Intrusion Prevention

Categories

(Core :: IPC, defect)

x86
Windows XP
defect
Not set
critical

Tracking

()

RESOLVED DUPLICATE of bug 1167248

People

(Reporter: scoobidiver, Unassigned)

References

Details

(Keywords: crash)

Crash Data

It's #51 top crasher in the first days of 10.0.
It happens almost only on Windows XP.

Signature 	rand_s More Reports Search
UUID	9d604b44-dcf1-4930-bec1-3d9fa2120202
Date Processed	2012-02-02 10:00:47
Uptime	2126
Last Crash	5.9 days before submission
Install Age	35.4 minutes since version was first installed.
Install Time	2012-02-02 09:25:07
Product	Firefox
Version	10.0
Build ID	20120129021758
Release Channel	release
OS	Windows NT
OS Version	5.1.2600 Service Pack 3
Build Architecture	x86
Build Architecture Info	GenuineIntel family 6 model 23 stepping 10
Crash Reason	EXCEPTION_NONCONTINUABLE_EXCEPTION
Crash Address	0x0
App Notes 	
AdapterVendorID: 8086, AdapterDeviceID: 2a42, AdapterSubsysID: 01391025, AdapterDriverVersion: 6.14.10.5102
Has dual GPUs. GPU #2: AdapterVendorID2: 8086, AdapterDeviceID2: 2582, AdapterSubsysID2: 0000000c, AdapterDriverVersion2: 6.14.10.4267D3D10 Layers? D3D10 Layers-
D3D9 Layers? D3D9 Layers-
EMCheckCompatibility	True

Frame 	Module 	Signature [Expand] 	Source
0 	ntdll.dll 	KiFastSystemCallRet 	
1 	ntdll.dll 	ZwWaitForSingleObject 	
2 	kernel32.dll 	WaitForSingleObjectEx 	
3 	kernel32.dll 	WaitForSingleObject 	
4 	xul.dll 	google_breakpad::ExceptionHandler::WriteMinidumpOnHandlerThread 	toolkit/crashreporter/google-breakpad/src/client/windows/handler/exception_handler.cc:764
5 	xul.dll 	google_breakpad::ExceptionHandler::HandleInvalidParameter 	toolkit/crashreporter/google-breakpad/src/client/windows/handler/exception_handler.cc:619
6 	msvcr80.dll 	rand_s 	f:\\dd\\vctools\\crt_bld\\self_x86\\crt\\src\\rand_s.c:86
7 	xul.dll 	`anonymous namespace'::RandUint32 	ipc/chromium/src/base/rand_util_win.cc:16
8 	xul.dll 	base::RandUint64 	ipc/chromium/src/base/rand_util_win.cc:25
9 	xul.dll 	base::RandInt 	ipc/chromium/src/base/rand_util.cc:20
10 	xul.dll 	ChildProcessInfo::GenerateRandomChannelID 	ipc/chromium/src/chrome/common/child_process_info.cc:58
11 	xul.dll 	ChildProcessHost::CreateChannel 	ipc/chromium/src/chrome/common/child_process_host.cc:78
12 	xul.dll 	mozilla::ipc::GeckoChildProcessHost::InitializeChannel 	ipc/glue/GeckoChildProcessHost.cpp:350
13 	xul.dll 	MessageLoop::RunTask 	ipc/chromium/src/base/message_loop.cc:318
14 	xul.dll 	MessageLoop::DeferOrRunPendingTask 	ipc/chromium/src/base/message_loop.cc:326
15 	xul.dll 	MessageLoop::DoWork 	ipc/chromium/src/base/message_loop.cc:426
16 	xul.dll 	base::MessagePumpForIO::DoRunLoop 	ipc/chromium/src/base/message_pump_win.cc:462
17 	xul.dll 	base::MessagePumpWin::RunWithDispatcher 	ipc/chromium/src/base/message_pump_win.cc:53
18 	xul.dll 	base::MessagePumpWin::Run 	ipc/chromium/src/base/message_pump_win.h:78
19 	xul.dll 	MessageLoop::RunHandler 	ipc/chromium/src/base/message_loop.cc:201
20 	xul.dll 	MessageLoop::Run 	ipc/chromium/src/base/message_loop.cc:175
21 	xul.dll 	base::Thread::ThreadMain 	ipc/chromium/src/base/thread.cc:156
22 	xul.dll 	`anonymous namespace'::ThreadFunc 	ipc/chromium/src/base/platform_thread_win.cc:26
23 	kernel32.dll 	BaseThreadStart 	


It's correlated to McAfee Host Intrusion Prevention:
* 9.0.1:
  rand_s|EXCEPTION_NONCONTINUABLE_EXCEPTION (63 crashes)
    100% (63/63) vs.   0% (101/116098) HcThe.dll
    100% (63/63) vs.   0% (101/116098) HIPHandlers.dll
    100% (63/63) vs.   0% (162/116098) HcApi.dll
* 10.0
  rand_s|EXCEPTION_NONCONTINUABLE_EXCEPTION (13 crashes)
    100% (13/13) vs.   0% (19/13378) HcThe.dll
    100% (13/13) vs.   0% (19/13378) HIPHandlers.dll
    100% (13/13) vs.   0% (20/13378) HcApi.dll

More reports at:
https://crash-stats.mozilla.com/report/list?signature=rand_s
A friend of mine has very, very common crashes here, basically every time he visits a web page. bp-62c03a92-e498-45d9-97bc-b554a2120330 is one example.

Should we blocklist? Can we contact McAfee?
(In reply to Joe Drew (:JOEDREW!) from comment #1)
> Should we blocklist?
Certainly not as it doesn't fulfill criteria (only about 700 crashes a week):
https://wiki.mozilla.org/Blocklisting#A_High_Bar

> Can we contact McAfee?
Yes. It affects version 8:
    100% (101/101) vs.   0% (162/129384) HcThe.dll (8.0.0.1741)
I strongly suspect that this issue is the one I've been trying to isolate off and on for the last few weeks.

Specifically, I manage a number of Windows XP machines that are currently running Firefox 3.6.28 and I've been packaging and preparing an updated to Firefox 10.0.3esr.  The observed behavior is that Firefox crashes almost immediately upon trying to access a webpage that uses a plugin (i.e. Adobe Flash, Shockwave, Oracle (nee Sun) Java, etc.).  At first I thought it might be a graphics driver issue since I do 99% of my testing in VMware VMs.  But then I got it to reproduce on a notebook.  Then I thought it might be related to the McAfee ScriptScan extension (also known as IDS_SS_NAME IDS_SS_VERSION because McAfee hosed up the install.rdf file), but that turned out to be a red herring (disabling the extension appeared to fixed it once on one machine, but I think that was a fluke).  I do not have control over the McAfee install on the machines I manage - ePO (the centralized McAfee management system) is handled by another team, so I can't remove ScriptScan, but I had been using extensions.autoDisableScopes set to 3 in order to get ScriptScan to enable silently, so I went to using extensions.enabledScopes set to 5 in order to get Firefox to completely ignore ScriptScan.  I could verify that Firefox wasn't even scanning for the McAfee ScriptScan and .NET Framework 3.5 extensions, but the crashes still persisted.

On a whim, I tried disabling the HIPS "Host IPS".  The ePO policy on our machines gives users the ability to disable the McAfee HIPS Host IPS, Network IPS, and/or Firewall - they stay disabled until the user turns them back on or until the next time ePO does a policy enforcement, which is every hour, so it could be anywhere from 1 to 3599 seconds, but it's useful for testing.  Voila!  The crashes completely disappeared.  Turn it back, Firefox starts crashing the minute I hit the Adobe Flash test page.  Turn it back off, rock solid.  Each test run involved running Firefox and accessing the Adobe Flash page - I did this five times on two VMs for a total of ten executions.  Here were the results:
* With Host IPS turned on: 2 successful Flash loads, 8 Firefox crashes
* With Host IPS turned off: 10 successful Flash loads
* With Host IPS turned back on: 0 successful Flash loads, 10 Firefox crashes
* With Host IPS turned back off: 10 successful Flash loads

Even though it doesn't crash 100% of the time, I think there's enough statistical evidence there to point the finger pretty aggressively at HIPS.

The machines I manage are running HIPS 8.0.0.1741 with IPSVer 8.0.0.4254.  The ePO/HIPS team has been rolling out HIPS 8.0.0.1981 to other machines in the company, but they haven't gotten to the machines I manage.  I will work with the ePO/HIPS people on Monday in order to get HIPS 8.0.0.1981 updated on a test VM so I can see if that resolves the Firefox issue, and I will update this thread with the results.  Given the difficulty I've had in identifying the root of this problem, I wanted to get this post out in case others were running into the same issue.

Here's the bad news: HIPS Host IPS isn't an extension and it's not a plugin, so Firefox has no control over it.  My understanding is it hooks the processes somehow on start up, so you can't blocklist it.

If the more recent updates to HIPS resolve the issue, then we just need to make sure McAfee has a KB article about the issue out where people can find it.  If the more recent updates to HIPS do not resolve the issue, then I will have the ePO/HIPS team escalate this to McAfee immediately (since we have a very short window to get 10.0.3esr deployed before 3.6.* goes out of support) and keep my fingers crossed that there is some sort of ePO policy setting available that can disable Host IPS on a per-process basis.
I was not able to get our ePO/HIPS team to update HIPS on test VMs today, so I was unable to test whether 8.0.0.1981 resolves the issue.

I did, however, do some further testing with 8.0.0.1741.  I was able to demonstrate at least one installation of Firefox 10.0.3esr to a machine where, despite repeated reboots, Firefox 10.0.3esr resolutely refused to crash even with Host IPS turned on.  For what it's worth, I'm deploying Firefox using ZIP files combined with registry files.  I reset the VM back to a snapshot taken from this morning and re-deployed the Firefox package and, voila, crashes left and right so long as Host IPS was enabled.  This reinforces the observation that the problem is definitely non-deterministic.

That said, further testing demonstrated (with a fairly high reliability) the following:
* Having Host IPS turned off when Firefox starts is sufficient to resolve the issue.  That is to say that the following does not crash: Turn Host IPS off, launch Firefox, turn Host IPS on, access the Flash test page.
* Having Host IPS turned off when accessing the first plugin page is sufficient to resolve the issue.  That is to say that the following does not crash: Turn Host IPS on, launch Firefox, turn Host IPS off, access the Flash test page.
* Having Host IPS turned off after Firefox starts but turning it back on before accessing the first plugin page does not resolve the issue.  That is to say that the following does crash: Turn Host IPS on, launch Firefox, turn Host IPS off, turn Host IPS on, access the Flash test page (boom - instacrash).

So, Host IPS must be both on when Firefox starts and when the first plugin page launches in order to get a crash out of Firefox.

In case anyone is curious, here's a list of 12 crashes generated during testing (in order of occurrence):

https://crash-stats.mozilla.com/report/index/923d0f1a-05fe-470e-9a8a-226012120409
https://crash-stats.mozilla.com/report/index/0fb59530-9027-4542-9958-bee1a2120409
https://crash-stats.mozilla.com/report/index/a3378757-fcdb-4960-950d-689a22120409
https://crash-stats.mozilla.com/report/index/ec0dcbe2-a501-46bf-8be5-8452f2120409
https://crash-stats.mozilla.com/report/index/b676ceb5-abbe-42d5-a663-f47592120409
https://crash-stats.mozilla.com/report/index/b685bb5b-3aab-4667-841b-133012120409
https://crash-stats.mozilla.com/report/index/9f85e3a8-4df5-4d21-bfbf-d10182120409
https://crash-stats.mozilla.com/report/index/49fdf349-da36-47f3-be16-f43662120409
https://crash-stats.mozilla.com/report/index/921733b1-7b48-4392-a3e5-8b0172120409
https://crash-stats.mozilla.com/report/index/04db6f8a-06a7-4ff2-ad9b-1de642120409
https://crash-stats.mozilla.com/report/index/6c992610-1d5f-44aa-b9fc-389df2120409
https://crash-stats.mozilla.com/report/index/3d0ad680-5dc8-4eb8-96cf-7a8002120409
I did finally manage to get some testing done with 8.0.0.1919 and 8.0.0.1981 and here's what I found.

8.0.0.1919 (AKA 8.0.0 Patch 1, a publically released update for HIPS, see the Release Notes at https://kc.mcafee.com/corporate/index?page=content&id=PD23514) does NOT resolve the issue.
8.0.0.1981 (AKA 8.0.0 Hotfix 712198, a non-public hotfix that installs on top of 8.0.0.1919) DOES resolve the issue.

The readme for 8.0.0.1981 (which I could not find anywhere on McAfee's public website) mentions "third-party applications", "process injection", and "Kernel32.dll", all of which match up nicely with the issue seen above.  Furthermore, installing it appears to affect HcApi.dll and HcThe.dll, both of which are referenced above.

Finally, during additional testing, I was able to confirm that disabling the "Buffer Overflow" engine (one of eight HIPS engines used) did appear to resolve the issue (at least under 8.0.0.1741).  Note that 8.0.0.1919 (which didn't resolve the issue) does not update HcApi.dll or HcThe.dll.

So, to recap, users with HIPS and Firefox 10 have two options:
* Update their machines to 8.0.0.1919 and then get the hotfix from McAfee tech support and get that deployed
* Modify their ePO policy to disable the "Buffer Overflow" engine while waiting for the above (or a subsequent patch release that incorporates the hotfix).

Hopefully this will help someone!
See Also: → 951827
See Also: → 1167248
I think this McAfee Host Intrusion Prevention crash is an instance of more general rand_s crash bug 1167248.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.