Closed Bug 1131180 Opened 9 years ago Closed 8 years ago

crash in free_impl | js::ctypes::CData::Finalize with McAfee SiteAdvisor

Categories

(External Software Affecting Firefox :: Other, defect)

All
Windows NT
defect
Not set
critical

Tracking

(firefox38 unaffected, firefox39 wontfix, firefox48+ fixed, firefox49+ fixed)

RESOLVED WORKSFORME
Tracking Status
firefox38 --- unaffected
firefox39 --- wontfix
firefox48 + fixed
firefox49 + fixed

People

(Reporter: lizzard, Unassigned)

References

Details

(Keywords: crash)

Crash Data

This bug was filed from the Socorro interface and is 
report bp-26609920-5b48-4d0a-8d0b-c6fdf2150208.
=============================================================
This crash signature is rising on Firefox 38 right now. It's at #22 in the topcrash list, and it only affects Firefox 38 and to a lesser extent, 37.  I am filing it now thought it's not in the top 10 because often it appears to be a startup crash. 


More reports: https://crash-stats.mozilla.com/report/list?product=Firefox&range_value=3&range_unit=days&date=2015-02-09&signature=free_impl+|+js%3A%3Actypes%3A%3ACData%3A%3AFinalize&version=Firefox%3A38.0a1#tab-reports

Crashing thread:

0 	mozglue.dll 	free_impl 	memory/build/replace_malloc.c
1 	xul.dll 	js::ctypes::CData::Finalize 	js/src/ctypes/CTypes.cpp
2 	xul.dll 	FinalizeArenas 	js/src/jsgc.cpp
3 	xul.dll 	js::gc::ArenaLists::forceFinalizeNow(js::FreeOp*, js::gc::AllocKind, js::gc::ArenaLists::KeepArenasEnum, js::gc::ArenaHeader**) 	js/src/jsgc.cpp
4 	xul.dll 	js::gc::ArenaLists::queueForegroundObjectsForSweep(js::FreeOp*) 	js/src/jsgc.cpp
5 	xul.dll 	js::gc::GCRuntime::beginSweepingZoneGroup() 	js/src/jsgc.cpp
6 	xul.dll 	js::gc::GCRuntime::beginSweepPhase(bool) 	js/src/jsgc.cpp
7 	xul.dll 	js::gc::GCRuntime::incrementalCollectSlice(js::SliceBudget&, JS::gcreason::Reason) 	js/src/jsgc.cpp
8 	xul.dll 	js::gc::GCRuntime::gcCycle(bool, js::SliceBudget&, JS::gcreason::Reason) 	js/src/jsgc.cpp
9 	xul.dll 	js::gc::GCRuntime::collect(bool, js::SliceBudget, JS::gcreason::Reason) 	js/src/jsgc.cpp
10 	xul.dll 	js::gc::GCRuntime::gc(JSGCInvocationKind, JS::gcreason::Reason) 	js/src/jsgc.cpp
11 	xul.dll 	nsXREDirProvider::DoShutdown() 	toolkit/xre/nsXREDirProvider.cpp
12 	xul.dll 	ScopedXPCOMStartup::~ScopedXPCOMStartup() 	toolkit/xre/nsAppRunner.cpp
13 	xul.dll 	ScopedXPCOMStartup::`scalar deleting destructor'(unsigned int) 	
14 	xul.dll 	XREMain::XRE_main(int, char** const, nsXREAppData const*) 	toolkit/xre/nsAppRunner.cpp
Crash Signature: [@ free_impl | js::ctypes::CData::Finalize] → [@ free_impl | js::ctypes::CData::Finalize] [@ je_arena_salloc]
I assume mccr8's choice of CC are the people who would be knowledgeable here :)

This has been top-10 on nightly for a while (though now with the new je_arena_salloc variant). As far as I can tell this began in 20141105030203. Does anything in here look suspicious? https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=5dde8ea48fef&tochange=62990ec7ad78

Oh, and it's never been seen anywhere outside of nightly. Does that help?
Flags: needinfo?(sphink)
Flags: needinfo?(jcoppeard)
Windows only, and the vast majority of the crash addresses are 0x4000000 or 0x4100000.

The crash is from CData::Finalize() trying to free a pointer to a buffer allocated in CData::Create().  I can't see a way this can be changed after initialisation, so this is pretty strange.
(In reply to David Major [:dmajor] (UTC+13) from comment #2)
> As far as I can tell this began in 20141105030203.  Does anything in here look suspicious?

Nothing jumps out.  Enabling PGO on Windows (bug 1084162) might be responsible, or any heap corruption bug could cause this.

Although it shows up during GC, it's unlikely to be related to GC itself.
(In reply to Jon Coppeard (:jonco) from comment #3)
> Windows only, and the vast majority of the crash addresses are 0x4000000 or
> 0x4100000.

These addresses are interesting! I assume the "ends with five zeros" property is because that's jemalloc converting the pointer to the chunk that contains it.

On 32-bit, the actual address that we're trying to free is almost always 0x46206d61. It's super suspicious that the pointer is so consistent. It looks vaguely like ASCII text but nothing rings a bell.

On 64-bit, a bunch of pointers look like: 0x0000000004090409, 0x0000000004010401, 0x0000000004190419. Those looks suspiciously like LCIDs (e.g. 0x409 == en-US).
Crash Signature: [@ free_impl | js::ctypes::CData::Finalize] [@ je_arena_salloc] → [@ free_impl | js::ctypes::CData::Finalize] [@ je_arena_salloc] [@ je_arena_salloc | ifree | moz_free | js::Nursery::FreeHugeSlotsTask::run() ]
sfink, jonco - any ideas? Does comment 5 help at all?
Crash Signature: [@ free_impl | js::ctypes::CData::Finalize] [@ je_arena_salloc] [@ je_arena_salloc | ifree | moz_free | js::Nursery::FreeHugeSlotsTask::run() ] → [@ free_impl | js::ctypes::CData::Finalize] [@ je_arena_salloc] [@ je_arena_salloc | ifree | moz_free | js::Nursery::FreeHugeSlotsTask::run() ] [@ je_free | js::ctypes::CData::Finalize]
Flags: needinfo?(sphink)
Flags: needinfo?(jcoppeard)
Crash Signature: [@ free_impl | js::ctypes::CData::Finalize] [@ je_arena_salloc] [@ je_arena_salloc | ifree | moz_free | js::Nursery::FreeHugeSlotsTask::run() ] [@ je_free | js::ctypes::CData::Finalize] → [@ free_impl | js::ctypes::CData::Finalize] [@ je_arena_salloc] [@ je_arena_salloc | ifree | moz_free | js::Nursery::FreeHugeSlotsTask::run() ] [@ je_free | js::ctypes::CData::Finalize] [@ je_arena_salloc | ifree | moz_free | js::Nursery::FreeHugeSlot…
[Tracking Requested - why for this release]:
data from 48.0a2 and 48.0b1 suggests that this crash signature is getting more common this cycle again. it's the #5 crasher in early 48.0b1 crash data.
For crashes on 48 Beta the crashing address is either 0x46200000 (most of them) or 0x0 (a few).

> On 32-bit, the actual address that we're trying to free is almost always
> 0x46206d61. It's super suspicious that the pointer is so consistent. It
> looks vaguely like ASCII text but nothing rings a bell.

It would be "am F" as an ASCII string (because x86 is little-endian) but I looked at the contents of memory and it's not part of some larger ASCII string.
the crashes on beta have an interesting correlation to the mcafee siteadvisor addon being installed:

  je_free | js::ctypes::CData::Finalize|EXCEPTION_ACCESS_VIOLATION_READ (140 crashes)
     99% (139/140) vs.   2% (254/13660) {4ED1F68A-5463-4931-9384-8FFF5ED91D92}
See Also: → 1281610
See Also: → 1277897
There's a considerable number of startup crashes that could be related to McAfee (bug 1277897).
Component: JavaScript: GC → Other
Product: Core → External Software Affecting Firefox
Summary: crash in free_impl | js::ctypes::CData::Finalize → crash in free_impl | js::ctypes::CData::Finalize with McAfee SiteAdvisor
Version: 38 Branch → unspecified
Depends on: 1286368
i could reproduce the crashes with mcafee siteadvisor 3.7 (it's a perma-crash at startup) and 48.0b, so i filed bug 1286368 to get older versions of the addon blocklisted.
Too late for 48, let's try to fix it for 49
We are in touch with MacAfee.
Marco could share an update about it
Flags: needinfo?(mcastelluccio)
As Marco is in pto and this bug affects the release, I reached out to our contact at Intel.

They say that it impacts old version of MacAfee. They are investigation to push an update to those users.
Flags: needinfo?(mcastelluccio)
Peter, Benjamin, I would like us to consider blocking the old version of MacAfee which are breaking Firefox 48. We are in touch with them. Outside of this bug, I can share more details if needed.
Flags: needinfo?(pdolanjski)
Flags: needinfo?(benjamin)
What information do you need from me?
Flags: needinfo?(benjamin)
If you would agree on adding this old version of MacAfee to the blocklist. This is a bad bug and if we have the capability to mitigate the risk before we push 47 to a bigger audience, that would be great.
Flags: needinfo?(benjamin)
That's not my decision. I'm not even sure if you're talking about the addon blocklist or the DLL blocklist. If it's the DLL blocklist, I'm happy to write the patch, but I need to know the DLL name and version numbers to block.
Flags: needinfo?(benjamin)
Not sure either, I asked Intel earlier today
The crash with this signature is a GC-related crash due to the McAfee SiteAdvisor addon, the crash in bug 1277897 is happening in a McAfee DLL (but I think it is still due to the addon, which is using the DLL via js-ctypes). Intel is working on giving us symbols for the DLL to make the crashes tracked by bug 1277897 easier to debug.

Most of the crashes with this signature are happening with addon versions < 4.0.0, which should have been blocked by bug 1286368. I've asked Jorge and Andrew to verify that the block is actually working, since we're still seeing crashes with older versions.
Removing NI since block is being implemented in child bug.
Flags: needinfo?(pdolanjski)
as i've written in the other bug as well, one reason for this crash spiking now after the 48.0 update appears to be that old versions of the siteadvisor extension were force-disabled prior to 48.0 because they were not signed, but curiously in 48.0 those got re-enabled again & there is no hint anymore about the missing signature (i'm not sure what has changed in 48 in this regard).

since the addon blocklist patch in bug 1286368 doesn't seem to work reliably, we might want to try targeting the NPMcFFPlg32.dll file through the windows dll blocklist...
today is the first day with module correlations available for 48.0:
  je_free | js::ctypes::CData::Finalize|EXCEPTION_ACCESS_VIOLATION_READ (120 crashes)
    100% (120/120) vs.   2% (546/22761) NPMcFFPlg32.dll
         12% (14/120) vs.   0% (14/22761) 3.6.5.118
          8% (9/120) vs.   0% (9/22761) 3.6.6.110
          2% (2/120) vs.   0% (2/22761) 3.6.6.112
         32% (38/120) vs.   0% (39/22761) 3.7.0.128
          9% (11/120) vs.   0% (11/22761) 3.7.1.110
          4% (5/120) vs.   0% (5/22761) 3.7.1.118
         13% (15/120) vs.   0% (15/22761) 3.7.1.120
          1% (1/120) vs.   0% (1/22761) 3.7.2.106
          2% (2/120) vs.   0% (2/22761) 3.7.2.116
         14% (17/120) vs.   0% (17/22761) 3.7.2.117
          4% (5/120) vs.   0% (6/22761) 3.7.2.120
          0% (0/120) vs.   0% (2/22761) 3.7.2.210
          0% (0/120) vs.   0% (2/22761) 3.7.2.223
          0% (0/120) vs.   0% (28/22761) 3.7.2.227
          0% (0/120) vs.   0% (1/22761) 4.0.0.219
          1% (1/120) vs.   0% (1/22761) 4.0.1.124
          0% (0/120) vs.   0% (1/22761) 4.0.1.167
          0% (0/120) vs.   0% (1/22761) 4.0.1.203
          0% (0/120) vs.   0% (5/22761) 4.0.1.207
          0% (0/120) vs.   0% (1/22761) 4.0.2.183
          0% (0/120) vs.   0% (4/22761) 4.0.2.187
          0% (0/120) vs.   0% (14/22761) 4.0.2.189
          0% (0/120) vs.   0% (5/22761) 4.0.2.192
          0% (0/120) vs.   0% (12/22761) 4.0.2.198
          0% (0/120) vs.   0% (2/22761) 4.0.3.220
          0% (0/120) vs.   0% (89/22761) 4.0.3.227
          0% (0/120) vs.   1% (257/22761) 4.0.3.233
I believe this issue is due to the addon itself and not the DLL, I've asked more details to McAfee and I'm awaiting a reply.

In any case, I've pushed a patch to add the DLL to the DLL blocklist (https://treeherder.mozilla.org/#/jobs?repo=try&revision=4803e8f47e2d), so we can test if blocking the DLL fixes the problem for you.
Flags: needinfo?(madperson)
We have two possible solutions at the moment:
1) Fix the bug that is preventing the addon to be blocked because it is unsigned (Mossop is investigating this: https://bugzilla.mozilla.org/show_bug.cgi?id=1286368#c33).
2) Block the DLL (philipp is going to test if this fixes the crash with a custom build, https://treeherder.mozilla.org/#/jobs?repo=try&revision=279e80cee923).

The addon blocklist bug (bug 1294439) is not a viable option: https://bugzilla.mozilla.org/show_bug.cgi?id=1294439#c4.
unfortunately it seems that blocklisting NPMcFFPlg32.dll doesn't solve the crashes on startup. this is a report triggered from the try build of comment #26:
https://crash-stats.mozilla.com/report/index/5ac968ec-3a70-4e25-90d4-7ec492160811
Flags: needinfo?(madperson)
(In reply to Marco Castelluccio [:marco] from comment #26)
> We have two possible solutions at the moment:
> 1) Fix the bug that is preventing the addon to be blocked because it is
> unsigned

this is now dealt with in bug 1294483
Depends on: 1294483
The remaining crashes with this signature on 48.0.1 are not correlated to the McAfee SiteAdvisor addon.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.