Closed Bug 1760668 Opened 2 years ago Closed 1 year ago

Socket process crashes in Symantec's IPSEng64 / IPSEng32

Categories

(External Software Affecting Firefox :: Other, defect)

Desktop
Windows
defect

Tracking

(firefox108 wontfix, firefox109 fixed)

RESOLVED FIXED
Tracking Status
firefox108 --- wontfix
firefox109 --- fixed

People

(Reporter: jimm, Assigned: gstoll)

References

Details

Crash Data

Attachments

(3 files)

We tried to address this in bug 1743427, but that doesn't seem to be helping. We're still seeing these crashes in background threads at pretty high volume.

https://www.mathies.com/mozilla/crashes/sockrelease.html

Bug 1743427, pre-spawn CIG is currently limited to early beta and earlier due to compat risks (bug 1704373).

The severity field is not set for this bug.
:haik, could you have a look please?

For more information, please visit auto_nag documentation.

Flags: needinfo?(haftandilian)

Some of these crashes are showing up on Socorro too.

Crash Signature: [@ ipseng32.dll | BaseThreadInitThunk]
Severity: -- → S3
Flags: needinfo?(haftandilian)

Broadcom told us they believe this problem was fixed in version 17.2.7.51 of ipseng64.dll/ipseng32.dll.

Crashstats data support this. At this time, there aren't any instances of socket process crashes with version 17.2.7.51 or later of the 32-bit or 64-bit DLLs in the stack. Some crashes are showing up for other processes (main and content) with the new version.

Over the last two weeks, from module ping telemetry, 17.2.7.51 is the latest and most popular version of the DLL getting loaded by about 4x for each of 64-bit and 32-bit compared to the next most recent. Going back 4 weeks shows the new version being loaded about as often as the next most recent indicating adoption of the newest version is ramping up.

Hence we can expect this problem to go away over time as the later DLL version adoption continues. I'll take more time to look at the crashes in other processes to determine if a new bug should be filed for those.

The bug is linked to a topcrash signature, which matches the following criterion:

  • Top 5 socket and utility process crashes on release

:haik, could you consider increasing the severity of this top-crash bug?

For more information, please visit auto_nag documentation.

Flags: needinfo?(haftandilian)
Keywords: topcrash

Changing to S2 due to this being a top crasher for the socket process.

Assignee: nobody → gstoll
Severity: S3 → S2
Flags: needinfo?(haftandilian)

We can see from this query that more than 99% of these crashes for IPSEng64 are from versions before 17.2.7.51, and similarly for IPSEng32. So it looks like this issue is still fixed in later versions, but for some reason many people still have older versions of this DLL.

I'm not sure why this would be popping up again. I'll work on a query to see the usage of these versions over time; perhaps Symantec pushed out something recently that caused older versions to get installed?

Here's a view of the most common IPSEng64 versions over the last 7 days - looks like around 70% of clients are on version 17.2.7.51 or above. I think this is a good candidate for blocklisting versions before the fix.

Here is the DLL blocklist questionnaire:

  1. How were we aware of the problem?
    Showed up as a topcrasher for the socket process.

  2. What is a suspicious product causing the problem?
    Symantec Endpoint Protection - Intrusion Detection

  3. Is the product downloadable? If so, do we have a local repro?
    The product does not seem to be downloadable. It was in the past, but we were not able to reproduce the crash (see comment 9 on bug 1743427. Since we believe the issue has been fixed in more recent versions it seems even more unlikely we would be able to reproduce it now.

  4. Which OS versions does the problem occur on?
    According to telemetry this is affecting all versions of Windows, with Windows 10 at around 86% of crashes.

  5. Which process types does the problem occur on?
    This seems to only happen in the socket process. Unfortunately the blocklist does not currently support blocking a module only in the socket process. I plan on adding this functionality and only blocking this module in the socket process.

  6. What is the maximum version of the module in the crash reports?
    Of the ~24K reports, only 2 have a version higher than 17.2.7.51.

  7. Is the issue fixed by a newer version of the product?
    Yes, Broadcom believes this was fixed in version 17.2.7.51, and this is borne out by the crash reports.

  8. Do we have data about the module in the third-party-module ping?
    Yes.

  9. Do we know how the module is loaded?
    (NOTE: I believe this is true, but it would be nice to have confirmation that I'm interpreting this correctly) According to telemetry this is not a dependent module since is_dependent is false.

  10. Describe your conclusion.
    We should block ipseng64.dll and ipseng32.dll versions prior to 17.2.7.51 in the socket process only.

Status: NEW → ASSIGNED

Based on the topcrash criteria, the crash signature linked to this bug is not a topcrash signature anymore.

For more information, please visit auto_nag documentation.

Keywords: topcrash

There doesn't seem to be a good way to directly test the blocklisting in the socket/utility process, so I manually tested it for the socket process and added these automated tests that things blocklisted in the socket/utility process can load in other processes.

Depends on D160586

Pushed by gstoll@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/c9a2bdc74bb5
part 1: add ability to blocklist DLLs in socket process. r=gerard-majax
https://hg.mozilla.org/integration/autoland/rev/6c4278668c2f
part 1.5: add tests for blocklists in socket and utility processes. r=handyman
Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED

There's a patch attached to this bug that hasn't been landed yet. Is that an oversight? Alternatively, should this bug be left open for now?

Flags: needinfo?(gstoll)

Yeah, the ability to blocklist DLLs for just the socket process went in, but actually using that blocklist to block IPSEng64/IPSEng32 has not. Reopening (and thanks!)

Status: RESOLVED → REOPENED
Flags: needinfo?(gstoll)
Resolution: FIXED → ---
See Also: → 1743427
Pushed by gstoll@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/2a1fec837d80
part 2: blocklist DLL in socket process only. r=handyman

After the configuration change, I see very few crashes with newer versions but it's still relatively common with older ones, so I'm going to block just older versions.

Status: REOPENED → RESOLVED
Closed: 2 years ago1 year ago
Resolution: --- → FIXED

The patch landed in nightly and beta is affected.
:gstoll, is this bug important enough to require an uplift?

  • If yes, please nominate the patch for beta approval.
  • If no, please set status-firefox108 to wontfix.

For more information, please visit auto_nag documentation.

Flags: needinfo?(gstoll)
Flags: needinfo?(gstoll)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: