Crash in ?? ?? ::FNODOBFM::`string''

RESOLVED FIXED

Status

defect
P2
critical
RESOLVED FIXED
2 years ago
2 years ago

People

(Reporter: jseward, Assigned: marco)

Tracking

({crash})

unspecified
All
Windows

Firefox Tracking Flags

(firefox-esr52 wontfix, firefox56 wontfix, firefox57 fixed, firefox58 fixed)

Details

(Whiteboard: [content process crash][AV:Avecto Privilege Guard][DLL:pghook], crash signature)

Attachments

(1 attachment)

This bug was filed from the Socorro interface and is 
report bp-4a23231b-f51f-4f46-9260-8b2150170728.
=============================================================

This is topcrash #19 in the Windows nightly of 20170727162801.
Per a bit of googling, the really strange "FNODOBFM" string
seems to somehow be related to msvc optimisations.
dmajor, is this Windows weirdness (FNODOBFM) known to you?
Flags: needinfo?(rjesup)
Flags: needinfo?(dmajor)
all but 2 are do not have sctp* in the stack; the spike is due to crashes with TppCallbackCheckThreadAfterCallback on the stack (after BaseThreadInitThunk).  So the spike is coming from somewhere else.  The Tpp stackframe goes back well before 56. 

Note: crashes like this appear to go back (at low frequency) to 28 or earlier.  Much older ones (in a few sampled) seem to have the weird frame called from RtlExitUserThread.

Since 54, there are a small number of crashes with sctp on the stack (and different reason), where BaseThreadINitThunk calls user_sctp_timer_iterate and the last frame before the weird on is sowakeup (in sctp).  39 crashes in the last 6 months, all 54 aurora or later.  We have not updated the sctp library since well before 54 (we should; have been waiting on a good upstream rev to land), but there was a sec fix or two in the timer code there perhaps in that timeframe (bwc, any thoughts?)

jimm - can you decide on a better component for this?  I suspect this has nothing or little to do with sctp
Flags: needinfo?(tuexen)
Flags: needinfo?(rjesup)
Flags: needinfo?(jmathies)
Flags: needinfo?(docfaraday)
(In reply to Julian Seward [:jseward] from comment #1)
> dmajor, is this Windows weirdness (FNODOBFM) known to you?

I wouldn't read too much into it. We crashed in some internal function that MS didn't publish symbols for, so the tools tried to guess based on the nearest visible symbol. It's the same thing that happens when we get absurdly large offsets on unhelpful function names (e.g. DllCanUnloadNow+0x56789).

Note that there are poison values in some registers, at least in the report at comment 0, suggesting a potential uaf. It may be worth checking to see if this happens in other reports, in case it helps narrow down the cause.
Flags: needinfo?(dmajor)
This is highly correlated with PGHook.dll, which is some sort of corporate snoop software [1] and uBlock.

[1] https://www.avecto.com/news-and-events/news/avecto-launches-privilege-guard-38

(98.73% in signature vs 00.20% overall) reason = 0xc0000710 / 0x00000000
(98.73% in signature vs 00.22% overall) Module "PGHook.dll" = true
(100.0% in signature vs 21.22% overall) moz_crash_reason = null
(100.0% in signature vs 29.70% overall) Addon "uBlock Origin" = true
Component: WebRTC: Networking → Other
Flags: needinfo?(jmathies)
Product: Core → External Software Affecting Firefox
Flags: needinfo?(docfaraday)
This still a problem (#10 in the Windows nightly of 20170817100132).
It seems to have been happening more frequently since around July 20.
Seems to be a small number of installs, 9 installs most recently are affected by this - even though overall it is in the Top 5 crashes in 57.
(In reply to Marcia Knous [:marcia - use ni] from comment #6)
> Seems to be a small number of installs, 9 installs most recently are
> affected by this - even though overall it is in the Top 5 crashes in 57.

Same thing this week - Although this is the #6 browser crash, there are only 8 installs affected.
No sctp library come up to my mind...
Flags: needinfo?(tuexen)
(In reply to Marcia Knous [:marcia - use ni] from comment #6)
> Seems to be a small number of installs, 9 installs most recently are
> affected by this - even though overall it is in the Top 5 crashes in 57.

Filtering this to the 0xc0000710 crashes (anything else would be a different root cause), every facet I can find says that these installs are actually from a single machine.

https://crash-stats.mozilla.com/search/?signature=%3D%3F%3F%20%3F%3F%20%3A%3AFNODOBFM%3A%3A%60string%27%27&reason=%3D0xc0000710%20%2F%200x00000000&version=57.0a1&product=Firefox&date=%3E%3D2017-03-15T02%3A16%3A44.000Z&date=%3C2017-09-15T02%3A16%3A44.000Z&_sort=-date&_facets=signature&_facets=reason&_facets=cpu_name&_facets=cpu_info&_facets=version&_facets=bios_manufacturer&_facets=adapter_vendor_id&_facets=adapter_device_id&_facets=adapter_subsys_id&_facets=adapter_driver_version&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform#crash-reports

If this doesn't appear when 57 hits beta, we should resolve incomplete.
There's a handful of these crashes in 57 beta. I'm wontfixing for 57. :dmajor, I leave it to you to resolve as you see fit.
Flags: needinfo?(dmajor)
OK. I retract my proposal to resolve incomplete. This should stay active (subject to prioritization based on volume).
Flags: needinfo?(dmajor)
Given the small volume now, I think this is a P3 or a P5.
Randall: Can you write up a bug for that? Then I can take it from there.
Flags: needinfo?(willkg)
Marco, this look like a really good blocklist candidate. Crash volume puts this at #7 for the browser in Nightly and it appears to be hitting 58 hard. 

(100.0% in signature vs 00.13% overall) Module "PGHook.dll" = true [129.17% vs 01.13% if startup_crash = null]
Flags: needinfo?(mcastelluccio)
Priority: -- → P1
Priority: P1 → P2
Hardware: Unspecified → All
OS: Windows 7 → Windows
(In reply to Jim Mathies [:jimm] from comment #15)
> Marco, this look like a really good blocklist candidate. Crash volume puts
> this at #7 for the browser in Nightly and it appears to be hitting 58 hard. 
> 
> (100.0% in signature vs 00.13% overall) Module "PGHook.dll" = true [129.17%
> vs 01.13% if startup_crash = null]

The only issue here is that we can't tell if the blocklisting is effective/is ineffective/causes other problems, as we don't have a copy of the software.
We could have tried to contact some of the users to ask them to verify the block, if only one of them left us an email address.
Flags: needinfo?(mcastelluccio)
We'll try to add this to the new blocklist for content processes.
Whiteboard: [content process crash]
Crash Signature: [@ ?? ?? ::FNODOBFM::`string''] → [@ ?? ?? ::FNODOBFM::`string''] [@ ?? ::FNODOBFM::`string''] [@ RtlpLowFragHeapFree | ?? ?? ::FNODOBFM::`string''] [@ RtlRaiseStatus | ?? ?? ::FNODOBFM::`string'']
Duplicate of this bug: 1377416
Crash Signature: [@ ?? ?? ::FNODOBFM::`string''] [@ ?? ::FNODOBFM::`string''] [@ RtlpLowFragHeapFree | ?? ?? ::FNODOBFM::`string''] [@ RtlRaiseStatus | ?? ?? ::FNODOBFM::`string''] → [@ ?? ?? ::FNODOBFM::`string''] [@ ?? ::FNODOBFM::`string''] [@ RtlpLowFragHeapFree | ?? ?? ::FNODOBFM::`string''] [@ RtlRaiseStatus | ?? ?? ::FNODOBFM::`string''] [@ BasepGetModuleHandleExW ] [@ BaseSetLastNTError]
Whiteboard: [content process crash] → [content process crash][AV:Avecto Privilege Guard][DLL:pghook]
Dhiraj, could you test if the crash is fixed with this build: https://queue.taskcluster.net/v1/task/ERrC-JdwSQCJAFdRkvJZTw/runs/0/artifacts/public/build/target.zip.
Flags: needinfo?(mishra.dhiraj95)
Hi Marco, 

Confirming the crash for pghook.dll is now fixed with this(https://queue.taskcluster.net/v1/task/ERrC-JdwSQCJAFdRkvJZTw/runs/0/artifacts/public/build/target.zip) build.
Thanks !
Flags: needinfo?(mishra.dhiraj95)
Attachment #8920776 - Attachment is patch: true
Attachment #8920776 - Flags: review?(jmathies)
Attachment #8920776 - Flags: review?(jmathies) → review+
Pushed by mcastelluccio@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/985e59b99024
Blocklist pghook.dll as it causes crashes. r=jimm
Please nominate this for uplift to Beta & ESR52.
Assignee: nobody → mcastelluccio
Flags: needinfo?(mcastelluccio)
Comment on attachment 8920776 [details] [diff] [review]
Patch to block pghook.dll

Approval Request Comment
[Feature/Bug causing the regression]: Third-party software injecting into Firefox.
[User impact if declined]: Crashes (low volume, but the fix is pretty safe and contained)
[Is this code covered by automated tests?]: No.
[Has the fix been verified in Nightly?]: No, but the try build has been verified to fix the crash in comment 20. There are no crashes on Nightly since this landed (https://crash-stats.mozilla.com/search/?signature=~FNODOBFM&product=Firefox&version=58.0a1&date=%3E%3D2017-10-24T12%3A45%3A31.000Z&date=%3C2017-10-31T11%3A45%3A31.000Z&_sort=-date&_facets=signature&_facets=build_id&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform#facet-build_id), but the volume is low so we can't tell for sure.
[Needs manual test from QE? If yes, steps to reproduce]: Not needed, since it has already been verified.
[List of other uplifts needed for the feature/fix]: None.
[Is the change risky?]: No.
[Why is the change risky/not risky?]: It's only adding a new DLL to the blocklist.
[String changes made/needed]: None.
Flags: needinfo?(mcastelluccio)
Attachment #8920776 - Flags: approval-mozilla-beta?
Comment on attachment 8920776 [details] [diff] [review]
Patch to block pghook.dll

Dll blocklist, crash fix, Beta57+
Attachment #8920776 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
You need to log in before you can comment on or make changes to this bug.