Closed Bug 1614885 Opened 2 months ago Closed 2 months ago

Firefox 73 no longer able to load content when launched in Win7 compatibility mode

Categories

(Core :: mozglue, defect, P1, critical)

73 Branch
Desktop
Windows
defect

Tracking

()

VERIFIED FIXED
mozilla75
Tracking Status
relnote-firefox --- 73+
firefox-esr68 --- unaffected
firefox73 blocking verified
firefox74 + verified
firefox75 + verified

People

(Reporter: philipp, Assigned: toshi)

References

(Regression)

Details

(Keywords: regression)

Attachments

(3 files)

[Tracking Requested - why for this release]:
In the first few hours of the 73 rollout there are numerous user reports that Firefox lost the ability to open any urls (same symptoms as bug 1610790) - this is sometimes accompanied by the application error code 0xc0000005 in the windows event log for firefox.

i'm able to somewhat reproduce that on windows 10 by setting the compatibility settings for firefox to run as "windows 7", though it's not possible to tell if this same configuration was in place for everyone affected that reported these symptoms.

Given the reports we're getting, this sounds bad. Also, philipp confirmed that bug 1610790 does not fix this on Nightly builds.

Severity: normal → critical
Priority: -- → P1
Flags: needinfo?(tkikuchi)
Flags: needinfo?(jmathies)
Flags: needinfo?(jmathies) → needinfo?(gpascutto)
Assignee: nobody → tkikuchi
Flags: needinfo?(gpascutto)

I also reproduced the issue by setting the compatibility mode of firefox.exe to Win7. The root cause of this problem is the mismatch of Import Table of firefox.exe between the browser process and the content process, while the root cause of bug 1610790 is the mismatch of Export Table. We need a different fix.

Flags: needinfo?(tkikuchi)

That was the regressor for bug 1604008, so that would make sense.

Regressed by: 1522830
Has Regression Range: --- → yes
Has STR: --- → yes

When the compat mode is on, some ntdll functions in firefox's IAT are replaced with AcLayers.dll's functions.

2.6 RtlAllocateHeap@712 00007ff6`4e980710 00007ff9`f5806c20 AcLayers!NS_FaultTolerantHeap::APIHook_RtlAllocateHeap
2.13 RtlFreeHeap@985 00007ff6`4e980748 00007ff9`f5806e90 AcLayers!NS_FaultTolerantHeap::APIHook_RtlFreeHeap
2.15 RtlGetVersion@1071 00007ff6`4e980758 00007ff9`f5822190 AcLayers!NS_Win7RTMVersionLie::APIHook_RtlGetVersion
2.19 RtlReAllocateHeap@1315 00007ff6`4e980778 00007ff9`f5807360 AcLayers!NS_FaultTolerantHeap::APIHook_RtlReAllocateHeap

However, this replacement happens after we copied IAT entries into a child process. Thus there is a time window where AcLayers's address is set but the dll is not loaded. This causes 0xc0000005 when the process tries to call one of AcLayers functions.

00 0000009b`ba7ff050 00007ffa`04af99dc apphelp!SepIatPatch+0xc1
01 0000009b`ba7ff0e0 00007ffa`04aebb8a apphelp!SepRouterHookImportedApi+0xdc2c
02 0000009b`ba7ff1c0 00007ffa`04aeb808 apphelp!SepRouterHookIAT+0x2da
03 0000009b`ba7ff260 00007ffa`0996caef apphelp!SE_DllLoaded+0xa8
04 0000009b`ba7ff2c0 00007ffa`0996c98f ntdll!LdrpSendPostSnapNotifications+0x12b
05 0000009b`ba7ff330 00007ffa`0996aa4b ntdll!LdrpNotifyLoadOfGraph+0x4f
06 0000009b`ba7ff360 00007ffa`09a14546 ntdll!LdrpPrepareModuleForExecution+0x73
07 0000009b`ba7ff3a0 00007ffa`09a01df5 ntdll!LdrpInitializeProcess+0x1dbe
08 0000009b`ba7ff7e0 00007ffa`099b1853 ntdll!_LdrpInitialize+0x50589
09 0000009b`ba7ff880 00007ffa`099b17fe ntdll!LdrpInitialize+0x3b
0a 0000009b`ba7ff8b0 00000000`00000000 ntdll!LdrInitializeThunk+0xe

I have not yet come up with a nice solution, but as a mitigation, I think we should stop trying to enable the blocklist in a child process if the launcher process fails to enable it in the browser process. Then this issue will become a one-time issue.

Component: Untriaged → mozglue
Product: Firefox → Core

Bug 1522830 added the call to InitializeDllBlocklistOOP in SandboxBroker::LaunchApp
to enable the new dll blocklist and telemetry in sandbox processes. If the browser
process fails to bootstrap a process for some reason, firefox starts without any crash
nor any content processes because of that change.

What is worse is that this problem persists even after the launcher process was disabled.
To mitigate it, this patch stops an attempt to bootstrap a child process if the launcher
process already failed to do it. With this, if something bad happens in the first launch,
the launcher process is automatically disabled via registry and next time firefox will work
normally. So a user will see the launching problem only once.

We will follow up the bootstrap issue.

In https://old.reddit.com/r/firefox/comments/f2d5cu/nothing_will_load/, one user mentions Exploit Protection of Windows Defender.

In Windows 10, I had added Firefox to the "Exploit Protection" with a whole bunch of rules enabled.
It seems as of Firefox 73, the following 2 rules must be disabled:

  • Validate stack integrity (StackPivot)
  • Import address filtering (IAF)

This has been fixed in Nightly as bug 1592486. If we hit this issue in Beta or Release, the 0xc0000142 popup (instead of 0xc0000005) will be appeared once, and then hitting a no-content-process situation next time. The patch above mitigates this Windows Defender issue as well as the Windows compat mode issue.

Attached image at launch.png

I have reproduced this issue In Nightly v75.0a1 from 2020-02-12 and I thought some screenshots might help to not confuse issues between them.

This is what I see when I open Nightly with a newly created profile on Windows 10, but Nightly is set to be opened in compatibility mode for Windows 7.

Loading any link results in infinite loading, refresh button is disabled, the user can switch tabs.

This is what the other tab displays.
Attempting to load any other link in the address bar will result in infinite loading, the "refresh" button is disabled, the browser is basically unusable.

Pushed by archaeopteryx@coole-files.de:
https://hg.mozilla.org/integration/autoland/rev/96ae269611a7
Do not attempt to bootstrap a child process if the launcher failed to boostrap the browser process.  r=aklotz
Pushed by archaeopteryx@coole-files.de:
https://hg.mozilla.org/mozilla-central/rev/b0b5ea1916c5
Do not attempt to bootstrap a child process if the launcher failed to boostrap the browser process.  r=aklotz
Status: NEW → RESOLVED
Closed: 2 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla75

New Windows Nightlies have been requested.

As request for PI-486 check, results for the test session can be found in this document.

I opened bug 1615308 to follow up the 0xc0000005 error popup.

See Also: → 1615308

Opened bug 1615401 for the 32bit-specific issue. It's not a regression. I saw this 32bit crash on 72.0.2, too.

See Also: → 1615401

Comment on attachment 9126197 [details]
Bug 1614885 - Do not attempt to bootstrap a child process if the launcher failed to boostrap the browser process. r=aklotz

Beta/Release Uplift Approval Request

  • User impact if declined: Firefox launches with no content processes, meaning no functionality, if Windows compatibility mode for firefox.exe is set to Win7. This problem persists and there is no chance to run automatic update.
  • Is this code covered by automated tests?: No
  • Has the fix been verified in Nightly?: Yes
  • Needs manual test from QE?: No
  • If yes, steps to reproduce:
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): The change is small enough to understand the impact. We skip to bootstrap a child process if we failed already. No bootstrap in a content process is the old behavior before the regressing change bug 1522830, so this is a fallback to the original behavior. Manual testing by QA team is done and no new regression was observed.
  • String changes made/needed: None
Attachment #9126197 - Flags: approval-mozilla-beta?
Duplicate of this bug: 1615320

Comment on attachment 9126197 [details]
Bug 1614885 - Do not attempt to bootstrap a child process if the launcher failed to boostrap the browser process. r=aklotz

Approved for 74.0b3 so we can get wider feedback.

Attachment #9126197 - Flags: approval-mozilla-release?
Attachment #9126197 - Flags: approval-mozilla-beta?
Attachment #9126197 - Flags: approval-mozilla-beta+
Status: RESOLVED → VERIFIED
Duplicate of this bug: 1615127
Duplicate of this bug: 1615127

Verified with 75.0a1 (2020-02-16) on Windows 10 and 7.

Comment on attachment 9126197 [details]
Bug 1614885 - Do not attempt to bootstrap a child process if the launcher failed to boostrap the browser process. r=aklotz

Works around a non-functional browser for users in some circumstances since updating to 73.0. Approved for 73.0.1.

Attachment #9126197 - Flags: approval-mozilla-release? → approval-mozilla-release+

Added to the Firefox 73.0.1 release notes:

Fixed loss of browser functionality in certain circumstances such as running in Windows compatibility mode or having custom anti-exploit settings

Verified with 73.0.1 on Windows 10/7(x64).

See Also: → 1615241
See Also: → 1603335
Flags: in-qa-testsuite+
Duplicate of this bug: 1615821
Duplicate of this bug: 1603335
See Also: → 1619200
Duplicate of this bug: 1615027
You need to log in before you can comment on or make changes to this bug.