Content-Encoding not honored when e10s is turned off via browser.tabs.remote.autostart=false or MOZ_FORCE_DISABLE_E10S=1
Categories
(Core :: Networking, defect, P2)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr68 | --- | unaffected |
firefox76 | --- | unaffected |
firefox77 | + | wontfix |
firefox78 | --- | verified |
firefox79 | --- | verified |
People
(Reporter: frankmortimer, Assigned: mattwoodrow)
References
(Regression)
Details
(Keywords: regression, Whiteboard: [necko-triaged])
Attachments
(2 files)
263.75 KB,
image/png
|
Details | |
47 bytes,
text/x-phabricator-request
|
jcristau
:
approval-mozilla-beta+
jcristau
:
approval-mozilla-release-
|
Details | Review |
User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36
Steps to reproduce:
set browser.tabs.remote.autostart to false from about:config, and then try to load any webpage with Content-Encoding: gzip or similar
Actual results:
Garbage on the screen, caused by the content not being decompressed. It seems to be vetoed by calling HttpBaseChannel::SetApplyConversion(false) from DocumentLoadListener::OnStartRequest() in hopes that the child process would apply the conversion, but it never does since there's no child in single-process mode
Expected results:
The webpage should have been loaded correctly
Reporter | ||
Updated•5 years ago
|
Comment 1•5 years ago
|
||
Bugbug thinks this bug should belong to this component, but please revert this change in case of error.
![]() |
||
Comment 2•5 years ago
|
||
I can reproduce the issue on Nightly78.0a1 and Beta77.0b6 if disable e10s via shell env(set MOZ_FORCE_DISABLE_E10S=1)
Steps To Reproduce:
- set MOZ_FORCE_DISABLE_E10S=1 in Dos prompt
- Launch firefox.exe
- Open www.google.com
Regression window:
https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=d34716786f5b0159ca32bdfbc86fcb6c747618c8&tochange=e190fb6ea8c2368e7c930ebc3e5c3e12aa88ba3e
I don't know if we still have plans of supporting non-e10s mode.
But I didn't expect the Content-Encoding part to be dependent on e10s mode. Is this a bug in DocumentChannel or in the underlying nsHttpChannel?
Updated•5 years ago
|
Comment 6•5 years ago
|
||
(In reply to Valentin Gosu [:valentin] (he/him) from comment #3)
I don't know if we still have plans of supporting non-e10s mode.
But I didn't expect the Content-Encoding part to be dependent on e10s mode. Is this a bug in DocumentChannel or in the underlying nsHttpChannel?
Does this affect Thunderbird?
I don't know if TB is using the document channel. :jya might be able to answer.
Comment 8•5 years ago
•
|
||
When browser.tabs.documentchannel.ppdc got enabled many things broke for Thunderbird. See bug 1633198. Not sure if that answers your question.
Comment 9•5 years ago
|
||
Forwarding to Matt as this is one of his change.
However,
(In reply to frankmortimer from comment #0)
Garbage on the screen, caused by the content not being decompressed. It seems to be vetoed by calling HttpBaseChannel::SetApplyConversion(false) from DocumentLoadListener::OnStartRequest() in hopes that the child process would apply the conversion, but it never does since there's no child in single-process mode
This is actually missing bits of the picture.
SetApplyConversion is called again with the original value once the channel is resumed, which will happen
there:
https://searchfox.org/mozilla-central/rev/bc3600def806859c31b2c7ac06e3d69271052a89/netwerk/ipc/DocumentLoadListener.cpp#962
in DocumentLoadListener::ResumeSuspendedChannel.
This is called by the ParentProcessDocumentChannel when we redirect to the real channel here:
https://searchfox.org/mozilla-central/rev/bc3600def806859c31b2c7ac06e3d69271052a89/netwerk/ipc/ParentProcessDocumentChannel.cpp#86
I'm not sure we want to continue supporting e10s for much longer anyway, there are core changes being made for fission.
Assignee | ||
Comment 10•5 years ago
|
||
Both nsHttpChannel, and HttpChannelChild will install content conversion helpers once OnStartRequest returns, if SetApplyConversion is true.
We currently call SetApplyConversion(false) from DocumentLoadListener::OnStartRequest, to prevent nsHttpChannel from installing a conversion helper, and then we re-set it to true before passing OnStartRequest to HttpChannelChild so that it does install one, and conversion happens in the content process.
In the parent-process case there is no HttpChannelChild, so nobody installs one.
I think we need to change DocumentLoadListener::OnStartRequest to only call SetApplyConversion(false), if we have decided to send this channel to a content process. I think that'd mean waiting until after we've called MaybeTriggerProcessSwitch, and have a way of checking which process was selected (without needing to wait on the promise for the pid).
Updated•5 years ago
|
Assignee | ||
Updated•5 years ago
|
Assignee | ||
Comment 11•5 years ago
|
||
Updated•5 years ago
|
Comment 12•5 years ago
|
||
Comment 13•5 years ago
|
||
bugherder |
Comment 14•5 years ago
•
|
||
[Tracking Requested - why for this release]:
Given that this bug broke Google.com for all users who have disabled E10S, this should probably be uplifted into release?
STR:
- Open Powershell on Win10. Run:
$env:MOZ_FORCE_DISABLE_E10S="1"; &"C:\Program Files\Mozilla Firefox\firefox.exe"
- Open about:support: "Multiprocess Windows: 0/1 Disabled forcibly"
- Open https://www.google.com. It looks broken like comment 0.
- Run
$env:MOZ_FORCE_DISABLE_E10S="";
to unset the env var.
(Alice0775 White from bug 1642870 comment #6)
From reporter's attachment:
Multiprocess Windows: 0/1 Disabled forciblyYou have disabled e10s for some reason.
So, this seems to be duplication of Bug 1638652.
multi-process-status-8 = Disabled forcibly
kE10sForceDisabled = 8,
// Uber override pref for emergency blocking
if (gBrowserTabsRemoteAutostart && EnvHasValue("MOZ_FORCE_DISABLE_E10S")) {
gBrowserTabsRemoteAutostart = false;
status = kE10sForceDisabled;
}
bug 1548941 removed the browser.tabs.remote.force-disable pref.
The only way to force-disable e10s is setting the environment variable MOZ_FORCE_DISABLE_E10S
.
There are three workarounds:
- Best solution: Delete the
MOZ_FORCE_DISABLE_E10S
environment variable:- Win10: Open your start menu, type "env", open "Edit the system environment variables", click on the "Environment variables" button, search for a
MOZ_FORCE_DISABLE_E10S
entry, select it, click on the "Delete" button, click on "OK" and restart Win10. - Win7: https://docs.oracle.com/en/database/oracle/r-enterprise/1.5.1/oread/creating-and-modifying-environment-variables-on-windows.html
- Win10: Open your start menu, type "env", open "Edit the system environment variables", click on the "Environment variables" button, search for a
- Not a long-term solution: Use Firefox 79 as alpha version right now: https://nightly.mozilla.org.
- Don't do this: Disable gzip, deflate and brotli by changing network.http.accept-encoding and network.http.accept-encoding.secure to an empty string.
Could someone please remove all support pages that suggest to set this env var? It force-disables WebRender as well.
https://support.mozilla.org/en-US/questions/1265590
https://www.google.com/search?q=site%3Asupport.mozilla.org+MOZ_FORCE_DISABLE_E10S
Comment 16•5 years ago
|
||
Updated•5 years ago
|
Comment 17•5 years ago
|
||
(Jan Andre Ikenmeyer [:darkspirit] from comment #14)
Given that this bug broke Google.com for all users who have disabled E10S, this should probably be uplifted into release?
Or would you like to leave it as is and nudge users to remove the env var (in worst case without being able to google the problem)?
In the past, disabling e10s was unfortunately the top suggestion to reduce memory usage at the cost of performance, stability and security.
It might be desirable to restrict both disabling methods to Nightly.
Comment 19•5 years ago
|
||
As an FYI, I have this env var set because it also enables me to disable /dev/shm
usage, which is required in Kubernetes (and other memory constrained docker-ish environments).
I'll probably have to do what was suggested above in "Don't do this": Disable gzip, deflate and brotli by changing network.http.accept-encoding and network.http.accept-encoding.secure to an empty string, and wait for FF 79 to re-enable them.
Comment 20•5 years ago
|
||
https://developer.mozilla.org/en-US/docs/Mozilla/Testing/Valgrind_test_job
The Valgrind test job builds the browser and runs it under Valgrind, which can detect various common memory-related errors. This job only runs on Linux64, which is the platform best suited to running Valgrind.
According to bug 1549856 you currently test an unsupported configuration and do not test what you ship.
Assignee | ||
Comment 22•5 years ago
|
||
Comment on attachment 9152605 [details]
Bug 1638652 - Only disable content conversion in the parent process if we're going to be delivering data to a content process. r?jya
Beta/Release Uplift Approval Request
- User impact if declined: Failure to decode compressed pages when e10s is disabled.
- Is this code covered by automated tests?: No
- Has the fix been verified in Nightly?: Yes
- Needs manual test from QE?: No
- If yes, steps to reproduce:
- List of other uplifts needed: None
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky):
- String changes made/needed:
Assignee | ||
Comment 23•5 years ago
|
||
Comment on attachment 9152605 [details]
Bug 1638652 - Only disable content conversion in the parent process if we're going to be delivering data to a content process. r?jya
Beta/Release Uplift Approval Request
- User impact if declined: Same as the beta request, this is a low risk fix to a bug for users with e10s disabled.
This isn't really a supported configuration, but we should consider the fix for a ride-along if we do a dot release on 78.
- Is this code covered by automated tests?: No
- Has the fix been verified in Nightly?: Yes
- Needs manual test from QE?: No
- If yes, steps to reproduce:
- List of other uplifts needed: None
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky):
- String changes made/needed:
Updated•5 years ago
|
Comment 25•5 years ago
|
||
Comment on attachment 9152605 [details]
Bug 1638652 - Only disable content conversion in the parent process if we're going to be delivering data to a content process. r?jya
approved for 78.0b5.
might be time to start ignoring the "disable e10s" pref on release builds...
Comment 26•5 years ago
|
||
(In reply to gil from comment #19)
As an FYI, I have this env var set because it also enables me to disable
/dev/shm
usage, which is required in Kubernetes (and other memory constrained docker-ish environments).
Can you elaborate on what you mean here? Why is disabling /dev/shm
required in Kubernetes / docker? I've tried searching for relevant documentation and have not found any.
Disabling e10s means web content is unsandboxed, and so has serious security disadvantages. We're likely to make it harder and harder, or even completely impossible, to disable it in the near future. I'd like to understand why you find it necessary to do so.
Comment 27•5 years ago
|
||
Can you elaborate on what you mean here? Why is disabling /dev/shm required in Kubernetes / docker? I've tried searching for relevant documentation and have not found any.
Back in the days, there was no way to specify shared memory use for a docker container in Kubernetes, and so I disabled E10S. Today, surprisingly, there is a way. Unfortunately, the recommended amount of memory is 2g
, which is HUGE. Why?
I run hundreds (and in the future thousands) of Firefox instances in a k8s grid of docker containers. For this, I need to specify the exact amount of memory to allocate to each container, and currently I allocate ~500mb to each container, which is enough for my purpose. If I need to add 2g (or even and additional 500mb) to each container, the price of the grid would rise significantly.
So I prefer disabling E10S, and am OK with being unsandboxed, given that only our sites run on inside Firefox, and NO scripts are run at all.
Actually, I don't care about E10S or sandboxing, I just don't want the overhead of the shared memory. From my point of view, having the processes communicate via the file system (isn't /dev/shm
treated as a file system?) good enough, even if slower: memory beats CPU in my (and probably other) environment.
(BTW, the same problem exists in Chromium, and they have a --disable-dev-shm
flag for the exact same purpose.)
Comment 28•5 years ago
|
||
bugherder uplift |
Comment 29•5 years ago
|
||
(In reply to gil from comment #27)
Can you elaborate on what you mean here? Why is disabling /dev/shm required in Kubernetes / docker? I've tried searching for relevant documentation and have not found any.
Back in the days, there was no way to specify shared memory use for a docker container in Kubernetes, and so I disabled E10S. Today, surprisingly, there is a way. Unfortunately, the recommended amount of memory is
2g
, which is HUGE. Why?
(I assume we're talking about k8s, not Firefox? It's not 100% clear. If this refers to Firefox, I have no idea what the origin of that number is...)
I run hundreds (and in the future thousands) of Firefox instances in a k8s grid of docker containers. For this, I need to specify the exact amount of memory to allocate to each container, and currently I allocate ~500mb to each container, which is enough for my purpose. If I need to add 2g (or even and additional 500mb) to each container, the price of the grid would rise significantly.
So I prefer disabling E10S, and am OK with being unsandboxed, given that only our sites run on inside Firefox, and NO scripts are run at all.
Unsure what your usecase is, but have you considered running multiple sites in a single Firefox and tabswitching? That'd get your memory costs down a lot...
Actually, I don't care about E10S or sandboxing, I just don't want the overhead of the shared memory. From my point of view, having the processes communicate via the file system (isn't
/dev/shm
treated as a file system?) good enough, even if slower: memory beats CPU in my (and probably other) environment.(BTW, the same problem exists in Chromium, and they have a
--disable-dev-shm
flag for the exact same purpose.)
I honestly don't know enough about the internals of our IPC code to comment on this, but I'll just mention that ironically, looking for this flag and background on it finds me bug reports where webdriver first made this flag the default when talking to chrome in order to deal with memory limitations as you suggest, and then had people request they undo it, precisely because for a bunch of people, this trade-off was really not what they wanted... Can't please all the people all of the time etc.
:gcp, is there any chance of supporting a similar thing via an env var so people use that instead of trying to turn off e10s all the time?
Comment 30•5 years ago
|
||
Unfortunately, the recommended amount of memory is 2g, which is HUGE. Why?
We use shared memory to communicate between the Firefox processes, and /dev/shm needs to be large enough to deal with the peak memory usage of whatever you are doing in Firefox, see e.g. https://bugzilla.mozilla.org/show_bug.cgi?id=1245239 where tons of people are crashing because docker makes it WAY too small by default. Depending on what you are doing, you probably don't need 2G, but in normal usage there's no reason whatsoever to make it small. You don't pay for what you don't use.
/dev/shm doesn't require real memory, so you only need to be able to support the peak usage across all instances you run, but if the VM/container instance you use requires real memory to back it for some reason you're going to have a hard time.
There's some workarounds with avoiding /dev/shm in newer kernels but we don't have support for it yet.
isn't /dev/shm treated as a file system?
It's a filesystem backed by virtual memory.
:gcp, is there any chance of supporting a similar thing via an env var so people use that instead of trying to turn off e10s all the time?
See the discussion in bug 1245239. It involves, among other things, reverting patches that made us use the standardized calls to manage shared memory (because those use /dev/shm, because that's where it is supposed to be!). I think I'd rather have us finish the memfd work rather than hack around it.
Comment 31•5 years ago
|
||
/dev/shm doesn't require real memory, so you only need to be able to support the peak usage across all instances you run
Yup, but unfortunately, best practices and K8s (and experience, unfortunately,...) have taught us to tell K8s to allocate the peak memory we need, and not rely on the statistics of hoping that the memory load will be balanced as containers.
It's a filesystem backed by virtual memory.
So if I map /dev/shm
to a real directory, will that work?
Comment 32•5 years ago
•
|
||
BTW, the same problem exists in Chromium, and they have a --disable-dev-shm flag for the exact same purpose.
That doesn't disable usage of shared memory! It just changes the default location from /dev/shm to /tmp, which is also backed by a tmpfs. If your /tmp is large enough, then you could as well have made /dev/shm large enough, it's the exact same thing. The Chrome flag is essentially a workaround for the broken default docker configs (that apparently have a bigger /tmp than /dev/shm), it doesn't magically turn regular filesystems into shared memory.
tmpfs doesn't need physical memory backing it, it can be swapped, and it won't "use" anything if it's empty.
So if I map /dev/shm to a real directory, will that work?
Probably not, I think libc does assume it's a tmpfs.
In your situation, if you somehow manage to run Firefox stably with a 500M instance, then I don't see any reason why simply making the /dev/shm 500M wouldn't work (if Firefox needs more shared memory, it was going to OOM anyway regardless of the shared memory situation...) Again, it won't use any actual memory (including swap) unless it's used. That's not too far of from how a normal system works, i.e. this Ubuntu workstation has 32G RAM and a 16G /dev/shm, a 32G /run, and 2x32G for every Snap app etc, so that's over 112G of tmpfs on a 32G RAM system...
Updated•5 years ago
|
Updated•5 years ago
|
Comment 33•5 years ago
|
||
Reproduced the issue on Firefox 77.0.1 and Firefox Beta 78.0b2
Verified fixed on Firefox Beta 78.0b5(20200609215727) and Firefox Nightly 79.0a1(20200611093454)
Comment 34•5 years ago
|
||
Comment on attachment 9152605 [details]
Bug 1638652 - Only disable content conversion in the parent process if we're going to be delivering data to a content process. r?jya
we're entering RC week for 78, so clearing the uplift request for 77
Updated•5 years ago
|
Description
•