Closed Bug 1251967 Opened 8 years ago Closed 2 years ago

Measure the overhead of a new content process

Categories

(Core :: DOM: Content Processes, defect)

defect
Not set
normal

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: gkrizsanits, Unassigned)

References

Details

(Whiteboard: btpp-fixlater [e10s-multi:M2])

Attachments

(2 files)

How much memory does it cost per platform? How much time it is to start a new one?
What numbers do we have already?
Blocks: e10s-multi
No longer blocks: e10s-multi
Blocks: e10s-multi
Eric has insight here.
Flags: needinfo?(erahm)
I wrote a blog post detailing the overhead of multiple content processes [1], please see that for more numbers and analysis. A quick summary of the raw deltas b/w 0 and 1 content processes on the AWSY [2] test suite:

>                       Linux 64-bit    Mac 64-bit      Windows 32-bit  Windows 64-bit
> Start                 43,958,272      32,768,000      41,746,432      32,673,792
> StartSettled          48,529,408      86,228,992      43,966,464      56,852,480
> TabsOpen              91,930,624      156,463,104     80,441,344      85,569,536
> TabsOpenSettled       97,968,128      105,127,936     75,620,352      68,603,904
> TabsOpenForceGC       99,606,528      179,605,504     70,021,120      64,675,840
> TabsClosed            127,164,416     211,378,176     86,515,712      99,008,512
> TabsClosedSettled     98,668,544      196,251,648     74,473,472      68,919,296
> TabsClosedForceGC     84,549,632      186,990,592     52,158,464      50,368,512

And as a percentage:
 
>                       Linux 64-bit    Mac 64-bit      Windows 32-bit  Windows 64-bit
> Start                 22.05%          9.79%           23.08%          12.69%
> StartSettled          26.72%          26.43%          21.62%          22.97%
> TabsOpen              19.20%          16.78%          16.65%          13.20%
> TabsOpenSettled       20.84%          11.44%          15.58%          10.47%  
> TabsOpenForceGC       22.91%          21.56%          14.95%          10.28%  
> TabsClosed            31.41%          25.39%          19.22%          16.60%
> TabsClosedSettled     35.58%          25.36%          19.96%          14.56%
> TabsClosedForceGC     33.37%          28.70%          14.54%          11.12%

[1] http://www.erahm.org/2016/02/11/memory-usage-of-firefox-with-e10s-enabled/
[2] https://areweslimyet.com
Flags: needinfo?(erahm)
Whiteboard: btpp-fixlater
In my experience it's pretty brutal.
I opened several tabs using aurora, with e10s off and with e10s on with unlimited content processes. (And with aurora not running of course)
Then measured memory usage by looking at the sum of system commit in the three cases and physical memory usage in process explorer.

I didn't use about:memory, because it's practically useless in this case: It doesn't provide any summary data for the processes. But with the enormous differences there's no need for more accuracy anyway.

I got these results:
- With e10s on: 13,5GB memory usage for firefox
- With e10s off: 3.6GB memory usage for firefox

So about 10 gigabytes difference or 375% the memory usage of single process firefox with unlimited content processes.
(In reply to avada from comment #3)
> In my experience it's pretty brutal.
> I opened several tabs using aurora, with e10s off and with e10s on with
> unlimited content processes. (And with aurora not running of course)
> Then measured memory usage by looking at the sum of system commit in the
> three cases and physical memory usage in process explorer.
> 
> I didn't use about:memory, because it's practically useless in this case: It
> doesn't provide any summary data for the processes. But with the enormous
> differences there's no need for more accuracy anyway.
> 
> I got these results:
> - With e10s on: 13,5GB memory usage for firefox
> - With e10s off: 3.6GB memory usage for firefox
> 
> So about 10 gigabytes difference or 375% the memory usage of single process
> firefox with unlimited content processes.

These measurements are extraordinary. How many tabs did you have open? What webpages were in them?

Despite what you say, memory reports from about:memory would be extremely useful here. https://developer.mozilla.org/en-US/docs/Mozilla/Performance/about:memory has some docs on how to gather them.
Most likely RSS is being summed, you need to look at cumulative private bytes (sometimes referred to as private working set on Windows).
(In reply to Nicholas Nethercote [:njn] from comment #4)
> These measurements are extraordinary. How many tabs did you have open? What
> webpages were in them?
> 
> Despite what you say, memory reports from about:memory would be extremely
> useful here.
> https://developer.mozilla.org/en-US/docs/Mozilla/Performance/about:memory
> has some docs on how to gather them.

I didn't make notes about the tabs/sites.

Maybe I'll make another test, it's just excruciating because I had to increase the swapfile for the pages to load and it took like a quarter hour with e10s.

It's not that I couldn't save a memory report, it's that it was rather useless for me.

(In reply to Eric Rahm [:erahm] (Out until 3/10) from comment #5)
> Most likely RSS is being summed, you need to look at cumulative private
> bytes (sometimes referred to as private working set on Windows).

These were the only two available as graphs. As such simple to "measure".
Anyway. I'm assuming an about:memory report would suffice?
So. I made some measurement with mostly the same tabs.
This time on central. I tested the x86 version on Windows 8.1. So I expect it would have been much worse with the 64 bit version.

I'll upload it anonymized because it's my personal browsing.
avada, thank you for the data. There are three files in the archive:

- anon-central-interoff: 50 content processes
- anon-eoff-central: no content processes
- anon-interpon-central: 50 content processes, not much different to anon-central-interoff?

I looked closely at the first one. Some observations...

* Several process's measurements are incomplete, e.g. this is the entirety of
  the worst one:

> WARNING: the 'heap-allocated' memory reporter does not work for this platform
> and/or configuration. This means that 'heap-unclassified' is not shown and
> the 'explicit' tree shows less memory than it should.
> Explicit Allocations
> 
> 0.00 MB (100.0%) -- explicit
> └──0.00 MB (100.0%) ── network/spdy-zlib-buffers

  I suspect we're hitting the timeout for content processes to measure and
  report back to the parent process.

* I'm not surprised that using unlimited content processes gives problems. The
  plan is to start with one and slowly increase. It's not clear to me we'll
  ever go to unlimited because even a maximum of 8 currently hurts memory usage
  significantly.

* It looks like you have more than 50 extensions enabled -- is that right? That
  is a *lot* and it wouldn't surprise me if you had memory troubles even
  without multiple content processes. We have plenty of bugs filed where people
  have memory troubles with only 10 or 20 enabled.

* Why do we have duplicated system compartments? There are plenty where we have
  5 or 8 duplicates. All within resource://gre/modules/commonjs/sdk/ or
  similar, e.g. base64.js, content/l10n-html.js, core/disposable.js,
  core/heritage.js, core/namespace.js, core/observer.js, core/promise.js. This
  takes us over 500 system compartments. When I measure a new profile without
  any extension I only have 90. I assume this is related to having multiple
  extensions enabled, but sharing these would be much better.

* heap-unclassified is high (30--40%) for many of the content processes. The
  extensions may be a factor in this.

* Weird formatting here for the 10% value here:

> 386.92 MB (100.0%) -- heap-committed
> ├──348.24 MB (90.00%) ── allocated
> └───38.68 MB (010.00%) ── overhead

Overall, this is an extremely unusual configuration -- 50+ add-ons and
unlimited content processes.
(In reply to Nicholas Nethercote [:njn] from comment #8)
> * Why do we have duplicated system compartments? There are plenty where we
> have
>   5 or 8 duplicates. All within resource://gre/modules/commonjs/sdk/ or
>   similar, e.g. base64.js, content/l10n-html.js, core/disposable.js,
>   core/heritage.js, core/namespace.js, core/observer.js, core/promise.js.
> This
>   takes us over 500 system compartments. When I measure a new profile without
>   any extension I only have 90. I assume this is related to having multiple
>   extensions enabled, but sharing these would be much better.

I believe the SDK always loads a separate copy of itself for each SDK add-on. Each SDK module is loaded in a separate Sandbox, and a Sandbox is a global. They rely on having separate globals per add-on.
(In reply to Nicholas Nethercote [:njn] from comment #8)
> avada, thank you for the data. There are three files in the archive:
> 
> - anon-central-interoff: 50 content processes
> - anon-eoff-central: no content processes
> - anon-interpon-central: 50 content processes, not much different to
> anon-central-interoff?

It should be 79 (the number of tabs I had open) or more shouldn't it?
The last on was with extensions.interposition.enabled;false. Memory wise id didn't make a difference.



> * It looks like you have more than 50 extensions enabled -- is that right?
> That
>   is a *lot* and it wouldn't surprise me if you had memory troubles even
>   without multiple content processes. We have plenty of bugs filed where
> people
>   have memory troubles with only 10 or 20 enabled.

I have a 100 enabled, including greasemonkey. Yes, it has a memory impact. It's also manageable.
But with e10s it's 400-500% the memory usage, in this case 11 gigabytes more. That makes browsing way more problematic.
> It should be 79 (the number of tabs I had open) or more shouldn't it?

It's possible that some of the processes failed to report anything due to the timeout. You're way off in uncharted territory with this configuration, there are going to be some bumps :)

> The last on was with extensions.interposition.enabled;false. Memory wise id
> didn't make a difference.

What does that option do?
(In reply to Nicholas Nethercote [:njn] from comment #11)
> What does that option do?

It disables compatibility shims for addons. Since I have a lot of addons, I thought I'd try with and without shims.
Here's a memory overhead analysis I did. I've emailed this to dev-platform as well, though it's currently held in moderation due to its length.
I've noticed on Windows each content process uses video memory. According to process explorer on one of my plugin-container.exe's it's using 37.5MB VRAM. Similar amounts for the other 7 I currently use

Could that be effecting the memory used?
Whiteboard: btpp-fixlater → btpp-fixlater [e10s-multi:M2]

AFAICS the decision to ship e10s has been long taken, so the overhead seems to be acceptable. And the fission work did some further changes in that area, too.

Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → WORKSFORME

We also have an AWSY test that specifically measures process overhead.

You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: