Short lived content processes should be reused.

RESOLVED FIXED in Firefox 56

Status

()

Core
DOM: Content Processes
RESOLVED FIXED
a month ago
2 days ago

People

(Reporter: krizsa, Assigned: krizsa)

Tracking

unspecified
mozilla56
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox54 unaffected, firefox55+ affected, firefox56 fixed, firefox-esr52 unaffected)

Details

Attachments

(1 attachment, 1 obsolete attachment)

(Assignee)

Description

a month ago
Based on bug 1363240 and bug 1352021 it would be useful to keep around content processes for a short while and reuse them if we need one. Not sure about the details yet, but ideally we should do this only for content processes with small memory footprints or ones with really short life span which indicates that it was probably created accidentally under some rare circumstances.
(Assignee)

Updated

a month ago
Blocks: 1363240
(Assignee)

Comment 1

a month ago
Created attachment 8869424 [details] [diff] [review]
Recycle content processes. v1
Assignee: nobody → gkrizsanits
Attachment #8869424 - Flags: review?(mrbkap)

Updated

a month ago
Blocks: 1365541
Comment on attachment 8869424 [details] [diff] [review]
Recycle content processes. v1

Review of attachment 8869424 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/ipc/ContentParent.cpp
@@ +1859,5 @@
> +  if (!sBrowserContentParents ||
> +      !IsAvailable() ||
> +      !mRemoteType.EqualsLiteral(DEFAULT_REMOTE_TYPE) ||
> +      !PreallocatedProcessManager::Provide(this) ||
> +      (TimeStamp::Now() - mActivateTS).ToSeconds() > kMaxLifeSpan) {

It looks like the timestamp check should happen before the PreallocatedProcessManager::Provide. Otherwise, we might end up with it being the mPreallocatedProcess *and* with it in the list.

It'd be really nice to be able to replace this time check with some sort of "memory health" check (memory usage + fragmentation) or something eventually.
Attachment #8869424 - Flags: review?(mrbkap) → review+
(Assignee)

Comment 3

a month ago
(In reply to Blake Kaplan (:mrbkap) from comment #2)
> It looks like the timestamp check should happen before the
> PreallocatedProcessManager::Provide. Otherwise, we might end up with it
> being the mPreallocatedProcess *and* with it in the list.

Uhh... sorry about that, and thanks for spotting it!

> 
> It'd be really nice to be able to replace this time check with some sort of
> "memory health" check (memory usage + fragmentation) or something eventually.

I agree and I've added a comment about that. I think we should also base our process selecting algorithm on that number.
(Assignee)

Comment 4

a month ago
https://treeherder.mozilla.org/#/jobs?repo=try&revision=1080b63a2928695f62ca553804a3db0028d8b8df
(Assignee)

Comment 5

a month ago
https://treeherder.mozilla.org/#/jobs?repo=try&revision=4dcd305b56d2115dc578e48a70aefad89ffb463e&selectedJob=100892442
We looked at this via bug 1365541 comment 1 during regression triage today. Guessing it stalled on some try issues?
Flags: needinfo?(gkrizsanits)
(Assignee)

Comment 7

23 days ago
(In reply to David Bolter [:davidb] from comment #6)
> We looked at this via bug 1365541 comment 1 during regression triage today.
> Guessing it stalled on some try issues?

Yeah. I fixed most of the problems but there is still a rare startup hang that happens on windows: https://treeherder.mozilla.org/#/jobs?repo=try&revision=703b9d9fe5375baf9518687a257bc9d61ede1be5

I'm trying to reproduce it locally currently or understand where it might come from.
Flags: needinfo?(gkrizsanits)
I wanted to call out that code freeze for 55 Nightly starts on June 5th. I can give RelMan a head's up if we think we need a day or two; we just don't want to crash land this right before merge day on June 12th so the earlier the better. Thank you!!
Flags: needinfo?(gkrizsanits)
(Assignee)

Comment 9

18 days ago
(In reply to Erin Lancaster [:elan] from comment #8)
> I wanted to call out that code freeze for 55 Nightly starts on June 5th. I
> can give RelMan a head's up if we think we need a day or two; we just don't
> want to crash land this right before merge day on June 12th so the earlier
> the better. Thank you!!

I couldn't finish the patch in time, but I don't think it's a big deal to target 56 with this patch or uplift it later, if we think it's really necessary.
Flags: needinfo?(gkrizsanits)
(Assignee)

Comment 10

16 days ago
https://treeherder.mozilla.org/#/jobs?repo=try&revision=56c33757fb5a
(Assignee)

Comment 11

16 days ago
https://treeherder.mozilla.org/#/jobs?repo=try&revision=2abe66414154
(Assignee)

Comment 12

16 days ago
https://treeherder.mozilla.org/#/jobs?repo=try&revision=d5cde5f671e4
(Assignee)

Comment 13

16 days ago
I think I've got it finally:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=a96ef225aa5df289bfa283b2d79705921795a525&selectedJob=105512254
(Assignee)

Comment 14

15 days ago
Created attachment 8876070 [details] [diff] [review]
Recycle content processes. v2

Since I did a few significant changes in the patch could you take a second look at it? Sorry about it, I should have made sure first that the patch is passing all the tests on windows too before asking for a review...

The patch is now makes sure that we reuse a process only if it's not about to get shut down.

Also since provide() might be called both from NotifyTabDestroying() and NotifyTabDestroyed() with the same content parent, I make sure we don't send a shut down message after the second call.

Finally I removed the |sBrowserContentParents| check from TryToRecycle(), because I run into a situation where for the first call of TryToRecycle() from NotifyTabDestroying() we remove the content parent from |sBrowserContentParents| and if it was the only content process than we destroy |sBrowserContentParents| and then shortly after for the second call from NotifyTabDestroyed() because of that check we returned false and sent the shut down message.
Attachment #8869424 - Attachment is obsolete: true
Attachment #8876070 - Flags: review?(mrbkap)

Updated

11 days ago
Attachment #8876070 - Flags: review?(mrbkap) → review+
(Assignee)

Comment 15

10 days ago
https://hg.mozilla.org/integration/mozilla-inbound/rev/ae3fb82c558209bc23f45bfc67ea75b21dfb40a4
Bug 1364849 - Recycle short lived content processes. r=mrbkap
https://hg.mozilla.org/mozilla-central/rev/ae3fb82c5582
Status: NEW → RESOLVED
Last Resolved: 9 days ago
status-firefox56: --- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla56
A bunch of improvements have landed! Thank you

== Change summary for alert #7301 (as of June 15 2017 00:10 UTC) ==

Improvements:

 12%  damp summary windows7-32 pgo e10s     226.13 -> 198.81
  8%  damp summary windows10-64 opt e10s    265.88 -> 244.80
  5%  tart summary windows7-32 pgo e10s     5.33 -> 5.06
  4%  tart summary windows7-32 opt e10s     7.50 -> 7.17
  4%  damp summary windows7-32 opt e10s     287.24 -> 277.04

For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=7301
Can this be uplifted to 55 to help with bug 1363240 and bug 1365541?
Flags: needinfo?(gkrizsanits)
(Assignee)

Comment 19

8 days ago
Comment on attachment 8876070 [details] [diff] [review]
Recycle content processes. v2

Approval Request Comment
[Feature/Bug causing the regression]: Not a regression.

[User impact if declined]: In various cases we create a content process and then quickly destroy it, which can cause sometimes user perceivable performance regression. In all cases it's a waste of resources. This patch is about reusing these processes instead of throwing them away.

[Is this code covered by automated tests?]: It's quite thoroughly tested by our tests.

[Has the fix been verified in Nightly?]: Yes.

[Needs manual test from QE? If yes, steps to reproduce]: No.
 
[List of other uplifts needed for the feature/fix]: None.

[Is the change risky?]: After all the tests it went through at this point I think it's more risky not to have it than to have it.

[Why is the change risky/not risky?]: I had to fix a lot of unexpected test failures. They all came from the same problem: I tried to cache a process that is going away and then reuse it later while it's being shut down. I think I've caught all cases, but if I missed an edge case it could cause some troubles. I see very little chance for this. But the increased content process crash rates would give it away. IF it happened it would be a harmless crash from a security point of view.

[String changes made/needed]: None.
Flags: needinfo?(gkrizsanits)
Attachment #8876070 - Flags: approval-mozilla-beta?
Let's watch this on nightly for another day and uplift if things look ok, for the beta 3 build on Monday.
status-firefox55: --- → affected
tracking-firefox55: --- → +
status-firefox54: --- → unaffected
status-firefox-esr52: --- → unaffected
I don't see any obvious problems from crash-stats. Let's go ahead and uplift this as we are still in early beta.
Comment on attachment 8876070 [details] [diff] [review]
Recycle content processes. v2

Fix for some content process related regressions, let's take this for beta 4.
Attachment #8876070 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Comment hidden (obsolete)
I had to back this out for causing mochitest-bc failures, mostly of the browser_thumbnails_bg_crash_during_capture.js variety.
https://treeherder.mozilla.org/logviewer.html#?job_id=108637094&repo=mozilla-beta

Also some browser_637020.js failures like:
https://treeherder.mozilla.org/logviewer.html#?job_id=108633617&repo=mozilla-beta

https://hg.mozilla.org/releases/mozilla-beta/rev/cd547b10e59e
status-firefox55: fixed → affected
Flags: needinfo?(gkrizsanits)
(Assignee)

Comment 25

3 days ago
(In reply to Ryan VanderMeulen [:RyanVM] from comment #24)
> I had to back this out for causing mochitest-bc failures, mostly of the
> browser_thumbnails_bg_crash_during_capture.js variety.
> https://treeherder.mozilla.org/logviewer.html#?job_id=108637094&repo=mozilla-
> beta
> 
> Also some browser_637020.js failures like:
> https://treeherder.mozilla.org/logviewer.html#?job_id=108633617&repo=mozilla-
> beta
> 
> https://hg.mozilla.org/releases/mozilla-beta/rev/cd547b10e59e

Only on beta or on mc as well?
Flags: needinfo?(gkrizsanits) → needinfo?(ryanvm)
Only Beta AFAICT.
Flags: needinfo?(ryanvm)
You need to log in before you can comment on or make changes to this bug.