Short lived content processes should be reused.

REOPENED
Assigned to

Status

()

Core
DOM: Content Processes
REOPENED
3 months ago
23 days ago

People

(Reporter: krizsa, Assigned: krizsa)

Tracking

(Depends on: 1 bug)

unspecified
mozilla56
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox-esr52 unaffected, firefox54 unaffected, firefox55+ wontfix, firefox56 fixed)

Details

Attachments

(1 attachment, 1 obsolete attachment)

(Assignee)

Description

3 months ago
Based on bug 1363240 and bug 1352021 it would be useful to keep around content processes for a short while and reuse them if we need one. Not sure about the details yet, but ideally we should do this only for content processes with small memory footprints or ones with really short life span which indicates that it was probably created accidentally under some rare circumstances.
(Assignee)

Updated

3 months ago
Blocks: 1363240
(Assignee)

Comment 1

3 months ago
Created attachment 8869424 [details] [diff] [review]
Recycle content processes. v1
Assignee: nobody → gkrizsanits
Attachment #8869424 - Flags: review?(mrbkap)

Updated

3 months ago
Blocks: 1365541
Comment on attachment 8869424 [details] [diff] [review]
Recycle content processes. v1

Review of attachment 8869424 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/ipc/ContentParent.cpp
@@ +1859,5 @@
> +  if (!sBrowserContentParents ||
> +      !IsAvailable() ||
> +      !mRemoteType.EqualsLiteral(DEFAULT_REMOTE_TYPE) ||
> +      !PreallocatedProcessManager::Provide(this) ||
> +      (TimeStamp::Now() - mActivateTS).ToSeconds() > kMaxLifeSpan) {

It looks like the timestamp check should happen before the PreallocatedProcessManager::Provide. Otherwise, we might end up with it being the mPreallocatedProcess *and* with it in the list.

It'd be really nice to be able to replace this time check with some sort of "memory health" check (memory usage + fragmentation) or something eventually.
Attachment #8869424 - Flags: review?(mrbkap) → review+
(Assignee)

Comment 3

3 months ago
(In reply to Blake Kaplan (:mrbkap) from comment #2)
> It looks like the timestamp check should happen before the
> PreallocatedProcessManager::Provide. Otherwise, we might end up with it
> being the mPreallocatedProcess *and* with it in the list.

Uhh... sorry about that, and thanks for spotting it!

> 
> It'd be really nice to be able to replace this time check with some sort of
> "memory health" check (memory usage + fragmentation) or something eventually.

I agree and I've added a comment about that. I think we should also base our process selecting algorithm on that number.
(Assignee)

Comment 4

3 months ago
https://treeherder.mozilla.org/#/jobs?repo=try&revision=1080b63a2928695f62ca553804a3db0028d8b8df
(Assignee)

Comment 5

3 months ago
https://treeherder.mozilla.org/#/jobs?repo=try&revision=4dcd305b56d2115dc578e48a70aefad89ffb463e&selectedJob=100892442
We looked at this via bug 1365541 comment 1 during regression triage today. Guessing it stalled on some try issues?
Flags: needinfo?(gkrizsanits)
(Assignee)

Comment 7

3 months ago
(In reply to David Bolter [:davidb] from comment #6)
> We looked at this via bug 1365541 comment 1 during regression triage today.
> Guessing it stalled on some try issues?

Yeah. I fixed most of the problems but there is still a rare startup hang that happens on windows: https://treeherder.mozilla.org/#/jobs?repo=try&revision=703b9d9fe5375baf9518687a257bc9d61ede1be5

I'm trying to reproduce it locally currently or understand where it might come from.
Flags: needinfo?(gkrizsanits)
I wanted to call out that code freeze for 55 Nightly starts on June 5th. I can give RelMan a head's up if we think we need a day or two; we just don't want to crash land this right before merge day on June 12th so the earlier the better. Thank you!!
Flags: needinfo?(gkrizsanits)
(Assignee)

Comment 9

3 months ago
(In reply to Erin Lancaster [:elan] from comment #8)
> I wanted to call out that code freeze for 55 Nightly starts on June 5th. I
> can give RelMan a head's up if we think we need a day or two; we just don't
> want to crash land this right before merge day on June 12th so the earlier
> the better. Thank you!!

I couldn't finish the patch in time, but I don't think it's a big deal to target 56 with this patch or uplift it later, if we think it's really necessary.
Flags: needinfo?(gkrizsanits)
(Assignee)

Comment 10

2 months ago
https://treeherder.mozilla.org/#/jobs?repo=try&revision=56c33757fb5a
(Assignee)

Comment 11

2 months ago
https://treeherder.mozilla.org/#/jobs?repo=try&revision=2abe66414154
(Assignee)

Comment 12

2 months ago
https://treeherder.mozilla.org/#/jobs?repo=try&revision=d5cde5f671e4
(Assignee)

Comment 13

2 months ago
I think I've got it finally:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=a96ef225aa5df289bfa283b2d79705921795a525&selectedJob=105512254
(Assignee)

Comment 14

2 months ago
Created attachment 8876070 [details] [diff] [review]
Recycle content processes. v2

Since I did a few significant changes in the patch could you take a second look at it? Sorry about it, I should have made sure first that the patch is passing all the tests on windows too before asking for a review...

The patch is now makes sure that we reuse a process only if it's not about to get shut down.

Also since provide() might be called both from NotifyTabDestroying() and NotifyTabDestroyed() with the same content parent, I make sure we don't send a shut down message after the second call.

Finally I removed the |sBrowserContentParents| check from TryToRecycle(), because I run into a situation where for the first call of TryToRecycle() from NotifyTabDestroying() we remove the content parent from |sBrowserContentParents| and if it was the only content process than we destroy |sBrowserContentParents| and then shortly after for the second call from NotifyTabDestroyed() because of that check we returned false and sent the shut down message.
Attachment #8869424 - Attachment is obsolete: true
Attachment #8876070 - Flags: review?(mrbkap)

Updated

2 months ago
Attachment #8876070 - Flags: review?(mrbkap) → review+
(Assignee)

Comment 15

2 months ago
https://hg.mozilla.org/integration/mozilla-inbound/rev/ae3fb82c558209bc23f45bfc67ea75b21dfb40a4
Bug 1364849 - Recycle short lived content processes. r=mrbkap
https://hg.mozilla.org/mozilla-central/rev/ae3fb82c5582
Status: NEW → RESOLVED
Last Resolved: 2 months ago
status-firefox56: --- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla56
A bunch of improvements have landed! Thank you

== Change summary for alert #7301 (as of June 15 2017 00:10 UTC) ==

Improvements:

 12%  damp summary windows7-32 pgo e10s     226.13 -> 198.81
  8%  damp summary windows10-64 opt e10s    265.88 -> 244.80
  5%  tart summary windows7-32 pgo e10s     5.33 -> 5.06
  4%  tart summary windows7-32 opt e10s     7.50 -> 7.17
  4%  damp summary windows7-32 opt e10s     287.24 -> 277.04

For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=7301
Can this be uplifted to 55 to help with bug 1363240 and bug 1365541?
Flags: needinfo?(gkrizsanits)
(Assignee)

Comment 19

2 months ago
Comment on attachment 8876070 [details] [diff] [review]
Recycle content processes. v2

Approval Request Comment
[Feature/Bug causing the regression]: Not a regression.

[User impact if declined]: In various cases we create a content process and then quickly destroy it, which can cause sometimes user perceivable performance regression. In all cases it's a waste of resources. This patch is about reusing these processes instead of throwing them away.

[Is this code covered by automated tests?]: It's quite thoroughly tested by our tests.

[Has the fix been verified in Nightly?]: Yes.

[Needs manual test from QE? If yes, steps to reproduce]: No.
 
[List of other uplifts needed for the feature/fix]: None.

[Is the change risky?]: After all the tests it went through at this point I think it's more risky not to have it than to have it.

[Why is the change risky/not risky?]: I had to fix a lot of unexpected test failures. They all came from the same problem: I tried to cache a process that is going away and then reuse it later while it's being shut down. I think I've caught all cases, but if I missed an edge case it could cause some troubles. I see very little chance for this. But the increased content process crash rates would give it away. IF it happened it would be a harmless crash from a security point of view.

[String changes made/needed]: None.
Flags: needinfo?(gkrizsanits)
Attachment #8876070 - Flags: approval-mozilla-beta?
Let's watch this on nightly for another day and uplift if things look ok, for the beta 3 build on Monday.
status-firefox55: --- → affected
tracking-firefox55: --- → +
status-firefox54: --- → unaffected
status-firefox-esr52: --- → unaffected
I don't see any obvious problems from crash-stats. Let's go ahead and uplift this as we are still in early beta.
Comment on attachment 8876070 [details] [diff] [review]
Recycle content processes. v2

Fix for some content process related regressions, let's take this for beta 4.
Attachment #8876070 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Comment hidden (obsolete)
I had to back this out for causing mochitest-bc failures, mostly of the browser_thumbnails_bg_crash_during_capture.js variety.
https://treeherder.mozilla.org/logviewer.html#?job_id=108637094&repo=mozilla-beta

Also some browser_637020.js failures like:
https://treeherder.mozilla.org/logviewer.html#?job_id=108633617&repo=mozilla-beta

https://hg.mozilla.org/releases/mozilla-beta/rev/cd547b10e59e
status-firefox55: fixed → affected
Flags: needinfo?(gkrizsanits)
(Assignee)

Comment 25

2 months ago
(In reply to Ryan VanderMeulen [:RyanVM] from comment #24)
> I had to back this out for causing mochitest-bc failures, mostly of the
> browser_thumbnails_bg_crash_during_capture.js variety.
> https://treeherder.mozilla.org/logviewer.html#?job_id=108637094&repo=mozilla-
> beta
> 
> Also some browser_637020.js failures like:
> https://treeherder.mozilla.org/logviewer.html#?job_id=108633617&repo=mozilla-
> beta
> 
> https://hg.mozilla.org/releases/mozilla-beta/rev/cd547b10e59e

Only on beta or on mc as well?
Flags: needinfo?(gkrizsanits) → needinfo?(ryanvm)
Only Beta AFAICT.
Flags: needinfo?(ryanvm)
Any updates here?  Or should we give up on getting this in 55?
Flags: needinfo?(gkrizsanits)
(Assignee)

Comment 28

2 months ago
(In reply to Julien Cristau [:jcristau] from comment #27)
> Any updates here?  Or should we give up on getting this in 55?

I already gave up on it sorry. Didn't have the time to figure out what's going on, and it's already a bit too late to uplift it.
Flags: needinfo?(gkrizsanits)
status-firefox55: affected → wontfix
Attachment #8876070 - Flags: approval-mozilla-beta+
(Assignee)

Comment 29

26 days ago
Currently the preallocated process manager had to be turned off on all channels, so this patch is not active right now.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---

Comment 30

26 days ago
(In reply to Gabor Krizsanits [:krizsa :gabor] from comment #29)
> Currently the preallocated process manager had to be turned off on all
> channels, so this patch is not active right now.

Because of which bug?
because of bug 1363601
(Assignee)

Updated

23 days ago
Depends on: 1385249
You need to log in before you can comment on or make changes to this bug.