Short lived content processes should be reused.

RESOLVED FIXED in Firefox 58

Status

()

defect
P1
normal
RESOLVED FIXED
2 years ago
2 years ago

People

(Reporter: gkrizsanits, Assigned: gkrizsanits)

Tracking

unspecified
mozilla56
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox-esr52 unaffected, firefox54 unaffected, firefox55+ wontfix, firefox56 wontfix, firefox57 wontfix, firefox58 fixed)

Details

Attachments

(1 attachment, 1 obsolete attachment)

Assignee

Description

2 years ago
Based on bug 1363240 and bug 1352021 it would be useful to keep around content processes for a short while and reuse them if we need one. Not sure about the details yet, but ideally we should do this only for content processes with small memory footprints or ones with really short life span which indicates that it was probably created accidentally under some rare circumstances.
Assignee

Updated

2 years ago
Blocks: 1363240
Assignee

Comment 1

2 years ago
Posted patch Recycle content processes. v1 (obsolete) — Splinter Review
Assignee: nobody → gkrizsanits
Attachment #8869424 - Flags: review?(mrbkap)
Blocks: 1365541
Comment on attachment 8869424 [details] [diff] [review]
Recycle content processes. v1

Review of attachment 8869424 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/ipc/ContentParent.cpp
@@ +1859,5 @@
> +  if (!sBrowserContentParents ||
> +      !IsAvailable() ||
> +      !mRemoteType.EqualsLiteral(DEFAULT_REMOTE_TYPE) ||
> +      !PreallocatedProcessManager::Provide(this) ||
> +      (TimeStamp::Now() - mActivateTS).ToSeconds() > kMaxLifeSpan) {

It looks like the timestamp check should happen before the PreallocatedProcessManager::Provide. Otherwise, we might end up with it being the mPreallocatedProcess *and* with it in the list.

It'd be really nice to be able to replace this time check with some sort of "memory health" check (memory usage + fragmentation) or something eventually.
Attachment #8869424 - Flags: review?(mrbkap) → review+
Assignee

Comment 3

2 years ago
(In reply to Blake Kaplan (:mrbkap) from comment #2)
> It looks like the timestamp check should happen before the
> PreallocatedProcessManager::Provide. Otherwise, we might end up with it
> being the mPreallocatedProcess *and* with it in the list.

Uhh... sorry about that, and thanks for spotting it!

> 
> It'd be really nice to be able to replace this time check with some sort of
> "memory health" check (memory usage + fragmentation) or something eventually.

I agree and I've added a comment about that. I think we should also base our process selecting algorithm on that number.
We looked at this via bug 1365541 comment 1 during regression triage today. Guessing it stalled on some try issues?
Flags: needinfo?(gkrizsanits)
Assignee

Comment 7

2 years ago
(In reply to David Bolter [:davidb] from comment #6)
> We looked at this via bug 1365541 comment 1 during regression triage today.
> Guessing it stalled on some try issues?

Yeah. I fixed most of the problems but there is still a rare startup hang that happens on windows: https://treeherder.mozilla.org/#/jobs?repo=try&revision=703b9d9fe5375baf9518687a257bc9d61ede1be5

I'm trying to reproduce it locally currently or understand where it might come from.
Flags: needinfo?(gkrizsanits)
I wanted to call out that code freeze for 55 Nightly starts on June 5th. I can give RelMan a head's up if we think we need a day or two; we just don't want to crash land this right before merge day on June 12th so the earlier the better. Thank you!!
Flags: needinfo?(gkrizsanits)
Assignee

Comment 9

2 years ago
(In reply to Erin Lancaster [:elan] from comment #8)
> I wanted to call out that code freeze for 55 Nightly starts on June 5th. I
> can give RelMan a head's up if we think we need a day or two; we just don't
> want to crash land this right before merge day on June 12th so the earlier
> the better. Thank you!!

I couldn't finish the patch in time, but I don't think it's a big deal to target 56 with this patch or uplift it later, if we think it's really necessary.
Flags: needinfo?(gkrizsanits)
Assignee

Comment 14

2 years ago
Since I did a few significant changes in the patch could you take a second look at it? Sorry about it, I should have made sure first that the patch is passing all the tests on windows too before asking for a review...

The patch is now makes sure that we reuse a process only if it's not about to get shut down.

Also since provide() might be called both from NotifyTabDestroying() and NotifyTabDestroyed() with the same content parent, I make sure we don't send a shut down message after the second call.

Finally I removed the |sBrowserContentParents| check from TryToRecycle(), because I run into a situation where for the first call of TryToRecycle() from NotifyTabDestroying() we remove the content parent from |sBrowserContentParents| and if it was the only content process than we destroy |sBrowserContentParents| and then shortly after for the second call from NotifyTabDestroyed() because of that check we returned false and sent the shut down message.
Attachment #8869424 - Attachment is obsolete: true
Attachment #8876070 - Flags: review?(mrbkap)
Attachment #8876070 - Flags: review?(mrbkap) → review+
https://hg.mozilla.org/mozilla-central/rev/ae3fb82c5582
Status: NEW → RESOLVED
Last Resolved: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla56
A bunch of improvements have landed! Thank you

== Change summary for alert #7301 (as of June 15 2017 00:10 UTC) ==

Improvements:

 12%  damp summary windows7-32 pgo e10s     226.13 -> 198.81
  8%  damp summary windows10-64 opt e10s    265.88 -> 244.80
  5%  tart summary windows7-32 pgo e10s     5.33 -> 5.06
  4%  tart summary windows7-32 opt e10s     7.50 -> 7.17
  4%  damp summary windows7-32 opt e10s     287.24 -> 277.04

For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=7301
Can this be uplifted to 55 to help with bug 1363240 and bug 1365541?
Flags: needinfo?(gkrizsanits)
Assignee

Comment 19

2 years ago
Comment on attachment 8876070 [details] [diff] [review]
Recycle content processes. v2

Approval Request Comment
[Feature/Bug causing the regression]: Not a regression.

[User impact if declined]: In various cases we create a content process and then quickly destroy it, which can cause sometimes user perceivable performance regression. In all cases it's a waste of resources. This patch is about reusing these processes instead of throwing them away.

[Is this code covered by automated tests?]: It's quite thoroughly tested by our tests.

[Has the fix been verified in Nightly?]: Yes.

[Needs manual test from QE? If yes, steps to reproduce]: No.
 
[List of other uplifts needed for the feature/fix]: None.

[Is the change risky?]: After all the tests it went through at this point I think it's more risky not to have it than to have it.

[Why is the change risky/not risky?]: I had to fix a lot of unexpected test failures. They all came from the same problem: I tried to cache a process that is going away and then reuse it later while it's being shut down. I think I've caught all cases, but if I missed an edge case it could cause some troubles. I see very little chance for this. But the increased content process crash rates would give it away. IF it happened it would be a harmless crash from a security point of view.

[String changes made/needed]: None.
Flags: needinfo?(gkrizsanits)
Attachment #8876070 - Flags: approval-mozilla-beta?
Let's watch this on nightly for another day and uplift if things look ok, for the beta 3 build on Monday.
I don't see any obvious problems from crash-stats. Let's go ahead and uplift this as we are still in early beta.
Comment on attachment 8876070 [details] [diff] [review]
Recycle content processes. v2

Fix for some content process related regressions, let's take this for beta 4.
Attachment #8876070 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
I had to back this out for causing mochitest-bc failures, mostly of the browser_thumbnails_bg_crash_during_capture.js variety.
https://treeherder.mozilla.org/logviewer.html#?job_id=108637094&repo=mozilla-beta

Also some browser_637020.js failures like:
https://treeherder.mozilla.org/logviewer.html#?job_id=108633617&repo=mozilla-beta

https://hg.mozilla.org/releases/mozilla-beta/rev/cd547b10e59e
Flags: needinfo?(gkrizsanits)
Assignee

Comment 25

2 years ago
(In reply to Ryan VanderMeulen [:RyanVM] from comment #24)
> I had to back this out for causing mochitest-bc failures, mostly of the
> browser_thumbnails_bg_crash_during_capture.js variety.
> https://treeherder.mozilla.org/logviewer.html#?job_id=108637094&repo=mozilla-
> beta
> 
> Also some browser_637020.js failures like:
> https://treeherder.mozilla.org/logviewer.html#?job_id=108633617&repo=mozilla-
> beta
> 
> https://hg.mozilla.org/releases/mozilla-beta/rev/cd547b10e59e

Only on beta or on mc as well?
Flags: needinfo?(gkrizsanits) → needinfo?(ryanvm)
Only Beta AFAICT.
Flags: needinfo?(ryanvm)
Any updates here?  Or should we give up on getting this in 55?
Flags: needinfo?(gkrizsanits)
Assignee

Comment 28

2 years ago
(In reply to Julien Cristau [:jcristau] from comment #27)
> Any updates here?  Or should we give up on getting this in 55?

I already gave up on it sorry. Didn't have the time to figure out what's going on, and it's already a bit too late to uplift it.
Flags: needinfo?(gkrizsanits)
Attachment #8876070 - Flags: approval-mozilla-beta+
Assignee

Comment 29

2 years ago
Currently the preallocated process manager had to be turned off on all channels, so this patch is not active right now.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---

Comment 30

2 years ago
(In reply to Gabor Krizsanits [:krizsa :gabor] from comment #29)
> Currently the preallocated process manager had to be turned off on all
> channels, so this patch is not active right now.

Because of which bug?
Assignee

Updated

2 years ago
Depends on: 1385249

Updated

2 years ago
Priority: -- → P1
Assignee

Comment 32

2 years ago
Turning back on the ppm took care of this as well.
Status: REOPENED → RESOLVED
Last Resolved: 2 years ago2 years ago
Resolution: --- → FIXED
Assignee

Updated

2 years ago
Duplicate of this bug: 1293284

Comment 34

2 years ago
(In reply to Gabor Krizsanits [:krizsa :gabor] from comment #32)
> Turning back on the ppm took care of this as well.

So I notice this is "RESOLVED FIXED in Firefox 56" but Bug 1385249 only re-enabled PPM only for 58.
So Bug 1385249 needs to be uplifted to the stable channel.
Assignee

Comment 35

2 years ago
(In reply to avada from comment #34)
> (In reply to Gabor Krizsanits [:krizsa :gabor] from comment #32)
> > Turning back on the ppm took care of this as well.
> 
> So I notice this is "RESOLVED FIXED in Firefox 56" but Bug 1385249 only
> re-enabled PPM only for 58.
> So Bug 1385249 needs to be uplifted to the stable channel.

Oh, you're totally right, I did not fix the tracking flags after reopening and closing again. PPM will not be uplifted to 57 as it is considered to be too risky to disrupt the 57 release.
You need to log in before you can comment on or make changes to this bug.