Closed Bug 1364849 Opened 8 years ago Closed 7 years ago

Short lived content processes should be reused.

Categories

(Core :: DOM: Content Processes, defect, P1)

defect

Tracking

()

RESOLVED FIXED
mozilla56
Tracking Status
firefox-esr52 --- unaffected
firefox54 --- unaffected
firefox55 + wontfix
firefox56 --- wontfix
firefox57 --- wontfix
firefox58 --- fixed

People

(Reporter: gkrizsanits, Assigned: gkrizsanits)

References

Details

Attachments

(1 file, 1 obsolete file)

Based on bug 1363240 and bug 1352021 it would be useful to keep around content processes for a short while and reuse them if we need one. Not sure about the details yet, but ideally we should do this only for content processes with small memory footprints or ones with really short life span which indicates that it was probably created accidentally under some rare circumstances.
Blocks: 1363240
Attached patch Recycle content processes. v1 (obsolete) — Splinter Review
Assignee: nobody → gkrizsanits
Attachment #8869424 - Flags: review?(mrbkap)
Blocks: 1365541
Comment on attachment 8869424 [details] [diff] [review] Recycle content processes. v1 Review of attachment 8869424 [details] [diff] [review]: ----------------------------------------------------------------- ::: dom/ipc/ContentParent.cpp @@ +1859,5 @@ > + if (!sBrowserContentParents || > + !IsAvailable() || > + !mRemoteType.EqualsLiteral(DEFAULT_REMOTE_TYPE) || > + !PreallocatedProcessManager::Provide(this) || > + (TimeStamp::Now() - mActivateTS).ToSeconds() > kMaxLifeSpan) { It looks like the timestamp check should happen before the PreallocatedProcessManager::Provide. Otherwise, we might end up with it being the mPreallocatedProcess *and* with it in the list. It'd be really nice to be able to replace this time check with some sort of "memory health" check (memory usage + fragmentation) or something eventually.
Attachment #8869424 - Flags: review?(mrbkap) → review+
(In reply to Blake Kaplan (:mrbkap) from comment #2) > It looks like the timestamp check should happen before the > PreallocatedProcessManager::Provide. Otherwise, we might end up with it > being the mPreallocatedProcess *and* with it in the list. Uhh... sorry about that, and thanks for spotting it! > > It'd be really nice to be able to replace this time check with some sort of > "memory health" check (memory usage + fragmentation) or something eventually. I agree and I've added a comment about that. I think we should also base our process selecting algorithm on that number.
We looked at this via bug 1365541 comment 1 during regression triage today. Guessing it stalled on some try issues?
Flags: needinfo?(gkrizsanits)
(In reply to David Bolter [:davidb] from comment #6) > We looked at this via bug 1365541 comment 1 during regression triage today. > Guessing it stalled on some try issues? Yeah. I fixed most of the problems but there is still a rare startup hang that happens on windows: https://treeherder.mozilla.org/#/jobs?repo=try&revision=703b9d9fe5375baf9518687a257bc9d61ede1be5 I'm trying to reproduce it locally currently or understand where it might come from.
Flags: needinfo?(gkrizsanits)
I wanted to call out that code freeze for 55 Nightly starts on June 5th. I can give RelMan a head's up if we think we need a day or two; we just don't want to crash land this right before merge day on June 12th so the earlier the better. Thank you!!
Flags: needinfo?(gkrizsanits)
(In reply to Erin Lancaster [:elan] from comment #8) > I wanted to call out that code freeze for 55 Nightly starts on June 5th. I > can give RelMan a head's up if we think we need a day or two; we just don't > want to crash land this right before merge day on June 12th so the earlier > the better. Thank you!! I couldn't finish the patch in time, but I don't think it's a big deal to target 56 with this patch or uplift it later, if we think it's really necessary.
Flags: needinfo?(gkrizsanits)
Since I did a few significant changes in the patch could you take a second look at it? Sorry about it, I should have made sure first that the patch is passing all the tests on windows too before asking for a review... The patch is now makes sure that we reuse a process only if it's not about to get shut down. Also since provide() might be called both from NotifyTabDestroying() and NotifyTabDestroyed() with the same content parent, I make sure we don't send a shut down message after the second call. Finally I removed the |sBrowserContentParents| check from TryToRecycle(), because I run into a situation where for the first call of TryToRecycle() from NotifyTabDestroying() we remove the content parent from |sBrowserContentParents| and if it was the only content process than we destroy |sBrowserContentParents| and then shortly after for the second call from NotifyTabDestroyed() because of that check we returned false and sent the shut down message.
Attachment #8869424 - Attachment is obsolete: true
Attachment #8876070 - Flags: review?(mrbkap)
Attachment #8876070 - Flags: review?(mrbkap) → review+
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla56
A bunch of improvements have landed! Thank you == Change summary for alert #7301 (as of June 15 2017 00:10 UTC) == Improvements: 12% damp summary windows7-32 pgo e10s 226.13 -> 198.81 8% damp summary windows10-64 opt e10s 265.88 -> 244.80 5% tart summary windows7-32 pgo e10s 5.33 -> 5.06 4% tart summary windows7-32 opt e10s 7.50 -> 7.17 4% damp summary windows7-32 opt e10s 287.24 -> 277.04 For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=7301
Can this be uplifted to 55 to help with bug 1363240 and bug 1365541?
Flags: needinfo?(gkrizsanits)
Comment on attachment 8876070 [details] [diff] [review] Recycle content processes. v2 Approval Request Comment [Feature/Bug causing the regression]: Not a regression. [User impact if declined]: In various cases we create a content process and then quickly destroy it, which can cause sometimes user perceivable performance regression. In all cases it's a waste of resources. This patch is about reusing these processes instead of throwing them away. [Is this code covered by automated tests?]: It's quite thoroughly tested by our tests. [Has the fix been verified in Nightly?]: Yes. [Needs manual test from QE? If yes, steps to reproduce]: No. [List of other uplifts needed for the feature/fix]: None. [Is the change risky?]: After all the tests it went through at this point I think it's more risky not to have it than to have it. [Why is the change risky/not risky?]: I had to fix a lot of unexpected test failures. They all came from the same problem: I tried to cache a process that is going away and then reuse it later while it's being shut down. I think I've caught all cases, but if I missed an edge case it could cause some troubles. I see very little chance for this. But the increased content process crash rates would give it away. IF it happened it would be a harmless crash from a security point of view. [String changes made/needed]: None.
Flags: needinfo?(gkrizsanits)
Attachment #8876070 - Flags: approval-mozilla-beta?
Let's watch this on nightly for another day and uplift if things look ok, for the beta 3 build on Monday.
I don't see any obvious problems from crash-stats. Let's go ahead and uplift this as we are still in early beta.
Comment on attachment 8876070 [details] [diff] [review] Recycle content processes. v2 Fix for some content process related regressions, let's take this for beta 4.
Attachment #8876070 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
I had to back this out for causing mochitest-bc failures, mostly of the browser_thumbnails_bg_crash_during_capture.js variety. https://treeherder.mozilla.org/logviewer.html#?job_id=108637094&repo=mozilla-beta Also some browser_637020.js failures like: https://treeherder.mozilla.org/logviewer.html#?job_id=108633617&repo=mozilla-beta https://hg.mozilla.org/releases/mozilla-beta/rev/cd547b10e59e
Flags: needinfo?(gkrizsanits)
(In reply to Ryan VanderMeulen [:RyanVM] from comment #24) > I had to back this out for causing mochitest-bc failures, mostly of the > browser_thumbnails_bg_crash_during_capture.js variety. > https://treeherder.mozilla.org/logviewer.html#?job_id=108637094&repo=mozilla- > beta > > Also some browser_637020.js failures like: > https://treeherder.mozilla.org/logviewer.html#?job_id=108633617&repo=mozilla- > beta > > https://hg.mozilla.org/releases/mozilla-beta/rev/cd547b10e59e Only on beta or on mc as well?
Flags: needinfo?(gkrizsanits) → needinfo?(ryanvm)
Only Beta AFAICT.
Flags: needinfo?(ryanvm)
Any updates here? Or should we give up on getting this in 55?
Flags: needinfo?(gkrizsanits)
(In reply to Julien Cristau [:jcristau] from comment #27) > Any updates here? Or should we give up on getting this in 55? I already gave up on it sorry. Didn't have the time to figure out what's going on, and it's already a bit too late to uplift it.
Flags: needinfo?(gkrizsanits)
Attachment #8876070 - Flags: approval-mozilla-beta+
Currently the preallocated process manager had to be turned off on all channels, so this patch is not active right now.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
(In reply to Gabor Krizsanits [:krizsa :gabor] from comment #29) > Currently the preallocated process manager had to be turned off on all > channels, so this patch is not active right now. Because of which bug?
Depends on: 1385249
Priority: -- → P1
Turning back on the ppm took care of this as well.
Status: REOPENED → RESOLVED
Closed: 8 years ago7 years ago
Resolution: --- → FIXED
(In reply to Gabor Krizsanits [:krizsa :gabor] from comment #32) > Turning back on the ppm took care of this as well. So I notice this is "RESOLVED FIXED in Firefox 56" but Bug 1385249 only re-enabled PPM only for 58. So Bug 1385249 needs to be uplifted to the stable channel.
(In reply to avada from comment #34) > (In reply to Gabor Krizsanits [:krizsa :gabor] from comment #32) > > Turning back on the ppm took care of this as well. > > So I notice this is "RESOLVED FIXED in Firefox 56" but Bug 1385249 only > re-enabled PPM only for 58. > So Bug 1385249 needs to be uplifted to the stable channel. Oh, you're totally right, I did not fix the tracking flags after reopening and closing again. PPM will not be uplifted to 57 as it is considered to be too risky to disrupt the 57 release.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: