Closed Bug 1333799 Opened 8 years ago Closed 7 years ago

Implement a deterministic process selection based on tab count

Categories

(Core :: DOM: Content Processes, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla55
Iteration:
55.1 - Mar 20
Tracking Status
firefox54 --- fixed
firefox55 --- fixed

People

(Reporter: gkrizsanits, Assigned: gkrizsanits)

References

Details

(Whiteboard: [e10s-multi:+])

Attachments

(1 file, 2 obsolete files)

Currently we pick a process randomly if we're at max process count (module the opener check), round robin seems like a more fair and more deterministic strategy.
Thanks for filing.  Deterministic would definitely be better for test stability and reproducing bugs, etc.
Depends on: 1324428
Whiteboard: [e10s-multi:?]
Assignee: nobody → mrbkap
Whiteboard: [e10s-multi:?] → [e10s-multi:+]
Is round robin really that fair?  After opening a closing a few tabs you're bound to having some processes with more and some with fewer tabs running in them.  Perhaps as a rough measure we can start by picking the process with the fewest number of tabs open?
Status: NEW → ASSIGNED
Flags: needinfo?(gkrizsanits)
(In reply to :Ehsan Akhgari from comment #2)
> Is round robin really that fair?  After opening a closing a few tabs you're
> bound to having some processes with more and some with fewer tabs running in
> them.  Perhaps as a rough measure we can start by picking the process with
> the fewest number of tabs open?

I'm perfectly OK with your suggestion, in fact it sounds like a better approach. We just assumed we have Round-Robin in place instead of this randomness, which would have been good enough for now, that's why I filed this bug. Blake is working on this so I let him decide, but probably the tab counting is better and sounds straightforward.
Flags: needinfo?(gkrizsanits) → needinfo?(mrbkap)
Sorry if I'm noising up the bug, but would an optimal long term solution be to choose based on medium-term average CPU usage?  (I understand this particular bug is not necessarily considered long term.)

One of the advantages of multiprocess is preventing CPU intensive tabs from interfering with other tabs, so it seems like distributing tabs based on CPU usage to some extent would be desirable.
(In reply to Caspy7 from comment #4)
> Sorry if I'm noising up the bug, but would an optimal long term solution be
> to choose based on medium-term average CPU usage?  (I understand this
> particular bug is not necessarily considered long term.)
> 
> One of the advantages of multiprocess is preventing CPU intensive tabs from
> interfering with other tabs, so it seems like distributing tabs based on CPU
> usage to some extent would be desirable.

Some of the changes we're working on as part of Quantum (for example <https://wiki.mozilla.org/Quantum/DOM>) may end up vastly change how well we can treat content running in the background that consumes a lot of CPU.  We've been basically assuming that we can never get to a perfect distribution of tabs in terms of their processing needs in each content process, so we are going to instead use measures such cooperative scheduling to try to do a better job at dealing with various types of workloads in one content process.

We should wait and evaluate once some of that work is in place.  My hope is that we can ultimately get away with a simple heuristic.  Monitoring CPU usage has overhead, can be error prone, etc.
Can we please just land something *deterministic* for now?  Yes, we can do all kind of great optimizations in the future, but right now completely random behavior is bad for user experience, diagnosing test failures, etc.  All of those things can be added in the future.  This bug, however, was filed just to deal with the randomness issue.   Please file your optimization ideas, pluggable behavior, etc as follow-on bugs.
I talked to Blake about this and he also likes the tab count based version better.
Assignee: mrbkap → gkrizsanits
Flags: needinfo?(mrbkap)
Summary: Implement a round-robin algorithm for process selection → Implement a deterministic process selection based on tab count
Attached patch mintab selector v1 (obsolete) — Splinter Review
I tried to keep it simple as possible. We still fall back to the default C++ implemented random selector if something goes wrong, I'm not sure it's worth the time to fix that, it should normally never happen.
Attachment #8842463 - Flags: review?(mrbkap)
Comment on attachment 8842463 [details] [diff] [review]
mintab selector v1

Review of attachment 8842463 [details] [diff] [review]:
-----------------------------------------------------------------

r- for the XPCOM comments. Looks good otherwise.

::: dom/base/ProcessSelector.js
@@ +73,5 @@
> +      return Ci.nsIContentProcessProvider.NEW_PROCESS;
> +    }
> +
> +    let min = Number.MAX_VALUE;
> +    let candidate = -1;

This could start as `Ci.nsIContentProcessProvider.NEW_PROCESS` and then you could simply `return candidate` at the end without the ?: expression.

::: dom/base/ProcessSelector.manifest
@@ +1,2 @@
>  component {c616fcfd-9737-41f1-aa74-cee72a38f91b} ProcessSelector.js
> +contract @mozilla.org/ipc/processselector/random;1 {c616fcfd-9737-41f1-aa74-cee72a38f91b}

This line should go.

::: dom/ipc/ContentParent.cpp
@@ +810,5 @@
>        infos.AppendElement(cp->mScriptableHelper);
>      }
>  
>      nsCOMPtr<nsIContentProcessProvider> cpp =
> +      do_GetService("@mozilla.org/ipc/processselector/mintab;1");

This is the wrong way to do this. If we add a new contractid for every process selector we add, then we'll be unable to change them without modifying the binary.

Instead, leave this as it is and take over processselector;1 with the new process selector. Then, extensions can override it or we can add some code to listen for a pref to swap out which selector we use based on the CID (class ID).
Attachment #8842463 - Flags: review?(mrbkap) → review-
(In reply to Gabor Krizsanits [:krizsa :gabor] from comment #8)
> I tried to keep it simple as possible. We still fall back to the default C++
> implemented random selector if something goes wrong, I'm not sure it's worth
> the time to fix that, it should normally never happen.

I don't think we can get rid of the C++ fallback mechanism. I'm paranoid about an extension doing something dumb and causing us to allocated hundreds of processes ignoring the pref. That's the one disadvantage to exposing this stuff to JS :(
(In reply to Blake Kaplan (:mrbkap) from comment #9)
> >      nsCOMPtr<nsIContentProcessProvider> cpp =
> > +      do_GetService("@mozilla.org/ipc/processselector/mintab;1");
> 
> This is the wrong way to do this. If we add a new contractid for every
> process selector we add, then we'll be unable to change them without
> modifying the binary.

My plan was to read the name of the selector from the related pref. Then the
add-on would have to simply register its own selector and change the pref.
|dom.ipc.processSelector = "mintab"|. The benefit is that we can easily check
if someone is using a non-default selector for data collection and that when
I want to change it manually I don't have to look up the CID.

> 
> Instead, leave this as it is and take over processselector;1 with the new
> process selector. Then, extensions can override it or we can add some code
> to listen for a pref to swap out which selector we use based on the CID
> (class ID).

This will work just as well. It's a bit less straightforward when I just
want to switch the selector manually, but probably more COM standard way of
doing it, I'll change my patch.

(In reply to Blake Kaplan (:mrbkap) from comment #10)
> (In reply to Gabor Krizsanits [:krizsa :gabor] from comment #8)
> I don't think we can get rid of the C++ fallback mechanism. I'm paranoid
> about an extension doing something dumb and causing us to allocated hundreds
> of processes ignoring the pref. That's the one disadvantage to exposing this
> stuff to JS :(

I didn't mean I want to get rid of the fallback mechanism, I just didn't want
to add a "mintab" fallback mechanism just leave the random selector there for
this patch.

For the C++ version I planned to keep the processes partially ordered (using a
hashmap instead of an array for them) and then maybe simplify the JS version
by calling it with a partially ordered array. But this only makes sense if
we have a lot of processes, so I rather skipped it as a premature optimization.

I could add a simple C++ version for mintab for the fallback mechanism and then
remove the C++ version of random selector though.
Attached patch mintab selector v2 (obsolete) — Splinter Review
I'm not sure how to switch between multiple strategies we provide this way tbh. But that work should be in a separate bug anyway. It's super simple to change back to the other version if we decide to, so I don't think it's super important right now. Anyway, let me know if this looks like what you had in mind.
Attachment #8842463 - Attachment is obsolete: true
Attachment #8842882 - Flags: review?(mrbkap)
Comment on attachment 8842882 [details] [diff] [review]
mintab selector v2

Review of attachment 8842882 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/base/ProcessSelector.manifest
@@ -1,1 @@
> -component {c616fcfd-9737-41f1-aa74-cee72a38f91b} ProcessSelector.js

Leave this line in, otherwise the component registry won't know about RandomSelector (if we ever want to use it again).
Attachment #8842882 - Flags: review?(mrbkap) → review+
Comment on attachment 8842882 [details] [diff] [review]
mintab selector v2

Review of attachment 8842882 [details] [diff] [review]:
-----------------------------------------------------------------

Oops, one more comment.

::: dom/base/ProcessSelector.js
@@ +86,5 @@
> +    return candidate;
> +  },
> +};
> +
> +var components = [MinTabSelector];

I think, now, components should be [MinTabSelector, RandomSelector]
Pushed by gkrizsanits@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/86801993e71d
MinTabSelector for process selection. r=mrbkap
https://hg.mozilla.org/mozilla-central/rev/86801993e71d
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
Iteration: --- → 55.1 - Mar 20
(In reply to Blake Kaplan (:mrbkap) from comment #14)
> 
> I think, now, components should be [MinTabSelector, RandomSelector]

Don't worry it was kind of implicit so I fixed that one as well.

The problem for the test failures by the way were because of I used forEach in the JS version of the selector hence did not respect the currently allowed maximum process count there (this is a problem only if we have more processes available than the allowed). So for tests where we force single content process that caused failures.
Uploading the version that has landed for aurora request, carrying over the r+ from the previous version.
Attachment #8842882 - Attachment is obsolete: true
Attachment #8845497 - Flags: review+
Comment on attachment 8845497 [details] [diff] [review]
mintab selector v3

Approval Request Comment
[Feature/Bug causing the regression]: Not a regression.
[User impact if declined]: We need this for e10s-multi beta experiment for 54.
[Is this code covered by automated tests?]: The whole browser chrome test suite goes over it (+dt), so it's pretty well tested.
[Has the fix been verified in Nightly?]: It's on nightly for a few days.
[Needs manual test from QE? If yes, steps to reproduce]: I did it already.
[List of other uplifts needed for the feature/fix]: None.
[Is the change risky?]: Not at all, especially since e10s-multi is not yet enabled on aurora.
[Why is the change risky/not risky?]: It is not taking any effect until e10s-multi is enabled. And there is extremely low chance that if there is any issue with the patch the whole tree does not turn orange.
[String changes made/needed]: None.
Attachment #8845497 - Flags: approval-mozilla-aurora?
Comment on attachment 8845497 [details] [diff] [review]
mintab selector v3

Support for e10s-multi beta experiment in 54. Aurora54+.
Attachment #8845497 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
No longer depends on: 1139795
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: