[e10s-multi] Adding back a simplified pre-allocated process mechanism

RESOLVED FIXED in Firefox 54

Status

()

Core
DOM: Content Processes
RESOLVED FIXED
4 months ago
2 months ago

People

(Reporter: krizsa, Assigned: krizsa)

Tracking

(Depends on: 1 bug, Blocks: 1 bug)

unspecified
mozilla54
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox54 fixed)

Details

(Whiteboard: [e10s-multi:M1])

Attachments

(2 attachments, 1 obsolete attachment)

(Assignee)

Description

4 months ago
To fight the slow content process start up times, we decided to use a simplified pre-allocated process mechanism. I'm going to merge back the old mechanism without all the fancy b2g/nuwa magic in it, and will try to either use the pre-allocated process or an existing one, and never keep the user waiting for a new process to spin up when opening something in a new tab.
(Assignee)

Updated

4 months ago
Blocks: 1303113
Whiteboard: [e10s-multi:M1]
(Assignee)

Updated

4 months ago
Blocks: 1290167
(Assignee)

Comment 1

4 months ago
Created attachment 8821155 [details] [diff] [review]
Simplified version. v1

So the original version was removed here: https://hg.mozilla.org/mozilla-central/rev/69d33fe8ac7a

I tried to cut out all the fancy b2g specific bits and start with the simplest version as the first step. Although we might need something like the old zombie mode for PresShell, some kind of frozen state, if an empty bg content process turns out to be too chatty or busy...

I'm not sure if we need hal type at all (process priority) does this do anything on desktop?

The pref read for the flag happens too early (so it returns always falls) I've added a detailed comment for the hack I did in the manager, we might want to do something smarter.

The gecko parent message pump seems to be quite busy, we never get to the idle tasks, so I had to change that part as well, and not wait with launch for an idle state on the parent side.

This is still quite under tested, and as it seems this never worked on desktop, so even though it sort of worked on my machine I'm a bit hesitant about this patch. Probably we will enable e10s-multi on nightly before we use this as a default strategy.
Attachment #8821155 - Flags: feedback?(wmccloskey)
Comment on attachment 8821155 [details] [diff] [review]
Simplified version. v1

Review of attachment 8821155 [details] [diff] [review]:
-----------------------------------------------------------------

Sorry for the delay here. This looks good. Just need some small cleanups I think.

It would be nice to take the tests from the old patch.

::: dom/ipc/ContentChild.cpp
@@ +1516,5 @@
>    }
>    return IPC_OK();
>  }
>  
> +static CancelableRunnable* sFirstIdleTask;

If this isn't a member, it should be gFirstIdleTask ("s" is for "static", as in "static member").

Also, this should be a StaticRefPtr. Otherwise we could get in trouble if the dispatch fails.

@@ +1518,5 @@
>  }
>  
> +static CancelableRunnable* sFirstIdleTask;
> +
> +static void FirstIdle(void)

|static void| should be on their own line.

@@ +1594,5 @@
> +
> +    MOZ_ASSERT(!sFirstIdleTask);
> +    RefPtr<CancelableRunnable> firstIdleTask = NewCancelableRunnableFunction(FirstIdle);
> +    sFirstIdleTask = firstIdleTask;
> +    MessageLoop::current()->PostIdleTask(firstIdleTask.forget());

PostIdleTask is probably going away soon. Please use NS_IdleDispatchToCurrentThread instead.

::: dom/ipc/ContentParent.cpp
@@ +720,5 @@
>    nsAutoString contentProcessType(aLargeAllocationProcess
>                                    ? NS_LITERAL_STRING(LARGE_ALLOCATION_REMOTE_TYPE)
>                                    : aRemoteType);
> +
> +  nsTArray<ContentParent*>* contentParents = GetOrCreatePool(contentProcessType);

Might be nice if this returned a reference instead of a pointer.

@@ +734,5 @@
>      maxContentParents = 1;
>    }
>  
>    if (contentParents->Length() >= uint32_t(maxContentParents)) {
> +    return RandomSelect(*contentParents, aOpener, maxContentParents);

Before your patch, if we didn't find any process to use (because of the opener restriction), we would create a new process. With your patch, we return null. That seems bad.

::: dom/ipc/ContentParent.h
@@ +143,5 @@
> +   * respecting the index limit set by |aMaxContentParents|.
> +   * Returns null if non available.
> +   */
> +  static already_AddRefed<ContentParent>
> +  RandomSelect(const nsTArray<ContentParent*>& aContentParents,

This should be private.

::: dom/ipc/PreallocatedProcessManager.cpp
@@ +77,5 @@
> +    ClearOnShutdown(&sSingleton);
> +  }
> +
> +  // First time when we init sSingleton, the pref database might not be in a
> +  // reliable state (we are too early), so despite dom.ipc.processPrelaunch.enabled

Under what conditions does that happen? It seems like creation of a ContentParent happens pretty late in startup (certainly after prefs are read, since we need to at least know whether e10s is enabled).

@@ +126,5 @@
> +  } else if (!strcmp("nsPref:changed", aTopic)) {
> +    // The only other observer we registered was for our prefs.
> +    RereadPrefs();
> +  } else if (!strcmp(NS_XPCOM_SHUTDOWN_OBSERVER_ID, aTopic)) {
> +    mShutdown = true;

You're going to have to unregister all your observers here. Otherwise you'll probably get leaks. Also, mShutdown seems to be unused.

@@ +137,5 @@
> +
> +static bool
> +IsMaxProcessCountReached()
> +{
> +  int32_t maxCount = Preferences::GetInt("dom.ipc.processCount");

I think this pref is now more complex. We have processCount.web too now. Can you share this code with ContentParent?

@@ +185,5 @@
> +    return;
> +  }
> +
> +  // Originally AllocateOnIdle() was post here, but since the gecko parent
> +  // message loop in practice never goes idle, that didn't work out well.

That's really surprising, but okay. In practice a 1s delay seems fine.

@@ +236,5 @@
> +
> +void
> +PreallocatedProcessManagerImpl::ObserveProcessShutdown(nsISupports* aSubject)
> +{
> +  if (!mPreallocatedProcess) {

Do we also want to try pre-allocating a new process if an old one has shut down?

::: dom/ipc/PreallocatedProcessManager.h
@@ +7,5 @@
> +#ifndef mozilla_PreallocatedProcessManager_h
> +#define mozilla_PreallocatedProcessManager_h
> +
> +#include "base/basictypes.h"
> +#include "nsCOMPtr.h"

You probably want mozilla/AlreadyAddRefed.h instead.

@@ +8,5 @@
> +#define mozilla_PreallocatedProcessManager_h
> +
> +#include "base/basictypes.h"
> +#include "nsCOMPtr.h"
> +#include "nsIObserver.h"

What's this for?
Attachment #8821155 - Flags: feedback?(wmccloskey) → feedback+
(Assignee)

Comment 3

3 months ago
Created attachment 8827962 [details] [diff] [review]
interdiff v1->v2

(In reply to Bill McCloskey (:billm) from comment #2)
> 
> Sorry for the delay here. This looks good. Just need some small cleanups I
> think.

No problem, and now it's my turn to say sorry for the delay with the second round...

> 
> It would be nice to take the tests from the old patch.

I didn't find that test very useful, I'm not sure we care about the process priority that much anyway, but I tried to add a trivial test this time that actually works, I think that should be OK as a start.

One thing though... the test shuts down the process too fast it seems, and it's not the most graceful experience. The child finds out during the init that the channel is closed and crashes... Is there a more subtle way of closing a process than what I do in ::CloseProcess? Or should we add some notification when the child process finished its initialization and wait for that in the test maybe?

> Before your patch, if we didn't find any process to use (because of the
> opener restriction), we would create a new process. With your patch, we
> return null. That seems bad.

Uhh... thanks for catching this.

> > +  // First time when we init sSingleton, the pref database might not be in a
> > +  // reliable state (we are too early), so despite dom.ipc.processPrelaunch.enabled
> 
> Under what conditions does that happen? It seems like creation of a
> ContentParent happens pretty late in startup (certainly after prefs are
> read, since we need to at least know whether e10s is enabled).

It's tricky. Pref are read at that point _except_ user prefs which happens a bit later I think from
nsXREDirProvider::DoStartup which then calls into Preferences::ResetAndReadUserPrefs. It's quite disturbing and feels wrong to me but haven't got the time to figure out why are we doing it this way. Maybe I should update the comment and/or file a follow-up about this...

> Do we also want to try pre-allocating a new process if an old one has shut
> down?

I think I would want to avoid doing that for now. If that process is shut down before use
(and not because of a pref change) I wouldn't risk starting it again right away... In some
edge cases it might help in some other cases it might just make things worse.

I think I've addressed the rest of the issues. This is an interdiff, but I'll attach the new patch
in a bit as well.
(Assignee)

Comment 4

3 months ago
Created attachment 8827963 [details] [diff] [review]
Simplified version. v2
Attachment #8821155 - Attachment is obsolete: true
Attachment #8827963 - Flags: review?(wmccloskey)
Comment on attachment 8827963 [details] [diff] [review]
Simplified version. v2

Review of attachment 8827963 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/ipc/ContentParent.cpp
@@ +700,5 @@
> +  if (maxContentParents < 1) {
> +    maxContentParents = 1;
> +  }
> +
> +  return (uint32_t) maxContentParents;

Please use a static_cast here.

@@ +735,5 @@
>                                            ProcessPriority aPriority,
>                                            ContentParent* aOpener,
>                                            bool aLargeAllocationProcess)
>  {
> +

Can you remove the blank lines here?

@@ +754,5 @@
> +    return p.forget();
> +  }
> +
> +  // Try to take the preallocated process.
> +  p = PreallocatedProcessManager::Take();

Shouldn't we only use the preallocated process if aRemoteType == DEFAULT_REMOTE_TYPE?

::: dom/ipc/PContent.ipdl
@@ +891,5 @@
>  
>      // Notify the parent of the presence or absence of private docshells
>      async PrivateDocShellsExist(bool aExist);
>  
> +    // Tell the parent that the child has gone idle for the first time

Period at the end, please.

::: dom/ipc/PreallocatedProcessManager.cpp
@@ +70,5 @@
> +
> +/* static */ PreallocatedProcessManagerImpl*
> +PreallocatedProcessManagerImpl::Singleton()
> +{
> +  if (!sSingleton) {

Can you assert that we're on the main thread?

@@ +91,5 @@
> +
> +NS_IMPL_ISUPPORTS(PreallocatedProcessManagerImpl, nsIObserver)
> +
> +PreallocatedProcessManagerImpl::PreallocatedProcessManagerImpl()
> +  :

Formatting here is weird.

@@ +111,5 @@
> +    os->AddObserver(this, NS_XPCOM_SHUTDOWN_OBSERVER_ID,
> +                    /* weakRef = */ false);
> +  }
> +  {
> +    RereadPrefs();

Why is this in its own block?

@@ +181,5 @@
> +
> +  // Originally AllocateOnIdle() was post here, but since the gecko parent
> +  // message loop in practice never goes idle, that didn't work out well.
> +  // Let's just launch the process after the delay.
> +  MessageLoop::current()->PostDelayedTask(

Please use NS_DelayedDispatchToCurrentThread instead.

@@ +194,5 @@
> +  if (!mEnabled || mPreallocatedProcess) {
> +    return;
> +  }
> +
> +  MessageLoop::current()->PostIdleTask(NewRunnableMethod(this, &PreallocatedProcessManagerImpl::AllocateNow));

Please use NS_IdleDispatchToCurrentThread instead.
Attachment #8827963 - Flags: review?(wmccloskey) → review+

Comment 6

3 months ago
Pushed by gkrizsanits@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/bdd2118235e9
Simplified preallocated process mechanism. r=billm
Backed out for timing out in dom/browser-element/mochitest/priority/test_Simple.html (seems to only hit asan and debug on Linux):

https://hg.mozilla.org/integration/mozilla-inbound/rev/05a753c833288cdb2654f5fc68576471f8325254

Push with failures: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=bdd2118235e94bcb68cb1c91cf4172cfc724fae9
Failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=71224830&repo=mozilla-inbound
Flags: needinfo?(gkrizsanits)
(Assignee)

Updated

3 months ago
Blocks: 1304546
No longer blocks: 1303113
Flags: needinfo?(gkrizsanits)
See Also: → bug 1293284
(Assignee)

Updated

3 months ago
Blocks: 1333799
(Assignee)

Updated

3 months ago
Depends on: 1332054
(Assignee)

Comment 8

3 months ago
So the reason for the failure was that I forgot to move the test file to the right place and was executed only in non-e10s mode... After I fixed that it turned out that even though running locally the test works OK on try shutting down the preallocated process after the test too early results various errors (the shut down ends up as a process crash because of bug 1332054).

Now I tried various strategies to wait with the shutdown, by adding a notification during the process initialization [1] or after [2], but neither were good enough.

So we wither have to fix bug 1332054, or find the right time when it's safe to shut down the process and send a notification to the parent about it. Bill, can you give me any hint about when should I fire that notification (if this approach is even the right one) or should I just wait for bug 1332054?

[1] https://hg.mozilla.org/try/rev/0b9791069566017bdb4382eec876c13e5233c986#l2.277
[2] https://hg.mozilla.org/try/rev/b6f8f65a4c6dad8e4010365b71080d69cd599804#l3.12
Flags: needinfo?(wmccloskey)

Comment 9

3 months ago
as a note, when this landed we increase our number of constructors by 1:
== Change summary for alert #4870 (as of January 23 2017 16:26 UTC) ==

Regressions:

  1%  compiler_metrics num_constructors linux32 debug     185 -> 186
  1%  compiler_metrics num_constructors linux64 debug     185 -> 186

For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=4870

we typically don't take action on this, it is more information.
Can you try calling ShutDownProcess(SEND_SHUTDOWN_MESSAGE) instead of Close()? That might work.
Flags: needinfo?(wmccloskey)
(Assignee)

Comment 11

3 months ago
(In reply to Bill McCloskey (:billm) from comment #10)
> Can you try calling ShutDownProcess(SEND_SHUTDOWN_MESSAGE) instead of
> Close()? That might work.

I wanted to avoid that since it's a private method. Probably that was a bad idea. I've added the PreallocatedProcessManagerImpl as a friend class to ContentParent. Calling ShutDownProcess(SEND_SHUTDOWN_MESSAGE) works smoothly, thanks!

(tree is closed so I'll land it as soon as it's open again)

Comment 12

3 months ago
Pushed by gkrizsanits@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/b44d15c23226
Simplified preallocated process manager. r=billm

Comment 13

3 months ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/b44d15c23226
Status: NEW → RESOLVED
Last Resolved: 3 months ago
status-firefox54: --- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla54
Assignee: nobody → gkrizsanits
This patch landed, and has been in the tree for a month, but dom.ipc.processPrelaunch.enabled is not set by default, and from what I can tell, http://searchfox.org/mozilla-central/rev/b035220d0f939559f4f8cf7b9079bc12f6adfc35/dom/ipc/PreallocatedProcessManager.cpp#145-149 means that we're not using the preallocator.

Is there a bug to turn this thing on by default?
Flags: needinfo?(gkrizsanits)
(Assignee)

Comment 15

2 months ago
(In reply to Mike Conley (:mconley) (Catching up on reviews and needinfos) from comment #14)
> This patch landed, and has been in the tree for a month, but
> dom.ipc.processPrelaunch.enabled is not set by default, and from what I can
> tell,
> http://searchfox.org/mozilla-central/rev/
> b035220d0f939559f4f8cf7b9079bc12f6adfc35/dom/ipc/PreallocatedProcessManager.
> cpp#145-149 means that we're not using the preallocator.
> 
> Is there a bug to turn this thing on by default?

Yeah, this was sort of a first step. I'm trying to enable it by default in bug 1341008. And then maybe force use it for all new content processes, but currently there are some issues with web extensions.
Flags: needinfo?(gkrizsanits)
You need to log in before you can comment on or make changes to this bug.