Closed Bug 1363240 Opened 3 years ago Closed 2 years ago

Content process limit specified by dom.ipc.processCount is not respected in some circumstances

Categories

(Core :: DOM: Content Processes, defect)

defect
Not set

Tracking

()

RESOLVED FIXED

People

(Reporter: dqeswn, Assigned: gkrizsanits)

References

Details

(Whiteboard: [e10s-multi:+])

Attachments

(1 file)

I have a session loaded with many tabs, few loaded. When I attempt to load a session via Session Manager with only a few tabs all of the lazy tabs are inserted because of another addon (Findbar Tweak).
Now here's where the issue presents itself. Content processes are not only started the limit set by dom.ipc.processCount, but way past it, probably for all tabs. I've seen hundreds started for a session with hundreds of tabs.
Addons or not, this shouldn't be happening.
Hi, I know about two ways that can cause something like this. Setting dom.ipc.processCount.web, which overwrites dom.ipc.processCount. Or using the nsIMozBrowserFrame interface in the add-on. Could you provide an STR so I can debug what is going on exactly?
Flags: needinfo?(dqeswn)
(In reply to Gabor Krizsanits [:krizsa :gabor] from comment #1)
> Hi, I know about two ways that can cause something like this. Setting
> , which overwrites dom.ipc.processCount. Or using
> the nsIMozBrowserFrame interface in the add-on. Could you provide an STR so
> I can debug what is going on exactly?

Dom.ipc.processCount.web isn't set here.

I practically did an STR, just didn't number the steps.
1. Install the two addons. Findbar tweak to cause the insertions when closing tabs, and Session Manager
2. load a session with many unloaded tabs.
3. load a session with only 1-2 tabs using the override option, so that the extra tabs are closed.
4. Observe the mass of content processes starting


Note:
Just tried it with a new profile with just using these steps and using pre-existing session backup files from another profile.
Flags: needinfo?(dqeswn)
(In reply to avada from comment #2)
> I practically did an STR, just didn't number the steps.
> 1. Install the two addons. Findbar tweak to cause the insertions when
> closing tabs, and Session Manager

Thanks, it wasn't about the numbers :) For some reason I thought the Session Manager is part of our SessionStore implementation and not another add-on and that confused me... sorry about that, I'll retry.
Blocks: 1358283
Hmm... it looks like nsFrameLoader::GetLoadContext is responsible for creating this short lived processes. So it's likely a regression from Bug 1340747.
Blocks: 1340747
Whiteboard: [e10s-multi:?]
Whiteboard: [e10s-multi:?] → [e10s-multi:+]
:gabor who should this be assigned to?
Flags: needinfo?(gkrizsanits)
(In reply to Erin Lancaster [:elan] from comment #5)
> :gabor who should this be assigned to?

I'm working on this.
Assignee: nobody → gkrizsanits
Flags: needinfo?(gkrizsanits)
Depends on: 1287330
What's happening here that we tap the messageManager property of all the tab's linked browser, which will insert an actual browser for each (lazy) tab, which will quickly launch a content process but right after that we close the tab so we kill it right away enabling all the tabs to spawn a new content process without ever hitting the cap. Depending on the number of lazy tabs this can spawn a gazillion of content processes in a chain. This is for sure a blocker for e10s-multi with add-ons combined, but reading the related bugs it seems like similar scenario might happen even without add-ons.

So questions:
- are we planning to ship lazy tabs for 54? (it will block e10s-multi for users with add-ons as it is right now)
- is there a way to avoid spawning a content process at linked browser insertion if the tab will be going away soon? (the question ofc how long can we delay the process startup without risking to regress other regular cases)
- are there any other known scenarios where we rapidly insert a massive amount of linked browser combined with the risk that many of those tabs will be closed real quickly. Note that given bug 1352021, even if we don't close the tabs right away there is a chance that under some circumstances we might spawn a massive amount of content processes for no good reason.
Flags: needinfo?(kevinhowjones)
Flags: needinfo?(dao+bmo)
Note: also I wonder if bug 1352706 is related since there the original scenario is quite similar.
(In reply to Gabor Krizsanits [:krizsa :gabor] from comment #7)
> - is there a way to avoid spawning a content process at linked browser
> insertion if the tab will be going away soon?

Dao, will bug 1363078 help with this?  (test for window.closed before inserting browser)

> - are there any other known scenarios where we rapidly insert a massive
> amount of linked browser combined with the risk that many of those tabs will
> be closed real quickly.

Other than being initiated by addons, I am not clear on what specific actions are causing massive linkedBrowser insertion natively.  Has that been identified?
Flags: needinfo?(kevinhowjones)
(In reply to Kevin Jones from comment #9)
> Other than being initiated by addons, I am not clear on what specific
> actions are causing massive linkedBrowser insertion natively.  Has that been
> identified?

Is some of this related to doing a mid-session restore, and re-using tabs?
(In reply to Kevin Jones from comment #9)
> Other than being initiated by addons, I am not clear on what specific
> actions are causing massive linkedBrowser insertion natively.  Has that been
> identified?

(In reply to Kevin Jones from comment #10)
> Is some of this related to doing a mid-session restore, and re-using tabs?

Regards of bug 1352706 (which is a without add-on STR) only the reporter could reproduce it so far. To me it's not clear how to open a directory of bookmarks in a way that the existing tabs are being re-used, but it can be just a timing thing. What do you mean by mid-session restore and re-using tabs exactly? (sorry my session store knowledge is limited)

Anyway, you mentioned a few cases in bug 1345098 where this used to happen so I wanted to check if you're aware of any cases where this is still an issue.  Given the related console.error and that no-one reported any related bugs you're aware of I would just assume that this is an add-on only issue (except maybe the very rare edge case in bug 1352706).
(In reply to Gabor Krizsanits [:krizsa :gabor] from comment #7)
> What's happening here that we tap the messageManager property of all the
> tab's linked browser, which will insert an actual browser for each (lazy)
> tab, which will quickly launch a content process but right after that we
> close the tab so we kill it right away enabling all the tabs to spawn a new
> content process without ever hitting the cap.

How's closing a tab related to all other tabs spawning a content process?

> Depending on the number of
> lazy tabs this can spawn a gazillion of content processes in a chain. This
> is for sure a blocker for e10s-multi with add-ons combined, but reading the
> related bugs it seems like similar scenario might happen even without
> add-ons.

It shouldn't happen anymore without add-ons.

> So questions:
> - are we planning to ship lazy tabs for 54? (it will block e10s-multi for
> users with add-ons as it is right now)

No, 55. If WebExtensions are somehow less prone to triggering this, we could potentially delay till 57 which drops support for XUL addons.

> - is there a way to avoid spawning a content process at linked browser
> insertion if the tab will be going away soon? (the question ofc how long can
> we delay the process startup without risking to regress other regular cases)

We could set a limit and then refuse inserting more of browsers in quick succession, but this might break add-ons in drastic ways.

> - are there any other known scenarios where we rapidly insert a massive
> amount of linked browser combined with the risk that many of those tabs will
> be closed real quickly.

Not that I know of.
Flags: needinfo?(dao+bmo)
(In reply to Dão Gottwald [::dao] from comment #12)
> How's closing a tab related to all other tabs spawning a content process?

The current cap for content processes is 4. So if we just insert 100 browsers because of a getter that hits all lazy tabs, then only the first few will spawn a process the rest will just pick one from the existing 4. If we insert/close/insert/close/... We might never hit the cap and we end up spawning short lived processes as fast as the machine can handle it pretty much, one for each lazy tabs.

> It shouldn't happen anymore without add-ons.

That's great news. :)

> 
> > So questions:
> > - are we planning to ship lazy tabs for 54? (it will block e10s-multi for
> > users with add-ons as it is right now)
> 
> No, 55. If WebExtensions are somehow less prone to triggering this, we could
> potentially delay till 57 which drops support for XUL addons.

IF we release multi with 54 there is a chance that we will only enable it for users with web extension based add-ons only. The million dollar question if a web extension can do something similar, and I'm very much afraid that the answer is yes. I think the error on the console.log is a good start, it should flag any new extension to go through the review process, although I'm not sure how reliable that is. I'll ask around in the add-on team.

> We could set a limit and then refuse inserting more of browsers in quick
> succession, but this might break add-ons in drastic ways.

It might be better to break add-ons than let add-ons break the browser... Another defensive mechanism would be to keep around unused, short lived, content processes for a while to prevent similar bugs.
(In reply to Gabor Krizsanits [:krizsa :gabor] from comment #13)
> (In reply to Dão Gottwald [::dao] from comment #12)
> > How's closing a tab related to all other tabs spawning a content process?
> 
> The current cap for content processes is 4. So if we just insert 100
> browsers because of a getter that hits all lazy tabs, then only the first
> few will spawn a process the rest will just pick one from the existing 4. If
> we insert/close/insert/close/... We might never hit the cap and we end up
> spawning short lived processes as fast as the machine can handle it pretty
> much, one for each lazy tabs.

What code is doing that insert/close/insert/close/...?

> > We could set a limit and then refuse inserting more of browsers in quick
> > succession, but this might break add-ons in drastic ways.
> 
> It might be better to break add-ons than let add-ons break the browser...

If we break an add-on like Session Manager, this would basically break the browser from the user's perspective, including data loss etc.

> Another defensive mechanism would be to keep around unused, short lived,
> content processes for a while to prevent similar bugs.

That sounds reasonable.
(In reply to Gabor Krizsanits [:krizsa :gabor] from comment #11)
> (In reply to Kevin Jones from comment #9)
> > Other than being initiated by addons, I am not clear on what specific
> > actions are causing massive linkedBrowser insertion natively.  Has that been
> > identified?
> 
> (In reply to Kevin Jones from comment #10)
> > Is some of this related to doing a mid-session restore, and re-using tabs?
> 
> Regards of bug 1352706 (which is a without add-on STR) only the reporter
> could reproduce it so far. To me it's not clear how to open a directory of
> bookmarks in a way that the existing tabs are being re-used, but it can be
> just a timing thing. What do you mean by mid-session restore and re-using
> tabs exactly? (sorry my session store knowledge is limited)

Mid session restore would be calling SessionStore.setBrowserState(...) or setWindowState(...).  I believe this may be what Session Manager uses.

> Anyway, you mentioned a few cases in bug 1345098 where this used to happen
> so I wanted to check if you're aware of any cases where this is still an
> issue.  Given the related console.error and that no-one reported any related
> bugs you're aware of I would just assume that this is an add-on only issue
> (except maybe the very rare edge case in bug 1352706).

As Dao said, yes, those issues have been taken care of, with the exception of abandoned addons Tab Groups and Findbar Tweak.  We have used shims where we could, but in some cases it is not possible and the addons must be modified.
ni'info Kris, comments 7 and 13 are questions about if WebExtensions will have a problem here.
Flags: needinfo?(kmaglione+bmo)
(In reply to Dão Gottwald [::dao] from comment #12)
> > So questions:
> > - are we planning to ship lazy tabs for 54? (it will block e10s-multi for
> > users with add-ons as it is right now)
> 
> No, 55. If WebExtensions are somehow less prone to triggering this, we could
> potentially delay till 57 which drops support for XUL addons.

At the moment, they're probably not, but we can easily make it so they are.

(In reply to Gabor Krizsanits [:krizsa :gabor] from comment #13)
> (In reply to Dão Gottwald [::dao] from comment #12)
> > How's closing a tab related to all other tabs spawning a content process?
> 
> The current cap for content processes is 4. So if we just insert 100
> browsers because of a getter that hits all lazy tabs, then only the first
> few will spawn a process the rest will just pick one from the existing 4. If
> we insert/close/insert/close/... We might never hit the cap and we end up
> spawning short lived processes as fast as the machine can handle it pretty
> much, one for each lazy tabs.

Can we just add a short delay before destroying the content processes, and reuse them if a new content process is required before they're destroyed?
Flags: needinfo?(kmaglione+bmo)
(In reply to Dão Gottwald [::dao] from comment #14)
> What code is doing that insert/close/insert/close/...?

The above mentioned add-ons. Findbar Tweak calls messageManager on the linkedBrowser (insert) just before tab close (close). I don't have the stack any more but if it's easy to reproduce if it's important for some reason.

> If we break an add-on like Session Manager, this would basically break the
> browser from the user's perspective, including data loss etc.

Maybe we should be just detecting this behavior and flagging the add-on for not compatible with e10s. I believe there are add-ons that opt out from e10s which would solve this problem.

We can postpone the release of lazy tabs until 57 or just mark problematic add-ons to disable e10s/e10s-multi.

I will try to look into reusing these short lived processes. That might also help, that would fix one part of the bug, I'm not sure if it's enough though as inserting hundreds of browser elements in a row can be still a horrible experience, I have never tried it out.

(In reply to Kris Maglione [:kmag] (busy; behind on reviews) from comment #17)
> At the moment, they're probably not, but we can easily make it so they are.

I filed a bug for it: Bug 1364847.

> 
> (In reply to Gabor Krizsanits [:krizsa :gabor] from comment #13)
> Can we just add a short delay before destroying the content processes, and
> reuse them if a new content process is required before they're destroyed?

I'm planning to do something like that. I would prefer letting go processes with huge memory footprint though... I'm not sure about the details yet: Bug 1364849.
Depends on: 1364847
Depends on: 1364849
(In reply to Gabor Krizsanits [:krizsa :gabor] from comment #18)
> (In reply to Dão Gottwald [::dao] from comment #14)
> Maybe we should be just detecting this behavior and flagging the add-on for
> not compatible with e10s. I believe there are add-ons that opt out from e10s
> which would solve this problem.
> 
> We can postpone the release of lazy tabs until 57 or just mark problematic
> add-ons to disable e10s/e10s-multi.

There is another issue to be considered here.  If the addon is doing this to multiple tabs (all?) in quick succession and large numbers of lazy browsers get inserted this can be resource intensive.  This could result in bad UX whether or not e10s is enabled (Imagine that we have simply deferred the bad UX on startup that was optimized with lazy tabs to somewhere in the middle of the session).
(In reply to Kevin Jones from comment #19)
> There is another issue to be considered here.  If the addon is doing this to
> multiple tabs (all?) in quick succession and large numbers of lazy browsers
> get inserted this can be resource intensive.  This could result in bad UX
> whether or not e10s is enabled (Imagine that we have simply deferred the bad
> UX on startup that was optimized with lazy tabs to somewhere in the middle
> of the session).

Yeah, in bug 1364847 I mentioned that problem as well, so for web extensions I hope that will be addressed.

For legacy add-ons I don't have any better solution than to either postpone the release of lazy tabs (until 57) or just disable them for users with legacy add-ons. I would do the later. It shouldn't be super hard, these add-ons require restart already (should be double checked but I think we still require even bootstrapped add-ons to restart before it would become active so this feature could be easily turned on/off when a legacy / SDK add-on is active). If that is something you're interested doing a good starting point would be: http://searchfox.org/mozilla-central/rev/484d2b7f51b7aed035147bbb4a565061659d9278/toolkit/mozapps/extensions/internal/XPIProviderUtils.js#1078
Summary: Content proces limit specified by dom.ipc.processCount is not respected in some circumstances → Content process limit specified by dom.ipc.processCount is not respected in some circumstances
avada, can you still reproduce this in Nightly now that bug 1364849 is fixed?
Blocks: 1287330
No longer depends on: 1287330
Flags: needinfo?(dqeswn)
Yes. Loading 111 tab resulted in the process count in process explorer incresing by 110. (guess some other process ended in the meanwhile) The process takes more than a minute.
Flags: needinfo?(dqeswn)
(In reply to avada from comment #22)
> Yes. Loading 111 tab resulted in the process count in process explorer
> incresing by 110. (guess some other process ended in the meanwhile) The
> process takes more than a minute.

Last time I checked the patch it fixed this issue, I'll try it again and see what's going on. The patch landed yesterday on mc so maybe it just didn't make it to the nightly or something. The missing process is probably because of the preallocated process that was launched earlier by the way.
I've just tested it again today and could not reproduce the issue (before the patch I could reproduce it reliably).
Status: UNCONFIRMED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Here it happens just the same. (With the addition of the mains process leak/hang at the end)
(In reply to avada from comment #25)
> Here it happens just the same. (With the addition of the mains process
> leak/hang at the end)

I'm sorry to hear that :( Could you retry it starting from a fresh profile? If it still happens with a fresh profile with only the mentioned two add-ons installed, could you specify me your OS and the build date of your nightly? Thanks!
Flags: needinfo?(dqeswn)
(In reply to Gabor Krizsanits [:krizsa :gabor] from comment #26)
> I'm sorry to hear that :( Could you retry it starting from a fresh profile?
> If it still happens with a fresh profile with only the mentioned two add-ons
> installed, could you specify me your OS and the build date of your nightly?
> Thanks!

Couldn't reproduce it with a fresh profile. Though that's a rather sterile test in my opinion. Did you try it with a well used profile?
Flags: needinfo?(dqeswn)
(In reply to avada from comment #27)
> (In reply to Gabor Krizsanits [:krizsa :gabor] from comment #26)
> > I'm sorry to hear that :( Could you retry it starting from a fresh profile?
> > If it still happens with a fresh profile with only the mentioned two add-ons
> > installed, could you specify me your OS and the build date of your nightly?
> > Thanks!
> 
> Couldn't reproduce it with a fresh profile. Though that's a rather sterile
> test in my opinion. Did you try it with a well used profile?

Thanks, then it's probably some non-default pref... could you post me your about:support? I might find something there. And yes I tried my regular profile but I could not reproduce it with that either.
(In reply to Gabor Krizsanits [:krizsa :gabor] from comment #28)
> Thanks, then it's probably some non-default pref... could you post me your
> about:support? I might find something there. And yes I tried my regular
> profile but I could not reproduce it with that either.

So meanwhile I updated to today's nightly from yesterdays. And it didn't seem reproducible at first in the profile I tested before. (Although the mass of processes starting to the limit (44) was still pretty heavy. And about:crashes is spammed with crashes, due to a bunch of them crashing.)
Just as I was about to send this comment it happened again. So I guess it doesn't always happen anymore.


Anyway I attached my about support.
You need to log in before you can comment on or make changes to this bug.