Open Bug 1845778 Opened 1 year ago Updated 2 months ago

ContentParent can spawn more processes than dom.ipc.processCount at shutdown

Categories

(Core :: DOM: Content Processes, defect, P3)

defect

Tracking

()

People

(Reporter: robwu, Unassigned)

References

Details

Attachments

(2 files, 1 obsolete file)

During my investigation of bug 1845352, I found that it is possible for the number of extension processes to become 2 despite dom.ipc.processCount.extension being at its default value of 1 (and dom.ipc.keepProcessesAlive.extension set to 0 or 1).

This can happen as follows:

  1. Extension process is started (when an extension is loaded).
  2. dom.ipc.keepProcessesAlive.extension=1
  3. "quit-application-granted" notification is triggered, which results in the invocation of ContentParent::BlockShutdown.
  4. ContentParent::BlockShutdown calls SignalImpendingShutdownToContentJS().
  5. (external) An attempt to open an extension page happens (moz-extension:-URL). The desired process is looked up by ContentParent::GetUsedBrowserProcess.
  6. Whatever selection is ignored because IsShuttingDown() is true due to step 4.
  7. Because the process is not found, a new process is created.
  8. While step 7 features the comment "// Launch aborted because of shutdown. Bailout.", the actual implementation does not bail at this stage and, and consequently there is an extra extension process.

I believe that ContentParent::GetNewOrUsedLaunchingBrowserProcess and/or ContentParent::BeginSubprocessLaunch should bail as soon as ShutdownPhase::AppShutdownConfirmed has been reached. And definitely not spawn another process.

See Also: → 1845785

(In reply to Rob Wu [:robwu] from comment #0)

I believe that ContentParent::GetNewOrUsedLaunchingBrowserProcess and/or ContentParent::BeginSubprocessLaunch should bail as soon as ShutdownPhase::AppShutdownConfirmed has been reached. And definitely not spawn another process.

It seems to me that we already have this check in ContentParent::GetNewOrUsedLaunchingBrowserProcess a few lines above, see bug 1811195. And I would hope that there is no way of calling things in parallel from different threads such that ShutdownPhase::AppShutdownConfirmed changes state in between. Are you able to reproduce this behavior and maybe record a pernsosco session?

Flags: needinfo?(rob)
See Also: → 1811195

Demonstrates bug 1845778: when "quit-application-granted" is triggered,
AppShutdown::IsInOrBeyond(ShutdownPhase::AppShutdownConfirmed) is
false.

The additional logging in ContentParent.cpp is not strictly required,
but does help with showing what's going on in more detail.

MOZ_LOG=Process:5,sync ./mach test toolkit/components/extensions/test/xpcshell/test_extension_process_alive.js --log-mach-verbose --verbose --setpref=toolkit.asyncshutdown.log=true

Depends on D184758

Output of test run from https://bugzilla.mozilla.org/show_bug.cgi?id=1845778#c2

To reproduce locally, apply https://phabricator.services.mozilla.com/D184758 and https://phabricator.services.mozilla.com/D185060, then run the command.

Tips to read the log:

  • Highlight /Process
  • Line 136-141 shows how the issue is triggered - i.e. "quit-application-granted" is triggered.
  • Line 142 shows the "surprising" observation that !isInOrBeyondShutdownPhase(SHUTDOWN_PHASE_APPSHUTDOWNCONFIRMED)
  • Line 214 - 283 is the full log of the start of the specific test up to the test failure; the test failure is that the extension process is unexpectedly terminated.
  • Line 253 is noteworthy; with the extra debug logging from the patch it shows that the number of extension processes is two.

I took a closer look, and attached two attachments: patch to trigger the issue (comment 2), and a log file from a run of the test case, with some annotations in comment 3.

The reported issue only happens because (until bug 1845352 was fixed), a test could trigger "quit-application-granted" without a matching call to AppShutdown::AdvanceShutdownPhase(ShutdownPhase::AppShutdownConfirmed, ...).

The question would mainly be: how realistic is this scenario be outside of unit tests?

Given this rough analysis, I suppose that it is realistic for the issue in this bug to happen. Especially because extension shutdown does not immediately happen when quit-application-granted is triggered: https://searchfox.org/mozilla-central/rev/4044c34031d035fadb588143297ba5421419d44b/toolkit/mozapps/extensions/internal/XPIProvider.jsm#2625-2633,2667

Flags: needinfo?(rob)
Flags: needinfo?(jstutte)

AppShutdownConfirmed aka "quit-application-granted" is indeed a bit special. When it runs, we have basically a 100% running browser and are informed that we should begin our shutdown sequence. It is the point of no return.

mozilla::AppShutdown::OnShutdownConfirmed() is called only once we had ferocity == eForceQuit. But as you pointed out, we already notified "quit-application-granted" observers well above, triggering all kinds of reactions.

We even have some special case inside AppShutdown::AdvanceShutdownPhaseInternal to avoid side effects from processing pending events before notifying the phase (again). I wonder if most of those side effects come exactly from the earlier notification.

I think we could try to just remove the first notification as it happens below again at the presumably better moment in time. This matches also better being "the point of no return" as we have some logic that appears to check if we can really close but that seems to not have any other effect than returning an error after having initiated the shutdown, anyways, as we will always have set ferocity = eForceQuit; before if we get there.

Edit: This is nonsense, obviously above we fire "quit-application-granted" and then AppShutdown fires "quit-application".

Please note that the modified test of course would still show the behavior you discovered as you manually call notifyObservers. If you want the correct shutdown behavior, you need to use advanceShutdownPhase. In fact I wonder if mocking AppShutdown in that test in general may have some unexpected side effects, too.

Edit (after chatting with :smaug): It seems we should overhaul the order here a bit. We should definitely be in the shutdown phase before/while sending the notification. And call CloseAllWindows only afterwards. We can then still check if the closing was successful, but if not, that probably just merits a MOZ_CRASH, as unload handlers on child processes would not be able to create new processes anymore (once the IsInOrBeyondcheck works correctly) and having unload handlers for anything running in the parent process should be considered being an error, probably.

Flags: needinfo?(jstutte)

This check might even prevent us from doing silly things also in the parent process case?

Assignee: nobody → jmarshall
Flags: needinfo?(jmarshall)

(In reply to Jens Stutte [:jstutte] from comment #8)

I threw this together, but a first smoke test seems to indicate it has bad side effects. This needs more time than I currently have, probably. Joshua, would you mind taking this over?

I suspect we have too many checks for AppShutdownConfirmedin our code that might better be AppShutdown.

Severity: -- → S3
Priority: -- → P3
Attachment #9348409 - Attachment is obsolete: true

(In reply to Jens Stutte [:jstutte] from comment #9)

I suspect we have too many checks for AppShutdownConfirmedin our code that might better be AppShutdown.

The patch was based on confusing "quit-application-granted" with "quit-application". But the last statement might still be true, especially together with the weirdness that we have a shutdown barrier quitApplicationGranted but apparently not for quitApplication. Joshua will continue investigation.

Assignee: jmarshall → nobody
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: