[geckoview] Fix content process shutdown

NEW
Assigned to

Status

defect
P2
normal
2 years ago
6 months ago

People

(Reporter: esawin, Assigned: esawin)

Tracking

Trunk
All
Android
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(5 attachments)

Assignee

Description

2 years ago
GeckoView [e10s] currently doesn't properly shutdown the content process.

STR 1
1. Open geckoview_example app.
2. Hit the Android back button.
3. Re-open geckoview_example app.

Expected: The app opens and shows the default page.
Actual: Crash (failed IsCurrent() check) or hang.

STR 2
1. Open Focus app.
2. Search for something (or open a specific URL).
3. Delete the session (using button in bottom right corner).
4. Search for something again.

Expected: The search result are displayed.
Actual: Crash or hang.
Assignee

Updated

2 years ago
Assignee

Comment 1

2 years ago
Unbind and kill the content service process when destructing the content process host.

Alternatively, we could just _exit(0) in ~ContentChild(), but this seemed like a more explicit and symmetric solution, given that we also launch the content process in the content process host.
Assignee

Comment 2

2 years ago
Don't try to shutdown and restart the content process during start if it's still around. Instead, just abort the start and return.

We will need something more sophisticated here, should we want to support more than one content process.
Assignee

Comment 3

2 years ago
Refactor the content process handling and somewhat simplify the waiting procedure.
Assignee

Comment 4

2 years ago
During the content process startup, we may trigger a deadlock between the Gecko thread and the UI thread.

1. [Gecko thread] Blocked waiting for the content process to launch in GeckoChildProcessHost::LaunchAndWaitForProcessHandle.
2. [UI thread] Blocked in createCompositor dispatch in LayerView.updateCompositor.

With this patch we spin the Gecko thread event loop to unblock the dispatching from the UI thread.

Eventually, we need async compositor creation, which should prevent the deadlock.
Assignee

Comment 5

2 years ago
Prevent calling GL functions on a stale context when clearing the texture pool during the first call to TexturePoolOGL::Fill on a new context.
Attachment #8851983 - Flags: review?(snorp)
Assignee

Comment 6

2 years ago
Comment on attachment 8851618 [details] [diff] [review]
0004-Bug-1350924-4.0-Spin-the-Gecko-thread-event-loop-to-.patch

This fixes the deadlock in the geckoview_example app, but causes intermittent startup hangs in Focus (haven't debugged yet).

Do you see an issue with the implementation?
Attachment #8851618 - Flags: feedback?(nchen)
Assignee

Comment 7

2 years ago
Comment on attachment 8851608 [details] [diff] [review]
0003-Bug-1350924-3.0-Refactor-content-process-service-han.patch

Refactored waitForChild into more explicit waitForConnect and waitForDisconnect (not used, so left it out for now) to avoid additional wait helper members and functions.

Moved content process killing out of UI thread in case that it's blocked.
Attachment #8851608 - Flags: review?(rbarker)
Assignee

Comment 8

2 years ago
Comment on attachment 8851607 [details] [diff] [review]
0002-Bug-1350924-2.0-Abort-content-process-launch-when-se.patch

Return 0 in case that the connection hasn't been properly cleared, which signals failure to start the content process.
Attachment #8851607 - Flags: review?(snorp)
Assignee

Comment 9

2 years ago
Comment on attachment 8851606 [details] [diff] [review]
0001-Bug-1350924-1.0-Stop-and-kill-the-content-process-wh.patch

Feel free to redirect r? since I'm not sure who is best suited for this.

Stop content process on GeckoChildHost destruction. This seems to work reliably for the geckoview_example app or anytime we close the window.

We need to decide how to handle the case where GeckoView's instance state is saved, which prevents the closing of its window on detaching.
Attachment #8851606 - Flags: review?(snorp)
Attachment #8851983 - Flags: review?(snorp) → review+
Comment on attachment 8851606 [details] [diff] [review]
0001-Bug-1350924-1.0-Stop-and-kill-the-content-process-wh.patch

Review of attachment 8851606 [details] [diff] [review]:
-----------------------------------------------------------------

Looks ok, but I'd rather have rbarker review
Attachment #8851606 - Flags: review?(snorp) → review?(rbarker)
Comment on attachment 8851607 [details] [diff] [review]
0002-Bug-1350924-2.0-Abort-content-process-launch-when-se.patch

Review of attachment 8851607 [details] [diff] [review]:
-----------------------------------------------------------------

-> rbarker
Attachment #8851607 - Flags: review?(snorp) → review?(rbarker)
Comment on attachment 8851618 [details] [diff] [review]
0004-Bug-1350924-4.0-Spin-the-Gecko-thread-event-loop-to-.patch

Review of attachment 8851618 [details] [diff] [review]:
-----------------------------------------------------------------

::: ipc/glue/GeckoChildProcessHost.cpp
@@ +475,5 @@
>    while (mProcessState < PROCESS_CREATED) {
> +    lock.Wait(100);
> +    if (mProcessState < PROCESS_CREATED) {
> +      MonitorAutoUnlock unlock(mMonitor);
> +      NS_ProcessNextEvent();

NS_ProcessNextEvent() can block (mayWait defaults to true), which essentially creates a "Gecko-blocked-on-UI" situation that's similar to what caused the deadlock from before. Also, processing one event every ~100ms might not be good.

I think you want to call NS_ProcessNextEvent() repeatedly in a loop, without wait. When NS_ProcessNextEvent() blocks, you get out of that by posting an event back to the Gecko thread.
Attachment #8851618 - Flags: feedback?(nchen) → feedback+
Comment on attachment 8851618 [details] [diff] [review]
0004-Bug-1350924-4.0-Spin-the-Gecko-thread-event-loop-to-.patch

Review of attachment 8851618 [details] [diff] [review]:
-----------------------------------------------------------------

This patch scares me to be honest. I think you want to talk to Billm about it for sure. One thing to keep an eye open for is the work under bug 1348361 that might solve some of your problems.

About the preallocated process manager we talked very briefly about, I don't know how could that help your situation here exactly. All it does is launching up an extra content process in the background without hosting any tab/window in it, and by the next time someone needs a new content process, one just needs to grab and use it. Since you have a hard upper limit of having only one content process alive at all times, I don't see how that would help you here but I might be missing something.
Assignee

Updated

2 years ago
Attachment #8851606 - Flags: review?(rbarker)
Assignee

Updated

2 years ago
Attachment #8851608 - Flags: review?(rbarker)
Assignee

Updated

2 years ago
Attachment #8851607 - Flags: review?(rbarker)
Assignee

Updated

2 years ago
Depends on: 1348361
Component: Embedding: APIs → GeckoView
Product: Core → Firefox for Android
Version: unspecified → Trunk
P2 because this is not a Klar+GeckoView blocker. Klar doesn't shut down the content process.
Priority: P1 → P2

Updated

6 months ago
Product: Firefox for Android → GeckoView
You need to log in before you can comment on or make changes to this bug.