Open Bug 1353297 Opened 4 years ago Updated 3 years ago

Implement notification for refreshing page when Compositor Process crashes

Categories

(Core :: Graphics, enhancement, P3)

enhancement

Tracking

()

People

(Reporter: asimonca, Unassigned)

Details

(Whiteboard: [gfx-noted])

Attachments

(1 file)

[Steps to reproduce]:
Preconditions:
Make sure e10s is enabled and there are 2-3 "Nightly" Background processes in the Task Manager.
1. Go to Google Maps.
2. Go to Street View in Google Maps. (Drag the little yellow man from the lower-right corner to a street)
3. End the GPU Process task from Task Manager (it usually is the last one of the 2 or 3 "Nightly" tasks that are shown in Task Manager)

[Expected result]:
- The screen flickers once but the site does not lose it's functionality.

[Actual result]:
- The screen flickers anytime I interact with it, even when just moving the mouse or clicking inside the black area.

As per Anthony's comment in www.bugzil.la/1352081#c7 this could be a nice UX improvement for when the GPU Process crashes in Google Maps. The user should be notified through a notification that a refresh is needed to regain the full functionality of the site. 

[Additional notes]:
- You can see a screencast of the website losing it's functionality following this link: https://goo.gl/XODCVf
Not just Google Maps but in fact any page.
Summary: Implement notification for refreshing "Google Maps" when Compositor Process crashes → Implement notification for refreshing page when Compositor Process crashes
Hi Alexandru,

I was not able to reproduce this bug with the latest nightly. Do you still have this issue on your environment?
Assignee: nobody → brsun
Flags: needinfo?(alexandru.simonca)
I can reproduce it by using "Terminate GPU Process" button in about:support to kill GPU process. Cancel ni flag.
Flags: needinfo?(alexandru.simonca)
Take some notes: Before GPUProcessManager::mNumProcessAttempts reaches "layers.gpu-process.max_restarts", if the GPU process has been killed, the canvas would show a black flash quickly and then working normally again. After reaching the max restart counts, the canvas would work abnormal as described on comment 0 if the GPU process has been killed again. Neither 'webglcontextlost' event nor 'webglcontextrestored' event would be fired in these cases.
Take some notes: The black screen only shows when the mouse cursor is moving. By adding gfxPlatform::GetPlatform()->CompositorUpdated() to the end of GPUProcessManager::HandleProcessLost() function, we don't suffer from this issue anymore. Probably there are something not reinitialized very well in the original scenario, or there might be some mystic timing issue.
Hi Kevin,

How do you think about this modification?
Attachment #8888630 - Flags: feedback?(kechen)
Comment on attachment 8888630 [details] [diff] [review]
bug1353297_trigger_compositor_updated_at_chrome_process.patch

Review of attachment 8888630 [details] [diff] [review]:
-----------------------------------------------------------------

I think it's reasonable to call compositorUpdated in HandleProcessLost since compositor process is killed.

But there are still two things I am not sure and still need to be figured out:
 1. Why is WebGL the only part which is affected and other contents are fine?
 2. In the current codebase, we still use Direct D3D11 as compositor after GPU process is killed and disabled; should we fall back to software solution for system stability?
Attachment #8888630 - Flags: feedback?(kechen) → feedback+
Comment on attachment 8888630 [details] [diff] [review]
bug1353297_trigger_compositor_updated_at_chrome_process.patch

Hi David,

Would you mind having a look on this patch?
Attachment #8888630 - Flags: review?(dvander)
(In reply to Kevin Chen[:kechen] (UTC + 8) from comment #7)
> Comment on attachment 8888630 [details] [diff] [review]
> bug1353297_trigger_compositor_updated_at_chrome_process.patch
> 
> Review of attachment 8888630 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> I think it's reasonable to call compositorUpdated in HandleProcessLost since
> compositor process is killed.
> 
> But there are still two things I am not sure and still need to be figured
> out:
>  1. Why is WebGL the only part which is affected and other contents are fine?
>  2. In the current codebase, we still use Direct D3D11 as compositor after
> GPU process is killed and disabled; should we fall back to software solution
> for system stability?

All the WebGL state is lost, there's not really any way to recover it. Chrome has the same problem.
Comment on attachment 8888630 [details] [diff] [review]
bug1353297_trigger_compositor_updated_at_chrome_process.patch

Review of attachment 8888630 [details] [diff] [review]:
-----------------------------------------------------------------

I don't understand the patch. If anything the "CompositorUpdated" call should come before we rebuild sessions. But as-is I don't understand why it would have any effect, since this path only affects the UI process and the UI process is not rendering content.

Bruce, could you explain why this fixes the issue?
After GPU process has been killed for "layers.gpu-process.max_restarts" times, the UI process uses Basic compositing as the compositor. After calling CompositorUpdated() at the UI process, the compositor has chances to use D3D11 devices for compositing D2D contents. I haven't found the exact point that induces black images though.
Flags: needinfo?(brsun)
Comment on attachment 8888630 [details] [diff] [review]
bug1353297_trigger_compositor_updated_at_chrome_process.patch

(In reply to Bruce Sun [:brsun] from comment #11)
> After GPU process has been killed for "layers.gpu-process.max_restarts"
> times, the UI process uses Basic compositing as the compositor. After
> calling CompositorUpdated() at the UI process, the compositor has chances to
> use D3D11 devices for compositing D2D contents. I haven't found the exact
> point that induces black images though.

Hrm, that still doesn't clarify things for me. Especially since the content process could have already gotten a new layer manager before this function runs.

If the GPU process pref is enabled, we never use D3D11 or D2D in the UI process. Any compositors we create in the UI after a GPU process crash, should be guaranteed to be "Basic". The content process already explicitly calls CompositorUpdated after a GPU process crash [1], and since the UI compositor will be "Basic", we should never use D2D for content [2], even if a D2D or D3D11 device is present.

Unfortunately I'm about to go on PTO and can't try the patch out myself to see what exactly the problem is. I'm forwarding the r? to :rhunt but I think an explanation is still needed.

[1] http://searchfox.org/mozilla-central/source/dom/ipc/ContentChild.cpp#1325
[2] http://searchfox.org/mozilla-central/source/gfx/thebes/gfxWindowsPlatform.cpp#474
Attachment #8888630 - Flags: review?(dvander) → review?(rhunt)
Comment on attachment 8888630 [details] [diff] [review]
bug1353297_trigger_compositor_updated_at_chrome_process.patch

This patch cannot recover other canvas pages (ex. [1]). Cancel the review flag for now.

[1] http://webglsamples.org/aquarium/aquarium.html
Attachment #8888630 - Flags: review?(rhunt)
unassign myself since I am not going to working on this bug
Assignee: brsun → nobody
Priority: -- → P3
Whiteboard: [gfx-noted]
You need to log in before you can comment on or make changes to this bug.