Bug 1600178 Comment 0 Edit History

Original comment by

on 2019-11-28 14:21:49 PST

At the moment, with WebRender on macOS, all windows share the same OpenGL context. On machines with an integrated and a discrete GPU, this shared GL context is migrated to the "active" GPU whenever the "active" GPU changes.
Here, the "active" GPU means the GPU which is currently driving the internal display.

We should figure out what we want to do long-term. I think we have two main options:
- Option A: Use one GL context and "migrate" it between GPUs. Share this context between all Firefox windows.
- Option B: Use one GL context per GPU and never migrate the contexts. Share the relevant context between all Firefox windows that are currently on a display that is driven by that context's GPU. Support moving an existing Firefox window to a different GL context.

There are other options which probably won't work very well:
- Option C: Don't use shared GL contexts. Instead, use one GL context per window and migrate it to the window's display's active GPU. Maybe we can use share groups to avoid shader recompilations.
- Maybe others I didn't think of.

So we're currently using Option A. It works ok in environments where there's only ever one "active" GPU, i.e. one "online renderer". But it has disadvantages:
- When WebRender [initializes its GL `Device`](https://searchfox.org/mozilla-central/rev/42c2ecdc429115c32e6bcb78bf087a228a051044/gfx/wr/webrender/src/device/gl.rs#1239-1486), it makes many decisions based on the renderer's capabilities and those decisions have a large impact, for example the maximum texture size impacts decisions on the RenderBackend thread. If the device's context has a chance of being migrated between GPUs, then we will need to make those initial decisions based on the "lowest common denominator" between all available GPUs.
- Option A also does not work very well in environments [with multiple "online renderers" at the same time](https://developer.apple.com/documentation/metal/gpu_selection_in_macos/understanding_multi-gpu_and_multi-display_setups). This is a somewhat rare case because it only occurs for users that use eGPUs (external GPUs): If you have an external GPU with an external screen attached to it, and you also have an internal screen, then there are two screens that are active at the same time but driven by different GPUs. And you can move Firefox windows between those screens. If you have one Firefox window on one screen and one on the other, and they both share a global OpenGL context, which GPU should that context be using?

So I think Option B would be better and cleaner. However, it requires us to be able to migrate a Firefox window from a WebRender instance that uses GPU A to a WebRender instance that uses GPU B, ideally without flashing the existing window contents and while taking less than half a second.

How should we do this? "How much" of WebRender do we initialize when this happens? Do the main threads need to send new display lists? Do certain external textures need to be re-uploaded? Which threads coordinate the switch, and which threads get shut down and re-launched?
I think on Windows, device reset / "context lost" handling does something similar. Sotaro, can you describe how that works?

Revision 1 by

Markus Stange [:mstange]

on 2019-11-28 14:25:11 PST

At the moment, with WebRender on macOS, all windows share the same OpenGL context. On machines with an integrated and a discrete GPU, this shared GL context is migrated to the "active" GPU whenever the "active" GPU changes.
Here, the "active" GPU means the GPU which is currently driving the internal display.

We should figure out what we want to do long-term. I think we have two main options:
- Option A: Use one GL context and "migrate" it between GPUs. Share this context between all Firefox windows.
- Option B: Use one GL context per GPU and never migrate the contexts. Share the relevant context between all Firefox windows that are currently on a display that is driven by that context's GPU. Support moving an existing Firefox window to a different GL context.

There are other options which probably won't work very well:
- Option C: Don't use shared GL contexts. Instead, use one GL context per window and migrate it to the window's display's active GPU. Maybe we can use share groups to avoid shader recompilations.
- Maybe others I didn't think of.

So we're currently using Option A. It works ok in environments where there's only ever one "active" GPU, i.e. one "online renderer". But it has disadvantages:
- When WebRender [initializes its GL `Device`](https://searchfox.org/mozilla-central/rev/42c2ecdc429115c32e6bcb78bf087a228a051044/gfx/wr/webrender/src/device/gl.rs#1239-1486), it makes many decisions based on the renderer's capabilities and those decisions have a large impact, for example the maximum texture size impacts decisions on the RenderBackend thread. If the device's context has a chance of being migrated between GPUs, then we will need to make those initial decisions based on the "lowest common denominator" between all available GPUs.
- Option A also does not work very well in environments [with multiple "online renderers" at the same time](https://developer.apple.com/documentation/metal/gpu_selection_in_macos/understanding_multi-gpu_and_multi-display_setups). This is a somewhat rare case because it only occurs for users that use eGPUs (external GPUs): If you have an external GPU with an external screen attached to it, and you also have an internal screen, then there are two screens that are active at the same time but driven by different GPUs. And you can move Firefox windows between those screens. If you have one Firefox window on one screen and one on the other, and they both share a global OpenGL context, which GPU should that context be using?

So I think Option B would be better and cleaner. However, it requires us to be able to migrate a Firefox window from a WebRender instance that uses GL context A to a WebRender instance that uses GL context B, ideally without flashing the existing window contents and while taking less than half a second.

How should we do this? "How much" of WebRender do we initialize when this happens? Do the main threads need to send new display lists? Do certain external textures need to be re-uploaded? Which threads coordinate the switch, and which threads get shut down and re-launched?
I think on Windows, device reset / "context lost" handling does something similar. Sotaro, can you describe how that works?

Revision 2 by

Markus Stange [:mstange]

on 2019-11-28 14:25:25 PST

At the moment, with WebRender on macOS, all windows share the same OpenGL context. On machines with an integrated and a discrete GPU, this shared GL context is migrated to the "active" GPU whenever the "active" GPU changes.
Here, the "active" GPU means the GPU which is currently driving the internal display.

We should figure out what we want to do long-term. I think we have two main options:
- Option A: Use one GL context and "migrate" it between GPUs. Share this context between all Firefox windows.
- Option B: Use one GL context per GPU and never migrate the contexts. Share the relevant context between all Firefox windows that are currently on a display that is driven by that context's GPU. Support moving an existing Firefox window to a different GL context.

There are other options which probably won't work very well:
- Option C: Don't use shared GL contexts. Instead, use one GL context per window and migrate it to the window's display's active GPU. Maybe we can use share groups to avoid shader recompilations.
- Maybe others I didn't think of.

So we're currently using Option A. It works ok in environments where there's only ever one "active" GPU, i.e. one "online renderer". But it has disadvantages:
- When WebRender [initializes its GL `Device`](https://searchfox.org/mozilla-central/rev/42c2ecdc429115c32e6bcb78bf087a228a051044/gfx/wr/webrender/src/device/gl.rs#1239-1486), it makes many decisions based on the renderer's capabilities and those decisions have a large impact, for example the maximum texture size impacts decisions on the RenderBackend thread. If the device's context has a chance of being migrated between GPUs, then we will need to make those initial decisions based on the "lowest common denominator" between all available GPUs.
- Option A also does not work very well in environments [with multiple "online renderers" at the same time](https://developer.apple.com/documentation/metal/gpu_selection_in_macos/understanding_multi-gpu_and_multi-display_setups). This is a somewhat rare case because it only occurs for users that use eGPUs (external GPUs): If you have an external GPU with an external screen attached to it, and you also have an internal screen, then there are two screens that are active at the same time but driven by different GPUs. And you can move Firefox windows between those screens. If you have one Firefox window on one screen and one on the other, and they both share a global OpenGL context, which GPU should that context be using?

So I think Option B would be better and cleaner. However, it requires us to be able to migrate a Firefox window from a WebRender instance that uses GL context A to a WebRender instance that uses GL context B, ideally without flashing the existing window contents and while taking less than half a second.

How should we do this? "How much" of WebRender do we re-initialize when this happens? Do the main threads need to send new display lists? Do certain external textures need to be re-uploaded? Which threads coordinate the switch, and which threads get shut down and re-launched?
I think on Windows, device reset / "context lost" handling does something similar. Sotaro, can you describe how that works?