Bug 1716049 Comment 13 Edit History

Note: The actual edited comment in the bug view page will always show the original commenter’s name and original timestamp.

## Some notes from Jeff and I looking into this:

- A simple way to reproduce the issue is to open a page containing a css animation such as https://developer.mozilla.org/en-US/docs/Web/CSS/animation in two windows in Firefox. Setting the pref "gfx.webrender.all" to true in about:config should ensure that the GPU rendering backend is enabled even if we'd have disabled it by default for that current configuration.

- Our rendering setup is rather simple. All GL commands are submitted from the same thread (the Renderer thread, which is not the thread the gtk event loop lives in case that's relevant). each time we render a window, we call glMakeCurrent, submit the GL commands and finally call glxSwapBuffers.

- On all x11 with all nvidia hardware that I could test, as soon as multiple windows are presenting continuously we, the glMakeCurrent calls start taking a very long time (more than 16 milliseconds on the beefy xeon processor I am testing on) with the following stack:

- Nouveau drivers aren't affected.

- The stack trace is consitently the one from comment 7 on all of the configurations I tested.

- As soon as a single window is presenting, glMakeCurrent goes back to be in the order of 0.005ms.

- No error is reported to error handlers registered via XSetErrorHandler.

- To be sure I disabled all interactions with glx on the vsync thread and switched to a purely timer-based one. It doesn not affect the problem at all.

- We call glxSwapInterval(1) once every frame. In doubt I changed it to be called once during the initialization of each window, but the problem remains.

- Interestingly, the problem almost goes away if glxSwapInterval is set to 0. By that I mean that glxMakeCurrent takes around 1ms which is a lot more than I would expect (~0.005ms) but at least leaves enough room for interactive frame rates.

- In a sibbling bug, it is reported that ASAP mode fixes the issue. What's important here is that SwapInterval is set to 0 when ASAP mode is enabled. When enabling ASAP mode and forcing SwapInterval to be set to 1, the problem still happens.

- Arthur Huillet suggested in comment 8 that the glx context becomes (partially?) invalid and (some of it is) recreated when we call gfxMakeCurrent. Note that textures and other persistent GPU resources aren't lost in the process.

## To me this smells like:

 - we get into a state that the nvidia driver doesn't like and it causes it to reinitialize some stuff. This initialization is costly in particular because it issues some synchronous query to something (the x server?).
 - when SwapInterval is not zero, the sync query is not answered until the next vsync (which is unfortunate because we tend to call this early in the vsync and end up waiting for a full frame.

The interaction of the two causing the slowdown to be spectacular

## Things to consider:

We do:

```
For each window
    make_current()
    render()
    swap()
```

In various places online it is suggested that the following might be better when SwapInterval is not zero:

```
For each window:
    make_current()
    render()
    Flush()

For each window:
    make_current()
    swap()
```

That requires a bit of coordination between windows that doesn't exist right now but it should be doable. It's worth trying but there is no indication that it will fix this bug.

We could also accept tearing when multiple windows are presenting. It's kind of gross but perhaps better than losing hardware acceleration altogether for all linux users with proprietary nvidia drivers, which is what we'll have to do in the short term if we don't find a solution.
## Some notes from Jeff and I looking into this:

- A simple way to reproduce the issue is to open a page containing a css animation such as https://developer.mozilla.org/en-US/docs/Web/CSS/animation in two windows in Firefox. Setting the pref "gfx.webrender.all" to true in about:config should ensure that the GPU rendering backend is enabled even if we'd have disabled it by default for that current configuration.

- Our rendering setup is rather simple. All GL commands are submitted from the same thread (the Renderer thread, which is not the thread the gtk event loop lives in case that's relevant). each time we render a window, we call glMakeCurrent, submit the GL commands and finally call glxSwapBuffers.

- On all x11 with all nvidia hardware that I could test, as soon as multiple windows are presenting continuously we, the glMakeCurrent calls start taking a very long time (more than 16 milliseconds on the beefy xeon processor I am testing on) with the following stack:

- Nouveau drivers aren't affected.

- The stack trace is consitently the one from comment 7 on all of the configurations I tested.

- As soon as a single window is presenting, glMakeCurrent goes back to be in the order of 0.005ms.

- No error is reported to error handlers registered via XSetErrorHandler.

- To be sure I disabled all interactions with glx on the vsync thread and switched to a purely timer-based one. It doesn not affect the problem at all.

- We call glxSwapInterval(1) once every frame. In doubt I changed it to be called once during the initialization of each window, but the problem remains.

- Interestingly, the problem almost goes away if glxSwapInterval is set to 0. By that I mean that glxMakeCurrent takes around 1ms which is a lot more than I would expect (~0.005ms) but at least leaves enough room for interactive frame rates.

- In a sibbling bug, it is reported that ASAP mode fixes the issue. What's important here is that SwapInterval is set to 0 when ASAP mode is enabled. When enabling ASAP mode and forcing SwapInterval to be set to 1, the problem still happens.

- Arthur Huillet suggested in comment 8 that the glx context becomes (partially?) invalid and (some of it is) recreated when we call gfxMakeCurrent. Note that textures and other persistent GPU resources aren't lost in the process.

## To me this smells like:

 - we get into a state that the nvidia driver doesn't like and it causes it to reinitialize some stuff. This initialization is costly in particular because it issues some synchronous query to something (the x server?).
 - when SwapInterval is not zero, the sync query is not answered until the next vsync (which is unfortunate because we tend to call this early in the vsync and end up waiting for a full frame.

The interaction of the two causing the slowdown to be spectacular

## Things to consider:

We do:

```
For each window
    make_current()
    render()
    swap()
```

In various places online it is suggested that the following might be better when SwapInterval is not zero:

```
For each window:
    make_current()
    render()
    Flush()

For each window:
    make_current()
    swap()
```

That requires a bit of coordination between windows that doesn't exist right now but it should be doable. It's worth trying but there is no indication that it will fix this bug. There's a chance that it might mitigate the sync issue by avoiding to call gfxMakeCurrent after swap, but the "real" issue is probably what's causing MakeCurrent to issue that sync query in the first place.

We could also accept tearing when multiple windows are presenting. It's kind of gross but perhaps better than losing hardware acceleration altogether for all linux users with proprietary nvidia drivers, which is what we'll have to do in the short term if we don't find a solution.
## Some notes from Jeff and I looking into this:

- A simple way to reproduce the issue is to open a page containing a css animation such as https://developer.mozilla.org/en-US/docs/Web/CSS/animation in two windows in Firefox. Setting the pref "gfx.webrender.all" to true in about:config should ensure that the GPU rendering backend is enabled even if we'd have disabled it by default for that current configuration.

- Our rendering setup is rather simple. All GL commands are submitted from the same thread (the Renderer thread, which is not the thread the gtk event loop lives in case that's relevant). each time we render a window, we call glMakeCurrent, submit the GL commands and finally call glxSwapBuffers.

- On all x11 with all nvidia hardware that I could test, as soon as multiple windows are presenting continuously we, the glMakeCurrent calls start taking a very long time (more than 16 milliseconds on the beefy xeon processor I am testing on) with the following stack:

- Nouveau drivers aren't affected.

- The stack trace is consitently the one from comment 7 on all of the configurations I tested.

- As soon as a single window is presenting, glMakeCurrent goes back to be in the order of 0.005ms.

- No error is reported to error handlers registered via XSetErrorHandler.

- To be sure I disabled all interactions with glx on the vsync thread and switched to a purely timer-based one. It doesn not affect the problem at all.

- We call glxSwapInterval(1) once every frame. In doubt I changed it to be called once during the initialization of each window, but the problem remains.

- Interestingly, the problem almost goes away if glxSwapInterval is set to 0. By that I mean that glxMakeCurrent takes around 1ms which is a lot more than I would expect (~0.005ms) but at least leaves enough room for interactive frame rates.

- In a sibbling bug, it is reported that ASAP mode fixes the issue. What's important here is that SwapInterval is set to 0 when ASAP mode is enabled. When enabling ASAP mode and forcing SwapInterval to be set to 1, the problem still happens.

- Arthur Huillet suggested in comment 8 that the glx context becomes (partially?) invalid and (some of it is) recreated when we call gfxMakeCurrent. Note that textures and other persistent GPU resources aren't lost in the process.

## To me this smells like:

 - we get into a state that the nvidia driver doesn't like and it causes it to reinitialize some stuff. This initialization is costly in particular because it issues some synchronous query to something (the x server?).
 - when SwapInterval is not zero, the sync query is not answered until the next vsync (which is unfortunate because we tend to call this early in the frame and end up waiting for a full frame.

The interaction of the two causing the slowdown to be spectacular

## Things to consider:

We do:

```
For each window
    make_current()
    render()
    swap()
```

In various places online it is suggested that the following might be better when SwapInterval is not zero:

```
For each window:
    make_current()
    render()
    Flush()

For each window:
    make_current()
    swap()
```

That requires a bit of coordination between windows that doesn't exist right now but it should be doable. It's worth trying but there is no indication that it will fix this bug. There's a chance that it might mitigate the sync issue by avoiding to call gfxMakeCurrent after swap, but the "real" issue is probably what's causing MakeCurrent to issue that sync query in the first place.

We could also accept tearing when multiple windows are presenting. It's kind of gross but perhaps better than losing hardware acceleration altogether for all linux users with proprietary nvidia drivers, which is what we'll have to do in the short term if we don't find a solution.
## Some notes from Jeff and I looking into this:

- A simple way to reproduce the issue is to open a page containing a css animation such as https://developer.mozilla.org/en-US/docs/Web/CSS/animation in two windows in Firefox. Setting the pref "gfx.webrender.all" to true in about:config should ensure that the GPU rendering backend is enabled even if we'd have disabled it by default for that current configuration.

- Our rendering setup is rather simple. All GL commands are submitted from the same thread (the Renderer thread, which is not the thread the gtk event loop lives in case that's relevant). each time we render a window, we call glMakeCurrent, submit the GL commands and finally call glxSwapBuffers.

- On all x11 with all nvidia hardware that I could test, as soon as multiple windows are presenting continuously we, the glMakeCurrent calls start taking a very long time (more than 16 milliseconds on the beefy xeon processor I am testing on) with the following stack:

- Nouveau drivers aren't affected.

- The stack trace is consitently the one from comment 7 on all of the configurations I tested.

- As soon as a single window is presenting, glMakeCurrent goes back to be in the order of 0.005ms.

- No error is reported to error handlers registered via XSetErrorHandler.

- To be sure I disabled all interactions with glx on the vsync thread and switched to a purely timer-based one. It doesn not affect the problem at all.

- We call glxSwapInterval(1) once every frame. In doubt I changed it to be called once during the initialization of each window, but the problem remains.

- Interestingly, the problem almost goes away if glxSwapInterval is set to 0. By that I mean that glxMakeCurrent takes around 1ms which is a lot more than I would expect (~0.005ms) but at least leaves enough room for interactive frame rates.

- In a sibbling bug, it is reported that ASAP mode fixes the issue. What's important here is that SwapInterval is set to 0 when ASAP mode is enabled. When enabling ASAP mode and forcing SwapInterval to be set to 1, the problem still happens.

- Arthur Huillet suggested in comment 8 that the glx context becomes (partially?) invalid and (some of it is) recreated when we call gfxMakeCurrent. Note that textures and other persistent GPU resources aren't lost in the process.

## To me this smells like:

 - we get into a state that the nvidia driver doesn't like and it causes it to reinitialize some stuff. This initialization is costly in particular because it issues some synchronous query to something (the x server?).
 - when SwapInterval is not zero, the sync query is not answered until the next vsync (which is unfortunate because we tend to call this early in the frame and end up waiting for a full frame.

The interaction of the two causing the slowdown to be spectacular

## Things to consider:

We do:

```
For each window
    make_current()
    render()
    swap()
```

In various places online it is suggested that the following might be better when SwapInterval is not zero:

```
For each window:
    make_current()
    render()
    Flush()

For each window:
    make_current()
    swap()
```

That requires a bit of coordination between windows that doesn't exist right now but it should be doable-ish. In reality we need also to group per screen, especially if they have different refresh rates. It's worth trying but there is no indication that it will fix this bug. There's a chance that it might mitigate the sync issue by avoiding to call gfxMakeCurrent after swap, but the "real" issue is probably what's causing MakeCurrent to issue that sync query in the first place.

We could also accept tearing when multiple windows are presenting. It's kind of gross but perhaps better than losing hardware acceleration altogether for all linux users with proprietary nvidia drivers, which is what we'll have to do in the short term if we don't find a solution.

Switching to EGL also appear to fix this. It's probably how we'll get this fixed in the medium/long term.

Back to Bug 1716049 Comment 13