Closed Bug 1377950 Opened 2 years ago Closed 2 years ago

Open tabs blank out in Firefox after compositing is disabled and enabled on Linux

Categories

(Core :: Graphics, defect, P3)

55 Branch
x86_64
Linux
defect

Tracking

()

VERIFIED FIXED
mozilla57
Tracking Status
firefox-esr52 --- unaffected
firefox54 --- unaffected
firefox55 --- wontfix
firefox56 --- fixed
firefox57 --- fixed

People

(Reporter: shtetldik, Assigned: rhunt)

References

Details

(Keywords: regression, Whiteboard: [gfx-noted])

Attachments

(3 files)

User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:55.0) Gecko/20100101 Firefox/55.0
Build ID: 20170629075044

Steps to reproduce:

In KDE, disabled and enabled compositing (by opening a fullscreen video with mpv) or explicitly (Alt+Shift+F12).

Configuration:
OS: Debian testing, x86_64. KDE Plasma 5.8.7

layers.acceleration.force-enabled = true
gfx.canvas.azure.accelerated = true
gfx.content.azure.accelerated = true

OpenGL renderer string: Gallium 0.4 on AMD POLARIS10 (DRM 3.10.0 / 4.11.0-1-amd64, LLVM 3.9.1)
OpenGL core profile version string: 4.3 (Core Profile) Mesa 13.0.6


Actual results:

Open tabs in Firefox became blank, and there is no way to restore them, except re-opening same URL in a new tab.


Expected results:

Tab should restore normally.
OS: Unspecified → Linux
Hardware: Unspecified → x86_64
Component: Untriaged → Graphics
Product: Firefox → Core
I just tested it with newer Mesa, and the issue still occurs.

OpenGL renderer string: AMD Radeon RX 480 Graphics (AMD POLARIS10 / DRM 3.10.0 / 4.11.0-1-amd64, LLVM 4.0.1)
OpenGL core profile version string: 4.5 (Core Profile) Mesa 17.2.0-devel (git-277621bbb7)
Correction: I sometimes can observe this issue even when layers.acceleration.force-enabled = false

I also noticed, that when the tab blanks out, rotating throbber (lading symbol) in displayed in the middle of the page.
I have the exact same issue. I did some quick investigating, and have the following observations:

- Occurs on my desktop PC, Nvidia card, Nvidia proprietary driver, version 381.22
- Does not happen on my work PC, which is also running KDE, and the same version of Firefox as my desktop. Main difference I can deduce is that work PC is running Intel graphics.
- My desktop is running Arch Linux, X11, also KDE, plasma desktop version 5.10.3
- Does not occur with Firefox 55b3, started happening with Firefox 55b4, and is still present in 55b7.
- Does not happen when E10S is completely disabled (via browser.tabs.remote.autostart* prefs). Occurs with E10S enabled, even with only 1 content process.
- Affects all open tabs when it occurs. New tabs or loading of old tabs from previous session seem to start ok.
- On my system, takes 1-2s before it manifests after compositing is disabled. However, if a tab was opened with composting already disabled, enabling compositing immediately triggers the effect
- Interestingly enough, does not affect 'about:whatever' browser tabs. At the very least, 'about:config' and 'about:performance' continue to function.
- Occurs with a clean profile, tested using the web-ext tool to temporarily open a clean profile.
- Hardware acceleration is not enabled.
Correction: this does also happen with the Intel graphics on my work device. My workflow rarely results in compositing changes, so I simply never noticed until today.
Not specific to KDE. Restarting Gnome Shell via Alt-F2 and typing 'r' also triggers this bug.
Summary: Open tabs blank out in Firefox after compositing is disabled and enabled on Linux/KDE → Open tabs blank out in Firefox after compositing is disabled and enabled on Linux
I can confirm that this is happening. It seems to happen with and without layers.acceleration.force-enabled on.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Whiteboard: [gfx-noted]
Testing on Fedora 26 with Gnome and the following STR:

1. Go to a web page (I tested with https://bugzilla.mozilla.org)
2. Hit alt-f2
3. Enter 'r'

gnome-shell will restart and then the content area of the browser will turn perma-white.

I ran mozregression and got:

https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=6e9b45f7ed376eea83df779d60b3c4562b188a66&tochange=457e6aea4650b4cc8bbc6581342fdc44bc8dbec1

 Bug 1364355 - Implement client side decorations - Support ARGB visuals

Martin, could you provide some insight on this? Thanks!
Flags: needinfo?(stransky)
See Also: → 1364355
Duplicate of this bug: 1383156
Sure I'll look into it.
Flags: needinfo?(stransky)
Gnome-shell restart emits "composited-changed" signal which calls CleanLayerManagerRecursive() and this causes the screen cleaning. We may need to recreate the Layer managers after the removal.
It also produces this warning after the compositor recreation:

[Parent 29448] WARNING: ConnectReferentLayer failed - Incorrect LayerManager: file /home/komat/tmp676-trunk-gtk3/src/gfx/layers/Layers.h, line 2925
Blocks: 1364355
See Also: 1364355
It seems to be also related to e10s - I can't reproduce when e10s is disabled.
Lee, any idea where I should look at? I'm not familiar with this code.

It looks like to me that we call DestroyCompositor() for living widget then create the compositor again for the same widget by void nsBaseWidget::CreateCompositor() and we miss some init/binding as it affects only remote tabs - in process tabs are not affected.

Also I expect the DestroyCompositor() is usually called only when widget is destroyed and nsBaseWidget::CreateCompositor() only when the widget is created. Maybe we need to set the new LayerManager somewhere? Thanks!
Flags: needinfo?(lsalzman)
Duplicate of this bug: 1379476
Duplicate of this bug: 1387773
I'm affected by this bug on all of my Desktops

One with Intel another With NVIDIA, all runs on updated Arch Linux with Gnome-shell. 

I can always reproduce the bug by 

Alt+F2 > r

How can I help?
There's also a crash after the restart when keyboard is hit:

Thread 1 "firefox" received signal SIGSEGV, Segmentation fault.
0x00007fffe0b6860f in mozilla::layers::FocusState::IsCurrent (this=0x7fffb4eae8a8)
    at /home/komat/tmp676-trunk-gtk3/src/gfx/layers/apz/src/FocusState.cpp:31
31	  MOZ_ASSERT(mLastContentProcessedEvent <= mLastAPZProcessedEvent);
(In reply to Martin Stránský from comment #19)
> There's also a crash after the restart when keyboard is hit:

I filed bug 1388466 for this as it looks like a separate problem.
Duplicate of this bug: 1388473
The problem here is that the layer managers in the content processes aren't being destroyed and recreated to match the new layer manager.

I think we can use GPUProcessManager:OnInProcessDeviceReset() to do this. When I stub that in here [1] instead of CleanLayerManagerRecursive() the issue is resolved.

[1] http://searchfox.org/mozilla-central/rev/bd39b6170f04afeefc751a23bb04e18bbd10352b/widget/gtk/nsWindow.cpp#3404
Flags: needinfo?(lsalzman)
Assignee: nobody → rhunt
(In reply to Ryan Hunt [:rhunt] from comment #22)
> The problem here is that the layer managers in the content processes aren't
> being destroyed and recreated to match the new layer manager.
> 
> I think we can use GPUProcessManager:OnInProcessDeviceReset() to do this.
> When I stub that in here [1] instead of CleanLayerManagerRecursive() the
> issue is resolved.
> 

Thank you. I tried that. it fixed the issue for me.
So the current behavior here is when we get a "composited-changed" signal for a widget we will destroy the layer managers and compositor for the window and it's sub widgets (using CleanLayerManagerRecursive). They will then be recreated in the next paint.

This is problematic with E10S because there are layer managers attached to this compositor in the child process that are not a part of this widget hierarchy. So when this happens a new compositor is recreated for the widget, but the layer managers in the content processes are incorrect.

We have code using GPUProcessManager to handle device resets that does what we desire here. It will recreate all compositors that exist though, so running it for "composited-changed" per widget would result in recreating compositors too often when there are multiple widgets.

So these commits wire up a signal handler for "composited-handled" for the GdkScreen which will defer to GPUProcessManager to reset the compositors.
Comment on attachment 8896345 [details]
Bug 1377950 - Stub in a function for resetting all compositors with GPUProcessManager.

https://reviewboard.mozilla.org/r/167608/#review172832
Attachment #8896345 - Flags: review?(jmuizelaar) → review+
Comment on attachment 8896346 [details]
Bug 1377950 - Reset all compositors when GdkScreen "composited-changed" fires.

https://reviewboard.mozilla.org/r/167610/#review173220

Thank you for the explanation and fix.

::: widget/gtk/nsWindow.cpp
(Diff revision 1)
>  
>          // mozilla.widget.use-argb-visuals is a hidden pref defaulting to false
>          // to allow experimentation
>          if (Preferences::GetBool("mozilla.widget.use-argb-visuals", false))
>              useAlphaVisual = true;
> -

Please don't remove this line.

::: widget/gtk/nsWindow.cpp:3881
(Diff revision 1)
> +
>          g_signal_connect(mShell, "composited-changed",
> -                         G_CALLBACK(composited_changed_cb), nullptr);
> +                         G_CALLBACK(widget_composited_changed_cb), nullptr);
> +
> +        if (!g_signal_handler_find(screen, G_SIGNAL_MATCH_FUNC,
> +                                   0, 0, 0,

Please pass nullptr, instead of 0, for the GClosure* parameter (the one before |func|).

::: widget/gtk/nsWindow.cpp:3882
(Diff revision 1)
>          g_signal_connect(mShell, "composited-changed",
> -                         G_CALLBACK(composited_changed_cb), nullptr);
> +                         G_CALLBACK(widget_composited_changed_cb), nullptr);
> +
> +        if (!g_signal_handler_find(screen, G_SIGNAL_MATCH_FUNC,
> +                                   0, 0, 0,
> +                                   (gpointer*)screen_composited_changed_cb, 0)) {

FuncToGpointer(), for a safer cast than (gpointer*)
Attachment #8896346 - Flags: review?(karlt) → review+
Pushed by rhunt@eqrion.net:
https://hg.mozilla.org/integration/mozilla-inbound/rev/2dd2794946eb
Stub in a function for resetting all compositors with GPUProcessManager. r=jrmuizel
https://hg.mozilla.org/integration/mozilla-inbound/rev/6936e0705405
Reset all compositors when GdkScreen "composited-changed" fires. r=karlt
https://hg.mozilla.org/mozilla-central/rev/2dd2794946eb
https://hg.mozilla.org/mozilla-central/rev/6936e0705405
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla57
Keywords: regression
Duplicate of this bug: 1390761
Duplicate of this bug: 1377664
Can the fix be included in devedition?
Karl, can you please nominate this for Beta approval since Ryan is on PTO? Thanks!
Flags: needinfo?(karlt)
Duplicate of this bug: 1394108
Comment on attachment 8896346 [details]
Bug 1377950 - Reset all compositors when GdkScreen "composited-changed" fires.

Approval Request Comment
[Feature/Bug causing the regression]:
e10s
[User impact if declined]:
Tabs go unrecoverably blank in some reasonably common situations. 
[Is this code covered by automated tests?]:
No.
[Has the fix been verified in Nightly?]:
Yes.
[Needs manual test from QE? If yes, steps to reproduce]:
No.
[List of other uplifts needed for the feature/fix]:
None AFAIK.
[Is the change risky?]:
Yes.
[Why is the change risky/not risky?]:
This is a completely different code path.

The behavior doesn't seem quite as expected.
Diagonal edges on the hamburger and identity dropdowns are still jaggy on
compositing window managers if first opened under non-compositing window
managers.  That's not a change in behavior from this patch, however.  It just
means that the behavior is not understood, and so it is hard to predict all
effects of the change.

An alternative, possibly less risky change would be to revert
https://hg.mozilla.org/mozilla-central/rev/d04f932669e7
That would remove the situation known to trigger the bug, and so wouldn't need
the workaround provided by this patch.

[String changes made/needed]:
None.
Flags: needinfo?(karlt)
Attachment #8896346 - Flags: approval-mozilla-beta?
Comment on attachment 8896346 [details]
Bug 1377950 - Reset all compositors when GdkScreen "composited-changed" fires.

Fix a regression and was verified. Beta56+.
Attachment #8896346 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
I can confirm that it works (Firefox 56.0b07), but the fix is far from ideal. When compositing is toggled (for example when some video player or other application switched to fullscreen and back in KDE), all Firefox windows noticeably blink which looks pretty annoying and out of place. It's surely better than completely blanking out, but some better solution is still needed that won't have any blinking involved.
If needed, I can open a new bug for the new behavior, or rename the current one.
(In reply to Shmerl from comment #42)
> If needed, I can open a new bug for the new behavior, or rename the current
> one.

Please file a new bug instead.
I opened Bug 1396418.
Duplicate of this bug: 1378462
Duplicate of this bug: 1391216
You need to log in before you can comment on or make changes to this bug.