Closed Bug 1478454 Opened 6 years ago Closed 6 years ago

[Proprietary Nvidia driver] BadMatch failure at glXMakeCurrent()

Categories

(Core :: Graphics: WebRender, defect)

x86_64
Linux
defect
Not set
normal

Tracking

()

VERIFIED FIXED
mozilla63
Tracking Status
firefox-esr52 --- unaffected
firefox-esr60 --- unaffected
firefox61 --- unaffected
firefox62 --- unaffected
firefox63 --- verified

People

(Reporter: jan, Assigned: stransky)

References

Details

(Keywords: nightly-community, regression)

Attachments

(6 files)

Yesterday, bug 1406533 introduced a regression that completely breaks WebRender for Linux users.
In bug 1357487 you are planning to enable OOP Webextensions (breaking WebRender) on Linux by default.
Risk: Linux users that have been told that they just need to flip gfx.webrender.all to true are no longer testing WebRender.

https://hg.mozilla.org/integration/autoland/graph/4b73b8c72408
> mozregression --repo autoland --launch 9bce60d44498 --pref extensions.webextensions.remote:true gfx.webrender.all:true -a 'https://addons.mozilla.org/en-US/firefox/addon/ublock-origin/'
-> last good: WebRender works, but OOP webextension panels have a white border.

> mozregression --repo autoland --launch 4b73b8c72408 --pref extensions.webextensions.remote:true gfx.webrender.all:true -a 'https://addons.mozilla.org/en-US/firefox/addon/ublock-origin/'
-> bad: Clicking on an icon of an OOP webextension breaks WebRender and lets Nightly fall back to OpenGL

STR:
1. Install https://addons.mozilla.org/en-US/firefox/addon/ublock-origin/
2. Click on uBlock's icon.
3. The window has been white for a moment because it broke WebRender:

about:support
> Compositing	OpenGL
[...]
> Failure Log
> (#0) Error	Failed GL context creation for WebRender: 0
> (#1) Error	Compositors might be mixed (5,2)
Keywords: regression
Martin, can you please take a look into this? This would seem somewhat urgent if it breaks WebRender.
Flags: needinfo?(stransky)
(Kris Maglione [:kmag] from bug 1357487 comment 12)
> https://hg.mozilla.org/integration/mozilla-inbound/rev/e5423d29aaf0b711e02b68b7340a2297dd6bfe16
> Bug 1357487: Enable OOP extensions by default on all platforms. r=aswan
Severity: normal → major
Component: Graphics: WebRender → Widget: Gtk
(In reply to Lee Salzman [:lsalzman] from comment #1)
> Martin, can you please take a look into this? This would seem somewhat
> urgent if it breaks WebRender.

Sure I'll look at it.
Flags: needinfo?(stransky)
Can you be more specific where and how do you see the regression? I built latest trunk on Fedora 28/gnome-shell, force-enabled webrender and HW acceleration (which is disabled by default on my HW - Mesa DRI Intel(R) HD Graphics 530) and I see OOP webextension panels with a white border. But I don't see any failures/crashes. about:supports is also quiet.
Flags: needinfo?(jan)
Assignee: nobody → stransky
Let's see if the patch fixes that - the IsComposited() routine does not work reliable for WebRender backend with GLX visual recently. 

Also we don't have covered a situation when WebRender is enabled and we're on non-compositing WM (I don't know if such config is even possible) - let's throw warning at least.
(In reply to Martin Stránský [:stransky] from comment #7)
> Let's see if the patch fixes that - the IsComposited() routine does not work
> reliable for WebRender backend with GLX visual recently. 
> 
> Also we don't have covered a situation when WebRender is enabled and we're
> on non-compositing WM (I don't know if such config is even possible) - let's
> throw warning at least.

Given that Kwin allows users to toggle compositing on and off at runtime with Alt+Shift+F12, I'd say such a config is very possible.

(And I *do* make use of it because toggling compositing off and then on again solves a problem I experience on my Kubuntu 14.04 LTS system where YouTube playback under Firefox Developer Edition or general rendering in Chromium can spontaneously become janky after they've been running for days at a time.)
Attached video 2018-07-26_13-48-10.mp4
(In reply to Martin Stránský [:stransky] from comment #5)
Debian Testing, Gnome, Xorg, Nvidia GTX 1060

Screencast: Running the build from comment 3 where extensions.webextensions.remote was set to true by default:
mozregression --repo mozilla-inbound --launch e5423d29aaf0b711e02b68b7340a2297dd6bfe16 --pref gfx.webrender.all:true -a 'https://addons.mozilla.org/en-US/firefox/addon/ublock-origin/'

gfx.webrender.all already internally switches layers.acceleration.force-enabled to true for convenience reasons.

(I tagged comment 3 as "assumption" because later I saw that you removed a "CreateBasicLayerManager();" with a patch in bug 1406533, so maybe Basic composition was used before? I'm not a developer. If I interpret bug 1377321 correctly, WebRender is desired for remote webextensions and Basic for non-remote?)
Flags: needinfo?(jan)
Attached video 2018-07-26_14-07-36.mp4
https://treeherder.mozilla.org/#/jobs?repo=try&revision=fbabeba5fb1a0b3750bacee795bbea0517f2a335

mozregression --repo try --launch fbabeba5fb1a0b3750bacee795bbea0517f2a335 --pref extensions.webextensions.remote:true gfx.webrender.all:true -a 'https://addons.mozilla.org/en-US/firefox/addon/ublock-origin/'
It's not fixed in your try build.

(In reply to Jan Andre Ikenmeyer [:darkspirit] from comment #0)
> -> bad: Clicking on an icon of an OOP webextension breaks WebRender and lets Nightly fall back to OpenGL

Precisely: Hovering an icon of an OOP Webextension lets Nightly fall back to OpenGL compositing.
(In reply to Jan Andre Ikenmeyer [:darkspirit] from comment #9)
> and Basic for non-remote?)
I think that was related to the awesome bar (bug 1377321 comment 11)? I can only show screencasts of the regression and I wish somebody else could explain how it should be..
(In reply to Jan Andre Ikenmeyer [:darkspirit] from comment #11)
> It's not fixed in your try build.
> 
> (In reply to Jan Andre Ikenmeyer [:darkspirit] from comment #0)
> > -> bad: Clicking on an icon of an OOP webextension breaks WebRender and lets Nightly fall back to OpenGL
> 
> Precisely: Hovering an icon of an OOP Webextension lets Nightly fall back to
> OpenGL compositing.

Do I understand correctly that something is fixed by this patch (clicking to the OOP extension popup) but something remains unfixed? (Hovering an icon of an OOP Webextension).

Also, do you have enabled or disabled the composition?
Flags: needinfo?(jan)
(In reply to Martin Stránský [:stransky] from comment #13)
> Do I understand correctly that something is fixed by this patch (clicking to the OOP extension popup) but something remains unfixed? (Hovering an icon of an OOP Webextension).

Comment 9 (inbound) and comment 10 (your try build) have the same behavior. Your patch did not fix the regression.

           this bug: "OOP webext + WR" was regressed by bug 1406533.
FYI/OT: bug 1406230: "OOP webext + WR + GPU process" was already broken in the past. Your current patch sounds related(?)

> Also, do you have enabled or disabled the composition?
I started Gnome (because you prefer it?) on Xorg for the screencast without any change. I don't know.
My regular KDE has a Compositor in the settings which is "enabled at startup" (Output: OpenGL 3.1, VSync: auto).
Flags: needinfo?(jan)
Attachment #8995151 - Flags: review?(jhorak)
Depends on: 1478661
Attachment #8995151 - Attachment is obsolete: true
Okay, Thanks. Let's move the IsComposited patch to a different bug as we need it anyway.
Unfortunately I can't reproduce this bug, with latest nightly builds from mozilla or with my own builds on Fedora 28 / Gnome / Intel GFX. 

I'll investigate how a and why is the compositor changed from WebRender to OpenGL. Also the regression from Bug 1406533 may come from removed popup hacks.
I can't reproduce this either, but maybe see https://bugzilla.mozilla.org/show_bug.cgi?id=1401634 which has the same error?
Yes, with layers.gpu-process.enabled I can reproduce this bug, works fine when layers.gpu-process.enabled is set to false. Can you please confirm that?
Flags: needinfo?(jan)
Bug 1406230 may be also related.
See Also: → 1406230
(In reply to Martin Stránský [:stransky] from comment #15)
> I'll investigate how a and why is the compositor changed from WebRender to OpenGL. Also the regression from Bug 1406533 may come from removed popup hacks.

My suspicion is now that bug 1406533 just cleaned things up and the correct way to display remote widgets with OpenGL (which the gpu process might have tried to do?) was already broken.

Re comment 16:
Yes, it looks like there is no more a difference between with and without GPU process when using WebRender with OOP Webextensions. (bug 1406230 comment 8 compared to comment 0)
Flags: needinfo?(jan)
Summary: regression: Bug 1406533 broke WebRender on Linux (Fallback to OpenGL) → [layers.gpu-process.enabled = true] regression: Bug 1406533 broke WebRender on Linux (Fallback to OpenGL)
[layers.gpu-process.enabled = true for stransky, but false (default) for darkspirit] ;)
Now I disabled KDE's compositor, rebooted, ran your try build with the command from comment 10 and behavior is the same.
Attached file aboutsupport.txt
Made with the command from comment 10 (try build) on Gnome Classic on Debian Testing.
Component: Widget: Gtk → Graphics
Summary: [layers.gpu-process.enabled = true] regression: Bug 1406533 broke WebRender on Linux (Fallback to OpenGL) → [layers.gpu-process.enabled = true] since fix for Bug 1406533 OOP webextension panels break WebRender on Linux (Fallback to OpenGL)
Ubuntu 18.04 LTS, Live system, GTX 1060 3GB, nouveau -- NV136, 3.0 Mesa 18.0.0-rc5

build from bug 1357487 comment 17 (extensions.webextensions.remote;true):
> mozregression --launch ef1550969466 --pref gfx.webrender.all:true -a 'https://addons.mozilla.org/en-US/firefox/addon/ublock-origin/'
* It does not fall back to OpenGL
* The panel has a fat white border (like the black border before bug 1406533): bug 1444595 is already about this on Mac.

> mozregression --launch ef1550969466 --pref layers.acceleration.force-enabled:true -a 'https://addons.mozilla.org/en-US/firefox/addon/ublock-origin/'
* is just fine

> mozregression --launch ef1550969466 --pref layers.gpu-process.enabled:true gfx.webrender.all:true -a 'https://addons.mozilla.org/en-US/firefox/addon/ublock-origin/'
* is the old bug 1406230.

I would need to install Ubuntu 18.04 with proprietary Nvidia driver if Debian Testing with nouveau would be fine too..
Summary: [layers.gpu-process.enabled = true] since fix for Bug 1406533 OOP webextension panels break WebRender on Linux (Fallback to OpenGL) → [Debian Testing, proprietary Nvidia driver] since fix for Bug 1406533 OOP webextension panels break WebRender on Linux (Fallback to OpenGL)
Debian Testing, GTX 1060 3GB, nouveau -- NV136, 3.1 Mesa 18.1.4, libgtk-3-0 3.22.30-2
It's the same as comment 23.
So I have to install Ubuntu with proprietary Nvidia driver tomorrow to check if Debian Testing itself is innocent.

In general: There is a small population of WebRender users with proprietary Nvidia on Linux (example: bug 1401455).
Attached file debuglog.txt
RUST_BACKTRACE=1 mozregression --launch 87bcafe428a4ad6017e59b915581ae00aa863407 -B debug --pref gfx.webrender.all:true -a 'https://addons.mozilla.org/en-US/firefox/addon/ublock-origin/' 2>&1 > debuglog.txt

> 0:51.06 INFO: [Parent 25460, Renderer] ###!!! ASSERTION: Failed to make GL context current!: 'succeeded', file /builds/worker/workspace/build/src/gfx/gl/GLContextProviderGLX.cpp, line 619
> 0:51.09 INFO: [Parent 25460, Renderer] WARNING: GLContext::InitWithPrefix failed!: file /builds/worker/workspace/build/src/gfx/gl/GLContext.cpp, line 351
> 0:51.09 INFO: [Parent 25460, Renderer] WARNING: Failed to create GLXContext!: file /builds/worker/workspace/build/src/gfx/gl/GLContextProviderGLX.cpp, line 558
> 0:51.09 INFO: [GFX1-]: Failed GL context creation for WebRender: 0
> 0:51.09 INFO: [Parent 25460, Compositor] WARNING: Possibly dropping task posted to updater thread: file /builds/worker/workspace/build/src/gfx/layers/apz/src/APZUpdater.cpp, line 416
> 0:51.11 INFO: [GLX] window 640003b has VisualID 0x28
> 0:51.18 INFO: [GLX] window 640001d has VisualID 0x27
> 0:51.18 INFO: [GFX1-]: Compositors might be mixed (5,2)
> 0:51.20 INFO: [Parent 25460, Compositor] WARNING: Created child without a matching parent?: file /builds/worker/workspace/build/src/gfx/layers/ipc/CrossProcessCompositorBridgeParent.cpp, line 104
> 0:51.20 INFO: [Child 25549, Main Thread] WARNING: failed to allocate layer transaction: file /builds/worker/workspace/build/src/dom/ipc/TabChild.cpp, line 2889
> 0:51.20 INFO: [Child 25549, Main Thread] WARNING: failed to recreate layer manager: file /builds/worker/workspace/build/src/dom/ipc/TabChild.cpp, line 3225
> 0:51.20 INFO: [Parent 25460, Compositor] WARNING: Created child without a matching parent?: file /builds/worker/workspace/build/src/gfx/layers/ipc/CrossProcessCompositorBridgeParent.cpp, line 104
> 0:51.20 INFO: [Child 25549, Main Thread] WARNING: failed to allocate layer transaction: file /builds/worker/workspace/build/src/dom/ipc/TabChild.cpp, line 2889
> 0:51.20 INFO: [Child 25549, Main Thread] WARNING: failed to recreate layer manager: file /builds/worker/workspace/build/src/dom/ipc/TabChild.cpp, line 3225
> 0:51.20 INFO: [Parent 25460, Compositor] WARNING: Created child without a matching parent?: file /builds/worker/workspace/build/src/gfx/layers/ipc/CrossProcessCompositorBridgeParent.cpp, line 104
> 0:51.20 INFO: [Child 25549, Main Thread] WARNING: failed to allocate layer transaction: file /builds/worker/workspace/build/src/dom/ipc/TabChild.cpp, line 2889
> 0:51.20 INFO: [Child 25549, Main Thread] WARNING: failed to recreate layer manager: file /builds/worker/workspace/build/src/dom/ipc/TabChild.cpp, line 3225
> 0:52.39 INFO: [Parent 25460, Main Thread] ###!!! ASSERTION: Creating widget for MenuPopupFrame with children: '!mGeneratedChildren && !PrincipalChildList().FirstChild()', file /builds/worker/workspace/build/src/layout/xul/nsMenuPopupFrame.cpp, line 259
This bug is reproducible with:
* Ubuntu 18.04 LTS, nvidia-driver-390 (390.48-0ubuntu3)
* Debian Testing, nvidia-driver (390.67-3)
* Debian Testing, nvidia-driver (390.77-1) from unstable
* Debian Testing, nvidia-driver (396.45-1) from experimental
Summary: [Debian Testing, proprietary Nvidia driver] since fix for Bug 1406533 OOP webextension panels break WebRender on Linux (Fallback to OpenGL) → [Proprietary Nvidia driver] since fix for Bug 1406533 OOP webextension panels break WebRender on Linux (Fallback to OpenGL)
Component: Graphics → Graphics: WebRender
Yes, I can see that on nvidia proprietary driver now - it fails at glXMakeCurrent().
Summary: [Proprietary Nvidia driver] since fix for Bug 1406533 OOP webextension panels break WebRender on Linux (Fallback to OpenGL) → [Proprietary Nvidia driver] Failure at glXMakeCurrent()
Summary: [Proprietary Nvidia driver] Failure at glXMakeCurrent() → [Proprietary Nvidia driver] BadMatch failure at glXMakeCurrent()
In my case the problem here is that X Drawable is created with visual 0x28 with RGBA color format but GLContextGLX::FindFBConfigForWindow() selects visual 0x27 with RGB only.

This bug affects transparent (popup) windows which are created with alpha channel at nsWindow::Create() but the AreCompatibleVisuals() fails to match alpha channel info and selects non-alpha visual (it depends on vidual id sorting).

I wonder if the proper fix is to don't call AreCompatibleVisuals() for webrender visuals as we already set the correct one at nsWindow::Create() or update the AreCompatibleVisuals() to also patch alpha channel. Lee, any idea here?
Flags: needinfo?(lsalzman)
See Also: → 1401455
(It looks like Sotaro wants to fix transparency for OOP webextension panels in general.)
See Also: → 1479181
Seems to be related to Bug 1193015
See Also: 14791811193015
Can you put up your fix from bug 1479181 comment 18 here? Thanks.
Flags: needinfo?(stransky)
Sure, but that needs to be investigated more as there's also the ATI workaround involved.
Flags: needinfo?(stransky)
(In reply to Martin Stránský [:stransky] from comment #28)
> Yes, I can see that on nvidia proprietary driver now - it fails at glXMakeCurrent().

(Sotaro Ikeda [:sotaro] from bug 1479181 comment 31)
> https://treeherder.mozilla.org/#/jobs?repo=try&revision=105a82ae43a2c52163ae2b66de9550e2aad32965

It looks like the new patches in bug 1479181 might have fixed this bug:
mozregression --repo try --launch 105a82ae43a2c52163ae2b66de9550e2aad32965 --pref gfx.webrender.all:true gfx.webrender.debug.compact-profiler:true gfx.webrender.debug.profiler:true -a https://addons.mozilla.org/en-US/firefox/addon/ublock-origin/
* No fallback to OpenGL
* Transparent OOP webextension widgets
I tracked the ATI hack here to Bug 572939 - I wonder if we still need it. Unfortunately I don't have any ATI handy to test that.
See Also: → 572939
Flags: needinfo?(lsalzman)
Comment on attachment 8995151 [details]
Bug 1478454 - [Linux/WebRender] Create glxContext with GLX visual chosen at nsWindow::Create(),

https://reviewboard.mozilla.org/r/259626/#review268322
Attachment #8995151 - Flags: review?(jgilbert) → review+
(In reply to Martin Stránský [:stransky] from comment #35)
> I tracked the ATI hack here to Bug 572939 - I wonder if we still need it.
> Unfortunately I don't have any ATI handy to test that.

I have a Radeon RX 560 set up with amdgpu, if that would help, but bug 572939 might have been about fglrx (now amdgpu-pro)?
(In reply to Jed Davis [:jld] (⏰UTC-6) from comment #38)
> (In reply to Martin Stránský [:stransky] from comment #35)
> > I tracked the ATI hack here to Bug 572939 - I wonder if we still need it.
> > Unfortunately I don't have any ATI handy to test that.
> 
> I have a Radeon RX 560 set up with amdgpu, if that would help, but bug
> 572939 might have been about fglrx (now amdgpu-pro)?

Okay, let's remove that hack as it may cause BadMatch GL failure and the fglrx are no longer used by AMD cards. I'll file a follow up bug for that.
Pushed by stransky@redhat.com:
https://hg.mozilla.org/integration/autoland/rev/9600a4859665
[Linux/WebRender] Create glxContext with GLX visual chosen at nsWindow::Create(), r=jgilbert
(In reply to Martin Stránský [:stransky] from comment #39)
> Okay, let's remove that hack as it may cause BadMatch GL failure and the
> fglrx are no longer used by AMD cards. I'll file a follow up bug for that.

Filed as Bug 1481145.
Backed out changeset 9600a4859665 (bug 1478454) for causing leaks

Log:
https://treeherder.mozilla.org/logviewer.html#?job_id=192198467&repo=autoland&lineNumber=8538

 TEST-PASS | leakcheck | tab process: no leaks detected!
[task 2018-08-06T09:05:11.630Z] 09:05:11     INFO - 
[task 2018-08-06T09:05:11.631Z] 09:05:11     INFO - == BloatView: ALL (cumulative) LEAK AND BLOAT STATISTICS, default process 2096
[task 2018-08-06T09:05:11.632Z] 09:05:11     INFO - 
[task 2018-08-06T09:05:11.633Z] 09:05:11     INFO -      |<----------------Class--------------->|<-----Bytes------>|<----Objects---->|
[task 2018-08-06T09:05:11.634Z] 09:05:11     INFO -      |                                      | Per-Inst   Leaked|   Total      Rem|
[task 2018-08-06T09:05:11.635Z] 09:05:11     INFO -    0 |TOTAL                                 |       43      936| 3078763       12|
[task 2018-08-06T09:05:11.636Z] 09:05:11     INFO -    1 |APZCTreeManager::CheckerboardFlushObse|       32       32|       9        1|
[task 2018-08-06T09:05:11.637Z] 09:05:11     INFO -    5 |APZUpdater                            |      392      392|       9        1|
[task 2018-08-06T09:05:11.638Z] 09:05:11     INFO -    7 |APZUpdater::ClearTree                 |       40       40|       9        1|
[task 2018-08-06T09:05:11.639Z] 09:05:11     INFO -  382 |IAPZCTreeManager                      |       16       16|       9        1|
[task 2018-08-06T09:05:11.640Z] 09:05:11     INFO -  427 |InputQueue                            |       80       80|       9        1|
[task 2018-08-06T09:05:11.641Z] 09:05:11     INFO -  501 |Mutex                                 |       72      360|    3444        5|
[task 2018-08-06T09:05:11.649Z] 09:05:11     INFO - 1810 |nsTArray_base                         |        8       16| 1355346        2|
[task 2018-08-06T09:05:11.650Z] 09:05:11     INFO - 
[task 2018-08-06T09:05:11.651Z] 09:05:11     INFO - nsTraceRefcnt::DumpStatistics: 1951 entries
[task 2018-08-06T09:05:11.652Z] 09:05:11     INFO - TEST-INFO | leakcheck | default process: leaked 1 APZCTreeManager::CheckerboardFlushObse
[task 2018-08-06T09:05:11.653Z] 09:05:11     INFO - TEST-INFO | leakcheck | default process: leaked 1 APZUpdater
[task 2018-08-06T09:05:11.654Z] 09:05:11     INFO - TEST-INFO | leakcheck | default process: leaked 1 APZUpdater::ClearTree
[task 2018-08-06T09:05:11.654Z] 09:05:11     INFO - TEST-INFO | leakcheck | default process: leaked 1 IAPZCTreeManager
[task 2018-08-06T09:05:11.655Z] 09:05:11     INFO - TEST-INFO | leakcheck | default process: leaked 1 InputQueue
[task 2018-08-06T09:05:11.656Z] 09:05:11     INFO - TEST-INFO | leakcheck | default process: leaked 5 Mutex
[task 2018-08-06T09:05:11.657Z] 09:05:11     INFO - TEST-INFO | leakcheck | default process: leaked 2 nsTArray_base
[task 2018-08-06T09:05:11.658Z] 09:05:11    ERROR - TEST-UNEXPECTED-FAIL | leakcheck | default process: 936 bytes leaked (APZCTreeManager::CheckerboardFlushObse, APZUpdater, APZUpdater::ClearTree, IAPZCTreeManager, InputQueue, ...)
[task 2018-08-06T09:05:11.659Z] 09:05:11     INFO - runtests.py | Running tests: end.
[task 2018-08-06T09:05:11.676Z] 09:05:11     INFO - Buffered messages finished
[task 2018-08-06T09:05:11.677Z] 09:05:11     INFO - Running manifest: toolkit/components/extensions/test/mochitest/mochitest-common.ini
[task 2018-08-06T09:05:11.678Z] 09:05:11     INFO - The following extra prefs will be set:
[task 2018-08-06T09:05:11.679Z] 09:05:11     INFO -   security.mixed_content.upgrade_display_content=false
[task 2018-08-06T09:05:11.680Z] 09:05:11     INFO -   browser.chrome.guess_favicon=true
[task 2018-08-06T09:05:11.798Z] 09:05:11     INFO -  Setting pipeline to PAUSED ...
[task 2018-08-06T09:05:11.799Z] 09:05:11     INFO -  Pipeline is PREROLLING ...
[task 2018-08-06T09:05:11.800Z] 09:05:11     INFO -  Pipeline is PREROLLED ...
[task 2018-08-06T09:05:11.801Z] 09:05:11     INFO -  Setting pipeline to PLAYING ...
[task 2018-08-06T09:05:11.802Z] 09:05:11     INFO -  New clock: GstSystemClock
[task 2018-08-06T09:05:11.839Z] 09:05:11     INFO -  Got EOS from element "pipeline0".
[task 2018-08-06T09:05:11.839Z] 09:05:11     INFO -  Execution ended after 0:00:00.033445765
[task 2018-08-06T09:05:11.840Z] 09:05:11     INFO -  Setting pipeline to PAUSED ...
[task 2018-08-06T09:05:11.841Z] 09:05:11     INFO -  Setting pipeline to READY ...
[task 2018-08-06T09:05:11.842Z] 09:05:11     INFO -  (gst-launch-1.0:2272): GStreamer-CRITICAL **: gst_object_unref: assertion '((GObject *) object)->ref_count > 0' failed
[task 2018-08-06T09:05:11.842Z] 09:05:11     INFO -  Setting pipeline to NULL ...
[task 2018-08-06T09:05:11.843Z] 09:05:11     INFO -  Freeing pipeline ...
[task 2018-08-06T09:05:12.188Z] 09:05:12     INFO -  pk12util: PKCS12 IMPORT SUCCESSFUL
[task 2018-08-06T09:05:12.524Z] 09:05:12     INFO - MochitestServer : launching [u'/builds/worker/workspace/build/tests/bin/xpcshell', '-g', '/builds/worker/workspace/build/application/firefox', '-f', '/builds/worker/workspace/build/tests/bin/components/httpd.js', '-e', "const _PROFILE_PATH = '/tmp/tmp8lxwmB.mozrunner'; const _SERVER_PORT = '8888'; const _SERVER_ADDR = '127.0.0.1'; const _TEST_PREFIX = undefined; const _DISPLAY_RESULTS = false;", '-f', '/builds/worker/workspace/build/tests/mochitest/server.js']
[task 2018-08-06T09:05:12.524Z] 09:05:12     INFO - runtests.py | Server pid: 2295
[task 2018-08-06T09:05:12.612Z] 09:05:12     INFO - runtests.py | Websocket server pid: 2299
[task 2018-08-06T09:05:12.775Z] 09:05:12     INFO - runtests.py | SSL tunnel pid: 2309
[task 2018-08-06T09:05:12.912Z] 09:05:12     INFO -  Couldn't convert chrome URL: chrome://branding/locale/brand.properties
[task 2018-08-06T09:05:12.912Z] 09:05:12     INFO -  [2295, Main Thread] WARNING: Could not get the program name for a cubeb stream.: 'NS_SUCCEEDED(rv)', file /builds/worker/workspace/build/src/dom/media/CubebUtils.cpp, line 363
[task 2018-08-06T09:05:12.929Z] 09:05:12     INFO - runtests.py | Running with e10s: True
[task 2018-08-06T09:05:12.929Z] 09:05:12     INFO - runtests.py | Running tests: start.
[task 2018-08-06T09:05:12.929Z] 09:05:12     INFO - 
[task 2018-08-06T09:05:12.950Z] 09:05:12     INFO - Application command: /builds/worker/workspace/build/application/firefox/firefox -marionette -foreground -profile /tmp/tmp8lxwmB.mozrunner
[task 2018-08-06T09:05:13.032Z] 09:05:13     INFO - runtests.py | Application pid: 2318
[task 2018-08-06T09:05:13.033Z] 09:05:13     INFO - TEST-INFO | started process GECKO(2318)

Push with failures:
https://treeherder.mozilla.org/#/jobs?repo=autoland&revision=9600a4859665fbb35e72e392fb2a08050459a3fe

Backout:
https://hg.mozilla.org/integration/autoland/rev/5ac815894d929774a528515b61742b0b24551aff
Flags: needinfo?(stransky)
(In reply to Martin Stránský [:stransky] from comment #39)
> Okay, let's remove that hack as it may cause BadMatch GL failure and the
> fglrx are no longer used by AMD cards. I'll file a follow up bug for that.

AMDGPU-PRO appears to be a direct descendent of fglrx (we had to care about this for sandboxing), but it uses the same kernel driver as the open-source amdgpu, and it's been 8 years since bug 572939, so possibly the double-buffering bug is either fixed or inapplicable.
Hm, I'm looking at that code and can't find anything what can possibly leak. The leak is even in different module, this code touch gfx/gl and APZ* is from gfx/layers.
Flags: needinfo?(stransky)
Flags: needinfo?(stransky)
(In reply to Martin Stránský [:stransky] from comment #44)
> Hm, I'm looking at that code and can't find anything what can possibly leak.
> The leak is even in different module, this code touch gfx/gl and APZ* is from gfx/layers.

Bug 1465658 and bug 1446181 are about some intermittent APZ leaks on Win10. Could it be related?
Looking at [1] which is the log for the one of the leaking mochitest-15 jobs, I see this:

 [GFX1-]: Failed GL context creation for WebRender: 0
...
 [GFX1-]: Compositors might be mixed (5,1)

which seems to imply we're falling back from WR to Basic compositing at some point. That's likely exposing a pre-existing leak in the APZ code that happens in this sort of edge case. I can look into the leak but presumably we shouldn't be falling back to Basic compositing at all, and that might be a result of your patch.

[1] https://taskcluster-artifacts.net/Bguws0qVSMahPHY0Cqc36Q/0/public/logs/live_backing.log
I see, thanks for the explanation - I'll investigate the fallback.
Flags: needinfo?(stransky)
Severity: major → normal
Pushed by stransky@redhat.com:
https://hg.mozilla.org/integration/autoland/rev/c847fd578fae
[Linux/WebRender] Create glxContext with GLX visual chosen at nsWindow::Create(), r=jgilbert
https://hg.mozilla.org/mozilla-central/rev/c847fd578fae
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla63
Manually verified using FF 63 running on Ubuntu 16.04 with an nVidia graphics card, the issue appears to be fixed as it is no longer reproducing. 08-16-2018.
Status: RESOLVED → VERIFIED
Attached video sg00001.mp4
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: