Closed Bug 1483533 Opened 6 years ago Closed 6 years ago

tab bar flickering when interacting with tabs (open, close, scroll) (ClientStorage)

Categories

(Core :: Graphics, defect, P3)

defect

Tracking

()

VERIFIED FIXED
mozilla64
Tracking Status
firefox-esr60 --- unaffected
firefox62 --- unaffected
firefox63 + verified
firefox64 --- verified

People

(Reporter: soeren.hentzschel, Assigned: alexical)

References

(Blocks 1 open bug)

Details

(Keywords: regression, Whiteboard: [fxperf:p1][gfx-noted])

Attachments

(6 files, 1 obsolete file)

Attached video flickering.mov
[Tracking Requested - why for this release]:

When opening or closing a tab there are visual glitches in the tab bar. It doesn't happen in Firefox 62 Beta, only in Firefox 63 Nightly. Operating system is the latest macOS Mojave Public Beta (it was already present with the previous Betas).

Tentatively setting as dependency of bug 1466336 because I can't reproduce on another machine with macOS 10.11. But since I can't also reproduce on Firefox 62 I am not sure if it's really depends on the version of macOS.

I attached a video showing the issue. First you see the opening and closing of tabs in Firefox 62 Beta without problems, then I switched to Firefox 63 Nightly where you can see the issue.
Attached video flickering2.mov
There is also an extreme flickering when scrolling within the tab bar.
Summary: tab bar flickering when opening or closing tabs → tab bar flickering when interacting with tabs (open, close, scroll)
Hi Sören,

Do you have WebRender enabled, by chance?
Flags: needinfo?(cadeyrn)
All webrender prefs in about:config have their default values. about:support says:

WEBRENDER	
opt-in by default: WebRender is an opt-in feature

WEBRENDER_QUALIFIED	
blocked by env: No qualified hardware
Flags: needinfo?(cadeyrn)
Attached file about-support.txt
I attached my about:support.

Another interesting note: There are graphical glitches while scrolling in the tab bar but not as bad bad as in the attached video. Once I have "about:welcome" open in a tab these graphical glitches are extreme like in the video. When I remove the animated graphic via devtools the issue is not gone but similar to the issue wihtout "about:welcome" being opened.
This smells like a graphics issue...

Is this a recent regression? If so, do you have time to help us find a regression range with mozregression?
Component: Untriaged → Graphics
Flags: needinfo?(cadeyrn)
Product: Firefox → Core
When I filed the bug it was hard to find the regression because it hadn't always occurred. Luckily  25 minutes ago I found out that a open "about:welcome" tab makes the problem always visible. So I was finally able to find the regressing bug.

16:51.05 INFO: Last good revision: 1c945ede0071fac3f11a6cddd48cb68517a1132a
16:51.05 INFO: First bad revision: 96ce98b72056cd2c19c3640b4b7834b73dcc42ac
16:51.05 INFO: Pushlog:
https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=1c945ede0071fac3f11a6cddd48cb68517a1132a&tochange=96ce98b72056cd2c19c3640b4b7834b73dcc42ac
Blocks: 1265824
Flags: needinfo?(cadeyrn)
Ruh roh.

dthayer is out, so I'm redirecting this mstange... any idea what we should do here? Apparently, Mojave is coming out pretty soon?
Flags: needinfo?(mstange)
Oh no! :(

I wonder why this shows up in the tab bar and nowhere else. If it's caused by bad texture contents, we should be seeing it everywhere...

Matt, do you have a machine with the 10.14 Beta that you could use to investigate this?
Flags: needinfo?(mstange) → needinfo?(matt.woodrow)
Asking Camelia to look at this to see if she can reproduce the issue noted in Comment 8.
Flags: needinfo?(camelia.badau)
A note: I experienced the problem with my old MacBook Pro Late 2013. Now I have a new MacBook Pro 2018 and I can't reproduce the problem on the new machine. Unfortunately that means I can't help with testing / providing more data because I sold my old MacBook.
Attached video issue.mov
We tested on macOS 10.14 (beta 6 build - 18A365a) using latest Nightly 63.0a1 (2018-08-03) and Nightly 63.0a1 (2018-08-16) and here are the following mentions:

- on MacBook Pro (Retina, 15-inch, Late 2013) with Intel Iris Pro 1536MB - we didn't manage to reproduce the issue: tab bar DOESN'T flicker when interacting with tabs (open, close, scroll)  

- on iMac (21.5-inch, Late 2012) with NVIDIA GeForce GT 640M 512MB - we REPRODUCED the issue: the tab bar flickers when interacting with tabs (open, close, scroll). Please see "issue.mov" attachment. 

It seems it is related to macOS with NVIDIA Graphics. 

The issue isn't reproducible on Firefox 62 Beta 19.
Flags: needinfo?(camelia.badau)
Tracked for 63 since it's a recent regression.
Assignee: nobody → dothayer
Status: NEW → ASSIGNED
Whiteboard: [fxperf:p1]
Priority: -- → P3
Whiteboard: [fxperf:p1] → [fxperf:p1][gfx-noted]
We have the regression window in comment #6
Doug, do you have an update on this regression? Thanks!
Flags: needinfo?(dothayer)
(In reply to Pascal Chevrel:pascalc from comment #14)
> Doug, do you have an update on this regression? Thanks!

Took a bit to find a loaner of a Mac with nvidia hardware - it's shipping now. As soon as it arrives I'll start working on this actively.
Flags: needinfo?(dothayer)
Flags: needinfo?(matt.woodrow)
Just received the iMac with nvidia hardware - the tab bar is flickering on scroll even on High Sierra. Hopefully this is the same issue, but in either case it needs to be fixed :/
So, removing the GL_TEXTURE_STORAGE_HINT_APPLE fiddling fixes things on nvidia hardware. Unfortunately it also degrades performance a little bit. I'm trying to investigate why this breaks things for nvidia cards. My working theory is that glTestObjectAPPLE / glFinishObjectAPPLE are not properly being respected by the nvidia driver, and we're writing to the buffers before the GPU is actually done with them.

Potentially other synchronization mechanisms could work, so I'll be investigating that next.
Sören, can you verify that this build[1] fixes the issue on your end? It works for me on an iMac with Nvidia hardware which was flickering in the same way you describe, but I wanted to verify that it works for you as well.

[1] https://queue.taskcluster.net/v1/task/d3HSLehhRmi6reXvFQm1dg/runs/0/artifacts/public/build/target.dmg
Flags: needinfo?(cadeyrn)
Unfortunately I can't verify the fix because I no longer have my old MacBook with the graphics card from Nvidia and my new MacBook has a graphics card from AMD.
Flags: needinfo?(cadeyrn)
I wish I understood a little better what precisely is going on
here. What seems to be the problem is calling glDeleteTextures
too early, but I can't pin down exactly when "too early" is.
In any case I can no longer reproduce the issue with this patch
applied, and I cannot observe any performance degradation, and
it's not a remarkably risky patch, so I'm opting to cut the
investigation short. Any insights would be appreciated though.
Comment on attachment 9009704 [details]
Bug 1483533 - Delay texture delete for DirectMapTextureSource r?jrmuizel

Jeff Muizelaar [:jrmuizel] has approved the revision.
Attachment #9009704 - Flags: review+
Pushed by dothayer@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/59759503b889
Delay texture delete for DirectMapTextureSource r=jrmuizel
Backout by aciure@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/ecdd5ebf4a43
Backed out changeset 59759503b889 for test failures RefPtr CLOSED TREE
Flags: needinfo?(dothayer)
Pushed by dothayer@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/dc263497b339
Delay texture delete for DirectMapTextureSource r=jrmuizel
https://hg.mozilla.org/mozilla-central/rev/dc263497b339
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla64
Hi Doug, this is a tracked bug for 63, can you please request beta uplift? Thanks!
Woops. This appears to not actually be resolved yet. It appears what I thought was a minor change to resolve a shutdown leak unexpectedly undid the fix.

The fix worked under the expectation that glDeleteTextures was to blame, but after going down the rabbit hole the real culprit is the freeing of the buffer[1] which is happening prematurely. Turns out, as far as I can tell, glFinishObjectAPPLE and glTestObjectAPPLE are just plainly left unimplemented, or else implemented in a way that is useless to us. The delete is just the only thing that happens quickly enough to manifest the problem on nvidia hardware, it seems. Replacing the delete with a memset to 0xff produces a white flash.

Deferring the delete[] makes the problem go away and I can't get it to come back through legitimate use, but it still makes me uneasy that the synchronization mechanism simply doesn't work.

I'm going to bang my head against this a little more.

[1]: https://searchfox.org/mozilla-central/rev/a0333927deabfe980094a14d0549b589f34cbe49/gfx/layers/composite/TextureHost.cpp#1247
Status: RESOLVED → REOPENED
Flags: needinfo?(dothayer)
Resolution: FIXED → ---
glFinishObjectAPPLE definitely seems to be doing something on Intel hardware:

6.00 ms    0.2%	1.00 ms	 	                   mozilla::layers::DirectMapTextureSource::Sync(bool)
5.00 ms    0.2%	0 s	 	                    gleFinishObject
3.00 ms    0.1%	0 s	 	                     gldFinishObject
3.00 ms    0.1%	0 s	 	                      IOAccelResourceFinishEvent
3.00 ms    0.1%	0 s	 	                       IOConnectCallMethod
3.00 ms    0.1%	0 s	 	                        io_connect_method
3.00 ms    0.1%	0 s	 	                         mach_msg
3.00 ms    0.1%	0 s	 	                          mach_msg_trap
3.00 ms    0.1%	0 s	 	                           mach_msg_overwrite_trap
3.00 ms    0.1%	0 s	 	                            ipc_kmsg_send
3.00 ms    0.1%	0 s	 	                             ipc_kobject_server
3.00 ms    0.1%	0 s	 	                              0xffffff8000342a80
3.00 ms    0.1%	0 s	 	                               is_io_connect_method
3.00 ms    0.1%	0 s	 	                                IOAccelSharedUserClient2::externalMethod(unsigned int, IOExternalMethodArguments*, IOExternalMethodDispatch*, OSObject*, void*)
3.00 ms    0.1%	0 s	 	                                 IOUserClient::externalMethod(unsigned int, IOExternalMethodArguments*, IOExternalMethodDispatch*, OSObject*, void*)
3.00 ms    0.1%	0 s	 	                                  shim_io_connect_method_scalarI_scalarO
3.00 ms    0.1%	0 s	 	                                   IOAccelSharedUserClient2::finish_object_event(unsigned int, unsigned int)
3.00 ms    0.1%	0 s	 	                                    IOAccelEventMachineFast2::finishEventUnlocked(IOAccelEvent*)
3.00 ms    0.1%	0 s	 	                                     IOAccelEventMachine2::waitForStamp(int, unsigned int, unsigned int*)
3.00 ms    0.1%	3.00 ms	 	                                      thread_block_reason
s


I got this from Instruments by setting "Record Kernel Callstacks" in Recording Options and filtering by gleFinishObject.
You might be able to see if things are working a little bit by just setting a breakpoint on IOAccelResourceFinishEvent
1.00 ms    0.0%	0 s	 	                  mozilla::layers::BufferTextureHost::UnbindTextureSource()
1.00 ms    0.0%	0 s	 	                   mozilla::layers::BufferTextureHost::ReadUnlock()
1.00 ms    0.0%	0 s	 	                    mozilla::layers::DirectMapTextureSource::Sync(bool)
1.00 ms    0.0%	1.00 ms	 	                     gleFinishObject


It ends there. This coupled with the fact that glTestObjectAPPLE only ever returns false makes me pretty confident that it's just left unimplemented. I ended up implementing the same idea using glFenceSync after the texture is used for drawing and glClientWaitSync before unlocking it, and it resolves the issue. I'm sanity checking the performance on intel and AMD hardware now.
(I am a little bit surprised that any samples at all show up for gleFinishObject. I'm not sure what accounts for that. But in any case whatever it is doing is not much.)
Summary: tab bar flickering when interacting with tabs (open, close, scroll) → tab bar flickering when interacting with tabs (open, close, scroll) (ClientStorage)
glFenceSync/glClientWaitSync just seem to be more well supported
on nvidia hardware, and they work fine as well on AMD/intel, so
I'm transitioning to that.

Depends on D6463
Target Milestone: mozilla64 → ---
Comment on attachment 9010843 [details]
Bug 1483533 - Undo deferment of glDeleteTextures r?jrmuizel

Jeff Muizelaar [:jrmuizel] has approved the revision.
Attachment #9010843 - Flags: review+
Comment on attachment 9010844 [details]
Bug 1483533 - Switch to glClientWaitSync for texture syncing r?jrmuizel

Jeff Muizelaar [:jrmuizel] has approved the revision.
Attachment #9010844 - Flags: review+
Re: the cost of a fence per texture, it can show up in cases with very large numbers of textures like the motionmark clipped rects test, but it's not a huge contributor. See the profile here: https://perfht.ml/2xFPDZO

That being said it would be nice for it to not show up at all. I could look at a more coarse fencing solution, or I could scope this to only nvidia hardware. What do you think?
Flags: needinfo?(jmuizelaar)
I think it's ok to go with the fence solution for a first pass. Let's look into a other solutions in a follow up.
Flags: needinfo?(jmuizelaar)
Pushed by dothayer@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/fa12e9425f90
Undo deferment of glDeleteTextures r=jrmuizel
https://hg.mozilla.org/integration/autoland/rev/dc678b66fcb3
Switch to glClientWaitSync for texture syncing r=jrmuizel
https://hg.mozilla.org/mozilla-central/rev/fa12e9425f90
https://hg.mozilla.org/mozilla-central/rev/dc678b66fcb3
Status: REOPENED → RESOLVED
Closed: 6 years ago6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla64
Comment on attachment 9010844 [details]
Bug 1483533 - Switch to glClientWaitSync for texture syncing r?jrmuizel

Approval Request Comment
[Feature/Bug causing the regression]: Bug 1265824
[User impact if declined]: Flickering in the tab bar on OSX with nvidia hardware
[Is this code covered by automated tests?]: No
[Has the fix been verified in Nightly?]: Yes
[Needs manual test from QE? If yes, steps to reproduce]: Ideally if we could test on more OSX nvidia devices to confirm the fix.
[List of other uplifts needed for the feature/fix]: None
[Is the change risky?]: Mildly
[Why is the change risky/not risky?]: It changes the synchronization mechanism for ClientStorage on OSX. If it went wrong it could lead to similar flickering on different hardware. This is unlikely though.
[String changes made/needed]: None
Attachment #9010844 - Flags: approval-mozilla-beta?
Comment on attachment 9010844 [details]
Bug 1483533 - Switch to glClientWaitSync for texture syncing r?jrmuizel

Fix for a tracked 63 regression, uplift approved for 63 beta 10. Thanks.
Attachment #9010844 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Flags: qe-verify+
Reproduced the issue on Nightly 63.0a1 (2018-08-03)on iMac (21.5-inch, Late 2012) with NVIDIA GeForce GT 640M 512MB macOS 10.14
Verified as fixed using Beta 63.0b10 and Nightly 64.0a1 on iMac (21.5-inch, Late 2012) with NVIDIA GeForce GT 640M 512MB macOS 10.14.
Status: RESOLVED → VERIFIED
Flags: qe-verify+
Attachment #9009704 - Attachment is obsolete: true
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: