Closed Bug 1808155 Opened 2 years ago Closed 1 year ago

Screen Flickering Issue on MacOS Ventura

Categories

(Core :: Graphics, defect)

Firefox 109
defect

Tracking

()

RESOLVED FIXED

People

(Reporter: anmolsharma010, Assigned: bradwerth)

References

Details

Attachments

(7 files)

User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:109.0) Gecko/20100101 Firefox/109.0

Steps to reproduce:

While in full screen mode on MacOS ventura, hover the cursor to the menu bar while on youtube.com with video playing

Actual results:

This results in screen flickering. I'm using the developer edition of firefox however it also happens on release and nightly edition. The flicker happens on both any external display or even the built-in display.

Further Details :-
OS :- Ventura 13.1
System :- Macbook Air M1

Furthermore you can find a screen recording showcasing this here :- https://ln5.sync.com/dl/fb05ed6f0/sw6td59j-nxpremxh-7gcsjsd5-gs7m34rv

Expected results:

The screen should not flicker.

The Bugbug bot thinks this bug should belong to the 'Core::Widget: Cocoa' component, and is moving the bug to that component. Please correct in case you think the bot is wrong.

Component: Untriaged → Widget: Cocoa
Product: Firefox → Core

Changing component because only the video content appears to be flickering.

Component: Widget: Cocoa → Audio/Video

This seems more a gfx issue to me.

Component: Audio/Video → Graphics

I was not able to repro on monterey.

I can reproduce this. It seems like a perfect issue for Brad to look at.

Flags: needinfo?(bwerth)

If I disable gfx.core-animation.specialize-video the problem doesn't happen

Severity: -- → S2

I'll figure it out.

Assignee: nobody → bwerth
Flags: needinfo?(bwerth)

I'm unable to reproduce on macOS 13.1 on an M1 Max. I'll keep trying some other settings to see if I can reproduce.

While I am attempting to reproduce on macOS 13.1, I see app console errors. There is an Apple issue with macOS 13 animating the menubar, where revealing the menubar would dump out stack traces ending with _ZN15MenuBarInstance22EnsureAutoShowObserverEv. I wonder if this is affecting the flickering issue. This problem is supposedly fixed in macOS 13.2. Would you please update to macOS 13.2 and re-test? I will be doing the same.

Flags: needinfo?(anmolsharma010)

In local logging, it appears that animating the menubar does not force a rebuild of the video layer, which was one theory of what might cause flickering, and something that we could do something about. What it does do is trigger the output of one of those _ZN15MenuBarInstance22RemoveAutoShowObserverEv call stacks in the console. I'll put a patch together that makes the video logging more useful.

This change makes the VIDEO_LOG only output the first surface enqueued on
each specialized video layer, and notes whenever the video layer has been
recreated. This will be helpful in diagnosing cases degenerate cases
where display of video content toggles between a specialized video layer
and a normal layer.

Updating to macOS 13.2 does not fix the errors with _ZN15MenuBarInstance22EnsureAutoShowObserverEv. This probably means that the flickering problem will persist for the reporter even on macOS 13.2. The proposed patch will make the video log more useful in confirming that the video layer is not being recreated during the menu bar animation.

Landing the patch will not resolve this bug, but will lead to easier error reporting for the reporter.

Keywords: leave-open

So I tested this again on Firefox 111.0b2 (Dev. Edition) and on Macos 13.2.1 and cannot replicate. It appears (at least to me) that the problem seems to have been fixed. However, not sure if the problem was on firefox's side or macos.

Flags: needinfo?(anmolsharma010)

(In reply to Anmol from comment #14)

So I tested this again on Firefox 111.0b2 (Dev. Edition) and on Macos 13.2.1 and cannot replicate. It appears (at least to me) that the problem seems to have been fixed. However, not sure if the problem was on firefox's side or macos.

Update :- It still does happen however not everytime like previously. I initially tested using a simple video and couldn't replicate this but after testing this on a youtube live stream, it still does happen.

Pushed by bwerth@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/e8e72e2c1097 Part 1: Make NativeLayerCA VIDEO_LOG more useful. r=mac-reviewers,mstange

We've landed some logging in Firefox Nightly that may be useful in diagnosing this problem. Would you please:

  1. Download Firefox Nightly and launch it.
  2. Navigate to "about:config" and set the pref "gfx.core-animation.specialize-video.log" to true.
  3. Quit Nightly.
  4. Launch Nightly from the Terminal. Start by launching Terminal.app. In the Terminal window, presuming that you've installed Firefox Nightly in the Applications directory, type this command /Applications/Firefox\ Nightly.app/Contents/MacOS/firefox.
  5. Firefox Nightly will launch. Reproduce the issue on your setup, then quit Nightly.
  6. Copy the text that appears in your Terminal window after the command you entered above, and paste that text as an attachment to this Bug. You may want to edit this text to remove any personal identifying information that you don't want to share.

Hopefully this will let us see if your video surface is being recreated while the menubar is animating. That is one possible explanation for the screen flickering.

Flags: needinfo?(anmolsharma010)
Attached file log.txt
Flags: needinfo?(anmolsharma010)

The log file is posted above for reference.

This is very helpful. The log does seem to indicate that the video surface is being recreated 3 times between the instances of _ZN15MenuBarInstance22EnsureAutoShowObserverEv and _ZN15MenuBarInstance22RemoveAutoShowObserverEv. Each time, the layer is created as an AVSampleBufferDisplayLayer, which is good, but it should not be getting recreated at all. I'll keep looking and see if I can determine what might be causing the layer to get wiped out.

Most likely we're hitting NativeLayerCA::SetRootWindowIsFullscreen when triggering the menubar animation. That function incorrectly unconditionally sets mMutatedSpecializeVideo, which will cause the video layer to be recreated. Patch inbound.

(In reply to Brad Werth [:bradwerth] from comment #22)

Most likely we're hitting NativeLayerCA::SetRootWindowIsFullscreen when triggering the menubar animation. That function incorrectly unconditionally sets mMutatedSpecializeVideo, which will cause the video layer to be recreated. Patch inbound.

I misspoke. That function doesn't unconditionally set mMutatedSpecializeVideo, but it is possible that it is setting that value when it shouldn't. I'll add some logging at a minimum.

Strangely, I can sometimes get this to replicate in Firefox Nightly on my machine, but I can't ever get it to replicate in a local build. So adding more logging may be a way for me to determine what is causing the layer to rebuild, without having to get a new log from the reporter.

There are only a few things that could cause the video layer to be
rebuilt:

  1. The mMutatedSpecializeVideo flag being set.
  2. A new layer object being created to host for the video.

This patch more thoroughly documents instances of case 1 by adding
VIDEO_LOG message to anything that sets the mMutatedSpecializeVideo flag.

It also makes some attempt to identify case 2 by adding a VIDEO_LOG
message whenever we destroy a layer that has ever displayed a video
texture. If a log message like this is followed by a message that a new
video layer is created, it is a strong signal that we are unexpectedly
throwing away our external surface handles when we'd like to keep using
them.

Pushed by bwerth@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/d04c17155a76 Part 2: Make VIDEO_LOG note events that will cause the video layer to rebuild. r=mstange

Thank you for sticking with this while we try to fix this issue. We've landed more detail in the VIDEO_LOG. Would you please again follow the steps in comment 18 (with an updated Nightly build) and post the VIDEO_LOG text?

Flags: needinfo?(anmolsharma010)
Attached file videolog.txt

Updated Log

Flags: needinfo?(anmolsharma010)

Thank you for the updated log. It appears that at least some of video layer disruption is being caused by unnecessary notifications about the fullscreen state. This issue will be fixed by Bug 1631735. After that lands, I'll ask you to test again. The log does seem to show that some of the video layer disruption has another cause, but that could be a false signal.

Bug 1631735 has landed. Would you please download a new Nightly and see if the issue still occurs on your setup? And if so, post another log? Thank you for your patience while we work through solving this Bug.

Flags: needinfo?(anmolsharma010)
Attached file new_log.txt

Unfortunately it is still happening. Here is the updated log file.

Flags: needinfo?(anmolsharma010)

Thank you for the new log. It is showing that our method of handling fullscreen is causing this problem for you. Let's confirm what type of fullscreen you are getting (there are two methods in use in macOS). Would you please:

  1. Navigate to "about:config".
  2. In the search box, paste in "full-screen-api.macos-native-full-screen".
  3. Note the current true/false value you have set for that pref, then set it to the opposite value.
  4. Restart and try to replicate the issue again.
Flags: needinfo?(anmolsharma010)

So the default value as I tested on 112.0b2 (new install) is FALSE and after setting it to true, I could not replicate the issue. Maybe this where the issue was.

Flags: needinfo?(anmolsharma010)

Thank you for trying that out. The outcome you found is not ideal, but it's something. We're transitioning to the default value of true for that pref in Bug 1802193, which hopefully will land in the next few weeks. For now, if it doesn't cause other problems for you, consider leaving that pref set. I'll keep trying to resolve this issue for users using emulated fullscreen (as is currently default).

It looks like the latest update to macOS, 13.3 no longer throws the _ZN15MenuBarInstance22EnsureAutoShowObserverEv exception while animating the menu bar in fullscreen. Would you please update to macOS 13.3 and see if the issue persists for you (with the full-screen-api.macos-native-full-screen pref set to false)?

Flags: needinfo?(anmolsharma010)

Tested on 111.0.1 and still happening with the given config set to false. Do you need a new log?

Flags: needinfo?(anmolsharma010)

(In reply to Anmol from comment #37)

Tested on 111.0.1 and still happening with the given config set to false. Do you need a new log?

No, thank you. We'll keep this open while we try to find a fix for the emulated fullscreen case.

Theory: this could be happening because the RenderCompositorNative is creating external texture surfaces for the same video when it should be reusing an existing one. I have a patch to log that issue, but in the meanwhile it seems like the surface will be recreated in the following situations:

  1. When the compositor changes Kind, which seems very unlikely.
  2. When the TileCacheInstance is destroyed, which seems very unlikely.
  3. When the surface misses used_this_frame for one composite, which seems very very likely.

It looks like the reuse of the external texture surface is dependent on its key being matched in the external_native_surface_cache, but that key is made in part from a calculated size that is affected by the compositor transform. Although it's understandable that we would only want to match the surface if its size matches (we need to fill the destination surface, after all), I wonder if we're failing to match due to some off-by-one rounding or something. If so, perhaps this could even be relaxed. Glenn, what do you think?

Flags: needinfo?(gwatson)

This additional logging message will help to eliminate the false positives
of AttachExternalImage or SetRootWindowIsFullscreen being blamed for
causing the video layer to be rebuilt. If the layer itself is new, a video
layer rebuild is unavoidable.

Maybe external textures can never be resized, not really. I'm not sure we're getting value by including the size in the key. I'll build a patch that removes that part of the key.

Looks like the external image id changes with each frame of the video, making it difficult to associate the already-allocated surface with the new image's id other than by using its size. If the size is changing, then there's no way to find the surface. It would be superior if the external image had an unchanging id associated with its source, that remains unchanged as long as the source stays the same size.

(discussed some of this with Brad on matrix).

I don't think it's likely that the size has a rounding / accuracy issue, as we do round it before returning the value. But it's not clear to me which path we're currently looking to diagnose, as this differs a lot depending on codec / video format.

Flags: needinfo?(gwatson)

The size of the surface is irrelevant for external surfaces, since it is
handled by the native compositor with each call to AttachExternalImage. By
removing the size from the hash key, this patch ensures that real or
imagined changes to surface sizes for external surfaces will no longer
call create_compositor_external_surface, thrashing the resources of the
native compositor.

Depends on D175399

Anmol, would you please try this test build which includes the Part 3 and Part 4 patches? Since I'm still not able to replicate the Bug myself, I'm hoping that this plausible guess will solve the problem for you.

Flags: needinfo?(anmolsharma010)
Pushed by bwerth@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/fdafa168fa32 Part 3: Make NativeLayerCA report when RenderCompositorNative is creating new video surfaces. r=mstange https://hg.mozilla.org/integration/autoland/rev/9a41a103bdaf Part 4: Make ExternalNativeSurfaceKey encode the size only for non-external surfaces. r=gw

I tried that test build & couldn't replicate the issue anymore. May it has solved the problem.

Flags: needinfo?(anmolsharma010)

(In reply to Anmol from comment #48)

I tried that test build & couldn't replicate the issue anymore. May it has solved the problem.

Great! The fix is in Nightly, too, so you should have the same outcome there. We'll call this fixed for now. Feel free to re-open if the problem occurs again.

Status: UNCONFIRMED → RESOLVED
Closed: 1 year ago
Resolution: --- → FIXED
Duplicate of this bug: 1819442
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: