Open Bug 1888656 Opened 10 months ago Updated 1 months ago

[macOS] Sometimes video flashes when entering or exiting fullscreen through PiP

Categories

(Core :: Widget: Cocoa, defect, P3)

Desktop
macOS
defect

Tracking

()

Tracking Status
firefox-esr115 --- unaffected
firefox124 --- unaffected
firefox125 - wontfix
firefox126 + wontfix
firefox127 + wontfix
firefox128 --- wontfix

People

(Reporter: atrif, Assigned: bradwerth)

References

(Regression)

Details

(Keywords: regression)

Attachments

(4 files)

Found in

  • 125.0b5

Affected versions

  • 126.0a1 (2024-03-28)
  • 125.0b5

Tested platforms

  • Affected platforms: macOS 12 Intel, macOS 14 ARM, 13 intel
  • Unaffected platforms: Windows 10x64, Ubuntu 23.10

Steps to reproduce

  1. Open a random YouTube video and open PiP.
  2. Enter and exit fullscreen from PiP.

Expected result

  • No glitches can be seen when entering or exiting fullscreen.

Actual result

  • Sometimes the video flashes after entering or exiting fullscreen.

Regression range

  • Mozregression points at bug 1880558. Note that the issue may be intermittent. If another regression range is needed please let me know.

Additional notes

  • Attached a screen recording
  • The issue may be intermittent, please repeat step 2 until the issue is reproduced.

:bradwerth, since you are the author of the regressor, bug 1880558, could you take a look?

For more information, please visit BugBot documentation.

Flags: needinfo?(bwerth)

I'll try to fix this.

Assignee: nobody → bwerth
Flags: needinfo?(bwerth)
Priority: -- → P3

The bug is marked as tracked for firefox125 (beta) and tracked for firefox126 (nightly). We have limited time to fix this, the soft freeze is in 9 days. However, the bug still has low priority and has low severity.

:gcp, could you please increase the priority and increase the severity for this tracked bug? Given that it is a regression and we know the cause, we could also simply backout the regressor. If you disagree with the tracking decision, please talk with the release managers.

For more information, please visit BugBot documentation.

Flags: needinfo?(gpascutto)

I disagree that this should be a tracked Bug. A flicker when exiting fullscreen is a one-time cost to the user. This will get fixed in time, but there is no benefit to backing out Bug 1880558 in the meanwhile.

Priority: P3 → P2
Flags: needinfo?(gpascutto)

I think this will be heavily affected by the re-work that will occur in Bug 1839425. That is being backed out now due to other regressions. I'll make it a blocker when the backout is complete.

Bug 1839425 was backed out from Nightly and Beta. Can you please re-test to see what the situation looks like now? Thanks!

Flags: needinfo?(atrif)

(In reply to Ryan VanderMeulen [:RyanVM] from comment #6)

Bug 1839425 was backed out from Nightly and Beta. Can you please re-test to see what the situation looks like now? Thanks!

Sure thing! Unfortunately, I can still reproduce the issue intermittently on macOS 14 arm, 12 and 13 Intel with Firefox 125.0b9 (20240404012813) and 126.0a1 (20240404034404) when entering fullscreen from PiP or when exiting fullscreen.

If more information is needed please let us know! Thank you!

Flags: needinfo?(atrif)
Priority: P2 → P1

(In reply to Brad Werth [:bradwerth] from comment #4)

I disagree that this should be a tracked Bug. A flicker when exiting fullscreen is a one-time cost to the user. This will get fixed in time, but there is no benefit to backing out Bug 1880558 in the meanwhile.

Ryan, must this remain a release tracked Bug?

Flags: needinfo?(ryanvm)

I guess not, but note that I never proposed backing out bug 1880558 either.

Flags: needinfo?(ryanvm)

This is happening because the Firefox macOS layer manager NativeLayerCA is destroying and recreating video surfaces during these transitions. Because the video layer initially appears blank and buffers decode of the first frame, this creates a flicker. There are a few challenging options on how to fix this:

  1. Make WebRender keep the same surface id across changing compositions. This is hard because of the complexity of WR. In general, WR claims new surface ids for every surface that changes size. This is not strictly necessary for "external" surfaces such as video surfaces, but the reuse logic doesn't exist and would have to be created. This is probably the best fix because it would reduce surface churn across all platforms.
  2. Make the macOS layer manager detect when a surface matches an already existing surface and reuse the layer that displays that surface. This would also be a change within WR, but possibly simpler because it would just be adding some source information to the surfaces that WR sends to the layer manager.
  3. Make the macOS video layer display the initial frame while buffering the next frame. We already try to force the first frame, but it apparently doesn't work as intended. Fix this or find a new, working approach. This would paper over the issue and insulate macOS from WR behavior, but we wouldn't get the benefit of the reduced surface churn. We'd also possibly be displaying a duplicated frame which would turn the flicker into a stutter -- less noticeable but something to avoid.

There's no great answer here. I'll try all of them and see which one I can get working.

It appears that the flicker occurs when WebRender sends exactly one screen-filling black surface to the layer manager, quickly followed by another update with the "real" video surface and other surfaces. How could it not flicker under those circumstances? I'll try to understand why this is happening and make it stop.

And sometimes WR instead sends a bunch of all-black tiles. So it's not that the color layer itself is the problem, it's that WebRender thinks nothing else is on the screen for the frame.

Confirmed this can happen on all macOS fullscreen windows, whether or not they are video, whether or not they came from Picture-in-Picture. I don't have reliable Steps to Reproduce, but when switching between fullscreen and desktop spaces I can sometimes get a flash of one frame of the background color on any window. Bumping this to S2.

Severity: S3 → S2
Attached video screen flicker 720p.mov

This is a recording of the page for this Bug in Bugzilla, showing that making the window fullscreen, then switching to desktop space, then back to fullscreen space will flicker the window with an all-background frame.

Reading your comments here the symptoms sort of sound like it's possible they are caused by the same root cause as bug 1750189. I need to land that anyway so I revived those patches and there is a rollup of them here https://hg.mozilla.org/try/rev/4f87a7832168ec88d097ce4f3c73ed90f5c7e22f

I wasn't able to reproduce the flicker, otherwise I would test it myself.

(In reply to Timothy Nikkel (:tnikkel) from comment #15)

Reading your comments here the symptoms sort of sound like it's possible they are caused by the same root cause as bug 1750189. I need to land that anyway so I revived those patches and there is a rollup of them here https://hg.mozilla.org/try/rev/4f87a7832168ec88d097ce4f3c73ed90f5c7e22f

I wasn't able to reproduce the flicker, otherwise I would test it myself.

Thanks for the patch. It sounds relevant for sure. I also can't tell if it solves the problem because when I leave the fullscreen space, the content process crashes with:

[68659] Hit MOZ_CRASH(index out of bounds: the len is 0 but the index is 1023) at gfx/wr/webrender_api/src/display_item_cache.rs:67

Interesting, thanks for testing, I'll look into that.

Barring any more insights into what's going on, I'm going to make Bug 1750189 a blocker for this.

Depends on: 1750189

Do you plan on having anything that can patch this in time for Fx126 since this is an S2?
The depends on bug doesn't have any activity

Flags: needinfo?(bwerth)

(In reply to Donal Meehan [:dmeehan] from comment #19)

Do you plan on having anything that can patch this in time for Fx126 since this is an S2?
The depends on bug doesn't have any activity

No, I don't have a view to solve this. I worked through the issue I was having with the patches for Bug 1750189 and confirmed that that patch will not solve this Bug, nor will the patch for the another candidate Bug 1750189. So I'm back to just trying to root cause this problem directly, and I don't have good insight into it yet.

No longer depends on: 1750189
Flags: needinfo?(bwerth)

Thanks, setting Fx126 to fix-optional. If you do get a fix at some point we could consider uplifting.

Ugh, first breakthrough after a lot of poking around. The blank frame is passing through this call to SetEmptyDisplayList, which seems to clearly indicate that this is an unusual condition where Gecko intentionally generates a blank frame (or a blank display list for the only subframe on the page, which devolves into a blank frame). So this is a known thing that we need to steer around. Conceptually, this sounds like the issue being worked in Bug 1750189, but the WIP patches there didn't fix this issue. I'll keep working on it.

(In reply to Brad Werth [:bradwerth] from comment #22)

Ugh, first breakthrough after a lot of poking around. The blank frame is passing through this call to SetEmptyDisplayList, which seems to clearly indicate that this is an unusual condition where Gecko intentionally generates a blank frame (or a blank display list for the only subframe on the page, which devolves into a blank frame). So this is a known thing that we need to steer around. Conceptually, this sounds like the issue being worked in Bug 1750189, but the WIP patches there didn't fix this issue. I'll keep working on it.

What if you set mReceivedDisplayList to false right after that SetEmptyDisplayList call?

Better understanding of what is going on. The blank display list in comment 22 is intentional and correct in that it helps us have a fast initial paint since we don't wait on any missing display lists. That is what we want for first paint, but it's really not what we want if the OS compositor is holding stale pixels from a previous paint that our new paint will be compared against. The macOS native fullscreen transition definitely has these stale pixels because it uses them as a source for the cross-fade. Options:

  1. Figure out why the pipeline is dropped, keep it from being dropped. This would be the nicest solution and would solve the common case (transitioning to a macOS fullscreen window from an existing window) but not solve every case. We can't hold all pipelines indefinitely.
  2. Make WR aware when there is a possibility of stale pixels (maybe every non-first-paint situation?) and have it hold its next frame emission until all pipelines are ready. Might be hard to identify the correct conditions; might have unwanted side effects.
  3. Make NativeLayerCA (our macOS layer manager) detect the conditions where there might be stale pixels and wait for the WR frame structure to stabilize before displaying it. This adds a delay and would be hard to get right.
  4. Make the nsCocoaWindow fullscreen transition do something different to supply different pixels to the OS compositor so the stale pixels better match the new pixels. Seems like if WR is going to draw a blank frame, there's no "good" stale pixels to match against it, except a blank frame. This would be ugly.

I'll keep thinking about how to fix this. Probably option 2.

Small update. By the time we reach WebRenderBridgeParent::AddPipelineIdForCompositable as noted in comment 22, the blank frame has already been generated. There's some prior display list sent that doesn't even contain the video pipeline. That's what needs to be fixed.

These patches attempt strategy #3 from comment 24. They don't work yet -- they classify too many frame updates as having a missing pipeline -- but I wanted them up here as a sign of progress.

(In reply to Timothy Nikkel (:tnikkel) from comment #23)

What if you set mReceivedDisplayList to false right after that SetEmptyDisplayList call?

Sorry I never responded to this excellent suggestion. Unfortunately this wasn't sufficient to solve the problem, which maybe is an indicator that I still haven't characterized it correctly. This one is so tricky!

Bug 1898436 and Bug 1897821 both seem relevant, but applying those patches does not solve this issue.

Severity: S2 → S3
Priority: P1 → P3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: