Closed Bug 1658392 Opened 4 years ago Closed 2 months ago

Stuttery playback of 1440p video on YouTube with UHD 600 on Windows

Categories

(Core :: Graphics: WebRender, defect, P3)

x86_64
Windows 10
defect

Tracking

()

RESOLVED INCOMPLETE
Tracking Status
firefox81 --- affected

People

(Reporter: yoasif, Assigned: bradwerth)

References

Details

(Keywords: perf)

Attachments

(4 files)

Attached video 2020-08-10 15-22-15.mp4

Noticed that YouTube playback struggles immensely compared to Edge on the same machine.

Steps to reproduce:

  1. Open https://www.youtube.com/watch?v=LXb3EKWsInQ
  2. Set playback quality to 1440p
  3. Play video

What happens:

Stuttery playback with many dropped frames.

Captured a profile: https://share.firefox.dev/2PFFtRx

Expected result:

Few dropped frames, smooth playback (like Edge).

Attached file about:support
Summary: Slow playback of 1440p video on YouTube → Stuttery playback of 1440p video on YouTube

Is this a regression?

Keywords: perf
OS: Unspecified → Windows 10
Hardware: Unspecified → x86_64

According to this page [1], the graphic card report uses, Intel-UHD-Graphics-600, seems not strong enough to play 1440p vp9. If I remember correctly, the approach of dealing late arrived decoded frame we use are different than Chrome and Edge (which is using Chromium).

Our approach is, if the new coming frame is late than current audio time, then we would discard it and wait for the next frame until we get one with correct timing, that is why the video frame would freeze.

Are you able to profile the video again with "Media" preset? That could provide more markers to show the frame dropping time and help us diagnose the issue.

Thank you.

[1] https://www.notebookcheck.net/Intel-UHD-Graphics-600-GPU.271820.0.html

Flags: needinfo?(yoasif)

Alastor, I could have sworn I profiled with the media preset in my initial report, but I went ahead and profiled again: https://share.firefox.dev/2PDLD4P

Flags: needinfo?(yoasif)

Hi, when you clicked Publish, did you click Include hidden threads? Because I can't see WebContent process and its related threads on which the major media pipeline runs.

However, the weird thing is that I indeed saw MediaDecoderStateMachine thread in the list of Threads Filter in your report.

Flags: needinfo?(yoasif)

I didn't uncheck anything and Include hidden threads was checked in this profile run by default (I changed nothing). Here's a new one: https://share.firefox.dev/2CfJ99A

Flags: needinfo?(yoasif)

Different playback strategies.

Chrome/Edge will pause the video/audio until it has content to play.
We favour the audio so that it is never stopped or interrupted.

How did the video sound like when played in Edge?

Few dropped frames, smooth playback (like Edge).
we have a different idea of what smooth is, that video certainly wasn't smoothed at all, 1/3rd of the frames were dropped continuously

(In reply to Jean-Yves Avenard [:jya] from comment #7)

How did the video sound like when played in Edge?

They sound the same in Firefox Nightly and Edge.

Few dropped frames, smooth playback (like Edge).
we have a different idea of what smooth is, that video certainly wasn't smoothed at all, 1/3rd of the frames were dropped continuously

Agreed that the video I took doesn't look great, but I was also playing back in both browsers and recording from within the machine.

I shut down the other apps on this machine and just played the two videos back to back as the only tabs in each browser -

Edge: 107 dropped out of 18808
Firefox: 11184 dropped out of 18721

I'm attaching video of both browsers playing back the video so you can get an idea of how they sound.

Attached video edge_EDIT.mp4

Can you try setting media.ffvpx.enabled to false ?

this will make it use libvpx which is the same decoder as what edge/chromium is using

normally performance are much worse than with ffvp9.

Can you try disabling webrender too ?

Did some additional testing:

WR disabled: 4507 dropped out of 18693
WR disabled & media.ffvpx.enabled set to false: 3819 dropped out of 18699
WR enabled & media.ffvpx.enabled set to false: 11821 dropped out of 18587

(In reply to Alastor Wu [:alwu] from comment #5)

Hi, when you clicked Publish, did you click Include hidden threads? Because I can't see WebContent process and its related threads on which the major media pipeline runs.

However, the weird thing is that I indeed saw MediaDecoderStateMachine thread in the list of Threads Filter in your report.

I renamed the Playback media threads as Controller, the UI wasn't updated to reflect that. So the media present no longer capture the playback thread

With such loss in performance when WR is enabled, I'm going to reassign that bug there.

Component: Audio/Video: Playback → Graphics: WebRender

Jean-Yves, can we split off a new bug into the playback issue? The WebRender thing can be investigated separately, but as far as I can tell, playback is worse than Edge as it is.

(In reply to Asif Youssuff from comment #14)

Jean-Yves, can we split off a new bug into the playback issue? The WebRender thing can be investigated separately, but as far as I can tell, playback is worse than Edge as it is.

63% of frame drop vs 20% is likely the only thing that matter.

We are already aware across multiple bugs that Edge performs much better than we do when it comes to playback. They are doing zero-copy decoding which are are currently unable to achieve (this could be achieved with bug 1539735).
They also use a DXVA decoder directly while we are using a WMF media transform which adds a lot of unnecessary overhead.

Having said that, here your machine is clearly incapable of doing properly such content.

Now regarding WebRender, it could be that we are just reporting the frame drops differently when using webrender; :mstange has made some changes recently in regards to how frames dropped are reported in webrender; in which case it's just a red-herring that webrender is looking so bad here.

:mstange, what do you think?

Flags: needinfo?(mstange.moz)

(In reply to Asif Youssuff from comment #11)

Did some additional testing:

WR disabled: 4507 dropped out of 18693
WR disabled & media.ffvpx.enabled set to false: 3819 dropped out of 18699
WR enabled & media.ffvpx.enabled set to false: 11821 dropped out of 18587

Could you collect a profile in each of these configurations so that we can compare?

Flags: needinfo?(mstange.moz) → needinfo?(yoasif)

(In reply to Jean-Yves Avenard [:jya] from comment #15)

We are already aware across multiple bugs that Edge performs much better than we do when it comes to playback. They are doing zero-copy decoding which are are currently unable to achieve (this could be achieved with bug 1539735).
They also use a DXVA decoder directly while we are using a WMF media transform which adds a lot of unnecessary overhead.

This makes it sound like we're resigned to being worse. I'd like to be clear on the fact that, whenever we're worse, that's a valid bug.
Thank you Asif for filing this bug.

(In reply to Jean-Yves Avenard [:jya] from comment #15)

Now regarding WebRender, it could be that we are just reporting the frame drops differently when using webrender; :mstange has made some changes recently in regards to how frames dropped are reported in webrender; in which case it's just a red-herring that webrender is looking so bad here.

There should not be a difference in the reporting of dropped frames between WR and non-WR.

The video frame drops seem to all be occurring in the video sink. See the DiscardVideo markers on the MediaDecoderStateMachine thread:
https://share.firefox.dev/30QRDNK

Markus,

Here is what I came up with:

WR enabled, media.ffvpx.enabled = true (default): https://share.firefox.dev/2PMpHED
WR enabled, media.ffvpx.enabled = false: https://share.firefox.dev/2DERQex
WR disabled, media.ffvpx.enabled = true: https://share.firefox.dev/31Gnccw
WR disabled, media.ffvpx.enabled = false: https://share.firefox.dev/3kBbzMC

Flags: needinfo?(yoasif)

@Sotaro: Could you take a look at what's going on here?

Blocks: gfx-82
Severity: -- → S2
Flags: needinfo?(sotaro.ikeda.g)
Priority: -- → P3

WebRender+native compositor uses more GPU power. That might increase the frame drop. If the vp9 video was decoded by GPU, Bug 1460499 might reduce GPU update and might reduce frame drop.

Now that bug 1640526 has landed, can you still reproduce the perf being significantly worse with WebRender in the latest nightly? Thanks!

Flags: needinfo?(yoasif)

(In reply to Asif Youssuff from comment #19)

Markus,

Here is what I came up with:

WR enabled, media.ffvpx.enabled = true (default): https://share.firefox.dev/2PMpHED
WR enabled, media.ffvpx.enabled = false: https://share.firefox.dev/2DERQex
WR disabled, media.ffvpx.enabled = true: https://share.firefox.dev/31Gnccw
WR disabled, media.ffvpx.enabled = false: https://share.firefox.dev/3kBbzMC

Thanks! As far as I can tell, the difference between the WR on and off profiles are firmly on the decoding side.
WR on: https://share.firefox.dev/3hldq5n
WR off: https://share.firefox.dev/33hapxR
(both with media.ffvpx.enabled = true)

The WR on profile has a lot more DiscardVideo markers. We must be falling behind during decoding. This could either be due to extra GPU load (what Sotaro said), or maybe we're trying to decode for the wrong target timestamp. I don't know enough about how we determine the decoding target timestamps to say why WR would be making a difference here.

(In reply to Andrew Osmond [:aosmond] from comment #22)

Now that bug 1640526 has landed, can you still reproduce the perf being significantly worse with WebRender in the latest nightly? Thanks!

Did another test.

WR enabled 12663 dropped frames out of 18386 (~69%)
WR disabed 4875 dropped frames out of 18562 (~26%)

Hope this helps.

Flags: needinfo?(yoasif)

@Sotaro: Do you have an idea what we could do here?

(In reply to Asif Youssuff from comment #19)

WR enabled, media.ffvpx.enabled = true (default): https://share.firefox.dev/2PMpHED
WR enabled, media.ffvpx.enabled = false: https://share.firefox.dev/2DERQex
WR disabled, media.ffvpx.enabled = true: https://share.firefox.dev/31Gnccw
WR disabled, media.ffvpx.enabled = false: https://share.firefox.dev/3kBbzMC

It seems that pref "media.ffvpx.enabled" did not affect to the profile results, since all of them have the following function. It seems that hardware decoder was used for vp9 video decoding.

  • WMFVideoMFTManager::Output()
  • WMFVideoMFTManager::CreateD3DVideoFrame()

Under MFTDecoder::Output() call, CDevice::WaitForSynchronizationObjectFromCpuCB() wait happened for very long duration. And the wait became longer when WebRender is enabled. From it, WebRender uses more GPU than non-WebRender. And WebRender caused the WaitForSynchronizationObjectFromCpuCB() wait longer. Then decoding speed seemed to become slower.

If we want to test software vp9 decoding, we also need to set pref "media.wmf.vp9.enabled=false" in addition to pref "media.ffvpx.enabled".

Asif, can you test software decode VP9 performance with pref "media.wmf.vp9.enabled=false"?

Flags: needinfo?(yoasif)

Sotaro, I should have realized this even before trying it, but the performance is abysmal. Basically (literally) 100% dropped frames. This is a very slow CPU, so software decode at this resolution is impossible.

Flags: needinfo?(yoasif)
Summary: Stuttery playback of 1440p video on YouTube → Stuttery playback of 1440p video on YouTube with UHD 600

@Nical: Can you reproduce this on the cherryview intel ?

Flags: needinfo?(nical.bugzilla)

(In reply to Asif Youssuff from comment #28)

Sotaro, I should have realized this even before trying it, but the performance is abysmal. Basically (literally) 100% dropped frames. This is a very slow CPU, so software decode at this resolution is impossible.

Thank you for checking! From it, hardware decoding is necessary and gpu task needs to be reduced drastically. bug 1539735 could be one solution.

Flags: needinfo?(sotaro.ikeda.g)

@Nical: Can you reproduce this on the cherryview intel ?

Yeah, with that device, things work well in 720p but in 1440p playback gets stuck on a particular frame for a long time even though the timeline in the video control advances.

Flags: needinfo?(nical.bugzilla)
No longer blocks: gfx-82

Is there any news on this issue? Video playback performance in FF is terrible. Almost all the frames are dropped. I have to use Chrome to watch HQ videos on YT!

Nical, can you please retest this and see if it remains an issue?

Flags: needinfo?(nical.bugzilla)

I don't have the cherryview low end device that could reproduce this anymore. I don't know of any work that would have moved the needle in that area.

Flags: needinfo?(nical.bugzilla)

For me it's reproducible on this video https://www.youtube.com/watch?v=7MIciimLLdM (or any other video on that channel) but not in 1440p. 1440p is ok but 1080p x2 speed is a slideshow. In Chrome 1080p x2 and even 1440p x2 playbacks are ok.

Intel UHD Graphics 617 1536 MB

(In reply to ris58h from comment #35)

For me it's reproducible on this video https://www.youtube.com/watch?v=7MIciimLLdM (or any other video on that channel) but not in 1440p. 1440p is ok but 1080p x2 speed is a slideshow. In Chrome 1080p x2 and even 1440p x2 playbacks are ok.

Intel UHD Graphics 617 1536 MB

This plays fine for me at 1x or 2x at 1440 and 1080. Intel(R) UHD Graphics 620.

Could you help us out by capturing and uploading a Firefox performance profile when this happens? Visit https://profiler.firefox.com/ for more information. When capturing, please select the Graphics profile in the drop down.

Flags: needinfo?(ris58h)

Could you help us out by capturing and uploading a Firefox performance profile when this happens? Visit https://profiler.firefox.com/ for more information. When capturing, please select the Graphics profile in the drop down.

I've recorded some Graphics profiles:

  1. 1080p x2. It hangs sometimes. https://share.firefox.dev/3TSUSgA
  2. 1080p x2 but I've entered and exited fullscreen mode. It hangs sometimes too. https://share.firefox.dev/3st8QKu
  3. 1440p x2 - just a static image but sound is playing. https://share.firefox.dev/3sqAqb3
Flags: needinfo?(ris58h)

Sotaro, can you provide any insights on this?

Flags: needinfo?(sotaro.ikeda.g)

Original bug was for Windows PC. See comment 1.

Profile data from ris58h says that it is for MacOS. See Comment 37.

Window and MacOS uses different hardware decoder. Then they are different problems.

Profile data of Comment 37 does not have information of decoding side. If we get profile by using Settings "Media", the profile data seems to have more information.

Flags: needinfo?(sotaro.ikeda.g)

ris58h, can you get Firefox Profiler with Settings "Media"? Thank you.

Flags: needinfo?(ris58h)

:mstange, can you comment to profile data of Comment 37?

Flags: needinfo?(mstange.moz)

In the first two profiles, I see some "SkippedComposite" markers on the compositor thread with the reason "Too many pending frames", i.e. the Renderer thread wasn't able to keep up with the requested composites. I think the Renderer thread is CPU starved - there are long delays between the Compositor thread requesting a composite and the Renderer thread starting to work on a composite.
First profile with a view which shows the relevant markers: https://share.firefox.dev/40ix1e0
Second profile with the same view: https://share.firefox.dev/3HmyHKR
It might be a good idea to increase the priority of the Renderer thread on macOS, too.

In the third profile, the compositor never gets any decoded frames. There should be increasing frameIDs in the UpdateCompositedFrame markers, but there's just a single image: https://share.firefox.dev/3WSTQlL

Flags: needinfo?(mstange.moz)

It seems like it's better now. There are some lags but it's not a slideshow.
https://profiler.firefox.com/from-browser/calltree/?globalTrackOrder=0wg&hiddenGlobalTracks=1wd&hiddenLocalTracksByPid=804-0w9~3424-01~1258-0w2~9911-0w2~9247-0w2~11009-0w2~11005-0w2~6427-0w2~11098-0w2~2534-0w2~6432-0w2~2551-0w2~2552-0w2~10463-0w3~3466-02~3423-0wy8yawydyfwynyqyr~9279-0wfiqwxc&thread=Aj&v=8

Video https://www.youtube.com/watch?v=LXb3EKWsInQ 1440@60 HDR x2 in Theater Mode

Display resolution is scaled to larger text (1280x800)

MacBook Air
Retina, 13-inch, 2018
1,6 GHz Dual-Core Intel Core i5
Intel UHD Graphics 617 1536 MB
8 GB 2133 MHz LPDDR3

Flags: needinfo?(ris58h)

You need to upload the profile before you can share a link to it. Use the "Upload Local Profile" button in the top right corner.

(In reply to Markus Stange [:mstange] from comment #44)

You need to upload the profile before you can share a link to it. Use the "Upload Local Profile" button in the top right corner.

Sorry for a wrong link. https://share.firefox.dev/3Y6oXeE

:bradwerth, can you comment to comment 37 and comment 45?

Flags: needinfo?(bwerth)

I don't have any additional insight into the profiles that have been uploaded so far.

I can sort-of reproduce this with a MacBook Pro with Intel UHD Graphics 630 running macOS 10.15, when pushing the quality up to 2160p. In that case I get some micro-stuttering. 1440p plays back fine. I've uploaded a profile of 5 seconds of playback. In the profile, I've tried to highlight only the relevant tracks, including both the media decoding and the compositor and rendering tracks.

I don't see anything actionable here. The only strange thing I noted is that the MediaPDecoder tracks spend most of their time in safeShift10BitBy6. That's fine, as long as the function is being inlined as intended. I assume the profiler is magically displaying the inlined stack, but if it's not, and the function has failed to be inlined, then that's something to fix.

Flags: needinfo?(bwerth)

(In reply to Brad Werth [:bradwerth] from comment #47)

I don't have any additional insight into the profiles that have been uploaded so far.

I can sort-of reproduce this with a MacBook Pro with Intel UHD Graphics 630 running macOS 10.15, when pushing the quality up to 2160p. In that case I get some micro-stuttering. 1440p plays back fine. I've uploaded a profile of 5 seconds of playback. In the profile, I've tried to highlight only the relevant tracks, including both the media decoding and the compositor and rendering tracks.

What video codec was being used in this profile?

Flags: needinfo?(bwerth)

(In reply to Jeff Muizelaar [:jrmuizel] from comment #48)

What video codec was being used in this profile?

It's VP9. Stats for nerds shows it as vp09.02.51.10.01.09.16.09.00 (337) / opus (251).

Flags: needinfo?(bwerth)

I'm surprised it's not decoding in hardware.

I'm going to split off the macOS playback issue into a new bug, leaving this for the originally-reported Windows playback issue. The platforms are different enough that we need to use different approaches for optimization on each, and we have different paths to achieve hardware decode, etc.

Summary: Stuttery playback of 1440p video on YouTube with UHD 600 → Stuttery playback of 1440p video on YouTube with UHD 600 on Windows

I've opened a new Bug 1814535 for the macOS-specific playback issue. ris58h, if the problem is still happening for you on your Intel UHD Graphics 617 mac, it would be helpful if you would upload a profile recorded with "Media" presets and add a link to it in that Bug. Since your earlier profiles were recorded with macOS 13, you should be getting hardware decode and it would be useful to see how hardware decode can cause stuttering.

Flags: needinfo?(ris58h)
See Also: → 1814535
Flags: needinfo?(ris58h)

Asif, can you still reproduce this?

Flags: needinfo?(yoasif)
Assignee: nobody → bwerth

Jeff, I am unfortunately still seeing dropped frames. Here's a new profile: https://share.firefox.dev/3Lj5rbQ

Flags: needinfo?(yoasif)

(In reply to Asif Youssuff from comment #54)

Jeff, I am unfortunately still seeing dropped frames. Here's a new profile: https://share.firefox.dev/3Lj5rbQ

I can't tell from that profile if there is a graphics issue, because it doesn't include the sampling from the Renderer thread. All I've discovered so far is that there's a big JavaScript garbage collection right before the first stutter. Asif, would you please record and post a profile using the "Graphics" Settings preset?

Flags: needinfo?(yoasif)

The profile has Renderer thread. It is hidden in GPU process. It could be enabled by right click on GPU process.

(In reply to Sotaro Ikeda [:sotaro] from comment #56)

The profile has Renderer thread. It is hidden in GPU process. It could be enabled by right click on GPU process.

I was expecting to see one in the parent process. At least, I see a Renderer, CanvasRender, and Compositor thread in the parent process when I record with the "Graphics" setting.

I'll try to reproduce on my slightly-newer Ice Lake system, Intel Iris Plus Graphics.

Flags: needinfo?(yoasif)

Any luck with reproduction, Brad?

Flags: needinfo?(bwerth)

(In reply to Brad Werth [:bradwerth] from comment #58)

I'll try to reproduce on my slightly-newer Ice Lake system, Intel Iris Plus Graphics.

On Ice Lake, it's very smooth. I'm going to mark this Bug as stalled until we have a profile that's showing something actionable. Again, all I can see in the last posted profile from comment 54 is that there's a JavaScript GC before the first frame stutter, but there's no activity on any thread during the stall period.

Severity: S2 → S3
Status: NEW → ASSIGNED
Flags: needinfo?(bwerth)
Keywords: stalled

Would you please record a new profile with "Media" settings? This may show us what is doing resource contention during the frame stuttering.

Flags: needinfo?(yoasif)

(In reply to Brad Werth [:bradwerth] (out until Jan 8) from comment #61)

Would you please record a new profile with "Media" settings? This may show us what is doing resource contention during the frame stuttering.

Whiteboard: [closeme 2024-02-01]

Resolved per whiteboard

Status: ASSIGNED → RESOLVED
Closed: 2 months ago
Flags: needinfo?(yoasif)
Resolution: --- → INCOMPLETE
Whiteboard: [closeme 2024-02-01]

Since the bug is closed, the stalled keyword is now meaningless.
For more information, please visit BugBot documentation.

Keywords: stalled
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: