Open Bug 1520894 Opened 4 years ago Updated 4 months ago

Low-latency video playback via MediaSource causes extremely high CPU/GPU usage and poor playback performance.

Categories

(Core :: Audio/Video: Playback, defect, P3)

defect

Tracking

()

Webcompat Priority P3
Performance P2

People

(Reporter: andrew, Unassigned, NeedInfo)

References

Details

(Keywords: perf:responsiveness)

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36

Steps to reproduce:

Actual results:

Firefox begins to pin the CPU and GPU usage at 100% and video playback becomes a slideshow. This actually becomes even worse if you make Firefox use the dGPU in a laptop instead of the integrated graphics.

Expected results:

Playback should remain smooth with low-latency and reasonable resource usage. Chrome and Safari are able to do this without issue. In fact, on a system with a Intel i7 7770k and GTX 1080Ti, Firefox is also able to playback smoothly https://twitter.com/Andrewmd5/status/1070562962228707333

We tested on a number of computers running Windows 10:

Lenovo ThinkPad P51 (20HH000TUS)

Lenovo X1 Carbon 6th Generation Ultrabook

Intel Compute Stick CS125

In all our test Chrome performs as expected and Firefox does not.

I've also uploaded a saved streamed so you can analyze the video data we append to the SourceBuffer.

https://nexus.rainway.com/2019-01-17-14-41_(Radeon+(TM)+RX+480+Graphics).zip

Component: Untriaged → Audio/Video: Playback
Product: Firefox → Core
Whiteboard: [qf]

Thanks for filing this bug Andrew.

(In reply to Andrew Sampson from comment #0)

This actually becomes even worse if you make Firefox use the dGPU in a laptop instead of the integrated graphics.

This comment makes me think you're causing us to readback decoded video frames from the GPU. Are you piping the video element through a canvas per chance?

Flags: needinfo?(andrew)

Andrew: it would also be super helpful if you could collect and share a performance profile of Firefox behaving badly. Then we can inspect the problem as you observe it happening in your environment.

It's very easy to collect a profile by installing the perf-html addon from https://perf-html.io, press CTRL+SHIFT+1 to start profiling, and CTRL+SHIFT+2 to stop once you've recorded the bad behaviour. Then you can upload the profile to our servers to share it (publicly) with us so we can inspect what's happening on your hardware.

It would also be helpful if you could add "Media" to the "Add custom threads by name" box in the profiling settings, then Firefox will collect samples from the threads involved in media demuxing and decoding. Your perf-html configuration would then look something like this:
https://imgur.com/a/TV3kbFb

For more details, see also:

https://developer.mozilla.org/en-US/docs/Mozilla/Performance/Reporting_a_Performance_Problem

Hi Chris,

I've taken a profile from our web application which you can download here

For further context, we do not use a canvas and are just rendering the video via MSE. We have optimizations in place that just keeps the video synced with the live edge, allowing for real-time interactive streaming.

Please let me know if I can provide any more information.

Flags: needinfo?(andrew)

Looking at this time slice on the GPU process' decode thread: https://perfht.ml/2U83st1 we can see (in the middle of the flame graph) that we're spending about 1/3 of the time draining the video decoder. Which I assume is causing us to re-do a lot of the decoding work?

jya: What causes us to drain on Windows these days?

Flags: needinfo?(jyavenard)

Decoders have latencies. Particularly on Windows where the decoders is always about 30 frames behind that last input one (so you must feed around 30 frames to the decoder before one comes out).

So when we reach the end of resource, in order to display those frames we have to drain the decoder, and to resume decoding once more frames are added we seek to the last frame decoded (which requires going to the last keyframe and decoding all frames until the last one).

Sites players that wants a very low latency (such as Twitch), try to stay as close to live as possible so we only have a few frames ahead of current time. As such, we hit EOS very often which will cause to drain/seek/decode repetitively.

Being able to play all frames buffered is part of the MSE spec, and you can't do so unless you drain the decoder when you reach the end of the buffered range.

For sites like Twitch and other low latency ones, where very few frames ahead of current time are buffered this can lead to very high CPU usage if the machine doesn't support HW decoding.

Unfortunately, not doing this drain will break sites such as YouTube as they do the following when seeking:

1- Load around 1s of video (on a typical youtube stream, that's 15 frames)
2- They wait for a frame to be available and the seeked event to be fired.
3- Once the seek event is fired, they will load more data and continue playback.

Now on all platforms other than Windows, not draining will be okay in this case, as none have such a high latency. With FFmpeg, the maximum latency is the number of threads use to decode, so on your typical machine it's between 4 and 8 frames max).

Windows, with its 35+ frames latency, the WMF decoder would return nothing here, and as such the seeked event will never be fired, and so YT will never send more data to play leading to a stall.

On Windows, YT only use the WMF decoder if HW decoding is supported, in all other cases they use FFmpeg. As such, you don't see this problem, and they can get away to not drain.

We don't have that freedom of implementation.

Flags: needinfo?(jyavenard)

Now, I would be keen to know why on the OP machine you could hit 100% GPU usage, which indicates that software decoding is in use.

Could you attach a copy of about:support here?

Also, what codecs are you using? The WMF latency issue only occurs with H264.

Flags: needinfo?(andrew)

Our video stream uses fragmented FMP4 (H.264) chunks. You can see a fully saved version of the streams raw data here.

Below is my about:support text.

Application Basics

Name: Firefox
Version: 64.0.2
Build ID: 20190108160530
Update Channel: release
User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:64.0) Gecko/20100101 Firefox/64.0
OS: Windows_NT 10.0
Multiprocess Windows: 1/1 (Enabled by default)
Web Content Processes: 2/4
Enterprise Policies: Inactive
Google Key: Found
Mozilla Location Service Key: Found
Safe Mode: false

Crash Reports for the Last 3 Days

All Crash Reports Firefox Features

Name: Firefox Screenshots
Version: 35.0.0
ID: screenshots@mozilla.org

Name: Form Autofill
Version: 1.0
ID: formautofill@mozilla.org

Name: Web Compat
Version: 3.0.0
ID: webcompat@mozilla.org

Name: WebCompat Reporter
Version: 1.1.0
ID: webcompat-reporter@mozilla.org

Extensions

Name: Gecko Profiler
Version: 0.26
Enabled: true
ID: geckoprofiler@mozilla.com

Name: User-Agent Switcher
Version: 1.2.4
Enabled: true
ID: user-agent-switcher@ninetailed.ninja

Name: Adobe Acrobat
Version: 18.0.9
Enabled: false
ID: web2pdfextension.17@acrobat.adobe.com

Security Software

Type: Windows Defender Antivirus

Type: Windows Defender Antivirus

Type: Windows Firewall

Graphics

Features
Compositing: Direct3D 11 (Advanced Layers)
Asynchronous Pan/Zoom: wheel input enabled; scrollbar drag enabled; keyboard enabled; autoscroll enabled
WebGL 1 Driver WSI Info: EGL_VENDOR: Google Inc. (adapter LUID: 0000000000013f00) EGL_VERSION: 1.4 (ANGLE 2.1.0.790e8e6b4179) EGL_EXTENSIONS: EGL_EXT_create_context_robustness EGL_ANGLE_d3d_share_handle_client_buffer EGL_ANGLE_d3d_texture_client_buffer EGL_ANGLE_surface_d3d_texture_2d_share_handle EGL_ANGLE_query_surface_pointer EGL_ANGLE_window_fixed_size EGL_ANGLE_keyed_mutex EGL_ANGLE_surface_orientation EGL_ANGLE_direct_composition EGL_NV_post_sub_buffer EGL_KHR_create_context EGL_EXT_device_query EGL_KHR_image EGL_KHR_image_base EGL_KHR_gl_texture_2D_image EGL_KHR_gl_texture_cubemap_image EGL_KHR_gl_renderbuffer_image EGL_KHR_get_all_proc_addresses EGL_KHR_stream EGL_KHR_stream_consumer_gltexture EGL_NV_stream_consumer_gltexture_yuv EGL_ANGLE_flexible_surface_compatibility EGL_ANGLE_stream_producer_d3d_texture EGL_ANGLE_create_context_webgl_compatibility EGL_CHROMIUM_create_context_bind_generates_resource EGL_CHROMIUM_sync_control EGL_EXT_pixel_format_float EGL_KHR_surfaceless_context EGL_ANGLE_display_texture_share_group EGL_ANGLE_create_context_client_arrays EGL_ANGLE_program_cache_control EGL_ANGLE_robust_resource_initialization EGL_ANGLE_create_context_extensions_enabled EGL_MOZ_create_context_provoking_vertex_dont_care EGL_EXTENSIONS(nullptr): EGL_EXT_client_extensions EGL_EXT_platform_base EGL_EXT_platform_device EGL_ANGLE_platform_angle EGL_ANGLE_platform_angle_d3d EGL_ANGLE_device_creation EGL_ANGLE_device_creation_d3d11 EGL_ANGLE_experimental_present_path EGL_KHR_client_get_all_proc_addresses EGL_KHR_debug EGL_ANGLE_explicit_context
WebGL 1 Driver Renderer: Google Inc. -- ANGLE (NVIDIA Quadro M1200 Direct3D11 vs_5_0 ps_5_0)
WebGL 1 Driver Version: OpenGL ES 2.0 (ANGLE 2.1.0.790e8e6b4179)
WebGL 1 Driver Extensions: GL_ANGLE_client_arrays GL_ANGLE_depth_texture GL_ANGLE_explicit_context GL_ANGLE_explicit_context_gles1 GL_ANGLE_framebuffer_blit GL_ANGLE_framebuffer_multisample GL_ANGLE_instanced_arrays GL_ANGLE_lossy_etc_decode GL_ANGLE_pack_reverse_row_order GL_ANGLE_program_cache_control GL_ANGLE_request_extension GL_ANGLE_robust_client_memory GL_ANGLE_texture_compression_dxt3 GL_ANGLE_texture_compression_dxt5 GL_ANGLE_texture_usage GL_ANGLE_translated_shader_source GL_CHROMIUM_bind_generates_resource GL_CHROMIUM_bind_uniform_location GL_CHROMIUM_color_buffer_float_rgb GL_CHROMIUM_color_buffer_float_rgba GL_CHROMIUM_copy_compressed_texture GL_CHROMIUM_copy_texture GL_CHROMIUM_sync_query GL_EXT_blend_minmax GL_EXT_color_buffer_half_float GL_EXT_debug_marker GL_EXT_discard_framebuffer GL_EXT_disjoint_timer_query GL_EXT_draw_buffers GL_EXT_frag_depth GL_EXT_map_buffer_range GL_EXT_occlusion_query_boolean GL_EXT_read_format_bgra GL_EXT_robustness GL_EXT_sRGB GL_EXT_shader_texture_lod GL_EXT_texture_compression_dxt1 GL_EXT_texture_compression_s3tc_srgb GL_EXT_texture_filter_anisotropic GL_EXT_texture_format_BGRA8888 GL_EXT_texture_rg GL_EXT_texture_storage GL_EXT_unpack_subimage GL_KHR_debug GL_KHR_parallel_shader_compile GL_KHR_robust_buffer_access_behavior GL_NV_EGL_stream_consumer_external GL_NV_fence GL_NV_pack_subimage GL_NV_pixel_buffer_object GL_OES_EGL_image GL_OES_EGL_image_external GL_OES_compressed_ETC1_RGB8_texture GL_OES_depth32 GL_OES_element_index_uint GL_OES_get_program_binary GL_OES_mapbuffer GL_OES_packed_depth_stencil GL_OES_rgb8_rgba8 GL_OES_standard_derivatives GL_OES_surfaceless_context GL_OES_texture_float GL_OES_texture_float_linear GL_OES_texture_half_float GL_OES_texture_half_float_linear GL_OES_texture_npot GL_OES_vertex_array_object OES_compressed_EAC_R11_signed_texture OES_compressed_EAC_R11_unsigned_texture OES_compressed_EAC_RG11_signed_texture OES_compressed_EAC_RG11_unsigned_texture OES_compressed_ETC2_RGB8_texture OES_compressed_ETC2_RGBA8_texture OES_compressed_ETC2_punchthroughA_RGBA8_texture OES_compressed_ETC2_punchthroughA_sRGB8_alpha_texture OES_compressed_ETC2_sRGB8_alpha8_texture OES_compressed_ETC2_sRGB8_texture
WebGL 1 Extensions: ANGLE_instanced_arrays EXT_blend_minmax EXT_color_buffer_half_float EXT_frag_depth EXT_sRGB EXT_shader_texture_lod EXT_texture_filter_anisotropic EXT_disjoint_timer_query OES_element_index_uint OES_standard_derivatives OES_texture_float OES_texture_float_linear OES_texture_half_float OES_texture_half_float_linear OES_vertex_array_object WEBGL_color_buffer_float WEBGL_compressed_texture_s3tc WEBGL_compressed_texture_s3tc_srgb WEBGL_debug_renderer_info WEBGL_debug_shaders WEBGL_depth_texture WEBGL_draw_buffers WEBGL_lose_context
WebGL 2 Driver WSI Info: EGL_VENDOR: Google Inc. (adapter LUID: 0000000000013f00) EGL_VERSION: 1.4 (ANGLE 2.1.0.790e8e6b4179) EGL_EXTENSIONS: EGL_EXT_create_context_robustness EGL_ANGLE_d3d_share_handle_client_buffer EGL_ANGLE_d3d_texture_client_buffer EGL_ANGLE_surface_d3d_texture_2d_share_handle EGL_ANGLE_query_surface_pointer EGL_ANGLE_window_fixed_size EGL_ANGLE_keyed_mutex EGL_ANGLE_surface_orientation EGL_ANGLE_direct_composition EGL_NV_post_sub_buffer EGL_KHR_create_context EGL_EXT_device_query EGL_KHR_image EGL_KHR_image_base EGL_KHR_gl_texture_2D_image EGL_KHR_gl_texture_cubemap_image EGL_KHR_gl_renderbuffer_image EGL_KHR_get_all_proc_addresses EGL_KHR_stream EGL_KHR_stream_consumer_gltexture EGL_NV_stream_consumer_gltexture_yuv EGL_ANGLE_flexible_surface_compatibility EGL_ANGLE_stream_producer_d3d_texture EGL_ANGLE_create_context_webgl_compatibility EGL_CHROMIUM_create_context_bind_generates_resource EGL_CHROMIUM_sync_control EGL_EXT_pixel_format_float EGL_KHR_surfaceless_context EGL_ANGLE_display_texture_share_group EGL_ANGLE_create_context_client_arrays EGL_ANGLE_program_cache_control EGL_ANGLE_robust_resource_initialization EGL_ANGLE_create_context_extensions_enabled EGL_MOZ_create_context_provoking_vertex_dont_care EGL_EXTENSIONS(nullptr): EGL_EXT_client_extensions EGL_EXT_platform_base EGL_EXT_platform_device EGL_ANGLE_platform_angle EGL_ANGLE_platform_angle_d3d EGL_ANGLE_device_creation EGL_ANGLE_device_creation_d3d11 EGL_ANGLE_experimental_present_path EGL_KHR_client_get_all_proc_addresses EGL_KHR_debug EGL_ANGLE_explicit_context
WebGL 2 Driver Renderer: Google Inc. -- ANGLE (NVIDIA Quadro M1200 Direct3D11 vs_5_0 ps_5_0)
WebGL 2 Driver Version: OpenGL ES 3.0 (ANGLE 2.1.0.790e8e6b4179)
WebGL 2 Driver Extensions: GL_ANGLE_client_arrays GL_ANGLE_depth_texture GL_ANGLE_explicit_context GL_ANGLE_explicit_context_gles1 GL_ANGLE_framebuffer_blit GL_ANGLE_framebuffer_multisample GL_ANGLE_instanced_arrays GL_ANGLE_lossy_etc_decode GL_ANGLE_multiview GL_ANGLE_pack_reverse_row_order GL_ANGLE_program_cache_control GL_ANGLE_request_extension GL_ANGLE_robust_client_memory GL_ANGLE_texture_compression_dxt3 GL_ANGLE_texture_compression_dxt5 GL_ANGLE_texture_usage GL_ANGLE_translated_shader_source GL_CHROMIUM_bind_generates_resource GL_CHROMIUM_bind_uniform_location GL_CHROMIUM_color_buffer_float_rgb GL_CHROMIUM_color_buffer_float_rgba GL_CHROMIUM_copy_compressed_texture GL_CHROMIUM_copy_texture GL_CHROMIUM_sync_query GL_EXT_blend_minmax GL_EXT_color_buffer_float GL_EXT_color_buffer_half_float GL_EXT_debug_marker GL_EXT_discard_framebuffer GL_EXT_disjoint_timer_query GL_EXT_draw_buffers GL_EXT_frag_depth GL_EXT_map_buffer_range GL_EXT_occlusion_query_boolean GL_EXT_read_format_bgra GL_EXT_robustness GL_EXT_sRGB GL_EXT_shader_texture_lod GL_EXT_texture_compression_dxt1 GL_EXT_texture_compression_s3tc_srgb GL_EXT_texture_filter_anisotropic GL_EXT_texture_format_BGRA8888 GL_EXT_texture_norm16 GL_EXT_texture_rg GL_EXT_texture_storage GL_EXT_unpack_subimage GL_KHR_debug GL_KHR_parallel_shader_compile GL_KHR_robust_buffer_access_behavior GL_NV_EGL_stream_consumer_external GL_NV_fence GL_NV_pack_subimage GL_NV_pixel_buffer_object GL_OES_EGL_image GL_OES_EGL_image_external GL_OES_EGL_image_external_essl3 GL_OES_compressed_ETC1_RGB8_texture GL_OES_depth32 GL_OES_element_index_uint GL_OES_get_program_binary GL_OES_mapbuffer GL_OES_packed_depth_stencil GL_OES_rgb8_rgba8 GL_OES_standard_derivatives GL_OES_surfaceless_context GL_OES_texture_float GL_OES_texture_float_linear GL_OES_texture_half_float GL_OES_texture_half_float_linear GL_OES_texture_npot GL_OES_vertex_array_object OES_compressed_EAC_R11_signed_texture OES_compressed_EAC_R11_unsigned_texture OES_compressed_EAC_RG11_signed_texture OES_compressed_EAC_RG11_unsigned_texture OES_compressed_ETC2_RGB8_texture OES_compressed_ETC2_RGBA8_texture OES_compressed_ETC2_punchthroughA_RGBA8_texture OES_compressed_ETC2_punchthroughA_sRGB8_alpha_texture OES_compressed_ETC2_sRGB8_alpha8_texture OES_compressed_ETC2_sRGB8_texture
WebGL 2 Extensions: EXT_color_buffer_float EXT_texture_filter_anisotropic EXT_disjoint_timer_query OES_texture_float_linear WEBGL_compressed_texture_s3tc WEBGL_compressed_texture_s3tc_srgb WEBGL_debug_renderer_info WEBGL_debug_shaders WEBGL_lose_context
Direct2D: true
Off Main Thread Painting Enabled: true
Off Main Thread Painting Worker Count: 4
DirectWrite: true (10.0.17134.376)
GPU #1
Active: Yes
Description: NVIDIA Quadro M1200
Vendor ID: 0x10de
Device ID: 0x13b6
Driver Version: 24.21.14.1195
Driver Date: 11-8-2018
Drivers: C:\WINDOWS\System32\DriverStore\FileRepository\nvltwi.inf_amd64_ff229b059e431d45\nvldumdx.dll,C:\WINDOWS\System32\DriverStore\FileRepository\nvltwi.inf_amd64_ff229b059e431d45\nvldumdx.dll,C:\WINDOWS\System32\DriverStore\FileRepository\nvltwi.inf_amd64_ff229b059e431d45\nvldumdx.dll,C:\WINDOWS\System32\DriverStore\FileRepository\nvltwi.inf_amd64_ff229b059e431d45\nvldumdx.dll C:\WINDOWS\System32\DriverStore\FileRepository\nvltwi.inf_amd64_ff229b059e431d45\nvldumd.dll,C:\WINDOWS\System32\DriverStore\FileRepository\nvltwi.inf_amd64_ff229b059e431d45\nvldumd.dll,C:\WINDOWS\System32\DriverStore\FileRepository\nvltwi.inf_amd64_ff229b059e431d45\nvldumd.dll,C:\WINDOWS\System32\DriverStore\FileRepository\nvltwi.inf_amd64_ff229b059e431d45\nvldumd.dll
Subsys ID: 224d17aa
RAM: 4096
GPU #2
Active: No
Description: Intel(R) HD Graphics 630
Vendor ID: 0x8086
Device ID: 0x591b
Driver Version: 21.20.16.4590
Driver Date: 1-18-2017
Drivers: igdumdim64 igd10iumd64 igd10iumd64 igd12umd64 igdumdim32 igd10iumd32 igd10iumd32 igd12umd32
Subsys ID: 224d17aa
RAM: Unknown
Diagnostics
AzureCanvasAccelerated: 0
AzureCanvasBackend: direct2d 1.1
AzureCanvasBackend (UI Process): skia
AzureContentBackend: direct2d 1.1
AzureContentBackend (UI Process): skia
AzureFallbackCanvasBackend (UI Process): cairo
GPUProcessPid: 21008
Decision Log
WEBRENDER:
opt-in by default: WebRender is an opt-in feature
WEBRENDER_QUALIFIED:
blocked by env: Has battery

Media

Audio Backend: wasapi
Max Channels: 2
Preferred Sample Rate: 48000
Output Devices
Name: Group
LG ULTRAWIDE-4 (NVIDIA High Definition Audio): HDAUDIO\FUNC_01&VEN_10DE&DEV_0060&SUBSYS_00000000&REV_1001\5&16550ea1&0&0001
SHARP HDMI-C (NVIDIA High Definition Audio): HDAUDIO\FUNC_01&VEN_10DE&DEV_0060&SUBSYS_00000000&REV_1001\5&16550ea1&0&0001
NVIDIA Output (NVIDIA High Definition Audio): HDAUDIO\FUNC_01&VEN_10DE&DEV_0060&SUBSYS_00000000&REV_1001\5&16550ea1&0&0001
DELL P2715Q-C (NVIDIA High Definition Audio): HDAUDIO\FUNC_01&VEN_10DE&DEV_0060&SUBSYS_00000000&REV_1001\5&16550ea1&0&0001
Crestron-C (NVIDIA High Definition Audio): HDAUDIO\FUNC_01&VEN_10DE&DEV_0060&SUBSYS_00000000&REV_1001\5&16550ea1&0&0001
SHARP HDMI (NVIDIA High Definition Audio):
Projector-4 (NVIDIA High Definition Audio): HDAUDIO\FUNC_01&VEN_10DE&DEV_0060&SUBSYS_00000000&REV_1001\5&16550ea1&0&0001
Projector-C (NVIDIA High Definition Audio): HDAUDIO\FUNC_01&VEN_10DE&DEV_0060&SUBSYS_00000000&REV_1001\5&16550ea1&0&0001
Headset Earphone (Wireless Controller): USB\VID_054C&PID_09CC&MI_00\6&57fb139&0&0000
DELL P2715Q-4 (NVIDIA High Definition Audio): HDAUDIO\FUNC_01&VEN_10DE&DEV_0060&SUBSYS_00000000&REV_1001\5&16550ea1&0&0001
SHARP HDMI-C (NVIDIA High Definition Audio): HDAUDIO\FUNC_01&VEN_10DE&DEV_0060&SUBSYS_00000000&REV_1001\5&16550ea1&0&0001
Crestron (NVIDIA High Definition Audio): HDAUDIO\FUNC_01&VEN_10DE&DEV_0060&SUBSYS_00000000&REV_1001\5&16550ea1&0&0001
Crestron-C (NVIDIA High Definition Audio): HDAUDIO\FUNC_01&VEN_10DE&DEV_0060&SUBSYS_00000000&REV_1001\5&16550ea1&0&0001
Speaker/HP (Realtek High Definition Audio): HDAUDIO\FUNC_01&VEN_10EC&DEV_0298&SUBSYS_17AA224D&REV_1001\4&21df54c2&0&0001
40S305-C (NVIDIA High Definition Audio): HDAUDIO\FUNC_01&VEN_10DE&DEV_0060&SUBSYS_00000000&REV_1001\5&16550ea1&0&0001
SHARP HDMI-4 (NVIDIA High Definition Audio): HDAUDIO\FUNC_01&VEN_10DE&DEV_0060&SUBSYS_00000000&REV_1001\5&16550ea1&0&0001
SHARP HDMI-C (NVIDIA High Definition Audio): HDAUDIO\FUNC_01&VEN_10DE&DEV_0060&SUBSYS_00000000&REV_1001\5&16550ea1&0&0001
LG ULTRAWIDE-4 (NVIDIA High Definition Audio): HDAUDIO\FUNC_01&VEN_10DE&DEV_0060&SUBSYS_00000000&REV_1001\5&16550ea1&0&0001
NS-40D510NA15-C (NVIDIA High Definition Audio): HDAUDIO\FUNC_01&VEN_10DE&DEV_0060&SUBSYS_00000000&REV_1001\5&16550ea1&0&0001
SHARP HDMI (NVIDIA High Definition Audio):
HP VH240a-C (NVIDIA High Definition Audio): HDAUDIO\FUNC_01&VEN_10DE&DEV_0060&SUBSYS_00000000&REV_1001\5&16550ea1&0&0001
Headset Earphone (2- Wireless Controller): USB\VID_054C&PID_09CC&MI_00\6&d39ff6c&0&0000
NVIDIA Output (NVIDIA High Definition Audio):
Crestron-C (NVIDIA High Definition Audio): HDAUDIO\FUNC_01&VEN_10DE&DEV_0060&SUBSYS_00000000&REV_1001\5&16550ea1&0&0001
LG ULTRAWIDE (NVIDIA High Definition Audio): HDAUDIO\FUNC_01&VEN_10DE&DEV_0060&SUBSYS_00000000&REV_1001\5&16550ea1&0&0001
Input Devices
Name: Group
Microphone Array (Realtek High Definition Audio): HDAUDIO\FUNC_01&VEN_10EC&DEV_0298&SUBSYS_17AA224D&REV_1001\4&21df54c2&0&0001
Internal AUX Jack (NVIDIA High Definition Audio): HDAUDIO\FUNC_01&VEN_10DE&DEV_0060&SUBSYS_00000000&REV_1001\5&16550ea1&0&0001
Internal AUX Jack (NVIDIA High Definition Audio): HDAUDIO\FUNC_01&VEN_10DE&DEV_0060&SUBSYS_00000000&REV_1001\5&16550ea1&0&0001
Headset Microphone (Wireless Controller): USB\VID_054C&PID_09CC&MI_00\6&57fb139&0&0000
Headset Microphone (2- Wireless Controller): USB\VID_054C&PID_09CC&MI_00\6&d39ff6c&0&0000
Microphone (C922 Pro Stream Webcam): USB\VID_046D&PID_085C&MI_02\7&2d9108c5&0&0002

Important Modified Preferences

accessibility.typeaheadfind.flashBar: 0
browser.cache.disk.capacity: 1048576
browser.cache.disk.filesystem_reported: 1
browser.cache.disk.smart_size.first_run: false
browser.cache.frecency_experiment: 3
browser.places.smartBookmarksVersion: 8
browser.sessionstore.upgradeBackup.latestBuildID: 20190108160530
browser.startup.homepage_override.buildID: 20190108160530
browser.startup.homepage_override.mstone: 64.0.2
browser.tabs.warnOnClose: false
browser.urlbar.placeholderName: Google
browser.urlbar.timesBeforeHidingSuggestionsHint: 0
dom.forms.autocomplete.formautofill: true
dom.push.userAgentID: 46ac66255bae4bfeae858a95650f9a59
extensions.lastAppVersion: 64.0.2
layers.mlgpu.sanity-test-failed: false
media.gmp-gmpopenh264.abi: x86_64-msvc-x64
media.gmp-gmpopenh264.lastUpdate: 1520028561
media.gmp-gmpopenh264.version: 1.7.1
media.gmp-manager.buildID: 20190108160530
media.gmp-manager.lastCheck: 1548020725
media.gmp-widevinecdm.abi: x86_64-msvc-x64
media.gmp-widevinecdm.lastUpdate: 1546464358
media.gmp-widevinecdm.version: 4.10.1146.0
media.gmp.storage.version.observed: 1
media.hardware-video-decoding.failed: false
network.cookie.prefsMigrated: true
network.predictor.cleaned-up: true
places.database.lastMaintenance: 1547071020
places.history.expiration.transient_current_max_pages: 112348
plugin.disable_full_page_plugin_for_types: application/pdf
privacy.sanitize.pending: [{"id":"newtab-container","itemsToClear":[],"options":{}}]
security.sandbox.content.tempDirSuffix: {81716ec9-2727-44e1-a792-262b9d1b58c1}
security.sandbox.plugin.tempDirSuffix: {9460633a-0954-40e6-bcd7-f5d74ac30695}
services.sync.declinedEngines:
services.sync.engine.addresses.available: true
signon.importedFromSqlite: true
storage.vacuum.last.index: 1
storage.vacuum.last.places.sqlite: 1547071020
ui.osk.debug.keyboardDisplayReason: IKPOS: Touch screen not found.

Important Locked Preferences

Places Database

JavaScript

Incremental GC: true

Accessibility

Activated: false
Prevent Accessibility: 0
Accessible Handler Used: true
Accessibility Instantiator:

Library Versions

NSPR
Expected minimum version: 4.20
Version in use: 4.20

NSS
Expected minimum version: 3.40.1
Version in use: 3.40.1

NSSSMIME
Expected minimum version: 3.40.1
Version in use: 3.40.1

NSSSSL
Expected minimum version: 3.40.1
Version in use: 3.40.1

NSSUTIL
Expected minimum version: 3.40.1
Version in use: 3.40.1

Sandbox

Content Process Sandbox Level: 5
Effective Content Process Sandbox Level: 5

Internationalization & Localization

Application Settings
Requested Locales: ["en-US"]
Available Locales: ["en-US"]
App Locales: ["en-US"]
Regional Preferences: ["en-US"]
Default Locale: "en-US"
Operating System
System Locales: ["en-US"]
Regional Preferences: ["en-US"]

Flags: needinfo?(andrew)
Whiteboard: [qf] → [qf:p2:resource]

As a part of triage process I NI jya.

Flags: needinfo?(jyavenard)
Whiteboard: [qf:p2:resource] → [qf:p2:resource] [needinfo jya]

If in about:config you set the preference media.wmf.low-latency.enabled to true and reload the twitch page, does it work better?

What I'm puzzled with however is how you can get high CPU usage when your machine should do it all with the GPU.

Flags: needinfo?(jyavenard) → needinfo?(andrew)

Hi @Jean-Yves Avenard

I just tested with the flag you suggested and it worked perfectly fine.

Flags: needinfo?(andrew)

(In reply to Andrew Sampson from comment #13)

Hi @Jean-Yves Avenard

I just tested with the flag you suggested and it worked perfectly fine.

Define fine though?

It just plays well, or it plays well and you also no longer see the extremely high CPU/GPU usage?

Sorry for the vague comment.

I no longer see pinned CPU/GPU usage and playback is low latency and can actually be used. Whereas without the flag it is a slideshow with 99% usage.

That's great to hear.

So this is dependent on bug 1305340.

Depends on: 1305340

Marking P3 as Bug 1305340 is P3.

Status: UNCONFIRMED → NEW
Rank: 25
Ever confirmed: true
Priority: -- → P3

(In reply to Jean-Yves Avenard [:jya] from comment #16)

That's great to hear.

So this is dependent on bug 1305340.

Thanks for the additional info. We have had multiple users and our staff test with this flag on and Firefox works extremely well. I don't know the status of the other bug, however, would it be possible to enable this flag by default for our domain (*.rainway.com) only? This would allow us to enable Firefox support which a lot of people were upset we had to remove.

Hi Andrew,

in comment #18 you said it works fine with the pref enabled. And bug 1305340 landed in Firefox 67. So that means the pref change is available in Firefox Release since several weeks now. Can you confirm that the problem is resolved now from your point of view?

Flags: needinfo?(andrew)

(In reply to Nils Ohlmeier [:drno] from comment #19)

Hi Andrew,

in comment #18 you said it works fine with the pref enabled. And bug 1305340 landed in Firefox 67. So that means the pref change is available in Firefox Release since several weeks now. Can you confirm that the problem is resolved now from your point of view?

Thanks for the ping! I've taken some time to compile a detailed overview of how things look from our end.

Some basic information on our stream:

  • The stream begins with a single FMP4 initialization segment, no Intra-Frames are sent after that.
  • The framerate of the video is "faked" to force browsers to decode faster. Our stream content may be 60 FPS, but the FMP4 metadata says it is 78 FPS. This trick has allowed us to keep the video-buffer low to minimize latency.
  • You can download a dump of one of our streams from here to analyze.

To start off, I did a comparison of Chrome 75 vs Firefox 69 on Windows. From my initial findings, Firefox's performance on Windows has drastically improved since I last tested it, even on one of the problem laptops mentioned in the initial report. As you can see from the videos however, there is still some random stuttering/frame skips that occurs every now and then when playing through Firefox, that is not present when playing using Chrome. The Windows profile for Firefox can be found here

macOS is a much different story. When using Chrome 75, it works without issue. Firefox 68 immediately freezes as the video buffer is increasing by seconds at a time, with all but a few frames being dropped. I took a profile of the problematic macOS session, which can be found here. The specs of my MacBook are: 15-inch, Touch Bar, 2.6GHz 6-Core Intel Core i7, Radeon Pro 560X, Intel UHD Graphics 630

We have not tested Linux, so I cannot comment on how well that platform works, and the sample size of Windows PC's we've been able to test on is small, but I'm happy to have my team verify more tomorrow. From what we can see though, is the video stream ultimately fails in the same way on every Mac we test on.

Please let me know if I can send over any other information or be helpful. We're looking forward to seeing these last issues resolved so we can give our users the choice to play in Firefox again.

Flags: needinfo?(andrew)

Putting back into the [qf] queue - this one was originally put into resource usage because of the high CPU usage, but it sounds like it might be closer to a responsiveness issue from the user's perspective.

Whiteboard: [qf:p2:resource] [needinfo jya] → [qf] [needinfo jya]

(In reply to Andrew Sampson from comment #20)

(
Thanks for the ping! I've taken some time to compile a detailed overview of how things look from our end.

Some basic information on our stream:

  • The stream begins with a single FMP4 initialization segment, no Intra-Frames are sent after that.
  • The framerate of the video is "faked" to force browsers to decode faster. Our stream content may be 60 FPS, but the FMP4 metadata says it is 78 FPS. This trick has allowed us to keep the video-buffer low to minimize latency.

I seriously doubt this trick has any effects with any browsers. I can't think of any decoders on any platforms actually using the information found in the mp4 metadata to determine the playback frame rate.

Its all based on the timestamps for each frame.

Firefox in particular use the moving average of the last 30 frames to determine the average framerate,

Saying it's 75fps in the metadata will do absolutely nothing to reduce decoding latency.

Additionally, the latency is entirely driven by the decoder itself.

Ffmpeg latency depends on how many cores the machine has. So a 4-cores with hyper- threading will have a 8 frames latency by default for playback.

On Mac, the latency is driven by the length of the H264 sliding window which is typically less than 4 frames.

On Windows it can be up to 30 frames, on Windows 10 and later, we have enabled their low latency mode. Depending on decoding by hardware or software.

And then the latency depends on the h264 content itself.
The sliding window is defined in the h264 spec as max_num_ref_frames
Details can be found here: https://searchfox.org/mozilla-central/source/dom/media/platforms/agnostic/bytestreams/H264.h#220

(In reply to Jean-Yves Avenard [:jya] from comment #22)

(In reply to Andrew Sampson from comment #20)

(
Thanks for the ping! I've taken some time to compile a detailed overview of how things look from our end.

Some basic information on our stream:

  • The stream begins with a single FMP4 initialization segment, no Intra-Frames are sent after that.
  • The framerate of the video is "faked" to force browsers to decode faster. Our stream content may be 60 FPS, but the FMP4 metadata says it is 78 FPS. This trick has allowed us to keep the video-buffer low to minimize latency.

I seriously doubt this trick has any effects with any browsers. I can't think of any decoders on any platforms actually using the information found in the mp4 metadata to determine the playback frame rate.

Its all based on the timestamps for each frame.

Firefox in particular use the moving average of the last 30 frames to determine the average framerate,

Saying it's 75fps in the metadata will do absolutely nothing to reduce decoding latency.

Additionally, the latency is entirely driven by the decoder itself.

Ffmpeg latency depends on how many cores the machine has. So a 4-cores with hyper- threading will have a 8 frames latency by default for playback.

On Mac, the latency is driven by the length of the H264 sliding window which is typically less than 4 frames.

On Windows it can be up to 30 frames, on Windows 10 and later, we have enabled their low latency mode. Depending on decoding by hardware or software.

And then the latency depends on the h264 content itself.
The sliding window is defined in the h264 spec as max_num_ref_frames
Details can be found here: https://searchfox.org/mozilla-central/source/dom/media/platforms/agnostic/bytestreams/H264.h#220

I apologize if "framerate" isn't the correct term here, but yes, marking our feed as faster than it really is does make Chrome decode much faster. Turning off this flag in our encoding pipeline sees the browser video buffer increases to 70ms or higher, while with it on it remains at just a few frames. I haven't validated if that behavior is the same in Firefox.

(In reply to Nils Ohlmeier [:drno] from comment #19)
(In reply to Jean-Yves Avenard [:jya] from comment #22)

I would like to provide more details about original problem and the current state of things.

The re-enablement of low latency video decoding in Windows apparently has positive effect. Low latency decoding has been working well and it's good to see it default-enabled in recent Firefox (as opposed to manual enablement as suggested in comment #12).

However as Andrew mentioned in comment #20 playback experience overall in Firefox remain slightly worse than in Chrome, for reasons presumably unrelated to hardware assisted decoding.

We stream real-time video by appending MSE data chunks and our goal is to have lowest latency decoding/presentation of this data. In most cases we are interested in having zero scheduling on client side (by browser) since if there is a video frame data it is already time to show it without bothering about time stamps and scheduling for presentation what would normal video delivery assume. Still we have to format the data to look like traditional fragmented MP4 streams, with frame rate indication in particular or otherwise browsers tend to fail to play the content.

We believe our content is okay for low latency overall. Specifically in non-browser scenarios (e.g. in Xbox One MediaElement control which is essentially close in terms of pipeline structure: receives fragments MP4, decodes with the help of DXVA2, schedules for GPU-enabled presentation - like in browsers with just a bit more of control over the pipeline from our end) the same content shows good stable ultra low latency. We will double check max_num_ref_frames per Jean-Yves's reference, however we know that content is technically decodable fast and basically the behavior is "one encoded frame data in - one decoded frame out".

The problem we believe is taking place is that we stream 60 fps and due to network fluctuations and otherwise the frame data is received and appended via MSE interface in far less regular increments, for objective reasons. Some frames are received on time, others - a bit later and presumably browsers re-schedule presentation favoring fixed fps and, of course, late frames. It would be normal for other media content playback that browser buffers a bit more but presents more smoothly. In our case it is against our preference because what is important for us, as we stream gaming experience, is real time video.

As Andrew mentioned we figured our tricking frame rate indication and time stamping. We stream 60 fps but we indicate the properties of 1.3x 60 fps stream. This is not what we wanted to do in first place but it appears to help a lot because we seem to keep browser's video tag "constantly late" in its presentation and the scheduling logic ignores network caused fluctuations and decode/present ASAP. With this trick some video frames might end up being "overly late" and browsers would drop them since their time stamps are too below the current presentation clock edge. This appears more or less acceptable and the value of 1.3x is a tradeoff between forcing that live edge for most of the video frames and having just small amount of frames dropped. The value was chosen mostly for Chrome because it was more important back in time, and there were other problems with Firefox including that one fixed mentioned in comment #19.

We see effect of this scaling trick in Firefox too, it's just there is a higher rate of dropped frames which visually result in stuttering effect during game streaming session.

We would love to have real time ultra low latency video feeds work perfectly in Firefox (and actually without our trick) but we would need a way to tell browser to apply strategy of decoded frame presentation no matter what, without using a presentation queue. In attempts to achieve this, at this time the time scaling trick is what worked for us the best and it worked better with Chrome (due to presumably differences in video presentation scheduler).

In agreement with Comment 21 , we are triaging this as responsiveness.

Whiteboard: [qf] [needinfo jya] → [qf:p2:responsiveness] [needinfo jya]

Any progress on this? As someone who uses both products, I am really tired to have to keep Chrome installed and constantly launch it to use the Rainway service.

So I'm following up here to add macOS is still not in a playable state, and Windows performance has degraded. have there been any updates on triaging this? If it is at all helpful we can have one of our developers assist or try creating a patch to Firefox to help.

Wanted to check in again, and see if this had any progress? or superceded by work on the new WebRenderer? I really would like to start using Firefox as my browser for realtime low latency game streaming.

Webcompat Priority: --- → ?
Performance: --- → P2
Whiteboard: [qf:p2:responsiveness] [needinfo jya] → [needinfo jya]
Webcompat Priority: ? → P3

It could be worth to test with a UA override (as chrome) for checking the current performance.
Maybe we should do a new performance profile.
If it's fixed with a UA override, we can create a site intervention on the webcompat side.

Flags: needinfo?(dave.hunt)

Andrew or Yulian could you confirm if this is still an issue? If so, could you generate a new profile?

Flags: needinfo?(yulian)
Flags: needinfo?(dave.hunt)
Flags: needinfo?(andrew)

So there are still considerable latency spikes during streaming using Rainway. I tried recording a new profile but Firefox kept failing saying the data was too large to serialize, even with only 5-10 seconds of recording.

As a note. I no longer use Rainway or the web clients for any service as the official desktop/mobile clients work sufficently for me for the services I use. Parsec and Rainway still offer web clients for Chrome-based browsers and they work perfectly in a pinch, but not in firefox.

How I test:

For rainway:
Install Rainway Dashboard, login
Go to https://play.rainway.com , login
Launch the Desktop stream and move windows around, observe latency spikes.

Client and Server are on Windows 11. Server uses AMD, client is Intel/Nvidia

For parsec:
Install parsec on server machine, login
Go to https://web.parsec.com , login
Connect to the server.

Same setup as Rainway.

Flags: needinfo?(yulian)
Blocks: media-triage
Whiteboard: [needinfo jya]

Hey folks --

To answer the question: Firefox is still broken when it comes to our technology.

Our company has evolved since this issue was opened and we are no longer just a consumer game streaming service anymore. Now, we license our technology to other teams and allow them to build services that leverage our app streaming technology. So in a sense this issue is now more likely to impact users of Firefox and force them to switch to another vendor. For example, Microsoft who uses our SDK for Xbox Cloud Gaming had to block Firefox due to the issues detailed in this bug report.

We would love to work with relevant folks on the Firefox team to try and find and solution to this problem; we're happy to set you up with developer accounts on our service so you can debug and test against our SDK which should hopefully make finding a solution easier. Please reach out to me via email to setup some time: andrew[at]rainway.com

Flags: needinfo?(andrew)

(In reply to Yulian Kuncheff from comment #31)

So there are still considerable latency spikes during streaming using Rainway. I tried recording a new profile but Firefox kept failing saying the data was too large to serialize, even with only 5-10 seconds of recording.

Julien, do you have any suggestions for how Yulian can successfully create and share a profile?

Flags: needinfo?(felash)

Hey Yulian,

Thanks for trying :-) I'm not sure where the error comes from, but I'd ensure that you're using one of the default presets. In this case I'd suggest the "Media" preset.

My colleague Paul Adenot recorded a video that you can see in his blogpost at [1], that explains the recording flow from enabling the profiler to sharing the recorded profile. The interface slightly changed since then but the flow is pretty similar.

[1] https://blog.paul.cx/post/profiling-firefox-media-workloads/#the-media-preset

Hey Andrew,

As a first step, it could be good to have access to a public-faced URL that shows the issue. Is it possible or is a local server always necessary ?
If it's necessary, is that available for Windows only ?

It would also help if you can look at recording a profile using the preset and flow mentioned above.

Thanks a lot to you both!

Flags: needinfo?(felash)
Performance: P2 → ---
Version: 66 Branch → Trunk

The media team is going to have a look and try to repro using existing instructions, no need to do anything for now. We'll update this if we can't reproduce, but we have a couple leads already. I don't think there is the need for a profile at this time. Thanks!

Performance: --- → P2

(In reply to Julien Wajsberg [:julienw] from comment #34)

Hey Yulian,

Thanks for trying :-) I'm not sure where the error comes from, but I'd ensure that you're using one of the default presets. In this case I'd suggest the "Media" preset.

My colleague Paul Adenot recorded a video that you can see in his blogpost at [1], that explains the recording flow from enabling the profiler to sharing the recorded profile. The interface slightly changed since then but the flow is pretty similar.

[1] https://blog.paul.cx/post/profiling-firefox-media-workloads/#the-media-preset

Hey Andrew,

As a first step, it could be good to have access to a public-faced URL that shows the issue. Is it possible or is a local server always necessary ?
If it's necessary, is that available for Windows only ?

It would also help if you can look at recording a profile using the preset and flow mentioned above.

Thanks a lot to you both!

(In reply to Paul Adenot (:padenot) from comment #35)

The media team is going to have a look and try to repro using existing instructions, no need to do anything for now. We'll update this if we can't reproduce, but we have a couple leads already. I don't think there is the need for a profile at this time. Thanks!

As mentioned doesn't hesitate to reach out if our team can be of assistance. We're happy to spin up some GPU servers close to folks on the media team so they can debug the stream and test without setting up a local environment.

Hey Matt, could you try to diagnose where the latency is here?

Flags: needinfo?(kinetik)
No longer blocks: media-triage
You need to log in before you can comment on or make changes to this bug.