1619882 - High CPU usage and Jittery/stuttering video playback when using VA-API

I noticed that the workaround in bug 1609538 (restoring/un-maximizing the window) temporarily fixes the jitter problem, and it didn't seem to happen under Sway.

Brandon Watkins

Comment 6

•

5 years ago

I'm using Pop!_OS 19.10 with gnome 3.34

Kestrel

Comment 7

•

5 years ago

I also see this on Ubuntu 19.10 when there is a changing video overlay, like when the player controls appear or when scanning across the seekbar to see thumbnails. If there is no overlay the video plays perfectly.

Bryce Seager van Dyk [:bryce] (he/him) - Not reading bugmail

Updated

•

5 years ago

Priority: -- → P5

pirminbraun16

Comment 8

•

5 years ago

Arch Linux latest, Gnome Shell 3.36.0, Intel graphics from Skylake i7-6700HQ, Nightly 75.0a1 (2020-03-09) (64-bit): Can confirm that jittery video only occurs while an overlay is displayed and goes away (I think) immediately when the overlay is gone.

Martin Stránský [:stransky] (ni? me)

Assignee

Comment 10

•

5 years ago

Bug 1619258 may help here as it exports a flag that HW playback is performed and compositor should threat it differently.

Darkspirit

Updated

•

5 years ago

status-firefox73: --- → unaffected

status-firefox74: --- → unaffected

status-firefox75: affected → disabled

status-firefox76: --- → disabled

status-firefox-esr68: --- → unaffected

Keywords: correctness

Hardware: Unspecified → Desktop

Cole

Comment 11

•

5 years ago

I also have this issue. Ubuntu 19.10 on wayland skylake hd 515 graphics gnome 3.34.2. not sure which logs or any would be helpful.

when I am watching a video in full screen, the flickering only happens when I hover over the controls/on screen display. I do notice it a lot worse on twitch.tv

Comment hidden (me-too)

I also have the issue. Fedora31, Intel(R) Core(TM) i7, AMD480E

1: visit https://v.qq.com/x/cover/ihhsfwvvhcm16nd.html
2: open “Bullet screen,” or danmu,
3:stuttering video playback .

4: vaapi log:
[AVHWDeviceContext @ 0x7f8613f0f800] libva: va_openDriver() returns 0
[AVHWDeviceContext @ 0x7f8613f0f800] Initialised VAAPI connection: version 1.6
[AVHWDeviceContext @ 0x7f8613f0f800] Format 0x3231564e -> nv12.
[AVHWDeviceContext @ 0x7f8613f0f800] Format 0x30313050 -> p010le.
[AVHWDeviceContext @ 0x7f8613f0f800] Format 0x36313050 -> unknown.
[AVHWDeviceContext @ 0x7f8613f0f800] Format 0x32315659 -> yuv420p.
[AVHWDeviceContext @ 0x7f8613f0f800] Format 0x56595559 -> unknown.
[AVHWDeviceContext @ 0x7f8613f0f800] Format 0x59565955 -> uyvy422.
[AVHWDeviceContext @ 0x7f8613f0f800] Format 0x41524742 -> bgra.
[AVHWDeviceContext @ 0x7f8613f0f800] Format 0x41424752 -> rgba.
[AVHWDeviceContext @ 0x7f8613f0f800] Format 0x58524742 -> bgr0.
[AVHWDeviceContext @ 0x7f8613f0f800] Format 0x58424752 -> rgb0.
[AVHWDeviceContext @ 0x7f8613f0f800] VAAPI driver: Mesa Gallium driver 19.2.8 for AMD Radeon (TM) RX 460 Graphics (POLARIS11, DRM 3.33.0, 5.3.7-301.fc31.x86_64, LLVM 9.0.0).
[AVHWDeviceContext @ 0x7f8613f0f800] Driver not found in known nonstandard list, using standard behaviour.
[h264 @ 0x7f861f546800] nal_unit_type: 7(SPS), nal_ref_idc: 3
[h264 @ 0x7f861f546800] nal_unit_type: 8(PPS), nal_ref_idc: 3
[Child 59034: MediaPDecoder #3]: D/PlatformDecoderModule VA-API FFmpeg init successful
[h264 @ 0x7f861f546800] nal_unit_type: 5(IDR), nal_ref_idc: 3
[Child 59034: MediaPDecoder #1]: D/PlatformDecoderModule Choosing FFmpeg pixel format for VA-API video decoding.
[Child 59034: MediaPDecoder #1]: D/PlatformDecoderModule Requesting pixel format VAAPI_VLD

5about:support
{
"application": {
"name": "Firefox",
"osVersion": "Linux 5.3.7-301.fc31.x86_64",
"version": "76.0a1",
"buildID": "20200403125244",
"userAgent": "Mozilla/5.0 (X11; Linux x86_64; rv:76.0) Gecko/20100101 Firefox/76.0",
"safeMode": false,
"updateChannel": "default",
"supportURL": "https://support.mozilla.org/1/firefox/76.0a1/Linux/en-US/",
"numTotalWindows": 1,
"numRemoteWindows": 1,
"remoteAutoStart": true,
"autoStartStatus": 1,
"policiesStatus": 0,
"keyLocationServiceGoogleFound": false,
"keySafebrowsingGoogleFound": false,
"keyMozillaFound": false
},
"securitySoftware": {
"registeredAntiVirus": "",
"registeredAntiSpyware": "",
"registeredFirewall": ""
},
"processes": {
"remoteTypes": {
"privilegedabout": 1,
"extension": 1,
"web": 7
},
"maxWebContentProcesses": 8
},
"modifiedPreferences": {
"browser.cache.disk.amount_written": 115871,
"browser.cache.disk.capacity": 696320,
"browser.cache.disk.filesystem_reported": 1,
"browser.contentblocking.category": "standard",
"browser.sessionstore.upgradeBackup.latestBuildID": "20200403125244",
"browser.startup.homepage_override.buildID": "20200403125244",
"browser.startup.homepage_override.mstone": "76.0a1",
"browser.urlbar.placeholderName": "Google",
"dom.forms.autocomplete.formautofill": true,
"extensions.lastAppVersion": "76.0a1",
"idle.lastDailyNotification": 1585895527,
"media.eme.enabled": true,
"media.gmp-gmpopenh264.abi": "x86_64-gcc3",
"media.gmp-gmpopenh264.lastUpdate": 1585894987,
"media.gmp-gmpopenh264.version": "1.8.1.1",
"media.gmp-manager.buildID": "20200403125244",
"media.gmp-manager.lastCheck": 1585894985,
"media.gmp-widevinecdm.abi": "x86_64-gcc3",
"media.gmp-widevinecdm.lastUpdate": 1585894989,
"media.gmp-widevinecdm.version": "4.10.1582.2",
"media.gmp.storage.version.observed": 1,
"places.database.lastMaintenance": 1585895527,
"places.history.expiration.transient_current_max_pages": 112348,
"privacy.sanitize.pending": "[{"id":"newtab-container","itemsToClear":[],"options":{}}]",
"privacy.purge_trackers.date_in_cookie_database": "0",
"security.sandbox.content.level": 0,
"security.sandbox.content.tempDirSuffix": "2dddb6a6-f6b0-4a96-9177-899ec0c1ac4b",
"security.sandbox.plugin.tempDirSuffix": "2a71a4f3-69c1-4f43-9cc2-cab9bcfa6328",
"services.sync.engine.addresses.available": true,
"signon.importedFromSqlite": true,
"storage.vacuum.last.index": 0,
"storage.vacuum.last.places.sqlite": 1585895527
},
"lockedPreferences": {
"dom.ipc.processCount.webIsolated": 1
},
"media": {
"currentAudioBackend": "pulse-rust",
"currentMaxAudioChannels": 2,
"currentPreferredSampleRate": 44100,
"audioOutputDevices": [
{
"name": "内置音频数字立体声(IEC958)",
"groupId": "/devices/pci0000:00/0000:00:1b.0/sound/card0",
"vendor": "Intel Corporation",
"type": 2,
"state": 2,
"preferred": 15,
"supportedFormat": 12336,
"defaultFormat": 16,
"maxChannels": 2,
"defaultRate": 48000,
"maxRate": 384000,
"minRate": 1,
"maxLatency": 0,
"minLatency": 0
}
],
"audioInputDevices": [
{
"name": "Monitor of 内置音频数字立体声(IEC958)",
"groupId": "/devices/pci0000:00/0000:00:1b.0/sound/card0",
"vendor": "Intel Corporation",
"type": 1,
"state": 2,
"preferred": 15,
"supportedFormat": 12336,
"defaultFormat": 16,
"maxChannels": 2,
"defaultRate": 48000,
"maxRate": 384000,
"minRate": 1,
"maxLatency": 0,
"minLatency": 0
}
]
},
"javaScript": {
"incrementalGCEnabled": true
},
"accessibility": {
"isActive": false,
"forceDisabled": 0,
"handlerUsed": false,
"instantiator": ""
},
"libraryVersions": {
"NSPR": {
"minVersion": "4.25",
"version": "4.25"
},
"NSS": {
"minVersion": "3.51.1 Beta",
"version": "3.51.1 Beta"
},
"NSSUTIL": {
"minVersion": "3.51.1 Beta",
"version": "3.51.1 Beta"
},
"NSSSSL": {
"minVersion": "3.51.1 Beta",
"version": "3.51.1 Beta"
},
"NSSSMIME": {
"minVersion": "3.51.1 Beta",
"version": "3.51.1 Beta"
}
},
"userJS": {
"exists": false
},
"intl": {
"localeService": {
"requested": [
"en-US"
],
"available": [
"en-US"
],
"supported": [
"en-US"
],
"regionalPrefs": [
"en-US"
],
"defaultLocale": "en-US"
},
"osPrefs": {
"systemLocales": [
"zh-CN"
],
"regionalPrefsLocales": [
"zh-CN"
]
}
},
"crashes": {
"submitted": [],
"pending": 0
},
"sandbox": {
"hasSeccompBPF": true,
"hasSeccompTSync": true,
"hasPrivilegedUserNamespaces": true,
"hasUserNamespaces": true,
"canSandboxContent": true,
"canSandboxMedia": true,
"syscallLog": [],
"contentSandboxLevel": 0,
"effectiveContentSandboxLevel": 0
},
"remoteAgent": {
"listening": false,
"url": ""
},
"graphics": {
"numTotalWindows": 1,
"numAcceleratedWindows": 1,
"windowLayerManagerType": "WebRender",
"windowLayerManagerRemote": true,
"windowUsingAdvancedLayers": false,
"adapterDescription": "AMD Radeon (TM) RX 460 Graphics (POLARIS11, DRM 3.33.0, 5.3.7-301.fc31.x86_64, LLVM 9.0.0)",
"adapterVendorID": "0x1002",
"adapterDeviceID": "0x67ef",
"adapterRAM": 4096,
"adapterDrivers": "",
"driverVendor": "mesa/radeonsi",
"driverVersion": "19.2.8.0",
"driverDate": "",
"usesTiling": false,
"contentUsesTiling": false,
"offMainThreadPaintEnabled": true,
"offMainThreadPaintWorkerCount": 3,
"targetFrameRate": 60,
"windowProtocol": "wayland/drm",
"desktopEnvironment": "gnome",
"webgl1Renderer": "X.Org -- AMD Radeon (TM) RX 460 Graphics (POLARIS11, DRM 3.33.0, 5.3.7-301.fc31.x86_64, LLVM 9.0.0)",
"webgl1Version": "OpenGL ES 3.2 Mesa 19.2.8",
"webgl1DriverExtensions": "GL_EXT_blend_minmax GL_EXT_multi_draw_arrays GL_EXT_texture_filter_anisotropic GL_EXT_texture_compression_s3tc GL_EXT_texture_compression_dxt1 GL_EXT_texture_compression_rgtc GL_EXT_texture_format_BGRA8888 GL_OES_compressed_ETC1_RGB8_texture GL_OES_depth24 GL_OES_element_index_uint GL_OES_fbo_render_mipmap GL_OES_mapbuffer GL_OES_rgb8_rgba8 GL_OES_standard_derivatives GL_OES_stencil8 GL_OES_texture_3D GL_OES_texture_float GL_OES_texture_float_linear GL_OES_texture_half_float GL_OES_texture_half_float_linear GL_OES_texture_npot GL_OES_vertex_half_float GL_EXT_texture_sRGB_decode GL_OES_EGL_image GL_OES_depth_texture GL_AMD_performance_monitor GL_OES_packed_depth_stencil GL_EXT_texture_type_2_10_10_10_REV GL_NV_conditional_render GL_OES_get_program_binary GL_APPLE_texture_max_level GL_EXT_discard_framebuffer GL_EXT_read_format_bgra GL_EXT_frag_depth GL_NV_fbo_color_attachments GL_OES_EGL_image_external GL_OES_EGL_sync GL_OES_vertex_array_object GL_OES_viewport_array GL_ANGLE_texture_compression_dxt3 GL_ANGLE_texture_compression_dxt5 GL_EXT_occlusion_query_boolean GL_EXT_robustness GL_EXT_texture_rg GL_EXT_unpack_subimage GL_NV_draw_buffers GL_NV_read_buffer GL_NV_read_depth GL_NV_read_depth_stencil GL_NV_read_stencil GL_EXT_draw_buffers GL_EXT_map_buffer_range GL_KHR_debug GL_KHR_robustness GL_KHR_texture_compression_astc_ldr GL_OES_depth_texture_cube_map GL_OES_required_internalformat GL_OES_surfaceless_context GL_EXT_color_buffer_float GL_EXT_sRGB_write_control GL_EXT_separate_shader_objects GL_EXT_shader_implicit_conversions GL_EXT_shader_integer_mix GL_EXT_tessellation_point_size GL_EXT_tessellation_shader GL_ANDROID_extension_pack_es31a GL_EXT_base_instance GL_EXT_compressed_ETC1_RGB8_sub_texture GL_EXT_copy_image GL_EXT_draw_buffers_indexed GL_EXT_draw_elements_base_vertex GL_EXT_gpu_shader5 GL_EXT_polygon_offset_clamp GL_EXT_primitive_bounding_box GL_EXT_render_snorm GL_EXT_shader_io_blocks GL_EXT_texture_border_clamp GL_EXT_texture_buffer GL_EXT_texture_cube_map_array GL_EXT_texture_norm16 GL_EXT_texture_view GL_KHR_blend_equation_advanced GL_KHR_context_flush_control GL_KHR_robust_buffer_access_behavior GL_NV_image_formats GL_OES_copy_image GL_OES_draw_buffers_indexed GL_OES_draw_elements_base_vertex GL_OES_gpu_shader5 GL_OES_primitive_bounding_box GL_OES_sample_shading GL_OES_sample_variables GL_OES_shader_io_blocks GL_OES_shader_multisample_interpolation GL_OES_tessellation_point_size GL_OES_tessellation_shader GL_OES_texture_border_clamp GL_OES_texture_buffer GL_OES_texture_cube_map_array GL_OES_texture_stencil8 GL_OES_texture_storage_multisample_2d_array GL_OES_texture_view GL_EXT_blend_func_extended GL_EXT_buffer_storage GL_EXT_float_blend GL_EXT_geometry_point_size GL_EXT_geometry_shader GL_KHR_no_error GL_KHR_texture_compression_astc_sliced_3d GL_OES_EGL_image_external_essl3 GL_OES_geometry_point_size GL_OES_geometry_shader GL_OES_shader_image_atomic GL_EXT_clip_cull_distance GL_EXT_disjoint_timer_query GL_EXT_texture_compression_s3tc_srgb GL_EXT_window_rectangles GL_MESA_shader_integer_functions GL_EXT_clip_control GL_EXT_memory_object GL_EXT_memory_object_fd GL_EXT_semaphore GL_EXT_semaphore_fd GL_EXT_texture_compression_bptc GL_KHR_parallel_shader_compile GL_AMD_framebuffer_multisample_advanced GL_EXT_texture_sRGB_R8 GL_EXT_depth_clamp GL_EXT_texture_query_lod ",
"webgl1Extensions": "ANGLE_instanced_arrays EXT_blend_minmax EXT_color_buffer_half_float EXT_float_blend EXT_sRGB EXT_texture_compression_bptc EXT_texture_compression_rgtc EXT_texture_filter_anisotropic MOZ_debug OES_element_index_uint OES_fbo_render_mipmap OES_standard_derivatives OES_texture_float OES_texture_float_linear OES_texture_half_float OES_texture_half_float_linear OES_vertex_array_object WEBGL_color_buffer_float WEBGL_compressed_texture_astc WEBGL_compressed_texture_etc WEBGL_compressed_texture_etc1 WEBGL_compressed_texture_s3tc WEBGL_compressed_texture_s3tc_srgb WEBGL_debug_renderer_info WEBGL_debug_shaders WEBGL_depth_texture WEBGL_lose_context",
"webgl1WSIInfo": "EGL_VENDOR: Mesa Project\nEGL_VERSION: 1.5\nEGL_EXTENSIONS: EGL_ANDROID_blob_cache EGL_ANDROID_native_fence_sync EGL_EXT_buffer_age EGL_EXT_create_context_robustness EGL_EXT_image_dma_buf_import EGL_EXT_swap_buffers_with_damage EGL_KHR_cl_event2 EGL_KHR_config_attribs EGL_KHR_create_context EGL_KHR_create_context_no_error EGL_KHR_fence_sync EGL_KHR_get_all_proc_addresses EGL_KHR_gl_colorspace EGL_KHR_gl_renderbuffer_image EGL_KHR_gl_texture_2D_image EGL_KHR_gl_texture_3D_image EGL_KHR_gl_texture_cubemap_image EGL_KHR_image_base EGL_KHR_no_config_context EGL_KHR_reusable_sync EGL_KHR_surfaceless_context EGL_KHR_swap_buffers_with_damage EGL_EXT_pixel_format_float EGL_KHR_wait_sync EGL_MESA_configless_context EGL_MESA_drm_image EGL_MESA_image_dma_buf_export EGL_MESA_query_driver EGL_WL_bind_wayland_display EGL_WL_create_wayland_buffer_from_image \nEGL_EXTENSIONS(nullptr): EGL_EXT_device_base EGL_EXT_device_enumeration EGL_EXT_device_query EGL_EXT_platform_base EGL_KHR_client_get_all_proc_addresses EGL_EXT_client_extensions EGL_KHR_debug EGL_EXT_platform_wayland EGL_EXT_platform_x11 EGL_MESA_platform_gbm EGL_MESA_platform_surfaceless EGL_EXT_platform_device",
"webgl2Renderer": "X.Org -- AMD Radeon (TM) RX 460 Graphics (POLARIS11, DRM 3.33.0, 5.3.7-301.fc31.x86_64, LLVM 9.0.0)",
"webgl2Version": "OpenGL ES 3.2 Mesa 19.2.8",
"webgl2DriverExtensions": "GL_EXT_blend_minmax GL_EXT_multi_draw_arrays GL_EXT_texture_filter_anisotropic GL_EXT_texture_compression_s3tc GL_EXT_texture_compression_dxt1 GL_EXT_texture_compression_rgtc GL_EXT_texture_format_BGRA8888 GL_OES_compressed_ETC1_RGB8_texture GL_OES_depth24 GL_OES_element_index_uint GL_OES_fbo_render_mipmap GL_OES_mapbuffer GL_OES_rgb8_rgba8 GL_OES_standard_derivatives GL_OES_stencil8 GL_OES_texture_3D GL_OES_texture_float GL_OES_texture_float_linear GL_OES_texture_half_float GL_OES_texture_half_float_linear GL_OES_texture_npot GL_OES_vertex_half_float GL_EXT_texture_sRGB_decode GL_OES_EGL_image GL_OES_depth_texture GL_AMD_performance_monitor GL_OES_packed_depth_stencil GL_EXT_texture_type_2_10_10_10_REV GL_NV_conditional_render GL_OES_get_program_binary GL_APPLE_texture_max_level GL_EXT_discard_framebuffer GL_EXT_read_format_bgra GL_EXT_frag_depth GL_NV_fbo_color_attachments GL_OES_EGL_image_external GL_OES_EGL_sync GL_OES_vertex_array_object GL_OES_viewport_array GL_ANGLE_texture_compression_dxt3 GL_ANGLE_texture_compression_dxt5 GL_EXT_occlusion_query_boolean GL_EXT_robustness GL_EXT_texture_rg GL_EXT_unpack_subimage GL_NV_draw_buffers GL_NV_read_buffer GL_NV_read_depth GL_NV_read_depth_stencil GL_NV_read_stencil GL_EXT_draw_buffers GL_EXT_map_buffer_range GL_KHR_debug GL_KHR_robustness GL_KHR_texture_compression_astc_ldr GL_OES_depth_texture_cube_map GL_OES_required_internalformat GL_OES_surfaceless_context GL_EXT_color_buffer_float GL_EXT_sRGB_write_control GL_EXT_separate_shader_objects GL_EXT_shader_implicit_conversions GL_EXT_shader_integer_mix GL_EXT_tessellation_point_size GL_EXT_tessellation_shader GL_ANDROID_extension_pack_es31a GL_EXT_base_instance GL_EXT_compressed_ETC1_RGB8_sub_texture GL_EXT_copy_image GL_EXT_draw_buffers_indexed GL_EXT_draw_elements_base_vertex GL_EXT_gpu_shader5 GL_EXT_polygon_offset_clamp GL_EXT_primitive_bounding_box GL_EXT_render_snorm GL_EXT_shader_io_blocks GL_EXT_texture_border_clamp GL_EXT_texture_buffer GL_EXT_texture_cube_map_array GL_EXT_texture_norm16 GL_EXT_texture_view GL_KHR_blend_equation_advanced GL_KHR_context_flush_control GL_KHR_robust_buffer_access_behavior GL_NV_image_formats GL_OES_copy_image GL_OES_draw_buffers_indexed GL_OES_draw_elements_base_vertex GL_OES_gpu_shader5 GL_OES_primitive_bounding_box GL_OES_sample_shading GL_OES_sample_variables GL_OES_shader_io_blocks GL_OES_shader_multisample_interpolation GL_OES_tessellation_point_size GL_OES_tessellation_shader GL_OES_texture_border_clamp GL_OES_texture_buffer GL_OES_texture_cube_map_array GL_OES_texture_stencil8 GL_OES_texture_storage_multisample_2d_array GL_OES_texture_view GL_EXT_blend_func_extended GL_EXT_buffer_storage GL_EXT_float_blend GL_EXT_geometry_point_size GL_EXT_geometry_shader GL_KHR_no_error GL_KHR_texture_compression_astc_sliced_3d GL_OES_EGL_image_external_essl3 GL_OES_geometry_point_size GL_OES_geometry_shader GL_OES_shader_image_atomic GL_EXT_clip_cull_distance GL_EXT_disjoint_timer_query GL_EXT_texture_compression_s3tc_srgb GL_EXT_window_rectangles GL_MESA_shader_integer_functions GL_EXT_clip_control GL_EXT_memory_object GL_EXT_memory_object_fd GL_EXT_semaphore GL_EXT_semaphore_fd GL_EXT_texture_compression_bptc GL_KHR_parallel_shader_compile GL_AMD_framebuffer_multisample_advanced GL_EXT_texture_sRGB_R8 GL_EXT_depth_clamp GL_EXT_texture_query_lod ",
"webgl2Extensions": "EXT_color_buffer_float EXT_float_blend EXT_texture_compression_bptc EXT_texture_compression_rgtc EXT_texture_filter_anisotropic MOZ_debug OES_texture_float_linear WEBGL_compressed_texture_astc WEBGL_compressed_texture_etc WEBGL_compressed_texture_etc1 WEBGL_compressed_texture_s3tc WEBGL_compressed_texture_s3tc_srgb WEBGL_debug_renderer_info WEBGL_debug_shaders WEBGL_lose_context",
"webgl2WSIInfo": "EGL_VENDOR: Mesa Project\nEGL_VERSION: 1.5\nEGL_EXTENSIONS: EGL_ANDROID_blob_cache EGL_ANDROID_native_fence_sync EGL_EXT_buffer_age EGL_EXT_create_context_robustness EGL_EXT_image_dma_buf_import EGL_EXT_swap_buffers_with_damage EGL_KHR_cl_event2 EGL_KHR_config_attribs EGL_KHR_create_context EGL_KHR_create_context_no_error EGL_KHR_fence_sync EGL_KHR_get_all_proc_addresses EGL_KHR_gl_colorspace EGL_KHR_gl_renderbuffer_image EGL_KHR_gl_texture_2D_image EGL_KHR_gl_texture_3D_image EGL_KHR_gl_texture_cubemap_image EGL_KHR_image_base EGL_KHR_no_config_context EGL_KHR_reusable_sync EGL_KHR_surfaceless_context EGL_KHR_swap_buffers_with_damage EGL_EXT_pixel_format_float EGL_KHR_wait_sync EGL_MESA_configless_context EGL_MESA_drm_image EGL_MESA_image_dma_buf_export EGL_MESA_query_driver EGL_WL_bind_wayland_display EGL_WL_create_wayland_buffer_from_image \nEGL_EXTENSIONS(nullptr): EGL_EXT_device_base EGL_EXT_device_enumeration EGL_EXT_device_query EGL_EXT_platform_base EGL_KHR_client_get_all_proc_addresses EGL_EXT_client_extensions EGL_KHR_debug EGL_EXT_platform_wayland EGL_EXT_platform_x11 EGL_MESA_platform_gbm EGL_MESA_platform_surfaceless EGL_EXT_platform_device",
"info": {
"AzureCanvasBackend": "skia",
"AzureFallbackCanvasBackend": "none",
"AzureContentBackend": "skia",
"CairoUseXRender": 0,
"ApzWheelInput": 1,
"ApzDragInput": 1,
"ApzKeyboardInput": 1,
"ApzAutoscrollInput": 1,
"CMSOutputProfile": "Empty profile data",
"DisplayCount": 1,
"Display0": "1600x900 default"
},
"featureLog": {
"features": [
{
"name": "HW_COMPOSITING",
"description": "Compositing",
"status": "available",
"log": [
{
"type": "default",
"status": "available"
}
]
},
{
"name": "OPENGL_COMPOSITING",
"description": "OpenGL Compositing",
"status": "available",
"log": [
{
"type": "default",
"status": "available"
}
]
},
{
"name": "GPU_PROCESS",
"description": "GPU Process",
"status": "blocked",
"log": [
{
"type": "default",
"status": "available"
},
{
"type": "runtime",
"status": "blocked",
"message": "Wayland does not work in the GPU process"
}
]
},
{
"name": "WEBRENDER",
"description": "WebRender",
"status": "available",
"log": [
{
"type": "default",
"status": "opt-in",
"message": "WebRender is an opt-in feature"
},
{
"type": "user",
"status": "available",
"message": "Qualified enabled by pref"
}
]
},
{
"name": "WEBRENDER_QUALIFIED",
"description": "WebRender qualified",
"status": "available",
"log": [
{
"type": "default",
"status": "available"
}
]
},
{
"name": "WEBRENDER_COMPOSITOR",
"description": "WebRender native compositor",
"status": "disabled",
"log": [
{
"type": "default",
"status": "disabled",
"message": "Disabled by default"
}
]
},
{
"name": "OMTP",
"description": "Off Main Thread Painting",
"status": "available",
"log": [
{
"type": "default",
"status": "available"
}
]
},
{
"name": "WEBGPU",
"description": "WebGPU",
"status": "disabled",
"log": [
{
"type": "default",
"status": "disabled",
"message": "Disabled by default"
}
]
}
],
"fallbacks": []
},
"crashGuards": []
},
"extensions": [
{
"name": "Amazon.com",
"version": "1.1",
"isActive": true,
"id": "amazondotcom@search.mozilla.org"
},
{
"name": "Bing",
"version": "1.1",
"isActive": true,
"id": "bing@search.mozilla.org"
},
{
"name": "DuckDuckGo",
"version": "1.0",
"isActive": true,
"id": "ddg@search.mozilla.org"
},
{
"name": "Google",
"version": "1.0",
"isActive": true,
"id": "google@search.mozilla.org"
},
{
"name": "Twitter",
"version": "1.0",
"isActive": true,
"id": "twitter@search.mozilla.org"
},
{
"name": "Wikipedia (en)",
"version": "1.0",
"isActive": true,
"id": "wikipedia@search.mozilla.org"
}
],
"features": [
{
"name": "DoH Roll-Out",
"version": "1.3.0",
"id": "doh-rollout@mozilla.org"
},
{
"name": "Firefox Screenshots",
"version": "39.0.0",
"id": "screenshots@mozilla.org"
},
{
"name": "Form Autofill",
"version": "1.0",
"id": "formautofill@mozilla.org"
},
{
"name": "Web Compat",
"version": "9.0.0",
"id": "webcompat@mozilla.org"
},
{
"name": "WebCompat Reporter",
"version": "1.1.0",
"id": "webcompat-reporter@mozilla.org"
}
]
}

Comment hidden (typo)

russianneuromancer

Comment 14

•

5 years ago

I also see this on Ubuntu 19.10 when there is a changing video overlay, like when the player controls appear or when scanning across the seekbar to see thumbnails. If there is no overlay the video plays perfectly.

For me this issue is reproducible even when YouTube controls is hidden.

Intel Core i3-6100U with Intel HD 520. Ubuntu 20.04 with Mesa 20.0.4 and Gnome Shell 3.36.0.

luis.pabon

Comment 15

•

5 years ago

When using picture on picture, glitchy videos play normally once the video is floating by itself by the way.

Comment hidden (me-too)

Comment hidden (obsolete)

Marco Castelluccio [:marco]

Updated

•

5 years ago

Keywords: regression

rickard

Comment 26

•

5 years ago

(In reply to Brandon Watkins from comment #21)

In 77 nightly, I no longer see this issue at all on youtube, however, I still see the issue on twitch.tv (but only when mousing over the video to bring up the controls)

Yeah, this is a bit odd. I noticed when set to auto quality video was jittery, but when I picked 1080p60 FPS (source) the jittering disappears.
BUT the "workaround" is not that simple... It seems to depend on the stream a bit, Lirik's stream worked flawlessy on source quality while Dansgaming stream had some jitter.

Comment hidden (obsolete)

rickard

Comment 29

•

5 years ago

Some followup... strangley it sometimes seems to work for a while then it starts to lag or jitter. I haven't really found any obvious reason for it.
When it lags it behaves kind of like a slow cpu doing software decoding, I increased EXTRA_HW_FRAMES to 12 instead of 6. Made no difference from what I could tell, also tried 0, no difference.
For me using nightly and KDE wayland the video seems jitters or lag regadless if video overlay controls are shown or not.
Same behavior on two different distributions and laptops.

Comment hidden (me-too)

Comment hidden (obsolete)

Comment hidden (me-too)

Gnomalex

Comment 34

•

5 years ago

Same issue here but no difference between YouTube or not.
@Pedro maybe you using VP9 on YT and your hardware doesn't support decoding it ?

The issue is present on both F76 release and nightly (today's version).
I have a ThinkPad T440 with i5-4200U on Archlinux.
No specific configuration other than text scaling factor in Gnome Tweaks (set to 1.25), which could be the cause.

I don't know which information could be interesting to help solve this bug.

Darkspirit

Updated

•

5 years ago

Blocks: wr-linux

status-firefox77: --- → disabled

status-firefox78: --- → disabled

Eduard Kachur

Comment 35

•

5 years ago

I did a test with RX 5700 XT under stock Ubuntu 20.04 with Firefox 76 and results are stunning, no stuttering at all with 4k 60 fps videos. Doesn't matter if fullscreen or not. Tried same video with mpv and again no stutters.
'''
mpv https://www.youtube.com/watch?v=LXb3EKWsInQ
(+) Video --vid=1 () (vp9 3840x2160 59.940fps)
(+) Audio --aid=1 --alang=eng () (opus 2ch 48000Hz)
[vo/gpu/wayland] GNOME's wayland compositor is known to have many serious issues with mpv. Switch to GNOME's xorg session for the best experience.
mesa: for the --simplifycfg-sink-common option: may only occur zero or one times!
mesa: for the --global-isel-abort option: may only occur zero or one times!
AO: [pulse] 48000Hz stereo 2ch float
Using hardware decoding (vaapi).
VO: [gpu] 3840x2160 vaapi[p010]
AV: 00:00:18 / 00:05:13 (6%) A-V: 0.000 Cache: 42s/150MB
'''

But this is not the case on Intel laptop with i7-7500U. Video stutters and mpv has some dropped frames (video is much smoother in it):
'''
mpv https://www.youtube.com/watch?v=LXb3EKWsInQ
(+) Video --vid=1 () (vp9 3840x2160 59.940fps)
(+) Audio --aid=1 --alang=eng () (opus 2ch 48000Hz)
[vo/gpu/wayland] GNOME's wayland compositor is known to have many serious issues with mpv. Switch to GNOME's xorg session for the best experience.
AO: [pulse] 48000Hz stereo 2ch float
Using hardware decoding (vaapi).
VO: [gpu] 3840x2160 vaapi[p010]
AV: 00:00:20 / 00:05:13 (6%) A-V: 0.000 Dropped: 9 Cache: 42s/150MB
'''
There are some tickets also mentioning this:
https://gitlab.freedesktop.org/xorg/xserver/-/issues/928
https://gitlab.freedesktop.org/drm/intel/-/issues/629

rickard

Comment 36

•

5 years ago

Mmm, what I think is weird is that it also sometimes works, at least for a while.
I have to mention that Chromium's "hacked in" VA-API support works fine all the time, at least under Xorg, no stuttering whatsoever.

There are so many potential sources of errors here, Intel Wayland drivers, Kernel drivers, Wayland itself, Firefox's implementation... etc.

rickard

Comment 37

•

5 years ago

Also, when playing a 1080p h264 video in VLC (Wayland) the CPU usage is very low... so maybe it isn't an intel video driver issue after all.

Brandon Watkins

Comment 38

•

5 years ago

I never use 4k video (as I have a 1080p display), but I can definitely say that I never have issues with vaapi playback in mpv

Also, interestingly I've found that in firefox, I have no jittering at all on netflix with vaapi enabled, even when mousing over the controls etc...

kurmikon

Comment 39

•

5 years ago

(In reply to Brandon Watkins from comment #38)

Also, interestingly I've found that in firefox, I have no jittering at all on netflix with vaapi enabled, even when mousing over the controls etc...

The video stream on Netflix and other paid services is decoded by widevine, there's no hardware decoding available on Linux for these services.

Marko

Comment 40

•

5 years ago

(In reply to kurmikon from comment #39)

(In reply to Brandon Watkins from comment #38)

Also, interestingly I've found that in firefox, I have no jittering at all on netflix with vaapi enabled, even when mousing over the controls etc...

The video stream on Netflix and other paid services is decoded by widevine, there's no hardware decoding available on Linux for these services.

Widevine is just a drm service, I don't think that it has anything to do with either software/hardware decoding on any platform, APIs like vaapi, dxva and others take care of that... or am I wrong here?

Pedro

Comment 41

•

5 years ago

@Gnomalex, my intel comet lake can decode vp9 I had to enable by disabling media.ffvpx.enabled, I see is loaded in the logs MOZ_ENABLE_WAYLAND=1 firefox-trunk

as @rickard says some times works normaly for some time , in the lasts nightlys you see less flikering , but is still present.
Chromium vaapi and mpv vaapi works very well

Martin Stránský [:stransky] (ni? me)

Assignee

Comment 42

•

5 years ago

I also do see the issue with those observation:

Happens when video is played online from youtube for instance. When it's loaded from disk it works ok.
Tends to break when mouse is moved across the video and video control is rendered.
It's broken in optimized builds, works ok in debug builds.

All those symptoms look like vaapi frame holder (Bug 1616892) releases the hw surfaces too early and they're reused while they're still in FF rendering queue.

Assignee: nobody → stransky

Martin Stránský [:stransky] (ni? me)

Assignee

Comment 43

•

5 years ago

Also it affects both vp9 and h264 codecs.

Martin Stránský [:stransky] (ni? me)

Assignee

Comment 44

•

5 years ago

(In reply to Martin Stránský [:stransky] from comment #42)

All those symptoms look like vaapi frame holder (Bug 1616892) releases the hw surfaces too early and they're reused while they're still in FF rendering queue.

It doesn't seem to be related, when vaapi surfaces are wrongly reused various color artifacts are shown as YUV planes are mixed together.
This one looks like some problem with texture cache/timing or so when whole frame is hold. Also is reproducible with both GL compositor and WebRender.

Sotaro, do you know if any special flags / config is used for texture clients from hw decoders to upload the textures asap or so?
Thanks.

Flags: needinfo?(sotaro.ikeda.g)

Eduard Kachur

Comment 45

•

5 years ago

Attached file 4k_costa_rica_error — Details

(In reply to Martin Stránský [:stransky] from comment #42) > I also do see the issue with those observation: > > - Happens when video is played online from youtube for instance. When it's loaded from disk it works ok. Just for information, I can confirm this too. Tho not all videos can be played from disk:

Eduard Kachur

Comment 46

•

5 years ago

(In reply to Martin Stránský [:stransky] from comment #42)

I also do see the issue with those observation:

Happens when video is played online from youtube for instance. When it's loaded from disk it works ok.

Just for information, I can confirm this too.

Not all videos can be played from disk, attached 4k_costa_rica_error log with this video https://www.youtube.com/watch?v=LXb3EKWsInQ downloaded with youtube-dl.

Sotaro Ikeda [:sotaro]

Comment 47

•

5 years ago

(In reply to Martin Stránský [:stransky] from comment #44)

Sotaro, do you know if any special flags / config is used for texture clients from hw decoders to upload the textures asap or so?
Thanks.

Sorry, I do not know about it:( But before doing rendering, texture needs to be ready. Is WaylandDMABufSurface::FenceWait() provided for it? At RenderDXGITextureHostOGL, the wait is done with IDXGIKeyedMutex in Windows. On android, Fence is used for it.

Flags: needinfo?(sotaro.ikeda.g)

Martin Stránský [:stransky] (ni? me)

Assignee

Comment 48

•

5 years ago

•

Edited

(In reply to Sotaro Ikeda [:sotaro] from comment #47)

(In reply to Martin Stránský [:stransky] from comment #44)

Sotaro, do you know if any special flags / config is used for texture clients from hw decoders to upload the textures asap or so?
Thanks.

Sorry, I do not know about it:( But before doing rendering, texture needs to be ready. Is WaylandDMABufSurface::FenceWait() provided for it? At RenderDXGITextureHostOGL, the wait is done with IDXGIKeyedMutex in Windows. On android, Fence is used for it.

~~Thanks, that may be the point. WaylandDMABufSurface::FenceWait() is not implemented.~~

Actually WaylandDMABufSurface::FenceWait() is implemented. I'll check how it's used.

Martin Stránský [:stransky] (ni? me)

Assignee

Comment 49

•

5 years ago

(In reply to Eduard Kachur from comment #46)

(In reply to Martin Stránský [:stransky] from comment #42)

I also do see the issue with those observation:

Happens when video is played online from youtube for instance. When it's loaded from disk it works ok.

Just for information, I can confirm this too.

Not all videos can be played from disk, attached 4k_costa_rica_error log with this video https://www.youtube.com/watch?v=LXb3EKWsInQ downloaded with youtube-dl.

Eduard, this is Bug 1632456 (kind of out of GPU memory bug).

Martin Stránský [:stransky] (ni? me)

Assignee

Comment 50

•

5 years ago

Looks like there's also expensive CPU usage when the controls (youtube for instance) is drawn. When I play video without the control I see ~ 8% CPU usage but with controls I see ~ 14%. I wonder what causes that, maybe the video rendered to some temporary buffer or so.

Martin Stránský [:stransky] (ni? me)

Assignee

Updated

•

5 years ago

Summary: Jittery/stuttering video playback when using VA-API → High CPU usage and Jittery/stuttering video playback when using VA-API and video control is shown

Darkspirit

Comment 51

•

5 years ago

(In reply to Martin Stránský [:stransky] from comment #50)

Looks like there's also expensive CPU usage when the controls (youtube for instance) is drawn. When I play video without the control I see ~ 8% CPU usage but with controls I see ~ 14%. I wonder what causes that, maybe the video rendered to some temporary buffer or so.

Is bug 1617498 needed to get the improvements of bug 1579235?

Martin Stránský [:stransky] (ni? me)

Assignee

Comment 52

•

5 years ago

This bug affects GL compositor as well.

Darkspirit

Comment 53

•

5 years ago

•

Edited

Gnome Wayland, Debian Testing, Intel Iris 6100 (Broadwell GT3), 2560x1600
My STR for intermittently stuttering video: Repeatedly hover the progress bar to make stuttering more likely.
$ pip3 install --upgrade mozregression
$ MOZ_ENABLE_WAYLAND=1 MOZ_LOG="PlatformDecoderModule:5" mozregression --launch 2020-05-13 --pref gfx.webrender.all:true media.webm.enabled:false widget.wayland-dmabuf-vaapi.enabled:true security.sandbox.content.level:0 --process-output stdout -a https://www.youtube.com/embed/PvKiWRTSAzg

Disabling gfx.webrender.picture-caching does not further reduce low CPU usage of 4-8% and does not fix the problem.
It can stutter regardless of video resolution or video control visibility.
I have watched 1080p in a loop: "Double right click > Full screen" does not have video controls, but stutters as well if my cursor is not yet hidden.

Martin Stránský [:stransky] (ni? me)

Assignee

Comment 54

•

5 years ago

Glen, do you have any idea why there's such overhead when the video is played with controls? With both GL and WebRender compositor.
Thanks.

Flags: needinfo?(gwatson)

Martin Stránský [:stransky] (ni? me)

Assignee

Comment 55

•

5 years ago

•

Edited

When SW video decode and HW video playback is used, I see the CPU goes up but I don't see any jittering when control is shown.

Glenn Watson [:gw]

Comment 56

•

5 years ago

When there are overlay controls on a video with WebRender enabled, we do draw the tiles that intersect the video in a (slightly) less efficient way than when the video has nothing on top of it.

The picture cache tiles that intersect the video (if it was promoted to a compositor surface) are switched from opaque blend to alpha blend and drawn on top of the video surface, with an alpha cutout. This doesn't add much overhead, but it is a difference when controls are present.

However, this has no effect on the GL compositor, so it seems unlikely to be the cause here (unless there are two separate causes with similar symptoms).

I suspect in the GL compositor mode, the layer config looks quite different, but I don't know much about how that code works, sorry.

Flags: needinfo?(gwatson)

Glenn Watson [:gw]

Comment 57

•

5 years ago

(In reply to Jan Andre Ikenmeyer [:darkspirit] from comment #51)

(In reply to Martin Stránský [:stransky] from comment #50)

Looks like there's also expensive CPU usage when the controls (youtube for instance) is drawn. When I play video without the control I see ~ 8% CPU usage but with controls I see ~ 14%. I wonder what causes that, maybe the video rendered to some temporary buffer or so.

Is bug 1617498 needed to get the improvements of bug 1579235?

No, we shouldn't need that bug implemented - the improvements in 1579235 should allow the video to be composited directly into the main framebuffer without being drawn to the picture tiles.

We should add some debug information to the picture caching debug overlay about compositor surfaces.

In the interim, it should be possible to test that the video is being drawn as a compositor surface with WebRender enabled by enabling the picture cache debug overlay (gfx.webrender.debug.picture-caching) - what we would hope is that the picture cache tiles under / over the video are staying green and not constantly invalidating.

Glenn Watson [:gw]

Comment 58

•

5 years ago

I wonder if there are any tools we can use on Linux to see exactly what hardware packets the GPU is scheduling / working on across processes, similar to how GPUView works on Windows. I wonder if the Intel GPA tools allow anything like that? Or maybe the nvidia nsight tools if running the proprietary driver?

Darkspirit

Comment 59

•

5 years ago

(In reply to Jan Andre Ikenmeyer [:darkspirit] from comment #53)

Gnome Wayland, Debian Testing, Intel Iris 6100 (Broadwell GT3), 2560x1600
My STR for intermittently stuttering video: Repeatedly hover the progress bar to make stuttering more likely.
$ pip3 install --upgrade mozregression
$ MOZ_ENABLE_WAYLAND=1 MOZ_LOG="PlatformDecoderModule:5" mozregression --launch 2020-05-13 --pref gfx.webrender.all:true media.webm.enabled:false widget.wayland-dmabuf-vaapi.enabled:true security.sandbox.content.level:0 --process-output stdout -a https://www.youtube.com/embed/PvKiWRTSAzg

Top row is yellow, middle row is green, bottom is mostly yellow, but progress bar and preview thumbnail are of course red. In rare cases the whole content is red for a short moment, but video is smooth then. When video stutters, middle row tiles are still green. That stuttering looks like as if frames were shown in the wrong order for a second. YouTube video stats say no frames were dropped. WR slow frame indicator never moves. The only thing I noticed was that Content Send Time is often shortly red around the time stuttering occurs.
https://perfht.ml/2LrYxk3 From hovering profiler screenshots it seems the most severe stuttering is between 7s and 11s.

Darkspirit

Comment 60

•

5 years ago

You see these 3 red markers in "Web Content". Hover the screenshots, it's seems to be exactly the moment it stutters.

Glenn Watson [:gw]

Comment 61

•

5 years ago

Interesting find!

It does seem to correlate with the skips - I don't know too much about the web content process, but if I expand the profile stacks, it maybe looks like there is some JS that is running that takes a really long time to complete? Markus might be able to help identify what's happening here, or point us to someone who would know more about that.

Flags: needinfo?(mstange)

Martin Stránský [:stransky] (ni? me)

Assignee

Comment 62

•

5 years ago

(In reply to Jan Andre Ikenmeyer [:darkspirit] from comment #59)

https://perfht.ml/2LrYxk3 From hovering profiler screenshots it seems the most severe stuttering is between 7s and 11s.

Jan, Thanks a lot for the profile data!

I went through it and compared the pictures and it looks like the video images are drawn in wrong order. That points back to vaapi decoder as it's reproducible with vaapi only.

There's a great sample between 7.97 s and 7.99 s where the alternation is clearly visible and there isn't much other activity there.

Martin Stránský [:stransky] (ni? me)

Assignee

Comment 63

•

5 years ago

I looks to me that this affects h264 streams only. I can't reproduce that with vp9 HW vaapi streams...can anyone confirm that?

Marko

Comment 64

•

5 years ago

(In reply to Martin Stránský [:stransky] from comment #63)

I looks to me that this affects h264 streams only. I can't reproduce that with vp9 HW vaapi streams...can anyone confirm that?

No, it affects vp9 too, just to a lesser extent. For me it affects vp9 only when having controls enabled and frame skipping/bumped cpu usage is slight compared to H264, but it is there.

Darkspirit

Comment 65

•

5 years ago

Is the problem caused by JS, content process priority, audio, animations or some upload of other textures? It could be interesting to test a try build of bug 1595994. (It looks like it was postponed to improve Jitsi WebRTC compatibility first?)
In comparison, playing https://4kmedia.org/lg-landscape-uhd-4k-demo/ (H264) from file:// is totally fine: https://perfht.ml/2T7Ve5P
VLC (DRM VAAPI) stutters sometimes, but Nightly is buttery smooth the whole video long.

Martin Stránský [:stransky] (ni? me)

Assignee

Comment 66

•

5 years ago

I tested a patch where only decoding is done by VAAPI and then decoded frames are downloaded/copied as with SW path and I see the jittering too.
So the difference is only vaapi decoding, not rendering.

When media is played over youtube only media fragments are downloaded (not whole video at once AFAIK) which may cause the issue. Maybe vaapi needs more time ahead for decoding or so?

Jean, do you have any idea here? Can be the media decoding process measured somehow?
Thanks.

Flags: needinfo?(jyavenard)

Tony Houghton

Comment 67

•

5 years ago

There seems to be some variation in symptoms reported here, so I'll report mine to make sure we're all talking about the same bug. My APU is a Pentium G4560, which has 2 cores/4 threads, and the GPU includes support for VP9. I use GNOME on Arch Linux. On most sites, including twitch, but excluding youtube, I experience the jittering, whether or not the video is fullscreen or has controls overlaid. Switching between fullscreen, maximised etc doesn't fix it. I haven't really checked CPU usage, except to confirm there was a big improvement compared to software decoding the first time I tried it on youtube.

Didn't some other people here also report that VP9 was affected, but not H264? That's probably why I'm not seeing this problem on youtube. I'm fairly sure my PC is using VAAPI for VP9 though, so software rendering isn't the reason that's working properly here.

rickard

Comment 68

•

5 years ago

I tested vp9 but I can't say i noticed anything wrong with it. I didn't play it for very long tho'.. The CPU usage was higher without MOZ_ENABLE_WAYLAND=1 but it wasn't drastically lowered with it enabled. Which almost made me wonder if HW decoding was doing its thing, is there any way to check that HW decoding is running on the currently played media? Something like chrome's media-internals.

Martin Stránský [:stransky] (ni? me)

Assignee

Comment 69

•

5 years ago

(In reply to rickard from comment #68)

I tested vp9 but I can't say i noticed anything wrong with it. I didn't play it for very long tho'.. The CPU usage was higher without MOZ_ENABLE_WAYLAND=1 but it wasn't drastically lowered with it enabled. Which almost made me wonder if HW decoding was doing its thing, is there any way to check that HW decoding is running on the currently played media? Something like chrome's media-internals.

Run on command line with MOZ_LOG="PlatformDecoderModule:5" and examine the log. It should contain info about vaapi usage. The CPU save isn't so big when you have fairly powerful cpu and/or play low-res clips.

rickard

Comment 70

•

5 years ago

(In reply to Martin Stránský [:stransky] from comment #69)

Run on command line with MOZ_LOG="PlatformDecoderModule:5" and examine the log. It should contain info about vaapi usage. The CPU save isn't so big when you have fairly powerful cpu and/or play low-res clips.

Mmm, for h264 I see VA-API is being used. But I don't see anything about it when it plays vp9 content.
I noticed the hardware decode support is only partial on Broadwell CPU's, it seems like full HW decoding isn't supported until Apollo Lake and above. vainfo shows the vp9 profile tho' but it might be a bit misleading.

Darkspirit

Comment 71

•

5 years ago

Attached video youtube.mp4 — Details

(In reply to Martin Stránský [:stransky] from comment #66)

When media is played over youtube only media fragments are downloaded (not whole video at once AFAIK) which may cause the issue. Maybe vaapi needs more time ahead for decoding or so?

I can reproduce this problem even with a downloaded YouTube video. I switched various prefs and it didn't help.

$ MOZ_ENABLE_WAYLAND=1 MOZ_LOG="PlatformDecoderModule:5" mozregression --launch 2020-05-13 --pref gfx.webrender.all:true media.webm.enabled:false widget.wayland-dmabuf-vaapi.enabled:true security.sandbox.content.level:0 media.mediasource.enabled:false layers.async-pan-zoom.enabled:false javascript.enabled:false widget.wayland-smooth-rendering:true widget.wayland-dmabuf-textures.enabled:true media.decoder.recycle.enabled:true media.decoder.skip-to-next-key-frame.enabled:false media.ffvpx.enabled:false media.media-capabilities.enabled:false media.memory_cache_max_size:65536 media.navigator.enabled:false media.peerconnection.enabled:false media.rdd-opus.enabled:false media.seekToNextFrame.enabled:false media.video-queue.default-size:50 media.video-queue.send-to-compositor-size:99999 media.videocontrols.picture-in-picture.enabled:false svg.disabled:true plugin.disable:true layout.interruptible-reflow.enabled:false layout.display-list.retain:false layout.display-list.retain.chrome:false layout.css.scroll-anchoring.enabled:false layers.omtp.enabled:false image.mem.shared:false gfx.webrender.enable-item-cache:false dom.animations.offscreen-throttling:false canvas.path.enabled:false canvas.focusring.enabled:false canvas.filters.enabled:false canvas.capturestream.enabled:false privacy.resistFingerprinting:true --process-output stdout -a file:///home/darkspirit/Downloads/youtube.mp4

Darkspirit

Comment 72

•

5 years ago

It happens more likely if I set play speed to 2x and continuously show and hide the progress bar by moving the mouse up and down.

Jean-Yves Avenard [:jya]

Comment 73

•

5 years ago

(In reply to Marko from comment #40)

Widevine is just a drm service, I don't think that it has anything to do with either software/hardware decoding on any platform, APIs like vaapi, dxva and others take care of that... or am I wrong here?

Widevine does the video and audio decoding as well as decryption.

Jean-Yves Avenard [:jya]

Comment 74

•

5 years ago

(In reply to Martin Stránský [:stransky] from comment #66)

I tested a patch where only decoding is done by VAAPI and then decoded frames are downloaded/copied as with SW path and I see the jittering too.
So the difference is only vaapi decoding, not rendering.

When media is played over youtube only media fragments are downloaded (not whole video at once AFAIK) which may cause the issue. Maybe vaapi needs more time ahead for decoding or so?

Jean, do you have any idea here? Can be the media decoding process measured somehow?
Thanks.

Decoding is currently done in the content process. Is that's what you mean by process.

We used to have a tool to measure decoding speed, but it wasn't ported to the new MozPromise based API.

If the MediaDataDecoder::IsHardwareAccelerated returns true, then the MediaDecoderStateMachine will keep 3 frames decoded ahead of the current time. If it returns false, 10 frames will be kept in the queue ahead of current time.

I doubt that being YouTube with MSE would lead to increase CPU usage over playing a file from disk. Quite the opposite. The demuxing process is much simpler with MSE as we get pre-demuxed content.

Do you have a video of what the jittering looks like? does it stutter or does it seems that some frames are painted out of order?

If the later, it could be the same issue we had on macOS when using global IOSurface.

The decoding is performed in the content process, into a global GPU surface that is then being transferred across the GPU process for rendering.

At one point in time, when the IOSurface was sent across IPC, the refcount was decreased, which caused the decoder to re-use the surface too early. So when rendered, it looked like they were out of order.

Ok, I just looked at the video, and that certainly looks like out of order frames to me, and quite identical to the bug we had with IOSurface.

How is the VAAPI image telling the decoder that it's okay to re-use it once composited?

Flags: needinfo?(jyavenard)

Martin Stránský [:stransky] (ni? me)

Assignee

Comment 75

•

5 years ago

(In reply to Jean-Yves Avenard [:jya] from comment #74)
[...]

Thanks a lot Jean! Yes, the issue is out of order frames.

At one point in time, when the IOSurface was sent across IPC, the refcount was decreased, which caused the decoder to re-use the surface too early. So when rendered, it looked like they were out of order.

So I should not see the bug when e10s is disabled, right?

How is the VAAPI image telling the decoder that it's okay to re-use it once composited?

We allocate the WaylandDMABUFSurfaceImage here:

https://searchfox.org/mozilla-central/rev/09b8072a543c145de2dc9bb76eddddd4a6c09adc/dom/media/platforms/ffmpeg/FFmpegVideoDecoder.cpp#676

and we use WaylandDMABUFSurfaceImage destructor to to tell va-api decoder it's ok to reuse the surface:

https://searchfox.org/mozilla-central/rev/09b8072a543c145de2dc9bb76eddddd4a6c09adc/gfx/layers/WaylandDMABUFSurfaceImage.h#35

so when the Image we create at decoder is released we also release va-api surface reference and it's reused by va-api decoder.

Flags: needinfo?(jyavenard)

Martin Stránský [:stransky] (ni? me)

Assignee

Comment 76

•

5 years ago

•

Edited

Yes, it's definitely use-after-free scenario, there's a log from my debug build where surface UID is used across the processes. It's 100% reproducible when CPU usage is 100% and then video with overlay is played (I use Firefox compilation as the CPU utilization :-)). When the video overlay is not used the vaapi playback is ok even when 100% CPU is used.

Created dmabuf UID = 63 HW surface 4000014
Creating TextureClient(), UID = 63
Create WaylandDMABUFTextureData UID = 63
Creating WaylandDMABUFTextureHostOGL() UID = 63
WaylandDMABUFTextureHostOGL::CreateRenderTexture() UID = 63
Created RenderWaylandDMABUFTextureHostOGL() UID = 63
WaylandDMABUFTextureHostOGL::PushResourceUpdates() UID = 63
RenderWaylandDMABUFTextureHostOGL::Lock() UID = 63

Releasing WaylandDMABUFSurfaceImage(), UID = 63 <<< released HW surface

WaylandDMABUFTextureData::Forget UID = 63
WaylandDMABufSurfaceNV12 Release UID = 63
RenderWaylandDMABUFTextureHostOGL::Unlock() UID = 63
RenderWaylandDMABUFTextureHostOGL::Lock() UID = 63
RenderWaylandDMABUFTextureHostOGL::Unlock() UID = 63

Martin Stránský [:stransky] (ni? me)

Assignee

Updated

•

5 years ago

Flags: needinfo?(mstange)

Jean-Yves Avenard [:jya]

Comment 77

•

5 years ago

(In reply to Martin Stránský [:stransky] from comment #75)

(In reply to Jean-Yves Avenard [:jya] from comment #74)
[...]

Thanks a lot Jean! Yes, the issue is out of order frames.

At one point in time, when the IOSurface was sent across IPC, the refcount was decreased, which caused the decoder to re-use the surface too early. So when rendered, it looked like they were out of order.

So I should not see the bug when e10s is disabled, right?

correct

https://searchfox.org/mozilla-central/rev/09b8072a543c145de2dc9bb76eddddd4a6c09adc/gfx/layers/WaylandDMABUFSurfaceImage.h#35

so when the Image we create at decoder is released we also release va-api surface reference and it's reused by va-api decoder.

The image is allocated in the content process and composited in the GPU process.
How is the ownership being carried across process?

The example with IOSurface I gave earlier isn't relevant here, because we don't yet have a GPU process on mac, so the situation is easier to manage.

I'm not familiar WaylandDMABUFSurfaceImage ; but ownership of the VAAPI buffer must be kept by the IPC serialiser and carried across the GPU process.

Doesn't look like it's happening by what you describe

Flags: needinfo?(jyavenard)

Jean-Yves Avenard [:jya]

Comment 78

•

5 years ago

BTW, this was precisely the question I asked on your original bug: https://bugzilla.mozilla.org/show_bug.cgi?id=1616185#c13

I should have worded it a bit more clearly at the time

Darkspirit

Comment 79

•

5 years ago

•

Edited

The image is allocated in the content process and composited in the GPU process.

Wayland doesn't have a GPU process yet (bug 1481405 comment 15), but X11 has, so it would be good to account for both.

Martin Stránský [:stransky] (ni? me)

Assignee

Comment 80

•

5 years ago

Jean, what we need here is to release client after host. Do you think TextureFlags::WAIT_HOST_USAGE_END can solve that?

Flags: needinfo?(jyavenard)

Martin Stránský [:stransky] (ni? me)

Assignee

Comment 81

•

5 years ago

(In reply to Jean-Yves Avenard [:jya] from comment #77)

I'm not familiar WaylandDMABUFSurfaceImage ; but ownership of the VAAPI buffer must be kept by the IPC serialiser and carried across the GPU process.

Doesn't look like it's happening by what you describe

Yes. The situation here is that we can't transfer the ownesrhip of VAAPI buffer across firefox processes as we don't own it. The VAAPI buffer is owned by ffmpeg library and must stay locked until it's used by any part of gecko. So we need to make sure it's released (unlocked) only when all operations over it are finished.

I may need to implement a kind of IPC monitor which can be serialized and passed as a part of dmabuf surface and release it VAAPI buffer only when it's truly unused.

Rinat

Comment 82

•

5 years ago

(In reply to Martin Stránský [:stransky] from comment #81)

I may need to implement a kind of IPC monitor which can be serialized and passed as a part of dmabuf surface and release it VAAPI buffer only when it's truly unused.

There is a texture recycling concept in the codebase already, see TextureClientRecycleAllocator. They pass message back when texture hosts are done with a texture. You may also need to change CreateImageVAAPI() to first allocate a recycled texture client, then wrap it with TextureWrapperImage, then make VideoData from TextureWrapperImage instance. I believe that overriding TextureClientRecycleAllocator::RecycleTextureClient will give you the point where AVFrame is to be released.

Darkspirit

Updated

•

5 years ago

Summary: High CPU usage and Jittery/stuttering video playback when using VA-API and video control is shown → High CPU usage and Jittery/stuttering video playback when using VA-API

Jean-Yves Avenard [:jya]

Comment 84

•

5 years ago

Bug 1539735 would provide a solution to this problem, but it's unlikely to be done anytime soon.

Flags: needinfo?(jyavenard)

Comment 85

•

5 years ago

(In reply to Martin Stránský [:stransky] from comment #80)

Jean, what we need here is to release client after host. Do you think TextureFlags::WAIT_HOST_USAGE_END can solve that?

I don't know.

Need to ask someone from the gfx team. :nical knows probably best about texture life management

Flags: needinfo?(nical.bugzilla)

Nicolas Silva [:nical]

Comment 86

•

5 years ago

Do you think TextureFlags::WAIT_HOST_USAGE_END can solve that?

This causes the client side to keep the texture data alive until the last reference to the texture is gone on the host side (at which point the host sends a NotifyNotUsed message which informs the client side that there is no more references to it on the other side.

The same system is used with the TextureClientRecycleAllocator mechanism mentioned by Rinat.

If the lifecycle of VAAPI frames works with the texture client recycling stuff, I would suggest using it so that platforms don't diverge too much. If it doesn't fit it would be worth figuring out if we can extend the system to make it work. Shared texture life cycle is quite complicated overall (especially when factoring in (unexpected) shutdown) so it would be nice if at some point it all works with a common system.

Flags: needinfo?(nical.bugzilla)

Martin Stránský [:stransky] (ni? me)

Assignee

Comment 87

•

5 years ago

Thanks Nicolas, I'll investigate the WAIT_HOST_USAGE_END path.

Martin Stránský [:stransky] (ni? me)

Assignee

Comment 88

•

5 years ago

After some investigation I end up with a global mutex similar to IDXGIKeyedMutex (Bug 1066312) which is a part of dmabuf surface and keeps global ref count. It's based on shm memory (opened by shm_open()) and linux semaphores (sem_* functions).

Unfortunately the shm_open() fails in content process with:

Sandbox: SandboxBroker: denied op=stat rflags=400000 perms=0 path=mozilla-shared-E2xfeo for pid=37667
Sandbox: Failed errno -13 op stat flags 0400000 path mozilla-shared-E2xfeo

Jed, I expect it's ok to use shared memory in content process, right? What's the best way how to deal with it? I think ShmMem is a bit overkill here as I just need a bit of shm memory which is attached to the surface and it's shared across processes as file descriptor with other dmabuf data.
Thanks.

Flags: needinfo?(jld)

Martin Stránský [:stransky] (ni? me)

Assignee

Updated

•

5 years ago

Depends on: 1639855

Jed Davis [:jld] ⟨⏰|UTC-8⟩ ⟦he/him⟧

Comment 89

•

5 years ago

(In reply to Martin Stránský [:stransky] from comment #88)

Jed, I expect it's ok to use shared memory in content process, right? What's the best way how to deal with it?

In general, yes, but I didn't want a sandboxed process to be able to access shared memory belonging to other processes. The way this works is that base::SharedMemory constructs a prefix that's unique to that process (and conforms to name restrictions imposed by external sandboxes like Snap), which is used in the sandbox policy. Note that when memfd support is added to base::SharedMemory (bug 1440203), no access to /dev/shm will be allowed under the current policy if the kernel supports memfd.

Is it possible for you to use base::SharedMemory? Alternately, the CrossProcessSemaphore class sounds similar to what you've done and is used to manage graphics resources, but I don't know if it would be useful in this context.

Flags: needinfo?(jld)

Rinat

Comment 90

•

5 years ago

Attached patch PoC-vaapi-frame-lifetime.patch — Details — Splinter Review

Ehm... Why are shared memory and mutexes necessary here? There is already a message channel between processes. The act of passing a message is a synchronization itself. Adding another synchronization channel may even introduce deadlocks in unexpected places.

Here is a PoC patch that uses TextureFlags::WAIT_HOST_USAGE_END. Looks like a clear way to a solution to me. Am I missing something?

Flags: needinfo?(stransky)

Martin Stránský [:stransky] (ni? me)

Assignee

Comment 91

•

5 years ago

(In reply to Rinat from comment #90)

Created attachment 9151351 [details] [diff] [review]
PoC-vaapi-frame-lifetime.patch

Ehm... Why are shared memory and mutexes necessary here? There is already a message channel between processes. The act of passing a message is a synchronization itself. Adding another synchronization channel may even introduce deadlocks in unexpected places.

Here is a PoC patch that uses TextureFlags::WAIT_HOST_USAGE_END. Looks like a clear way to a solution to me. Am I missing something?

Rinat, thanks for the patch. Did you test it?

When I tested the WAIT_HOST_USAGE_END the frame release was too slow and which led to OOM on GPU side and also sometimes the frames were released too early (before unlock) on webrender side so the flickering was still there. But maybe I did that wrong.

The mutex I implemented here is very fast which enables to recycle the GPU frames ASAP. That's critical becuase number of HW VAAPI surfaces is limited.

Flags: needinfo?(stransky)

Martin Stránský [:stransky] (ni? me)

Assignee

Comment 92

•

5 years ago

(In reply to Martin Stránský [:stransky] from comment #91)

(In reply to Rinat from comment #90)

Created attachment 9151351 [details] [diff] [review]
PoC-vaapi-frame-lifetime.patch

Ehm... Why are shared memory and mutexes necessary here? There is already a message channel between processes. The act of passing a message is a synchronization itself. Adding another synchronization channel may even introduce deadlocks in unexpected places.

Here is a PoC patch that uses TextureFlags::WAIT_HOST_USAGE_END. Looks like a clear way to a solution to me. Am I missing something?

When I tested the WAIT_HOST_USAGE_END the frame release was too slow and which led to OOM on GPU side and also sometimes the frames were released too early (before unlock) on webrender side so the flickering was still there. But maybe I did that wrong.

This is actually cause by missing link between WaylandDMABUFSurfaceImage and texture client. WaylandDMABUFSurfaceImage is released early when VideoData is deleted and TextureClient is released later. So the WAIT_HOST_USAGE_END does not solve WaylandDMABUFSurfaceImage/TextureClient dependence here.

Martin Stránský [:stransky] (ni? me)

Assignee

Comment 93

•

5 years ago

We can use eventfd() for the IPC dmabuf surface refcount instead of shm memory + semaphore and it does not interfere with firefox sandbox.

Martin Stránský [:stransky] (ni? me)

Assignee

Comment 94

•

5 years ago

Attached file Bug 1619882 [Wayland] Implement dmabuf global ref count, r?jhorak — Details

This patch implements:

Global surface UID for better tracking/debugging. The UID is passes acros all processes so we can track
surface usage.
WaylandAllocateShmMemory() function which uses shm_open() instead of files at tmp for Shm memory creation.
DMABufRefcount() class based on eventfd() which is used as a global surface reference count.

Martin Stránský [:stransky] (ni? me)

Assignee

Comment 95

•

5 years ago

Attached file Bug 1619882 [Wayland] Use VAAPIFrameHolder array at FFmpegVideoDecoder to hold active vaapi hw video surfaces, r?jya — Details

VA-API surfaces are owned by ffmpeg library and we need to keep them locked until any part of rendering pipeline use them.
We use a global lock provided by WaylandDMABUFSurface for it, keep array of active surfaces at FFmpegVideoDecoder
and release them only when all references are cleared or FFmpegVideoDecoder is deleted.

Depends on D76689

Martin Stránský [:stransky] (ni? me)

Assignee

Comment 96

•

5 years ago

Attached file Bug 1619882 [Wayland] Use WaylandDMABufSurface global lock instead of VAAPIFrameHolder at WaylandDMABUFSurfaceImage, r?sotaro — Details

VAAPIFrameHolder was moved to FFmpegVideoDecoder class and we use WaylandDMABufSurface global lock to keep
surface reference until it's used.

Depends on D76690

Rinat

Comment 98

•

5 years ago

(In reply to Martin Stránský [:stransky] from comment #91)

Did you test it?

I've only tested the patch on Weston. Haven't seen out of order frames on a test video with the patch, while without the patch the issue can be reproduced quite easily.

WAIT_HOST_USAGE_END the frame release was too slow and which led to OOM

WAIT_HOST_USAGE_END behavior is currently broken. It makes TextureClient go to the waiting list, but doesn't change TextureHost behavior, and that's why NotifyNotUsed is never sent. TextureClient's keep accumulating without any limit. This is why OOM you saw occurs.
In the patch, TextureHost's code is modified so it sends required NotifyNotUsed messages.

WaylandDMABUFSurfaceImage is released early when VideoData is deleted and TextureClient is released later. So the WAIT_HOST_USAGE_END does not solve WaylandDMABUFSurfaceImage/TextureClient dependence here.

Yeah, and that's why the PoC patch passes responsibility over VAAPIFrameHolder from WaylandDMABUFSurfaceImage to newly-created TextureClient backed by WaylandDMABUFTextureData. Since it WAIT_HOST_USAGE_END is TextureClient feature, it only controls TextureClient lifetime.

The mutex I implemented here is very fast which enables to recycle the GPU frames ASAP.

Yeah, it probably is fast. It just looks deadlock-prone to me, since I can't understand its safety easily. On the other hand, message passing looks simple.

As I thought about it more, I must admit that another locking mechanism may be necessary here. As far as I understand, video decoder produces frames on its own, without limitations. It's fine if renderer manages to consume all produced frames in time. But if it doesn't, video decoder may run out of available hw frames to use. Then it may lock up and even if renderer process releases frames, decoder process can be locked for good. So some waitable locking may be required here. But it looks so wrong...

As :jya mentioned earlier, compositor-driven decoding would solve this in a clear way. Unfortunately nobody's is working on it at the moment, apparently.

Nicolas Silva [:nical]

Comment 99

•

5 years ago

•

Edited

May or may not be relevant here but: TextureClient has a "ReadLock" mechanism based on a reference count written in a shmem that is not used by default but can be used when calling TextureClient::EnableReadLock https://searchfox.org/mozilla-central/source/gfx/layers/client/TextureClient.cpp#752

The main idea with this system is that the the client side increments the read count before sending and the host side decrements it when nothing references the texture anymore. The client side can query whether the texture is available for reuse but cannot block on it. This is all happening under the hood. It uses a single shmem to host the shared reference counts of all textures as shmems can be a bit expensive. Doing this allows potentially reusing a texture earlier than if we had had wait for a message back from the compositor.

It is currently mostly used for tiled painted layers so it may or may not work out of the box for video frames but I expect it wouldn't be hard to make it work if it doesn't.

Rinat

Comment 100

•

5 years ago

(In reply to Rinat from comment #98)

video decoder may run out of available hw frames to use. Then it may lock up and even if renderer process releases frames, decoder process can be locked for good.

Oh, it may be solved by copying frames. When a single hw frame is left in the reserve, code may switch to making copies of frames. This may degrade performance a bit, but will prevent deadlocks. This fallback will probably never run, but would be nice to have this corner covered.

Martin Stránský [:stransky] (ni? me)

Assignee

Comment 101

•

5 years ago

(In reply to Rinat from comment #100)

(In reply to Rinat from comment #98)

video decoder may run out of available hw frames to use. Then it may lock up and even if renderer process releases frames, decoder process can be locked for good.

Oh, it may be solved by copying frames. When a single hw frame is left in the reserve, code may switch to making copies of frames. This may degrade performance a bit, but will prevent deadlocks. This fallback will probably never run, but would be nice to have this corner covered.

No, it won't lock in any case. If that happens you'll see "video is corrupted" message and you can reload the video.

Martin Stránský [:stransky] (ni? me)

Assignee

Comment 102

•

5 years ago

I think the plan for video playback may be:

finish the vaapi playback (this bug)
implement EGLimage backend for SW decoding (Bug 1629788)
implement texture recycling for SW decoding

When the texture recycling is going to be more effective than the direct dmabuf surface tracking we can remove it but as it's implemented here it's the most effective variant AFAIK. It uses single 64bit integer created by eventfd() which is stored in kernel and it's shared by file descriptor along other dmabuf surface info. It does not use shared memory/semaphores, just simple IO.

Martin Stránský [:stransky] (ni? me)

Assignee

Comment 103

•

5 years ago

(In reply to Rinat from comment #100)

(In reply to Rinat from comment #98)

video decoder may run out of available hw frames to use. Then it may lock up and even if renderer process releases frames, decoder process can be locked for good.

Oh, it may be solved by copying frames. When a single hw frame is left in the reserve, code may switch to making copies of frames. This may degrade performance a bit, but will prevent deadlocks. This fallback will probably never run, but would be nice to have this corner covered.

btw. we can't copy frames in this case. When all hw frames are exhausted the HW decoder just fails to decode any further frames. What can be done here is to postpone decoding until the frames are released. With the patch here we can do that as we track usage of all vappi hw frames used by ffmpeg.

Rinat

Comment 104

•

5 years ago

(In reply to Martin Stránský [:stransky] from comment #101)

No, it won't lock in any case. If that happens you'll see "video is corrupted" message and you can reload the video.

What I tried for the test is to remove mLib->av_buffer_unref(&mHWFrame); from VAAPIFrameHolder::~VAAPIFrameHolder. That makes Firefox's part of the decoder to keep references to all hw frames. And when I try to play video on such build, it plays first 32 frames, then shows a loading spinner animation and text "No video with supported format and MIME type found". I still can interact with video controls, but changing position on timeline does nothing. It doesn't offer me to reload video or anything else. Remainder of the page and the browser UI are fine though.
Perhaps we are talking about different aspects.

btw. we can't copy frames in this case. When all hw frames are exhausted the HW decoder just fails to decode any further frames.

But number of frames held in VAAPIFrameHolder instances can be tracked. And you are explicitly asking FFmpeg to allocate known number of additional frames. It's known when an VAAPIFrameHolder instance is created, and when it's destroyed. So, it should be possible to track number of frames left?

What can be done here is to postpone decoding until the frames are released. With the patch here we can do that as we track usage of all vappi hw frames used by ffmpeg.

OK.

Pulsebot

Comment 106

•

5 years ago

Pushed by rgurzau@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/594ac84515dc [Wayland] Implement dmabuf global ref count, r=jhorak https://hg.mozilla.org/integration/autoland/rev/45a693471a1c [Wayland] Use VAAPIFrameHolder array at FFmpegVideoDecoder to hold active vaapi hw video surfaces, r=jya https://hg.mozilla.org/integration/autoland/rev/5c5cdf8961e6 [Wayland] Use WaylandDMABufSurface global lock instead of VAAPIFrameHolder at WaylandDMABUFSurfaceImage, r=sotaro

Cristina Coroiu [:ccoroiu]

Comment 107

•

5 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/594ac84515dc
https://hg.mozilla.org/mozilla-central/rev/45a693471a1c
https://hg.mozilla.org/mozilla-central/rev/5c5cdf8961e6

Status: NEW → RESOLVED

Closed: 5 years ago

Resolution: --- → FIXED

Target Milestone: --- → mozilla78

rickard

Comment 108

•

5 years ago

Seems to work now. Good job.

However I don't think this is related to the patch, but if I open Konsole in KDE and have it on top of Firefox that have an open tab with a video paused and start typing in Konsole, the Konsole window partly disappears.
But maybe that''s a different bug report we need to make.

magiblot

Comment 109

•

5 years ago

(In reply to rickard from comment #108)

However I don't think this is related to the patch, but if I open Konsole in KDE and have it on top of Firefox that have an open tab with a video paused and start typing in Konsole, the Konsole window partly disappears.
But maybe that''s a different bug report we need to make.

KDE Plasma has had many issues with Firefox Wayland, mainly https://bugs.kde.org/show_bug.cgi?id=387313 which caused issues like https://bugs.kde.org/show_bug.cgi?id=419797.

That fix will be part of Plasma 5.19. If it still doesn't work there, you should post a new bug report in the wayland-generic component of the kwin product.

Rinat

Comment 110

•

5 years ago

Hmm. I thought that the inter-process semaphore approach was chosen to prevent decoder from blocking in the case where all VASurfaces are in use. But ReleaseUnusedVAAPIFrames() in FFmpegVideoDecoder just peeks at all elements in mFrameHolders and never waits. So all this IPC mechanism is just a re-implementation of the existing texture recycling framework. Is there a reason for such complications that I failed to see?

Martin Stránský [:stransky] (ni? me)

Assignee

Comment 111

•

5 years ago

(In reply to Rinat from comment #110)

So all this IPC mechanism is just a re-implementation of the existing texture recycling framework. Is there a reason for such complications that I failed to see?

According to https://bugzilla.mozilla.org/show_bug.cgi?id=1619882#c86 :

Shared texture life cycle is quite complicated overall (especially when factoring in (unexpected) shutdown) so it would be nice if at some point it all works with a common system.

But if you make it work with the recycle I'm happy to remove the frame holder there. Right now I find the frameholder solution as simplification here.

Rinat

Comment 112

•

5 years ago

(In reply to Martin Stránský [:stransky] from comment #111)

But if you make it work with the recycle I'm happy to remove the frame holder there. Right now I find the frameholder solution as simplification here.

Frameholder is necessary. I wasn't arguing about that. Instead I was talking about how frameholders are destroyed. Your current solution is to add a shared flag. So when decoder issues a frame, you create an WaylandDMABUFSurfaceImage image, it sets the flag, creates a TextureClient, which is serialized and sent to the compositor where it becomes TextureHost (WaylandDMABUFTextureHostOGL) which creates WaylandDMABufSurface. TextureClient is destroyed at some point in time, which is not relevant. Then, when WaylandDMABUFTextureHostOGL is no longer necessary, it is destroyed, which destroys WaylandDMABufSurface which resets the shared flag. When another frame is decoded in decoder process, it checks all backlog frameholders for their corresponding flags and destroys those which have no flag set. So there are message channel IPC, and a ad-hoc IPC channel, these flags.

There is another way to solve the issue: to use texture recycling. In that scenario WaylandDMABUFSurfaceImage passes frameholder responsibility to a TextureClient, which is created with WAIT_HOST_USAGE_END flag. It's then serialized and passed to the compositor, where it becomes a TextureHost. When TextureHost is no longer needed, a NotifyNotUsed message is sent back to the TextureClient side, where it causes the corresponding TextureClient to be destroyed, which destroys a frameholder. So, the existing message channel IPC is used. No additional channel is necessary. Here is how it looks as a patch: https://bugzilla.mozilla.org/attachment.cgi?id=9151351&action=diff.

Both approaches have no idea what to do if the compositor for some reason decides to hold TextureHosts longer than usual, but I believe WAIT_HOST_USAGE_END approach is clearer, since it already uses what's there in the code.

I probably should submit the part that fixes WAIT_HOST_USAGE_END as a separate bug though.

Martin Stránský [:stransky] (ni? me)

Assignee

Comment 113

•

5 years ago

(In reply to Rinat from comment #112)

Frameholder is necessary. I wasn't arguing about that. Instead I was talking about how frameholders are destroyed. Your current solution is to add a shared flag. So when decoder issues a frame, you create an WaylandDMABUFSurfaceImage image, it sets the flag, creates a TextureClient, which is serialized and sent to the compositor where it becomes TextureHost (WaylandDMABUFTextureHostOGL) which creates WaylandDMABufSurface. TextureClient is destroyed at some point in time, which is not relevant. Then, when WaylandDMABUFTextureHostOGL is no longer necessary, it is destroyed, which destroys WaylandDMABufSurface which resets the shared flag. When another frame is decoded in decoder process, it checks all backlog frameholders for their corresponding flags and destroys those which have no flag set. So there are message channel IPC, and a ad-hoc IPC channel, these flags.

Yes, you're right but the last point - mozilla IPC is not involved in the "busy" flag settings. WaylandDMABufSurface uses a global lock based on eventfd. So when webrender TextureHost releases its WaylandDMABufSurface reference there's no mozilla IPC involved here.

There is another way to solve the issue: to use texture recycling. In that scenario WaylandDMABUFSurfaceImage passes frameholder responsibility to a TextureClient, which is created with WAIT_HOST_USAGE_END flag. It's then serialized and passed to the compositor, where it becomes a TextureHost. When TextureHost is no longer needed, a NotifyNotUsed message is sent back to the TextureClient side, where it causes the corresponding TextureClient to be destroyed, which destroys a frameholder. So, the existing message channel IPC is used. No additional channel is necessary. Here is how it looks as a patch: https://bugzilla.mozilla.org/attachment.cgi?id=9151351&action=diff.

I'm a bit worried about IPC channel delays here. The recent solution (internal WaylandDMABufSurface global lock) does not involve any IPC communication so the change is instant. But maybe I'm wrong and this is not any issue at all.

Martin Stránský [:stransky] (ni? me)

Assignee

Comment 114

•

5 years ago

(In reply to Rinat from comment #112)

channel is necessary. Here is how it looks as a patch: https://bugzilla.mozilla.org/attachment.cgi?id=9151351&action=diff.

I feel the VAAPIFrameHolder should not be exported from media playback scope.

OTOH There's already an effort to use SW decoder with recycled HW textures (Bug 1629788) so it would be great to have one solution for both cases (vaapi+dmabuf, SW decode + EGLImages/dmabuf) where the textures will be recycled by the same way so I suspect we'd need something like you posted there.

Darkspirit

Updated

•

5 years ago

status-firefox78: disabled → fixed

Jan Beich

Updated

•

5 years ago

Regressions: 1618914

rickard

Comment 115

•

5 years ago

Everything works but it feels like Firefox slows down quite a bit if it's running more than one video stream or running a single 60fps Twitch stream.
Chromium seems to run smoother than Firefox in this regard.

Can things be optimized perhaps?

Screencast from 03-04-2020 04:20:28 AM.webm 5 years ago Asif Youssuff 7.16 MB, video/webm		Details
ffdebug.log.child-10.moz_log 5 years ago Asif Youssuff 1.53 MB, application/octet-stream		Details
about:support 5 years ago Asif Youssuff 30.38 KB, text/plain		Details
4k_costa_rica_error 5 years ago Eduard Kachur 5.60 KB, text/plain		Details
youtube.mp4 5 years ago Darkspirit 3.28 MB, video/mp4		Details
PoC-vaapi-frame-lifetime.patch 5 years ago Rinat 5.91 KB, patch		Details \| Diff \| Splinter Review
Bug 1619882 [Wayland] Implement dmabuf global ref count, r?jhorak 5 years ago Martin Stránský [:stransky] (ni? me) 47 bytes, text/x-phabricator-request		Details \| Review
Bug 1619882 [Wayland] Use VAAPIFrameHolder array at FFmpegVideoDecoder to hold active vaapi hw video surfaces, r?jya 5 years ago Martin Stránský [:stransky] (ni? me) 47 bytes, text/x-phabricator-request		Details \| Review
Bug 1619882 [Wayland] Use WaylandDMABufSurface global lock instead of VAAPIFrameHolder at WaylandDMABUFSurfaceImage, r?sotaro 5 years ago Martin Stránský [:stransky] (ni? me) 47 bytes, text/x-phabricator-request		Details \| Review