Closed Bug 1405589 Opened 7 years ago Closed 7 years ago

[OMTP] Very slow painting when scrolling fast on certain webpages

Categories

(Core :: Graphics: Layers, defect, P1)

58 Branch
All
Windows
defect

Tracking

()

RESOLVED FIXED
mozilla58
Tracking Status
firefox-esr52 --- unaffected
firefox56 --- unaffected
firefox57 --- unaffected
firefox58 --- fixed

People

(Reporter: lh.bennett, Assigned: bas.schouten)

References

()

Details

(4 keywords, Whiteboard: [gfx-noted])

Attachments

(3 files)

I've seen very slow painting on some webpages with a reasonable amount of scrolling speed. Sometimes painting will not occur for a few seconds after either using the scroll wheel one section up or down or moving faster with the scrollbar.

I attached a page that shows this easier. I also attached a profile that shows an event processing delay toward the end of the timeline that records the delay I've seen.
Graphics

Features
Compositing	Direct3D 11 (Advanced Layers)
Asynchronous Pan/Zoom	wheel input enabled; scrollbar drag enabled; keyboard enabled; autoscroll enabled
WebGL 1 Driver WSI Info	EGL_VENDOR: Google Inc. (adapter LUID: 000000000000913c)
EGL_VERSION: 1.4 (ANGLE 2.1.0.dec065540d5f)
EGL_EXTENSIONS: EGL_EXT_create_context_robustness EGL_ANGLE_d3d_share_handle_client_buffer EGL_ANGLE_d3d_texture_client_buffer EGL_ANGLE_surface_d3d_texture_2d_share_handle EGL_ANGLE_query_surface_pointer EGL_ANGLE_window_fixed_size EGL_ANGLE_keyed_mutex EGL_ANGLE_surface_orientation EGL_ANGLE_direct_composition EGL_NV_post_sub_buffer EGL_KHR_create_context EGL_EXT_device_query EGL_KHR_image EGL_KHR_image_base EGL_KHR_gl_texture_2D_image EGL_KHR_gl_texture_cubemap_image EGL_KHR_gl_renderbuffer_image EGL_KHR_get_all_proc_addresses EGL_KHR_stream EGL_KHR_stream_consumer_gltexture EGL_NV_stream_consumer_gltexture_yuv EGL_ANGLE_flexible_surface_compatibility EGL_ANGLE_stream_producer_d3d_texture_nv12 EGL_ANGLE_create_context_webgl_compatibility EGL_CHROMIUM_create_context_bind_generates_resource 
EGL_EXTENSIONS(nullptr): EGL_EXT_client_extensions EGL_EXT_platform_base EGL_EXT_platform_device EGL_ANGLE_platform_angle EGL_ANGLE_platform_angle_d3d EGL_ANGLE_device_creation EGL_ANGLE_device_creation_d3d11 EGL_ANGLE_experimental_present_path EGL_KHR_client_get_all_proc_addresses 
WebGL 1 Driver Renderer	Google Inc. -- ANGLE (AMD Radeon HD 5700 Series Direct3D11 vs_5_0 ps_5_0)
WebGL 1 Driver Version	OpenGL ES 2.0 (ANGLE 2.1.0.dec065540d5f)
WebGL 1 Driver Extensions	GL_ANGLE_depth_texture GL_ANGLE_framebuffer_blit GL_ANGLE_framebuffer_multisample GL_ANGLE_instanced_arrays GL_ANGLE_lossy_etc_decode GL_ANGLE_pack_reverse_row_order GL_ANGLE_request_extension GL_ANGLE_robust_client_memory GL_ANGLE_texture_compression_dxt3 GL_ANGLE_texture_compression_dxt5 GL_ANGLE_texture_usage GL_ANGLE_translated_shader_source GL_CHROMIUM_bind_generates_resource GL_CHROMIUM_bind_uniform_location GL_CHROMIUM_copy_compressed_texture GL_CHROMIUM_copy_texture GL_CHROMIUM_sync_query GL_EXT_blend_minmax GL_EXT_color_buffer_half_float GL_EXT_debug_marker GL_EXT_discard_framebuffer GL_EXT_disjoint_timer_query GL_EXT_draw_buffers GL_EXT_frag_depth GL_EXT_map_buffer_range GL_EXT_occlusion_query_boolean GL_EXT_read_format_bgra GL_EXT_robustness GL_EXT_sRGB GL_EXT_shader_texture_lod GL_EXT_texture_compression_dxt1 GL_EXT_texture_filter_anisotropic GL_EXT_texture_format_BGRA8888 GL_EXT_texture_rg GL_EXT_texture_storage GL_EXT_unpack_subimage GL_KHR_debug GL_NV_EGL_stream_consumer_external GL_NV_fence GL_NV_pack_subimage GL_NV_pixel_buffer_object GL_OES_EGL_image GL_OES_EGL_image_external GL_OES_compressed_ETC1_RGB8_texture GL_OES_depth32 GL_OES_element_index_uint GL_OES_get_program_binary GL_OES_mapbuffer GL_OES_packed_depth_stencil GL_OES_rgb8_rgba8 GL_OES_standard_derivatives GL_OES_texture_float GL_OES_texture_float_linear GL_OES_texture_half_float GL_OES_texture_half_float_linear GL_OES_texture_npot GL_OES_vertex_array_object 
WebGL 1 Extensions	ANGLE_instanced_arrays EXT_blend_minmax EXT_color_buffer_half_float EXT_frag_depth EXT_sRGB EXT_shader_texture_lod EXT_texture_filter_anisotropic EXT_disjoint_timer_query OES_element_index_uint OES_standard_derivatives OES_texture_float OES_texture_float_linear OES_texture_half_float OES_texture_half_float_linear OES_vertex_array_object WEBGL_color_buffer_float WEBGL_compressed_texture_s3tc WEBGL_debug_renderer_info WEBGL_debug_shaders WEBGL_depth_texture WEBGL_draw_buffers WEBGL_lose_context
WebGL 2 Driver WSI Info	EGL_VENDOR: Google Inc. (adapter LUID: 000000000000913c)
EGL_VERSION: 1.4 (ANGLE 2.1.0.dec065540d5f)
EGL_EXTENSIONS: EGL_EXT_create_context_robustness EGL_ANGLE_d3d_share_handle_client_buffer EGL_ANGLE_d3d_texture_client_buffer EGL_ANGLE_surface_d3d_texture_2d_share_handle EGL_ANGLE_query_surface_pointer EGL_ANGLE_window_fixed_size EGL_ANGLE_keyed_mutex EGL_ANGLE_surface_orientation EGL_ANGLE_direct_composition EGL_NV_post_sub_buffer EGL_KHR_create_context EGL_EXT_device_query EGL_KHR_image EGL_KHR_image_base EGL_KHR_gl_texture_2D_image EGL_KHR_gl_texture_cubemap_image EGL_KHR_gl_renderbuffer_image EGL_KHR_get_all_proc_addresses EGL_KHR_stream EGL_KHR_stream_consumer_gltexture EGL_NV_stream_consumer_gltexture_yuv EGL_ANGLE_flexible_surface_compatibility EGL_ANGLE_stream_producer_d3d_texture_nv12 EGL_ANGLE_create_context_webgl_compatibility EGL_CHROMIUM_create_context_bind_generates_resource 
EGL_EXTENSIONS(nullptr): EGL_EXT_client_extensions EGL_EXT_platform_base EGL_EXT_platform_device EGL_ANGLE_platform_angle EGL_ANGLE_platform_angle_d3d EGL_ANGLE_device_creation EGL_ANGLE_device_creation_d3d11 EGL_ANGLE_experimental_present_path EGL_KHR_client_get_all_proc_addresses 
WebGL 2 Driver Renderer	Google Inc. -- ANGLE (AMD Radeon HD 5700 Series Direct3D11 vs_5_0 ps_5_0)
WebGL 2 Driver Version	OpenGL ES 3.0 (ANGLE 2.1.0.dec065540d5f)
WebGL 2 Driver Extensions	GL_ANGLE_depth_texture GL_ANGLE_framebuffer_blit GL_ANGLE_framebuffer_multisample GL_ANGLE_instanced_arrays GL_ANGLE_lossy_etc_decode GL_ANGLE_pack_reverse_row_order GL_ANGLE_request_extension GL_ANGLE_robust_client_memory GL_ANGLE_texture_compression_dxt3 GL_ANGLE_texture_compression_dxt5 GL_ANGLE_texture_usage GL_ANGLE_translated_shader_source GL_CHROMIUM_bind_generates_resource GL_CHROMIUM_bind_uniform_location GL_CHROMIUM_copy_compressed_texture GL_CHROMIUM_copy_texture GL_CHROMIUM_sync_query GL_EXT_blend_minmax GL_EXT_color_buffer_float GL_EXT_color_buffer_half_float GL_EXT_debug_marker GL_EXT_discard_framebuffer GL_EXT_disjoint_timer_query GL_EXT_draw_buffers GL_EXT_frag_depth GL_EXT_map_buffer_range GL_EXT_occlusion_query_boolean GL_EXT_read_format_bgra GL_EXT_robustness GL_EXT_sRGB GL_EXT_shader_texture_lod GL_EXT_texture_compression_dxt1 GL_EXT_texture_filter_anisotropic GL_EXT_texture_format_BGRA8888 GL_EXT_texture_norm16 GL_EXT_texture_rg GL_EXT_texture_storage GL_EXT_unpack_subimage GL_KHR_debug GL_NV_EGL_stream_consumer_external GL_NV_fence GL_NV_pack_subimage GL_NV_pixel_buffer_object GL_OES_EGL_image GL_OES_EGL_image_external GL_OES_EGL_image_external_essl3 GL_OES_compressed_ETC1_RGB8_texture GL_OES_depth32 GL_OES_element_index_uint GL_OES_get_program_binary GL_OES_mapbuffer GL_OES_packed_depth_stencil GL_OES_rgb8_rgba8 GL_OES_standard_derivatives GL_OES_texture_float GL_OES_texture_float_linear GL_OES_texture_half_float GL_OES_texture_half_float_linear GL_OES_texture_npot GL_OES_vertex_array_object 
WebGL 2 Extensions	EXT_color_buffer_float EXT_texture_filter_anisotropic EXT_disjoint_timer_query OES_texture_float_linear WEBGL_compressed_texture_s3tc WEBGL_debug_renderer_info WEBGL_debug_shaders WEBGL_lose_context
Direct2D	true
Off Main Thread Painting Enabled	true
DirectWrite	true (10.0.15063.483)
GPU #1
Active	Yes
Description	AMD Radeon HD 5700 Series
Vendor ID	0x1002
Device ID	0x68be
Driver Version	15.301.1901.0
Driver Date	2-26-2016
Drivers	aticfx64 aticfx64 aticfx64 amdxc64 aticfx32 aticfx32 aticfx32 amdxc32 atiumd64 atidxx64 atidxx64 atiumdag atidxx32 atidxx32 atiumdva atiumd6a atitmm64
Subsys ID	29801682
RAM	1024
GPU #2
Active	No
Description	Intel(R) HD Graphics 4000
Vendor ID	0x8086
Device ID	0x0162
Driver Version	10.18.10.4653
Driver Date	4-7-2017
Drivers	igdumdim64 igd10iumd64 igd10iumd64 igdumdim32 igd10iumd32 igd10iumd32
Subsys ID	0000000c
RAM	Unknown
Diagnostics
ClearType Parameters	DISPLAY1 [ Gamma: 2.2 Pixel Structure: RGB ClearType Level: 50 Enhanced Contrast: 100 ] DISPLAY2 [ Gamma: 2.2 Pixel Structure: RGB ClearType Level: 100 Enhanced Contrast: 50 ] DISPLAY5 [ Gamma: 2.2 Pixel Structure: RGB ClearType Level: 100 Enhanced Contrast: 50 ] 
AzureCanvasAccelerated	0
AzureCanvasBackend	Direct2D 1.1
AzureCanvasBackend (UI Process)	skia
AzureContentBackend	Direct2D 1.1
AzureContentBackend (UI Process)	skia
AzureFallbackCanvasBackend (UI Process)	cairo
GPUProcessPid	4912
GPUProcess	
Device Reset	
ClearType Parameters	DISPLAY1 [ Gamma: 2.2 Pixel Structure: RGB ClearType Level: 50 Enhanced Contrast: 100 ] DISPLAY2 [ Gamma: 2.2 Pixel Structure: RGB ClearType Level: 100 Enhanced Contrast: 50 ] DISPLAY5 [ Gamma: 2.2 Pixel Structure: RGB ClearType Level: 100 Enhanced Contrast: 50 ] 
Decision Log
WEBRENDER	
opt-in by default: WebRender is an opt-in feature
Blocks: 1403935
Keywords: perf, regression
I've been noticing this also with OMTP enabled. Even on my powerful desktop machine with an i7 6700k where I never had paint problems like that before.
Sounds similar to the issue I filed yesterday about Google Docs too (bug 1405536).
This is interesting. The main thread is definitely sitting there  waiting for the paint thread to finish. but in this profile I can't see what the paint thread is doing.. Markus, any way I can change that? The 1+ second wait seems excessive and I feel we may have either a scheduling issue or might not be properly doing the paint flushing, but it's impossible to tell for sure without seeing the paint thread.
Flags: needinfo?(mstange)
Flags: needinfo?(dvander)
I can't reproduce this locally, could you add ',PaintThread' to the list of threads in the profiler setting, and then re-capture a profile, that will allow me to see what's going on.
Flags: needinfo?(mstange) → needinfo?(lh.bennett)
Priority: -- → P1
Whiteboard: [gfx-noted]
Flags: needinfo?(lh.bennett)
Added new profile. I can reproduce this quite easily if using the scrollbar to go up and down the page quickly. Its more random using the scroll wheel, but its definitely noticeable. Turning off OMTP reduces or eliminates this effect.
This profile is much better! We can see when it's waiting, the paintthread is doing real work, as in, it's painting, and there's no visible contention (quite likely this is roughly similar to what we've seen in some other perf issues). What's interesting is it's spending most of that time drawing text, but the part that is taking the time is pushing and popping the clips associated with the drawing. This is a known issue we have, but it shouldn't be exacerbated by OMTP. David, since I still don't know all of the OMTP code, is there any reason you could think of why our (complex) clipping behavior would become worse with OMTP?
Flags: needinfo?(bas)
Severity: normal → major
Status: UNCONFIRMED → NEW
Has Regression Range: --- → yes
Has STR: --- → yes
Ever confirmed: true
OS: Windows 10 → Windows
Hardware: x86_64 → All
Version: Trunk → 58 Branch
I can pretty much STR this on every website page, when other tabs are still loading in background.
This is very easy to reproduce; scroll around on https://www.reddit.com/r/PoliticalHumor/comments/743bg5/the_right_wing_over_the_past_24_hours/ by rapidly moving the scrollbar. Happens on my older but top-of-line NVIDIA GPU.

On a profile without OMTP, I see small chunks of rasterizing at 30-40ms each.

With OMTP, that goes down to 3-4ms - exactly what we want! But the PaintThread now spends *150ms* rasterizing instead of 30, which means we miss way more frames. In one profile I saw it spend 500ms rasterizing and 90ms recording. Some of that is due to D2D contention - bug 1397310.

Profile with OMTP: https://perfht.ml/2fVJXFs and https://perfht.ml/2fVJXFs
Profile without OMTP: https://perfht.ml/2fUOic7

No problems with Skia.
Flags: needinfo?(dvander) → needinfo?(bas)
(In reply to David Anderson [:dvander] from comment #10)
> This is very easy to reproduce; scroll around on
> https://www.reddit.com/r/PoliticalHumor/comments/743bg5/
> the_right_wing_over_the_past_24_hours/ by rapidly moving the scrollbar.
> Happens on my older but top-of-line NVIDIA GPU.
> 
> On a profile without OMTP, I see small chunks of rasterizing at 30-40ms each.
> 
> With OMTP, that goes down to 3-4ms - exactly what we want! But the
> PaintThread now spends *150ms* rasterizing instead of 30, which means we
> miss way more frames. In one profile I saw it spend 500ms rasterizing and
> 90ms recording. Some of that is due to D2D contention - bug 1397310.
> 
> Profile with OMTP: https://perfht.ml/2fVJXFs and https://perfht.ml/2fVJXFs
> Profile without OMTP: https://perfht.ml/2fUOic7
> 
> No problems with Skia.

Hrm, I don't see much D2D contention at all in the 500ms replay, which I find the most concerning. In the recording I do see contention which is due to the device context creation which (somewhat unsurprisingly), takes a ton of time, there might be other issues to address this by using D2D a little more how it's meant to be used (a single DC rather than one for each surface), but on the other hand this issue will be fixed by your box-shadow patch.
Flags: needinfo?(bas)
(In reply to David Anderson [:dvander] from comment #10)
> This is very easy to reproduce; scroll around on
> https://www.reddit.com/r/PoliticalHumor/comments/743bg5/
> the_right_wing_over_the_past_24_hours/ by rapidly moving the scrollbar.
> Happens on my older but top-of-line NVIDIA GPU.

Great, I have a machine where I can reproduce the issue on this page as well.
This is still all work pushing and popping clips, I wonder if with omtp we somehow record more complex clip patterns.
So, the problem pretty much reproduces for me when I switch on force-sync. Leading me to believe that rather than a contention issue this is some sort of issue with the recording itself. Can someone confirm this so I can be sure I'm seeing the same thing?

(switch the layers.omtp.force-sync pref)
Found the issue, this also explains why this was reproducible depending on the exact page, screen size, etc. The patch here should fix it.
Assignee: nobody → bas
Status: NEW → ASSIGNED
r+'ed by David via alternative route since he didn't have bugzilla access.
Pushed by bschouten@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/a9ddc3f6748c
Have DrawTargetCapture properly forward SupportsRegionClipping. r=dvander
https://hg.mozilla.org/mozilla-central/rev/a9ddc3f6748c
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla58
Comment on attachment 8915948 [details]
Bug 1405589: Have DrawTargetCapture properly forward SupportsRegionClipping.

https://reviewboard.mozilla.org/r/187226/#review192406

Stealing review, David's on the road.
Attachment #8915948 - Flags: review+
Improvements noticed:

== Change summary for alert #9886 (as of October 06 2017 17:10 UTC) ==

Improvements:

  4%  tsvgx summary windows10-64 opt e10s     237.58 -> 227.81
  4%  tsvgx summary windows10-64 pgo e10s     220.33 -> 211.43

For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=9886
I'm still getting mini-freezes when scrolling. So far I can only reliably reproduce on http://runescape.wikia.com/wiki/Calculator:Disassembly_by_material/Simple by drag scrolling fast but performance is randomly worse on some other heavy sites.
Hardware: FX8320 @4Ghz, HD5750, 16Gb RAM.

OMTP ON: https://perfht.ml/2kBydJY
OMTP OFF: https://perfht.ml/2kAeA4O

Not sure if I did this right but is there meant to be "event processing delays" on the main thread with OMTP on because I'm not seeing it with it disabled?
David, Ryan, what do you make of the main thread jank in the OMTP enabled case?  Will any of the upcoming changes help?
Flags: needinfo?(rhunt)
Flags: needinfo?(dvander)
Jules, are you still seeing this problem in the latest nightly?
Flags: needinfo?(wot_up_dogg)
(In reply to Milan Sreckovic [:milan] from comment #26)
> Jules, are you still seeing this problem in the latest nightly?

I'm not quite sure how to explain it but it seems a lot better, I only see a few very short pauses but scrolling feels delayed or not as smooth though it seems to be painting faster than off. Looking at the graphs now is confusing, it seems like with it off it should be hitching but it felt more responsive despite the slower painting.

On: https://perfht.ml/2zHHyCH
Off: https://perfht.ml/2zGO50f
Flags: needinfo?(wot_up_dogg)
Jules, in your Gecko Profiler, can you click "Settings", and to "Threads", add ",PaintThread" to the end? Then record a new profiler for with OMTP On. Unfortunately by default it doesn't include the new painting thread.
Flags: needinfo?(dvander) → needinfo?(wot_up_dogg)
sorry, that should read: "new profile with OMTP on"
I misread at first and did on a new profile (so without addons and anything I changed in about:config). Here's that if you find it interesting: https://perfht.ml/2zI6cCT
Otherwise here it is with PaintThread on: https://perfht.ml/2zGJifb
Flags: needinfo?(wot_up_dogg)
(In reply to Jules A from comment #30)
> I misread at first and did on a new profile (so without addons and anything
> I changed in about:config). Here's that if you find it interesting:
> https://perfht.ml/2zI6cCT
> Otherwise here it is with PaintThread on: https://perfht.ml/2zGJifb

It looks like there are a lot of paints happening in the new profile you linked to. The average refresh driver tick time is ~16ms-25ms. Of that rasterization (capture with OMTP) is ~6-8ms, and the rest is display list and layer building.

We don't have great markers for OMTP, so I can't tell how long the actual rasterization is, but I can see that the following refresh ticks aren't being delayed much. Estimating from paint thread samples, I'd say rasterization is ~6-8ms.

I do notice that the content threads sometimes are actually running scripts and doing other things while we rasterize so that's a win for OMTP.

Overall I don't see OMTP misbehaving here with blocking or other async weirdness.

It looks like this page has a lot of work to do with display list building (it's essentially a giant table). I also noticed that as you scroll, images will transition in for table entries. My guess is there is a script doing this, and that's what's causing us to paint so much.
Flags: needinfo?(rhunt)
I have no objective data to support my assumption because I don't know if there's a good way to measure it. However I'm under the impression that OMTP on causes consistently more checkerboarding than OMTP off (with current Nightly 58). 

Sometimes on my Skylake i5 6200u with only Nightly running without any major CPU usage it might happen that checkerboarding is displayed for a few seconds (up to 5!) by just scrolling on a relatively simple Xenoboard. 
I even notice checkerboarding on my Skylake i7 6700k, a CPU where I practically never noticed checkerboarding before OMTP was enabled by default. 

Is OMTP just about removing jank from the main thread or should it automatically improve rendering performance and cause less checkerboarding?
Here is a profile where I noticed a pretty long checkerboarding occurence with Paint thread capturing enabled: https://perfht.ml/2ADd2em

The checkerboarding happened on process 5960 (Content (8 of 8)), on domain ard.de.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: