Open Bug 1527144 Opened 6 years ago Updated 6 months ago

bind_frame_data seems quite slow

Categories

(Core :: Graphics: WebRender, defect, P2)

defect

Tracking

()

People

(Reporter: gw, Unassigned)

References

(Blocks 1 open bug)

Details

STR:

  • Open about:blank
  • Profile the time spent in bind_frame_data

On my machine, I typically see around 0.2 - 0.6ms, for a blank page and 0.3 - 1.0 ms for a real world page.

Although this isn't a huge absolute number, it's a very significant percentage of the total composite CPU time on most pages, and seems higher than expected / ideal.

I added a few logs:

bind_frame_data: 0.60733
prim_header_f_texture: all=0.039532 push=0.013509 up=0.025277
prim_header_i_texture: all=0.053038 push=0.008341 up=0.031914
transforms_texture: all=0.03723 push=0.021304 up=0.015683
render_task_texture: all=0.037387 push=0.007996 up=0.028927

all measures total time in the function, push is the time extending the texture to be a row-length multiple, and up is the time in the texture upload code.

I haven't looked further, but I suspect that a lot of the time is either in memory allocs and/or copying around memory that we could avoid.

Matt, Dzmitry, this might be worth looking into sometime - not high priority, but there might be some easy per-frame wins to be found here.

Adding a ni? to remind to look at this

Flags: needinfo?(dglastonbury)

I'm slightly confused. If we add up all "all" sections of prim_header_f_texture, prim_header_i_texture, transforms_texture, and render_task_texture, we get 0.167187, which is quite different from the stated 0.60733 for the whole bind_frame_data.

We have quite a few redundant GL calls, things like "active_texture" jumping back and forth:

            self.gl.active_texture(gl::TEXTURE0 + slot.0 as gl::GLuint);
            self.gl.bind_texture(target, id);
            self.gl.active_texture(gl::TEXTURE0);

Marking Glenn for ni about the total times mismatch.

Flags: needinfo?(gwatson)

(sorry for the spam of messages!)
Another possible issue for real-world use of VertexDataTexture::update() is that it resizes exactly to needed_height. If we get the content outside of the supported texture size, we should reset our estimates, and get some headway to accommodate for further changes. I.e if needed_height > existing_height, we should either add a constant value (e.g. resize to needed_height + 10) or better - to round it up to a power of 2, given that Intel driver still prefers PoT texture sizes.

Huh, you're quite right - most of the samples I had logged it seemed 90+% of the time was in those child functions. Maybe I copy-pasted an outlier, or made a mistake in copying.

Flags: needinfo?(gwatson)
Depends on: wr-perf
Priority: -- → P2
Flags: needinfo?(dglastonbury)
Severity: normal → S3
No longer depends on: wr-perf
You need to log in before you can comment on or make changes to this bug.