Large percentage of time spent in ScopedResolveTexturesForDraw in Proxx-Tables-Canvas benchmark
Categories
(Core :: Graphics: Canvas2D, enhancement, P3)
Tracking
()
Tracking | Status | |
---|---|---|
firefox108 | --- | fixed |
People
(Reporter: mstange, Assigned: jgilbert)
References
(Blocks 2 open bugs)
Details
(Whiteboard: [sp3-proxx-tables-canvas])
Attachments
(1 file)
Profile: https://share.firefox.dev/3hbJr5H
WebGLContext::DrawArraysInstanced seems to spend a fair amount of time allocating and freeing things via ScopedResolveTexturesForDraw.
I found this on the Proxx-Tables-Canvas benchmark (bug 1798923): https://grandprixbench.netlify.app/?suites=Proxx-Tables-Canvas
Comment 1•2 years ago
|
||
Kelsey, you've been in this code quite a bit. Are these memory allocations avoidable, or can the number of texUnits be known up-front? ScopedResolveTexturesForDraw::ScopedResolveTexturesForDraw
is also profiling in a weird way: the profiler seems to show that the function invokes itself and also can allocate multiple maps in the same call -- perhaps that's just showing that the reserve amount has been exceeded.
Updated•2 years ago
|
Assignee | ||
Comment 2•2 years ago
|
||
We should just reuse the allocations somehow.
The max size is known and small, I'm mostly using map for ergonomics.
Assignee | ||
Comment 3•2 years ago
|
||
I'm tempted to just have this be a static thread_local, but we can just tag it off of the this
object.
Assignee | ||
Comment 4•2 years ago
|
||
Also showing up kinda hot there is overhead around these four related commands:
- SetEnabled
- Clear
- ClearColor
- Scissor
Feels like we're ending up doing something like:
for () {
Enable(SCISSOR_TEST)
Scissor(x,y,...)
ClearColor(r,g,b,a)
Clear(COLOR)
Disable(SCISSOR_TEST)
}
You can imagine us e.g. lazily combining ClearColor+Clear into Clear(flags, r,g,b,a, d,s), and/or eliding unused Enables/Disables.
Deser overhead is cheap but not free, and it does show up here weighing things down.
Assignee | ||
Comment 5•2 years ago
|
||
Updated•2 years ago
|
Assignee | ||
Comment 6•2 years ago
|
||
While that profile link does make it look like we're spending 15% in this func, that's because we're dropping samples that are waiting on events, so this is closer to a 2% win, all else equal.
Comment 8•2 years ago
|
||
bugherder |
Comment 9•2 years ago
|
||
== Change summary for alert #36053 (as of Sun, 13 Nov 2022 01:24:52 GMT) ==
Improvements:
Ratio | Test | Platform | Options | Absolute values (old vs new) |
---|---|---|---|---|
8% | motionmark_webgl 3DGraphics-WebGL | windows10-64-shippable-qr | e10s fission stylo webrender | 10.59 -> 11.46 |
For up to date results, see: https://treeherder.mozilla.org/perfherder/alerts?id=36053
Updated•2 years ago
|
Updated•2 years ago
|
Updated•2 years ago
|
Updated•2 years ago
|
Description
•