28.51% of total execution time in After the Flood WebGL 2 demo on Very Low Quality setting is spent in ClientCanvasLayer::RenderLayer -> fCreateImage

NEW
Unassigned

Status

()

Core
Canvas: WebGL
P3
normal
11 months ago
2 months ago

People

(Reporter: Jukka Jylänki, Unassigned)

Tracking

55 Branch
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [gfx-noted])

Attachments

(1 attachment)

(Reporter)

Description

11 months ago
Created attachment 8867269 [details]
Screen Shot 2017-05-12 at 6.55.18 PM.png

The recent After the Flood demo has been updated with a more mobile friendly version in the emunittest suite, with a Very Low quality option (pull emunittest suite to latest from git to obtain)

Profiling the performance on a high end Huawei P10 Plus ARM Mali-G71 MP8 device (http://www.gsmarena.com/huawei_p10_plus-8515.php), it appears that 28.51% of total execution time is spent in compositing related operations in ClientCanvasLayer::RenderLayer().

This looks suspicious from the parlance viewpoint: fCreateImage suggests perhaps a new GPU side surface creation operation that is occurring each frame? I wonder if that's the case, and if performance could be improved by creating such a scratch surface only once and reusing it across each frame?

Overall performance is about 18.65fps. Given that this is a 2017 flagship phone and the demo is running at its lowest quality settings, the expectation is that 60fps would be feasible to attain.

See the attached screenshot for an illustration.
How's the performance on desktop?
Presumably this would be nullified by bug 1136734
(Reporter)

Comment 3

11 months ago
On a Macbook Pro 2016 laptop with Intel Graphics 550, execution sits at 60fps with CPU being idle 65% of the time.
It seems like we should be able to better without needing bug 1136734. Why's fCreateImage being called during ClientCanvasLayer::RenderLayer()?
Flags: needinfo?(snorp)
On Android we use eglCreateImage() against a texture to share WebGL output with the compositor[0]. The reason this shows up under ClientCanvasLayer::RenderLayer() is because webgl does the buffer swap in a pretransaction callback, which happens in ShareableCanvasLayer::UpdateCompositableClient(), called by RenderLayer(). When we swap, we create the new front buffer, hence eglCreateImage(). Why that thing is so slow I have no idea.

[0] https://dxr.mozilla.org/mozilla-central/source/gfx/gl/SharedSurfaceEGL.cpp#43
Flags: needinfo?(snorp)
Jukka, how are you getting this profile? Can you share it? I didn't think the gecko profiler worked on fennec anymore...
Flags: needinfo?(jujjyl)
(Reporter)

Comment 7

11 months ago
I am using the Firefox build-in performance profiler tool - attach to phone with WebIDE, then open the tab and access the DevTools Performance panel to profile.
Flags: needinfo?(jujjyl)
Whiteboard: [gfx-noted]
Priority: -- → P3
You need to log in before you can comment on or make changes to this bug.