Closed Bug 572651 Opened 14 years ago Closed 14 years ago

ThebesLayer shader program (GetBGRXLayerProgram) is slow on mobile...

Categories

(Core :: Graphics, defect)

x86
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED
Tracking Status
fennec 2.0+ ---

People

(Reporter: romaxa, Assigned: romaxa)

References

Details

Attachments

(2 files, 4 obsolete files)

I was testing ThebesLayer scrolling performance on EGL (maemo). and simple layer texture scrolling is slow there.

I have found (with Bas help) that sBGRXTextureLayerFS  using uLayerOpacity calculation.

I've tested on device with and without uLayerOpacity, and without opacity we have ~1.3x faster layer scrolling.
Comment on attachment 451909 [details] [diff] [review]
Patch which is making layer scrolling 1.3x faster

Need feedback and some explanation how to proceed with this...
Should I create separate shader program without opacity for each existing program, or something else should be done here?
mmm, annoying that the extra multiply costs that much, but I guess I can believe that it would dominate.  I like this here (though I don't like the "2" name :-), though it would be nice if we could autogenerate this.. let me think about it for a bit.
Oleg, can you try using the original shader program, but before the draw call for those layers, if opacity == 1.0, can you try disabling GL_BLEND (and reenabling it afterward)?  See what that does to performance.
> for those layers, if opacity == 1.0, can you try disabling GL_BLEND (and

checked, gl()->fDisable(LOCAL_GL_BLEND) does not help at all.
Attachment #451909 - Attachment is obsolete: true
Attachment #452412 - Flags: feedback?(vladimir)
Attached patch Shader version with no opacity (obsolete) — Splinter Review
Attachment #452412 - Attachment is obsolete: true
Attachment #452692 - Flags: review?(vladimir)
Attachment #452412 - Flags: feedback?(vladimir)
Assignee: nobody → romaxa
tracking-fennec: --- → 2.0+
Attached patch Updated patch to upstream (obsolete) — Splinter Review
Attachment #452692 - Attachment is obsolete: true
Attachment #482013 - Flags: review?(vladimir)
Attachment #452692 - Flags: review?(vladimir)
Here is the results of Simple scrolling with Shadow layers GL on Maemo

With patch        ~ 55 FPS
Without patch m-c ~ 39 FPS

Vlad asked me to try some other option
>Oleg, for the immediate problem, since you're able to get good performance numbers -- can you try changing all the precision to 'lowp' and see if that makes a difference? 
> Also try changing the expression to be vec4(texture2D(..) * uLayerOpacity, uLayerOpacity), and see what change that makes (and the two together).

mediump - lowp change test - ~ 55 FPS

-  gl_FragColor = vec4(texture2D(uTexture, vTexCoord).rgb, 1.0) * uLayerOpacity;
+  gl_FragColor = vec4(texture2D(uTexture, vTexCoord).rgb * uLayerOpacity, uLayerOpacity);
vec4(texture2D(..) * uLayerOpacity, uLayerOpacity) ~ 38 FPS


lowp + vec4(texture2D(..) * uLayerOpacity, uLayerOpacity)  ~ 54.9 FPS
No sure if it does not break anything...
Comment on attachment 487467 [details] [diff] [review]
mediump->lowp precision... not sure

mediump->lowp, tested with CSS prop "opacity: 0.5;" with webgl layer... layer is transparent and works fine
Attachment #487467 - Flags: review?(vladimir)
Attachment #482013 - Attachment is obsolete: true
Attachment #482013 - Flags: review?(vladimir)
Attachment #488285 - Flags: review?(vladimir)
Comment on attachment 488285 [details] [diff] [review]
Use right program in canvas and image

This really should have gone into a separate bug with some more detail :-)

Also I'm pretty sure the patch commit message isn't correct here.
Attachment #488285 - Flags: review?(vladimir) → review+
This is valid part from previous patch.
and yep, that is message from different patch...
+                                      useGLContext != 0 ? PR_TRUE : PR_FALSE)

You can just write "useGLContext != 0"

Would similar optimizations be at all useful on D3D, I wonder?
> 
> Would similar optimizations be at all useful on D3D, I wonder?

probably yes, but I don't have D3D on mobile ;(
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Hmm, so with low precision enabled, removing the multiply by uLayerOpacity doesn't give any improvement? Weird.
I think at that point it's the same number of vector multiplies, so it ends up being a wash -- the hardware can do (iirc) 4 low precision multiplies in a cycle, so we can either do (RGB,1.0)*opacity or (RGB*opacity,opacity) in the same amount of time.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: