Last Comment Bug 766366 - Add Texture -> Texture blit function to GLContext
: Add Texture -> Texture blit function to GLContext
Status: RESOLVED FIXED
:
Product: Core
Classification: Components
Component: Graphics (show other bugs)
: Trunk
: x86_64 Linux
: -- normal with 1 vote (vote)
: mozilla17
Assigned To: Jeff Gilbert [:jgilbert]
:
Mentors:
Depends on:
Blocks: 781084 784581 716859 728524 776796 785579
  Show dependency treegraph
 
Reported: 2012-06-19 15:55 PDT by Oleg Romashin (:romaxa)
Modified: 2013-09-20 09:31 PDT (History)
6 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
GLContext FBO render to texture with context Store/Restore (10.83 KB, patch)
2012-06-19 16:17 PDT, Oleg Romashin (:romaxa)
jgilbert: review-
Details | Diff | Splinter Review
Draw texture to framebuffer (18.90 KB, patch)
2012-07-26 16:41 PDT, Jeff Gilbert [:jgilbert]
jacob.benoit.1: feedback+
Details | Diff | Splinter Review
blit texture->framebuffer (19.60 KB, patch)
2012-07-31 16:23 PDT, Jeff Gilbert [:jgilbert]
no flags Details | Diff | Splinter Review
blit All The Things (23.95 KB, patch)
2012-07-31 17:24 PDT, Jeff Gilbert [:jgilbert]
jacob.benoit.1: review-
Details | Diff | Splinter Review
test All The Blits (9.32 KB, patch)
2012-07-31 18:43 PDT, Jeff Gilbert [:jgilbert]
jacob.benoit.1: review-
Details | Diff | Splinter Review
DiscardFramebuffer (9.97 KB, patch)
2012-08-01 15:07 PDT, Jeff Gilbert [:jgilbert]
no flags Details | Diff | Splinter Review
blit code (25.20 KB, patch)
2012-08-03 15:54 PDT, Jeff Gilbert [:jgilbert]
no flags Details | Diff | Splinter Review
blit code (25.23 KB, patch)
2012-08-03 15:56 PDT, Jeff Gilbert [:jgilbert]
jacob.benoit.1: review-
Details | Diff | Splinter Review
discard framebuffer (9.88 KB, patch)
2012-08-03 16:01 PDT, Jeff Gilbert [:jgilbert]
no flags Details | Diff | Splinter Review
blit code (30.90 KB, patch)
2012-08-09 16:08 PDT, Jeff Gilbert [:jgilbert]
jacob.benoit.1: review+
Details | Diff | Splinter Review
blit code w/crtp-raii (32.47 KB, patch)
2012-08-11 01:39 PDT, Jeff Gilbert [:jgilbert]
jacob.benoit.1: review+
Details | Diff | Splinter Review
test code (8.89 KB, patch)
2012-08-11 02:08 PDT, Jeff Gilbert [:jgilbert]
no flags Details | Diff | Splinter Review

Description Oleg Romashin (:romaxa) 2012-06-19 15:55:54 PDT
For WebGL context OMTC sharing with one copy we have create Blit function which will do that copy.
using CopyTexImage2D is not very good because it blows away texture, and we should rewrap that texture with EGLImage.
CopyTexSubImage2D does not work very fast (2x slower comparing to simple FBO render)
Comment 1 Oleg Romashin (:romaxa) 2012-06-19 16:17:48 PDT
Created attachment 634642 [details] [diff] [review]
GLContext FBO render to texture with context Store/Restore
Comment 2 Jeff Gilbert [:jgilbert] 2012-06-19 23:20:07 PDT
Comment on attachment 634642 [details] [diff] [review]
GLContext FBO render to texture with context Store/Restore

Review of attachment 634642 [details] [diff] [review]:
-----------------------------------------------------------------

GLContext.cpp is a 4-space indent file also, so select it all and hit tab again. :)

::: gfx/gl/GLContext.cpp
@@ +2186,5 @@
>      fEnable(LOCAL_GL_BLEND);
>  }
>  
> +bool
> +GLContext::BlitTexture(GLuint aSrc, GLuint aDest, const nsIntSize& aSize, GLenum aTexDepth)

Since the type doesn't indicate that they're textures, their names should. Maybe aSrcTex and aDestTex?

@@ +2188,5 @@
>  
> +bool
> +GLContext::BlitTexture(GLuint aSrc, GLuint aDest, const nsIntSize& aSize, GLenum aTexDepth)
> +{
> +  nsIntRect rect(0, 0, aSize.width, aSize.height);

Blitting a texture shouldn't care what size it is.

@@ +2193,5 @@
> +  /* Bind the image to a renderbuffer */
> +  GLuint prevRead = GetUserBoundReadFBO();
> +  GLuint prevDraw = GetUserBoundDrawFBO();
> +  /* Bind the image to a renderbuffer */
> +  GLint oldrb, oldprog, oldarrbuffer, oldtex;

Initialize here too.

@@ +2197,5 @@
> +  GLint oldrb, oldprog, oldarrbuffer, oldtex;
> +  fGetIntegerv(LOCAL_GL_RENDERBUFFER_BINDING, &oldrb);
> +  fGetIntegerv(LOCAL_GL_ARRAY_BUFFER_BINDING, &oldarrbuffer);
> +  fGetIntegerv(LOCAL_GL_CURRENT_PROGRAM, &oldprog);
> +  fGetIntegerv(LOCAL_GL_TEXTURE_BINDING_2D, &oldtex);

This whole section needs more whitespace.

@@ +2202,5 @@
> +  GLint size0, size1, enabled0, enabled1;
> +  GLint type0, type1;
> +  GLint normalized0, normalized1;
> +  GLsizei stride0, stride1;
> +  GLvoid *pointer0, *pointer1;

Declare pointers on their own line, but more importantly, these should all be initialized, since we don't actually have a guarantee that their values get filled.

Also, assert that no error was generated by these calls in DEBUG builds, because that's when we'd be falling back to our wrong (but less dangerous) default values.

@@ +2225,5 @@
> +  }
> +
> +  fBindRenderbuffer(LOCAL_GL_RENDERBUFFER, mBlitRenderbuffer);
> +  fRenderbufferStorage(LOCAL_GL_RENDERBUFFER, aTexDepth, aSize.width, aSize.height);
> +  fFramebufferRenderbuffer(LOCAL_GL_FRAMEBUFFER, LOCAL_GL_DEPTH_ATTACHMENT,

You forgot to bind the framebuffer before calling this.
Also, why are we doing depth-related stuff? Surely we just want to copy the pixel data.

@@ +2228,5 @@
> +  fRenderbufferStorage(LOCAL_GL_RENDERBUFFER, aTexDepth, aSize.width, aSize.height);
> +  fFramebufferRenderbuffer(LOCAL_GL_FRAMEBUFFER, LOCAL_GL_DEPTH_ATTACHMENT,
> +                           LOCAL_GL_RENDERBUFFER, mBlitRenderbuffer);
> +
> +  SetBlitFramebufferForDestTexture(aDest);

While we're using this function, can you fix it so it doesn't check framebuffer completeness when aTexture is 0?

@@ +2231,5 @@
> +
> +  SetBlitFramebufferForDestTexture(aDest);
> +
> +  UseBlitProgram();
> +  fClearColor(0.0f, 0.0f, 0.0f, 1.0f);

If we change this, we're going to need to restore it.
We shouldn't need it, though. We're doing a full-texture blit, so we're going to overwrite whatever we clear to anyways.

@@ +2233,5 @@
> +
> +  UseBlitProgram();
> +  fClearColor(0.0f, 0.0f, 0.0f, 1.0f);
> +  fClear(LOCAL_GL_COLOR_BUFFER_BIT | LOCAL_GL_DEPTH_BUFFER_BIT);
> +  fActiveTexture(LOCAL_GL_TEXTURE0);

Need to save and restore the current active texture.

@@ +2246,5 @@
> +  }
> +  fBindBuffer(LOCAL_GL_ARRAY_BUFFER, 0);
> +
> +  fVertexAttribPointer(0, 2, LOCAL_GL_FLOAT, LOCAL_GL_FALSE, 0, rects.vertexPointer());
> +  fVertexAttribPointer(1, 2, LOCAL_GL_FLOAT, LOCAL_GL_FALSE, 0, rects.texCoordPointer());

Can we not just specify the texCoords, and have it figure out the vert positions itself? Really, just setting the viewport properly should mean we only need to do a trivial conversion between texCoords and position. If no y-flip is needed, these should even be able to be the same.

Since the shader input should only ever need to be [[0,0] [1,0] [1,1] [0,1]], we should be able to cache this and reuse it. (At worst, we'll actually need the two-tri version instead of this tri-fan version)

@@ +2252,5 @@
> +  fEnableVertexAttribArray(1);
> +  fDrawArrays(LOCAL_GL_TRIANGLES, 0, rects.elements());
> +
> +  fEnable(LOCAL_GL_SCISSOR_TEST);
> +  fEnable(LOCAL_GL_BLEND);

Only re-enable these if they were originally enabled.

@@ +2263,5 @@
> +  fVertexAttribPointer(1, size1, type1, normalized1, stride1, pointer1);
> +
> +  fBindBuffer(LOCAL_GL_ARRAY_BUFFER, oldarrbuffer);
> +  fBindTexture(LOCAL_GL_TEXTURE_2D, oldtex);
> +  SetBlitFramebufferForDestTexture(0);

While this is safe here, this should be done before we start manually restoring old state.
Comment 3 Vladimir Vukicevic [:vlad] [:vladv] 2012-06-20 08:51:29 PDT
Comment on attachment 634642 [details] [diff] [review]
GLContext FBO render to texture with context Store/Restore

Gah.  This still terrifies me, and I would really prefer to not use it if we don't need to because it will always be fragile.  We should see if there's some reason why CopyTexSubImage is slow.

This is intended to be able to be called on the WebGL rendering context, right?  If so, it needs to save and restore the clear color, the active texture, the state of scissor and blend modes, and then needs to set explicitly every mode that could affect the glDrawArrays call -- for example, glColorMask state, whether depth test is enabled/disabled, stencil mode, etc.  Getting all that right, and maintaining it in the future, seems like it's going to be a nightmare (e.g. if we add support for an extension that affects rasterization, we'd have to explicitly set the state to something sane in this function if that extension is available).

I would almost rather take the CopyTexSubImage perf hit instead of trying to get that right.  Also, an alternative sharing approach on Android 4.0 (using Surface & SurfaceTexture) shouldn't need any of this stuff, right?
Comment 4 Oleg Romashin (:romaxa) 2012-06-20 09:49:15 PDT
Ok, I can make first version of 728524 fix with copyTexSubImage2d version for fast and safe landing. and leave a comment there that it could be faster by using draw quads
Comment 5 Vladimir Vukicevic [:vlad] [:vladv] 2012-06-20 10:45:27 PDT
That'd be great, thanks!  Then we can iterate on this separately.
Comment 6 Jeff Gilbert [:jgilbert] 2012-07-26 16:41:16 PDT
Created attachment 646405 [details] [diff] [review]
Draw texture to framebuffer

It's nasty, but do you see anything especially bad?

It still has some more MOZ_CRASH's than it will eventually (for testing), but the main bit is the form of the methods, and all the state we save/restore.

Really, we should also add shadowing for all state this touches, as well.
Comment 7 Benoit Jacob [:bjacob] (mostly away) 2012-07-27 11:20:25 PDT
Comment on attachment 646405 [details] [diff] [review]
Draw texture to framebuffer

Review of attachment 646405 [details] [diff] [review]:
-----------------------------------------------------------------

f+ because it's only feedback, but that would be a r- ;-)

::: gfx/gl/GLContext.cpp
@@ +1780,5 @@
> +#ifdef FRAG_SHADER_SOURCE
> +#error FRAG_SHADER_SOURCE already defined!
> +#endif
> +
> +#define VERT_SHADER_SOURCE "\

r- for putting this in macros without a stated really really good reason. if the concern is about wasting relocation, and you really need this to be a global, making it a static const char[] will do. http://glandium.org/blog/?p=2361.

@@ +1809,5 @@
> +{
> +    static bool sExpectSuccess = false;
> +    bool success = false;
> +    // Use while to let us break
> +    while (!mTexBlit_Program) {

If you're using while instead of if to allow break, why not do a do{...}while instead so you wouldn't have this nontrivial while condition here?

do {

  ...

  if (foo)
    break;

  ...
} while (false);

@@ +1851,5 @@
> +        fCompileShader(mTexBlit_FragShader);
> +
> +        if (!sExpectSuccess) {
> +            GLint status = 0;
> +            fGetShaderiv(mTexBlit_VertShader, LOCAL_GL_COMPILE_STATUS, &status);

No need to check the COMPILE_STATUS since you're checking the LINK_STATUS below and the GL is allowed to defer errors until linkProgram so this check is really on a undefined condition.

@@ +1874,5 @@
> +            if (status != LOCAL_GL_TRUE)
> +                break;
> +        }
> +
> +        MOZ_ASSERT(fGetAttribLocation(mTexBlit_Program, "aPosition") == 0);

Oops! This code will vanish #ifndef DEBUG!

Also a side note, since GL getter calls are in principle slow, I'd suppose that BindAttribLocation would give you better performance. But I don't know.
Comment 8 Jeff Gilbert [:jgilbert] 2012-07-27 14:13:45 PDT
Comment on attachment 646405 [details] [diff] [review]
Draw texture to framebuffer

Review of attachment 646405 [details] [diff] [review]:
-----------------------------------------------------------------

Great, thanks for the feedback.

Do you see any state that might be missing from the save/restore pile that could affect rasterization?

I tested, and the framebufferBlit fast path works, but there's some issue with the rendering quad. Looking into it now.

::: gfx/gl/GLContext.cpp
@@ +1780,5 @@
> +#ifdef FRAG_SHADER_SOURCE
> +#error FRAG_SHADER_SOURCE already defined!
> +#endif
> +
> +#define VERT_SHADER_SOURCE "\

Interesting.

@@ +1809,5 @@
> +{
> +    static bool sExpectSuccess = false;
> +    bool success = false;
> +    // Use while to let us break
> +    while (!mTexBlit_Program) {

I suppose, but that block will end up as:

if (!condition) do {
[...]
} while (false)

I suppose that's slightly better, since it guarantees we don't loop.

Alternatively, I could do:
do{
if (condition) break;
[...]
}while(false)

@@ +1851,5 @@
> +        fCompileShader(mTexBlit_FragShader);
> +
> +        if (!sExpectSuccess) {
> +            GLint status = 0;
> +            fGetShaderiv(mTexBlit_VertShader, LOCAL_GL_COMPILE_STATUS, &status);

Fun and/or gross.

@@ +1874,5 @@
> +            if (status != LOCAL_GL_TRUE)
> +                break;
> +        }
> +
> +        MOZ_ASSERT(fGetAttribLocation(mTexBlit_Program, "aPosition") == 0);

It's supposed to, since we do indeed bind it to 0 above. This is just to make sure that the binding went through properly. Since this is DEBUG code, a few extra getters are the least of our worries, performance-wise.
Comment 9 Jeff Gilbert [:jgilbert] 2012-07-27 14:14:25 PDT
Hmm, tragically reviewing a review didn't work as I'd hoped.
Comment 10 Benoit Jacob [:bjacob] (mostly away) 2012-07-30 13:04:58 PDT
Comment on attachment 646405 [details] [diff] [review]
Draw texture to framebuffer

Review of attachment 646405 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/gl/GLContext.cpp
@@ +1824,5 @@
> +        };
> +
> +        MOZ_ASSERT(!mTexBlit_Buffer);
> +        fGenBuffers(1, &mTexBlit_Buffer);
> +        fBindBuffer(LOCAL_GL_ARRAY_BUFFER, mTexBlit_Buffer);

It doesn't seem that you are saving and restoring this buffer binding.

@@ +1827,5 @@
> +        fGenBuffers(1, &mTexBlit_Buffer);
> +        fBindBuffer(LOCAL_GL_ARRAY_BUFFER, mTexBlit_Buffer);
> +        fBufferData(LOCAL_GL_ARRAY_BUFFER, 8 * sizeof(GLfloat), verts, LOCAL_GL_STATIC_DRAW);
> +
> +        fEnableVertexAttribArray(0);

It doesn't seem that you are saving/restoring this bit either?
Comment 11 Jeff Gilbert [:jgilbert] 2012-07-30 13:17:51 PDT
(In reply to Benoit Jacob [:bjacob] from comment #10)
> Comment on attachment 646405 [details] [diff] [review]
> Draw texture to framebuffer
> 
> Review of attachment 646405 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: gfx/gl/GLContext.cpp
> @@ +1824,5 @@
> > +        };
> > +
> > +        MOZ_ASSERT(!mTexBlit_Buffer);
> > +        fGenBuffers(1, &mTexBlit_Buffer);
> > +        fBindBuffer(LOCAL_GL_ARRAY_BUFFER, mTexBlit_Buffer);
> 
> It doesn't seem that you are saving and restoring this buffer binding.
> 
> @@ +1827,5 @@
> > +        fGenBuffers(1, &mTexBlit_Buffer);
> > +        fBindBuffer(LOCAL_GL_ARRAY_BUFFER, mTexBlit_Buffer);
> > +        fBufferData(LOCAL_GL_ARRAY_BUFFER, 8 * sizeof(GLfloat), verts, LOCAL_GL_STATIC_DRAW);
> > +
> > +        fEnableVertexAttribArray(0);
> 
> It doesn't seem that you are saving/restoring this bit either?

Those two are in UseTexQuadProgram(), which can be destructive for all state that's saved/restored in DrawTextureToFramebuffer, which includes ARRAY_BUFFER and the enable state of vertAttribArr 0.
Comment 12 Vladimir Vukicevic [:vlad] [:vladv] 2012-07-31 12:00:48 PDT
Hrm, this patch implements drawing a texture to a framebuffer -- we need drawing a texture to a texture.  Is the intent that we're supposed to have a FBO handy already for the destination texture?
Comment 13 Jeff Gilbert [:jgilbert] 2012-07-31 15:06:37 PDT
(In reply to Vladimir Vukicevic [:vlad] [:vladv] from comment #12)
> Hrm, this patch implements drawing a texture to a framebuffer -- we need
> drawing a texture to a texture.  Is the intent that we're supposed to have a
> FBO handy already for the destination texture?

The core of the problem here is 'taking a texture and drawing it to a destination'.
We can then wrap this with another function which does texture->texture blit, potentially using CopyTexSubImage to do framebuffer->texture blit.

In order of preference:
Framebuffer->framebuffer blit (BlitFramebuffers)
Framebuffer->texture blit (CopyTexSubImage)
texture->framebuffer blit (eldritch horror of state manipulation)

We only really want to do the DrawArrays blit on platforms where BlitFramebuffers is unavailable and CopyTexSubImage is (for whatever reason) slower.
Comment 14 Jeff Gilbert [:jgilbert] 2012-07-31 16:23:34 PDT
Created attachment 647754 [details] [diff] [review]
blit texture->framebuffer
Comment 15 Jeff Gilbert [:jgilbert] 2012-07-31 17:18:10 PDT
Comment on attachment 647754 [details] [diff] [review]
blit texture->framebuffer

Got a better version coming. Next one has all the {tex,fb}->{tex,fb} combos.
Comment 16 Jeff Gilbert [:jgilbert] 2012-07-31 17:24:34 PDT
Created attachment 647777 [details] [diff] [review]
blit All The Things

Alright, here we go.
I have another patch on top of this which is designed to stress-test it. Will fix up and also upload.
Comment 17 Jeff Gilbert [:jgilbert] 2012-07-31 18:43:58 PDT
Created attachment 647808 [details] [diff] [review]
test All The Blits

Aaaand, a way to test.
WFM on spider gl (cube, particles), webgl pasta, webgl conformance tests 1.0.1.
Comment 18 Vladimir Vukicevic [:vlad] [:vladv] 2012-08-01 10:19:11 PDT
Benchmarking the draw path instead of CopyTexSubImage shows that it's categorically slower, at least on a Adreno S3.

Sample timings:

I/Gecko   (32130): Setup: 0.305203 ms, CopyTexSubImage: 2.777354 ms
I/Gecko   (32130): Setup: 0.305204 ms, CopyTexSubImage: 2.746833 ms
I/Gecko   (32130): Setup: 0.305204 ms, CopyTexSubImage: 3.052037 ms
I/Gecko   (32130): Setup: 0.183122 ms, CopyTexSubImage: 1.892263 ms
I/Gecko   (32130): Setup: 2.014345 ms, CopyTexSubImage: 2.472150 ms
I/Gecko   (32130): Setup: 0.122081 ms, CopyTexSubImage: 2.472150 ms
I/Gecko   (32130): Setup: 0.152602 ms, CopyTexSubImage: 2.441630 ms
I/Gecko   (32130): Setup: 1.983824 ms, CopyTexSubImage: 2.441630 ms
I/Gecko   (32130): Setup: 0.122082 ms, CopyTexSubImage: 2.441630 ms
I/Gecko   (32130): Setup: 3.052037 ms, CopyTexSubImage: 2.960476 ms
I/Gecko   (32130): Setup: 3.967648 ms, CopyTexSubImage: 2.533191 ms

I/Gecko   (  811): Setup: 0.091561 ms, blit: 5.676789 ms
I/Gecko   (  811): Setup: 0.122082 ms, blit: 5.859911 ms
I/Gecko   (  811): Setup: 3.967648 ms, blit: 6.226156 ms
I/Gecko   (  811): Setup: 0.122082 ms, blit: 5.798871 ms
I/Gecko   (  811): Setup: 0.091561 ms, blit: 6.165115 ms
I/Gecko   (  811): Setup: 0.305203 ms, blit: 6.317717 ms
I/Gecko   (  811): Setup: 0.122081 ms, blit: 5.707310 ms
I/Gecko   (  811): Setup: 0.122081 ms, blit: 6.043035 ms
I/Gecko   (  811): Setup: 4.089730 ms, blit: 6.195636 ms
I/Gecko   (  811): Setup: 0.122082 ms, blit: 5.676789 ms
I/Gecko   (  811): Setup: 0.122081 ms, blit: 6.043034 ms
I/Gecko   (  811): Setup: 8.087899 ms, blit: 5.615748 ms

setup is the texture bind and EGLTextureTargetLongName2D call, which I think is noise.

The draw takes roughly 2x as long as a CopyTexSubImage here.  Granted, I haven't tried this on a different phone... guess I'll do that shortly.
Comment 19 Vladimir Vukicevic [:vlad] [:vladv] 2012-08-01 11:12:09 PDT
Okay, here's some better data.  Note that without any changes, our offscreen surface FBO is created with a GL_BGRA texture, while our EGLImage texture is GL_RGBA.

Adreno 200 on S3:        CopyTexImage: 1.8-2.6ms            draw:  5.8-7.1ms
Galaxy Nexus SGX540:     CopyTexImage: 56.6-74.5ms  !!!     draw:  0.213-0.396ms !!!
Nexus 7 Tegra 3:         CopyTexImage: 165.93ms-172.237 !!! draw:  2.5-3.3ms

If I change the offscreen surface texture to also be GL_RGBA, by passing ForceRGBA there, I get:

Adreno 200 on S3:        CopyTexImage: 1.8-2.4ms         draw:  5.6-6.3ms
Galaxy Nexus SGX540:     CopyTexImage: 56.15-67.2ms !!!  draw:  0.244-0.350ms !!!
Nexus 7 Tegra 3:         CopyTexImage: 2.0-2.1ms         draw:  2.5-3.3ms

So, Tegra has an issue with going BGRA -> RGBA; that's likely a bug and I'll report it given that the draw is equal speed.  We can fix that by just creating that surface as RGBA.  SGX540 just hates CopyTexImage and is somehow ridiculously fast at doing the draw (so fast that I wonder how they're cheating).  Adreno prefers CopyTexImage.

Note: the timings were done via basically |glFinish();  tstart = Timestamp; <op>; glFinish(); tend = Timestamp;|
Comment 20 Jeff Gilbert [:jgilbert] 2012-08-01 11:27:38 PDT
(In reply to Vladimir Vukicevic [:vlad] [:vladv] from comment #18)
> Benchmarking the draw path instead of CopyTexSubImage shows that it's
> categorically slower, at least on a Adreno S3.
> 
> [snip]
> 
> setup is the texture bind and EGLTextureTargetLongName2D call, which I think
> is noise.
> 
> The draw takes roughly 2x as long as a CopyTexSubImage here.  Granted, I
> haven't tried this on a different phone... guess I'll do that shortly.

We should try throwing a DiscardFramebuffer (and maybe a Clear?) in before we draw to the framebuffer, since that's supposed to help.

Other things we can do include shadowing all state we touch in the draw path, and possibly allowing for coalescing framebuffer bindings. (if that's something we're taking a penalty on).

It may also be worth checking how the paths compare when we have a dedicated framebuffer for the texture, instead of the temporary framebuffer wrapping I threw at these functions.
Comment 21 Jeff Gilbert [:jgilbert] 2012-08-01 11:32:24 PDT
Wow, yeah, SGX is really out of whack, it seems. My only thought is that they're being really lazy about drawing, possibly so much as to be out-of-spec in regards to what Finish is supposed to assure.

I recommend trying to run it many times, then do a ReadPixels, since if we can read back the pixels, we really know it's done rendering. Multiple runs per test (just divide the time by the number of runs) is probably necessary because the overhead to ReadPixels is generally so high.
Comment 22 Vladimir Vukicevic [:vlad] [:vladv] 2012-08-01 11:40:11 PDT
(In reply to Jeff Gilbert [:jgilbert] from comment #20)
> We should try throwing a DiscardFramebuffer (and maybe a Clear?) in before
> we draw to the framebuffer, since that's supposed to help.

Yeah -- that might get back the bulk of the difference on the Adreno 200, I think.  I'd guess it won't affect Tegra, because no tiles.

> Other things we can do include shadowing all state we touch in the draw
> path, and possibly allowing for coalescing framebuffer bindings. (if that's
> something we're taking a penalty on).
> 
> It may also be worth checking how the paths compare when we have a dedicated
> framebuffer for the texture, instead of the temporary framebuffer wrapping I
> threw at these functions.

Yup to both -- though I suspect at best it'll get us back to CopyTexImage speed (assuming RGBA) on Adreno/Tegra.  It might help on SGX, assuming I fix the timing :p
Comment 23 Jeff Gilbert [:jgilbert] 2012-08-01 15:07:43 PDT
Created attachment 648103 [details] [diff] [review]
DiscardFramebuffer

DiscardFramebuffer, with helper wrapper, and fallback to Clear.
|gl.ignore-discards=true| will ignore calls to DiscardFramebuffer, so it's easier to see if this actually helps at all.
Comment 24 Benoit Jacob [:bjacob] (mostly away) 2012-08-01 15:39:39 PDT
Comment on attachment 647777 [details] [diff] [review]
blit All The Things

Review of attachment 647777 [details] [diff] [review]:
-----------------------------------------------------------------

Sorry, some questions:

::: gfx/gl/GLContext.cpp
@@ +1775,5 @@
>    }
>  }
>  
> +
> +static const char kTexBlit_VertShaderSource[] = "\

Can you remind me why this needs to be global?

::: gfx/gl/GLContext.h
@@ +839,5 @@
> +                                  const gfxIntSize& srcSize,
> +                                  const gfxIntSize& destSize);
> +    void BlitTextureToTexture(GLuint srcTex, GLuint destTex,
> +                              const gfxIntSize& srcSize,
> +                              const gfxIntSize& destSize);

Bah, it sucks that OpenGL objects are untyped so you have to name these functions differently and if we call the wrong one the compiler won't catch that mistake. But that's life.

@@ +1125,5 @@
> +    GLuint GetUserBoundFBO() {
> +#ifdef DEBUG
> +        GLuint draw = GetUserBoundDrawFBO();
> +        GLuint read = GetUserBoundReadFBO();
> +        MOZ_ASSERT(draw == read);

You could replace these 5 lines by a single:

MOZ_ASSERT(GetUserBoundDrawFBO() == GetUserBoundReadFBO());

@@ +1131,5 @@
> +        return mUserBoundDrawFBO;
> +    }
> +
> +    void BindUserFBO(GLuint name) {
> +        fBindFramebuffer(LOCAL_GL_FRAMEBUFFER, name);

Isn't this assuming a identity mapping of user id -> internal id?

For example, if the user calls BindUserFBO(0) he means the default framebuffer, which means a non-0 actual GL id, right?

@@ +2397,5 @@
>      }
>  
> +    void GetUIntegerv(GLenum pname, GLuint *params) {
> +        fGetIntegerv(pname, (GLint*)params);
> +    }

I like this. Fewer useless casts on the caller's side. I'd use reinterpret_cast, but it's no big deal.
Comment 25 Jeff Gilbert [:jgilbert] 2012-08-01 16:07:58 PDT
(In reply to Benoit Jacob [:bjacob] from comment #24)
> Comment on attachment 647777 [details] [diff] [review]
> blit All The Things
> 
> Review of attachment 647777 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Sorry, some questions:
> 
> ::: gfx/gl/GLContext.cpp
> @@ +1775,5 @@
> >    }
> >  }
> >  
> > +
> > +static const char kTexBlit_VertShaderSource[] = "\
> 
> Can you remind me why this needs to be global?
It's only global to that file, and I have it out there because it seems cleaner to have it there than inside the function that uses it. Where would you put it?

> ::: gfx/gl/GLContext.h
> @@ +839,5 @@
> > +                                  const gfxIntSize& srcSize,
> > +                                  const gfxIntSize& destSize);
> > +    void BlitTextureToTexture(GLuint srcTex, GLuint destTex,
> > +                              const gfxIntSize& srcSize,
> > +                              const gfxIntSize& destSize);
> 
> Bah, it sucks that OpenGL objects are untyped so you have to name these
> functions differently and if we call the wrong one the compiler won't catch
> that mistake. But that's life.
Indeed, my testing code originally did this. :| Annoying to debug. We should actually probably add asserts for IsTexture and IsFramebuffer.

> @@ +1125,5 @@
> > +    GLuint GetUserBoundFBO() {
> > +#ifdef DEBUG
> > +        GLuint draw = GetUserBoundDrawFBO();
> > +        GLuint read = GetUserBoundReadFBO();
> > +        MOZ_ASSERT(draw == read);
> 
> You could replace these 5 lines by a single:
> 
> MOZ_ASSERT(GetUserBoundDrawFBO() == GetUserBoundReadFBO());
It's true.

> @@ +1131,5 @@
> > +        return mUserBoundDrawFBO;
> > +    }
> > +
> > +    void BindUserFBO(GLuint name) {
> > +        fBindFramebuffer(LOCAL_GL_FRAMEBUFFER, name);
> 
> Isn't this assuming a identity mapping of user id -> internal id?
> 
> For example, if the user calls BindUserFBO(0) he means the default
> framebuffer, which means a non-0 actual GL id, right?
Since so many people are going to blindly use fBindFramebuffer anyways, our logic for hiding the real '0' buffer is in fBindFramebuffer, which calls raw_BindFramebuffer with the one we actually want.

> @@ +2397,5 @@
> >      }
> >  
> > +    void GetUIntegerv(GLenum pname, GLuint *params) {
> > +        fGetIntegerv(pname, (GLint*)params);
> > +    }
> 
> I like this. Fewer useless casts on the caller's side. I'd use
> reinterpret_cast, but it's no big deal.
Might as well do it correctly, I suppose, now that it's not onerous to do so.
Comment 26 Benoit Jacob [:bjacob] (mostly away) 2012-08-02 15:27:08 PDT
(In reply to Jeff Gilbert [:jgilbert] from comment #25)
> (In reply to Benoit Jacob [:bjacob] from comment #24)
> > Comment on attachment 647777 [details] [diff] [review]
> > blit All The Things
> > 
> > Review of attachment 647777 [details] [diff] [review]:
> > -----------------------------------------------------------------
> > 
> > Sorry, some questions:
> > 
> > ::: gfx/gl/GLContext.cpp
> > @@ +1775,5 @@
> > >    }
> > >  }
> > >  
> > > +
> > > +static const char kTexBlit_VertShaderSource[] = "\
> > 
> > Can you remind me why this needs to be global?
> It's only global to that file, and I have it out there because it seems
> cleaner to have it there than inside the function that uses it. Where would
> you put it?

If it's only used by one function, I would make it local to that function, no?

Killer argument: this way you get the 'static' behavior by default without having to specify it!
Comment 27 Benoit Jacob [:bjacob] (mostly away) 2012-08-02 15:31:31 PDT
Comment on attachment 647777 [details] [diff] [review]
blit All The Things

Review of attachment 647777 [details] [diff] [review]:
-----------------------------------------------------------------

Sorry, some questions:

::: gfx/gl/GLContext.cpp
@@ +1840,5 @@
> +        MOZ_ASSERT(!mTexBlit_VertShader);
> +        MOZ_ASSERT(!mTexBlit_FragShader);
> +
> +        static const char* vertShaderSource = kTexBlit_VertShaderSource;
> +        static const char* fragShaderSource = kTexBlit_FragShaderSource;

1) static here at function scope means something different from the earlier file-scope static, and doesn't seem to be what you want (?) illustrating the cost of using that confusing keyword
2) Why even declare these pointers when you could instead pass the arrays directly? (Maybe with a cast, not sure).

@@ +1843,5 @@
> +        static const char* vertShaderSource = kTexBlit_VertShaderSource;
> +        static const char* fragShaderSource = kTexBlit_FragShaderSource;
> +
> +        mTexBlit_VertShader = fCreateShader(LOCAL_GL_VERTEX_SHADER);
> +        fShaderSource(mTexBlit_VertShader, 1, &vertShaderSource, nsnull);

Does this even compile? This is passing (up to cv) a char** for a char* argument

::: gfx/gl/GLContext.h
@@ +839,5 @@
> +                                  const gfxIntSize& srcSize,
> +                                  const gfxIntSize& destSize);
> +    void BlitTextureToTexture(GLuint srcTex, GLuint destTex,
> +                              const gfxIntSize& srcSize,
> +                              const gfxIntSize& destSize);

Bah, it sucks that OpenGL objects are untyped so you have to name these functions differently and if we call the wrong one the compiler won't catch that mistake. But that's life.

@@ +1125,5 @@
> +    GLuint GetUserBoundFBO() {
> +#ifdef DEBUG
> +        GLuint draw = GetUserBoundDrawFBO();
> +        GLuint read = GetUserBoundReadFBO();
> +        MOZ_ASSERT(draw == read);

You could replace these 5 lines by a single:

MOZ_ASSERT(GetUserBoundDrawFBO() == GetUserBoundReadFBO());

@@ +1131,5 @@
> +        return mUserBoundDrawFBO;
> +    }
> +
> +    void BindUserFBO(GLuint name) {
> +        fBindFramebuffer(LOCAL_GL_FRAMEBUFFER, name);

Isn't this assuming a identity mapping of user id -> internal id?

For example, if the user calls BindUserFBO(0) he means the default framebuffer, which means a non-0 actual GL id, right?

@@ +2397,5 @@
>      }
>  
> +    void GetUIntegerv(GLenum pname, GLuint *params) {
> +        fGetIntegerv(pname, (GLint*)params);
> +    }

I like this. Fewer useless casts on the caller's side. I'd use reinterpret_cast, but it's no big deal.
Comment 28 Benoit Jacob [:bjacob] (mostly away) 2012-08-02 15:41:28 PDT
Argh, the review tool re-adds all the existing review comments :-(

Also I was wrong, shadersource really wants a char**.

But there remains the other comments.
Comment 29 Jeff Gilbert [:jgilbert] 2012-08-03 15:54:13 PDT
Created attachment 648884 [details] [diff] [review]
blit code

Fixed some issues and added asserts to check that framebuffers and textures passed in really are of the correct type.
Comment 30 Jeff Gilbert [:jgilbert] 2012-08-03 15:56:08 PDT
Created attachment 648887 [details] [diff] [review]
blit code

Forgot to qref. :|
Comment 31 Jeff Gilbert [:jgilbert] 2012-08-03 16:01:02 PDT
Created attachment 648890 [details] [diff] [review]
discard framebuffer
Comment 32 Benoit Jacob [:bjacob] (mostly away) 2012-08-03 18:01:30 PDT
Comment on attachment 648887 [details] [diff] [review]
blit code

Review of attachment 648887 [details] [diff] [review]:
-----------------------------------------------------------------

Sorry, there is just enough stuff there that saying "r+ with following comments, please address before landing" wouldn't quite make sense.

::: gfx/gl/GLContext.cpp
@@ +1775,5 @@
>    }
>  }
>  
> +
> +static const char kTexBlit_VertShaderSource[] = "\

I wonder about the _ in there, but up to you.

@@ +1805,5 @@
> +    bool success = false;
> +
> +    // Use do-while(false) to let us break on failure
> +    do {
> +        if (mTexBlit_Program) {

Same comment as above about _ . I don't know how I feel about it; up to you. I see the point and wish we had a separator character that didn't look out of place in CamelCase.

@@ +1811,5 @@
> +            success = true;
> +            break;
> +        }
> +
> +        /* CCW tri-strip:

What does CCW refer to here? In a tristip, every second triangle has one orientation, while ever second triangle has the opposite orientation. Like in your diagram, which is correct, the first triangle is CCW and the second one is CW.

@@ +1826,5 @@
> +
> +        MOZ_ASSERT(!mTexBlit_Buffer);
> +        fGenBuffers(1, &mTexBlit_Buffer);
> +        fBindBuffer(LOCAL_GL_ARRAY_BUFFER, mTexBlit_Buffer);
> +        fBufferData(LOCAL_GL_ARRAY_BUFFER, 8 * sizeof(GLfloat), verts, LOCAL_GL_STATIC_DRAW);

Replace 8 * sizeof float by sizeof verts.

@@ +1851,5 @@
> +        fShaderSource(mTexBlit_FragShader, 1, &fragShaderSource, nsnull);
> +        fCompileShader(mTexBlit_FragShader);
> +
> +        GLint status = 0;
> +        fGetShaderiv(mTexBlit_VertShader, LOCAL_GL_COMPILE_STATUS, &status);

As said earlier in this bug, checking compile_status before LinkProgram is futile.

@@ +1930,5 @@
> +    bool mComplete;
> +    GLuint mFB;
> +
> +public:
> +    // Assumes |gl| is current

Could you enforce this assumption by an assertion? a MOZ_ASSERT even, as failure would mean a programming mistake?

@@ +1957,5 @@
> +
> +        mGL->BindUserFBO(boundFB);
> +    }
> +
> +    // Does not assume context is current

:-/ I suppose there was a really good reason for this discrepancy between the ctor and the dtor? I can't see it.

@@ +1966,5 @@
> +        mGL->MakeCurrent();
> +        Delete();
> +    }
> +
> +    // Assumes context is current

Again, guard by assertion.

@@ +2099,5 @@
> +    if (depthTest)    fDisable(LOCAL_GL_DEPTH_TEST);
> +    if (dither)       fDisable(LOCAL_GL_DITHER);
> +    if (polyOffsFill) fDisable(LOCAL_GL_POLYGON_OFFSET_FILL);
> +    if (sampleAToC)   fDisable(LOCAL_GL_SAMPLE_ALPHA_TO_COVERAGE);
> +    if (sampleCover)  fDisable(LOCAL_GL_SAMPLE_COVERAGE);

Please explain the 3 previous ones! Teach me.

@@ +2141,5 @@
> +    if (polyOffsFill) fEnable(LOCAL_GL_POLYGON_OFFSET_FILL);
> +    if (sampleAToC)   fEnable(LOCAL_GL_SAMPLE_ALPHA_TO_COVERAGE);
> +    if (sampleCover)  fEnable(LOCAL_GL_SAMPLE_COVERAGE);
> +    if (scissor)      fEnable(LOCAL_GL_SCISSOR_TEST);
> +    if (stencil)      fEnable(LOCAL_GL_STENCIL_TEST);

I think that it would be worth having a RAII helper for IsEnabled...Disable...Enable. So you could replace these 3 lines by

{
  RaiiGLDisable raiiDisableBlend(LOCAL_GL_BLEND);

  ...
}

Could use that in similar code in GLContext::ClearSafely.
Comment 33 Benoit Jacob [:bjacob] (mostly away) 2012-08-03 18:04:21 PDT
Also: wouldn't it make sense for this stuff to move to a new .cpp file?
Comment 34 Benoit Jacob [:bjacob] (mostly away) 2012-08-03 18:52:42 PDT
Comment on attachment 647808 [details] [diff] [review]
test All The Blits

Review of attachment 647808 [details] [diff] [review]:
-----------------------------------------------------------------

We have a highly nontrivial problem to solve here: how to implement this test in a way that doesn't bloat the libxul that we ship to users?

Bloat takes 2 forms: making these highly central source files, GLContext.*, even bigger and more complex; and adding to the resulting binary size.

The former should be solvable by moving the test code to a separate .cpp file, so the modifications needed in the existing files would be minimal (maybe move this into a new class outside of class GLContext so the only modification you'd need in GLContext.h would be a friend declaration)?

The latter could be solved by making this #if defined(ENABLE_TESTS) && defined(DEBUG). Just ENABLE_TESTS is _probably_ not enough as I think we do enable tests on the builds that we ship - but I could be wrong.

No matter what you decide, please update Kyle's thread "The sad state of C++ unit testing in Mozilla". The idea he proposes there, to link a second libxul with all symbols exported, sounds like it would work; your use case would add more support for it.
Comment 35 Benoit Jacob [:bjacob] (mostly away) 2012-08-03 18:57:55 PDT
So maybe just pushing Kyle's proposal and waiting for it to be implemented is the best thing to do here. Up to you.
Comment 36 Jeff Gilbert [:jgilbert] 2012-08-09 13:43:07 PDT
(In reply to Benoit Jacob [:bjacob] from comment #32)
> Comment on attachment 648887 [details] [diff] [review]
> blit code
> 
> Review of attachment 648887 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Sorry, there is just enough stuff there that saying "r+ with following
> comments, please address before landing" wouldn't quite make sense.
> 
> ::: gfx/gl/GLContext.cpp
> @@ +1775,5 @@
> >    }
> >  }
> >  
> > +
> > +static const char kTexBlit_VertShaderSource[] = "\
> 
> I wonder about the _ in there, but up to you.
I feel the separation makes it clear that this is the Source for the Vert Shader, for use by the Texture-Blitting code. Basically, it's close to namespacing without actually doing it. Since we have other vert shader sources floating around, I think it makes it clear what it's for.

> @@ +1805,5 @@
> > +    bool success = false;
> > +
> > +    // Use do-while(false) to let us break on failure
> > +    do {
> > +        if (mTexBlit_Program) {
> 
> Same comment as above about _ . I don't know how I feel about it; up to you.
> I see the point and wish we had a separator character that didn't look out
> of place in CamelCase.
IMO it doesn't look that out of place, but I'm used to it, I suppose.

> 
> @@ +1811,5 @@
> > +            success = true;
> > +            break;
> > +        }
> > +
> > +        /* CCW tri-strip:
> 
> What does CCW refer to here? In a tristip, every second triangle has one
> orientation, while ever second triangle has the opposite orientation. Like
> in your diagram, which is correct, the first triangle is CCW and the second
> one is CW.
That's one way to think of it, but even though the order of the vert indexes is CW instead of CCW, it still counts as CCW for the purposes of face culling. (Which we disable, but I'd rather be really explicit, and use CCW since it's more common)

Technically, every 2nd tri is still CCW, but the tri is (n-1, n-2, n) instead of (n-2, n-1, n) like normal.
> 
> @@ +1826,5 @@
> > +
> > +        MOZ_ASSERT(!mTexBlit_Buffer);
> > +        fGenBuffers(1, &mTexBlit_Buffer);
> > +        fBindBuffer(LOCAL_GL_ARRAY_BUFFER, mTexBlit_Buffer);
> > +        fBufferData(LOCAL_GL_ARRAY_BUFFER, 8 * sizeof(GLfloat), verts, LOCAL_GL_STATIC_DRAW);
> 
> Replace 8 * sizeof float by sizeof verts.
Ok. I don't really like doing that, since if verts becomes a pointer later (instead of an array), this breaks. I suppose it's better than having |8| floating in there, though.

> @@ +1851,5 @@
> > +        fShaderSource(mTexBlit_FragShader, 1, &fragShaderSource, nsnull);
> > +        fCompileShader(mTexBlit_FragShader);
> > +
> > +        GLint status = 0;
> > +        fGetShaderiv(mTexBlit_VertShader, LOCAL_GL_COMPILE_STATUS, &status);
> 
> As said earlier in this bug, checking compile_status before LinkProgram is
> futile.
How about doing it in DEBUG? I originally had both in case the compiler spit out a useful error. (Which I appear to have forgotten to add the code for) I would imagine that if there was an implementation that did non-deferred compilation, the linker might only give an error about one of the shaders being invalid?

> @@ +1930,5 @@
> > +    bool mComplete;
> > +    GLuint mFB;
> > +
> > +public:
> > +    // Assumes |gl| is current
> 
> Could you enforce this assumption by an assertion? a MOZ_ASSERT even, as
> failure would mean a programming mistake?
Alright, I can add a mechanism for this.

> @@ +1957,5 @@
> > +
> > +        mGL->BindUserFBO(boundFB);
> > +    }
> > +
> > +    // Does not assume context is current
> 
> :-/ I suppose there was a really good reason for this discrepancy between
> the ctor and the dtor? I can't see it.
Since this is RAII, if we're ever playing with multiple contexts, a different context might be current when the object pops out of scope. (OGL Layers, for instance)
I suppose the better way to do this is to assert that the context *is* current, and force users to scope properly.

> @@ +1966,5 @@
> > +        mGL->MakeCurrent();
> > +        Delete();
> > +    }
> > +
> > +    // Assumes context is current
> 
> Again, guard by assertion.
Ok.

> @@ +2099,5 @@
> > +    if (depthTest)    fDisable(LOCAL_GL_DEPTH_TEST);
> > +    if (dither)       fDisable(LOCAL_GL_DITHER);
> > +    if (polyOffsFill) fDisable(LOCAL_GL_POLYGON_OFFSET_FILL);
> > +    if (sampleAToC)   fDisable(LOCAL_GL_SAMPLE_ALPHA_TO_COVERAGE);
> > +    if (sampleCover)  fDisable(LOCAL_GL_SAMPLE_COVERAGE);
> 
> Please explain the 3 previous ones! Teach me.
To be honest, I've never used them. WebGL exposes them, though, and they seem to be able to change how rendering works, which we don't want.

> 
> @@ +2141,5 @@
> > +    if (polyOffsFill) fEnable(LOCAL_GL_POLYGON_OFFSET_FILL);
> > +    if (sampleAToC)   fEnable(LOCAL_GL_SAMPLE_ALPHA_TO_COVERAGE);
> > +    if (sampleCover)  fEnable(LOCAL_GL_SAMPLE_COVERAGE);
> > +    if (scissor)      fEnable(LOCAL_GL_SCISSOR_TEST);
> > +    if (stencil)      fEnable(LOCAL_GL_STENCIL_TEST);
> 
> I think that it would be worth having a RAII helper for
> IsEnabled...Disable...Enable. So you could replace these 3 lines by
> 
> {
>   RaiiGLDisable raiiDisableBlend(LOCAL_GL_BLEND);
> 
>   ...
> }
> 
> Could use that in similar code in GLContext::ClearSafely.
Agreed.
Comment 37 Jeff Gilbert [:jgilbert] 2012-08-09 16:08:15 PDT
Created attachment 650712 [details] [diff] [review]
blit code

Now with 200% more RAII.
Comment 38 Benoit Jacob [:bjacob] (mostly away) 2012-08-10 11:03:36 PDT
(In reply to Jeff Gilbert [:jgilbert] from comment #36)
> (In reply to Benoit Jacob [:bjacob] from comment #32)
> > Comment on attachment 648887 [details] [diff] [review]
> > blit code
> > 
> > Review of attachment 648887 [details] [diff] [review]:
> > -----------------------------------------------------------------
> > 
> > Sorry, there is just enough stuff there that saying "r+ with following
> > comments, please address before landing" wouldn't quite make sense.
> > 
> > ::: gfx/gl/GLContext.cpp
> > @@ +1775,5 @@
> > >    }
> > >  }
> > >  
> > > +
> > > +static const char kTexBlit_VertShaderSource[] = "\
> > 
> > I wonder about the _ in there, but up to you.
> I feel the separation makes it clear that this is the Source for the Vert
> Shader, for use by the Texture-Blitting code. Basically, it's close to
> namespacing without actually doing it. Since we have other vert shader
> sources floating around, I think it makes it clear what it's for.

OK

> 
> Technically, every 2nd tri is still CCW, but the tri is (n-1, n-2, n)
> instead of (n-2, n-1, n) like normal.
> > 
> > @@ +1826,5 @@
> > > +
> > > +        MOZ_ASSERT(!mTexBlit_Buffer);
> > > +        fGenBuffers(1, &mTexBlit_Buffer);
> > > +        fBindBuffer(LOCAL_GL_ARRAY_BUFFER, mTexBlit_Buffer);
> > > +        fBufferData(LOCAL_GL_ARRAY_BUFFER, 8 * sizeof(GLfloat), verts, LOCAL_GL_STATIC_DRAW);
> > 
> > Replace 8 * sizeof float by sizeof verts.
> Ok. I don't really like doing that, since if verts becomes a pointer later
> (instead of an array), this breaks. I suppose it's better than having |8|
> floating in there, though.

Indeed, so the safe way to write code like this would be to create an array with an explicit length and refer to it:

const size_t array_size = 8; // technically, const may not be constant enough;
  // you might have to make this an enum value.
float verts[array_size] = { ... ... };
fBufferData ( ... , array_size * sizeof(verts[0]), ... );

We do have a NS_ARRAY_LENGTH macro that claims to be the right thing to use here,
http://mxr.mozilla.org/mozilla-central/source/xpcom/glue/nsMemory.h#124
but it will fall apart on a plain pointer in the same way as you point out about the approach discussed above.

> 
> > @@ +1851,5 @@
> > > +        fShaderSource(mTexBlit_FragShader, 1, &fragShaderSource, nsnull);
> > > +        fCompileShader(mTexBlit_FragShader);
> > > +
> > > +        GLint status = 0;
> > > +        fGetShaderiv(mTexBlit_VertShader, LOCAL_GL_COMPILE_STATUS, &status);
> > 
> > As said earlier in this bug, checking compile_status before LinkProgram is
> > futile.
> How about doing it in DEBUG? I originally had both in case the compiler spit
> out a useful error. (Which I appear to have forgotten to add the code for) I
> would imagine that if there was an implementation that did non-deferred
> compilation, the linker might only give an error about one of the shaders
> being invalid?

I wasn't suggesting avoiding checking the COMPILE_STATUS of individual shaders, I was just saying: first call linkProgram, then check individual shaders' COMPILE_STATUS.

> Since this is RAII, if we're ever playing with multiple contexts, a
> different context might be current when the object pops out of scope. (OGL
> Layers, for instance)
> I suppose the better way to do this is to assert that the context *is*
> current, and force users to scope properly.

I would prefer that, yes.

> 
> > @@ +1966,5 @@
> > > +        mGL->MakeCurrent();
> > > +        Delete();
> > > +    }
> > > +
> > > +    // Assumes context is current
> > 
> > Again, guard by assertion.
> Ok.
> 
> > @@ +2099,5 @@
> > > +    if (depthTest)    fDisable(LOCAL_GL_DEPTH_TEST);
> > > +    if (dither)       fDisable(LOCAL_GL_DITHER);
> > > +    if (polyOffsFill) fDisable(LOCAL_GL_POLYGON_OFFSET_FILL);
> > > +    if (sampleAToC)   fDisable(LOCAL_GL_SAMPLE_ALPHA_TO_COVERAGE);
> > > +    if (sampleCover)  fDisable(LOCAL_GL_SAMPLE_COVERAGE);
> > 
> > Please explain the 3 previous ones! Teach me.
> To be honest, I've never used them. WebGL exposes them, though, and they
> seem to be able to change how rendering works, which we don't want.

OK. If this isenabled/enable/disable business is not too expensive, maybe it's not worth worrying.
Comment 39 Benoit Jacob [:bjacob] (mostly away) 2012-08-10 11:29:41 PDT
Comment on attachment 650712 [details] [diff] [review]
blit code

Review of attachment 650712 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/gl/GLContext.cpp
@@ +1852,5 @@
> +        fShaderSource(mTexBlit_FragShader, 1, &fragShaderSource, nsnull);
> +        fCompileShader(mTexBlit_FragShader);
> +
> +        GLint status = 0;
> +#ifdef DEBUG

I'm not a fan of large #ifdef'd blocks in the middle of a function body, but it's OK. Altenatively, you could consider either moving this to a separate function; or not having the #ifdef at all and always doing this check.

@@ +1969,5 @@
> +        mTexBlit_Program = 0;
> +    }
> +}
> +
> +struct GLWrapper

Would it be more explicit to have Scoped or Raii in the name?

This is used below to do two very different things:
 - wrappers for scoped GL objects, like FramebufferTextureWrapper
 - Raii state restorers like StateWrapper

Maybe the best "common denominator" between these would be Raii or Scoped rather than Wrapper?

@@ +1978,5 @@
> +private:
> +    bool mIsUnwrapped;
> +
> +protected:
> +    // Use |enable = false| for to disable.

syntax error in above comment. Also, what's |enable| ?

@@ +1986,5 @@
> +    {
> +        MOZ_ASSERT(mGL->IsCurrent());
> +    }
> +
> +    virtual ~GLWrapper() {

Just a remark: you really don't need this to be virtual, since you'll never have a situation where you have at runtime a GLWrapper base class object without knowing its derived type. Instead, you could have gone for a CRTP here.

But, the cost of the virtual function call is not a problem here, and the convenience is worth it: it's cleaner code than the CRTP version.

@@ +1991,5 @@
> +        if (!mIsUnwrapped)
> +            Unwrap();
> +    }
> +
> +    virtual void Unwrap_Impl() = 0;

Is the _ really needed here?

@@ +1994,5 @@
> +
> +    virtual void Unwrap_Impl() = 0;
> +
> +public:
> +    void Unwrap() {

You don't seem to have any code calling Unwrap directly. So is this only a remnant from an earlier approach? If you could remove it, that would allow to rename Unwrap_Impl to Unwrap and call it directly from the dtor.
Comment 40 Jeff Gilbert [:jgilbert] 2012-08-10 15:35:23 PDT
(In reply to Benoit Jacob [:bjacob] from comment #39)
> Comment on attachment 650712 [details] [diff] [review]
> blit code
> 
> Review of attachment 650712 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: gfx/gl/GLContext.cpp
> @@ +1852,5 @@
> > +        fShaderSource(mTexBlit_FragShader, 1, &fragShaderSource, nsnull);
> > +        fCompileShader(mTexBlit_FragShader);
> > +
> > +        GLint status = 0;
> > +#ifdef DEBUG
> 
> I'm not a fan of large #ifdef'd blocks in the middle of a function body, but
> it's OK. Altenatively, you could consider either moving this to a separate
> function; or not having the #ifdef at all and always doing this check.
How about if (DebugMode())? This is always false on Opt builds, but logic-flow-wise is still normal (non-ifdef) code.

> 
> @@ +1969,5 @@
> > +        mTexBlit_Program = 0;
> > +    }
> > +}
> > +
> > +struct GLWrapper
> 
> Would it be more explicit to have Scoped or Raii in the name?
> 
> This is used below to do two very different things:
>  - wrappers for scoped GL objects, like FramebufferTextureWrapper
>  - Raii state restorers like StateWrapper
> 
> Maybe the best "common denominator" between these would be Raii or Scoped
> rather than Wrapper?
'ScopedGLWrapper'?

> 
> @@ +1978,5 @@
> > +private:
> > +    bool mIsUnwrapped;
> > +
> > +protected:
> > +    // Use |enable = false| for to disable.
> 
> syntax error in above comment. Also, what's |enable| ?
> 
> @@ +1986,5 @@
> > +    {
> > +        MOZ_ASSERT(mGL->IsCurrent());
> > +    }
> > +
> > +    virtual ~GLWrapper() {
> 
> Just a remark: you really don't need this to be virtual, since you'll never
> have a situation where you have at runtime a GLWrapper base class object
> without knowing its derived type. Instead, you could have gone for a CRTP
> here.
> 
> But, the cost of the virtual function call is not a problem here, and the
> convenience is worth it: it's cleaner code than the CRTP version.
This is cleaner. I doubt this will be a perf issue, but if it ends up being such, we can fix it.

> 
> @@ +1991,5 @@
> > +        if (!mIsUnwrapped)
> > +            Unwrap();
> > +    }
> > +
> > +    virtual void Unwrap_Impl() = 0;
> 
> Is the _ really needed here?
> 
> @@ +1994,5 @@
> > +
> > +    virtual void Unwrap_Impl() = 0;
> > +
> > +public:
> > +    void Unwrap() {
> 
> You don't seem to have any code calling Unwrap directly. So is this only a
> remnant from an earlier approach? If you could remove it, that would allow
> to rename Unwrap_Impl to Unwrap and call it directly from the dtor.

The unwrap implementation (here Unwrap_Impl) should only be called once. Normally, we let this happen by letting the object fall out of scope, which runs the dtor, which would call Unwrap. However, if we want to allow for users to call Unwrap 'early' (before it falls out of scope), we don't want to call unwrap in the dtor. This is the easiest way to give this functionality to this class's children without requiring anything of them. We can keep 'mIsUnwrapped' private, because no one else needs to mess with it, and allow the user to trigger unwrap either implicitly (dtor) or explicitly (Unwrap).
Comment 41 Jeff Gilbert [:jgilbert] 2012-08-10 15:40:07 PDT
(In reply to Benoit Jacob [:bjacob] from comment #39)
> Comment on attachment 650712 [details] [diff] [review]
> blit code
> 
> Review of attachment 650712 [details] [diff] [review]:
> -----------------------------------------------------------------
> @@ +1991,5 @@
> > +        if (!mIsUnwrapped)
> > +            Unwrap();
> > +    }
> > +
> > +    virtual void Unwrap_Impl() = 0;
> 
> Is the _ really needed here?

I like it, and I thought MakeCurrentImpl used it, but I just checked, and it's 'MakeCurrentImpl'. In my mind, 'MakeCurrentImpl' seems like it could mean 'make-current an implementation', instead of 'implementation of MakeCurrent'. In my mind, the underscore serves to separate these two parts. Likewise 'UnwrapImpl' would appear to 'unwrap an implementation', instead of being 'an implementation of unwrap'.

MakeCurrentImpl has precedence, though, so I suppose we should go with UnwrapImpl.
Comment 42 Jeff Gilbert [:jgilbert] 2012-08-11 01:39:47 PDT
Created attachment 651109 [details] [diff] [review]
blit code w/crtp-raii

I know you r+'d the previous one, but here's a new one with RAII via CRTP, since RAII via virtuals doesn't work. (oops)
Comment 43 Jeff Gilbert [:jgilbert] 2012-08-11 02:08:57 PDT
Created attachment 651112 [details] [diff] [review]
test code

Here's the update for the blit test code with everything moved to the new file. I don't like using ifdefs to cordon off code that won't run anyways, since if I ever do want to use it, (temporarily) it can be a pain to re-enable.

We do really need this to be a unit test, though, and not an opt-in debug-only test. I'll work on converting it to such, so when (not if!) we get unit tests working, we'll be able to drop this in.
Comment 44 Benoit Jacob [:bjacob] (mostly away) 2012-08-11 08:08:04 PDT
(In reply to Jeff Gilbert [:jgilbert] from comment #42)
> Created attachment 651109 [details] [diff] [review]
> blit code w/crtp-raii
> 
> I know you r+'d the previous one, but here's a new one with RAII via CRTP,
> since RAII via virtuals doesn't work. (oops)

Why didn't it work?
Comment 45 Benoit Jacob [:bjacob] (mostly away) 2012-08-11 08:11:36 PDT
> > I'm not a fan of large #ifdef'd blocks in the middle of a function body, but
> > it's OK. Altenatively, you could consider either moving this to a separate
> > function; or not having the #ifdef at all and always doing this check.
> How about if (DebugMode())? This is always false on Opt builds, but
> logic-flow-wise is still normal (non-ifdef) code.


Why not!

> 
> > 
> > @@ +1969,5 @@
> > > +        mTexBlit_Program = 0;
> > > +    }
> > > +}
> > > +
> > > +struct GLWrapper
> > 
> > Would it be more explicit to have Scoped or Raii in the name?
> > 
> > This is used below to do two very different things:
> >  - wrappers for scoped GL objects, like FramebufferTextureWrapper
> >  - Raii state restorers like StateWrapper
> > 
> > Maybe the best "common denominator" between these would be Raii or Scoped
> > rather than Wrapper?
> 'ScopedGLWrapper'?

Whatever sounds good to you given my suggestions -- you're the native English speaker here.


> The unwrap implementation (here Unwrap_Impl) should only be called once.
> Normally, we let this happen by letting the object fall out of scope, which
> runs the dtor, which would call Unwrap. However, if we want to allow for
> users to call Unwrap 'early' (before it falls out of scope), we don't want
> to call unwrap in the dtor. This is the easiest way to give this
> functionality to this class's children without requiring anything of them.
> We can keep 'mIsUnwrapped' private, because no one else needs to mess with
> it, and allow the user to trigger unwrap either implicitly (dtor) or
> explicitly (Unwrap).

OK. I was just noting that this feature (allowing to unwrap early) didn't seem to be used in the current state of your patch.
Comment 46 Benoit Jacob [:bjacob] (mostly away) 2012-08-11 08:12:28 PDT
(In reply to Jeff Gilbert [:jgilbert] from comment #41)
> (In reply to Benoit Jacob [:bjacob] from comment #39)
> > Comment on attachment 650712 [details] [diff] [review]
> > blit code
> > 
> > Review of attachment 650712 [details] [diff] [review]:
> > -----------------------------------------------------------------
> > @@ +1991,5 @@
> > > +        if (!mIsUnwrapped)
> > > +            Unwrap();
> > > +    }
> > > +
> > > +    virtual void Unwrap_Impl() = 0;
> > 
> > Is the _ really needed here?
> 
> I like it, and I thought MakeCurrentImpl used it, but I just checked, and
> it's 'MakeCurrentImpl'. In my mind, 'MakeCurrentImpl' seems like it could
> mean 'make-current an implementation', instead of 'implementation of
> MakeCurrent'. In my mind, the underscore serves to separate these two parts.
> Likewise 'UnwrapImpl' would appear to 'unwrap an implementation', instead of
> being 'an implementation of unwrap'.

Understood; that makes sense.

> 
> MakeCurrentImpl has precedence, though, so I suppose we should go with
> UnwrapImpl.

Up to you!
Comment 47 Benoit Jacob [:bjacob] (mostly away) 2012-08-15 15:32:05 PDT
Comment on attachment 651109 [details] [diff] [review]
blit code w/crtp-raii

Review of attachment 651109 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/gl/GLContext.h
@@ +3284,5 @@
> +        MOZ_ASSERT(&Derived::UnwrapImpl);
> +        MOZ_ASSERT(mGL->IsCurrent());
> +    }
> +
> +    virtual ~ScopedGLWrapper() {

Now that this is templatized, there really is no reason for virtual at all anymore, right?
Comment 48 Jeff Gilbert [:jgilbert] 2012-08-15 15:48:13 PDT
(In reply to Benoit Jacob [:bjacob] from comment #47)
> Comment on attachment 651109 [details] [diff] [review]
> blit code w/crtp-raii
> 
> Review of attachment 651109 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: gfx/gl/GLContext.h
> @@ +3284,5 @@
> > +        MOZ_ASSERT(&Derived::UnwrapImpl);
> > +        MOZ_ASSERT(mGL->IsCurrent());
> > +    }
> > +
> > +    virtual ~ScopedGLWrapper() {
> 
> Now that this is templatized, there really is no reason for virtual at all
> anymore, right?

It's true. Virtual dtors are sort of a reflex for me. I'll de-virtualize them, since they're technically not needed here.
Comment 49 Benoit Jacob [:bjacob] (mostly away) 2012-08-21 14:06:36 PDT
Comment on attachment 651109 [details] [diff] [review]
blit code w/crtp-raii

Review of attachment 651109 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/gl/GLContext.h
@@ +1123,5 @@
>          mInInternalBindingMode_ReadFBO = false;
>  #endif
>      }
>  
> +    GLuint GetUserBoundFBO() {

In the identifier GetUserBoundFBO, User is before Bound. But in BindUserFBO, it is after. Suggest harmonizing this.
Comment 50 Jeff Gilbert [:jgilbert] 2012-08-21 15:40:25 PDT
I'll split the test code and discardFramebuffer stuff to separate bugs.
Comment 52 Ed Morley [:emorley] 2012-08-22 02:44:19 PDT
https://hg.mozilla.org/mozilla-central/rev/d6a5f82cc3fc
Comment 53 Vladimir Vukicevic [:vlad] [:vladv] 2012-08-22 13:19:17 PDT
For the discard framebuffer bit -- I don't think it's being used correctly.  I think you want to discard immediately after doing the read from it as well as where you're doing it, because that saves you from needing to resolve it back into system memory.  But maybe we're not hitting that at all, since we won't be modifying it, just doing a read... and maybe what's here is enough?
Comment 54 Jeff Gilbert [:jgilbert] 2012-08-22 14:14:30 PDT
(In reply to Vladimir Vukicevic [:vlad] [:vladv] from comment #53)
> For the discard framebuffer bit -- I don't think it's being used correctly. 
> I think you want to discard immediately after doing the read from it as well
> as where you're doing it, because that saves you from needing to resolve it
> back into system memory.  But maybe we're not hitting that at all, since we
> won't be modifying it, just doing a read... and maybe what's here is enough?

Ah, yeah, doing it before unbinding makes more sense than doing it before next use. Let's take discussion of this to bug 781084, though.
Comment 55 Vladimir Vukicevic [:vlad] [:vladv] 2013-09-20 09:31:21 PDT
Comment on attachment 648890 [details] [diff] [review]
discard framebuffer

Clearing ancient review request (sorry!).. if this is still relevant, r? again pls.

Note You need to log in before you can comment on or make changes to this bug.