can't depend on REPEAT for GL textures without npot texture support

RESOLVED FIXED

Status

()

defect
RESOLVED FIXED
9 years ago
9 years ago

People

(Reporter: vlad, Assigned: vlad)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(2 attachments, 2 obsolete attachments)

Turns out other than requiring LINEAR/NEAREST filter modes, CLAMP_TO_EDGE is the only allowed wrap mode for non-power-of-two textures.  The PowerVR chips at least don't support the ES npot extension that relaxes these limits, which is why currently they just render black.  Argh.

We'll have to do some work in the shader I guess, see how expensive a mod() or other type of operation is to normalize our texcoords ourselves.
So I think the right way to do this is to draw 4 quads instead of just 1.  Since we can do so in a single draw call, this should have extremely minimal overhead over drawing just 1; basically just the cost of setting up a client-side texcoord/vertex pointer.
And I forgot, we already do that, because we need custom texcoords.  So we can just replace it and not depend on REPEAT at all.
I agree, that sounds like the right solution.
So, this works, according to my testing; I'll attach my test html file, usable with -chrome.

However, I realized that this approach is basically guaranteed to cause seams when we have fennec's dpi-scaling involved with a rotated buffer.  Maybe that's ok, and it won't be present often enough for it to be an issue?
Assignee: nobody → vladimir
Attachment #488999 - Flags: review?(jmuizelaar)
Comment on attachment 488999 [details] [diff] [review]
remove need for REPEAT when rendering rotated textures

Cancelling review, moving to feedback instead.

This works with my testcase, but fails in fennec -- opening up the prefs pane and scrolling up and down has severe artifacts.

Also, the seam issue means that at the very least we should keep using REPEAT when it's available instead of always decomposing into non-wrapping rects.
Attachment #488999 - Flags: review?(jmuizelaar) → feedback?(jmuizelaar)
Working version.  I'm going to add back support for REPEAT as an additional patch on top of this (could do it the other way around, but I already have this...)
Attachment #488999 - Attachment is obsolete: true
Attachment #489291 - Flags: review?(jmuizelaar)
Attachment #488999 - Flags: feedback?(jmuizelaar)
Blocks: 602523
Why can't we just do the repeat in the pixel shader? Performance?
Yep -- we'd have to do the mod() operation in the fragment shader, thus per pixel, and doing texture2D(tex, mod(v, 1.0)) costs 12 cycles compared to 1 for texture2D(tex, v).
Comment on attachment 489291 [details] [diff] [review]
remove need for REPEAT when rendering rotated textures (updated)

Here's part of a review.

> 
>     fActiveTexture(LOCAL_GL_TEXTURE0);
>     fBindTexture(LOCAL_GL_TEXTURE_2D, aSrc->Texture());
>@@ -1142,7 +1137,7 @@ GLContext::BlitTextureImage(TextureImage
> 
>     DEBUG_GL_ERROR_CHECK(this);
> 
>-    fDrawArrays(LOCAL_GL_TRIANGLE_STRIP, 0, 4);
>+    fDrawArrays(LOCAL_GL_TRIANGLES, 0, rects.numRects * 6);

How about adding a comment like:
/* Using TRIANGLES instead of TRIANGLES_STRIP makes it easier to draw multiple
rectangles and we're lazy */


>+    *v++ = x0; *v++ = y1;
>+    *v++ = x1; *v++ = y0;
>+    *v++ = x1; *v++ = y1;
>+
>+    *t++ = tx0; *t++ = ty0;
>+    *t++ = tx1; *t++ = ty0;
>+    *t++ = tx0; *t++ = ty1;
>+
>+    *t++ = tx0; *t++ = ty1;
>+    *t++ = tx1; *t++ = ty0;
>+    *t++ = tx1; *t++ = ty1;
>+
>+    v = &vertexCoords[numRects*6*2];
>+    t = &texCoords[numRects*6*2];

I don't see a purpose for these assignments to v and t.

>diff --git a/gfx/thebes/GLContext.h b/gfx/thebes/GLContext.h
>--- a/gfx/thebes/GLContext.h
>+++ b/gfx/thebes/GLContext.h
>@@ -690,6 +690,36 @@ public:
>     void BlitTextureImage(TextureImage *aSrc, const nsIntRect& aSrcRect,
>                           TextureImage *aDst, const nsIntRect& aDstRect);
> 
>+    /** Helper for DecomposeIntoNoRepeatTriangles
>+     */
>+    struct RectTriangles {
>+        RectTriangles() : numRects(0) { }
>+
>+        void addRect(GLfloat x0, GLfloat y0, GLfloat x1, GLfloat y1,
>+                     GLfloat tx0, GLfloat ty0, GLfloat tx1, GLfloat ty1);
>+
>+        int numRects;
>+        GLfloat vertexCoords[4*6*2];
>+        GLfloat texCoords[4*6*2];

A comment like:
/* 4 rectangles made of 2 triangles with 3 vertices with 2 coordinates */

Using 3 * 2 instead of 6 is more intuitive to me.

>+    };
>+
>+    /**
>+     * Decompose drawing the possibly-wrapped aTexCoordRect rectangle
>+     * of a texture of aTexSize into one or more rectangles (represented
>+     * as 2 triangles) and associated tex coordinates, such that
>+     * we don't have to use the REPEAT wrap mode.
>+     *
>+     * The resulting triangle vertex coordinates will be in the space of
>+     * (0.0, 0.0) to (1.0, 1.0) -- transform the coordinates appropriately
>+     * if you need a different space.
>+     * The optional aDestRect parameter gives the destination rectangle
>+     * whose vertex coords will be generated.  If not given, it is
>+     * assumed to be the rectangle from (0.0, 0.0) to (1.0, 1.0).
>+     */

Looks like aDestRect doesn't exist.
The comment also doesn't really make it clear that it only supports repeating a texel a single time. i.e you can't use this for tiling.
(In reply to comment #8)
> Yep -- we'd have to do the mod() operation in the fragment shader, thus per
> pixel, and doing texture2D(tex, mod(v, 1.0)) costs 12 cycles compared to 1 for
> texture2D(tex, v).

Couldn't we just do something like frac()? Would that be equally as slow? What if we did bit operations, abusing our knowledge that we're using IEEE 754 float?
bit operations in a fragment shader? :-)

Regardless, doing this math once per draw operation than W*H times per pixel is going to be a win regardless of what we do -- doing nothing in the fragment shader is always going to be faster than doing something.

Jeff, will fix -- some of those bits are leftover from debug code (like the v/t assignments) that I missed when removing it.  Comment fixes are good too, will add.
adds back REPEAT when it's available.  Largely mechanical changes needed to pass the wrap mode through and make it available in the TextureImage struct.
Attachment #489708 - Flags: review?(jmuizelaar)
(Didn't add the lazy comment, instead documented the Decompose function to say that the points should be drawn with GL_TRIANGLES -- to do this using tristrips we'd have to do some degenerate triangles, so I didn't bother; I figure the vertex cache should win in triangles just as well in this case.)
Comment on attachment 489291 [details] [diff] [review]
remove need for REPEAT when rendering rotated textures (updated)


>+    // texture coordinates.
>+
>+    bool xwrap = false, ywrap = false;
>+    if (tcr.x < 0 || tcr.x > aTexSize.width ||
>+        tcr.XMost() < 0 || tcr.XMost() > aTexSize.width)
>+    {
>+        xwrap = true;
>+        tl[0] = fmodf(tl[0], 1.0f);
>+        br[0] = fmodf(br[0], 1.0f);
>+    }
>+
>+    if (tcr.y < 0 || tcr.y > aTexSize.height ||
>+        tcr.YMost() < 0 || tcr.YMost() > aTexSize.height)
>+    {
>+        ywrap = true;
>+        tl[1] = fmodf(tl[1], 1.0f);
>+        br[1] = fmodf(br[1], 1.0f);
>+    }


As I understand it, tl and br should at this point contain numbers between 0..1. However if, for example,
tcr.x is negative this won't be true.

It's probably worth asserting this condition.

>+    // there isn't a 1:1 mapping between tex coords and destination coords;
>+    // when computing midpoints, we have to take that into account.
>+    GLfloat xs = GLfloat(aTexSize.width) / GLfloat(aTexCoordRect.width);
>+    GLfloat ys = GLfloat(aTexSize.height) / GLfloat(aTexCoordRect.height);
>+
>+    // If xwrap is false, the texture will be sampled from tl[0]
>+    // .. br[0].  If xwrap is true, then it will be split into tl[0]
>+    // .. 1.0, and 0.0 .. br[0].  Same for the Y axis.  The
>+    // destination rectangle is also split appropriately, according
>+    // to the calculated xmid/ymid values.
>+
>+    if (!xwrap && !ywrap) {
>+        aRects.addRect(0.0f, 0.0f, 1.0f, 1.0f,
>+                       tl[0], tl[1], br[0], br[1]);
>+    } else if (!xwrap && ywrap) {
>+        GLfloat ymid = (1.0f - tl[1]) * ys;

if aTexSize.width = 1000 and aTexCoordRect.width = 2  won't this produce a ymid that is not between 0..1?
Attachment #489291 - Flags: review?(jmuizelaar) → review-
Attachment #489708 - Flags: review?(jmuizelaar) → review+
Posted patch part 1, fixedSplinter Review
that is a good point, the x/y scale stuff was wrong.  fixed here, with better comments. also fixed fmodf().
Attachment #489291 - Attachment is obsolete: true
Attachment #489908 - Flags: review?(jmuizelaar)
Attachment #489908 - Flags: review?(jmuizelaar) → review+
Duplicate of this bug: 602523
You need to log in before you can comment on or make changes to this bug.