If you think a bug might affect users in the 57 release, please set the correct tracking and status flags for Release Management.

Fix TextureImage format handling

NEW
Unassigned

Status

()

Core
Graphics
6 years ago
5 years ago

People

(Reporter: bjacob, Unassigned)

Tracking

(Depends on: 1 bug)

Firefox Tracking Flags

(Not tracked)

Details

(URL)

Attachments

(3 attachments, 6 obsolete attachments)

Running with the current patch queue,
http://hg.mozilla.org/users/florian.haenel_heeen.de/GLES_layers/

I get a GL_INVALID_OPERATION with this trace:

#4  0x84c8273c in mozilla::gl::GLContext::AfterGLCall (this=0x48845c00, 
    glFunction=0x86a93288 "void mozilla::gl::GLContext::fTexSubImage2D(GLenum, GLint, GLint, GLint, GLsizei, GLsizei, GLenum, GLenum, const GLvoid*)") at ../../../dist/include/GLContext.h:1110
#5  0x84cb7ec8 in mozilla::gl::GLContext::fTexSubImage2D (this=0x48845c00, target=3553, level=0, 
    xoffset=0, yoffset=0, width=320, height=508, format=6407, type=33635, pixels=0x494a9000)
    at ../../../dist/include/GLContext.h:1775
#6  0x860c3ba0 in mozilla::gl::GLContext::UploadSurfaceToTexture (this=0x48845c00, aSurface=
    0x48e66740, aDstRegion=..., aTexture=@0x48f0a50c, aOverwrite=false, aSrcPoint=..., 
    aPixelBuffer=false) at /home/bjacob/mozilla-central-mobile/gfx/thebes/GLContext.cpp:1738
#7  0x860de2c4 in mozilla::gl::TextureImageEGL::DirectUpdate (this=0x48f0a4c0, aSurf=0x48e66740, 
    aRegion=..., aFrom=...)
    at /home/bjacob/mozilla-central-mobile/gfx/thebes/GLContextProviderEGL.cpp:1264
#8  0x860bff80 in mozilla::gl::TiledTextureImage::DirectUpdate (this=0x4aa90580, 
    aSurf=0x48e66740, aRegion=..., aFrom=...)
    at /home/bjacob/mozilla-central-mobile/gfx/thebes/GLContext.cpp:738
#9  0x8611796c in mozilla::layers::ShadowBufferOGL::Upload (this=0x48df3280, aUpdate=0x48e66740, 
    aUpdated=..., aRect=..., aRotation=...)
    at /home/bjacob/mozilla-central-mobile/gfx/layers/opengl/ThebesLayerOGL.cpp:981



The reason for this error becomes apparent with the patch from bug 654424: this texture was created by a glTexImage2D call with format=GL_RGBA, and is now getting updated by a glTexSubImage2D call with format=GL_RGB:

I/Gecko   (16157): [gl:0x48845c00] > void mozilla::gl::GLContext::fBindTexture(GLenum, GLuint)
I/Gecko   (16157): parameters:
I/Gecko   (16157):   target = de1
I/Gecko   (16157):   texture = 490007
I/Gecko   (16157): [gl:0x48845c00] < void mozilla::gl::GLContext::fBindTexture(GLenum, GLuint) [0x0000]
I/Gecko   (16157): [gl:0x48845c00] > void mozilla::gl::GLContext::fTexImage2D(GLenum, GLint, GLint, GLsizei, GLsizei, GLint, GLenum, GLenum, const GLvoid*)
I/Gecko   (16157): parameters:
I/Gecko   (16157):   target = de1
I/Gecko   (16157):   level = 0
I/Gecko   (16157):   width = 320
I/Gecko   (16157):  height = 508
I/Gecko   (16157):   format = 1908
I/Gecko   (16157): , type = 1401
I/Gecko   (16157): [gl:0x48845c00] < void mozilla::gl::GLContext::fTexImage2D(GLenum, GLint, GLint, GLsizei, GLsizei, GLint, GLenum, GLenum, const GLvoid*) [0x0000]
I/Gecko   (16157): [gl:0x48845c00] > void mozilla::gl::GLContext::fTexParameteri(GLenum, GLenum, GLint)
I/Gecko   (16157): [gl:0x48845c00] < void mozilla::gl::GLContext::fTexParameteri(GLenum, GLenum, GLint) [0x0000]
I/Gecko   (16157): [gl:0x48845c00] > void mozilla::gl::GLContext::fTexParameteri(GLenum, GLenum, GLint)
I/Gecko   (16157): [gl:0x48845c00] < void mozilla::gl::GLContext::fTexParameteri(GLenum, GLenum, GLint) [0x0000]
I/Gecko   (16157): [gl:0x48845c00] > void mozilla::gl::GLContext::fTexParameteri(GLenum, GLenum, GLint)
I/Gecko   (16157): [gl:0x48845c00] < void mozilla::gl::GLContext::fTexParameteri(GLenum, GLenum, GLint) [0x0000]
I/Gecko   (16157): [gl:0x48845c00] > void mozilla::gl::GLContext::fTexParameteri(GLenum, GLenum, GLint)
I/Gecko   (16157): [gl:0x48845c00] < void mozilla::gl::GLContext::fTexParameteri(GLenum, GLenum, GLint) [0x0000]
I/Gecko   (16157): [gl:0x48845c00] > void mozilla::gl::GLContext::fActiveTexture(GLenum)
I/Gecko   (16157): parameters:
I/Gecko   (16157):  textureunit = 33984
I/Gecko   (16157): [gl:0x48845c00] < void mozilla::gl::GLContext::fActiveTexture(GLenum) [0x0000]
I/Gecko   (16157): [gl:0x48845c00] > void mozilla::gl::GLContext::fBindTexture(GLenum, GLuint)
I/Gecko   (16157): parameters:
I/Gecko   (16157):   target = de1
I/Gecko   (16157):   texture = 490007
I/Gecko   (16157): [gl:0x48845c00] < void mozilla::gl::GLContext::fBindTexture(GLenum, GLuint) [0x0000]
I/Gecko   (16157): [gl:0x48845c00] > void mozilla::gl::GLContext::fTexSubImage2D(GLenum, GLint, GLint, GLint, GLsizei, GLsizei, GLenum, GLenum, const GLvoid*)
I/Gecko   (16157): parameters:
I/Gecko   (16157):   target = de1
I/Gecko   (16157):   level = 0
I/Gecko   (16157):   xoffset = 0
I/Gecko   (16157):  yoffset = 0
I/Gecko   (16157):   width = 320
I/Gecko   (16157):  height = 508
I/Gecko   (16157):   format = 1907
I/Gecko   (16157): , type = 8363
I/Gecko   (16157): [gl:0x48845c00] < void mozilla::gl::GLContext::fTexSubImage2D(GLenum, GLint, GLint, GLint, GLsizei, GLsizei, GLenum, GLenum, const GLvoid*) [0x0502]
I/Gecko   (16157): GL ERROR: void mozilla::gl::GLContext::fTexSubImage2D(GLenum, GLint, GLint, GLint, GLsizei, GLsizei, GLenum, GLenum, const GLvoid*) generated GL error GL_INVALID_OPERATION(0x0502)
I/Gecko   (16157): ###!!! ABORT: file ../../../dist/include/GLContext.h, line 1110


We need to figure which of these two format values is wrong, factor out the code that finds the correct format and let both sides use it.
(Reporter)

Comment 1

6 years ago
The first time (in texImage2D), we have format=GL_RGBA, type=GL_UNSIGNED_BYTE; this is computed in GLFormatForImage() and GLTypeForImage() in GLContextProviderEGL.cpp, this means that the image's format is either gfxASurface::ImageFormatARGB32 or gfxASurface::ImageFormatRGB24.

The second time (in texSubImage2D), we have format=GL_RGB, type=GL_UNSIGNED_SHORT_5_6_5; this is computed in GLContext::UploadSurfaceToTexture(); this means that the image's format is gfxASurface::ImageFormatRGB16_565.

So the image's format changed between these two points; the bug is either that fact, or the fact that the GL texture wasn't updated accordingly.
(Reporter)

Comment 2

6 years ago
... or perhaps we're just missing a glBindTexture call somewhere between these two points.
(Reporter)

Comment 3

6 years ago
(In reply to comment #2)
> ... or perhaps we're just missing a glBindTexture call somewhere between
> these two points.

Ah no, forget that, we're specifically calling glBindTexture on the same texture.
(Reporter)

Updated

6 years ago
Summary: Texture update generates GL_INVALID_OPERATION because it uses the wrong format on Android → Texture update generates GL_INVALID_OPERATION on Android because image format changed to RGB16_565 under our feet
(Reporter)

Updated

6 years ago
(Reporter)

Comment 4

6 years ago
Created attachment 536683 [details] [diff] [review]
helper to log nested (function, etc) scopes

With this patch we get a log of the main relevant function calls around here.
(Reporter)

Comment 5

6 years ago
Created attachment 536689 [details]
log with function scopes

Here's the log of relevant function calls, obtained with above patch.

We get:

  ShadowBufferOGL::Upload {
    CreateClampOrRepeatTextureImage {
      GLContextEGL::CreateTextureImage {
         TextureImageEGL::TextureImageEGL {
           TextureImageEGL::Resize {
             glTexImage2D with GL_RGBA, GL_UNSIGNED_BYTE
           }
         }
      }
    }
    TextureImageEGL::DirectUpdate {
      glTexSubImage2D with GL_RGB, GL_UNSIGNED_SHORT_5_6_5
    }
  }

So the format change occurs inside of ShadowBufferOGL::Upload(), after the CreateClampOrRepeatTextureImage() call and before the TextureImageEGL::DirectUpdate() call.
(Reporter)

Comment 6

6 years ago
ShadowBufferOGL::Upload is called here with a surface whose format is ImageFormatRGB16_565.

So the real error is that the glTexImage2D call creates a RGBA32 texture. The subsequent glTexSubImage2D call (where the error occurs) is correct in trying to update the texture as RGB16.
(Reporter)

Updated

6 years ago
Summary: Texture update generates GL_INVALID_OPERATION on Android because image format changed to RGB16_565 under our feet → ShadowBufferOGL::Upload generates GL_INVALID_OPERATION on Android
(Reporter)

Comment 7

6 years ago
In TextureImageEGL::TextureImageEGL(), we initially have mUpdateFormat==ImageFormatRGB24 and we then adjust it to ImageFormatARGB32.

Both are wrong. The correct format, that was passed to ShadowBuffer::Upload(), was never propagated to here.
(Reporter)

Comment 8

6 years ago
Does anyone know this code? It is necessary to rewrite this code in such a way that the texture format is computed once and for all in one place, and use that for both the texture creation (glTexImage2D) and the texture update (glTexSubImage2D).

I don't understand this optimization in TextureImageEGL::TextureImageEGL, where we change the mUpdateFormat. Even though this isn't what's causing this error here, it's going to cause similar errors whenever it results in a different format being picked than what TextureImageEGL::DirectUpdate() picks.
(Reporter)

Comment 9

6 years ago
After IRC discussion with romaxa and mattwoodrow, it's confirmed we should honor the format of the surface passed to ShadowBuffer::Upload(), instead of only passing its contentType which doesn't have enough information.
(Reporter)

Updated

6 years ago
Summary: ShadowBufferOGL::Upload generates GL_INVALID_OPERATION on Android → Fix TextureImage format handling
(Reporter)

Updated

6 years ago
OS: Linux → All
Hardware: x86_64 → All
Created attachment 537245 [details] [diff] [review]
Fix TextureImage format handling

This patch makes thing work nicely without GL errors on my Nexus S.

Not yet asking for review as I would first like to get confirmation that I'm not breaking stuff on other platforms. Please check on Meego. This also needs to be checked on GLX and on CGL.

Here's what the patch is doing. TextureImage used to store two members giving some information about its format: mContentType and mIsRGBFormat. mContentType said whether we had color, alpha, or both. mIsRGBFormat said whether we have RGB or BGR order.

The first problem was that that wasn't enough information, since it didn't tell the difference between RGB16_565 and RGB24 for example. That's the bug I experienced above. This patch replaces mContentType by mImageFormat everywhere, so TextureImage knows its gfxImageFormat.

This made a second problem surface, as mIsRGBFormat was now redundant. Indeed, my understanding is that Cairo surfaces have BGR order, unless the format is RGB565, and then when using hw surfaces on Meego we always have RGB order. So I removed the mIsRGBFormat member, the IsRGB() method is now implementing the above logic (has to be checked by romaxa/heeen), so there's no redundancy anymore, and also I *think* we had bugs where we were not correctly updating mIsRGBFormat and these would now be fixed (in any case some code blocks got removed which is always nice).
I haven't gone through this fully yet, but I can see at least one problem.

GetAsImageSurface won't always succeed (at least for XlibSurfaces on GLX), so we might want to try handle this case too.

Can we avoid allocating the texture until the first BeginUpdate/DirectUpdate and use the surface format at that point. If the surface is one that doesn't support GetAsImageSurface, then we need to allocate a temporary buffer and we control the source format.

Alternatively we could just block GL layers on GLX when we don't have texture_from_pixmap (The only case where this would actually matter), throw lots of assertions in all over the place and hope we never need to support a platform that doesn't handle GetAsImageSurface().
CocoaChildView.mm  change also needed here:
http://mxr.mozilla.org/mozilla-central/source/widget/src/cocoa/nsChildView.mm#2036
otherwise it fail to compile on mac:
http://tinderbox.mozilla.org/showlog.cgi?log=Try/1307216058.1307221722.5492.gz
GetAsImageSurface - will be a huge problem, because you call it just for ImageFormat extraction... but if that is XSurface, then it will cause alloc -> X->image copy -> dealloc
would it be better to tweak gfxASurface::FormatFromContent function and make it return 24 or 16 depends on MOBILE_GFX define?
Or we can do it even this way:
gfxASurface::gfxImageFormat
gfxASurface::FormatFromContent(gfxASurface::gfxContentType type)
{
    switch (type) {
        case CONTENT_COLOR_ALPHA:
            return ImageFormatARGB32;
        case CONTENT_ALPHA:
            return ImageFormatA8;
        case CONTENT_COLOR:
        default:
            return gfxPlatform::GetOffscreenFormat();
    }
}
(In reply to comment #13)
> GetAsImageSurface - will be a huge problem, because you call it just for
> ImageFormat extraction... but if that is XSurface, then it will cause alloc
> -> X->image copy -> dealloc

Ah OK. What is the right way of getting the format of a gfxASurface?
(In reply to comment #14)
> would it be better to tweak gfxASurface::FormatFromContent function and make
> it return 24 or 16 depends on MOBILE_GFX define?

That sure sounds like the right way, but do you know if gfxASurface::FormatFromContent has other users who might be broken by such a change?

MXR search:
http://mxr.mozilla.org/mozilla-central/ident?i=FormatFromContent&tree=mozilla-central&filter=
(In reply to comment #16)
> (In reply to comment #13)
> > GetAsImageSurface - will be a huge problem, because you call it just for
> > ImageFormat extraction... but if that is XSurface, then it will cause alloc
> > -> X->image copy -> dealloc
> 
> Ah OK. What is the right way of getting the format of a gfxASurface?

There doesn't seem to exist a way, is there?

We could query the surface type with GetType() and handle the few cases that we need manually, if we only need a few. How would we get the format of a gfxXlibSurface? Already that seems hard.
There isn't necessarily an answer, which is why there's no defined method to retrieve it.

For surfaces that don't support GetAsImageSurface (gfxXlibSurface will return nsnull), the format isn't relevant as we need to do a manual copy to an image surface to access the pixel data. The internal data format could be (theoretically) something completely different and not even covered by the gfxImageFormat enum.

The only accurate way is to call GetAsImageSurface and use the format of that pointer. If that call returns nsnull, then we need to allocate a temporary image surface and make a copy, and we can get the format of the temporary.

As I said before, we can probably just avoid cases where this will actually happen and rely on GetAsImageSurface succeeding.
Created attachment 537598 [details] [diff] [review]
Fix TextureImage format handling

Here's a new version, see in GLContext.h the TextureImageFormatForSurface checking if GetAsImageSurface returned null, and OptimizedTextureImageFormat doing the 24bpp->16bpp optimization on mobile.
(Reporter)

Updated

6 years ago
Attachment #537245 - Attachment is obsolete: true
Created attachment 542668 [details] [diff] [review]
Rebased patch

Here's the patch rebased on today's mozilla-central, in case it's useful to anyone.
I have a problem with red and blue being flipped occasionally with this patch. It apparently happens when a surface is resized or redrawn from RGB565 to RGBA32.
I found that changing ShaderProgramTypeForImageFormat(gfxASurface::gfxImageFormat aFormat)
from

case gfxASurface::ImageFormatARGB32:
        return BGRALayerProgramType;

to

case gfxASurface::ImageFormatARGB32:
        return RGBALayerProgramType;

fixes this problem, but I'm not sure if this solution also works on other platforms. It was mentioned that Meego hw surfaces have a different order than on android for example, I guess this is the cause.
Does that mean that our ImageFormat is actually wrong and should be something like gfxASurface::ImageFormatABGR32?
(In reply to comment #22)
> case gfxASurface::ImageFormatARGB32:
>         return RGBALayerProgramType;

This is wrong. ImageFormatARGB32 is BGRALayerProgramType. It sounds like you're running into a either an extra swizzel or not enough swizzels.
Here's a description of what happens normally:

cairo's image surfaces are stored in memory as B,G,R,A. We upload these to texture with the format GL_RGBA and type of UNSIGNED_BYTE. These textures are
stored in memory as R,G,B,A. So the data we pass in is technically wrong. However, when we use the BGRALayerProgramType we swap things back into place:
gl_FragColor = color.bgra;

Presumably, when using EGL surfaces this lie is not happening, and when we use the BGRALayerProgramType we do an extra swizzle.
For what it's worth, BGRALayerProgramType should probably be rename to BGRAasRGBALayerProgramType
Created attachment 543759 [details] [diff] [review]
fixed red-blue flip

I fixed the red-blue swizzle by overriding the mShaderType to the non-swizzling version if our Image has been acquired through locking.
Attachment #537598 - Attachment is obsolete: true
Attachment #542668 - Attachment is obsolete: true
Doing a rebase I noticed that there is now support for a luminance channel image in code that this patch ewplaces. Do we also support this?
After fixing it for egl locked surfaces this problem resurfaced on GLX. in the current code we have:

606         if (aSurface->GetContentType() == gfxASurface::CONTENT_COLOR_ALPHA) {  
607             mShaderType = gl::RGBALayerProgramType;                            
608         } else {                                                               
609             mShaderType = gl::RGBXLayerProgramType;                            
610         
}                                                                      

while the patch changes this to:

    case gfxASurface::ImageFormatARGB32:
        return BGRALayerProgramType;
    case gfxASurface::ImageFormatRGB24:
        return BGRXLayerProgramType;

So we return BGRA shaders when we used to use RGBA shaders. Clearly we need to add a flag to the function telling it which order our channels are in.
I can't see why you need to change this much.

GetShaderProgramType should be the definitive check for a given TextureImage. UploadSurfaceForTexture will return the correct value to return here for all normal cairo surfaces. The only change that should be needed is to override the value of mShaderType when using a backing surface instead.

Removing IsRGB() seems worthwhile as it shouldn't actually be needed for anything.
Created attachment 544488 [details] [diff] [review]
fixed red-blue flip (rebased)

Just the most recent patch rebased onto more recent mozilla-central
I still think Comment 11 is the better approach here.

The specific pixel format should stay as implementation detail of each TextureImage subclass, and this can be communicated to shaders through GetShaderType().

The main issue is guessing an appropriate format in the constructor without knowing the format of the upload surfaces. If we delay texture creation until the first surface is available we fix this problem without extra code.
Created attachment 544499 [details] [diff] [review]
fixed red-blue flip (rebased)

Whoops, actually make sure it builds this time.
Attachment #544488 - Attachment is obsolete: true
Does this bug still exist without the patch? The trace log above shows us calling Resize from the TextureImageEGL ctor, but I can't see that in the current code.

We still guess the mUpdateFormat incorrectly on android, but unless we call Resize or {Begin|End}Update this shouldn't ever be used.

With mobile (and a separate content process), we should only ever be doing texture updating via DirectUpdate. Is there ever a case where this isn't true?

We seem to have an inconsistent API for TextureImages:

Resize (used for copying pixel data from one image to another) assumes that the TextureImage will decide it's own pixel format, as was the case when only the {Being|End}Update calls existed.

With DirectUpdate we generally want to copy the pixel format of the passed surface to avoid needing to make an extra copy.

I'll have a think about possible solutions for this.
This bug needs an owner, I will take it for now.
Assignee: nobody → bgirard
(In reply to comment #33)
> Does this bug still exist without the patch? The trace log above shows us
> calling Resize from the TextureImageEGL ctor, but I can't see that in the
> current code.

Looks like this is only added in bug 660090. Might be better to change that patch to defer creation of the texture until we know an appropriate surface type to use.
###!!! ASSERTION: Unknown GL type for Image format: 'Error', file /home/fhae/upstream/mozilla-central/gfx/thebes/GLContext.h, line 2297
###!!! ASSERTION: Unknown GL shader type for Image format: 'Error', file /home/fhae/upstream/mozilla-central/gfx/thebes/GLContext.h, line 2321

on http://camendesign.com/code/video_for_everybody/test.html on meego/harmattan
Created attachment 548464 [details] [diff] [review]
fix luminance-only channel

I added a fix for the the single channel textures used by video frames.
Matt, you said you had bigger plans for this patch... do you have a rough idea so someone else could start with it? Or can we work on this patch?
Attachment #543759 - Attachment is obsolete: true
Attachment #544499 - Attachment is obsolete: true

Comment 38

6 years ago
(In reply to comment #33)
> With mobile (and a separate content process), we should only ever be doing
> texture updating via DirectUpdate. Is there ever a case where this isn't
> true?

Yes, we do still have texture updates occurring in EndUpdate (when Thebes layers are being rendered).
Is this fixed by Bug 675210?
(In reply to comment #39)
> Is this fixed by Bug 675210?

Wait, there's a real problem here which is that TextureImage doesn't know its precise format, despite having a precise format (since it holds a GL texture which has one). As a result we make bogus texImage2D/texSubImage2D calls as we have to guess the format, and we do that guesswork in 2 different ways that disagree with each other at least on mobile. This is a real problem that needs to be fixed, and I'm not sure how it can be fixed just by deferring allocation of textures? Even if that somehow made the bug disappear in practice we shouldn't rely on guesswork for our texImage2D/texSubImage2D calls so TextureImage should really know about its format.

Comment 41

6 years ago
In the case of DirectUpdate, the result of deferring allocation is that the texture is now allocated within GLContext::UploadSurfaceToTexture, where the correct format for the texture is chosen based on the format of the surface (that is, there is no guesswork here).

So the question then is if we also get format mismatches that do not involve DirectUpdate.
OK but there's also a basic design issue here. If TextureImage holds a GL texture that has a precise format, why not store that format info in TextureImage? Why instead store a ContentType and have to rely on a function to deduce the format from it? Even if we get this deduction completely right and safe, that still looks like bad design to me.

Updated

5 years ago
Assignee: bgirard → nobody
You need to log in before you can comment on or make changes to this bug.