Closed Bug 745137 Opened 12 years ago Closed 12 years ago

Make gralloc-based direct texturing work for b2g "Tier Is"

Categories

(Core :: Graphics: Layers, defect)

ARM
Gonk (Firefox OS)
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla16
blocking-kilimanjaro +
blocking-basecamp +

People

(Reporter: cjones, Assigned: Daeken)

References

Details

Attachments

(1 file, 14 obsolete files)

8.22 KB, patch
Details | Diff | Splinter Review
Right now, adreno and SGX 540.
Assignee: nobody → cbrocious
Hey Cody whats the status here?
Attached patch Direct texturing initial work (obsolete) — Splinter Review
This patch implements direct texturing for Gonk.  Double buffering is causing issues on Mali and is commented out (working on a fix) the way that the EGLImageKHR is unboxed from Android's libEGL is funky, and it needs a lot of cleanup, but it's functional on the whole.
Oh, and one other strange issue: The power button doesn't seem to turn on the screen, and hitting it while the screen is on flips it immediately back to the lock screen.  Not sure why that would have anything to do with direct texturing, though.
I am not opposed to the unboxing as first step. In parallel you can get help from mwu on linking with our custom libEGL, but lets decouple from that. We desperately need a working version of this patch as soon as possible. This has been dragging on for several months.

Whats the issue with mali?

comment #3 will need debugging
mwu, can you help with two things here:

- we need to link against our a custom libEGL because Android's is borked (Cody/mwu, in the alternative can we patch libEgl at the gonk level?)
- mwu, didn't you do something to make sure the screen is always restored? like force a redraw or something? I wonder whether that somehow interferes with the buffer switch of gralloc something like that and causes comment 3
(In reply to Andreas Gal :gal from comment #5)
> mwu, can you help with two things here:
> 
> - we need to link against our a custom libEGL because Android's is borked
> (Cody/mwu, in the alternative can we patch libEgl at the gonk level?)

We can have a libMozEGL or something like that. Cody, if you post your patch to the android libEGL I can see about forking it and making libMozEGL.

> - mwu, didn't you do something to make sure the screen is always restored?
> like force a redraw or something? I wonder whether that somehow interferes
> with the buffer switch of gralloc something like that and causes comment 3

That was a temporary hack which has been replaced with https://bugzilla.mozilla.org/show_bug.cgi?id=707589 . Now we send a sizemodechange when /sys/power/wait_for_fb_wake notifies us that the screen is ready to be drawn to.
I have my own little libMozEGL (well, libGonkEGL) that I was attempting to get working, just by building it separately and then referencing it from our GL loader in Gecko.  The problem is that something is going wrong in the interaction with the vendor EGL libs; not sure what that is yet.

As for double buffering, the issue seems to be the way I'm using EGLImages; Mali doesn't seem to like it.  I'm playing around with it now to get that working properly so we can turn that on.
Another update: the lockscreen and display power issues seem to have been a bug in my Gaia install.  Updating fixed those.
blocking-basecamp: --- → +
blocking-kilimanjaro: --- → +
How are we doing here?
Roughly the same as the previous patch, but adds a preference (gfx.directtexture.enable) and defaults it to false for landing.  Still needs cleanup and there are still a couple leaks, but hopefully we can hammer this out once it lands.
Attachment #627529 - Attachment is obsolete: true
Attachment #630025 - Flags: review?(gal)
Comment on attachment 630025 [details] [diff] [review]
Adds a preference for direct texturing

Review of attachment 630025 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/gl/GLContextProviderEGL.cpp
@@ +1,2 @@
> +#include "android/log.h"
> +#define LOG(args...)  __android_log_print(ANDROID_LOG_INFO, "Daeken" , ## args)

Please change to "gralloc" or something else meaningful.

@@ +1176,5 @@
>          }
>  
> +#ifdef MOZ_WIDGET_GONK
> +        if (mGraphicBuffer != nsnull) {
> +            /*sp<GraphicBuffer> temp = mGraphicBuffer;

This is really messy. Can you remove all the dead code?
Attachment #630025 - Flags: review?(gal) → review-
Attached patch Fixed previous issues (obsolete) — Splinter Review
This patch reflects the notes in comment #11 and removes the system malloc code since jemalloc is no longer a problem for us.
Attachment #630025 - Attachment is obsolete: true
Attachment #630233 - Flags: review?(gal)
Comment on attachment 630233 [details] [diff] [review]
Fixed previous issues

Review of attachment 630233 [details] [diff] [review]:
-----------------------------------------------------------------

::: b2g/app/b2g.js
@@ +490,5 @@
>  // Enable device storage
>  pref("device.storage.enabled", true);
> +
> +// Turns on direct texturing for Gonk
> +pref("gfx.directtexture.enabled", true);

lets do gfx.gralloc.enabled and please default it to false

::: gfx/gl/GLContextProviderEGL.cpp
@@ +1168,5 @@
> +            mGLContext->fActiveTexture(LOCAL_GL_TEXTURE0);
> +            mGLContext->fBindTexture(LOCAL_GL_TEXTURE_2D, mTexture);
> +            void *img = unboximage(mImageKHR);
> +            itt2d(LOCAL_GL_TEXTURE_2D, img);
> +            if(sEGLLibrary.fGetError() != 0x00003000) {

^ space

@@ +1169,5 @@
> +            mGLContext->fBindTexture(LOCAL_GL_TEXTURE_2D, mTexture);
> +            void *img = unboximage(mImageKHR);
> +            itt2d(LOCAL_GL_TEXTURE_2D, img);
> +            if(sEGLLibrary.fGetError() != 0x00003000) {
> +                LOG("could not do the target thing. ERROR (0x%04x)", sEGLLibrary.fGetError());

This could use a better text

@@ +1177,5 @@
> +            void *vaddr;
> +            if (mGraphicBuffer->lock(GraphicBuffer::USAGE_SW_READ_OFTEN |
> +                                     GraphicBuffer::USAGE_SW_WRITE_OFTEN,
> +                                     &vaddr) != OK) {
> +                printf_stderr("couldn't lock GraphicBuffer\n");

LOG?

@@ +1361,4 @@
>  #endif
>  
> +#ifdef MOZ_WIDGET_GONK
> +        if(gUseBackingSurface) {

^ space

@@ +1372,5 @@
> +            if (mGraphicBuffer != nsnull && mGraphicBackBuffer != nsnull && 
> +                mGraphicBuffer->initCheck() == OK && mGraphicBackBuffer->initCheck() == OK) {
> +                const int eglImageAttributes[] = { EGL_IMAGE_PRESERVED_KHR, LOCAL_EGL_TRUE,
> +                                                   LOCAL_EGL_NONE, LOCAL_EGL_NONE };
> +                if(itt2d == NULL || unboximage == NULL) {

^ space

@@ +1376,5 @@
> +                if(itt2d == NULL || unboximage == NULL) {
> +                    void *sysegl = dlopen("/system/lib/libEGL.so", RTLD_NOW);
> +                    unboximage = reinterpret_cast <unboximage_t>(dlsym(sysegl, "_ZN7android33egl_get_image_for_current_contextEPv"));
> +                    DIR * dir = opendir("/system/lib/egl");
> +                    while(struct dirent *ent = readdir(dir)) {

lots of space issues here, please consistently use foo (

@@ +1387,5 @@
> +                        if(vendor == NULL)
> +                            continue;
> +                        itt2d = reinterpret_cast <itt2d_t>(dlsym(vendor, "glEGLImageTargetTexture2DOES"));
> +                    }
> +                    if(itt2d == NULL || unboximage == NULL) {

Why do we do this hackery here? The code for these libs is open source no? Can we just directly use the code here?

@@ +1411,5 @@
> +
> +                return true;
> +            }
> +
> +            printf_stderr("GraphicBufferAllocator::alloc failed\n");

LOG?
Corrected style and naming issues raised in #13; discussed EGL hacks offline.
Attachment #630233 - Attachment is obsolete: true
Attachment #630233 - Flags: review?(gal)
Attachment #630425 - Flags: review?(gal)
Depends on: 762101
Can you please upload the latest patch? We should take the QC fix and disable this on SGS2 (only enable if Renderer() == adreno200). I really would like to land this tomorrow.
Attached patch Merged patch (obsolete) — Splinter Review
Well, this patch is a half way point.  It includes the fix for the symbol lookup, so we no longer have to dive into the vendor lib, but it still requires unboxing the EGLImage.  This behavior is consistent across devices.  Something funky is going on in libEGL still.
Attachment #630425 - Attachment is obsolete: true
Attachment #630425 - Flags: review?(gal)
Hi, I have a question about why the patch directly call egl_get_image_for_current_context(). Are there a reason? I have read the source code, then I thought it might not be necessary. I might be a dumb.

When, we try to get an address of glEGLImageTargetTexture2DOES() by using eglGetProcAddress(), we get an address of glEGLImageTargetTexture2DOES_wrapper().

implementation of glEGLImageTargetTexture2DOES_wrapper() is following.
egl_get_image_for_current_context() is called within the function.

static void glEGLImageTargetTexture2DOES_wrapper(GLenum target, GLeglImageOES image)
{
    GLeglImageOES implImage =
        (GLeglImageOES)egl_get_image_for_current_context((EGLImageKHR)image);
    glEGLImageTargetTexture2DOES_impl(target, implImage);
}

today, http://androidxref.com/ do not respond, then I just write the file path. Both functions are defined in the following file.
 - \frameworks\base\opengl\libs\EGL\eglApi.cpp
That's one of the problems that we've been running up against.  When we call glEGLImageTargetTexture2DOES as imported by GLLibraryEGL, it ends up calling the vendor module directly, rather than the wrapper, thus necessitating the unboxing.
Hi Cody, thanks for the gracious reply. That's a problem...
I checked it on my hardware. When GLLibraryEGL::fImageTargetTexture2DOES() is called, then glEGLImageTargetTexture2DOES_wrapper() is called within the function call. A call sequence was following.

GLLibraryEGL::fImageTargetTexture2DOES()
->mSymbols.fImageTargetTexture2DOES()
->glEGLImageTargetTexture2DOES_wrapper()//platform independent
->glEGLImageTargetTexture2DOES_impl()
->platform dependent glEGLImageTargetTexture2DOES()

glEGLImageTargetTexture2DOES() is a gl function, platform independent implementation is in libGLESv2.so. Therefore GLLibraryLoader could not get a function pointer of platform independent glEGLImageTargetTexture2DOES() from libEGL.so. then the GLLibraryLoader try to get the function pointer by using eglGetProcAddress() and receive a function pointer of glEGLImageTargetTexture2DOES_wrapper().
Attached patch Removed hacks (obsolete) — Splinter Review
Removed all the symbol hacks.  Outside of the context sensitive symbols, everything works out of the box on ICS.  Added a version check to enable gralloc only on ICS.
Attachment #631328 - Attachment is obsolete: true
Attachment #632105 - Flags: review?(gal)
Comment on attachment 632105 [details] [diff] [review]
Removed hacks

Review of attachment 632105 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/gl/GLContextProviderEGL.cpp
@@ +809,4 @@
>          , mIsLocked(false)
>      {
>          mUpdateFormat = gfxASurface::FormatFromContent(GetContentType());
> +        gUseBackingSurface = Preferences::GetBool("gfx.gralloc.enabled", false);

This is probably pretty slow. We should set this somewhere else. Ditto for the property_get below.

@@ +1169,5 @@
> +            mGLContext->MakeCurrent(true);
> +            mGLContext->fActiveTexture(LOCAL_GL_TEXTURE0);
> +            mGLContext->fBindTexture(LOCAL_GL_TEXTURE_2D, mTexture);
> +            sEGLLibrary.fImageTargetTexture2DOES(LOCAL_GL_TEXTURE_2D, mImageKHR);
> +            if (sEGLLibrary.fGetError() != 0x00003000) {

Should there be a proper constant for this?

::: gfx/gl/GLLibraryEGL.cpp
@@ +251,5 @@
> +void
> +GLLibraryEGL::LoadConfigSensitiveSymbols()
> +{
> +    LOG("loading config sensitive symbols");
> +    //if (mHave_EGL_KHR_image) {

???
Uncommented conditional loading of EGLImage symbols.
Attachment #632105 - Attachment is obsolete: true
Attachment #632105 - Flags: review?(gal)
Attachment #632108 - Flags: review?(jones.chris.g)
Attached patch Fixed issues (obsolete) — Splinter Review
Rolled EGL_SUCCESS constant into place and moved preference/property checks out of the repeated paths.
Attachment #632108 - Attachment is obsolete: true
Attachment #632108 - Flags: review?(jones.chris.g)
Attachment #632114 - Flags: review?(jones.chris.g)
The patch is dying with mozmalloc_abort on otoro if I pan around a bit on the home screen. Active layers don't seem to show up partially. The paginators disappear while panning and come back when the layer is reinserted.
Seems to be working for me for a bit, in a DEBUG build, but I get some errors from the GPU driver and then pretty quickly crash with

I/Gecko   ( 6943): ###!!! ABORT: Framebuffer not complete -- error 0x0: file /home/cjones/mozilla/mozilla-central/gfx/gl/GLContext.cpp, line 2893

My money is on PMEM exhaustion.  Probably need to go talk to our friends again.
Correction: I see us both alloc'ing and dealloc'ing buffers.  When we crash, we're using ~1/10th of available pmem.  The entire history of allocs (not just memory in use when we crash) is ~1/5th available pmem.  So we don't seem to exhausting it or fragmenting it.

Looks like something else is going on.
Comment on attachment 632114 [details] [diff] [review]
Fixed issues

>diff --git a/gfx/gl/GLContextProviderEGL.cpp b/gfx/gl/GLContextProviderEGL.cpp

>+#if defined(MOZ_WIDGET_GONK)
>+#include <dlfcn.h>
>+#include <ui/GraphicBuffer.h>
>+#include "android/log.h"
>+#define LOG(args...)  __android_log_print(ANDROID_LOG_INFO, "gecko_gralloc" , ## args)
>+
>+using namespace android;
>+
>+#define EGL_NATIVE_BUFFER_ANDROID 0x3140
>+#define EGL_IMAGE_PRESERVED_KHR   0x30D2
>+
>+#endif

Nit: please indent the preprocessor directives here so nesting is clearer. 

# if defined(MOZ_WIDGET_GONK)
#  include <dlfcn.h>
#  include <ui/GraphicBuffer.h>
#  include "android/log.h"

etc.

>+#define printf_stderr(args...) LOG(args)

Can you not use the version of printf_stderr() already defined by
nsDebug.h?  It should already go to logcat.

>@@ -102,12 +116,13 @@ public:
> #include "GLLibraryEGL.h"
> #include "nsDebug.h"
> #include "nsThreadUtils.h"
>+#include "cutils/properties.h"
> 

This needs to be included only #ifdef ANDROID.

>+static PixelFormat
>+PixelFormatForImage(gfxASurface::gfxImageFormat aFormat)

>+    default:
>+        NS_WARNING("Unknown gralloc pixel format for Image format");

This should be MOZ_NOT_REACHED();

> static GLenum
> GLTypeForImage(gfxASurface::gfxImageFormat aFormat)
> {

>+            case gfxASurface::ImageFormatRGB16_565:
>+                mShaderType = RGBALayerProgramType;

I see

        if (gUseBackingSurface) {
            if (mUpdateFormat != gfxASurface::ImageFormatARGB32) {
                mShaderType = RGBXLayerProgramType;

and

            if (mUpdateFormat == gfxASurface::ImageFormatRGB16_565) {
                mShaderType = RGBXLayerProgramType;

elsewhere in this file.  Why do you use RGBALayerProgramType here?
Are you sure that's what we want?

>+            default:
>+                NS_WARNING("Unknown update format");

MOZ_NOT_REACHED()

>+            mGraphicBuffer = new GraphicBuffer(aSize.width, aSize.height, format, usage);
>+            mGraphicBackBuffer = new GraphicBuffer(aSize.width, aSize.height, format, usage);
>+            if (mGraphicBuffer != nsnull && mGraphicBackBuffer != nsnull &&

These allocations will never fail.  Drop the null checks.

>diff --git a/gfx/layers/opengl/ThebesLayerOGL.cpp b/gfx/layers/opengl/ThebesLayerOGL.cpp

>-    result.mContext = new gfxContext(mTexImage->BeginUpdate(result.mRegionToDraw));
>+    gfxASurface *surf = mTexImage->BeginUpdate(result.mRegionToDraw);
>+    result.mContext = new gfxContext(surf);

I imagine this was for debugging, right?  Please revert this change,
or make it

  nsRefPtr<gfxASurface> surf = ...;

Mostly nit-type-stuff, r=me with above fixed.
Attachment #632114 - Flags: review?(jones.chris.g) → review+
Hi, There seems to be no necessity to implement GLLibraryEGL::LoadConfigSensitiveSymbols(). And sEGLLibrary.fImageTargetTexture2DOES() is already called within "#ifdef MOZ_X11" in CreateBackingSurface().
(In reply to Andreas Gal :gal from comment #22)
> Comment on attachment 632105 [details] [diff] [review]
> Removed hacks
> 
> Review of attachment 632105 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: gfx/gl/GLContextProviderEGL.cpp
> @@ +809,4 @@
> >          , mIsLocked(false)
> >      {
> >          mUpdateFormat = gfxASurface::FormatFromContent(GetContentType());
> > +        gUseBackingSurface = Preferences::GetBool("gfx.gralloc.enabled", false);
> 
> This is probably pretty slow. We should set this somewhere else. Ditto for
> the property_get below.

FWIW, a pref lookup is quite fast - a hash table lookup.
Took care of the issues raised by cjones.
Attachment #632114 - Attachment is obsolete: true
Attachment #632397 - Flags: review?(jones.chris.g)
Attached patch Fixed bitrot (obsolete) — Splinter Review
This patch fixes conflicts with inbound.
Attachment #632397 - Attachment is obsolete: true
Attachment #632397 - Flags: review?(jones.chris.g)
Comment on attachment 632510 [details] [diff] [review]
Fixed bitrot

Will land with a small whitespace tweak here

+# if defined(MOZ_WIDGET_GONK)
+#  include <ui/GraphicBuffer.h>
+
+using namespace android;
+
+# define EGL_NATIVE_BUFFER_ANDROID 0x3140
+# define EGL_IMAGE_PRESERVED_KHR   0x30D2
+
+# endif

Cody, in the future make you sure you |hg qref -m 'Bug XXX: Commit message. r=foopy' and |hg export tip| to produce landable patches.
Attachment #632510 - Flags: review+
Backed out in 

https://hg.mozilla.org/integration/mozilla-inbound/rev/e3d9bee60f0e

Cody, needs a tryserver run :).  You can watch the results from https://tbpl.mozilla.org/?tree=Mozilla-Inbound to see what's busted.
Attachment #632510 - Attachment is obsolete: true
Attached patch Unbroke Android builds (obsolete) — Splinter Review
Attachment #632545 - Attachment is obsolete: true
Attached patch temporary inter-diff patch (obsolete) — Splinter Review
This is a inter-diff patch to attachment 632397 [details] [diff] [review].

With attachment 632397 [details] [diff] [review], I faced following problems on my mobile phone(qualcomm dual core, ics).
- while mImageKHR is related to a Texture, mGraphicBuffer->lock() call within GetLockSurface() always fail.
- fCheckFramebufferStatus() call within GLContext::SetBlitFramebufferForDestTexture() always fail.

The patch addressed the problems on the phone.
https://hg.mozilla.org/mozilla-central/rev/7ed28c33d727
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Thanks to Sotaro's patch, we seem to be crashless on all devices now.  I've removed the tiny double buffering stubs as well, pending proper support.
Attachment #632549 - Attachment is obsolete: true
Attachment #633064 - Attachment is obsolete: true
Attachment #633204 - Flags: review?(gal)
Attachment #633204 - Flags: review?(gal) → review+
No code changes from previous patch, just fixed the actual patch format.
Attachment #633204 - Attachment is obsolete: true
Hi cody, it's nice that the change works on all devices:) Did you check it by using followign code?

> sEGLLibrary.fImageTargetTexture2DOES(LOCAL_GL_TEXTURE_2D, 0);

Yesterday, I also tried the above code on my phone. But it seemed that it was rejected within "adreno" related code because of invalid argument and mGraphicBuffer->lock() failed. I am going to check it again today.
> > sEGLLibrary.fImageTargetTexture2DOES(LOCAL_GL_TEXTURE_2D, 0);
> 
> Yesterday, I also tried the above code on my phone. But it seemed that it
> was rejected within "adreno" related code because of invalid argument and
> mGraphicBuffer->lock() failed. I am going to check it again today.

I just took a second look at this on my device to confirm.  Sending in EGL_NO_IMAGE_KHR(0) actually does cause the ImageTargetTexture2DOES implementation to bail before it does anything useful, however regardless with the latest patch I am not experiencing the write lock failure.  This line is dead code it seems.
> implementation to bail before it does anything useful, however regardless
> with the latest patch I am not experiencing the write lock failure.  This
> line is dead code it seems.

Yes, it is dead code. For debugging purpose, I commented out the "return" and checked what happened in this case . I forgot to mention about it.
I confirmed again EGL_NO_IMAGE_KHR(0) as argument. It failed on my hardware with "GL_INVALID_OPERATION" log print.
Blocks: 774059
(In reply to Sotaro Ikeda from comment #48)
> I confirmed again EGL_NO_IMAGE_KHR(0) as argument. It failed on my hardware
> with "GL_INVALID_OPERATION" log print.

Would you be able to file a bug about this? It is either a driver bug (to work around) or a bug in this patch. I'll check the spec(s) on Monday to figure out which.

INVALID_OPERATION sounds about right for trying to redirect a texture's data to a non-existent EGLImage.
Hi Jeff, I filed Bug 774530.
This problem is also discussed in Bug 771350
I created bug Bug 774530. But from Andreas' comment in Bug 774530, a silicon vender already suggested to use mini texture with dimensions (1,1).
Blocks: 774530
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: