Closed Bug 925608 Opened 6 years ago Closed 6 years ago

Figure and fix how EGLSurface renewal works on Android

Categories

(Core :: Graphics, defect)

ARM
Android
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla28
Tracking Status
firefox27 - affected
firefox28 --- fixed

People

(Reporter: bjacob, Assigned: bjacob)

References

Details

Attachments

(12 files)

384 bytes, text/plain
Details
1.30 KB, patch
Details | Diff | Splinter Review
126.95 KB, text/plain
Details
768.50 KB, text/plain
Details
6.99 KB, patch
kats
: review+
Details | Diff | Splinter Review
6.31 KB, patch
kats
: review+
Details | Diff | Splinter Review
7.14 KB, patch
kats
: review+
Details | Diff | Splinter Review
13.09 KB, patch
kats
: review+
jgilbert
: review+
Details | Diff | Splinter Review
4.12 KB, patch
kats
: review+
Details | Diff | Splinter Review
3.97 KB, patch
kats
: review+
Details | Diff | Splinter Review
8.15 KB, patch
kats
: review+
Details | Diff | Splinter Review
4.40 KB, patch
kats
: review+
Details | Diff | Splinter Review
It seems that at the moment, our leading cause of gfx crashes on Android is the handling of the renewal of the EGLSurface that our compositor renders to. See bug 884047 and bug 900020.

Surfaces get renewed on any size/orientation change, and when our window gets re-focused. Our Java code is using a SurfaceView. This has 3 callbacks that we need to handle correctly: surfaceCreated, surfaceChanged, surfaceDestroyed.

We need to understand:
 1) When exactly are these callbacks (surfaceCreated, surfaceChanged, surfaceDestroyed) fired by the SurfaceView framework?
 2) When exactly is the underlying EGLSurface destroyed and recreated by the SurfaceView framework? (How does that correlate to 1) ?
 3) How are _we_ currently reacting to these callbacks?
 4) What should we change?
 5) Does any of the above depend on the Android version?
(In reply to Benoit Jacob [:bjacob] from comment #0)
> We need to understand:
>  1) When exactly are these callbacks (surfaceCreated, surfaceChanged,
> surfaceDestroyed) fired by the SurfaceView framework?

Let's do that for Android 4.3 for now, we'll check later if that is different in other Android versions.

The Android 4.3 code for SurfaceView is there:

http://androidxref.com/4.3_r2.1/xref/frameworks/base/core/java/android/view/

We're mostly interested in these two files:

HardwareRenderer.java is where the actual EGL surface creation/destruction calls are made:

http://androidxref.com/4.3_r2.1/xref/frameworks/base/core/java/android/view/HardwareRenderer.java

ViewRootImpl.java is what drives this, and fires the callbacks (surfaceCreated, surfaceChanged, surfaceDestroyed):

http://androidxref.com/4.3_r2.1/xref/frameworks/base/core/java/android/view/ViewRootImpl.java

Some testing with a patch that logs these callbacks shows that:


* On orientation change, the EGLSurface gets destroyed and recreated in GlRenderer.invalidate:

I/dalvikvm( 8834):   at com.google.android.gles_jni.EGLImpl.eglDestroySurface(Native Method)
I/dalvikvm( 8834):   at android.view.HardwareRenderer$GlRenderer.invalidate(HardwareRenderer.java:1232)
I/dalvikvm( 8834):   at android.view.ViewRootImpl.performTraversals(ViewRootImpl.java:1647)
I/dalvikvm( 8834):   at android.view.ViewRootImpl.doTraversal(ViewRootImpl.java:1004)

I/dalvikvm( 8834):   at com.google.android.gles_jni.EGLImpl.eglCreateWindowSurface(EGLImpl.java:92)
I/dalvikvm( 8834):   at android.view.HardwareRenderer$GlRenderer.createSurface(HardwareRenderer.java:1251)
I/dalvikvm( 8834):   at android.view.HardwareRenderer$GlRenderer.invalidate(HardwareRenderer.java:1238)
I/dalvikvm( 8834):   at android.view.ViewRootImpl.performTraversals(ViewRootImpl.java:1647)
I/dalvikvm( 8834):   at android.view.ViewRootImpl.doTraversal(ViewRootImpl.java:1004)


and the only callback that is fired is surfaceChanged (see logs). The surfaceCreated/Destroyed callbacks are not fired, even though the underlying surface was created/destroyed.


* On de-focusing the Firefox window, the EGLSurface gets destroyed by GlRenderer.destroy:

I/dalvikvm( 8834):   at com.google.android.gles_jni.EGLImpl.eglDestroySurface(Native Method)
I/dalvikvm( 8834):   at android.view.HardwareRenderer$GlRenderer.destroySurface(HardwareRenderer.java:1220)
I/dalvikvm( 8834):   at android.view.HardwareRenderer$GlRenderer.destroy(HardwareRenderer.java:1210)
I/dalvikvm( 8834):   at android.view.HardwareRenderer$Gl20Renderer.destroy(HardwareRenderer.java:1959)
I/dalvikvm( 8834):   at android.view.ViewRootImpl.destroyHardwareResources(ViewRootImpl.java:645)
I/dalvikvm( 8834):   at android.view.ViewRootImpl.performTraversals(ViewRootImpl.java:1229)

and the only callback that is fired is surfaceDestroyed.


* On re-focusing the Firefox window, the EGLSurface gets recreated by GlRenderer.initialize:

I/dalvikvm( 8834):   at com.google.android.gles_jni.EGLImpl.eglCreateWindowSurface(EGLImpl.java:92)
I/dalvikvm( 8834):   at android.view.HardwareRenderer$GlRenderer.createSurface(HardwareRenderer.java:1251)
I/dalvikvm( 8834):   at android.view.HardwareRenderer$GlRenderer.createEglSurface(HardwareRenderer.java:1156)
I/dalvikvm( 8834):   at android.view.HardwareRenderer$GlRenderer.initialize(HardwareRenderer.java:980)
I/dalvikvm( 8834):   at android.view.ViewRootImpl.performTraversals(ViewRootImpl.java:1536)

and the only callback that is fired is surfaceCreated.
(In reply to Benoit Jacob [:bjacob] from comment #0)
> We need to understand:
>  2) When exactly is the underlying EGLSurface destroyed and recreated by the
> SurfaceView framework? (How does that correlate to 1) ?

That was basically answered together with 1) above.

In summary:
* surfaceCreated means exactly that
* surfaceDestroyed means exactly that
* surfaceChanged means that the old surface was destroyed and a new surface was created.
(In reply to Benoit Jacob [:bjacob] from comment #6)
> In summary:
> * surfaceCreated means exactly that
> * surfaceDestroyed means exactly that
> * surfaceChanged means that the old surface was destroyed and a new surface
> was created.

According to [1] surfaceChanged is also called once after surfaceCreated. This agrees with my experience and what our code does (i.e. we ignore surfaceCreated entirely and do the setup in surfaceChanged).

[1] http://developer.android.com/reference/android/view/SurfaceHolder.Callback.html#surfaceChanged%28android.view.SurfaceHolder,%20int,%20int,%20int%29
(In reply to Benoit Jacob [:bjacob] from comment #1)
> * On orientation change, the EGLSurface gets destroyed and recreated in
> GlRenderer.invalidate:
> 
> I/dalvikvm( 8834):   at
> com.google.android.gles_jni.EGLImpl.eglDestroySurface(Native Method)
> I/dalvikvm( 8834):   at
> android.view.HardwareRenderer$GlRenderer.invalidate(HardwareRenderer.java:
> 1232)
> I/dalvikvm( 8834):   at
> android.view.ViewRootImpl.performTraversals(ViewRootImpl.java:1647)
> I/dalvikvm( 8834):   at
> android.view.ViewRootImpl.doTraversal(ViewRootImpl.java:1004)
> 
> I/dalvikvm( 8834):   at
> com.google.android.gles_jni.EGLImpl.eglCreateWindowSurface(EGLImpl.java:92)
> I/dalvikvm( 8834):   at
> android.view.HardwareRenderer$GlRenderer.createSurface(HardwareRenderer.java:
> 1251)
> I/dalvikvm( 8834):   at
> android.view.HardwareRenderer$GlRenderer.invalidate(HardwareRenderer.java:
> 1238)
> I/dalvikvm( 8834):   at
> android.view.ViewRootImpl.performTraversals(ViewRootImpl.java:1647)
> I/dalvikvm( 8834):   at
> android.view.ViewRootImpl.doTraversal(ViewRootImpl.java:1004)

If eglCreateWindowSurface is being called by the GlRenderer, do we still need to be calling it ourselves at [1]?

[1] http://mxr.mozilla.org/mozilla-central/source/mobile/android/base/gfx/GLController.java#160
FYI, for some android related words.
- Surface: Rendering surface like Layer in gecko. Allocated by SurfaceFlinger via WindowManagerService.
- View: Basic building block for UI component. Application's all ui components are derived from View class.
- ViewRoot: Root of application view hirarchy. It is not derived from view class. It hold one Surface for application's rendering. All application's rendering is drawn to the Surace except 'view derived classes' that has own Surface(like SurfaceView)   
- SurfaceView: View derived class that has own Surface. Draw to SurfaceView is not drawn to ViewRoot's Surface, but is drawn to own Surface. The Surface is allocated by SurfaceFlinger via WindowManagerService.
- EGLSurface: In this context, almost cases are ANativeWindow(Surface) backed EGLSurface. So, Before Surface is destroyed by SurfaceFlinger, EGLSurface needs to be destroyed.
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #7)
> According to [1] surfaceChanged is also called once after surfaceCreated.
> This agrees with my experience and what our code does (i.e. we ignore
> surfaceCreated entirely and do the setup in surfaceChanged).
> 
> [1]
> http://developer.android.com/reference/android/view/SurfaceHolder.Callback.
> html#surfaceChanged%28android.view.SurfaceHolder,%20int,%20int,%20int%29

You're right, and this is confirmed in the logcat attached here:

W/GeckoLayerView( 8834): XXX SurfaceListener::surfaceCreated
W/GeckoLayerView( 8834): XXX SurfaceListener::surfaceChanged

(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #8)
> If eglCreateWindowSurface is being called by the GlRenderer, do we still
> need to be calling it ourselves at [1]?
> 
> [1]
> http://mxr.mozilla.org/mozilla-central/source/mobile/android/base/gfx/
> GLController.java#160

That was a big mystery for me, but Sotaro gave me a very good explanation, in addition to the very helpful comment 9 above.

The EGLSurface that is being created and destroyed by the GlRenderer used by the ViewRoot, is the applications "main" surface, while the EGLSurface that _we_ are creating and destroying in our surfaceCreated/Changed/Destroyed callbacks is just the surface for our SurfaceView. Think of it as just one "layer" (View) from the point of view of the Android application compositor (ViewRoot).
Ah, another explanation by Sotaro made me realize that what I just wrote above is incorrect!

The Real Truth is that the EGLSurface created and destroyed by the GlRenderer is where the native Android UI goes, while the EGLSurface that we create and destroy is where Gecko's compositor output goes, and the Android compositor takes care of compositing these two things together.
So anyway, yes it is normal that we create/destroy our EGLSurface. The EGLSurface that Android already creates/destroys is another, unrelated surface (for Android UI). We should keep creating/destroying our surface. It is probably a bug that we don't on orientation change.
Sorry about the long delay getting back to this: B2G bugs and Gfx work week got in the way.

Here's the answer to the other questions here.

As we said above, we do get surfaceChanged callbacks on orientation change. We react to these by calling GLContextEGL::RenewSurface().

But GLContextEGL::RenewSurface() early-exists if it already has a mSurface. The old mSurface is not cleared on orientation change, so in that case we already have a mSurface, and we early-exit without renewing anything (the actual renewal code path calls a method named ProvideEGLSurface in the AndroidBridge).

Once I remove the early-exit in RenewSurface, it actually calls ProvideEGLSurface, which calls ProvideEGLSurfaceWrapper, which calls into Java, and that seems to return always the same surface on orientation change here on Nexus 10 / Android 4.3... I hope that that's normal, that that just means that the system knows that the surface doesn't actually need replacing?
The stack to ProvideEGLSurfaceWrapper is

#0  mozilla::AndroidBridge::ProvideEGLSurfaceWrapper (this=0x73748400, aTarget=0x1d800386) at /hack/mozilla-central/widget/android/GeneratedJNIWrappers.cpp:2500
#1  0x7dc5b95e in mozilla::AndroidBridge::ProvideEGLSurface (this=0x73748400) at /hack/mozilla-central/widget/android/AndroidBridge.cpp:741
#2  0x7e137230 in mozilla::gl::GLContextEGL::RenewSurface (this=0x876b4800) at /hack/mozilla-central/gfx/gl/GLContextProviderEGL.cpp:476
#3  0x7e12315c in mozilla::layers::CompositorOGL::Resume (this=<optimized out>) at /hack/mozilla-central/gfx/layers/opengl/CompositorOGL.cpp:1423
#4  0x7e112b22 in mozilla::layers::CompositorParent::ResumeComposition (this=0x87609400) at /hack/mozilla-central/gfx/layers/ipc/CompositorParent.cpp:372
#5  0x7e114358 in DispatchToMethod<mozilla::layers::CompositorParent, void (mozilla::layers::CompositorParent::*)(int, int), int, int> (arg=..., method=
    (void (mozilla::layers::CompositorParent::*)(mozilla::layers::CompositorParent * const, int, int)) 0x7e112b69 <mozilla::layers::CompositorParent::ResumeCompositionAndResize(int, int)>, obj=<optimized out>)
    at /hack/mozilla-central/ipc/chromium/src/base/tuple.h:400
#6  RunnableMethod<mozilla::layers::CompositorParent, void (mozilla::layers::CompositorParent::*)(int, int), Tuple2<int, int> >::Run (this=<optimized out>) at /hack/mozilla-central/ipc/chromium/src/base/task.h:307
#7  0x7e0a77f8 in MessageLoop::RunTask (this=0x86bffde8, task=0x8782bfd0) at /hack/mozilla-central/ipc/chromium/src/base/message_loop.cc:338
I broke again on eglCreateWindowSurface and logged some more to confirm the following:

When we get a surfaceChanged callback, all it is telling us is that the surface used internally by the Android ViewRoot to composite our application, changed. It doesn't tell us that our own application-side EGLSurfaces were changed for us; they are not. The piece of log pasted below shows that. The conclusion is that when we get a surfaceChanged callback, we need to treat our own surface as invalidated by that event.

I/dalvikvm( 3014):   at com.google.android.gles_jni.EGLImpl.eglCreateWindowSurface(EGLImpl.java:92)
I/dalvikvm( 3014):   at android.view.HardwareRenderer$GlRenderer.createSurface(HardwareRenderer.java:1251)
I/dalvikvm( 3014):   at android.view.HardwareRenderer$GlRenderer.invalidate(HardwareRenderer.java:1238)
I/dalvikvm( 3014):   at android.view.ViewRootImpl.performTraversals(ViewRootImpl.java:1647)
I/dalvikvm( 3014):   at android.view.ViewRootImpl.doTraversal(ViewRootImpl.java:1004)
I/dalvikvm( 3014):   at android.view.ViewRootImpl$TraversalRunnable.run(ViewRootImpl.java:5481)
I/dalvikvm( 3014):   at android.view.Choreographer$CallbackRecord.run(Choreographer.java:749)
I/dalvikvm( 3014):   at android.view.Choreographer.doCallbacks(Choreographer.java:562)
I/dalvikvm( 3014):   at android.view.Choreographer.doFrame(Choreographer.java:532)
I/dalvikvm( 3014):   at android.view.Choreographer$FrameDisplayEventReceiver.run(Choreographer.java:735)
I/dalvikvm( 3014):   at android.os.Handler.handleCallback(Handler.java:730)
I/dalvikvm( 3014):   at android.os.Handler.dispatchMessage(Handler.java:92)
I/dalvikvm( 3014):   at android.os.Looper.loop(Looper.java:137)
I/dalvikvm( 3014):   at android.app.ActivityThread.main(ActivityThread.java:5103)
I/dalvikvm( 3014):   at java.lang.reflect.Method.invokeNative(Native Method)
I/dalvikvm( 3014):   at java.lang.reflect.Method.invoke(Method.java:525)
I/dalvikvm( 3014):   at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:737)
I/dalvikvm( 3014):   at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:553)
I/dalvikvm( 3014):   at dalvik.system.NativeStart.main(Native Method)
I/dalvikvm( 3014): 
D/mali_winsys( 3014): new_window_surface returns 0x3000
W/GeckoLayerView( 3014): SurfaceListener::surfaceChanged
W/GeckoLayerView( 3014): LayerView::onSizeChanged
W/GeckoGLController( 3014): GLController::resumeCompositor(2560, 1454) and mCompositorCreated=true
I/Gecko   ( 3014): RenewSurface called on GLContextEGL 0x86eb6800 with mSurface 0x75f923e0
I/Gecko   ( 3014): RenewSurface calling ProvideEGLSurface...
I/Gecko   ( 3014): ProvideEGLSurface called
I/Gecko   ( 3014): ProvideEGLSurface: returning 0x75f923e0
W/GeckoGLController( 3014): done GLController::resumeCompositor
W/GeckoLayerView( 3014): LayerView::surfaceChanged
W/GeckoGLController( 3014): GLController::surfaceChanged(1600, 2414), mSurfaceValid=true
W/GeckoGLController( 3014): GLController::resumeCompositor(1600, 2414) and mCompositorCreated=true
I/Gecko   ( 3014): RenewSurface called on GLContextEGL 0x86eb6800 with mSurface 0x75f923e0
I/Gecko   ( 3014): RenewSurface calling ProvideEGLSurface...
I/Gecko   ( 3014): ProvideEGLSurface called
I/Gecko   ( 3014): ProvideEGLSurface: returning 0x75f923e0
The other thing that this log tells us, is that GLController::surfaceChanged needs to renew the surface before calling resumeCompositor, because that is what calls RenewSurface which calls ProvideEGLSurface which returns the current EGLSurface.
Blocks: 935910
I have a patch queue of 12 patches that claims to make all this a lot simpler and more sensible, will put up for review ASAP, meanwhile here's a try push:

https://tbpl.mozilla.org/?tree=Try&rev=71c6b1bd0ea8
Btw here are diffstats. Note that this includes in particular the removal of the GfxInfoThread (which was making all this more complicated than otherwise necessary).

 gfx/gl/GLContext.cpp                          |    14 +
 gfx/gl/GLContext.h                            |    13 +-
 gfx/gl/GLContextProviderEGL.cpp               |   172 ++++--------
 gfx/gl/SharedSurfaceANGLE.cpp                 |     2 +-
 mobile/android/base/GeckoAppShell.java        |     6 -
 mobile/android/base/gfx/GLController.java     |   113 +------
 mobile/android/base/gfx/GeckoLayerClient.java |     4 +-
 mobile/android/base/gfx/GfxInfoThread.java    |   219 ---------------
 mobile/android/base/gfx/LayerView.java        |     6 +-
 mobile/android/base/moz.build                 |     1 -
 widget/android/AndroidBridge.cpp              |    25 +-
 widget/android/AndroidBridge.h                |     3 +-
 widget/android/GeneratedJNIWrappers.cpp       |  2418 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-----------------------------------------------------------------------------------------
 widget/android/GeneratedJNIWrappers.h         |     6 +-
 widget/android/GfxInfo.cpp                    |    89 ++---
 widget/android/GfxInfo.h                      |    14 +-
 widget/android/moz.build                      |     3 +-
 widget/xpwidgets/moz.build                    |     2 +
 18 files changed, 1365 insertions(+), 1745 deletions(-)
Previous try didn't build on b2g, this fixes it (and runs well locally):
https://tbpl.mozilla.org/?tree=Try&rev=6899cf00c9f4
All green so far, here's a android+b2g try push with all tests enabled and patch queue reordered in preparation for reviewing:

https://tbpl.mozilla.org/?tree=Try&rev=41f86d8b87fd
All green. Uploading for review in separate bugs that will block this one.
Depends on: 936968
Depends on: 937196
Depends on: 919987
Depends on: 937204
Filed four separate bugs for preliminary work to allow the changes here to be more simple: bug 936968, bug 937196, bug 919987 are cleanup around GLContext, and bug 937204 removes the GfxInfoThread that was seriously getting in the way here.

Uploading the remaining patches here.
Summary: Figure how EGLSurface renewal works on Android → Figure and fix how EGLSurface renewal works on Android
One of the biggest conceptual problems around here is that it was unclear what surfaces we were talking about. GLController had two fields, mSurfaceValid and mSurface, that were supposed to refer to two different surfaces. Indeed, mSurfaceValid effectively tracked whether the Android-side compositor (that composites our application for us) had a valid surface (this is what those surfaceDestroyed/surfaceChanged callbacks notify us about), while mSurface was the surface that we managed ourselves and that we handed off to Gecko to render to.

So the terminology introduced here is "server" for the android compositor's internal output surface, and "client" for our own surface that we hand off to Gecko to composite to. The surfaceChanged/surfaceDestroyed callbacks inform us of changes to the "server" surface, we are free to act on them as we want to, and in our case the right thing to do on these callbacks is to update our "client" surface and notify the gecko compositor.

By the end of this patch queue, both mSurfaceValid and mSurface are gone. But until then, if only to make this review process sane, let's rename mSurfaceValid to mServerSurfaceValid, while the next patch will rename mSurface to mClientSurface.
Attachment #830308 - Flags: review?(bugmail.mozilla)
As explained in previous comment.
Attachment #830309 - Flags: review?(bugmail.mozilla)
We have no choice but to let the GLContext be in charge of managing the surface's lifetime, because nothing can destroy a surface as long as a GL context is current against it:

http://www.khronos.org/registry/egl/sdk/docs/man/xhtml/eglDestroySurface.html

> "If the EGL surface surface is not current to any thread, eglDestroySurface destroys it immediately. Otherwise, surface is destroyed when it becomes not current to any thread."

This patch clarifies this situation. It was already roughly the case, but wasn't very clear. For example, on the Java side, we had a mClientSurface = null; that really was out of place.

R? kats for java side, jgilbert for GLContext side.
Attachment #830317 - Flags: review?(jgilbert)
Attachment #830317 - Flags: review?(bugmail.mozilla)
All what serverSurfaceChanged needs to do, is record the new size, and notify the compositor about that change. The compositor will then take care of renewing its own surface, which will involve calling back into java code to get a new surface.

(Is this round trip a performance problem? If yes, we could easily avoid it by keeping the existing approach to immediately create a new surface and changing the resumeCompositor API to take a EGLSurface parameter; but that would have the downside that we'd be creating the new surface before the old one gets destroyed, increasing transient memory usage by quite a bit... surfaces do get large on hi-res devices).
Attachment #830320 - Flags: review?(bugmail.mozilla)
createCompositor really is "ensure that compositor gets updated to a surface change with mWidth,mHeight; create the compositor if it wasn't already created".

As createCompositor was mostly called to update an existing compositor, its name sounded scarier than necessary.

Do you see a better name alternative?
Attachment #830323 - Flags: review?(bugmail.mozilla)
I want to clarify two things with this name change:

 1) This function is effectively creating a surface everytime it's called. The caller has responsibility to manage and destroy it. This function is not just exposing something that is already managed. 'Create' here is already standard EGL lingo, e.g. eglCreateWindowSurface

 2) The surface here is the surface for the compositor to render to. 'compositor' is consistent with other terminology around here: resumeCompositor(), updateCompositor(). It's the same compositor we're talking about. "SurfaceForCompositor" is as opposed to "ServerSurface".
Attachment #830325 - Flags: review?(bugmail.mozilla)
As said above, mServerSurfaceValid was tracking whether the Android internal code (specifically, the ViewRoot implementation) that is compositing our application, has a valid surface at the moment. It's not clear to me why we would need to know that. This was exposed and used at one place in LayerView.java, so please check that its removal here is reasonable; seems to work fine here.
Attachment #830329 - Flags: review?(bugmail.mozilla)
Attachment #830308 - Flags: review?(bugmail.mozilla) → review+
Attachment #830309 - Flags: review?(bugmail.mozilla) → review+
Attachment #830310 - Flags: review?(bugmail.mozilla) → review+
Attachment #830317 - Flags: review?(bugmail.mozilla) → review+
Attachment #830320 - Flags: review?(bugmail.mozilla) → review+
Attachment #830323 - Flags: review?(bugmail.mozilla) → review+
Comment on attachment 830325 [details] [diff] [review]
7/8: rename ProvideEGLSurface to CreateEGLSurfaceForCompositor

Review of attachment 830325 [details] [diff] [review]:
-----------------------------------------------------------------

::: mobile/android/base/gfx/GLController.java
@@ +201,5 @@
>  
>          throw new GLControllerException("No suitable EGL configuration found");
>      }
>  
> +    @GeneratableAndroidBridgeTarget(allowMultithread = true, stubName = "CreateEGLSurfaceForCompositorWrapper")

You can get rid of the stubName here if you want. The JNI generator will then generate a C++ function with the same name as the Java version. The stubName was only there for backwards-compatibility, so that we didn't have to do a mass-rename when going from the hand-written versions to the autogenerated version. Going forward we should try to phase out the need for stubName unless it's absolutely needed.
Attachment #830325 - Flags: review?(bugmail.mozilla) → review+
Attachment #830329 - Flags: review?(bugmail.mozilla) → review+
Thanks for breaking the patches up like that - it made it 1000x easier to review! Overall I think all your code and renaming make good sense. I suspect this will fix a lot of bugs.

The only one concern I have is that AndroidBridge::CreateEGLSurfaceForCompositor can return nullptr in some cases and I'm not sure if the GLContextProviderEGL.cpp code deals with that correctly. I think it might because that used to be the case before but I just want to double-check that is still handled.
Comment on attachment 830317 [details] [diff] [review]
4/8: Clarify that the client surface is managed by the GLContext code. Java only provides the function to create a surface.

Review of attachment 830317 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/gl/GLContextProviderEGL.cpp
@@ +180,5 @@
> +static EGLSurface
> +CreateSurfaceForWindow(nsIWidget* widget, const EGLConfig& config) {
> +    EGLSurface newSurface = EGL_NO_SURFACE;
> +
> +    #ifdef MOZ_ANDROID_OMTC

I see what you're doing here with indenting for ifdefs. I don't like it, but I think it's least-bad.

@@ +192,5 @@
> +        newSurface = sEGLLibrary.fCreateWindowSurface(EGL_DISPLAY(), config, GET_NATIVE_WINDOW(widget), 0);
> +        #ifdef MOZ_WIDGET_GONK
> +            gScreenBounds.x = 0;
> +            gScreenBounds.y = 0;
> +            sEGLLibrary.fQuerySurface(EGL_DISPLAY(), newSurface, LOCAL_EGL_WIDTH, &gScreenBounds.width);

Should we call these even if CreateWindowSurface failed? Is failure here not worth considering?

@@ +304,5 @@
>          printf_stderr("Destroying context %p surface %p on display %p\n", mContext, mSurface, EGL_DISPLAY());
>  #endif
>  
>          sEGLLibrary.fDestroyContext(EGL_DISPLAY(), mContext);
> +        mozilla::gl::DestroySurface(mSurface);

I assume this is namespace-qualified so we don't hit GLContext::DestroySurface? This is sort of goopy, and I wish we didn't have to do this.
Attachment #830317 - Flags: review?(jgilbert) → review+
Previous android try was busted, new one:
https://tbpl.mozilla.org/?tree=Try&rev=464d458a7193
(In reply to Jeff Gilbert [:jgilbert] from comment #33)
> Comment on attachment 830317 [details] [diff] [review]
> 4/8: Clarify that the client surface is managed by the GLContext code. Java
> only provides the function to create a surface.
> 
> Review of attachment 830317 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: gfx/gl/GLContextProviderEGL.cpp
> @@ +180,5 @@
> > +static EGLSurface
> > +CreateSurfaceForWindow(nsIWidget* widget, const EGLConfig& config) {
> > +    EGLSurface newSurface = EGL_NO_SURFACE;
> > +
> > +    #ifdef MOZ_ANDROID_OMTC
> 
> I see what you're doing here with indenting for ifdefs. I don't like it, but
> I think it's least-bad.
> 
> @@ +192,5 @@
> > +        newSurface = sEGLLibrary.fCreateWindowSurface(EGL_DISPLAY(), config, GET_NATIVE_WINDOW(widget), 0);
> > +        #ifdef MOZ_WIDGET_GONK
> > +            gScreenBounds.x = 0;
> > +            gScreenBounds.y = 0;
> > +            sEGLLibrary.fQuerySurface(EGL_DISPLAY(), newSurface, LOCAL_EGL_WIDTH, &gScreenBounds.width);
> 
> Should we call these even if CreateWindowSurface failed? Is failure here not
> worth considering?

I believe that it is worth considering, but I was only moving around existing code here. I'll file a follow-up bug about that, but don't commit to working on it, as the problem preexists (and is possibly not a problem, if QuerySurface just returns zero when called on EGL_NO_SURFACE).

> 
> @@ +304,5 @@
> >          printf_stderr("Destroying context %p surface %p on display %p\n", mContext, mSurface, EGL_DISPLAY());
> >  #endif
> >  
> >          sEGLLibrary.fDestroyContext(EGL_DISPLAY(), mContext);
> > +        mozilla::gl::DestroySurface(mSurface);
> 
> I assume this is namespace-qualified so we don't hit
> GLContext::DestroySurface? This is sort of goopy, and I wish we didn't have
> to do this.

Oh, for DestroySurface, that dates back to when this was called ReleaseSurface. Now that it's called DestroySurface, there is no risk of collision with a GLContextEGL method anymore, so I can remove mozilla::gl::
Inbound is still closed, so the best I can do is a full try run to really make sure that nothing will go wrong when I push these 15 patches as soon as the tree reopens...

https://tbpl.mozilla.org/?tree=Try&rev=f973086e453d
If this is low risk enough, after it's baked on Nightly, let's get this on Aurora and make sure it's in FF27.0b1 if nothing unexpected occurs. It's too late for 26.
(In reply to lsblakk@mozilla.com [:lsblakk] from comment #41)
> If this is low risk enough, after it's baked on Nightly, let's get this on
> Aurora and make sure it's in FF27.0b1 if nothing unexpected occurs. It's too
> late for 26.

In that light, note that there was a quite significant crash regression from this on trunk, see bug 834243 comment #56 and following.
As Robert said, we have had some pretty bad fallout from the present bug's patches, see bug 834243. It's over now, looking at crash-stats, but still, this shows how dangerous it can be to make changes in this area, as this code deals with things that are potentially varying a lot between Android versions and perhaps even devices, and when things go wrong in this area, we can get startup crashes, as we got on bug 834243. For these reasons, I would advise to let this ride all the trains to Firefox 28.
In that case let's not track this for FF27 and let it ride.
Blocks: 900033
You need to log in before you can comment on or make changes to this bug.