Last Comment Bug 670930 - Texture updates happen synchronously and harm interactive performance
: Texture updates happen synchronously and harm interactive performance
Status: RESOLVED FIXED
: mobile, perf
Product: Firefox for Android
Classification: Client Software
Component: General (show other bugs)
: unspecified
: ARM Android
: P4 normal (vote)
: Firefox 12
Assigned To: James Willcox (:snorp) (jwillcox@mozilla.com)
:
: Sebastian Kaspari (:sebastian)
Mentors:
Depends on: 708307 711751 711852 712716 713569 714289
Blocks: 598864 709120
  Show dependency treegraph
 
Reported: 2011-07-12 08:01 PDT by Chris Lord [:cwiiis]
Modified: 2012-05-26 12:13 PDT (History)
26 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
fixed
fixed
+


Attachments
Plan for using EGLImage/threads on android to speed up GL shadow layer updates (4.26 KB, text/plain)
2011-07-12 08:01 PDT, Chris Lord [:cwiiis]
no flags Details
WIP asynchronous texture-upload patch (63.71 KB, patch)
2011-07-22 04:35 PDT, Chris Lord [:cwiiis]
no flags Details | Diff | Splinter Review
WIP asynchronous texture-upload patch (64.12 KB, patch)
2011-07-22 04:37 PDT, Chris Lord [:cwiiis]
no flags Details | Diff | Splinter Review
Add Android direct texture implementation (24.05 KB, patch)
2011-12-15 07:30 PST, James Willcox (:snorp) (jwillcox@mozilla.com)
no flags Details | Diff | Splinter Review
Use direct texturing on Android when available (15.47 KB, patch)
2011-12-15 07:31 PST, James Willcox (:snorp) (jwillcox@mozilla.com)
no flags Details | Diff | Splinter Review
Use asynchronous direct texturing on Android when available (15.27 KB, patch)
2011-12-15 07:36 PST, James Willcox (:snorp) (jwillcox@mozilla.com)
pwalton: feedback+
Details | Diff | Splinter Review
Add Android direct texture implementation (24.47 KB, patch)
2011-12-15 13:16 PST, James Willcox (:snorp) (jwillcox@mozilla.com)
chrislord.net: review+
pwalton: feedback+
akeybl: approval‑mozilla‑aurora+
Details | Diff | Splinter Review
Use asynchronous direct texturing on Android when available (15.92 KB, patch)
2011-12-15 13:18 PST, James Willcox (:snorp) (jwillcox@mozilla.com)
chrislord.net: review+
pwalton: feedback+
akeybl: approval‑mozilla‑aurora+
Details | Diff | Splinter Review

Description Chris Lord [:cwiiis] 2011-07-12 08:01:50 PDT
Created attachment 545385 [details]
Plan for using EGLImage/threads on android to speed up GL shadow layer updates

When using GLES layers on Android, we synchronously upload the texture every time we do a cache viewport update. This greatly harms interactive performance, e.g. when panning and is especially visible on the high-resolution tablets, where that upload (using the default cache viewport sizes) is very large.

On honeycomb at least, we can use EGLImage to offload the shadow buffer updates to a thread and not block the UI. Attached is a plan to do so, I'd appreciate any comments.

Thoughts/suggestions that have already come up - pbuffers ought to be asynchronous, maybe just using pbuffers and reducing the tiled texture size might be a good boost?

Is it worth doing this when Android will eventually have EGLImage/pixmap support, allowing us to use a very similar path to the one X11 uses with texture-from-pixmap? (Android does already have this in fact, but its 'native buffer' is private and non-trivial)

Is this plan even feasible at all, have I missed out some details/will any of it actually work?

Plan attached.
Comment 1 Mark Finkle (:mfinkle) (use needinfo?) 2011-07-14 20:54:47 PDT
Will we need this for any OpenGL functionality to land in Fennec? Or is this a nice optimization?
Comment 2 Chris Lord [:cwiiis] 2011-07-16 02:17:28 PDT
We won't need this, though we may want to make other changes before switching on GL by default without this (assuming that this even works).

I've made some progress on this, but not enough to know if it'll change anything yet;

So far, I've altered GLContextProviderEGL to, when available, allocate textures using a separate GL context and bind them via EGLImage - so all texture allocation and upload is going through a separate context.

I'm in the middle of adding some API to the GL layers classes to allow for asynchronous swapping and upload, and am concentrating solely on shadow thebes layers at the moment. This is making progress - the API is almost there and it's showing signs of working (at the moment, everything works for a while, before it gradually grinds to a halt - either a leak or a count mismatch somewhere that's stalling the message-passing).

I'll continue working on this, hopefully once the async API works synchronously, adding threads ought to be trivial and we can see if it actually helps or not.

The situation may be that texture allocation/upload stalls the GL pipeline, regardless of what context it's in, in which case, there are other things we can do to help mitigate this problem (delayed upload, small default tile-size, texture re-use, etc.)
Comment 3 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2011-07-20 02:13:49 PDT
I'm incredibly impressed with your progress so far!  Nice work.

First things first --- this is a hard project to get right.  Do you have good profile data showing that this is the biggest main-thread lag while panning?  I'm not trying to discourage you, just want to make sure you get the most reward for your effort.  (What fps are we hitting nowadays?)  TBH, I also wouldn't recommend tackling this until there aren't major rendering+crasher bugs left to fix.  We can't turn on GL at all until the bugs are sorted, and last I was in this code the bugs were more severe than perf problems (may have changed though).

If this is the biggest perf badness, there might be some simpler things we can do to cut down on upload overhead.  First, are we repainting entire buffers or only doing partial repaints?  Last I saw, the GL shadow-layers code was blindly uploading the entire buffer no matter what, because we didn't have data to tune on.  I would strongly recommend gathering that data before tackling asynchronous texture updates.  If in common cases we're only repainting a small portion of the buffer, we should just implement partial updates and recheck perf.

If data says we need to do async texture upload, then there are of course gotchas to watch for that I think you're on top of already, like making sure there's no per-process GL lock in android, or that upload won't otherwise block the GPU (shouldn't, in theory).

If we get to this point, we still need to take a bit to think this out.  Async texture upload is going to break the current layers model, because it will mean buffer updates are no longer atomic with paint calls for that "version" of the layer tree.  This might be a nice short-term win, and maybe we should take it for that, but the longer-term plan would be to upload from the content process when we have cross-process sharing on Android.  Azure will give us another option.

If we're still heading towards async upload, then we need to watch out for
 - content-process paint of buffer X can't race with texture upload of X.  Or else we'll look like crap.  That means we might hit "upload stalls" unless we triple buffer, but triple buffering is a new memory tradeoff to measure.
 - async upload has to deal with content-process crashes.  This could be pretty hard; might need to discuss this with a patch to reference.

You were mentioning asynchronous layers updates on IRC, but that's a separate can of worms that you probably don't want to open (yet).  It's orthogonal to the texture uploading problem.  There's already a bug on file for that, which I can't find right now, sorry.  Let me know if you're interested.

Let me know if this makes sense.
Comment 4 Chris Lord [:cwiiis] 2011-07-20 07:34:06 PDT
Ok, a lot to think about there!

In terms of data, that's something I need to work on - but I've been using the GLES layers for a while now, and rather than trying to improve general performance, I'm trying to prove the very perceptible jerk when the cache viewport is updated. I'm reasonably certain that this is due to the texture upload (which is quite a huge size on a 1200x800 screen tablet), and this seems like one obvious way to fix that.

I haven't encountered any obvious crashers using GL yet, though there are some rendering bugs that I'd like to try and help with - the frame-rate is pretty good (20+) when texture updates aren't happening and not so great when they are.

Of course, partial updates would be good too, but there are legitimate occasions to update large areas, and if we can make this non-blocking, that seems like a good idea - certainly at least worth experimenting with, I'm prepared for it to not work/not help at all, at least I've learnt something about how this code works :)

I much prefer the long-term plan, and I've been thinking about this too - but is it possible to share GL resources cross-process on Android? Or rather, I know it's possible with EGL pixmap, but the API for this isn't public, which would make it rather cumbersome, especially as their 'native buffer' implementation is non-trivial (though maybe we can just cheat and do manual library loads/symbol look-ups to get what we need..) - if there are people working on this, I can lend a hand if it'd help.

My current plan is to only allow a single asynchronous update to happen at a time, so when another update comes, it'll block on the old one completing - Given that we don't update the cache viewport each frame, the likelihood of these updates bumping into each other is extremely unlikely. This is instead of the async layer updates I was talking about earlier, which you're right about, it is a can of worms I don't want to open :)

I think there's a lot of stuff I've not considered and probably will need to, and yes, tests for all of these things would be great. I'd like to try and get this patch working to see what difference it makes, as I think I'm quite close, but I'll re-examine the situation once that's done.
Comment 5 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2011-07-21 15:14:43 PDT
(In reply to comment #4)
> In terms of data, that's something I need to work on - but I've been using
> the GLES layers for a while now, and rather than trying to improve general
> performance, I'm trying to prove the very perceptible jerk when the cache
> viewport is updated. I'm reasonably certain that this is due to the texture
> upload (which is quite a huge size on a 1200x800 screen tablet), and this
> seems like one obvious way to fix that.

The upload is certainly a prime suspect.  Simple FunctionTimers placed across the upload calls would be naive measurements, but might give something useful.  You could also try replacing the full-buffer uploads with something much simpler, just enough to allow measure scrolling perf (like a short line of pixels every few hundred rows, each a different color). 

> I haven't encountered any obvious crashers using GL yet, though there are
> some rendering bugs that I'd like to try and help with - the frame-rate is
> pretty good (20+) when texture updates aren't happening and not so great
> when they are.

Hm, 20+ isn't so great, but we'll get there.  The rendering bugs I know of are tracked by bug 598864 or reachable from there.

> Of course, partial updates would be good too, but there are legitimate
> occasions to update large areas, and if we can make this non-blocking, that
> seems like a good idea - certainly at least worth experimenting with, I'm
> prepared for it to not work/not help at all, at least I've learnt something
> about how this code works :)

Sure.  Just to re-emphasize, I'd generally recommend getting the data before attempting something really hard like this (might find a much easier, bigger win), but learning the code is itself a nice payoff :).

> I much prefer the long-term plan, and I've been thinking about this too -
> but is it possible to share GL resources cross-process on Android? Or
> rather, I know it's possible with EGL pixmap, but the API for this isn't
> public, which would make it rather cumbersome, especially as their 'native
> buffer' implementation is non-trivial (though maybe we can just cheat and do
> manual library loads/symbol look-ups to get what we need..) - if there are
> people working on this, I can lend a hand if it'd help.

We were told by some folks in the know essentially to not attempt cross-process sharing with the current android codebase.  We have some well-placed friends who are helping us out here though, ask me privately for details if you're interested.
Comment 6 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2011-07-21 15:28:03 PDT
(In reply to comment #5)
> (In reply to comment #4)
> > In terms of data, that's something I need to work on - but I've been using
> > the GLES layers for a while now, and rather than trying to improve general
> > performance, I'm trying to prove the very perceptible jerk when the cache
> > viewport is updated. I'm reasonably certain that this is due to the texture
> > upload (which is quite a huge size on a 1200x800 screen tablet), and this
> > seems like one obvious way to fix that.
> 
> The upload is certainly a prime suspect.  Simple FunctionTimers placed
> across the upload calls would be naive measurements, but might give
> something useful.  You could also try replacing the full-buffer uploads with
> something much simpler, just enough to allow measure scrolling perf (like a
> short line of pixels every few hundred rows, each a different color).

Or if we had an fps meter, you could just comment out the uploads and check the meter.  Patrick Walton wrote something like that that worked on mac, I bet it could be ported to fennec.
Comment 8 Chris Lord [:cwiiis] 2011-07-22 04:35:53 PDT
Created attachment 547656 [details] [diff] [review]
WIP asynchronous texture-upload patch

Note, this patch doesn't work as such and just makes updating extremely glitchy, but this is my work-in-progress at the moment. It makes textures go through EGLImage and a separate context instead of allocating through the main context (this happens for all textures, it probably ought to be controlled by a flag).

In addition, it adds an async API to various layer and texture update functions, to allow for asynchronous swapping. I've semi-implemented this for ShadowThebesLayer and enabled double-buffering so we can upload asynchronously and the back-end doesn't have to wait for us (also so I don't have to deal with the can-of-worms of adding async replies to the content process).

This doesn't yet work for threads as (I guess/hope) I'm not allocating the context in a thread yet, so that will take a bit more work - though the threads code is in there and disabled. This patch just dispatches the update to the main thread to test that a delayed update will work (and it does, though of course this is not synchronised with the painting, so texture coordinates are updated before the textures are put in place and it doesn't redraw on upload completion).

Having gotten this far, it's opened a few more questions for me and I think I'd like to work on some simpler methods of improving our updates before this. Rather than having ginormous textures for our layers, I think we'd be much better off having a smaller tile pool, using partial updates, and reshuffling the tiles on update.

I still think we'll want asynchronous texture updates in time, but you're right, it's probably a bit early for that without exhausting our other, easier options first. Not to say I'm giving up on this, but I might shelve it for a little while.
Comment 9 Chris Lord [:cwiiis] 2011-07-22 04:37:35 PDT
Created attachment 547657 [details] [diff] [review]
WIP asynchronous texture-upload patch

Oops, missed a header...
Comment 10 Chris Lord [:cwiiis] 2011-09-07 05:36:05 PDT
Just to note, I'm looking at this again and benchmarking what we can get away with in terms of texture upload on Tegra 2 (on a Motorola Xoom running Honeycomb 3.2 to be exact).

Some preliminary results - If you're doing almost nothing else, you can get away with uploading about 392k of texture data and maintain a 60fps refresh - any higher and it starts dropping. This would equate to a 448x448, 16-bit texture. Of course, we do do other things, so this highlights how important it is not to upload too much texture data in a frame.

If we did tiled or segmented uploads, this would point at 256x256 being a sensible tile size.

Next, I'm going to test if doing texture uploads in a thread using EGLImage blocks the GPU in the main thread.
Comment 11 Chris Lord [:cwiiis] 2011-09-10 03:44:43 PDT
Just to add another note - I was totally wrong about the upload bandwidth - there's actually tonnes. It takes about 10ms (well, 9-13ms variable) to upload a 2048x2048 16-bit texture, which isn't far off from the advertised theoretical bandwidth of 1.33Gb/s (source: http://www.anandtech.com/show/2911)

What is interesting though (and this is where I made a mistake), is that if you force a conversion (say uploading a 32-bit surface to a 16-bit texture), it's roughly 50 times slower. And I found that we are doing this (on android).

Wherever possible, we need to make sure that our GL surface internal format corresponds to our gfx surface format, and we need to make sure that any conversions are done on the content side, not the chrome side.

I'm going to audit this and file the relevant bugs.

It still takes a non-trivial amount of time to do large updates, so I still believe we'll want async updates if we don't manage to find a method of doing this without copying, but I'll come back to this when it's more relevant.
Comment 12 Benjamin Stover (:stechz) 2011-09-11 06:51:22 PDT
> What is interesting though (and this is where I made a mistake), is that if
> you force a conversion (say uploading a 32-bit surface to a 16-bit texture),
> it's roughly 50 times slower. And I found that we are doing this (on
> android).

This might be a very good separate bug, as fixing this would attenuate the delay on upload effect today.
Comment 13 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2011-09-16 14:11:50 PDT
(In reply to Chris Lord [:cwiiis] from comment #10)
> Next, I'm going to test if doing texture uploads in a thread using EGLImage
> blocks the GPU in the main thread.

Have you tried this test yet?  We have the following setup on android ---
 - Java main() thread creates SurfaceView
 - Gecko thread creates an EGLSurface for that window SurfaceView
 - Gecko thread creates a GLContext for that EGLSurface that shares resources with the global GL context
 - Gecko thread creates the global GL context for an offscreen pixmap

There are several points in there where we might be hitting some cross readback, locked, or fallback path from crossing different resource pools or violating threading expectations.  Would be interesting to modify your small testcase to match that setup and see if you can repro the slowdown that way.
Comment 14 James Willcox (:snorp) (jwillcox@mozilla.com) 2011-12-15 07:30:34 PST
Created attachment 581961 [details] [diff] [review]
Add Android direct texture implementation
Comment 15 James Willcox (:snorp) (jwillcox@mozilla.com) 2011-12-15 07:31:02 PST
Created attachment 581962 [details] [diff] [review]
Use direct texturing on Android when available
Comment 16 James Willcox (:snorp) (jwillcox@mozilla.com) 2011-12-15 07:36:15 PST
Created attachment 581965 [details] [diff] [review]
Use asynchronous direct texturing on Android when available
Comment 17 James Willcox (:snorp) (jwillcox@mozilla.com) 2011-12-15 07:39:26 PST
The attached patches use the direct texture mechanism on Android to upload textures asynchronously from the Gecko thread.
Comment 18 Patrick Walton (:pcwalton) 2011-12-15 08:13:43 PST
So I had assumed we were doing the double buffering on the Java side -- using two textures (with separate IDs) and swap between them every frame, instead of using one texture and swapping buffers every frame. This would give us a path to a performant fallback for devices in which gralloc isn't supported. On the other hand, I know you were having a lot of trouble getting double buffering to work at all, so I don't know whether this is feasible.
Comment 19 James Willcox (:snorp) (jwillcox@mozilla.com) 2011-12-15 08:17:08 PST
(In reply to Patrick Walton (:pcwalton) from comment #18)
> So I had assumed we were doing the double buffering on the Java side --
> using two textures (with separate IDs) and swap between them every frame,
> instead of using one texture and swapping buffers every frame. This would
> give us a path to a performant fallback for devices in which gralloc isn't
> supported. On the other hand, I know you were having a lot of trouble
> getting double buffering to work at all, so I don't know whether this is
> feasible.

We can do that too. Having multiple textures doesn't buy us anything in the direct texture case, but it doesn't hurt either (we'd still need two AndroidGraphicBufferS). I guess you want to create an offscreen context for the Gecko thread and share the textures (as cjones suggests)?
Comment 20 Patrick Walton (:pcwalton) 2011-12-15 08:20:47 PST
Comment on attachment 581965 [details] [diff] [review]
Use asynchronous direct texturing on Android when available

Review of attachment 581965 [details] [diff] [review]:
-----------------------------------------------------------------

This is awesome work.

::: mobile/android/base/GeckoAppShell.java
@@ +430,5 @@
> +        // If we have direct texture available, use it
> +        if (GeckoAppShell.testDirectTexture()) {
> +            Log.i(LOGTAG, "Using direct texture for widget layer");
> +            GeckoApp.mAppContext.getSoftwareLayerClient().installWidgetLayer();
> +        }

Could this be moved into the GeckoSoftwareLayerClient constructor?

::: mobile/android/base/gfx/GeckoSoftwareLayerClient.java
@@ +195,5 @@
>              Rect rect = new Rect(x, y, x + width, y + height);
> +
> +            if (mTileLayer instanceof SingleTileLayer) {
> +                ((SingleTileLayer)mTileLayer).invalidate(rect);
> +            }

nit: no {} in this file

::: mobile/android/base/gfx/WidgetTileLayer.java
@@ +66,5 @@
> +    protected void bindAndSetGLParameters() {
> +        GLES11.glBindTexture(GL10.GL_TEXTURE_2D, mTextureIDs[0]);
> +        GLES11.glTexParameterf(GL10.GL_TEXTURE_2D, GL10.GL_TEXTURE_MIN_FILTER, GL10.GL_NEAREST);
> +        GLES11.glTexParameterf(GL10.GL_TEXTURE_2D, GL10.GL_TEXTURE_MAG_FILTER, GL10.GL_LINEAR);
> +    }

This duplicates code in the TileLayer (or the SingleTileLayer, I forget which). Could you factor this out, perhaps into a protected function of Layer?
Comment 21 James Willcox (:snorp) (jwillcox@mozilla.com) 2011-12-15 08:26:49 PST
(In reply to Patrick Walton (:pcwalton) from comment #20)
> Comment on attachment 581965 [details] [diff] [review]
> Use asynchronous direct texturing on Android when available
> 
> Review of attachment 581965 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> This is awesome work.
> 
> ::: mobile/android/base/GeckoAppShell.java
> @@ +430,5 @@
> > +        // If we have direct texture available, use it
> > +        if (GeckoAppShell.testDirectTexture()) {
> > +            Log.i(LOGTAG, "Using direct texture for widget layer");
> > +            GeckoApp.mAppContext.getSoftwareLayerClient().installWidgetLayer();
> > +        }
> 
> Could this be moved into the GeckoSoftwareLayerClient constructor?

Sadly not because GeckoSoftwareLayerClient is constructed before we've loaded the native libs, so it would crash. It's a crappy situation.

> 
> ::: mobile/android/base/gfx/GeckoSoftwareLayerClient.java
> @@ +195,5 @@
> >              Rect rect = new Rect(x, y, x + width, y + height);
> > +
> > +            if (mTileLayer instanceof SingleTileLayer) {
> > +                ((SingleTileLayer)mTileLayer).invalidate(rect);
> > +            }
> 
> nit: no {} in this file

Fixed

> 
> ::: mobile/android/base/gfx/WidgetTileLayer.java
> @@ +66,5 @@
> > +    protected void bindAndSetGLParameters() {
> > +        GLES11.glBindTexture(GL10.GL_TEXTURE_2D, mTextureIDs[0]);
> > +        GLES11.glTexParameterf(GL10.GL_TEXTURE_2D, GL10.GL_TEXTURE_MIN_FILTER, GL10.GL_NEAREST);
> > +        GLES11.glTexParameterf(GL10.GL_TEXTURE_2D, GL10.GL_TEXTURE_MAG_FILTER, GL10.GL_LINEAR);
> > +    }
> 
> This duplicates code in the TileLayer (or the SingleTileLayer, I forget
> which). Could you factor this out, perhaps into a protected function of
> Layer?

The problem is that [Tile]Layer doesn't have any texture IDs, only SingleTileLayer. So we'd have to move up the texture generation as well. I don't think the hierarchy is quite right, but I'm not really sure of the correct way either.

I just discovered that the current code crashes on the Droid Pro with Chris' tile size decoupling patch so I'm figuring that out now (sigh).
Comment 22 Patrick Walton (:pcwalton) 2011-12-15 08:30:33 PST
(In reply to James Willcox (:snorp) (jwillcox@mozilla.com) from comment #19)
> We can do that too. Having multiple textures doesn't buy us anything in the
> direct texture case, but it doesn't hurt either (we'd still need two
> AndroidGraphicBufferS). I guess you want to create an offscreen context for
> the Gecko thread and share the textures (as cjones suggests)?

Oh, I see, you don't need texture sharing this way. Let's just leave it the way it is then -- no need to add additional complexity.
Comment 23 James Willcox (:snorp) (jwillcox@mozilla.com) 2011-12-15 13:16:49 PST
Created attachment 582076 [details] [diff] [review]
Add Android direct texture implementation
Comment 24 James Willcox (:snorp) (jwillcox@mozilla.com) 2011-12-15 13:18:05 PST
Created attachment 582079 [details] [diff] [review]
Use asynchronous direct texturing on Android when available
Comment 25 Patrick Walton (:pcwalton) 2011-12-15 22:42:16 PST
Comment on attachment 582076 [details] [diff] [review]
Add Android direct texture implementation

Review of attachment 582076 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good.

::: widget/src/android/AndroidGraphicBuffer.cpp
@@ +317,5 @@
> +AndroidGraphicBuffer::Reallocate(PRUint32 aWidth, PRUint32 aHeight, gfxImageFormat aFormat)
> +{
> +  EnsureInitialized();
> +
> +  if (sGLFunctions.fGraphicBufferReallocate(mHandle, aWidth, aHeight, GetAndroidFormat(aFormat)) != 0) {

Might want to add a comment here saying that some devices actually don't have this, so someone doesn't try to remove this later.
Comment 26 Patrick Walton (:pcwalton) 2011-12-15 22:46:07 PST
Comment on attachment 582079 [details] [diff] [review]
Use asynchronous direct texturing on Android when available

Review of attachment 582079 [details] [diff] [review]:
-----------------------------------------------------------------

LGTM.

::: mobile/android/base/GeckoAppShell.java
@@ +428,5 @@
>          Log.i(LOGTAG, "post native init");
>  
> +        // If we have direct texture available, use it
> +        if (GeckoAppShell.testDirectTexture()) {
> +            Log.i(LOGTAG, "Using direct texture for widget layer");

It'd be really cool if we could add this to about:support somehow. It would also perhaps shame device manufacturers into supporting this important API. Just a nice-to-have though, doesn't need to (and shouldn't) be part of this patch.
Comment 27 Chris Lord [:cwiiis] 2011-12-16 02:39:11 PST
Comment on attachment 582076 [details] [diff] [review]
Add Android direct texture implementation

Review of attachment 582076 [details] [diff] [review]:
-----------------------------------------------------------------

r+ as long as the EnsureInitialized() return values in AndroidGraphicBuffer.cpp are checked.

::: widget/src/android/AndroidDirectTexture.cpp
@@ +42,5 @@
> +namespace mozilla {
> +
> +AndroidDirectTexture::AndroidDirectTexture(PRUint32 width, PRUint32 height, PRUint32 usage,
> +                                           gfxImageFormat format) :
> +  mLock("AndroidDirectTexture.mLock"), mNeedFlip(false), mWidth(width), mHeight(height), mFormat(format), mPendingReallocBuffer(NULL)

Minor, most files break these up by line and align on the comma, like

  mLock("AndroidDirectTexture.mLock")
, mNeedFlip(false)
, mWidth(width)

etc.

@@ +52,5 @@
> +AndroidDirectTexture::~AndroidDirectTexture()
> +{
> +  if (mFrontBuffer) {
> +    delete mFrontBuffer;
> +    mFrontBuffer = NULL;

Are there issues here if this is deleted while it's bound?

@@ +57,5 @@
> +  }
> +
> +  if (mBackBuffer) {
> +    delete mBackBuffer;
> +    mBackBuffer = NULL;

Are there issues here if this is deleted while it's locked?

::: widget/src/android/AndroidGraphicBuffer.cpp
@@ +220,5 @@
> +namespace mozilla {
> +
> +static bool checkGLError(const char* name)
> +{
> +  bool result = true;

This returning true when there's no error seems backwards to me, given its name.

@@ +234,5 @@
> +
> +AndroidGraphicBuffer::AndroidGraphicBuffer(PRUint32 width, PRUint32 height, PRUint32 usage,
> +                                           gfxImageFormat format) :
> +  mWidth(width), mHeight(height), mUsage(usage), mFormat(format),
> +  mHandle(0), mEGLImage(0)

Same as before re: alignment

@@ +267,5 @@
> +AndroidGraphicBuffer::Create(PRUint32 aWidth, PRUint32 aHeight, PRUint32 aUsage, gfxImageFormat aFormat)
> +{
> +  if (!mHandle) {
> +    mHandle = malloc(GRAPHIC_BUFFER_SIZE);
> +    sGLFunctions.fGraphicBufferCtor(mHandle, mWidth, mHeight, GetAndroidFormat(mFormat), GetAndroidUsage(aUsage));

Should there be an mFormat = aFormat before this? Or should this be aFormat?

@@ +280,5 @@
> +  if (!sGLFunctions.EnsureInitialized()) {
> +    return false;
> +  }
> +
> +  Create(mWidth, mHeight, mUsage, mFormat);

This confused me for a second - it's probably worth commenting that running Create only creates if the buffer hasn't yet been created, or has been destroyed.

@@ +287,5 @@
> +
> +bool
> +AndroidGraphicBuffer::Lock(PRUint32 aUsage, unsigned char **bits)
> +{
> +  EnsureInitialized();

Here, and in other places, EnsureInitialized is called, but the return value isn't checked - these should definitely be replaced with returns if EnsureInitialized returns false, as it'll almost certainly be followed by uninitialised memory access.
Comment 28 Chris Lord [:cwiiis] 2011-12-16 02:56:46 PST
Comment on attachment 582079 [details] [diff] [review]
Use asynchronous direct texturing on Android when available

Review of attachment 582079 [details] [diff] [review]:
-----------------------------------------------------------------

I would prefer not to have a static DirectTexture, and open up enough of the API so that Java can allocate the DirectTexture itself and tell nsWindow about it somehow (new message, or piggy-backing on an existing message like SIZE_CHANGED). This would allow the Java-side changes to be a bit cleaner and I'm worried about possible race-conditions doing it like this too...

On the other hand, this is minor compared to the awesomeness this allows, and I'm not saying this is a worse way either, just different. r+ from me :)

I suspect that we may need to have a black-list if we enable this, but better to commit this, find out, and fix it in-tree imo.

::: widget/src/android/nsWindow.cpp
@@ +97,5 @@
> +#include "mozilla/Mutex.h"
> +#include "nsThreadUtils.h"
> +#include "AndroidDirectTexture.h"
> +
> +static AndroidDirectTexture* sDirectTexture = new AndroidDirectTexture(2048, 2048,

This gets reallocated anyway if the size is different, couldn't we start with something smaller? (or nsnull?)

@@ +835,5 @@
> +  // If we already tested, return early
> +  if (!sHasDirectTexture)
> +    return false;
> +
> +  AndroidGraphicBuffer* buffer = new AndroidGraphicBuffer(512, 512,

Do we really want to test with such a large allocation? Also, couldn't you just do this test on sDirectTexture?

@@ +1197,5 @@
>              ALOG("### Failed to create a valid surface from the bitmap");
>          } else {
> +            if (sHasDirectTexture) {
> +              // XXX: lock only the dirty rect above and pass it in here
> +              DrawTo(targetSurface);

We spoke on IRC about doing the offset - this is fine, but if we commit this, we need to make sure to file a follow-up bug for only locking the dirty rectangle. Shouldn't be hard to add though, maybe we can talk about it later.
Comment 29 Chris Lord [:cwiiis] 2011-12-16 03:17:28 PST
(In reply to Patrick Walton (:pcwalton) from comment #26)
> It'd be really cool if we could add this to about:support somehow. It would
> also perhaps shame device manufacturers into supporting this important API.
> Just a nice-to-have though, doesn't need to (and shouldn't) be part of this
> patch.

+1 on this btw.
Comment 30 James Willcox (:snorp) (jwillcox@mozilla.com) 2011-12-16 06:31:31 PST
(In reply to Chris Lord [:cwiiis] from comment #27)
> Comment on attachment 582076 [details] [diff] [review]
> Add Android direct texture implementation
> 
> Review of attachment 582076 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> r+ as long as the EnsureInitialized() return values in
> AndroidGraphicBuffer.cpp are checked.
> 
> ::: widget/src/android/AndroidDirectTexture.cpp
> @@ +42,5 @@
> > +namespace mozilla {
> > +
> > +AndroidDirectTexture::AndroidDirectTexture(PRUint32 width, PRUint32 height, PRUint32 usage,
> > +                                           gfxImageFormat format) :
> > +  mLock("AndroidDirectTexture.mLock"), mNeedFlip(false), mWidth(width), mHeight(height), mFormat(format), mPendingReallocBuffer(NULL)
> 
> Minor, most files break these up by line and align on the comma, like
> 
>   mLock("AndroidDirectTexture.mLock")
> , mNeedFlip(false)
> , mWidth(width)
> 
> etc.

Thanks, fixed.

> 
> @@ +52,5 @@
> > +AndroidDirectTexture::~AndroidDirectTexture()
> > +{
> > +  if (mFrontBuffer) {
> > +    delete mFrontBuffer;
> > +    mFrontBuffer = NULL;
> 
> Are there issues here if this is deleted while it's bound?

Yes.

> 
> @@ +57,5 @@
> > +  }
> > +
> > +  if (mBackBuffer) {
> > +    delete mBackBuffer;
> > +    mBackBuffer = NULL;
> 
> Are there issues here if this is deleted while it's locked?

Yes.

Deleting one of these will be tricky. You need be sure that it's no longer in use anywhere (including the framebuffer). We can't guarantee that our destructor, though. I'll add a comment.

> 
> ::: widget/src/android/AndroidGraphicBuffer.cpp
> @@ +220,5 @@
> > +namespace mozilla {
> > +
> > +static bool checkGLError(const char* name)
> > +{
> > +  bool result = true;
> 
> This returning true when there's no error seems backwards to me, given its
> name.

Fair enough. I changed it to ensureNoGLError()

> 
> @@ +234,5 @@
> > +
> > +AndroidGraphicBuffer::AndroidGraphicBuffer(PRUint32 width, PRUint32 height, PRUint32 usage,
> > +                                           gfxImageFormat format) :
> > +  mWidth(width), mHeight(height), mUsage(usage), mFormat(format),
> > +  mHandle(0), mEGLImage(0)
> 
> Same as before re: alignment

Fixed

> 
> @@ +267,5 @@
> > +AndroidGraphicBuffer::Create(PRUint32 aWidth, PRUint32 aHeight, PRUint32 aUsage, gfxImageFormat aFormat)
> > +{
> > +  if (!mHandle) {
> > +    mHandle = malloc(GRAPHIC_BUFFER_SIZE);
> > +    sGLFunctions.fGraphicBufferCtor(mHandle, mWidth, mHeight, GetAndroidFormat(mFormat), GetAndroidUsage(aUsage));
> 
> Should there be an mFormat = aFormat before this? Or should this be aFormat?

Nice catch, that should be aFormat. Fixed.

> 
> @@ +280,5 @@
> > +  if (!sGLFunctions.EnsureInitialized()) {
> > +    return false;
> > +  }
> > +
> > +  Create(mWidth, mHeight, mUsage, mFormat);
> 
> This confused me for a second - it's probably worth commenting that running
> Create only creates if the buffer hasn't yet been created, or has been
> destroyed.

Yeah, maybe EnsureBufferCreated is better.

> 
> @@ +287,5 @@
> > +
> > +bool
> > +AndroidGraphicBuffer::Lock(PRUint32 aUsage, unsigned char **bits)
> > +{
> > +  EnsureInitialized();
> 
> Here, and in other places, EnsureInitialized is called, but the return value
> isn't checked - these should definitely be replaced with returns if
> EnsureInitialized returns false, as it'll almost certainly be followed by
> uninitialised memory access.

Quite right; fixed.
Comment 31 James Willcox (:snorp) (jwillcox@mozilla.com) 2011-12-16 06:41:05 PST
(In reply to Chris Lord [:cwiiis] from comment #28)
> Comment on attachment 582079 [details] [diff] [review]
> Use asynchronous direct texturing on Android when available
> 
> Review of attachment 582079 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> I would prefer not to have a static DirectTexture, and open up enough of the
> API so that Java can allocate the DirectTexture itself and tell nsWindow
> about it somehow (new message, or piggy-backing on an existing message like
> SIZE_CHANGED). This would allow the Java-side changes to be a bit cleaner
> and I'm worried about possible race-conditions doing it like this too...

Yeah, I figured doing it this way would be the least error prone, but what you describe sounds nicer for the most part. We should consider it.

> 
> On the other hand, this is minor compared to the awesomeness this allows,
> and I'm not saying this is a worse way either, just different. r+ from me :)
> 
> I suspect that we may need to have a black-list if we enable this, but
> better to commit this, find out, and fix it in-tree imo.

Yeah, I really hope not. The goal is for nsWindow::HasDirectTexture to filter out that sort of nonsense. So far it seems to work, at least.

> 
> ::: widget/src/android/nsWindow.cpp
> @@ +97,5 @@
> > +#include "mozilla/Mutex.h"
> > +#include "nsThreadUtils.h"
> > +#include "AndroidDirectTexture.h"
> > +
> > +static AndroidDirectTexture* sDirectTexture = new AndroidDirectTexture(2048, 2048,
> 
> This gets reallocated anyway if the size is different, couldn't we start
> with something smaller? (or nsnull?)

Starting with nsnull requires us to add a ton of checking for that, since we don't get a valid size before the first paint somehow. I guess we could make it smaller, but I figured it would be better to err on the side of correctness since most of the time it actually is going to be 2048x2048.

> 
> @@ +835,5 @@
> > +  // If we already tested, return early
> > +  if (!sHasDirectTexture)
> > +    return false;
> > +
> > +  AndroidGraphicBuffer* buffer = new AndroidGraphicBuffer(512, 512,
> 
> Do we really want to test with such a large allocation? Also, couldn't you
> just do this test on sDirectTexture?

Well we're going to be doing much larger allocations than that with real usage, so I didn't see a problem with using 512 here. The reason I'm not just doing it on sDirectTexture is because I'm also testing Reallocate() and I didn't want to have to put it back to whatever sDirectTexture started out with. I don't feel strongly about it though.
> 
> @@ +1197,5 @@
> >              ALOG("### Failed to create a valid surface from the bitmap");
> >          } else {
> > +            if (sHasDirectTexture) {
> > +              // XXX: lock only the dirty rect above and pass it in here
> > +              DrawTo(targetSurface);
> 
> We spoke on IRC about doing the offset - this is fine, but if we commit
> this, we need to make sure to file a follow-up bug for only locking the
> dirty rectangle. Shouldn't be hard to add though, maybe we can talk about it
> later.

Yeah, I definitely want to fix that. It should make things like animated gifs much better. Maybe we can knock that out today after landing this initial patch.
Comment 32 James Willcox (:snorp) (jwillcox@mozilla.com) 2011-12-16 11:07:38 PST
Pushed to mozilla-inbound

http://hg.mozilla.org/integration/mozilla-inbound/rev/ba92c42f126f
http://hg.mozilla.org/integration/mozilla-inbound/rev/8a5cc33141a3
Comment 33 Patrick Walton (:pcwalton) 2011-12-16 18:59:02 PST
Is there any way we can fall back to asynchronous texture upload if the device doesn't support gralloc? This link has an example of how it works (note that the stock Honeycomb browser presumably is using it): http://groups.google.com/group/android-platform/msg/d1457a38055f5996

Looks like the method is quite similar to gralloc, except that we use EGL_GL_TEXTURE_2D_KHR instead of EGL_NATIVE_BUFFER_ANDROID.
Comment 35 Matt Brubeck (:mbrubeck) 2011-12-17 09:17:49 PST
The second part was backed out because it broke XUL fennec:
https://hg.mozilla.org/mozilla-central/rev/f39dab2d2adb
Comment 36 Matt Brubeck (:mbrubeck) 2011-12-17 09:19:37 PST
Oh, but it re-landed.  (Sorry, I'm merging a lot of changesets at once... If you can leave a comment or a whiteboard message when things are landed or backed out on inbound, it helps with the merging process...)

https://hg.mozilla.org/mozilla-central/rev/74277c18e7ce
Comment 37 Doug Turner (:dougt) 2011-12-17 22:09:43 PST
backed out.
Comment 38 Patrick Walton (:pcwalton) 2011-12-18 01:02:33 PST
Why was this backed out? Was it due to the crashes in bug 711751 and bug 711694?

We may want to just whitelist phones for gralloc and focus on async texture upload (bug 711687) as a fallback/safer alternative.
Comment 39 James Willcox (:snorp) (jwillcox@mozilla.com) 2011-12-21 12:50:36 PST
This is back on inbound once more:

http://hg.mozilla.org/integration/mozilla-inbound/rev/7aa1b9862e89
http://hg.mozilla.org/integration/mozilla-inbound/rev/402f8ea2cbd7
Comment 40 Brad Lassey [:blassey] (use needinfo?) 2011-12-21 14:22:49 PST
the bug isn't fixed until this makes it to mozilla-central.

Also, just a friendly reminder to flag these for uplift to aurora
Comment 42 Mark Finkle (:mfinkle) (use needinfo?) 2012-01-06 23:02:04 PST
Comment on attachment 582076 [details] [diff] [review]
Add Android direct texture implementation

[Approval Request Comment]
Needed for performance on aurora. This has baked on m-c for a while
Comment 43 Mark Finkle (:mfinkle) (use needinfo?) 2012-01-06 23:02:28 PST
Comment on attachment 582079 [details] [diff] [review]
Use asynchronous direct texturing on Android when available

[Approval Request Comment]
Needed for performance on aurora. This has baked on m-c for a while
Comment 44 Alex Keybl [:akeybl] 2012-01-09 14:49:31 PST
Comment on attachment 582076 [details] [diff] [review]
Add Android direct texture implementation

[Triage Comment]
Mobile only - approving for Aurora.
Comment 45 Mark Finkle (:mfinkle) (use needinfo?) 2012-01-10 14:06:28 PST
(In reply to Alex Keybl [:akeybl] from comment #44)
> Comment on attachment 582076 [details] [diff] [review]
> Add Android direct texture implementation
> 
> [Triage Comment]
> Mobile only - approving for Aurora.

This bug depends on bug 708307, bug 711426 and bug 713774. We should take them on aurora all at once.
Comment 46 Brad Lassey [:blassey] (use needinfo?) 2012-01-13 08:39:01 PST
added known/identified dependencies. To be clear, this should land on aurora along with all its dependent bugs when they're fixed.

Note You need to log in before you can comment on or make changes to this bug.