Closed Bug 730718 Opened 8 years ago Closed 7 years ago

Use EGLImageKHR to avoid texture upload on the compositor side

Categories

(Firefox for Android :: General, defect, P1)

ARM
Android
defect

Tracking

()

RESOLVED WONTFIX
Tracking Status
blocking-fennec1.0 --- -
fennec + ---

People

(Reporter: pcwalton, Unassigned)

References

(Blocks 2 open bugs)

Details

(Whiteboard: maple [has proposed patch][needs feedback bgirard])

Attachments

(2 files, 15 obsolete files)

18.46 KB, patch
Details | Diff | Splinter Review
9.45 KB, patch
Details | Diff | Splinter Review
Performing texture sharing using EGLImageKHR linked to an off-main-thread EGL context can be used to do true asynchronous texture upload, at least on some phones. I wrote up a testcase and tested on the Droid RAZR (a Gingerbread phone). I composited as quickly as possible while simultaneously continuously uploading 1024x1024 textures. Textures uploaded at a rate of 15 per second, but compositing stayed at a constant 60 fps, indicating that the texture upload is truly asynchronous.

This approach has a couple of nice advantages over gralloc and SurfaceTexture:
(1) It's a public API available on all platforms.
(2) WebKit on Android uses this. Interestingly, they deprecated SurfaceTexture use on ICS in favor of this approach. Presumably Google Chrome is using this, which means that phone GPU manufacturers will have to test it.

I assume the best way to do this is to perform the image upload on the content side (rather than on the compositor side), using the MOZ_HAVE_PLATFORM_SPECIFIC_LAYER_BUFFERS infrastructure to wrap an EGLImageKHR in a custom SurfaceDescriptor. This would allow the compositor to stay responsive while upload is happening on the content side. We'd have to create a dummy pbuffer GL context on the content side, which we haven't had to do before, but it'd just be so that we can perform texture upload; we wouldn't actually be rendering anything using GL calls. Chris, do you think this will work?
I'm glad the experiment worked. The other alternative is to perform the texture upload on the compositor using a worker thread to process upload asynchronously to reduce main thread contention. The drawback is this solution will require an extra buffer per layer so we would have to do another experiment to confirm if this is better then uploading on the content side.
(In reply to Benoit Girard (:BenWa) from comment #1)
> I'm glad the experiment worked. The other alternative is to perform the
> texture upload on the compositor using a worker thread to process upload
> asynchronously to reduce main thread contention. The drawback is this
> solution will require an extra buffer per layer so we would have to do
> another experiment to confirm if this is better then uploading on the
> content side.

We'd have to be able to call Composite() while processing Thebes swap events in that case, which would require a nested event loop, unless I'm missing something. Seems more complex than just doing the upload content side.
Not quite but actually since we need to complete the transaction atomically the worker would need to be on the content side.

Here's more details:
1) Content prepares the transactions like useful including the gfxSurface.
2) Content adds the gfxSurface's to the upload work queue.
3) Once the uploads are completed they replace the gfxSurface with EGLImageKHR.
4) Transaction is sent to the compositor where the textures are replaced.

The advantage is not blocking either content or the compositor while uploads are taking place.
(In reply to Patrick Walton (:pcwalton) from comment #0)
> Performing texture sharing using EGLImageKHR linked to an off-main-thread
> EGL context can be used to do true asynchronous texture upload, at least on
> some phones. I wrote up a testcase and tested on the Droid RAZR (a
> Gingerbread phone). I composited as quickly as possible while simultaneously
> continuously uploading 1024x1024 textures. Textures uploaded at a rate of 15
> per second, but compositing stayed at a constant 60 fps, indicating that the
> texture upload is truly asynchronous.

Pretty cool. Did you only try on the RAZR? Can you post the APK or a tarball?

> 
> I assume the best way to do this is to perform the image upload on the
> content side (rather than on the compositor side), using the
> MOZ_HAVE_PLATFORM_SPECIFIC_LAYER_BUFFERS infrastructure to wrap an
> EGLImageKHR in a custom SurfaceDescriptor. This would allow the compositor
> to stay responsive while upload is happening on the content side. We'd have
> to create a dummy pbuffer GL context on the content side, which we haven't
> had to do before, but it'd just be so that we can perform texture upload; we
> wouldn't actually be rendering anything using GL calls. Chris, do you think
> this will work?

You should verify that it's actually possible to create a pbuffer context in addition to the on-screen one. IIRC, some drivers only allow one context per process, which would mean we are totally hosed.
Whiteboard: maple
Attached patch WIP patch. (obsolete) — Splinter Review
Here's an early WIP. It's totally untested and not hooked up to the layer manager yet, but should give an idea of what I'm going for...
Attachment #601193 - Flags: feedback?(bgirard)
Blocks: 729082
Comment on attachment 601193 [details] [diff] [review]
WIP patch.

I don't think I can give good feedback yet since there is too much missing code for the overall design to be clear.

From what I can tell:
You're creating a thread to do the upload and you're passing |aChangeset| to them however these objects aren't deep-copied and will be modified when content continue with the next transaction. We should discuss a design and confirm them with a few measures before we implement this.
Attachment #601193 - Flags: feedback?(bgirard)
Attached patch WIP patch, part 2. (obsolete) — Splinter Review
Here's another WIP using the method we discussed on IRC. This time I'm starting from the compositor side and working backward. Consequently, the actual texture upload on the content side isn't implemented yet. I need to merge the code in WIP part 1 that performs the texture upload with this.
Attached patch WIP patch, version 2. (obsolete) — Splinter Review
Here's a version that should be more or less feature-complete. Totally untested, will undoubtedly need tweaking.

On an unrelated note, I think we need to use EGL image fences on Mali GPUs.
Attachment #601193 - Attachment is obsolete: true
Attachment #601543 - Attachment is obsolete: true
blocking-fennec1.0: --- → ?
I need to change the approach a little bit -- looks like the EGLImageKHR should actually be in the surface *descriptor*, not the surface. Not a big deal, some of this is done already.
No longer blocks: land-maple
Attached patch WIP patch, version 3. (obsolete) — Splinter Review
Patch version 3 adds the new surface descriptor logic. Currently crashing.
Attachment #601885 - Attachment is obsolete: true
I filed bug 734164 to track the issue that this is trying to solve
blocking-fennec1.0: ? → ---
blocking-fennec1.0: --- → ?
This was nomed and cleared; made a mistake in renoming.  Clearing nom again.
blocking-fennec1.0: ? → ---
Attached patch WIP patch, version 4. (obsolete) — Splinter Review
Here's a new WIP. Often you see Thebes content rendered generally correctly on the screen now (with the wrong shader), but it crashes frequently in ThebesLayerBuffer::RenderTo. I suspect there's some kind of race causing the texture image to be null.

Additionally, the code is quite ugly and textures aren't yet getting deleted properly.
Attachment #603166 - Attachment is obsolete: true
Attached patch WIP patch, version 5. (obsolete) — Splinter Review
WIP version 5 fixes the crashes and GPU memory OOMs.
Attachment #604593 - Attachment is obsolete: true
Attached patch WIP patch, version 6. (obsolete) — Splinter Review
Here's a significantly cleaned-up version. Logging is still on in the important places, but the amount of logging is less exorbitant.

The biggest issue is that we're uploading the whole texture every frame. This is because I don't quite understand how sub-layer uploads work yet. I'll try some things, but advice from layers experts would be helpful here. (The relevant code is in ShadowLayerUtilsEGL.cpp, TextureUploader::UploadSurfaceToTexture and EGLImageTextureKHR::UploadSurface. The commented out code in the latter results in incorrect rendering.)
Attachment #605529 - Attachment is obsolete: true
Attachment #605839 - Flags: feedback?(bgirard)
Try run for 81ab9aab6d6b is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=81ab9aab6d6b
Results (out of 101 total builds):
    success: 72
    warnings: 29
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/b56girard@gmail.com-81ab9aab6d6b
Current status: The driver isn't letting me do subtexture uploads. Currently seeing if I can get away with some trick to avoid double buffering...
Attached patch WIP patch, version 7. (obsolete) — Splinter Review
Rebased to tip. Fixed the black screen issue (wasn't setting the mInitialised flag). There's still lots of debugging stuff that I'll remove tomorrow.
Attachment #605839 - Attachment is obsolete: true
Attachment #605839 - Flags: feedback?(bgirard)
Attached patch WIP patch, version 8. (obsolete) — Splinter Review
Here's a new WIP. Debugging stuff is removed, shaders are corrected. Issues, in descending order of severity:

(1) The Java-rendered decorations are usually, but not always, hidden for some reason. (I think this is a known issue even with the current setup.)

(2) We get races, because glTexImage2D() can finish, causing the compositor thread's view of the texture to change, before the compositor thread receives the layers update. The solution, unfortunately, is going to be to double buffer, although when the TiledThebesLayer stuff hits we may be able to get rid of some of that.

(3) More frequently than I'd like, the displayport is too big for the max texture size, so we disable the content-side texture upload. This can be fixed in browser.js or with tiling, but I'd like it to be fixed in browser.js if possible, because we really shouldn't have layer buffers more than 2048 pixels on a side.
Attachment #608179 - Attachment is obsolete: true
(In reply to Patrick Walton (:pcwalton) from comment #19)
> (1) The Java-rendered decorations are usually, but not always, hidden for
> some reason. (I think this is a known issue even with the current setup.)

Yeah, bug 736177.

> (3) More frequently than I'd like, the displayport is too big for the max
> texture size, so we disable the content-side texture upload. This can be
> fixed in browser.js or with tiling, but I'd like it to be fixed in
> browser.js if possible, because we really shouldn't have layer buffers more
> than 2048 pixels on a side.

Yeah, I've been working on different strategies for display port sizing. I agree we need to reduce the size of the display port for the common cases.
(In reply to Kartikaya Gupta (:kats) from comment #20)
> (In reply to Patrick Walton (:pcwalton) from comment #19)
> > (1) The Java-rendered decorations are usually, but not always, hidden for
> > some reason. (I think this is a known issue even with the current setup.)
> 
> Yeah, bug 736177.
> 
> > (3) More frequently than I'd like, the displayport is too big for the max
> > texture size, so we disable the content-side texture upload. This can be
> > fixed in browser.js or with tiling, but I'd like it to be fixed in
> > browser.js if possible, because we really shouldn't have layer buffers more
> > than 2048 pixels on a side.
> 
> Yeah, I've been working on different strategies for display port sizing. I
> agree we need to reduce the size of the display port for the common cases.

It'd be nice to cap the display port at 2048x2048 pixels, which places it within the max texture size on most devices. Having it larger doesn't break anything (as the EGLImageKHR path just disables itself), but it'll make things slow.
Attached patch WIP patch, version 9. (obsolete) — Splinter Review
This version implements double buffering to fix the flicker. There are two main issues:

(1) Mali doesn't work (black screens).

(2) More seriously, double buffering results in a *lot* of checkerboarding when panning, since we have to upload a much larger area each frame (more precisely, the union of the invalid region of this frame with the invalid region of the previous frame). I'm starting to think that, if the region to be uploaded is relatively small, we should just lock the front buffer and update it directly, and only use the back buffer if the whole texture (or near enough to make no difference) is changing.
Attachment #608575 - Attachment is obsolete: true
Are you copying to updated area from new-front->new-back?  Otherwise as you point out the invalid region for each paint converges to the entire window.
Patrick, just got my gralloc stuff to work with mali (Nexus S). Take a look at surfaceflinger. It does a bunch of glEnable(GL_TEXTURE_2D). I do it right now right after a Clear (renable GL_TEXTURE_2D). Do it for GLContextProviderEGL and also the WebGL context.
(In reply to Chris Jones [:cjones] [:warhammer] from comment #23)
> Are you copying to updated area from new-front->new-back?  Otherwise as you
> point out the invalid region for each paint converges to the entire window.

It's just the union of the invalid region of the *last* paint and the invalid region of this paint. Only two invalid regions at any point are unioned, so there's no convergence.
-ing to get off the triage list. This is still a possible solution for bug 734164 which is blocking release.
blocking-fennec1.0: --- → -
Depends on: 739679
Attached patch WIP patch, version 10. (obsolete) — Splinter Review
New WIP is based on top of the work in bug 739679, which makes this a lot easier. It's compiling, but untested.

A lot of the code to double buffer, etc. can be removed once this is working.
Attachment #608949 - Attachment is obsolete: true
Currently refactoring to eliminate display corruption and crashes. It's not working yet, but looks promising. Hard to say what the ETA is at this point.
Re-noming since gralloc is not the best path for us.
blocking-fennec1.0: - → ?
blocking-fennec1.0: ? → beta+
Attached patch WIP patch, part 2. (obsolete) — Splinter Review
Here's a rewrite, based on the newest incarnation of the tiling work, and with a design discussed with BenWa. It works on the Mac desktop, using off-main-thread texture upload. (Doing development on the Mac literally improved my productivity by something like 5x-10x.) Next steps will be to merge the EGLImageKHR stuff from patch part 1 into one big patch.
Priority: -- → P1
Attached patch WIP patch, part 2, version 2. (obsolete) — Splinter Review
Mac part 2 fixes races and rebases on top of BenWa's latest work.
Attachment #614668 - Attachment is obsolete: true
Attached patch WIP patch, part 3. (obsolete) — Splinter Review
WIP patch part 3 ports forward the support for Android, untested at the moment.
Attachment #612776 - Attachment is obsolete: true
Here's a version that works. It decreases jank dramatically on my Droid RAZR. It increases checkerboards slightly, because it double buffers each tile. Thinking about it again, I don't see any reason why this has to be the case (since in a tiled layer buffer world, tiles don't move around, so we should be able to safely do subtexture uploads). Also, it will break on devices in which EGLImageKHR is not supported. So the next steps are (a) disabling EGLImageKHR at runtime on devices where it's not supported and (b) removing double buffering and re-enabling subtexture uploads. The latter can and should be implemented on desktop first.
Attachment #614987 - Attachment is obsolete: true
All right, this thing is ready for feedback. I'm not requesting formal review yet because this is built on top of the unreviewed tiling patches.

I've been testing this myself lately and it feels very smooth. With these patches, we're a constant 55+ fps on cnn.com on my Droid RAZR, much better than stock and better than the situation without these patches. This patch affects only jank and does not affect the amount of checkerboards; there are still too many checkerboards IMO, but we're closer.

The first patch implements content-side texture upload on the Mac via standard OpenGL cross-context texture sharing. It is behind a pref and off by default. On Android, the same pref exists but is turned on by default (although the pref does nothing if the GL driver does not say that it supports EGLImageKHR; this patch therefore does nothing on Adreno 200).
Attachment #614986 - Attachment is obsolete: true
Attachment #615601 - Flags: feedback?(bgirard)
And here's the Android implementation. Instead of direct texture sharing, we use EGLImageKHR.
Attachment #615070 - Attachment is obsolete: true
Attachment #615602 - Flags: feedback?(bgirard)
Apparently this is broken on Mali. Bleh. Will investigate and blacklist if necessary.
Summary: Use EGLImageKHR as a platform-specific layer buffer to avoid texture upload → Use EGLImageKHR to avoid texture upload on the compositor side
Whiteboard: maple → maple [has proposed patch][needs feedback bgirard]
Minor suggestion: instead of GFX_HAVE_PLATFORM_TILE_BACKING_STORE, is it worth just implementing a dummy/stub platform backing store class, and just have ShouldUsePlatformTileBackingStore() return false on the platforms where it's not implemented?  Would get rid of a bunch of #ifdefs.
(In reply to Vladimir Vukicevic (:vlad) from comment #37)
> Minor suggestion: instead of GFX_HAVE_PLATFORM_TILE_BACKING_STORE, is it
> worth just implementing a dummy/stub platform backing store class, and just
> have ShouldUsePlatformTileBackingStore() return false on the platforms where
> it's not implemented?  Would get rid of a bunch of #ifdefs.

I think we're going to be redoing the way this is done anyhow, because we don't want to do the texture upload on the content side. Rather this will be done off both the content and compositor thread.
tracking-fennec: --- → 15+
blocking-fennec1.0: beta+ → -
Blocks: 749062
Sorry this fell of my radar with other work for fennec. If someone is still working on this feel free to ping me for review and I promise I'll take a look in a timely manner this time.
Attachment #615601 - Flags: feedback?(bgirard)
Attachment #615602 - Flags: feedback?(bgirard)
Assignee: pwalton → snorp
tracking-fennec: 15+ → +
Are we still planning on doing this? No, right? Probably ok to close if so.
Assignee: snorp → bgirard
I don't think this is important now that we have tiling. We're only uploading a few tiles per frame.
Assignee: bgirard → nobody
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.