Bug 963073 (b2g-tiling)

Turn on tiled scrollable layers on b2g

RESOLVED FIXED in Firefox 30, Firefox OS v1.4

Status

()

Core
Graphics: Layers
RESOLVED FIXED
4 years ago
3 years ago

People

(Reporter: cwiiis, Assigned: cwiiis)

Tracking

(Depends on: 3 bugs)

unspecified
mozilla30
All
Gonk (Firefox OS)
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(blocking-b2g:1.4+, firefox28 wontfix, firefox29 wontfix, firefox30 fixed, b2g-v1.4 fixed)

Details

Attachments

(2 attachments, 8 obsolete attachments)

(Assignee)

Description

4 years ago
This is a bug to track the issues that need to be fixed before we turn on tiles everywhere for b2g. Note, there is a similar bug, bug 894333 that only concerns the browser. This may end up obsoleting that bug.
(Assignee)

Updated

4 years ago
Depends on: 963076
(Assignee)

Updated

4 years ago
Depends on: 915234
(Assignee)

Updated

4 years ago
Depends on: 950841
(Assignee)

Comment 1

4 years ago
I'm adding some gralloc bugs as blockers, but this may depend on whether we can show that gralloc could be a significant performance increase for tiles (I would say that this isn't a given).

We may also want to consider some kind of EGLImage-backed tiles implementation to move the texture allocation away from the compositor, if that makes sense...
Depends on: 959089

Comment 2

4 years ago
I don't think we will need that. Content should send the compositor gralloc tiles that content allocated (from the image bridge).
(Assignee)

Updated

4 years ago
Depends on: 939348
We will use this bug to cover the "scrolling" scenario.  If the measurements support us doing tiling everywhere, we'll create another bug.
Summary: [meta] Turn on tiled layers on b2g → [meta] Turn on tiled scrollable layers on b2g
blocking-b2g: --- → 1.4+
(Assignee)

Updated

4 years ago
Depends on: 893303

Updated

4 years ago
Alias: b2g-tiling

Updated

4 years ago
Depends on: 971914
Clearing blocking flag - this is a meta bug, which we will not block on. Per a drivers' decision - please nominate individual actionable dependencies.
blocking-b2g: 1.4+ → ---
Assignee: nobody → chrislord.net
blocking-b2g: --- → 1.4+
Summary: [meta] Turn on tiled scrollable layers on b2g → Turn on tiled scrollable layers on b2g
Blocks: 942750

Updated

4 years ago
Depends on: 978248
(Assignee)

Updated

4 years ago
Depends on: 979973
(Assignee)

Updated

4 years ago
Depends on: 979084
Depends on: 979712
(Assignee)

Comment 5

4 years ago
Created attachment 8387468 [details] [diff] [review]
Tiling branch work

This is the diff of the tiling branch against the large merge point on central. Attaching it to this bug as it seemed the most appropriate, but this fixes bug 893303, bug 915234, bug 963076 and probably a few other related things.
(Assignee)

Comment 6

4 years ago
Comment on attachment 8387468 [details] [diff] [review]
Tiling branch work

Review of attachment 8387468 [details] [diff] [review]:
-----------------------------------------------------------------

Here's my first pass through, I'm going to start fixing anything that's straight-forward.

::: gfx/layers/Compositor.h
@@ +137,5 @@
> + */
> +class CompositorBackendSpecificData : public RefCounted<CompositorBackendSpecificData>
> +{
> +public:
> +  //MOZ_DECLARE_REFCOUNTED_TYPENAME(CompositorBackendSpecificData)

Need to uncomment this now I think.

::: gfx/layers/client/ClientThebesLayer.cpp
@@ +22,5 @@
>  #include "nsCOMPtr.h"                   // for already_AddRefed
>  #include "nsISupportsImpl.h"            // for Layer::AddRef, etc
>  #include "nsRect.h"                     // for nsIntRect
>  #include "gfx2DGlue.h"
> +#include "gfxPrefs.h"

Not sure why this has moved... Merging weirdness?

@@ +174,5 @@
>  #endif
> +      gfxPrefs::LayersTilesEnabled() &&
> +      (AsShadowForwarder()->GetCompositorBackendType() == LayersBackend::LAYERS_OPENGL ||
> +       AsShadowForwarder()->GetCompositorBackendType() == LayersBackend::LAYERS_D3D9 ||
> +       AsShadowForwarder()->GetCompositorBackendType() == LayersBackend::LAYERS_D3D11)) {

I wonder if we even need to check the compositor backend type now...?

::: gfx/layers/client/CompositableClient.h
@@ +142,5 @@
>     * A hook for the when the Compositable is detached from it's layer.
>     */
>    virtual void OnDetach() {}
>  
> +  virtual void ClearCachedResources() {}

Needs documentation.

::: gfx/layers/client/TextureClient.cpp
@@ +213,5 @@
>  }
>  
> +#ifdef MOZ_WIDGET_GONK
> +static bool
> +DisableGralloc(SurfaceFormat aFormat)

Let's be careful when doing the final merge that this file mirrors the removals in CompositableClient.cpp correctly.

::: gfx/layers/client/TextureClient.h
@@ +224,5 @@
>    virtual void Unlock() {}
>  
>    virtual bool IsLocked() const = 0;
>  
> +  virtual bool CopyToTextureClient(TextureClient* aTarget,

Needs documentation (especially with regard to the expected state of locks).

@@ +235,5 @@
>     * use immediate uploads (see TextureFlags in CompositorTypes.h)
>     */
>    virtual bool ImplementsLocking() const { return false; }
>  
> +  virtual bool HasInternalBuffer() const = 0;

Needs documentation.

::: gfx/layers/client/TextureClientPool.h
@@ +17,5 @@
> +namespace layers {
> +
> +class ISurfaceAllocator;
> +
> +class TextureClientPool : public RefCounted<TextureClientPool>

All the function declarations in this class need documentation.

@@ +48,5 @@
> +  static const uint32_t sMinCacheSize = 0;
> +
> +  // This is the number of texture clients we don't want to exceed, including
> +  // outstanding TextureClients (from GetTextureClient()), cached
> +  // TextureClients and deferred-return TextureClients.

This needs updating, as deferred-return TextureClients are now considered outstanding.

::: gfx/layers/client/TiledContentClient.cpp
@@ +34,5 @@
>  #ifdef GFX_TILEDLAYER_DEBUG_OVERLAY
>  #include "cairo.h"
>  #include <sstream>
>  using mozilla::layers::Layer;
> +static void DrawDebugOverlay(mozilla::gfx::DrawTarget* dt, int x, int y, int width, int height)

For the record, I don't think this function fully works anymore - I didn't see the text getting drawn last time I enabled it. We could do with the bitmap font debugging functions that Vlad talked about.

::: gfx/layers/client/TiledContentClient.h
@@ +46,5 @@
> +class ClientLayerManager;
> +
> +
> +// A class to help implement copy-on-write semantics for shared tiles.
> +class gfxSharedReadLock : public AtomicRefCounted<gfxSharedReadLock> {

We should consider splitting these into their own files, they aren't really tiling-specific. The methods also need documentation (especially that on creation, the readcount is 1).

@@ +134,5 @@
>   * each tile keeps a reference to a texture client. The texture client
>   * is backed by a gfxReusableSurfaceWrapper that implements a
>   * copy-on-write mechanism while locked. The tile should be
>   * locked before being sent to the compositor and unlocked
>   * as soon as it is uploaded to prevent a copy.

This comment needs updating.

@@ +334,5 @@
>  /**
>   * Provide an instance of TiledLayerBuffer backed by image surfaces.
>   * This buffer provides an implementation to ValidateTile using a
>   * thebes callback and can support painting using a single paint buffer
>   * which is much faster then painting directly into the tiles.

This comment also needs updating. Tests are showing that the single paint buffer is actually quite a bit slower now too... One thing at a time, I guess :)

::: gfx/layers/composite/CompositableHost.cpp
@@ +174,5 @@
>    default:
>      MOZ_CRASH("Unknown CompositableType");
>    }
> +  // We know that Tiled buffers don't use the compositable backend-specific
> +  // data, so don't bother creating it.

Actually, this isn't used anywhere with the current changes. We should either just get rid of compositable backend specific data, or we should add a patch that restores the old path in GrallocTextureHost. I'm in favour of the former, but really, we should probably do the latter. I have a patch for the latter.

::: gfx/layers/composite/TiledContentHost.cpp
@@ +110,5 @@
> +  if(!IsValid()) {
> +    return;
> +  }
> +  // The TextureClients were created with the TEXTURE_IMMEDIATE_UPLOAD flag,
> +  // so calling Update on all the texture hosts will perform the texture upload.

I wonder if there's a way we can assert this? If we don't set this flag on internally buffered surfaces, we could end up with some weirdness.

@@ +130,5 @@
>    printf_stderr("Upload tile %i, %i\n", aTileOrigin.x, aTileOrigin.y);
>    long start = PR_IntervalNow();
>  #endif
>  
> +  aTile.mTextureHost->Updated(nullptr);

So here we upload the entire texture, like we used to, but we could easily use aDirtyRect if we wanted. Probably worth noting in a comment.

@@ +168,5 @@
> +TiledContentHost::~TiledContentHost()
> +{
> +  MOZ_COUNT_DTOR(TiledContentHost);
> +
> +  if (mPendingUpload) {

Probably wouldn't hurt to comment on what's going on here.

@@ +455,5 @@
>    for (;it != stop; ++it) {
>      fprintf_stderr(aFile, "%s", aPrefix);
>      fprintf_stderr(aFile, aDumpHtml ? "<li> <a href=" : "Tile ");
> +    //TODO: Check if there's a DumpTextureHost, if not write one
> +    //DumpDeprecatedTextureHost(aFile, it->mDeprecatedTextureHost);

Someone should follow up on this.

::: gfx/layers/composite/TiledContentHost.h
@@ +87,5 @@
>  
> +  bool IsPlaceholderTile() const { return mTextureHost == nullptr; }
> +
> +  void ReadUnlock() {
> +    NS_WARN_IF_FALSE(mTextureHost == nullptr || mSharedLock != nullptr,

Should probably document this warning - it's basically saying that if you have a texture host, you must have a lock. Could probably express it in an easier to read way.

@@ +121,5 @@
> +  void ReadUnlock();
> +
> +  void ReleaseTextureHosts();
> +
> +  void Upload();

This method especially could do with documentation I think. It's unlikely to actually do anything for textures with no internal buffer.

@@ +147,4 @@
>  };
>  
>  /**
> + * XXX Sort out this comment.

Yup.

::: gfx/layers/ipc/CompositableTransactionParent.cpp
@@ +67,5 @@
>  
>  bool
> +CompositableParentManager::IsCrossProcess() const
> +{
> +  MOZ_ASSERT(false, "TODO[nical] - Implement me!");

File a bug for this, or implement before merge?

::: gfx/layers/ipc/ShadowLayers.h
@@ +254,5 @@
>    void SetMask(ShadowableLayer* aLayer,
>                 ShadowableLayer* aMaskLayer);
>  
>    /**
> +   * See CompositableForwarder::UseTiledLayerBuffer

Seeing as this is an override, maybe we should just remove this comment(?)

::: gfx/layers/opengl/CompositorOGL.cpp
@@ +49,5 @@
>  #endif
> +
> +#ifdef MOZ_WIDGET_GONK
> +#include "EGLImageHelpers.h"
> +#endif

This hunk can disappear, this was for when the code allocated a 1x1 EGLImage to help with unbinding.

@@ +1373,5 @@
> +#ifdef MOZ_WIDGET_GONK
> +  if (mCompositorBackendSpecificData) {
> +    static_cast<CompositorOGLGonkBackendSpecificData*>(mCompositorBackendSpecificData.get())->EndFrame();
> +  }
> +#endif

We still get genlock failures with some drivers, I wonder if this might be improved by moving this block to *before* the SwapBuffers above?

::: gfx/layers/opengl/GrallocTextureHost.h
@@ +57,5 @@
>    {
>      mGraphicBuffer = nullptr;
>    }
>  
>    TemporaryRef<gfx::DataSourceSurface> GetAsSurface();

Note, we need to audit this function - I'm not sure if with the current locking and the way this function gets used if it won't cause genlock failures.

This function is pretty poorly documented, it's not clear whether a lock should be taken or not when using it and when I last looked at mxr, whether this is the case is unclear. Personally, I think a lock should be taken.

::: gfx/layers/opengl/TextureClientOGL.h
@@ +93,5 @@
>    virtual bool ToSurfaceDescriptor(SurfaceDescriptor& aOutDescriptor) MOZ_OVERRIDE;
>  
>    virtual TextureClientData* DropTextureData() MOZ_OVERRIDE { return nullptr; }
>  
> +  virtual bool HasInternalBuffer() const MOZ_OVERRIDE { return false; }

This is unnecessary, the default already returns false and this doesn't inherit from anything else.

::: gfx/layers/opengl/TextureHostOGL.cpp
@@ +53,1 @@
>    return new CompositableDataGonkOGL();

Heh, this would explain why my patch that restores the old path seemed to have no effect... Whoops :)
(Assignee)

Comment 7

4 years ago
Created attachment 8387521 [details] [diff] [review]
Tiling branch work v2

I've addressed everything in my comments, except:
- DrawDebugOverlay
- splitting out gfxSharedReadLock
- the compositable backend data stuff (I have a patch on another machine that I won't get to for another hour or so)*
- DumpTextureHost
- ReadUnlock warning*
- XXX Sort out this comment*
- CompositableParentManager::IsCrossProcess
- Audit GrallocTextureHost::GetAsSurface*

Stuff marked with a star is stuff I intend to fix now, everything else and any issues I missed, take your pick.
Attachment #8387468 - Attachment is obsolete: true
(In reply to Chris Lord [:cwiiis] from comment #7)
> - DumpTextureHost

Done

> - CompositableParentManager::IsCrossProcess

Done

> - Audit GrallocTextureHost::GetAsSurface*

Note that GetAsSurface is only used by debugging helpers
(Assignee)

Comment 9

4 years ago
Created attachment 8387565 [details] [diff] [review]
Restore old gralloc path for non-tiles

Here's a patch that restores the old path for non-tiles...
(Assignee)

Comment 10

4 years ago
(In reply to Chris Lord [:cwiiis] from comment #9)
> Created attachment 8387565 [details] [diff] [review]
> Restore old gralloc path for non-tiles
> 
> Here's a patch that restores the old path for non-tiles...

So was looking at doing the opposite (removing the old path), and it's quite involved. We need to either remove or port GrallocDeprecatedTextureHostOGL and duplicate whatever it is that CompositableParentManager::ReturnTextureDataIfNecessary is doing (which might need some looking at anyway?).

My vote goes for applying this patch and dealing with this later.
(Assignee)

Comment 11

4 years ago
(In reply to Chris Lord [:cwiiis] from comment #9)
> Created attachment 8387565 [details] [diff] [review]
> Restore old gralloc path for non-tiles
> 
> Here's a patch that restores the old path for non-tiles...

After even more inspection, we realised that the new path breaks the fencing stuff satoro was working on, so I don't think not restoring the old path is an option. I've pushed this to the branch.

Also fixed up the other comments I pointed out.

GetAsSurface is only used by debug code, so I think we're ok to sort that out later.
(Assignee)

Comment 12

4 years ago
Created attachment 8387577 [details] [diff] [review]
Tiles branch work v3

I think this is possibly good enough. Obviously, I can't be the one to make that call though.
Attachment #8387521 - Attachment is obsolete: true
Attachment #8387565 - Attachment is obsolete: true
(Assignee)

Comment 13

4 years ago
Created attachment 8387582 [details] [diff] [review]
Tiling branch work v3, rebased on m-c

No changes, couple of small header conflicts resolved.
Attachment #8387577 - Attachment is obsolete: true
(Assignee)

Comment 14

4 years ago
try push of central with bug 979084 (which will have no effect on tests, but will be handy if anyone wants to use the resultant b2g builds) and rolled up patches:

https://tbpl.mozilla.org/?tree=Try&rev=8c9c9fe90171
(Assignee)

Comment 15

4 years ago
Created attachment 8387588 [details] [diff] [review]
Tiling branch work v3, rebased on m-c, TextureClientPool added.

Whoops, forgot to add files...

https://tbpl.mozilla.org/?tree=Try&rev=3125296c850d
Attachment #8387582 - Attachment is obsolete: true
(Assignee)

Comment 16

4 years ago
And a push with tiles enabled on b2g: https://tbpl.mozilla.org/?tree=Try&rev=a7cb42741777
(Assignee)

Comment 17

4 years ago
Actually, here's a better b2g try push: https://tbpl.mozilla.org/?tree=Try&rev=6d98af34bb95
(Assignee)

Comment 18

4 years ago
Created attachment 8387596 [details] [diff] [review]
Tiling branch work v3, rebased on m-c, TextureClientPool added, summary text corrected

kats caught that I got the bug number wrong. Updating in case someone jumps the gun and pushes straight from this patch :)
Attachment #8387588 - Attachment is obsolete: true
(Assignee)

Comment 19

4 years ago
Created attachment 8387597 [details] [diff] [review]
Enable tiles by default on b2g

This is just the pref flip. Giving r? to milan, who can have the final decision on when we turn this on after we merge.
Attachment #8387597 - Flags: review?(milan)
Comment on attachment 8387588 [details] [diff] [review]
Tiling branch work v3, rebased on m-c, TextureClientPool added.

Review of attachment 8387588 [details] [diff] [review]:
-----------------------------------------------------------------

Here are some things I've noticed so far.

::: gfx/layers/client/TiledContentClient.h
@@ +82,5 @@
> +private:
> +  int32_t mReadCount;
> +};
> +
> +class gfxShmSharedReadLock : public gfxSharedReadLock {

It's probably worth noting that each of these locks uses a full 4k page. In the future we might want to consider putting more than one on the same page.

::: gfx/layers/composite/TiledContentHost.cpp
@@ +44,4 @@
>  
> +  // Combine any valid content that wasn't already uploaded
> +  nsIntRegion oldPaintedRegion(aOldPaintedRegion);
> +  oldPaintedRegion.And(oldPaintedRegion, mValidRegion);

Doing:
 nsIntRegion oldPaintedRegion;
 oldPaintedRegion.And(aOldPaintedRegion, mValidRegion);

Will avoid the copy into oldPaintedRegion at the begining.
Attachment #8387588 - Attachment is obsolete: false
(Assignee)

Comment 21

4 years ago
ugh, sorry, juggling queues meant I had a stale patch that broke displayport positioning on b2g... This will likely result in errors, so I've cancelled the b2g push, left the first all push and pushed new tries for both:


All: https://tbpl.mozilla.org/?tree=Try&rev=fc55be65eca6
B2G+tiles: https://tbpl.mozilla.org/?tree=Try&rev=8d1ded8e586e

Comment 22

4 years ago
4k per lock is not going to work on Tarako. We need a pool of looks sharing pages. Please file a follow-up bug to be fixed right after landing.
Let's flip the pref after Martijn has had a chance to look at this, Monday at the latest.

Andreas - agreed - we're aiming for hamachi/nexus4/nexus5 until we get more devices to work with next week.
Flags: needinfo?(martijn.martijn)
(Assignee)

Comment 24

4 years ago
I've just spotted that TextureHostOGL doesn't implement HasInternalBuffer (which defaults to false on TextureHost), so that's negatively impacting Android performance/memory use. Will push a fix after I've tested.

Comment 25

4 years ago
PM Triage: remains 1.4+
(Assignee)

Comment 26

4 years ago
(In reply to Chris Lord [:cwiiis] from comment #24)
> I've just spotted that TextureHostOGL doesn't implement HasInternalBuffer
> (which defaults to false on TextureHost), so that's negatively impacting
> Android performance/memory use. Will push a fix after I've tested.

This wasn't correct, the problem was an uninitialised member variable in TiledLayerBufferComposite (but it had the same effect).

Here's a try push with that fixed: https://tbpl.mozilla.org/?tree=Try&rev=48735e1258b7

Here's an etherpad where we're tracking what we've pushed to try: https://etherpad.mozilla.org/tiling-talos

Playing with this on a Galaxy Nexus, I'm seeing the compositor occasionally block on the client (or at least, it looks that way). I don't think this should stop us from landing, but we need to investigate this.
(Assignee)

Comment 27

4 years ago
Created attachment 8387755 [details]
Tiling branch work v4, rebased on m-c

Here's the latest with the unnecessary nsIntRegion copy and uninitialised mHasDoubleBufferedTiles fixed.
Attachment #8387588 - Attachment is obsolete: true
Attachment #8387596 - Attachment is obsolete: true
(Assignee)

Comment 28

4 years ago
Created attachment 8387763 [details] [diff] [review]
Tiling branch work v5, rebased on m-c

Removed the nsIntRegion optimisation - it didn't build, presumable because aOldPaintedRegion is const. I'm getting tired, I'll let someone else fix that.

New Android push: https://tbpl.mozilla.org/?tree=Try&rev=438f9f0e1035
Attachment #8387755 - Attachment is obsolete: true
Comment on attachment 8387763 [details] [diff] [review]
Tiling branch work v5, rebased on m-c

Review of attachment 8387763 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/layers/client/TiledContentClient.cpp
@@ +28,5 @@
>  
> +// This is the minimum area that we deem reasonable to copy from the front buffer to the
> +// back buffer on tile updates. If the valid region is smaller than this, we just
> +// redraw it and save on the copy (and requisite surface-locking involved).
> +#define MINIMUM_TILE_COPY_AREA ((TILEDLAYERBUFFER_TILE_SIZE * TILEDLAYERBUFFER_TILE_SIZE)/16)

This could be a constant, not a define.

@@ +314,5 @@
> +  MOZ_COUNT_CTOR(gfxShmSharedReadLock);
> +
> +  if (mAllocator) {
> +#define MOZ_ALIGN_WORD(x) (((x) + 3) & ~3)
> +    if (mAllocator->AllocUnsafeShmem(MOZ_ALIGN_WORD(sizeof(ShmReadLockInfo)),

Since ShmReadLockInfo contains just a int32_t, better static_assert(sizeof==4) and then you dont need to do crazy computations and can remove MOZ_ALIGN_WORD  (which btw should be an inline function, not a macro).

::: gfx/layers/client/TiledContentClient.h
@@ +121,5 @@
> +
> +  ShmReadLockInfo* GetShmReadLockInfoPtr()
> +  {
> +    return reinterpret_cast<ShmReadLockInfo*>
> +      (mShmem.get<char>() + mShmem.Size<char>() - sizeof(ShmReadLockInfo));

What assumption are you making here about alignof(ShmReadLockInfo) ? Is that assumption safe? Can you cover it by an assertion?

::: gfx/layers/opengl/GrallocTextureHost.cpp
@@ +130,5 @@
> +  if (mCompositableBackendData) {
> +    // There are two paths for locking/unlocking - if mCompositableBackendData is
> +    // set, we use the texture on there, otherwise we use
> +    // CompositorBackendSpecificData from the compositor and bind the EGLImage
> +    // only in Lock().

Hang on, are we going back to using a single texture compositor-wide for all gralloc buffers? There should at least be a comment here explaining why this is wanted. Also, if you still have these bugs with client processes hanging because the compositor doesn't release a lock on a graphicbuffer, then this would be a place where to look suspisciously!

@@ +324,5 @@
>  
>  void
>  GrallocTextureHostOGL::Unlock()
>  {
>    // Unlock is done internally by binding the texture to another gralloc buffer

I realize that that was there before, but maybe this is still suspicious and could explain these client process hangs you were mentioning... i.e. what if no new compositing is taking place, so this locks remains, and prevents a client process from getting a write lock.
(Assignee)

Comment 30

4 years ago
(In reply to Benoit Jacob [:bjacob] from comment #29)
> Comment on attachment 8387763 [details] [diff] [review]
> Tiling branch work v5, rebased on m-c
> 
> Review of attachment 8387763 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: gfx/layers/client/TiledContentClient.cpp
> @@ +28,5 @@
> >  
> > +// This is the minimum area that we deem reasonable to copy from the front buffer to the
> > +// back buffer on tile updates. If the valid region is smaller than this, we just
> > +// redraw it and save on the copy (and requisite surface-locking involved).
> > +#define MINIMUM_TILE_COPY_AREA ((TILEDLAYERBUFFER_TILE_SIZE * TILEDLAYERBUFFER_TILE_SIZE)/16)
> 
> This could be a constant, not a define.
> 
> @@ +314,5 @@
> > +  MOZ_COUNT_CTOR(gfxShmSharedReadLock);
> > +
> > +  if (mAllocator) {
> > +#define MOZ_ALIGN_WORD(x) (((x) + 3) & ~3)
> > +    if (mAllocator->AllocUnsafeShmem(MOZ_ALIGN_WORD(sizeof(ShmReadLockInfo)),
> 
> Since ShmReadLockInfo contains just a int32_t, better
> static_assert(sizeof==4) and then you dont need to do crazy computations and
> can remove MOZ_ALIGN_WORD  (which btw should be an inline function, not a
> macro).
> 
> ::: gfx/layers/client/TiledContentClient.h
> @@ +121,5 @@
> > +
> > +  ShmReadLockInfo* GetShmReadLockInfoPtr()
> > +  {
> > +    return reinterpret_cast<ShmReadLockInfo*>
> > +      (mShmem.get<char>() + mShmem.Size<char>() - sizeof(ShmReadLockInfo));
> 
> What assumption are you making here about alignof(ShmReadLockInfo) ? Is that
> assumption safe? Can you cover it by an assertion?

All of this was just blindly derived from gfxSharedImageSurface, so I'm afraid I don't really know off-hand (and am struggling to think about it right now)... If you think this can be simplified, great :)

> ::: gfx/layers/opengl/GrallocTextureHost.cpp
> @@ +130,5 @@
> > +  if (mCompositableBackendData) {
> > +    // There are two paths for locking/unlocking - if mCompositableBackendData is
> > +    // set, we use the texture on there, otherwise we use
> > +    // CompositorBackendSpecificData from the compositor and bind the EGLImage
> > +    // only in Lock().
> 
> Hang on, are we going back to using a single texture compositor-wide for all
> gralloc buffers? There should at least be a comment here explaining why this
> is wanted. Also, if you still have these bugs with client processes hanging
> because the compositor doesn't release a lock on a graphicbuffer, then this
> would be a place where to look suspisciously!

No, the two paths are one texture per compositable or one texture per host locked in a frame. The former pretty much equates to the latter in all cases except for tiles, where we have many textures in one compositable.

It's worth restoring the old path because satoro's fence work relies on it, but also the non-tiles gralloc paths are not well-tested with the new locking mechanism.

> @@ +324,5 @@
> >  
> >  void
> >  GrallocTextureHostOGL::Unlock()
> >  {
> >    // Unlock is done internally by binding the texture to another gralloc buffer
> 
> I realize that that was there before, but maybe this is still suspicious and
> could explain these client process hangs you were mentioning... i.e. what if
> no new compositing is taking place, so this locks remains, and prevents a
> client process from getting a write lock.

The hangs I'm seeing are on Android only, b2g seems fine - I would more suspect something in the TextureClient/Host (maybe a contending lock that shouldn't be?)

gralloc is unfortunately a black box (and more irritatingly, a different black box per driver revision per device) - I think we can debug these issues as follow-up and hopefully think up some more robust plan.
Comment on attachment 8387596 [details] [diff] [review]
Tiling branch work v3, rebased on m-c, TextureClientPool added, summary text corrected

Review of attachment 8387596 [details] [diff] [review]:
-----------------------------------------------------------------

part 1 of my review:

::: browser/devtools/profiler/cleopatra/js/parserWorker.js
@@ +1225,5 @@
>      bugNumber: "772916",
>      check: function(frames, symbols, meta) {
>  
>        return stepContains('PaintGradient', frames, symbols)
> +          && stepContains('ClientTiledLayerBuffer::PaintThebesSingleBufferDraw', frames, symbols)

I say don't bother patching this.

::: gfx/layers/TiledLayerBuffer.h
@@ +199,3 @@
>     * has completed uploading.
>     */
> +  virtual void UseTiledLayerBuffer(ISurfaceAllocator* aAllocator,

UseTiledLayerBuffer is a poor name for this. I like PaintedTiledLayerbuffer much better. Otherwise we should find a better name.

@@ +369,5 @@
> +    if (oldTileCount >= tilesMissing) {
> +      oldRetainedTiles[i] = AsDerived().GetPlaceholderTile();
> +      AsDerived().ReleaseTile(oldTile);
> +    } else {
> +      oldTileCount ++;

nit: no space here.

::: gfx/layers/client/ClientLayerManager.cpp
@@ +61,5 @@
>    mRoot = nullptr;
>  
>    MOZ_COUNT_DTOR(ClientLayerManager);
> +
> +  mTexturePools.clear();

Is this clear needed? The destructor for this object is about to run.

::: gfx/layers/client/TextureClientPool.h
@@ +17,5 @@
> +namespace layers {
> +
> +class ISurfaceAllocator;
> +
> +class TextureClientPool : public RefCounted<TextureClientPool>

This class needs a comment to describe its functionality.

@@ +30,5 @@
> +   * a cached client that was returned to the pool, or a newly allocated
> +   * client if one isn't available.
> +   *
> +   * All clients retrieved by this method should be returned using the return
> +   * functions, or reported lost so that the pool can manage its size correctly.

I'm not a fan of the feature that a texture client can become loss. This complicates the lifetime quite a bit. Is this *really* needed?

@@ +44,5 @@
> +  /**
> +   * Return a TextureClient that is not yet ready to be reused, but will be
> +   * imminently.
> +   */
> +  void ReturnTextureClientDeferred(TextureClient *aClient);

Then it shouldn't be returned to the poll. If you have a good reason to do this then it should be tracked outside of the pool. If you return something here but it's not ready to be re-used then you're putting the responsibility on the caller to ReturnDeferredClients to know when it is. Now what happens if there's two types of lifetimes for tiles returned here and one of the calls to ReturnDeferredClients happens while some tiles still aren't ready.

The user of the texture client are responsible for tracking this.

@@ +56,5 @@
> +  /**
> +   * Attempt to shrink the pool so that there are no more than sMinCacheSize
> +   * unused clients.
> +   */
> +  void ShrinkToMinimumSize();

Ideally these Shrin* would be private (use friend).

@@ +68,5 @@
> +  /**
> +   * Report that a client retrieved via GetTextureClient() has become
> +   * unusable, so that it will no longer be tracked.
> +   */
> +  void ReportClientLost() { mOutstandingClients--; }

This call is gross too. What you're saying here:
- This pool will allocate the texture client
- You have to return the texture client
- ... but if you don't then you need to tell me about it

I don't think this complexity is justified for a pool. The pool should accept objects that are returned and either make it guaranteed to return or optional. We shouldn't track out much outstanding clients are made.

@@ +87,5 @@
> +  static const uint32_t sMinCacheSize = 0;
> +
> +  // The maximum number of texture clients managed by this pool that we want
> +  // to remain active.
> +  static const uint32_t sMaxTextureClients = 50;

I'm not a fan of this max. This wont work for a 4k display for example. In fact I really don't like most of these values. We seems to have done well so far this arbitrary limits so I think we should push back or have a great reason for them and why these numbers are chosen.

::: gfx/layers/composite/TextureHost.h
@@ +430,5 @@
>  
>    /**
> +   * Indicates whether the TextureHost implementation is backed by an
> +   * in-memory buffer. The consequence of this is that locking the
> +   * TextureHost does not contend with locking the texture on the client side.

I don't understand this comment. This leaves me wondering:
If there's an in-memory buffer then both sides can still access it so locking is important.

I'm sure this works so please clean-up the comment/API to make it more obvious what is happening here.

::: gfx/layers/ipc/CompositableForwarder.h
@@ +222,5 @@
>  
>    /**
>     * Returns the type of backend that is used off the main thread.
>     * We only don't allow changing the backend type at runtime so this value can
>     * be queried once and will not change until Gecko is restarted.

This comment is wrong now, see the other GetCompositorBackendType.

::: gfx/layers/ipc/ISurfaceAllocator.h
@@ +85,5 @@
> +   * We only don't allow changing the backend type at runtime so this value can
> +   * be queried once and will not change until Gecko is restarted.
> +   *
> +   * XXX - With e10s this may not be true anymore. we can have accelerated widgets
> +   * and non-accelerated widgets (small popups, etc.)

Shouldn't we fix this by having a different surface allocator or something? It's not clear what assumptions are made on that which will now be broken.
(Assignee)

Comment 32

4 years ago
Surprisingly, enabling tiles is green on b2g :) https://tbpl.mozilla.org/?tree=Try&rev=8d1ded8e586e
(In reply to Chris Lord [:cwiiis] from comment #32)
> Surprisingly, enabling tiles is green on b2g :)
> https://tbpl.mozilla.org/?tree=Try&rev=8d1ded8e586e

I'd be surprised if the b2g tests managed to catch a blatant abort() ;)
http://hg.mozilla.org/integration/mozilla-inbound/rev/8394fed3332e
Comment on attachment 8387597 [details] [diff] [review]
Enable tiles by default on b2g

Review of attachment 8387597 [details] [diff] [review]:
-----------------------------------------------------------------

Milan's on PTO, it's clear we want to do this now, let's back out if it causes issues.
Attachment #8387597 - Flags: review?(milan) → review+
The landing stuck, turned it on by default here:

http://hg.mozilla.org/integration/mozilla-inbound/rev/101152af314f

There seemed to be consensus since it was all green that turning it on my default right now was a good idea. If I interpreted this wrongly I apologize and do backout, but as far as I can see right now the more testing, the earlier, the better.
https://hg.mozilla.org/mozilla-central/rev/8394fed3332e
https://hg.mozilla.org/mozilla-central/rev/101152af314f
https://hg.mozilla.org/mozilla-central/rev/6e816e44735d
Status: NEW → RESOLVED
Last Resolved: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla30

Updated

4 years ago
Depends on: 981817

Updated

4 years ago
No longer depends on: 981817

Updated

4 years ago
Depends on: 981794

Updated

4 years ago
Depends on: 982128
status-b2g-v1.4: --- → fixed
status-firefox28: --- → wontfix
status-firefox29: --- → wontfix
status-firefox30: --- → fixed

Updated

4 years ago
Depends on: 984531

Updated

4 years ago
Depends on: 983883

Updated

4 years ago
Depends on: 984618

Updated

4 years ago
Depends on: 984673

Updated

4 years ago
Depends on: 984577

Updated

4 years ago
Depends on: 985593

Updated

4 years ago
Depends on: 985162
Depends on: 982821
Depends on: 985779

Updated

4 years ago
Depends on: 977880

Updated

4 years ago
Depends on: 985170

Updated

4 years ago
Depends on: 984482
Flags: needinfo?(martijn.martijn)

Updated

3 years ago
No longer depends on: 983883

Updated

3 years ago
No longer depends on: 985162

Updated

3 years ago
No longer depends on: 984531

Updated

3 years ago
No longer depends on: 984577

Updated

3 years ago
No longer depends on: 985170

Updated

3 years ago
No longer depends on: 985779

Updated

3 years ago
No longer depends on: 985593

Updated

3 years ago
No longer depends on: 984482
Can we follow up on the review here as promised?

Updated

3 years ago
Depends on: 992985

Updated

3 years ago
Depends on: 992218
Depends on: 995129
Depends on: 1007119
(Assignee)

Updated

3 years ago
Duplicate of this bug: 894333
Depends on: 1014333

Updated

3 years ago
Depends on: 1009306
You need to log in before you can comment on or make changes to this bug.