Closed Bug 895358 Opened 11 years ago Closed 11 years ago

Enable progressive tile rendering in B2G

Categories

(Core :: Graphics: Layers, defect)

All
Gonk (Firefox OS)
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla29

People

(Reporter: cwiiis, Assigned: rbarker)

References

Details

Attachments

(2 files, 31 obsolete files)

6.38 KB, patch
Details | Diff | Splinter Review
53.34 KB, patch
Details | Diff | Splinter Review
Tiled rendering needs to be progressive to provide a big win, we should aim to enable this on b2g after we get the tiled layers backend working so that we can hit the FPS we're aiming for.
We're going to start sorting out tiled rendering vs. buffer rotation on B2G shortly, and we'll keep an eye on this.
Depends on: 895905
No longer depends on: 859929
(In reply to Milan Sreckovic [:milan] from comment #1)
> We're going to start sorting out tiled rendering vs. buffer rotation on B2G
> shortly, and we'll keep an eye on this.

Oh, you are? This is something I was looking at too, so if you could let me know what your plans are, that'd be great :) Feel free to/please mail me if this or another bug isn't an appropriate place.
Jeff will be looking at this, and you're both on this bug, so we can stay in here.  Should know more by mid next week, will keep you posted.
Chris, I was planing on starting to think about this next week. I don't have any plan yet. What plan do you have?
Most of the work is in unrotting my patch to make tiling IPC safe which is almost a redoing the patch since it was done per layers refactoring.
(In reply to Benoit Girard (:BenWa) from comment #5)
> Most of the work is in unrotting my patch to make tiling IPC safe which is
> almost a redoing the patch since it was done per layers refactoring.

This is what I planned to do this week (and to address the comments in bug 747811 if it gets that far).
Blocks: 897996
Working on this.
Assignee: nobody → chrislord.net
Status: NEW → ASSIGNED
Unfortunately, I've not been able to finish this before I leave - 10:20pm on a Friday is a reasonable time to give up what with my last train being in about an hour I think :)

I'll attach my (very limited) WIP patch that someone can use as a starting point or inspiration. My plan was thus:

- Implement an IPC call for ProgressiveUpdateCallback - this would be handled I guess by the Shadow layer manager (or whatever that's called in the new world order, I don't recall) - this takes in basically the same parameters as the existing one for Android, but would also include the mScrollID of the layer in question.

- Compositor-side, this callback would be implemented by AsyncPanZoomController, which has all the information to make the same decisions as GeckoLayerClient. I assume the shadow layer manager would look up the correct one given the mScrollID.

- AsyncPanZoomController would have a reimplementation of GeckoLayerClient::progressiveUpdateCallback and would do the identical thing. Note though, that where this Java implementation returns the viewport and the scale, I think we need only return the composition bounds in CSS units. Gecko-side has enough context to transform CSS units into layer space and this would be a lot clearer than what is currently being done.

- Not many changes need to happen layers-side - The function RoundedTransformViewportBounds would be replaced in this situation by a function that would transform the CSS units composition bounds into local layer space. What that's doing only works I think because that viewport is on the root scroll frame, I very much doubt it works in any other situation. I apologise for writing that code.


Once this has been implemented, progressive rendering ought to just work (after enabling the pref), and low precision rendering ought to be easy to enable too, though will require more work (to set the critical display port as well as the display port).

Personally, I think we should tackle low precision rendering after we have all of this as well as nested APZCs solid.
Assignee: chrislord.net → nobody
Attached patch Beginnings (obsolete) — Splinter Review
Here was the beginning of the implementation of my plan. Not much here, but might be useful as a pointer to where things could go.
Attached patch progressive_tiles.patch (obsolete) — Splinter Review
more plumbing. In this patch it always replies with abortDrawing = false. If you set it to true all the time we only draw quick flickers, so it seems to be working.
Attachment #798031 - Attachment is obsolete: true
Attached patch Posix Cross Process Mutex (obsolete) — Splinter Review
Here is my first cut at a posix based cross process mutex. I'm concerned about the usage of SharedMemoryBasic. Is there a better way to handle this. Also, I would really like to write a test for this but am unsure how proceed with that.
(In reply to Randall Barker from comment #11)
> Created attachment 812683 [details] [diff] [review]
> Posix Cross Process Mutex
> 
> Here is my first cut at a posix based cross process mutex. I'm concerned
> about the usage of SharedMemoryBasic. Is there a better way to handle this.
> Also, I would really like to write a test for this but am unsure how proceed
> with that.

Is this a good candidate for splitting into a separate bug, which would be blocking this one?
Attached patch Work in progress. (obsolete) — Splinter Review
Here is what I have so far. It is still a work in progress. I'm still trying to get BasicTiledLayerBufferHelper::UpdateFromCompositorFrameMetrics working correctly.
Assignee: nobody → rbarker
I find this page is good for testing checker boarding:

http://en.wikipedia.org/wiki/List_of_United_States_counties_and_county-equivalents

It is currently the longest article on Wikipedia.
Attached patch Work in progress 2 (obsolete) — Splinter Review
Current state of development for bug deep dive.
Attachment #824380 - Attachment is obsolete: true
This is the back trace I get when the shared memory is deleted in the compositor process.
(In reply to Randall Barker from comment #13)
> Created attachment 824380 [details] [diff] [review]
> Work in progress.
> 
> Here is what I have so far. It is still a work in progress. I'm still trying
> to get BasicTiledLayerBufferHelper::UpdateFromCompositorFrameMetrics working
> correctly.

Make sure to talk to Cwiiis and kats about that.
Attached patch Work in progress 3 (obsolete) — Splinter Review
Current state of work. No longer crashes. When tiling enabled, there are visual flashes and visual anomalies in the UI caused by the progressive tiling. Going to need help to track this down as I do not know where to even start. Also will need help verifying it is working as expected.
Attachment #812683 - Attachment is obsolete: true
Attachment #828706 - Attachment is obsolete: true
Attached patch Work in progress 4 (obsolete) — Splinter Review
Fixed flickering issue. Still need to verify the new UpdateFromCompositorFrameMetrics function.
Attachment #831211 - Attachment is obsolete: true
Attachment #831910 - Flags: review?(chrislord.net)
Attachment #831910 - Flags: review?(bugmail.mozilla)
Attachment #831910 - Flags: review?(bgirard)
Attached patch wip5.patch (obsolete) — Splinter Review
Cleaned up the case where the layer did not have a composition bounds.
Attachment #831910 - Attachment is obsolete: true
Attachment #831910 - Flags: review?(chrislord.net)
Attachment #831910 - Flags: review?(bugmail.mozilla)
Attachment #831910 - Flags: review?(bgirard)
Attachment #832621 - Flags: review?(chrislord.net)
Attachment #832621 - Flags: review?(bugmail.mozilla)
Attachment #832621 - Flags: review?(bgirard)
Comment on attachment 831910 [details] [diff] [review]
Work in progress 4

Review of attachment 831910 [details] [diff] [review]:
-----------------------------------------------------------------

Sorry he's the review I had pending against the previous patch

::: b2g/app/b2g.js
@@ +75,4 @@
>  pref("mozilla.widget.use-buffer-pixmap", true);
>  pref("mozilla.widget.disable-native-theme", true);
>  pref("layout.reflow.synthMouseMove", false);
> +pref("layers.force-tiles", true);

We can't enable tiling on b2g until we have compared the performance and power usage. We will likely need to make gralloc based tiling.

::: gfx/layers/ipc/CompositorChild.h
@@ +75,5 @@
> +
> +    FrameMetricsData() :
> +        mBuffer(0),
> +        mMutex(0),
> +        mId(0) { }

count ctor/dtor

::: gfx/layers/ipc/CompositorParent.cpp
@@ +66,2 @@
>  {
>  }

Not your fault but this could really use MOZ_CTOR_COUNT.

::: gfx/layers/ipc/CompositorParent.h
@@ +199,4 @@
>      nsRefPtr<Layer> mRoot;
>      nsRefPtr<GeckoContentController> mController;
>      CompositorParent* mParent;
> +    PCompositorParent* mCrossProcessParent;

This could use a comment. What's the difference between mParent and mCrossProcessParent. mParent is also of type PCompositorParent. This isn't easy to see from glancing at this code.

::: ipc/glue/CrossProcessMutex_posix.cpp
@@ +17,5 @@
> +}
> +
> +namespace mozilla {
> +
> +CrossProcessMutex::CrossProcessMutex(const char*)

Mutex change should live in a separate patch.

@@ +25,5 @@
> +    , mCalledMutexInit(true)
> +{
> +  mSharedMutex = new ipc::SharedMemoryBasic;
> +  if (!mSharedMutex->Create(sizeof(MutexData))) {
> +    NS_RUNTIMEABORT("Failed to create shared memory for storage of shared mutex handle!");

nit: IMO knowing the exact cause of the failure for something that is unlikely to happen isn't very important. We can reuse the same thing for the runtime abort and not bloat our text section. Small increase but it's something we're going to carry in the future for a long time.

@@ +41,5 @@
> +
> +  mMutex = &(data->mMutex);
> +
> +  if (!mMutex) {
> +    NS_RUNTIMEABORT("This shouldn't happen - failed to create mutex!");

Actually how would this ever happen. data is a non negative pointer and we know here that it's not null. So the pointer to the fields mMutex and mDestroy are some offset from that pointer which will also be a positive number.

@@ +89,5 @@
> +  MutexData* data = static_cast<MutexData*>(mSharedMutex->memory());
> +  mMutex = &(data->mMutex);
> +  mDestroyed = &(data->mDestroyed);
> +
> +  if (!mMutex) {

We should check that data is not null but the pointer to mutex or destroyed will never be null.

@@ +99,5 @@
> +
> +CrossProcessMutex::~CrossProcessMutex()
> +{
> +  // Need to release mutex here to prevent leaking
> +  if (mCalledMutexInit) {

Managing this object across process seems complicated. If you delete the side that allocate the object the mutex is no longer valid. This doesn't seem like it's well documented. Typically we either want an object that only alive on one side at a time or use something alike a reference count or messages to verify that the other side is done before the deallocation.

@@ +101,5 @@
> +{
> +  // Need to release mutex here to prevent leaking
> +  if (mCalledMutexInit) {
> +    Lock();
> +    *mDestroyed = 1;

And using mDestroy isn't safe because your stores may not be ordered and even if they were you're risking a time of check, time of use problem where the other side checks mDestroy, then the current side stores the destroyed state and destroy the mutex and now the other side is reschedule and thinks the buffer is still alive and uses it.

@@ +113,5 @@
> +
> +void
> +CrossProcessMutex::Lock()
> +{
> +  if (!*mDestroyed) {

Also if mDestroyed is set that means that the shared memory it's pointing to will also be release so this isn't safe or useful.
(In reply to Benoit Girard (:BenWa) from comment #21)
> Comment on attachment 831910 [details] [diff] [review]
> Work in progress 4
> 
> Review of attachment 831910 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Sorry he's the review I had pending against the previous patch
> 

No problem, I didn't change very much.

> ::: b2g/app/b2g.js
> @@ +75,4 @@
> >  pref("mozilla.widget.use-buffer-pixmap", true);
> >  pref("mozilla.widget.disable-native-theme", true);
> >  pref("layout.reflow.synthMouseMove", false);
> > +pref("layers.force-tiles", true);
> 
> We can't enable tiling on b2g until we have compared the performance and
> power usage. We will likely need to make gralloc based tiling.
> 

Okay, but if tiling is turned off then there isn't much use for the patch until it can be enabled since the progressive tile rendering isn't used with out it.

> ::: gfx/layers/ipc/CompositorChild.h
> @@ +75,5 @@
> > +
> > +    FrameMetricsData() :
> > +        mBuffer(0),
> > +        mMutex(0),
> > +        mId(0) { }
> 
> count ctor/dtor
> 
> ::: gfx/layers/ipc/CompositorParent.cpp
> @@ +66,2 @@
> >  {
> >  }
> 
> Not your fault but this could really use MOZ_CTOR_COUNT.
> 

Sure, I'll add it.

> ::: gfx/layers/ipc/CompositorParent.h
> @@ +199,4 @@
> >      nsRefPtr<Layer> mRoot;
> >      nsRefPtr<GeckoContentController> mController;
> >      CompositorParent* mParent;
> > +    PCompositorParent* mCrossProcessParent;
> 
> This could use a comment. What's the difference between mParent and
> mCrossProcessParent. mParent is also of type PCompositorParent. This isn't
> easy to see from glancing at this code.
> 

Done, not sure I made things clearer but I tried.

> ::: ipc/glue/CrossProcessMutex_posix.cpp
> @@ +17,5 @@
> > +}
> > +
> > +namespace mozilla {
> > +
> > +CrossProcessMutex::CrossProcessMutex(const char*)
> 
> Mutex change should live in a separate patch.
> 

Okay, it is actually a separate commit, I have just been uploading it as a single patch.

> @@ +25,5 @@
> > +    , mCalledMutexInit(true)
> > +{
> > +  mSharedMutex = new ipc::SharedMemoryBasic;
> > +  if (!mSharedMutex->Create(sizeof(MutexData))) {
> > +    NS_RUNTIMEABORT("Failed to create shared memory for storage of shared mutex handle!");
> 
> nit: IMO knowing the exact cause of the failure for something that is
> unlikely to happen isn't very important. We can reuse the same thing for the
> runtime abort and not bloat our text section. Small increase but it's
> something we're going to carry in the future for a long time.
> 

I was just copying what was done in CrossProcessMutes_windows.cpp. But I understand what you are saying.

> @@ +41,5 @@
> > +
> > +  mMutex = &(data->mMutex);
> > +
> > +  if (!mMutex) {
> > +    NS_RUNTIMEABORT("This shouldn't happen - failed to create mutex!");
> 
> Actually how would this ever happen. data is a non negative pointer and we
> know here that it's not null. So the pointer to the fields mMutex and
> mDestroy are some offset from that pointer which will also be a positive
> number.
> 

Yes, you will see the same abort in the windows version, I was just trying to make them match, I can update.

> @@ +89,5 @@
> > +  MutexData* data = static_cast<MutexData*>(mSharedMutex->memory());
> > +  mMutex = &(data->mMutex);
> > +  mDestroyed = &(data->mDestroyed);
> > +
> > +  if (!mMutex) {
> 
> We should check that data is not null but the pointer to mutex or destroyed
> will never be null.
> 

Done.

> @@ +99,5 @@
> > +
> > +CrossProcessMutex::~CrossProcessMutex()
> > +{
> > +  // Need to release mutex here to prevent leaking
> > +  if (mCalledMutexInit) {
> 
> Managing this object across process seems complicated. If you delete the
> side that allocate the object the mutex is no longer valid. This doesn't
> seem like it's well documented. Typically we either want an object that only
> alive on one side at a time or use something alike a reference count or
> messages to verify that the other side is done before the deallocation.
> 

I agree, it will cause problems if the mutex is shared between more than two process. I couldn't figure out a way to run a multi-process unit test on B2G devices so it made trying different approaches challenging. The problem I was seeing was if the mutex was destroyed in the process that did not create it, the app would crash when the CrossProcessMutex was deleted in the process that initially created it. There may have been other factors causing this but I was not able to figure out what. By just ensuring that the mutex was destroyed in the same process it was created in prevented the crash. I looked into the pthread_mutex implementation on the device and could not see any reason for it to matter what process the destroy was called on. I can go back and see if I can come up with a ref counted version, it would be nice if it could be done with out all of Gecko getting in the way of testing it. 

> @@ +101,5 @@
> > +{
> > +  // Need to release mutex here to prevent leaking
> > +  if (mCalledMutexInit) {
> > +    Lock();
> > +    *mDestroyed = 1;
> 
> And using mDestroy isn't safe because your stores may not be ordered and
> even if they were you're risking a time of check, time of use problem where
> the other side checks mDestroy, then the current side stores the destroyed
> state and destroy the mutex and now the other side is reschedule and thinks
> the buffer is still alive and uses it.
> 

Well, the shared memory buffer is still alive. The destructor only unmaps it from the process that the destructor is called in. The shared memory is still mapped in the other process and the value of mDestroyed is  1 in the other process.

> @@ +113,5 @@
> > +
> > +void
> > +CrossProcessMutex::Lock()
> > +{
> > +  if (!*mDestroyed) {
> 
> Also if mDestroyed is set that means that the shared memory it's pointing to
> will also be release so this isn't safe or useful.

The mDestroyed flag only indicates that pthread_mutex_destroy has been called on the pthread_mutex_t so trying to lock on the other process will cause undefined behavior (Actually it causes the lock to hang due to the value that pthread_mutex_t is set to after the destroy). The memory will still be mapped in the other process even after it has been unmapped in the process that destroyed it. Having the creator also destroy was the only way I could ensure that destroy is only called once. Calling destroy more than once on a pthread mutex is undefined. With the mutex only shared between two process, this method should work, if there are more than two process sharing the mutext then problems may arise if two of the three process try to use the mutex after it has been destroyed.
Attached patch Work in progress 6 (obsolete) — Splinter Review
Updated based on feedback.
Attachment #832621 - Attachment is obsolete: true
Attachment #832621 - Flags: review?(chrislord.net)
Attachment #832621 - Flags: review?(bugmail.mozilla)
Attachment #832621 - Flags: review?(bgirard)
Attachment #832682 - Flags: review?(chrislord.net)
Attachment #832682 - Flags: review?(bugmail.mozilla)
Attachment #832682 - Flags: review?(bgirard)
Comment on attachment 832682 [details] [diff] [review]
Work in progress 6

Review of attachment 832682 [details] [diff] [review]:
-----------------------------------------------------------------

The cross-process mutex stuff should definitely be in another patch. I also think someone else should review this for the IPC stuff and general types usage (:blassey perhaps?), as I'm not that well versed in those areas.

This otherwise looks ok to me, but I have some questions below. r- for too many open questions and lack of documentation, but it looks sound.

::: b2g/app/b2g.js
@@ +75,4 @@
>  pref("mozilla.widget.use-buffer-pixmap", true);
>  pref("mozilla.widget.disable-native-theme", true);
>  pref("layout.reflow.synthMouseMove", false);
> +pref("layers.force-tiles", true);

I think we want to cut down tiled layer usage before enabling it on b2g - perhaps only for large scrollable layers (where the scroll port is more than twice the size of the viewport, perhaps? (i.e. when it would otherwise cause a buffer unrotate)) - see bug 915673

::: gfx/layers/client/TiledContentClient.cpp
@@ +135,5 @@
> +  FrameMetrics contentMetrics = aLayer->GetFrameMetrics();
> +  ContainerLayer* layer = aLayer;
> +
> +  // Walk up the layer tree until a valid composition bounds is found
> +  while (layer && contentMetrics.mCompositionBounds.IsEmpty()) {

This doesn't seem like a great way of doing this - perhaps instead of checking if mCompositionBounds is empty, you could check if mScrollId != NULL_SCROLL_ID ? (this is what botond suggested to me for doing something similar)

@@ +140,5 @@
> +    contentMetrics = layer->GetFrameMetrics();
> +    layer = layer->GetParent();
> +  }
> +
> +  // Use the local values in case a shared FrameMetrics is not found

I don't understand this comment

@@ +179,5 @@
> +    return true;
> +  }
> +
> +/*
> +  // Need to implement this for low-resolution tiling?

Commented out code - Let's just write this when it's needed, to avoid any weirdness due to bitrot.

@@ +214,5 @@
> +    return false;
> +  }
> +
> +  // If we're not doing low precision draws and we're about to
> +  // checkerboard, enable low precision drawing.

It's confusing that this block of code is here, but the related block above is commented out.

::: gfx/layers/client/TiledContentClient.h
@@ +122,5 @@
> +{
> +public:
> +  BasicTiledLayerBufferHelper(ClientLayerManager* aManager);
> +  ~BasicTiledLayerBufferHelper();
> +  bool UpdateFromCompositorFrameMetrics(ContainerLayer* aLayer,

These methods need documentation.

::: gfx/layers/ipc/AsyncPanZoomController.cpp
@@ +303,5 @@
>  AsyncPanZoomController::~AsyncPanZoomController() {
> +
> +  if (mCrossProcessCompositorParent && mSharedFrameMetricsBuffer) {
> +    bool unused = mCrossProcessCompositorParent->SendReleaseSharedCompositorFrameMetrics(mFrameMetrics.mScrollId, mId);
> +    (unused); // Suppress warning. Nothing to to be done if it does fail.

Why assign this variable if it gets ignored? Wouldn't a comment suffice?

@@ +1597,5 @@
> +
> +  if (gfxPlatform::UseProgressiveTilePainting() && frame) {
> +    mSharedLock.Lock();
> +    *frame = mFrameMetrics;
> +    mSharedLock.Unlock();

As we discussed in the gfx meeting, I think you could triple-buffer for this instead of locking - I especially don't like the idea of possible deadlock... But that doesn't need to hold this patch up. This is still probably better than what we do in fennec.

@@ +1601,5 @@
> +    mSharedLock.Unlock();
> +  }
> +}
> +
> +void AsyncPanZoomController::ShareCompositorFrameMetrics() {

This method seriously needs some commenting, it's not entirely obvious what's going on.

@@ +1611,5 @@
> +      if (frame) {
> +        ReentrantMonitorAutoEnter lock(mMonitor);
> +        *frame = mFrameMetrics;
> +      }
> +    }

If the Create/Map fail here, do we want to continue? And should we at least warn about it?

::: gfx/layers/ipc/AsyncPanZoomController.h
@@ +180,4 @@
>     */
>    void SetCompositorParent(CompositorParent* aCompositorParent);
>  
> +  void SetCrossProcessCompositorParent(PCompositorParent* aCrossProcessCompositorParent);

This method needs documentation.

::: gfx/layers/ipc/CompositorChild.cpp
@@ +59,5 @@
> +  if (data) {
> +    FrameMetrics* frame = (FrameMetrics*)data->mBuffer->memory();
> +    if (frame) {
> +      data->mMutex->Lock();
> +      aFrame = *(frame);

nit, unnecessary parenthesis.

@@ +145,5 @@
> +    const uint32_t& aAPZCId)
> +{
> +  FrameMetricsData* data = new FrameMetricsData();
> +  data->mBuffer = new ipc::SharedMemoryBasic(metrics);
> +  data->mBuffer->Map(sizeof(FrameMetrics));

Elsewhere you check if the map succeeds, could this possibly fail?

@@ +149,5 @@
> +  data->mBuffer->Map(sizeof(FrameMetrics));
> +  data->mMutex = new CrossProcessMutex(handle);
> +  data->mId = aAPZCId;
> +  FrameMetrics* frame = (FrameMetrics*)data->mBuffer->memory();
> +  if (frame) {

should we warn if frame == nullptr?

@@ +161,5 @@
> +    const ViewID& aId,
> +    const uint32_t& aAPZCId)
> +{
> +  FrameMetricsData* data = mFrameMetricsTable.Get(aId);
> +  if (data && (data->mId == aAPZCId)) {

Should we warn if release is called and either of the IDs isn't found?

::: gfx/layers/ipc/CompositorChild.h
@@ +36,4 @@
>  
>    void Destroy();
>  
> +  bool LookupCompositorFrameMetrics(const FrameMetrics::ViewID aId, FrameMetrics&);

This method needs documentation.
Attachment #832682 - Flags: review?(chrislord.net) → review-
Comment on attachment 832682 [details] [diff] [review]
Work in progress 6

Review of attachment 832682 [details] [diff] [review]:
-----------------------------------------------------------------

Overall this looks good to me. I didn't look at the changes to TiledContent* in any detail, but I have some comments on the rest of the patch. Somebody more familiar with IPC stuff (maybe ben turner?) should probably review the CrossProcessMutex code.

::: gfx/layers/client/TiledContentClient.cpp
@@ +103,5 @@
>    buffer->ClearPaintedRegion();
>  }
>  
> +BasicTiledLayerBufferHelper::BasicTiledLayerBufferHelper(ClientLayerManager *aManager) :
> +    mLayerManager(aManager),

style nit: copy the same style as the other constructors. that is, move the colon and commas down to the start of the next line.

@@ +605,4 @@
>    // caused by there being an incoming, more relevant paint.
>    gfx::Rect viewport;
>    float scaleX, scaleY;
> +#ifdef MOS_WIDGET_ANDROID

s/MOS/MOZ/. You might want to also test this on Android to ensure nothing broke there.

::: gfx/layers/ipc/AsyncPanZoomController.cpp
@@ +215,4 @@
>  
>  static TimeStamp sFrameTime;
>  
> +uint32_t AsyncPanZoomController::sAsyncPanZoomControllerCount = 0;

Can we just make this static to this file, rather than a static member of AsyncPanZoomController?

@@ +302,5 @@
>  
>  AsyncPanZoomController::~AsyncPanZoomController() {
> +
> +  if (mCrossProcessCompositorParent && mSharedFrameMetricsBuffer) {
> +    bool unused = mCrossProcessCompositorParent->SendReleaseSharedCompositorFrameMetrics(mFrameMetrics.mScrollId, mId);

Note that a new AsyncPanZoomController with the same mScrollId might get created before the old one is freed. This might result in the case where you call SendSharedCompositorFrameMetrics twice in a row before the first release. i.e. the flow would be like this:

APZC 1 is created with mScrollId = 1.
APZC 1 gets NotifyLayersUpdated, which calls SendSharedCompositorFrameMetrics
... time passes, a relayout happens
APZC 2 is created with mScrollId = 1.
APZC 2 gets NotifyLayersUpdated, which calls SendSharedCompositorFrameMetrics
APZC 1 is deleted, which calls SendReleaseSharedCompositorFrameMetrics
... time passes, a relayout happens
APZC 2 is deleted, which calls SnedReleaseSharedCompositorFrameMetrics

I think the code in CompositorChild that uses the ViewID as the lookup key in the hashtable might not deal with this scenario properly.

@@ +303,5 @@
>  AsyncPanZoomController::~AsyncPanZoomController() {
> +
> +  if (mCrossProcessCompositorParent && mSharedFrameMetricsBuffer) {
> +    bool unused = mCrossProcessCompositorParent->SendReleaseSharedCompositorFrameMetrics(mFrameMetrics.mScrollId, mId);
> +    (unused); // Suppress warning. Nothing to to be done if it does fail.

If you can get rid of storing the return value at all that would be good. If not, #include mozilla/unused.h and do "unused << func();" See http://mxr.mozilla.org/mozilla-central/source/content/base/src/nsFrameLoader.cpp?rev=6e221af28de2#2077 for example.

@@ +1591,5 @@
>  }
>  
> +void AsyncPanZoomController::UpdateSharedCompositorFrameMetrics()
> +{
> +  FrameMetrics* frame = mSharedFrameMetricsBuffer ?

This function can be called from multiple threads, although it looks like you only call it from inside the re-entrant lock which makes it safe. Still it would be good to add an assert or some documentation to that effect. e.g. mMonitor.AssertCurrentThreadIn().

@@ +1594,5 @@
> +{
> +  FrameMetrics* frame = mSharedFrameMetricsBuffer ?
> +      (FrameMetrics*)mSharedFrameMetricsBuffer->memory() : nullptr;
> +
> +  if (gfxPlatform::UseProgressiveTilePainting() && frame) {

I don't know if we can call UseProgressiveTilePainting from the compositor thread. I don't think so, because it calls into the preferences service.

@@ +1602,5 @@
> +  }
> +}
> +
> +void AsyncPanZoomController::ShareCompositorFrameMetrics() {
> +  if (!mSharedFrameMetricsBuffer && gfxPlatform::UseProgressiveTilePainting() && mCrossProcessCompositorParent) {

Ditto on the UseProgressiveTilePainting. This function is definitely run on the compositor thread.

@@ +1606,5 @@
> +  if (!mSharedFrameMetricsBuffer && gfxPlatform::UseProgressiveTilePainting() && mCrossProcessCompositorParent) {
> +    mSharedFrameMetricsBuffer = new ipc::SharedMemoryBasic;
> +    FrameMetrics* frame = nullptr;
> +    if (mSharedFrameMetricsBuffer->Create(sizeof(FrameMetrics)) && mSharedFrameMetricsBuffer->Map(sizeof(FrameMetrics))) {
> +      frame = (FrameMetrics*)mSharedFrameMetricsBuffer->memory();

static_cast is preferred.

::: gfx/layers/ipc/AsyncPanZoomController.h
@@ +13,5 @@
>  #include "mozilla/EventForwards.h"
>  #include "mozilla/Monitor.h"
>  #include "mozilla/ReentrantMonitor.h"
>  #include "mozilla/RefPtr.h"
> +#include "mozilla/layers/PCompositorParent.h"

Do you need to include this? You're forward-declaring the class below anyway.

@@ +516,4 @@
>  
>    uint64_t mLayersId;
>    nsRefPtr<CompositorParent> mCompositorParent;
> +  PCompositorParent* mCrossProcessCompositorParent;

Does this need to be refcounted? When is the PCompositorParent deleted? Is it ever possible for it to get deleted while an APZC instance still has a pointer to it?

@@ +685,5 @@
>    }
>  
>  private:
> +  static uint32_t sAsyncPanZoomControllerCount;
> +  const uint32_t mId;

These need comments.

@@ +700,5 @@
> +  //ipc::Shmem *mSharedFrameMetricsBuffer;
> +  ipc::SharedMemoryBasic* mSharedFrameMetricsBuffer;
> +  CrossProcessMutex mSharedLock;
> +  void UpdateSharedCompositorFrameMetrics();
> +  void ShareCompositorFrameMetrics();

These need comments.

::: gfx/layers/ipc/CompositorChild.cpp
@@ +54,5 @@
> +bool
> +CompositorChild::LookupCompositorFrameMetrics(const FrameMetrics::ViewID aId,
> +                                              FrameMetrics& aFrame)
> +{
> +  FrameMetricsData* data = mFrameMetricsTable.Get(aId);

The ViewID is not a perfectly unique identifier for a FrameMetrics, so I'm a little uneasy about using this as the key here. In practice I guess you're unlikely to hit a scenario where it matters but maybe a comment in the .h file would be good.

@@ +143,5 @@
> +    const ipc::SharedMemoryBasic::Handle& metrics,
> +    const CrossProcessMutexHandle& handle,
> +    const uint32_t& aAPZCId)
> +{
> +  FrameMetricsData* data = new FrameMetricsData();

Does the mFrameMetricsTable handle deletion of the old instances when you update them with new ones? What about on .Remove()?

@@ +150,5 @@
> +  data->mMutex = new CrossProcessMutex(handle);
> +  data->mId = aAPZCId;
> +  FrameMetrics* frame = (FrameMetrics*)data->mBuffer->memory();
> +  if (frame) {
> +    mFrameMetricsTable.Put(frame->mScrollId, data);

Does reading/writing from mFrameMetricsTable need locking? What thread is the Lookup function called from?

::: gfx/layers/ipc/PCompositor.ipdl
@@ +72,5 @@
> +child:
> +  // Send back Compositor Frame Metrics from APZCs so tiled layers can
> +  // update progressively.
> +  async SharedCompositorFrameMetrics(Handle metrics, CrossProcessMutexHandle mutex, uint32_t aAPZCId);
> +  async ReleaseSharedCompositorFrameMetrics(ViewID aId, uint32_t aAPZCId);

These seem kinda verbose. I don't think "Compositor" is needed in the names of these functions.

::: ipc/glue/CrossProcessMutex_posix.cpp
@@ +97,5 @@
> +    Lock();
> +    *mDestroyed = 1;
> +    Unlock();
> +    if (pthread_mutex_destroy(mMutex)) {
> +    }

Empty if statement?
Attachment #832682 - Flags: review?(chrislord.net)
Attachment #832682 - Flags: review?(bugmail.mozilla)
Attachment #832682 - Flags: review-
Comment on attachment 832682 [details] [diff] [review]
Work in progress 6

Review of attachment 832682 [details] [diff] [review]:
-----------------------------------------------------------------

Whoops, mid-air collision.
Attachment #832682 - Flags: review?(chrislord.net) → review-
I've made the indicated changes but will hold off uploading a new patch until I had time to address :kats comments. I'm posting the response now so I don't loose it.

(In reply to Chris Lord [:cwiiis] from comment #24)
> Comment on attachment 832682 [details] [diff] [review]
> Work in progress 6
> 
> Review of attachment 832682 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> The cross-process mutex stuff should definitely be in another patch. I also
> think someone else should review this for the IPC stuff and general types
> usage (:blassey perhaps?), as I'm not that well versed in those areas.
> 
> This otherwise looks ok to me, but I have some questions below. r- for too
> many open questions and lack of documentation, but it looks sound.
> 
> ::: b2g/app/b2g.js
> @@ +75,4 @@
> >  pref("mozilla.widget.use-buffer-pixmap", true);
> >  pref("mozilla.widget.disable-native-theme", true);
> >  pref("layout.reflow.synthMouseMove", false);
> > +pref("layers.force-tiles", true);
> 
> I think we want to cut down tiled layer usage before enabling it on b2g -
> perhaps only for large scrollable layers (where the scroll port is more than
> twice the size of the viewport, perhaps? (i.e. when it would otherwise cause
> a buffer unrotate)) - see bug 915673
>

Okay, any tips on where to add that logic?

> ::: gfx/layers/client/TiledContentClient.cpp
> @@ +135,5 @@
> > +  FrameMetrics contentMetrics = aLayer->GetFrameMetrics();
> > +  ContainerLayer* layer = aLayer;
> > +
> > +  // Walk up the layer tree until a valid composition bounds is found
> > +  while (layer && contentMetrics.mCompositionBounds.IsEmpty()) {
> 
> This doesn't seem like a great way of doing this - perhaps instead of
> checking if mCompositionBounds is empty, you could check if mScrollId !=
> NULL_SCROLL_ID ? (this is what botond suggested to me for doing something
> similar)
> 

Hmm, I had that at first but changed it. Not sure why, as it makes more sense. I tired it again and it is working so I'll make the change.

> @@ +140,5 @@
> > +    contentMetrics = layer->GetFrameMetrics();
> > +    layer = layer->GetParent();
> > +  }
> > +
> > +  // Use the local values in case a shared FrameMetrics is not found
> 
> I don't understand this comment
> 

Updated comment.

> @@ +179,5 @@
> > +    return true;
> > +  }
> > +
> > +/*
> > +  // Need to implement this for low-resolution tiling?
> 
> Commented out code - Let's just write this when it's needed, to avoid any
> weirdness due to bitrot.
> 

Done. I left it in since I wasn't sure if it was needed but you answered that.

> @@ +214,5 @@
> > +    return false;
> > +  }
> > +
> > +  // If we're not doing low precision draws and we're about to
> > +  // checkerboard, enable low precision drawing.
> 
> It's confusing that this block of code is here, but the related block above
> is commented out.
> 

So should I remove all the references to low precision? What I did was take the Java GeckoLayerClient::progressiveUpdateCallback function and attempted to port it to C++. I tried to leave as many of the comments intact as I could. Since at the time I thought the low precision work would follow on immediately after, I left the code in place. I can remove it if you feel it is better that way.

> ::: gfx/layers/client/TiledContentClient.h
> @@ +122,5 @@
> > +{
> > +public:
> > +  BasicTiledLayerBufferHelper(ClientLayerManager* aManager);
> > +  ~BasicTiledLayerBufferHelper();
> > +  bool UpdateFromCompositorFrameMetrics(ContainerLayer* aLayer,
> 
> These methods need documentation.
> 

Done. I'm not certain that my implementation of aboutToCheckerboard is sufficient. It seems that it should detect an approaching checker board sooner but I was unsure how to do that with only the data in the FrameMetrics. Any ideas?

> ::: gfx/layers/ipc/AsyncPanZoomController.cpp
> @@ +303,5 @@
> >  AsyncPanZoomController::~AsyncPanZoomController() {
> > +
> > +  if (mCrossProcessCompositorParent && mSharedFrameMetricsBuffer) {
> > +    bool unused = mCrossProcessCompositorParent->SendReleaseSharedCompositorFrameMetrics(mFrameMetrics.mScrollId, mId);
> > +    (unused); // Suppress warning. Nothing to to be done if it does fail.
> 
> Why assign this variable if it gets ignored? Wouldn't a comment suffice?
> 

I was trying to get rid of a warning since the -Wunused-value flag is set. I was unaware of unused.h. Fixed.

> @@ +1597,5 @@
> > +
> > +  if (gfxPlatform::UseProgressiveTilePainting() && frame) {
> > +    mSharedLock.Lock();
> > +    *frame = mFrameMetrics;
> > +    mSharedLock.Unlock();
> 
> As we discussed in the gfx meeting, I think you could triple-buffer for this
> instead of locking - I especially don't like the idea of possible
> deadlock... But that doesn't need to hold this patch up. This is still
> probably better than what we do in fennec.
> 

I talked about this with :blassey a little on Thursday and I don't believe triple buffering will work. I have seen cases where the compositor can execute 10 or more frames to the content process's one. With out locking, you could not ensure that the content process was not reading from the same buffer that the compositor process was writing to. I think buffering would only work if the two process ran in lock step. If you know of a scheme that will work I'm open to alternatives as I would prefer not to use locks if possible. I'm also not sure how you think a deadlock is really possible. One side is producing and the other side is consuming. The lock is used only to protect reads and writes so barring a crash in the copy operation, I don't see how a dead lock is likely or even possible.

> @@ +1601,5 @@
> > +    mSharedLock.Unlock();
> > +  }
> > +}
> > +
> > +void AsyncPanZoomController::ShareCompositorFrameMetrics() {
> 
> This method seriously needs some commenting, it's not entirely obvious
> what's going on.
> 

Okay, added.

> @@ +1611,5 @@
> > +      if (frame) {
> > +        ReentrantMonitorAutoEnter lock(mMonitor);
> > +        *frame = mFrameMetrics;
> > +      }
> > +    }
> 
> If the Create/Map fail here, do we want to continue? And should we at least
> warn about it?
> 

I'm not sure of the value of a warning. I expand on this a little more bellow. If you feel strongly that I need a warning, I can add one.

> ::: gfx/layers/ipc/AsyncPanZoomController.h
> @@ +180,4 @@
> >     */
> >    void SetCompositorParent(CompositorParent* aCompositorParent);
> >  
> > +  void SetCrossProcessCompositorParent(PCompositorParent* aCrossProcessCompositorParent);
> 
> This method needs documentation.
> 

Done.

> ::: gfx/layers/ipc/CompositorChild.cpp
> @@ +59,5 @@
> > +  if (data) {
> > +    FrameMetrics* frame = (FrameMetrics*)data->mBuffer->memory();
> > +    if (frame) {
> > +      data->mMutex->Lock();
> > +      aFrame = *(frame);
> 
> nit, unnecessary parenthesis.
> 

Fixed.

> @@ +145,5 @@
> > +    const uint32_t& aAPZCId)
> > +{
> > +  FrameMetricsData* data = new FrameMetricsData();
> > +  data->mBuffer = new ipc::SharedMemoryBasic(metrics);
> > +  data->mBuffer->Map(sizeof(FrameMetrics));
> 
> Elsewhere you check if the map succeeds, could this possibly fail?
> 

Are you referring to the new ipc::SharedFrameMemoryBasic(metrics)? Should I be checking heap allocations? I'm not sure what I could do if we are out of heap space. The part more likely to fail is when the ashmem is mapped. If that fails then the mBuffer->memroy() call will return null. I'm not sure of the value of warning on exhaustion of ashmem as that generally indicates that the process has max file handles open so I assume things will probably start going off the rails rather rapidly.

> @@ +149,5 @@
> > +  data->mBuffer->Map(sizeof(FrameMetrics));
> > +  data->mMutex = new CrossProcessMutex(handle);
> > +  data->mId = aAPZCId;
> > +  FrameMetrics* frame = (FrameMetrics*)data->mBuffer->memory();
> > +  if (frame) {
> 
> should we warn if frame == nullptr?
> 

See comment above, I'm fine with adding a warning, I guess I'm am uncertain of the value.

> @@ +161,5 @@
> > +    const ViewID& aId,
> > +    const uint32_t& aAPZCId)
> > +{
> > +  FrameMetricsData* data = mFrameMetricsTable.Get(aId);
> > +  if (data && (data->mId == aAPZCId)) {
> 
> Should we warn if release is called and either of the IDs isn't found?
> 

No, this is an expected case. Here is how it can happen:

1) Compositor process creates APZC with ViewID 1 and APZCId 1
2) Content process registers FrameMetrics with ViewID 1 and APZCId 1
3) Compositor process creates new APZC with ViewID 1 and APZCId 2
4) Content process registers FrameMetrics with ViewID 1 and APZCId 2 causing the FrameMetrics with ViewID 1 and APZCId 1 to be destroyed
5) Compositor process destroys APZC with ViewID 1 and APZCId 1 sending a message to Content process to remove associated FrameMetrics
6) Content process receives message to remove FrameMetrics with ViewID 1 and aAPZCId 1, looks up FrameMetrics with ViewID 1, but sees that the APZCIds do not match so the FrameMetrics is not removed.

I had to add the APZCId specifically for this case.

> ::: gfx/layers/ipc/CompositorChild.h
> @@ +36,4 @@
> >  
> >    void Destroy();
> >  
> > +  bool LookupCompositorFrameMetrics(const FrameMetrics::ViewID aId, FrameMetrics&);
> 
> This method needs documentation.

Done.
Attached patch Cross Process Mutex for POSIX (obsolete) — Splinter Review
Not sure who else should review this.
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #25)
> Comment on attachment 832682 [details] [diff] [review]
> Work in progress 6
> 
> Review of attachment 832682 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Overall this looks good to me. I didn't look at the changes to TiledContent*
> in any detail, but I have some comments on the rest of the patch. Somebody
> more familiar with IPC stuff (maybe ben turner?) should probably review the
> CrossProcessMutex code.
> 
> ::: gfx/layers/client/TiledContentClient.cpp
> @@ +103,5 @@
> >    buffer->ClearPaintedRegion();
> >  }
> >  
> > +BasicTiledLayerBufferHelper::BasicTiledLayerBufferHelper(ClientLayerManager *aManager) :
> > +    mLayerManager(aManager),
> 
> style nit: copy the same style as the other constructors. that is, move the
> colon and commas down to the start of the next line.
> 

Done.

> @@ +605,4 @@
> >    // caused by there being an incoming, more relevant paint.
> >    gfx::Rect viewport;
> >    float scaleX, scaleY;
> > +#ifdef MOS_WIDGET_ANDROID
> 
> s/MOS/MOZ/. You might want to also test this on Android to ensure nothing
> broke there.
>

I had cut and pasted it, but then it got dropped by mistake on a merge conflict, probably shouldn't have tried to save time by typing it from memory.

> ::: gfx/layers/ipc/AsyncPanZoomController.cpp
> @@ +215,4 @@
> >  
> >  static TimeStamp sFrameTime;
> >  
> > +uint32_t AsyncPanZoomController::sAsyncPanZoomControllerCount = 0;
> 
> Can we just make this static to this file, rather than a static member of
> AsyncPanZoomController?
> 

Sure, is there a reason why a file local variable is preferred to a static member variable?

> @@ +302,5 @@
> >  
> >  AsyncPanZoomController::~AsyncPanZoomController() {
> > +
> > +  if (mCrossProcessCompositorParent && mSharedFrameMetricsBuffer) {
> > +    bool unused = mCrossProcessCompositorParent->SendReleaseSharedCompositorFrameMetrics(mFrameMetrics.mScrollId, mId);
> 
> Note that a new AsyncPanZoomController with the same mScrollId might get
> created before the old one is freed. This might result in the case where you
> call SendSharedCompositorFrameMetrics twice in a row before the first
> release. i.e. the flow would be like this:
> 
> APZC 1 is created with mScrollId = 1.
> APZC 1 gets NotifyLayersUpdated, which calls SendSharedCompositorFrameMetrics
> ... time passes, a relayout happens
> APZC 2 is created with mScrollId = 1.
> APZC 2 gets NotifyLayersUpdated, which calls SendSharedCompositorFrameMetrics
> APZC 1 is deleted, which calls SendReleaseSharedCompositorFrameMetrics
> ... time passes, a relayout happens
> APZC 2 is deleted, which calls SnedReleaseSharedCompositorFrameMetrics
> 
> I think the code in CompositorChild that uses the ViewID as the lookup key
> in the hashtable might not deal with this scenario properly.
> 

That is why I added sAsyncPanZoomControllerCount so each APZC is created with a unique number. I use that to verify that I'm not removing a shared FrameMetrics that is still in use. I discuss this in more detail in my response to :cwiiis above.

> @@ +303,5 @@
> >  AsyncPanZoomController::~AsyncPanZoomController() {
> > +
> > +  if (mCrossProcessCompositorParent && mSharedFrameMetricsBuffer) {
> > +    bool unused = mCrossProcessCompositorParent->SendReleaseSharedCompositorFrameMetrics(mFrameMetrics.mScrollId, mId);
> > +    (unused); // Suppress warning. Nothing to to be done if it does fail.
> 
> If you can get rid of storing the return value at all that would be good. If
> not, #include mozilla/unused.h and do "unused << func();" See
> http://mxr.mozilla.org/mozilla-central/source/content/base/src/nsFrameLoader.
> cpp?rev=6e221af28de2#2077 for example.
> 

I didn't see a way to get rid of the return code from the IPDL send function so I used unused.h. I didn't know about that.

> @@ +1591,5 @@
> >  }
> >  
> > +void AsyncPanZoomController::UpdateSharedCompositorFrameMetrics()
> > +{
> > +  FrameMetrics* frame = mSharedFrameMetricsBuffer ?
> 
> This function can be called from multiple threads, although it looks like
> you only call it from inside the re-entrant lock which makes it safe. Still
> it would be good to add an assert or some documentation to that effect. e.g.
> mMonitor.AssertCurrentThreadIn().
> 

Done.

> @@ +1594,5 @@
> > +{
> > +  FrameMetrics* frame = mSharedFrameMetricsBuffer ?
> > +      (FrameMetrics*)mSharedFrameMetricsBuffer->memory() : nullptr;
> > +
> > +  if (gfxPlatform::UseProgressiveTilePainting() && frame) {
> 
> I don't know if we can call UseProgressiveTilePainting from the compositor
> thread. I don't think so, because it calls into the preferences service.
> 

Okay, that explains AsyncPanZoomController::InitializeGlobalState(), updated to match.

> @@ +1602,5 @@
> > +  }
> > +}
> > +
> > +void AsyncPanZoomController::ShareCompositorFrameMetrics() {
> > +  if (!mSharedFrameMetricsBuffer && gfxPlatform::UseProgressiveTilePainting() && mCrossProcessCompositorParent) {
> 
> Ditto on the UseProgressiveTilePainting. This function is definitely run on
> the compositor thread.
> 

Fixed.

> @@ +1606,5 @@
> > +  if (!mSharedFrameMetricsBuffer && gfxPlatform::UseProgressiveTilePainting() && mCrossProcessCompositorParent) {
> > +    mSharedFrameMetricsBuffer = new ipc::SharedMemoryBasic;
> > +    FrameMetrics* frame = nullptr;
> > +    if (mSharedFrameMetricsBuffer->Create(sizeof(FrameMetrics)) && mSharedFrameMetricsBuffer->Map(sizeof(FrameMetrics))) {
> > +      frame = (FrameMetrics*)mSharedFrameMetricsBuffer->memory();
> 
> static_cast is preferred.
> 

Fixed.

> ::: gfx/layers/ipc/AsyncPanZoomController.h
> @@ +13,5 @@
> >  #include "mozilla/EventForwards.h"
> >  #include "mozilla/Monitor.h"
> >  #include "mozilla/ReentrantMonitor.h"
> >  #include "mozilla/RefPtr.h"
> > +#include "mozilla/layers/PCompositorParent.h"
> 
> Do you need to include this? You're forward-declaring the class below anyway.
> 

Fixed.

> @@ +516,4 @@
> >  
> >    uint64_t mLayersId;
> >    nsRefPtr<CompositorParent> mCompositorParent;
> > +  PCompositorParent* mCrossProcessCompositorParent;
> 
> Does this need to be refcounted? When is the PCompositorParent deleted? Is
> it ever possible for it to get deleted while an APZC instance still has a
> pointer to it?
> 

Can an IPDL generated interface be refcounted? I thought it couldn't. It should live as long as the content process it is associated. I did look into at the time and is seem okay but that doesn't mean things can't change.

> @@ +685,5 @@
> >    }
> >  
> >  private:
> > +  static uint32_t sAsyncPanZoomControllerCount;
> > +  const uint32_t mId;
> 
> These need comments.
> 

Done.

> @@ +700,5 @@
> > +  //ipc::Shmem *mSharedFrameMetricsBuffer;
> > +  ipc::SharedMemoryBasic* mSharedFrameMetricsBuffer;
> > +  CrossProcessMutex mSharedLock;
> > +  void UpdateSharedCompositorFrameMetrics();
> > +  void ShareCompositorFrameMetrics();
> 
> These need comments.
> 

Done.

> ::: gfx/layers/ipc/CompositorChild.cpp
> @@ +54,5 @@
> > +bool
> > +CompositorChild::LookupCompositorFrameMetrics(const FrameMetrics::ViewID aId,
> > +                                              FrameMetrics& aFrame)
> > +{
> > +  FrameMetricsData* data = mFrameMetricsTable.Get(aId);
> 
> The ViewID is not a perfectly unique identifier for a FrameMetrics, so I'm a
> little uneasy about using this as the key here. In practice I guess you're
> unlikely to hit a scenario where it matters but maybe a comment in the .h
> file would be good.
> 

My understanding is that ViewID is unique within a given instance of a layer tree. I believe this usage is okay but I will add a comment. Please take a look and make sure it addresses your concerns, I wasn't totally certain what to say.

> @@ +143,5 @@
> > +    const ipc::SharedMemoryBasic::Handle& metrics,
> > +    const CrossProcessMutexHandle& handle,
> > +    const uint32_t& aAPZCId)
> > +{
> > +  FrameMetricsData* data = new FrameMetricsData();
> 
> Does the mFrameMetricsTable handle deletion of the old instances when you
> update them with new ones? What about on .Remove()?
> 

Yes, the hash table will delete the the existing FrameMetricsData on a collision. .Remove() also deletes the object associated with the key. I did some logging in dtors to be certain. The fact that Put() doesn't have a return value and Remove() does not return the stored object also indicate to me that this is the case.

> @@ +150,5 @@
> > +  data->mMutex = new CrossProcessMutex(handle);
> > +  data->mId = aAPZCId;
> > +  FrameMetrics* frame = (FrameMetrics*)data->mBuffer->memory();
> > +  if (frame) {
> > +    mFrameMetricsTable.Put(frame->mScrollId, data);
> 
> Does reading/writing from mFrameMetricsTable need locking? What thread is
> the Lookup function called from?
> 

I was under the impression it was all running is the same thread. It didn't occur to me this may not be the case. Do you know of a good way to verify this?

> ::: gfx/layers/ipc/PCompositor.ipdl
> @@ +72,5 @@
> > +child:
> > +  // Send back Compositor Frame Metrics from APZCs so tiled layers can
> > +  // update progressively.
> > +  async SharedCompositorFrameMetrics(Handle metrics, CrossProcessMutexHandle mutex, uint32_t aAPZCId);
> > +  async ReleaseSharedCompositorFrameMetrics(ViewID aId, uint32_t aAPZCId);
> 
> These seem kinda verbose. I don't think "Compositor" is needed in the names
> of these functions.
> 

Hmm, while they are verbose, I did have them with out Compositor at first but latter added it as I felt it made the code easier to read. Since these are only called in one place, I don't see the harm in a little extra verbosity.

> ::: ipc/glue/CrossProcessMutex_posix.cpp
> @@ +97,5 @@
> > +    Lock();
> > +    *mDestroyed = 1;
> > +    Unlock();
> > +    if (pthread_mutex_destroy(mMutex)) {
> > +    }
> 
> Empty if statement?

Fixed, was working around compiler warning. Used unused.h.
Attached patch Work in progress 7 (obsolete) — Splinter Review
Updated patch to address most comments from :kats and :cwiiis
Attachment #832682 - Attachment is obsolete: true
Attachment #832682 - Flags: review?(bgirard)
Attachment #833269 - Flags: review?(chrislord.net)
Attachment #833269 - Flags: review?(bugmail.mozilla)
Attachment #833269 - Flags: review?(bgirard)
Why do we need a cross-process mutex on top of a transactional layers system? I would like roc to review that part of the design. Thanks.
We might be able to simply disable tiling for layers that don't scroll and enable it for anything that scrolls and then lose hw composition on scrolling layers (which might be ok).
(In reply to Andreas Gal :gal from comment #32)
> We might be able to simply disable tiling for layers that don't scroll and
> enable it for anything that scrolls and then lose hw composition on
> scrolling layers (which might be ok).

Yes, that's bug 915673.
(In reply to Andreas Gal :gal from comment #31)
> Why do we need a cross-process mutex on top of a transactional layers
> system? I would like roc to review that part of the design. Thanks.

There's a reason (we need this data outside of transactions -- it's outside to help us make decisions).  Cross-process lock sounds terrifying, but allocating a mutex in shared memory is a normal supported thing on both Unix and Windows.  So let's carry on and see how it goes..
Attached patch Cross Process Mutex for POSIX (obsolete) — Splinter Review
Fixed to allow usage in multiple processes and remove need for tracking creator.
Attachment #833251 - Attachment is obsolete: true
Comment on attachment 833269 [details] [diff] [review]
Work in progress 7

Review of attachment 833269 [details] [diff] [review]:
-----------------------------------------------------------------

Pretty much all my comments are just nits, so assuming they're addressed/explained, this looks good to me.

::: b2g/app/b2g.js
@@ +75,4 @@
>  pref("mozilla.widget.use-buffer-pixmap", true);
>  pref("mozilla.widget.disable-native-theme", true);
>  pref("layout.reflow.synthMouseMove", false);
> +pref("layers.force-tiles", true);

I'm still assuming that this won't be committed as part of this patch.

::: gfx/layers/client/TiledContentClient.cpp
@@ +24,5 @@
>  #include "nsMathUtils.h"               // for NS_roundf
>  #include "gfx2DGlue.h"
>  
> +#include <android/log.h>
> +#define RLOG(format, ...) __android_log_print(ANDROID_LOG_INFO, "reb", format, ##__VA_ARGS__);

reb? Most other uses of this have 'GeckoXXX' - perhaps GeckoTiledContentClient?

@@ +118,5 @@
> +{
> +
> +}
> +
> +static inline bool fuzzyEquals(float a, float b) {

nit, function name/proto should be on a new line and start with a capital letter.

@@ +134,5 @@
> +  if (!aLayer) {
> +    return false;
> +  }
> +
> +  FrameMetrics contentMetrics = aLayer->GetFrameMetrics();

Make this a const FrameMetrics&, or the loop below becomes more expensive than necessary.

@@ +141,5 @@
> +  // Walk up the layer tree until a valid composition bounds is found
> +  while (layer && (contentMetrics.mScrollId == FrameMetrics::NULL_SCROLL_ID)) {
> +    contentMetrics = layer->GetFrameMetrics();
> +    layer = layer->GetParent();
> +  }

This loop can fail to find a scroll container and this function will continue with possibly invalid bounds. There should be some kind of check and warning for this if it's not expected.

@@ +146,5 @@
> +
> +  // On B2G, the compositor is part of the b2g process so there is no
> +  // CrossProcessCompositorParent created for sharing the FrameMetrics.
> +  // In this case, LookupCompositorFrameMetrics will fail so we use
> +  // the content version of the FrameMetrics for calculations.

I don't really understand this comment - so this function won't fully work on b2g ever? Or is it that it doesn't work in a specific circumstance?

@@ +159,5 @@
> +
> +  FrameMetrics compositorMetrics;
> +
> +  if (!compositor->LookupCompositorFrameMetrics(contentMetrics.mScrollId,
> +                                                compositorMetrics)){

nit, space before opening brace.

@@ +186,5 @@
> +  }
> +
> +  // XXX All sorts of rounding happens inside Gecko that becomes hard to
> +  //     account exactly for. Given we align the display-port to tile
> +  //     boundaries (and so they rarely vary by sub-pixel amounts), just

This isn't true for APZC until bug 907743 is fixed.

@@ +203,5 @@
> +      fabsf(contentMetrics.mDisplayPort.height - compositorMetrics.mDisplayPort.height)) {
> +    return false;
> +  }
> +
> +  if (!aLowPrecision && !mProgressiveUpdateWasInDanger) {

Add a comment for this block. Apologies if one doesn't exist in GeckoLayerClient already.

::: gfx/layers/client/TiledContentClient.h
@@ +138,5 @@
> +                                        CSSToScreenScale& aZoom);
> +  /**
> +   * Determines if the compositor's upcoming composition bounds has fallen
> +   * outside of the contents display port. If it has then the compositor
> +   * will start to checker board.

Might be worth adding a short description of what we mean by checkerboarding, given it's been a really long time since we drew a checkerboard pattern underneath unrendered content :)

::: gfx/layers/ipc/AsyncPanZoomController.cpp
@@ +308,5 @@
>       mHandlingTouchQueue(false),
> +     mTreeManager(aTreeManager),
> +     mAPZCId(sAsyncPanZoomControllerCount++),
> +     mSharedFrameMetricsBuffer(nullptr),
> +     mSharedLock("")

The string in here should be something helpful to identify errors, right?

@@ +1629,5 @@
> +
> +  if (gUseProgressiveTilePainting && frame) {
> +    mSharedLock.Lock();
> +    *frame = mFrameMetrics;
> +    mSharedLock.Unlock();

Not a blocker (pun not initially intended), but I still think you could write this without locking. Let's tackle it if it ever becomes a problem though.

@@ +1638,5 @@
> +  // Only create the shared memory buffer if it hasn't already been created,
> +  // we are using progressive tile painting, and we have cross process
> +  // compositor to pass the shared memory back to the content process.
> +  if (!mSharedFrameMetricsBuffer && gUseProgressiveTilePainting && mCrossProcessCompositorParent) {
> +    // Create shared memory and initialized it with current FrameMetrics value

nit, s/initialized/initialize/ (or indeed, initialise, but I'm not gonna win that battle :)), s/with/with the/ and s/ value//

@@ +1649,5 @@
> +      {
> +        ReentrantMonitorAutoEnter lock(mMonitor);
> +        *frame = mFrameMetrics;
> +      }
> +      // Get the process id of the content process

nit, let's add some newlines here, it's getting a bit crowded.

::: gfx/layers/ipc/AsyncPanZoomController.h
@@ +183,5 @@
> +  /**
> +   * The platform implementation must set the cross process compositor if
> +   * there is one associated with layer tree so that APZC may share its
> +   * FrameMetrics with the content process so it may be used in progressive
> +   * tile updates.

I can't parse this comment - split it up and add a few commas, as necessary.

@@ +695,5 @@
>    }
>  
>  private:
> +  // Unique id assigned to each APZC. Used with ViewID to uniquely identify
> +  // shared FrameMeterics used in progressive tile painting.

nit, use C-style comment here to match up with the other documented variables.

@@ +711,5 @@
> +  ipc::SharedMemoryBasic* mSharedFrameMetricsBuffer;
> +  CrossProcessMutex mSharedLock;
> +  /*
> +   * Called when ever mFrameMetric is updated so that if it is being
> +   * shared with the content process the shared FrameMetric may be updated.

s/FrameMetric/FrameMetrics/ (for both)

@@ +714,5 @@
> +   * Called when ever mFrameMetric is updated so that if it is being
> +   * shared with the content process the shared FrameMetric may be updated.
> +   */
> +  void UpdateSharedCompositorFrameMetrics();
> +  /**

nit, you use two '*'s here, but only one in the previous comment. The file itself seems to use one everywhere and start the comment on the same line that opens it, so let's be consistent (this applies further up the file too).

::: gfx/layers/ipc/CompositorChild.cpp
@@ +58,5 @@
> +  FrameMetricsData* data = mFrameMetricsTable.Get(aId);
> +  if (data) {
> +    FrameMetrics* frame = (FrameMetrics*)data->mBuffer->memory();
> +    if (frame) {
> +      data->mMutex->Lock();

Is there a chance that data or frame could disappear or change between the assignment and this lock?

@@ +148,5 @@
> +  data->mBuffer = new ipc::SharedMemoryBasic(metrics);
> +  data->mBuffer->Map(sizeof(FrameMetrics));
> +  data->mMutex = new CrossProcessMutex(handle);
> +  data->mId = aAPZCId;
> +  FrameMetrics* frame = (FrameMetrics*)data->mBuffer->memory();

nit, use a static_cast instead of a C-style cast, unless there's some reason not to?

::: gfx/layers/ipc/CompositorChild.h
@@ +38,5 @@
>  
>    /**
> +   * Lookup the FrameMetrics shared by the compositor process with the
> +   * associated FrameMetrics::ViewID. The returned FrameMetrics is used
> +   * in progressive tile update calculations.

nit, s/tile/tiled/

@@ +75,2 @@
>  private:
> +  struct FrameMetricsData {

Let's document this struct too.

@@ +97,4 @@
>    nsRefPtr<LayerManager> mLayerManager;
>    nsCOMPtr<nsIObserver> mMemoryPressureObserver;
>  
> +  // The ViewID of the FrameMetric is used as the key for this hash table.

nit, s/FrameMetric/FrameMetrics/
Attachment #833269 - Flags: review?(chrislord.net) → review+
I'll do my next round of review on the cleaned up patch since there's already a lot of comments in flight.
(In reply to Chris Lord [:cwiiis] from comment #36)
> Comment on attachment 833269 [details] [diff] [review]
> Work in progress 7
> 
> Review of attachment 833269 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Pretty much all my comments are just nits, so assuming they're
> addressed/explained, this looks good to me.
> 
> ::: b2g/app/b2g.js
> @@ +75,4 @@
> >  pref("mozilla.widget.use-buffer-pixmap", true);
> >  pref("mozilla.widget.disable-native-theme", true);
> >  pref("layout.reflow.synthMouseMove", false);
> > +pref("layers.force-tiles", true);
> 
> I'm still assuming that this won't be committed as part of this patch.
> 

Do I leave pref("layers.progressive-paint", true); ?

> ::: gfx/layers/client/TiledContentClient.cpp
> @@ +24,5 @@
> >  #include "nsMathUtils.h"               // for NS_roundf
> >  #include "gfx2DGlue.h"
> >  
> > +#include <android/log.h>
> > +#define RLOG(format, ...) __android_log_print(ANDROID_LOG_INFO, "reb", format, ##__VA_ARGS__);
> 
> reb? Most other uses of this have 'GeckoXXX' - perhaps
> GeckoTiledContentClient?
> 

Sorry, debug code that got left behind. I use reb so I can: "adb logcat | grep reb" and only see my debug statements. Removed.

> @@ +118,5 @@
> > +{
> > +
> > +}
> > +
> > +static inline bool fuzzyEquals(float a, float b) {
> 
> nit, function name/proto should be on a new line and start with a capital
> letter.
> 

Done.

> @@ +134,5 @@
> > +  if (!aLayer) {
> > +    return false;
> > +  }
> > +
> > +  FrameMetrics contentMetrics = aLayer->GetFrameMetrics();
> 
> Make this a const FrameMetrics&, or the loop below becomes more expensive
> than necessary.
> 

I'm not sure I follow. If I change it then the loop bellow doesn't work since I won't be able to change the value of contentMetrics. I broke it out into a separate function and do this:

const FrameMetrics* contentMetrics = &(aLayer->GetFrameMetrics());

> @@ +141,5 @@
> > +  // Walk up the layer tree until a valid composition bounds is found
> > +  while (layer && (contentMetrics.mScrollId == FrameMetrics::NULL_SCROLL_ID)) {
> > +    contentMetrics = layer->GetFrameMetrics();
> > +    layer = layer->GetParent();
> > +  }
> 
> This loop can fail to find a scroll container and this function will
> continue with possibly invalid bounds. There should be some kind of check
> and warning for this if it's not expected.
> 

Added a warning. Not sure what I could do if no composition bounds is found. Also, it turns out I can't use NULL_SCROLL_ID since it is possible to not have any valid ViewIDs all the way up to the root node. I have gone back to looking for non-empty composition bounds.

> @@ +146,5 @@
> > +
> > +  // On B2G, the compositor is part of the b2g process so there is no
> > +  // CrossProcessCompositorParent created for sharing the FrameMetrics.
> > +  // In this case, LookupCompositorFrameMetrics will fail so we use
> > +  // the content version of the FrameMetrics for calculations.
> 
> I don't really understand this comment - so this function won't fully work
> on b2g ever? Or is it that it doesn't work in a specific circumstance?
> 

There two potential issues at play here. Since the the content process for the main B2G process is in the same process as the compositor, there is no CrossProcessCompositorParent created. With out a CrossProcessCompositorParent, there is nothing to pass the current FrameMetrics back to the content process. The other issue is some layers are not scrollable, and none of their parents are either so there is no APZC created for sharing its frame metrics. Once progressive paint is only enabled for suitably scrollable layers, only the main B2G process will potentially have issues and then only if something it is rendering is scrollable. I reworked the function so that the layer tree only gets walked if the shared FrameMetrics is not found. I'm not sure what to do about scrollable layers in the main b2g process.

> @@ +159,5 @@
> > +
> > +  FrameMetrics compositorMetrics;
> > +
> > +  if (!compositor->LookupCompositorFrameMetrics(contentMetrics.mScrollId,
> > +                                                compositorMetrics)){
> 
> nit, space before opening brace.
>

Fixed.
 
> @@ +186,5 @@
> > +  }
> > +
> > +  // XXX All sorts of rounding happens inside Gecko that becomes hard to
> > +  //     account exactly for. Given we align the display-port to tile
> > +  //     boundaries (and so they rarely vary by sub-pixel amounts), just
> 
> This isn't true for APZC until bug 907743 is fixed.
> 

Okay, I will remove the comment. Should I change the fabs to FuzzyEqual or just assume it will be true eventually and leave as is?

> @@ +203,5 @@
> > +      fabsf(contentMetrics.mDisplayPort.height - compositorMetrics.mDisplayPort.height)) {
> > +    return false;
> > +  }
> > +
> > +  if (!aLowPrecision && !mProgressiveUpdateWasInDanger) {
> 
> Add a comment for this block. Apologies if one doesn't exist in
> GeckoLayerClient already.
> 

Done.

> ::: gfx/layers/client/TiledContentClient.h
> @@ +138,5 @@
> > +                                        CSSToScreenScale& aZoom);
> > +  /**
> > +   * Determines if the compositor's upcoming composition bounds has fallen
> > +   * outside of the contents display port. If it has then the compositor
> > +   * will start to checker board.
> 
> Might be worth adding a short description of what we mean by
> checkerboarding, given it's been a really long time since we drew a
> checkerboard pattern underneath unrendered content :)
> 

Okay, added a historical note. Do you know if the blank tiles use the layer's background color or are they always white? Using the background color looks a lot better than a checker board pattern or white in my opinion.

> ::: gfx/layers/ipc/AsyncPanZoomController.cpp
> @@ +308,5 @@
> >       mHandlingTouchQueue(false),
> > +     mTreeManager(aTreeManager),
> > +     mAPZCId(sAsyncPanZoomControllerCount++),
> > +     mSharedFrameMetricsBuffer(nullptr),
> > +     mSharedLock("")
> 
> The string in here should be something helpful to identify errors, right?
> 

It is actually ignored in the constructor but I will change it in case it is used in the future.

> @@ +1629,5 @@
> > +
> > +  if (gUseProgressiveTilePainting && frame) {
> > +    mSharedLock.Lock();
> > +    *frame = mFrameMetrics;
> > +    mSharedLock.Unlock();
> 
> Not a blocker (pun not initially intended), but I still think you could
> write this without locking. Let's tackle it if it ever becomes a problem
> though.
> 

I agree that it could be done with out a lock but it would not be a trivial change. I still do not believe that a triple buffer would work with out a lock (see my comments previously).


> @@ +1638,5 @@
> > +  // Only create the shared memory buffer if it hasn't already been created,
> > +  // we are using progressive tile painting, and we have cross process
> > +  // compositor to pass the shared memory back to the content process.
> > +  if (!mSharedFrameMetricsBuffer && gUseProgressiveTilePainting && mCrossProcessCompositorParent) {
> > +    // Create shared memory and initialized it with current FrameMetrics value
> 
> nit, s/initialized/initialize/ (or indeed, initialise, but I'm not gonna win
> that battle :)), s/with/with the/ and s/ value//
>

Done.
 
> @@ +1649,5 @@
> > +      {
> > +        ReentrantMonitorAutoEnter lock(mMonitor);
> > +        *frame = mFrameMetrics;
> > +      }
> > +      // Get the process id of the content process
> 
> nit, let's add some newlines here, it's getting a bit crowded.
> 

Done.

> ::: gfx/layers/ipc/AsyncPanZoomController.h
> @@ +183,5 @@
> > +  /**
> > +   * The platform implementation must set the cross process compositor if
> > +   * there is one associated with layer tree so that APZC may share its
> > +   * FrameMetrics with the content process so it may be used in progressive
> > +   * tile updates.
> 
> I can't parse this comment - split it up and add a few commas, as necessary.
> 

Okay, broke it up into multiple sentences.
 
> @@ +695,5 @@
> >    }
> >  
> >  private:
> > +  // Unique id assigned to each APZC. Used with ViewID to uniquely identify
> > +  // shared FrameMeterics used in progressive tile painting.
> 
> nit, use C-style comment here to match up with the other documented
> variables.
> 

Done.

> @@ +711,5 @@
> > +  ipc::SharedMemoryBasic* mSharedFrameMetricsBuffer;
> > +  CrossProcessMutex mSharedLock;
> > +  /*
> > +   * Called when ever mFrameMetric is updated so that if it is being
> > +   * shared with the content process the shared FrameMetric may be updated.
> 
> s/FrameMetric/FrameMetrics/ (for both)
> 

Done.

> @@ +714,5 @@
> > +   * Called when ever mFrameMetric is updated so that if it is being
> > +   * shared with the content process the shared FrameMetric may be updated.
> > +   */
> > +  void UpdateSharedCompositorFrameMetrics();
> > +  /**
> 
> nit, you use two '*'s here, but only one in the previous comment. The file
> itself seems to use one everywhere and start the comment on the same line
> that opens it, so let's be consistent (this applies further up the file too).
>

The file isn't very consistent. Most function comments start with two '*' so I think I should use two for both of the functions.
 
> ::: gfx/layers/ipc/CompositorChild.cpp
> @@ +58,5 @@
> > +  FrameMetricsData* data = mFrameMetricsTable.Get(aId);
> > +  if (data) {
> > +    FrameMetrics* frame = (FrameMetrics*)data->mBuffer->memory();
> > +    if (frame) {
> > +      data->mMutex->Lock();
> 
> Is there a chance that data or frame could disappear or change between the
> assignment and this lock?
> 

I don't believe so. I verified that the hash table is only accessed in a single thread.

> @@ +148,5 @@
> > +  data->mBuffer = new ipc::SharedMemoryBasic(metrics);
> > +  data->mBuffer->Map(sizeof(FrameMetrics));
> > +  data->mMutex = new CrossProcessMutex(handle);
> > +  data->mId = aAPZCId;
> > +  FrameMetrics* frame = (FrameMetrics*)data->mBuffer->memory();
> 
> nit, use a static_cast instead of a C-style cast, unless there's some reason
> not to?
> 

Done.

> ::: gfx/layers/ipc/CompositorChild.h
> @@ +38,5 @@
> >  
> >    /**
> > +   * Lookup the FrameMetrics shared by the compositor process with the
> > +   * associated FrameMetrics::ViewID. The returned FrameMetrics is used
> > +   * in progressive tile update calculations.
> 
> nit, s/tile/tiled/
> 

Changed to progressive paint since that is what the preference is called.

> @@ +75,2 @@
> >  private:
> > +  struct FrameMetricsData {
> 
> Let's document this struct too.
> 

Done.

> @@ +97,4 @@
> >    nsRefPtr<LayerManager> mLayerManager;
> >    nsCOMPtr<nsIObserver> mMemoryPressureObserver;
> >  
> > +  // The ViewID of the FrameMetric is used as the key for this hash table.
> 
> nit, s/FrameMetric/FrameMetrics/

Done.
Attached patch Work in progress 8 (obsolete) — Splinter Review
Updated to address comments.
Attachment #8334955 - Flags: review?(chrislord.net)
Attachment #8334955 - Flags: review?(bugmail.mozilla)
Attachment #8334955 - Flags: review?(bgirard)
Attachment #833269 - Attachment is obsolete: true
Attachment #833269 - Flags: review?(bugmail.mozilla)
Attachment #833269 - Flags: review?(bgirard)
Attached patch Cross Process Mutex for POSIX 3 (obsolete) — Splinter Review
Small change to compile after rebase.
Attachment #8334172 - Attachment is obsolete: true
Comment on attachment 8334955 [details] [diff] [review]
Work in progress 8

Review of attachment 8334955 [details] [diff] [review]:
-----------------------------------------------------------------

I again mostly skipped over the TiledContentClient.*. The rest of the patch looks fine to me, modulo previous comments.

(In reply to Randall Barker from comment #27)
> No, this is an expected case. Here is how it can happen:
> 
> 1) Compositor process creates APZC with ViewID 1 and APZCId 1
> 2) Content process registers FrameMetrics with ViewID 1 and APZCId 1
> 3) Compositor process creates new APZC with ViewID 1 and APZCId 2
> 4) Content process registers FrameMetrics with ViewID 1 and APZCId 2 causing
> the FrameMetrics with ViewID 1 and APZCId 1 to be destroyed
> 5) Compositor process destroys APZC with ViewID 1 and APZCId 1 sending a
> message to Content process to remove associated FrameMetrics
> 6) Content process receives message to remove FrameMetrics with ViewID 1 and
> aAPZCId 1, looks up FrameMetrics with ViewID 1, but sees that the APZCIds do
> not match so the FrameMetrics is not removed.
> 
> I had to add the APZCId specifically for this case.

Cool. Note that once bug 900092 lands the ViewID *will* be unique per layers id, and you may be able to remove APZCId.

(In reply to Randall Barker from comment #29)
> > > +uint32_t AsyncPanZoomController::sAsyncPanZoomControllerCount = 0;
> > 
> > Can we just make this static to this file, rather than a static member of
> > AsyncPanZoomController?
> > 
> 
> Sure, is there a reason why a file local variable is preferred to a static
> member variable?

I prefer to avoid putting things in .h files if it can be avoided just for faster compile times if you touch the line. It's not a big deal either way, and in the end you may be able to get rid of this entirely as mentioned above.

> > @@ +516,4 @@
> > >  
> > >    uint64_t mLayersId;
> > >    nsRefPtr<CompositorParent> mCompositorParent;
> > > +  PCompositorParent* mCrossProcessCompositorParent;
> > 
> > Does this need to be refcounted? When is the PCompositorParent deleted? Is
> > it ever possible for it to get deleted while an APZC instance still has a
> > pointer to it?
> > 
> 
> Can an IPDL generated interface be refcounted? I thought it couldn't. It
> should live as long as the content process it is associated. I did look into
> at the time and is seem okay but that doesn't mean things can't change.

I'm not sure about this. You might want to check with :nical or :bjacob on IRC quickly, they generally know a lot about lifetimes of stuff and what should be refcounted or not.

> > The ViewID is not a perfectly unique identifier for a FrameMetrics, so I'm a
> > little uneasy about using this as the key here. In practice I guess you're
> > unlikely to hit a scenario where it matters but maybe a comment in the .h
> > file would be good.
> > 
> 
> My understanding is that ViewID is unique within a given instance of a layer
> tree. I believe this usage is okay but I will add a comment. Please take a
> look and make sure it addresses your concerns, I wasn't totally certain what
> to say.
> 

Looks fine on the new patch. I'm not so concerned about this anymore now that ROOT_SCROLL_ID is going away.

> > @@ +150,5 @@
> > > +  data->mMutex = new CrossProcessMutex(handle);
> > > +  data->mId = aAPZCId;
> > > +  FrameMetrics* frame = (FrameMetrics*)data->mBuffer->memory();
> > > +  if (frame) {
> > > +    mFrameMetricsTable.Put(frame->mScrollId, data);
> > 
> > Does reading/writing from mFrameMetricsTable need locking? What thread is
> > the Lookup function called from?
> > 
> 
> I was under the impression it was all running is the same thread. It didn't
> occur to me this may not be the case. Do you know of a good way to verify
> this?
> 

You should just be able to print out the current thread id and compare. I believe those pieces of code will never be called from multiple threads so if the thread IDs are same once they will always be the same.
Attachment #8334955 - Flags: review?(bugmail.mozilla) → review+
(In reply to (PTO Nov21-Dec01) Kartikaya Gupta (email:kats@mozilla.com) from comment #41)
> (In reply to Randall Barker from comment #27)
> > No, this is an expected case. Here is how it can happen:
> > 
> > 1) Compositor process creates APZC with ViewID 1 and APZCId 1
> > 2) Content process registers FrameMetrics with ViewID 1 and APZCId 1
> > 3) Compositor process creates new APZC with ViewID 1 and APZCId 2
> > 4) Content process registers FrameMetrics with ViewID 1 and APZCId 2 causing
> > the FrameMetrics with ViewID 1 and APZCId 1 to be destroyed
> > 5) Compositor process destroys APZC with ViewID 1 and APZCId 1 sending a
> > message to Content process to remove associated FrameMetrics
> > 6) Content process receives message to remove FrameMetrics with ViewID 1 and
> > aAPZCId 1, looks up FrameMetrics with ViewID 1, but sees that the APZCIds do
> > not match so the FrameMetrics is not removed.
> > 
> > I had to add the APZCId specifically for this case.
> 
> Cool. Note that once bug 900092 lands the ViewID *will* be unique per layers
> id, and you may be able to remove APZCId.

Bug 900092 has now landed.
(In reply to (PTO Nov21-Dec01) Kartikaya Gupta (email:kats@mozilla.com) from comment #41)
> Comment on attachment 8334955 [details] [diff] [review]
> Work in progress 8
> 
> Review of attachment 8334955 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> I again mostly skipped over the TiledContentClient.*. The rest of the patch
> looks fine to me, modulo previous comments.
> 
> (In reply to Randall Barker from comment #27)
> > No, this is an expected case. Here is how it can happen:
> > 
> > 1) Compositor process creates APZC with ViewID 1 and APZCId 1
> > 2) Content process registers FrameMetrics with ViewID 1 and APZCId 1
> > 3) Compositor process creates new APZC with ViewID 1 and APZCId 2
> > 4) Content process registers FrameMetrics with ViewID 1 and APZCId 2 causing
> > the FrameMetrics with ViewID 1 and APZCId 1 to be destroyed
> > 5) Compositor process destroys APZC with ViewID 1 and APZCId 1 sending a
> > message to Content process to remove associated FrameMetrics
> > 6) Content process receives message to remove FrameMetrics with ViewID 1 and
> > aAPZCId 1, looks up FrameMetrics with ViewID 1, but sees that the APZCIds do
> > not match so the FrameMetrics is not removed.
> > 
> > I had to add the APZCId specifically for this case.
> 
> Cool. Note that once bug 900092 lands the ViewID *will* be unique per layers
> id, and you may be able to remove APZCId.
> 

I'll rebase and see if I can get rid of the APZCId value.
   
> > > @@ +150,5 @@
> > > > +  data->mMutex = new CrossProcessMutex(handle);
> > > > +  data->mId = aAPZCId;
> > > > +  FrameMetrics* frame = (FrameMetrics*)data->mBuffer->memory();
> > > > +  if (frame) {
> > > > +    mFrameMetricsTable.Put(frame->mScrollId, data);
> > > 
> > > Does reading/writing from mFrameMetricsTable need locking? What thread is
> > > the Lookup function called from?
> > > 
> > 
> > I was under the impression it was all running is the same thread. It didn't
> > occur to me this may not be the case. Do you know of a good way to verify
> > this?
> > 
> 
> You should just be able to print out the current thread id and compare. I
> believe those pieces of code will never be called from multiple threads so
> if the thread IDs are same once they will always be the same.

Yeah, after I posted that I realized it was a pretty silly question. I printed out the thread ids and verified that the hash table is only accessed in the same thread.
(In reply to (PTO Nov21-Dec01) Kartikaya Gupta (email:kats@mozilla.com) from comment #41)
> Comment on attachment 8334955 [details] [diff] [review]
> Work in progress 8
> 
> Review of attachment 8334955 [details] [diff] [review]:
> ----------------------------------------------------------------- 
> (In reply to Randall Barker from comment #27)
> > No, this is an expected case. Here is how it can happen:
> > 
> > 1) Compositor process creates APZC with ViewID 1 and APZCId 1
> > 2) Content process registers FrameMetrics with ViewID 1 and APZCId 1
> > 3) Compositor process creates new APZC with ViewID 1 and APZCId 2
> > 4) Content process registers FrameMetrics with ViewID 1 and APZCId 2 causing
> > the FrameMetrics with ViewID 1 and APZCId 1 to be destroyed
> > 5) Compositor process destroys APZC with ViewID 1 and APZCId 1 sending a
> > message to Content process to remove associated FrameMetrics
> > 6) Content process receives message to remove FrameMetrics with ViewID 1 and
> > aAPZCId 1, looks up FrameMetrics with ViewID 1, but sees that the APZCIds do
> > not match so the FrameMetrics is not removed.
> > 
> > I had to add the APZCId specifically for this case.
> 
> Cool. Note that once bug 900092 lands the ViewID *will* be unique per layers
> id, and you may be able to remove APZCId.
> 

I rebased to pick up bug 900092. Unfortunately it appears that the APZCId is still necessary. Here is the scenario:

1) Compositor process creates APZC with ViewID 1 and APZCId 1
2) Content process registers FrameMetrics with ViewID 1 and APZCId 1
3) A tab is opened with a new page. (new ViewID 1 APZCId 2)
4) Switch back to original page. Page layout cause new APCZ to be created with ViewID 1 and APZCId 3. Creating new FrameMetrics with same ViewID causes ViewID 1 and APZCId 1 FrameMetrics to be deleted from hash table in content process.
5) Compositor process destroys APZC with ViewID 1 and APZCId 1 sending a message to Content process to remove associated FrameMetrics
6) Content process receives message to remove FrameMetrics with ViewID 1 and aAPZCId 1, looks up FrameMetrics with ViewID 1, but sees that the APZCIds do not match so the FrameMetrics is not removed.

I don't see a way around this. I think the APCZId is still needed.
(In reply to Randall Barker from comment #44)
> 1) Compositor process creates APZC with ViewID 1 and APZCId 1
> 2) Content process registers FrameMetrics with ViewID 1 and APZCId 1
> 3) A tab is opened with a new page. (new ViewID 1 APZCId 2)

At this point, is APZCId 1 still around? Or has it been destroyed? If it has been destroyed then step 5 really happens (in particular, before APZCId 3 is created) and so there should be no problem. If APZCId 1 is still around, see my next comment.

> 4) Switch back to original page. Page layout cause new APCZ to be created
> with ViewID 1 and APZCId 3.

If APZCId 1 is still around, then this part is unexpected to me. I would expect the original page to pick up the same APZC as it was using before, because of the code at [1] (this was added recently but you should have it if rebased to pick up 900092).

> Creating new FrameMetrics with same ViewID
> causes ViewID 1 and APZCId 1 FrameMetrics to be deleted from hash table in
> content process.
> 5) Compositor process destroys APZC with ViewID 1 and APZCId 1 sending a
> message to Content process to remove associated FrameMetrics
> 6) Content process receives message to remove FrameMetrics with ViewID 1 and
> aAPZCId 1, looks up FrameMetrics with ViewID 1, but sees that the APZCIds do
> not match so the FrameMetrics is not removed.
> 
> I don't see a way around this. I think the APCZId is still needed.

Even if for some reason my expectations above are wrong, you might still be able to get rid of APZCId by moving the call to ShareCompositorFrameMetrics. That is, instead of doing it in NotifyLayersUpdated (which runs on the new APZC before the old one is destroyed), you can call it directly from APZCTreeManager after running through the destroy list, somewhere around [2] (you'd have to walk the full APZC tree at that point and call Share on each one). That would ensure the old one is always deleted before the new one is shared, which should make the APZCId guard unnecessary. I'd be fine with trying this in a follow-up patch or bug so that it doesn't hold up this patch from landing.

[1] http://mxr.mozilla.org/mozilla-central/source/gfx/layers/composite/APZCTreeManager.cpp?rev=1e1b514a0f4b#124
[2] http://mxr.mozilla.org/mozilla-central/source/gfx/layers/composite/APZCTreeManager.cpp?rev=1e1b514a0f4b#98
Comment on attachment 8334960 [details] [diff] [review]
Cross Process Mutex for POSIX 3

Review of attachment 8334960 [details] [diff] [review]:
-----------------------------------------------------------------

Kyle, are you a good review for this?
Attachment #8334960 - Flags: review?(khuey)
Comment on attachment 8334955 [details] [diff] [review]
Work in progress 8

Review of attachment 8334955 [details] [diff] [review]:
-----------------------------------------------------------------

I did a review of the header changes but I'm not done.

::: gfx/layers/client/TiledContentClient.cpp
@@ +214,5 @@
> +    layer = layer->GetParent();
> +    contentMetrics = layer ? &(layer->GetFrameMetrics()) : contentMetrics;
> +  }
> +
> +  NS_WARN_IF_FALSE(!contentMetrics->mCompositionBounds.IsEmpty(),

Is this appropriate as a warning? Sounds from the message that it should be an assert.

::: gfx/layers/client/TiledContentClient.h
@@ +118,4 @@
>  class ClientTiledThebesLayer;
>  class ClientLayerManager;
>  
> +class BasicTiledLayerBufferHelper

This doesn't need 'basic' in the name. In GFX land we use basic to mean software/CPU based rasterization. Helper isn't a descriptive suffix but I'm not sure what to suggest.

As I understand this class is just to encapsulate the complex FrameMetrics calculations. Maybe TiledLayerFrameMetricsHelper?

@@ +120,5 @@
>  
> +class BasicTiledLayerBufferHelper
> +{
> +public:
> +  BasicTiledLayerBufferHelper(ClientLayerManager* aManager);

Is this always constructed with null? Does it need a parameter to the constructor then?

@@ +139,5 @@
> +
> +  /**
> +   * When a shared FrameMetrics can not be found for a given layer,
> +   * this function is used to find the first non-empty composition bounds
> +   * by traversing up the layer tree.

Can you document the return code? I don't know what it does except that it's always 0.

@@ +151,5 @@
> +   * will start to checker board. Checker boarding is when the compositor
> +   * tries to composite a tile and it is not available. Historically
> +   * a tile with a checker board pattern was used. Now a blank tile is used.
> +   */
> +  bool aboutToCheckerboard(const FrameMetrics& aContentMetrics,

AboutToCheckerboard*

::: gfx/layers/ipc/AsyncPanZoomController.cpp
@@ +320,4 @@
>  }
>  
>  AsyncPanZoomController::~AsyncPanZoomController() {
> +

Is MOZ_ASSERT(!mSharedFrameMetricsBuffer || mCrossProcessCompositorParent); appropriate here?

::: gfx/layers/ipc/AsyncPanZoomController.h
@@ +522,4 @@
>  
>    uint64_t mLayersId;
>    nsRefPtr<CompositorParent> mCompositorParent;
> +  PCompositorParent* mCrossProcessCompositorParent;

and how does this differ from mCompositorParent?
Also you don't keep a reference here. Are you sure this object will always outlive this reference?

::: gfx/layers/ipc/CompositorChild.h
@@ +75,2 @@
>  private:
> +  // Struct used to store the shared data in a hash table

I don't think this comment or this name is very descriptive. Perhaps |SharedFrameMetrics|?

@@ +75,4 @@
>  private:
> +  // Struct used to store the shared data in a hash table
> +  struct FrameMetricsData {
> +    // Pointer the class that allows access to the shared memory that contains

nit: grammar

@@ +104,5 @@
>    nsCOMPtr<nsIObserver> mMemoryPressureObserver;
>  
> +  // The ViewID of the FrameMetrics is used as the key for this hash table.
> +  // While this generally should be safe to use since the ViewID is unique
> +  // for a instance of a layer tree, It may be possible for a stale

an* it*

@@ +105,5 @@
>  
> +  // The ViewID of the FrameMetrics is used as the key for this hash table.
> +  // While this generally should be safe to use since the ViewID is unique
> +  // for a instance of a layer tree, It may be possible for a stale
> +  // shared FrameMetrics to be used for a draw update.

You mention a problem but not how it's resolved. If this isn't solved then well need to file a bug and link in the code base. Is it possible to fix, assert or mitigate this problem? I know in some places we have the concept of a generation as well we add to the id.

::: gfx/layers/ipc/CompositorParent.h
@@ +201,5 @@
>      CompositorParent* mParent;
> +    // Pointer to the CrossProcessCompositorParent. Used by APZCs to share
> +    // their FrameMetrics with the corresponding child process that holds
> +    // the PCompositorChild
> +    PCompositorParent* mCrossProcessParent;

How does mCrossProcessParent differ from mParent?
Comment on attachment 8334955 [details] [diff] [review]
Work in progress 8

Review of attachment 8334955 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/layers/client/TiledContentClient.cpp
@@ +107,5 @@
> +  : mLayerManager(aManager)
> +  , mLastProgressiveUpdateWasLowPrecision(false)
> +  , mProgressiveUpdateWasInDanger(false)
> +{
> +

MOZ_COUNT_CTOR

@@ +129,5 @@
> +    ScreenRect& aCompositionBounds,
> +    CSSToScreenScale& aZoom)
> +{
> +  if (!aLayer) {
> +    return false;

Is this code not being merged with the old code to calculate this?

@@ +325,5 @@
>  
>  /* static */ BasicTiledLayerBuffer
>  BasicTiledLayerBuffer::OpenDescriptor(ISurfaceAllocator *aAllocator,
> +                                      const SurfaceDescriptorTiles& aDescriptor,
> +                                      BasicTiledLayerBufferHelper* aHelper)

BTLBH be merged into SDT? Do we ever have a SurfaceDescriptorTiles without a BTLBH?

@@ +597,4 @@
>    // caused by there being an incoming, more relevant paint.
>    gfx::Rect viewport;
>    float scaleX, scaleY;
> +#if defined(MOZ_WIDGET_ANDROID)

These two block should be able to share a lot more code. Ideally they should call a different method but everything else, or nearly, should be shared.

::: gfx/layers/client/TiledContentClient.h
@@ +120,5 @@
>  
> +class BasicTiledLayerBufferHelper
> +{
> +public:
> +  BasicTiledLayerBufferHelper(ClientLayerManager* aManager);

Nevermind, initializers make it hard to grep.
(In reply to Benoit Girard (:BenWa) from comment #47)
> Comment on attachment 8334955 [details] [diff] [review]
> Work in progress 8
> 
> Review of attachment 8334955 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> I did a review of the header changes but I'm not done.
> 
> ::: gfx/layers/client/TiledContentClient.cpp
> @@ +214,5 @@
> > +    layer = layer->GetParent();
> > +    contentMetrics = layer ? &(layer->GetFrameMetrics()) : contentMetrics;
> > +  }
> > +
> > +  NS_WARN_IF_FALSE(!contentMetrics->mCompositionBounds.IsEmpty(),
> 
> Is this appropriate as a warning? Sounds from the message that it should be
> an assert.
> 

I don't know what qualifies as an assert vs. a warning. I had nothing before, an earlier review asked that I add a warning. I can make it an assert but I'm not confident that an empty composition bounds is not a valid possibility.

> ::: gfx/layers/client/TiledContentClient.h
> @@ +118,4 @@
> >  class ClientTiledThebesLayer;
> >  class ClientLayerManager;
> >  
> > +class BasicTiledLayerBufferHelper
> 
> This doesn't need 'basic' in the name. In GFX land we use basic to mean
> software/CPU based rasterization. Helper isn't a descriptive suffix but I'm
> not sure what to suggest.
> 
> As I understand this class is just to encapsulate the complex FrameMetrics
> calculations. Maybe TiledLayerFrameMetricsHelper?
> 

Okay, I just meant for the name to reflect the class it was helping, I will change it to SharedFrameMetricsHelper.

> @@ +120,5 @@
> >  
> > +class BasicTiledLayerBufferHelper
> > +{
> > +public:
> > +  BasicTiledLayerBufferHelper(ClientLayerManager* aManager);
> 
> Is this always constructed with null? Does it need a parameter to the
> constructor then?
> 

Good catch, the ClientLayerManager was left over cruft that isn't needed any more. Removed.

> @@ +139,5 @@
> > +
> > +  /**
> > +   * When a shared FrameMetrics can not be found for a given layer,
> > +   * this function is used to find the first non-empty composition bounds
> > +   * by traversing up the layer tree.
> 
> Can you document the return code? I don't know what it does except that it's
> always 0.
>

Actually, I'll remove it since it is always false, originally there was a case that returned true.
 
> @@ +151,5 @@
> > +   * will start to checker board. Checker boarding is when the compositor
> > +   * tries to composite a tile and it is not available. Historically
> > +   * a tile with a checker board pattern was used. Now a blank tile is used.
> > +   */
> > +  bool aboutToCheckerboard(const FrameMetrics& aContentMetrics,
> 
> AboutToCheckerboard*
> 

Fixed.

> ::: gfx/layers/ipc/AsyncPanZoomController.cpp
> @@ +320,4 @@
> >  }
> >  
> >  AsyncPanZoomController::~AsyncPanZoomController() {
> > +
> 
> Is MOZ_ASSERT(!mSharedFrameMetricsBuffer || mCrossProcessCompositorParent);
> appropriate here?
> 

No, if the layer is not doing progressive painting then the shared frame metrics is never created since it isn't needed. I'll add a comment.

> ::: gfx/layers/ipc/AsyncPanZoomController.h
> @@ +522,4 @@
> >  
> >    uint64_t mLayersId;
> >    nsRefPtr<CompositorParent> mCompositorParent;
> > +  PCompositorParent* mCrossProcessCompositorParent;
> 
> and how does this differ from mCompositorParent?
> Also you don't keep a reference here. Are you sure this object will always
> outlive this reference?
> 

Since it is a PCompositorParent and not a CompositorParent, it can not be ref counted. It should live as long as the content process it is paired with. I can look again as see if there is a way to insure it is being accessed properly.

> ::: gfx/layers/ipc/CompositorChild.h
> @@ +75,2 @@
> >  private:
> > +  // Struct used to store the shared data in a hash table
> 
> I don't think this comment or this name is very descriptive. Perhaps
> |SharedFrameMetrics|?

Updated comment and class name.

> 
> @@ +75,4 @@
> >  private:
> > +  // Struct used to store the shared data in a hash table
> > +  struct FrameMetricsData {
> > +    // Pointer the class that allows access to the shared memory that contains
> 
> nit: grammar
> 

Fixed.

> @@ +104,5 @@
> >    nsCOMPtr<nsIObserver> mMemoryPressureObserver;
> >  
> > +  // The ViewID of the FrameMetrics is used as the key for this hash table.
> > +  // While this generally should be safe to use since the ViewID is unique
> > +  // for a instance of a layer tree, It may be possible for a stale
> 
> an* it*
>

Fixed.
 
> @@ +105,5 @@
> >  
> > +  // The ViewID of the FrameMetrics is used as the key for this hash table.
> > +  // While this generally should be safe to use since the ViewID is unique
> > +  // for a instance of a layer tree, It may be possible for a stale
> > +  // shared FrameMetrics to be used for a draw update.
> 
> You mention a problem but not how it's resolved. If this isn't solved then
> well need to file a bug and link in the code base. Is it possible to fix,
> assert or mitigate this problem? I know in some places we have the concept
> of a generation as well we add to the id.
> 

Actually, since https://bugzilla.mozilla.org/show_bug.cgi?id=900092 has landed it isn't a problem. I'll remove the last part of the comment that is no longer true.

> ::: gfx/layers/ipc/CompositorParent.h
> @@ +201,5 @@
> >      CompositorParent* mParent;
> > +    // Pointer to the CrossProcessCompositorParent. Used by APZCs to share
> > +    // their FrameMetrics with the corresponding child process that holds
> > +    // the PCompositorChild
> > +    PCompositorParent* mCrossProcessParent;
> 
> How does mCrossProcessParent differ from mParent?

The child for mCrossProcessParent lives in a content process. in B2G, the child for mParent is in the b2g process which is not the process the APZC needs to share its FrameMetrics with. I actually just realized that something got lost in the numerous rewrites that the APZC should fall back to use the mParent if there isn't a mCrossProcessParent since that means that content is in a different thread instead of a separate process.
(In reply to Benoit Girard (:BenWa) from comment #48)
> Comment on attachment 8334955 [details] [diff] [review]
> Work in progress 8
> 
> Review of attachment 8334955 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: gfx/layers/client/TiledContentClient.cpp
> @@ +107,5 @@
> > +  : mLayerManager(aManager)
> > +  , mLastProgressiveUpdateWasLowPrecision(false)
> > +  , mProgressiveUpdateWasInDanger(false)
> > +{
> > +
> 
> MOZ_COUNT_CTOR
> 

Fixed.

> @@ +129,5 @@
> > +    ScreenRect& aCompositionBounds,
> > +    CSSToScreenScale& aZoom)
> > +{
> > +  if (!aLayer) {
> > +    return false;
> 
> Is this code not being merged with the old code to calculate this?
> 

The code that does this on fennec is written in Java. I was under the impression that when/if fennec moved to using APZCs, it would move to using this path as well.

> @@ +325,5 @@
> >  
> >  /* static */ BasicTiledLayerBuffer
> >  BasicTiledLayerBuffer::OpenDescriptor(ISurfaceAllocator *aAllocator,
> > +                                      const SurfaceDescriptorTiles& aDescriptor,
> > +                                      BasicTiledLayerBufferHelper* aHelper)
> 
> BTLBH be merged into SDT? Do we ever have a SurfaceDescriptorTiles without a
> BTLBH?
> 

I don't follow. Are you referring to future/in progress work related to BasicTiledLayerBuffer and SurfaceDescriptorTiles?

> @@ +597,4 @@
> >    // caused by there being an incoming, more relevant paint.
> >    gfx::Rect viewport;
> >    float scaleX, scaleY;
> > +#if defined(MOZ_WIDGET_ANDROID)
> 
> These two block should be able to share a lot more code. Ideally they should
> call a different method but everything else, or nearly, should be shared.
>

I wrote this with the assumption that https://bugzilla.mozilla.org/show_bug.cgi?id=912148 will land first. Once it lands, I was going to collapse it. I'm still assuming it will land first but maybe not.
Attached patch Work in progress 9 (obsolete) — Splinter Review
Updated to address comments.
Attachment #8334955 - Attachment is obsolete: true
Attachment #8334955 - Flags: review?(chrislord.net)
Attachment #8334955 - Flags: review?(bgirard)
Attachment #8336486 - Flags: review?(bgirard)
(In reply to (PTO Nov21-Dec01) Kartikaya Gupta (email:kats@mozilla.com) from comment #45)
> (In reply to Randall Barker from comment #44)
> > 1) Compositor process creates APZC with ViewID 1 and APZCId 1
> > 2) Content process registers FrameMetrics with ViewID 1 and APZCId 1
> > 3) A tab is opened with a new page. (new ViewID 1 APZCId 2)
> 
> At this point, is APZCId 1 still around? Or has it been destroyed? If it has
> been destroyed then step 5 really happens (in particular, before APZCId 3 is
> created) and so there should be no problem. If APZCId 1 is still around, see
> my next comment.
> 
> > 4) Switch back to original page. Page layout cause new APCZ to be created
> > with ViewID 1 and APZCId 3.
> 
> If APZCId 1 is still around, then this part is unexpected to me. I would
> expect the original page to pick up the same APZC as it was using before,
> because of the code at [1] (this was added recently but you should have it
> if rebased to pick up 900092).
> 

A new APZC is definitely being created as seen in the increase in the APZCId but not the ViewID. I logged the content process and saw:

1) ViewID 1 APZCId 1 FrameMetrics received. 
2) ViewID 1 APZCId 3 FrameMetrics received (previous version deleted)
3) Remove ViewID APZCId 1 message received, ignored because it has already been replaced.


> > Creating new FrameMetrics with same ViewID
> > causes ViewID 1 and APZCId 1 FrameMetrics to be deleted from hash table in
> > content process.
> > 5) Compositor process destroys APZC with ViewID 1 and APZCId 1 sending a
> > message to Content process to remove associated FrameMetrics
> > 6) Content process receives message to remove FrameMetrics with ViewID 1 and
> > aAPZCId 1, looks up FrameMetrics with ViewID 1, but sees that the APZCIds do
> > not match so the FrameMetrics is not removed.
> > 
> > I don't see a way around this. I think the APCZId is still needed.
> 
> Even if for some reason my expectations above are wrong, you might still be
> able to get rid of APZCId by moving the call to ShareCompositorFrameMetrics.
> That is, instead of doing it in NotifyLayersUpdated (which runs on the new
> APZC before the old one is destroyed), you can call it directly from
> APZCTreeManager after running through the destroy list, somewhere around [2]
> (you'd have to walk the full APZC tree at that point and call Share on each
> one). That would ensure the old one is always deleted before the new one is
> shared, which should make the APZCId guard unnecessary. I'd be fine with
> trying this in a follow-up patch or bug so that it doesn't hold up this
> patch from landing.
> 
> [1]
> http://mxr.mozilla.org/mozilla-central/source/gfx/layers/composite/
> APZCTreeManager.cpp?rev=1e1b514a0f4b#124
> [2]
> http://mxr.mozilla.org/mozilla-central/source/gfx/layers/composite/
> APZCTreeManager.cpp?rev=1e1b514a0f4b#98

While I completely understand the desire to get rid of unnecessary IDs, delaying when the FrameMetrics is shared feels very fragile to me. I'm okay with going with this approach but I'm afraid it would be one refactor away from breaking. My experience is that it is good to avoid APIs that break when the call order is changed.
Sorry for not providing better review times. I can imagine this patch is fairly painful to work on as-is. :(

(In reply to Randall Barker from comment #49)
> Since it is a PCompositorParent and not a CompositorParent, it can not be
> ref counted. It should live as long as the content process it is paired
> with. I can look again as see if there is a way to insure it is being
> accessed properly.

I'm not sure why we have PCompositorParent. Typically we only have one implementation of P<X> which is <X>. The distinction is a bit confusing/useless but I noticed that there's also other code that does this. I guess PCompositorParent says 'I don't need any of the methods that aren't mention in the protocol'. Alright fine :)
(In reply to Randall Barker from comment #49)
> > ::: gfx/layers/client/TiledContentClient.cpp
> > @@ +214,5 @@
> > > +    layer = layer->GetParent();
> > > +    contentMetrics = layer ? &(layer->GetFrameMetrics()) : contentMetrics;
> > > +  }
> > > +
> > > +  NS_WARN_IF_FALSE(!contentMetrics->mCompositionBounds.IsEmpty(),
> > 
> > Is this appropriate as a warning? Sounds from the message that it should be
> > an assert.
> > 
> 
> I don't know what qualifies as an assert vs. a warning. I had nothing
> before, an earlier review asked that I add a warning. I can make it an
> assert but I'm not confident that an empty composition bounds is not a valid
> possibility.
> 

IMO if our code breaks we need to make that critical. We need to decide that either it can't happen and assert (and ideally explain why we think this properly always holds), or have a good fallback in place. Feel free to convince me otherwise.
Comment on attachment 8336486 [details] [diff] [review]
Work in progress 9

Review of attachment 8336486 [details] [diff] [review]:
-----------------------------------------------------------------

Some more questions for you as I try to understand the code and make sure it's clear. Sorry this code is pretty bad and I'm hoping we don't make it worse here.

::: gfx/layers/client/TiledContentClient.cpp
@@ +126,5 @@
> +    bool aLowPrecision,
> +    ScreenRect& aCompositionBounds,
> +    CSSToScreenScale& aZoom)
> +{
> +  if (!aLayer) {

Does it make sense to pass aLayer as null? IMO this should never be null in which case we should MOZ_ASSERT.

@@ +600,1 @@
>    if (mManager->ProgressiveUpdateCallback(!staleRegion.Contains(aInvalidRegion),

// TODO remove java progressive callback and use b2g codepath.

@@ +600,4 @@
>    if (mManager->ProgressiveUpdateCallback(!staleRegion.Contains(aInvalidRegion),
>                                            viewport,
>                                            scaleX, scaleY, !drawingLowPrecision)) {
> +#else

Sorry I think the ifdef here is too ugly to land as it. Here's a suggestion.

Introduce a |bool abortPaint = false;| above before ifdef. Then this should become:

#ifdef android
  abortPaint = mManager->ProgressiveUpdateCallback(...)
#else
  ScreenRect compositionBounds;
  CSSToScreenScale zoom;

  if (mSharedFrameMetricsHelper) {
    abortPaint = mSharedFrameMetricsHelper->UpdateFromCompositorFrameMetrics(...)
  } else {
    abortPaint = true;
  }
  viewport = compositionBounds.ToUnknownRect();
  scaleX = scaleY = zoom.scale;
#endif

  if (abortPaint) {
    // label etc...

::: gfx/layers/ipc/CompositorChild.cpp
@@ +151,5 @@
> +  data->mAPZCId = aAPZCId;
> +  FrameMetrics* frame = static_cast<FrameMetrics*>(data->mBuffer->memory());
> +  if (frame) {
> +    mFrameMetricsTable.Put(frame->mScrollId, data);
> +  }

Why would frame be null here? We don't need to lock to read frame here?

@@ +161,5 @@
> +    const ViewID& aId,
> +    const uint32_t& aAPZCId)
> +{
> +  SharedFrameMetricsData* data = mFrameMetricsTable.Get(aId);
> +  if (data && (data->mAPZCId == aAPZCId)) {

What happens if we get asked to remove something but we don't find it? Sounds like an error condition to me.

::: gfx/layers/ipc/CompositorChild.h
@@ +77,5 @@
> +  struct SharedFrameMetricsData {
> +    // Pointer to the class that allows access to the shared memory that contains
> +    // the shared FrameMetrics
> +    mozilla::ipc::SharedMemoryBasic* mBuffer;
> +    CrossProcessMutex* mMutex;

IMO we can encapsulate this better. Turning this into a class (private access) and allow a caller to pass in a pointer/reference to a mutable FrameMetrics. This class will make sure we always lock, copy, unlock.
Attachment #8336486 - Flags: review?(bgirard) → review-
(In reply to Benoit Girard (:BenWa) from comment #55)
> Comment on attachment 8336486 [details] [diff] [review]
> Sorry this code is pretty bad and I'm hoping we don't make it
> worse here.

That might come across badly. I meant the existing code, not your patch :).
(In reply to Benoit Girard (:BenWa) from comment #53)
> Sorry for not providing better review times. I can imagine this patch is
> fairly painful to work on as-is. :(
> 
> (In reply to Randall Barker from comment #49)
> > Since it is a PCompositorParent and not a CompositorParent, it can not be
> > ref counted. It should live as long as the content process it is paired
> > with. I can look again as see if there is a way to insure it is being
> > accessed properly.
> 
> I'm not sure why we have PCompositorParent. Typically we only have one
> implementation of P<X> which is <X>. The distinction is a bit
> confusing/useless but I noticed that there's also other code that does this.
> I guess PCompositorParent says 'I don't need any of the methods that aren't
> mention in the protocol'. Alright fine :)

We will have one CompositorParent which will connect to the main b2g process, but then we will have one CrossProcessCompositorParent for each content process. Unfortunately CrossProcessCompositorParent is not a public API (see gfx/layers/ipc/CompositorParent.cpp). I had to use the PCompositorParent interface because that was the only part of the CrossProcessCompositorParent that was public. I used the PCompositorParent interface to access IPC and send the shared FrameMetrics back to the associated content process. If you know of a better route for getting the shared data back to the content process I would like to know about it.
No that's fine
(In reply to Benoit Girard (:BenWa) from comment #55)
> Comment on attachment 8336486 [details] [diff] [review]
> Work in progress 9
> 
> Review of attachment 8336486 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Some more questions for you as I try to understand the code and make sure
> it's clear. Sorry this code is pretty bad and I'm hoping we don't make it
> worse here.
> 
> ::: gfx/layers/client/TiledContentClient.cpp
> @@ +126,5 @@
> > +    bool aLowPrecision,
> > +    ScreenRect& aCompositionBounds,
> > +    CSSToScreenScale& aZoom)
> > +{
> > +  if (!aLayer) {
> 
> Does it make sense to pass aLayer as null? IMO this should never be null in
> which case we should MOZ_ASSERT.
> 

Old habits die hard, fixed.

> @@ +600,1 @@
> >    if (mManager->ProgressiveUpdateCallback(!staleRegion.Contains(aInvalidRegion),
> 
> // TODO remove java progressive callback and use b2g codepath.
> 
> @@ +600,4 @@
> >    if (mManager->ProgressiveUpdateCallback(!staleRegion.Contains(aInvalidRegion),
> >                                            viewport,
> >                                            scaleX, scaleY, !drawingLowPrecision)) {
> > +#else
> 
> Sorry I think the ifdef here is too ugly to land as it. Here's a suggestion.
> 
> Introduce a |bool abortPaint = false;| above before ifdef. Then this should
> become:
> 
> #ifdef android
>   abortPaint = mManager->ProgressiveUpdateCallback(...)
> #else
>   ScreenRect compositionBounds;
>   CSSToScreenScale zoom;
> 
>   if (mSharedFrameMetricsHelper) {
>     abortPaint =
> mSharedFrameMetricsHelper->UpdateFromCompositorFrameMetrics(...)
>   } else {
>     abortPaint = true;
>   }
>   viewport = compositionBounds.ToUnknownRect();
>   scaleX = scaleY = zoom.scale;
> #endif
> 
>   if (abortPaint) {
>     // label etc...
> 

Done.

> ::: gfx/layers/ipc/CompositorChild.cpp
> @@ +151,5 @@
> > +  data->mAPZCId = aAPZCId;
> > +  FrameMetrics* frame = static_cast<FrameMetrics*>(data->mBuffer->memory());
> > +  if (frame) {
> > +    mFrameMetricsTable.Put(frame->mScrollId, data);
> > +  }
> 
> Why would frame be null here? We don't need to lock to read frame here?
> 

If the buffer failed to map it will return null. I'll change it to an assert. 

I could lock, but the value of mScrollId doesn't change once it has been initially set so there should be no danger of reading it in a partially set state.

> @@ +161,5 @@
> > +    const ViewID& aId,
> > +    const uint32_t& aAPZCId)
> > +{
> > +  SharedFrameMetricsData* data = mFrameMetricsTable.Get(aId);
> > +  if (data && (data->mAPZCId == aAPZCId)) {
> 
> What happens if we get asked to remove something but we don't find it?
> Sounds like an error condition to me.
> 

This is not an error. See end of comment 27 and comment 52 for more detail. It is possible for two APZCs to exist with the same ViewID for a short period of time. The creation of the second APZC cause the shared FrameMetrics of the first APZC to be automatically removed from the hash table. When the first APZC gets deleted and sends the release FrameMetrics message, the target FrameMetrics is no longer in the table.

> ::: gfx/layers/ipc/CompositorChild.h
> @@ +77,5 @@
> > +  struct SharedFrameMetricsData {
> > +    // Pointer to the class that allows access to the shared memory that contains
> > +    // the shared FrameMetrics
> > +    mozilla::ipc::SharedMemoryBasic* mBuffer;
> > +    CrossProcessMutex* mMutex;
> 
> IMO we can encapsulate this better. Turning this into a class (private
> access) and allow a caller to pass in a pointer/reference to a mutable
> FrameMetrics. This class will make sure we always lock, copy, unlock.

Fixed.
Attached patch Work in progress 10 (obsolete) — Splinter Review
Updated to address comments.
Attachment #8336486 - Attachment is obsolete: true
Attachment #8338762 - Flags: review?(bgirard)
(In reply to Benoit Girard (:BenWa) from comment #54)
> (In reply to Randall Barker from comment #49)
> > > ::: gfx/layers/client/TiledContentClient.cpp
> > > @@ +214,5 @@
> > > > +    layer = layer->GetParent();
> > > > +    contentMetrics = layer ? &(layer->GetFrameMetrics()) : contentMetrics;
> > > > +  }
> > > > +
> > > > +  NS_WARN_IF_FALSE(!contentMetrics->mCompositionBounds.IsEmpty(),
> > > 
> > > Is this appropriate as a warning? Sounds from the message that it should be
> > > an assert.
> > > 
> > 
> > I don't know what qualifies as an assert vs. a warning. I had nothing
> > before, an earlier review asked that I add a warning. I can make it an
> > assert but I'm not confident that an empty composition bounds is not a valid
> > possibility.
> > 
> 
> IMO if our code breaks we need to make that critical. We need to decide that
> either it can't happen and assert (and ideally explain why we think this
> properly always holds), or have a good fallback in place. Feel free to
> convince me otherwise.

Updated to be a MOZ_ASSERT.
Comment on attachment 8338762 [details] [diff] [review]
Work in progress 10

Yay!
Attachment #8338762 - Flags: review?(bgirard) → review+
Attached patch progressive.patch (obsolete) — Splinter Review
Rebased patched. carry forward reviews from :BenWa, :kats, and :CWiiis
Attachment #802762 - Attachment is obsolete: true
Attachment #828710 - Attachment is obsolete: true
Attachment #8338762 - Attachment is obsolete: true
Attached patch progressive.patch (obsolete) — Splinter Review
Previous patch still had pref("layers.force-tiles", true); incorrectly set.
Attachment #8339536 - Attachment is obsolete: true
Note that I asked Cwiiis to rename that preference to layers.enable-tiles in bug 915673.
(In reply to Randall Barker from comment #52)
> (In reply to (PTO Nov21-Dec01) Kartikaya Gupta (email:kats@mozilla.com) from
> comment #45)
> > (In reply to Randall Barker from comment #44)
> > > 1) Compositor process creates APZC with ViewID 1 and APZCId 1
> > > 2) Content process registers FrameMetrics with ViewID 1 and APZCId 1
> > > 3) A tab is opened with a new page. (new ViewID 1 APZCId 2)
> > 
> > At this point, is APZCId 1 still around? Or has it been destroyed? If it has
> > been destroyed then step 5 really happens (in particular, before APZCId 3 is
> > created) and so there should be no problem. If APZCId 1 is still around, see
> > my next comment.
> > 
> > > 4) Switch back to original page. Page layout cause new APCZ to be created
> > > with ViewID 1 and APZCId 3.
> > 
> > If APZCId 1 is still around, then this part is unexpected to me. I would
> > expect the original page to pick up the same APZC as it was using before,
> > because of the code at [1] (this was added recently but you should have it
> > if rebased to pick up 900092).
> > 
> 
> A new APZC is definitely being created as seen in the increase in the APZCId
> but not the ViewID. I logged the content process and saw:
> 
> 1) ViewID 1 APZCId 1 FrameMetrics received. 
> 2) ViewID 1 APZCId 3 FrameMetrics received (previous version deleted)
> 3) Remove ViewID APZCId 1 message received, ignored because it has already
> been replaced.
> 

Please file a separate bug with STR on how to reproduce the scenario you describe here and in comment 44. I would like to investigate why a new APZC is being created here when it should be reusing the old one.
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #66)
> (In reply to Randall Barker from comment #52)
> > (In reply to (PTO Nov21-Dec01) Kartikaya Gupta (email:kats@mozilla.com) from
> > comment #45)
> > > (In reply to Randall Barker from comment #44)
> > > > 1) Compositor process creates APZC with ViewID 1 and APZCId 1
> > > > 2) Content process registers FrameMetrics with ViewID 1 and APZCId 1
> > > > 3) A tab is opened with a new page. (new ViewID 1 APZCId 2)
> > > 
> > > At this point, is APZCId 1 still around? Or has it been destroyed? If it has
> > > been destroyed then step 5 really happens (in particular, before APZCId 3 is
> > > created) and so there should be no problem. If APZCId 1 is still around, see
> > > my next comment.
> > > 
> > > > 4) Switch back to original page. Page layout cause new APCZ to be created
> > > > with ViewID 1 and APZCId 3.
> > > 
> > > If APZCId 1 is still around, then this part is unexpected to me. I would
> > > expect the original page to pick up the same APZC as it was using before,
> > > because of the code at [1] (this was added recently but you should have it
> > > if rebased to pick up 900092).
> > > 
> > 
> > A new APZC is definitely being created as seen in the increase in the APZCId
> > but not the ViewID. I logged the content process and saw:
> > 
> > 1) ViewID 1 APZCId 1 FrameMetrics received. 
> > 2) ViewID 1 APZCId 3 FrameMetrics received (previous version deleted)
> > 3) Remove ViewID APZCId 1 message received, ignored because it has already
> > been replaced.
> > 
> 
> Please file a separate bug with STR on how to reproduce the scenario you
> describe here and in comment 44. I would like to investigate why a new APZC
> is being created here when it should be reusing the old one.

https://bugzilla.mozilla.org/show_bug.cgi?id=945342
Attached patch Progressive Paint rebase 1 (obsolete) — Splinter Review
Rebased patch.
Attachment #8339538 - Attachment is obsolete: true
Updated mutex patch to be in correct format.
Attachment #8334960 - Attachment is obsolete: true
Attachment #8334960 - Flags: review?(khuey)
Updated patch to be correctly formatted.
Attachment #8341180 - Attachment is obsolete: true
Add in missing file.
Attachment #8345036 - Attachment is obsolete: true
Attachment #8345388 - Flags: review?(khuey)
Un-bitrot.
Attachment #8345038 - Attachment is obsolete: true
Actually include patch this time.
Attachment #8345545 - Attachment is obsolete: true
Comment on attachment 8345388 [details] [diff] [review]
Part 1: Linux cross process mutex implementation;

Review of attachment 8345388 [details] [diff] [review]:
-----------------------------------------------------------------

Please replace all the NS_RUNTIMEABORTs with MOZ_CRASH.  There's no need to put in a string, the line numbers will tell us where things are going wrong.

Similarly, replace NS_ASSERTION with MOZ_ASSERT (but keep the strings in this case).

This will use a page of memory per mutex, but changing that would be hard, so lets not worry about that unless it becomes a problem.

r=me
Attachment #8345388 - Flags: review?(khuey) → review+
Addressed comments from :khuey. Cary r+ forward.
Attachment #8345388 - Attachment is obsolete: true
Unbitrot carry r+s forward.
Attachment #8345547 - Attachment is obsolete: true
Keywords: checkin-needed
Rebased to tip of inbound.
Attachment #8346832 - Attachment is obsolete: true
Rebased to tip of inbound.
Attachment #8346833 - Attachment is obsolete: true
Ran try: -b o -p emulator,emulator-jb,unagi,linux_gecko,linux64_gecko,macosx64_gecko -u all -t none
Keywords: checkin-needed
Sorry, but this is bitrotted. Please rebase :(
Keywords: checkin-needed
Keywords: checkin-needed
Rebased to tip.
Attachment #8349015 - Attachment is obsolete: true
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/8a47074c7af5
https://hg.mozilla.org/mozilla-central/rev/4e75d3da1ed3
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla29
Comment on attachment 8350253 [details] [diff] [review]
Part 2, Enable progressive tile rendering in B2G; r=BenWa,kats,Cwiiis

Review of attachment 8350253 [details] [diff] [review]:
-----------------------------------------------------------------

I just noticed something weird in the code, and traced it back to this bug.

::: b2g/app/b2g.js
@@ +92,5 @@
> +#ifdef MOZ_WIDGET_COCOA
> +pref("layers.progressive-paint", false);
> +#else
> +pref("layers.progressive-paint", false);
> +#endif

I'm not sure this #ifdef is very useful, shouldn't you just set the pref to false here?
Flags: needinfo?(rbarker)
The comment above the ifdef explains why it is split into two parts. When we do eventually turn on progressive painting (bug 994293) it will flip one half of the ifdef to true and leave the other half false.
Flags: needinfo?(rbarker)
(In reply to Jan Keromnes [:janx] from comment #92)
> Comment on attachment 8350253 [details] [diff] [review]
> Part 2, Enable progressive tile rendering in B2G; r=BenWa,kats,Cwiiis
> 
> Review of attachment 8350253 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> I just noticed something weird in the code, and traced it back to this bug.
> 
> ::: b2g/app/b2g.js
> @@ +92,5 @@
> > +#ifdef MOZ_WIDGET_COCOA
> > +pref("layers.progressive-paint", false);
> > +#else
> > +pref("layers.progressive-paint", false);
> > +#endif
> 
> I'm not sure this #ifdef is very useful, shouldn't you just set the pref to
> false here?

The issue is that progressive-paint can not be enabled for MacOS X until there is a CrossProcessMutex implementation. Since the the code was merged turned off on all platforms, I left that as a reminder not to enable on MacOS X. Of course only FFOS really needs the CrossProcessMutex right now so the code could be fixed to work on MacOS X as well as long as the content thread and compositor run in the same process. That was not the point of this bug however.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: