Closed Bug 982888 Opened 6 years ago Closed 6 years ago

[Buri] Email content cannot be scrolled in some HTML format with embedded images (note: iframe with transform: scale() applied)

Categories

(Core :: Panning and Zooming, defect)

28 Branch
ARM
Gonk (Firefox OS)
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla31
tracking-b2g backlog
Tracking Status
b2g-v1.2 --- unaffected
b2g-v1.3 --- affected
b2g-v1.4 --- affected
b2g-v2.0 --- fixed

People

(Reporter: njpark, Assigned: kats)

References

Details

(Keywords: regression)

Attachments

(5 files, 10 obsolete files)

6.08 KB, message/rfc822
Details
7.67 KB, patch
kats
: review+
Details | Diff | Splinter Review
18.91 KB, patch
tnikkel
: review+
Details | Diff | Splinter Review
9.06 KB, patch
kats
: review+
Details | Diff | Splinter Review
7.26 KB, patch
kats
: review+
Details | Diff | Splinter Review
Steps to reproduce
1. Open the email app
2. Select the email from the list
3. When the content opens, DO NOT scroll first, first click the "Show external Images" banner
4. Scroll down

Expected:
message scrolls normally

Actual:
message does not scroll

Note:
If the message is scrolled before clicking the 'show external images' banner, then the message will scroll after images are loaded.  Attached is the email that caused the issue.


Gaia      3005269d4dcabcc7d27eaf72bda44a969873af8c
Gecko     https://hg.mozilla.org/mozilla-central/rev/23005b395ae8
BuildID   20140312040203
Version   30.0a1
ro.build.version.incremental=eng.tclxa.20131223.163538
ro.build.date=Mon Dec 23 16:36:04 CST 2013
Might be a regression.

QA Wanted to find out if this reproduces on 1.3.
Keywords: qawanted
QA Contact: astole
This issue also occurs on the latest 1.3.

Environmental Variables:
Device: Buri v1.3 Mozilla RIL
BuildID: 20140313004002
Gaia: 6194def5ceed3f4b9bc9de0ea2c11661cd439a27
Gecko: 9368fd13bfa6
Version: 28.0
Firmware Version: V1.2-device.cfg

This issue does not occur on the latest 1.2.

Environmental Variables:
Device: Buri v1.2 Mozilla RIL
BuildID: 20140313004001
Gaia: 539a25e1887b902b8b25038c547048e691bd97f6
Gecko: 9008f899e5af
Version: 26.0
Firmware Version: V1.2-device.cfg
blocking-b2g: --- → 1.3?
Switching to qawanted to find out if this reproduces with APZC disabled.
QA Contact: astole → jschmitt
With APZ off, the bug does not reproduce, I am able to scroll after loading the External Images.
Keywords: qawanted
Alright - let's do a window within the timeframe of the latest build to the time when APZC was first enabled.
Blocks: gaia-apzc
Component: Gaia::E-Mail → Panning and Zooming
Product: Firefox OS → Core
Version: unspecified → 28 Branch
Note: in my repro on v1.4, if I manage to get the reader to scroll by triggering the scroll in the portion of the display above the message body iframe, then everything scrolls okay.

To restate:
- Initially, the only thing dragging my finger around in the iframe does is cause horizontal scrolling of the iframe.
- If I initiate the scroll outside the iframe, things scroll and then scrolling works in general.

The relevant thing is probably that our iframe has a transform: scale() on it because in a pre-APZ world that's the only way we could do pinch-to-zoom stuff on the email with it initially zoomed out.  Maybe now we can do something more clever.

In any event, the DOM situation is this:
- clipping container: clientHeight: 1409px (actually, we had set it to 1408.98 or something because of math and we're jerks), scrollHeight: 3107px
  - iframe: clientHeight: 3107px, scrollHeight: 3107px

The situation there is that since we have used scale() to shrink the apparent size of the iframe we obviously don't need all that leftover whitespace, so we clip it off with the containing DOM.
Summary: [Buri] Email content cannot be scrolled in some HTML format with embedded images → [Buri] Email content cannot be scrolled in some HTML format with embedded images (note: iframe with transform: scale() applied)
(In reply to Jason Smith [:jsmith] from comment #6)
> Alright - let's do a window within the timeframe of the latest build to the
> time when APZC was first enabled.

APZ was first enabled by default on the 01/10/14 1.3 build. Therefore that is when this issue started reproducing without the user having to do anything except run through the STR.

- Last Working -
Device: Buri v1.3 MOZ RIL
BuildID: 20140109004002
Gaia: 22bc6be5b76cdc6d4e9667ff070979041a20ce2f
Gecko: 2c8f8683bd0d
Version: 28.0a2
Firmware Version: V1.2-device.cfg

- First Broken -
Device: Buri v1.3 MOZ RIL
BuildID: 20140110004009
Gaia: c606b129a2d1647c0fc7bfb197555026e9b27f9e
Gecko: c5109884ae3a
Version: 28.0a2
Firmware Version: V1.2-device.cfg

Push log for gaia: https://github.com/mozilla-b2g/gaia/compare/22bc6be5b76cdc6d4e9667ff070979041a20ce2f...c606b129a2d1647c0fc7bfb197555026e9b27f9e
QA Contact: jschmitt → mvaughan
Milan

Please review and help reassign/ fix.
blocking-b2g: 1.3? → 1.3+
Flags: needinfo?(milan)
Providing regression window prior to January 10, 2014 per irc chat with Botond B.

This issue started reproducing on the 12/07/13 1.3 build.

- Last Working -
Device: Buri v1.3 MOZ RIL
BuildID: 20131206040203
Gaia: 8fca2ca67e8a6022fe6ed8cb576e5d59dfb5237f
Gecko: 1401e4b394ad
Version: 28.0a1
Firmware Version: V1.2-device.cfg

- First Broken -
Device: Buri v1.3 MOZ RIL
BuildID: 20131207040203
Gaia: 1abda08e450cb66a61a31bdcfd3352e2df9d9ace
Gecko: 42b2a2adda8f
Version: 28.0a1
Firmware Version: V1.2-device.cfg


**This looks to be a gecko issue**
last working gaia/first broken gecko = REPRO
Gaia:  8fca2ca67e8a6022fe6ed8cb576e5d59dfb5237f
Gecko: 42b2a2adda8f

first broken gaia/last working gecko = NO REPRO
Gaia:  1abda08e450cb66a61a31bdcfd3352e2df9d9ace
Gecko: 1401e4b394ad

Push log: http://hg.mozilla.org//mozilla-central/pushloghtml?fromchange=1401e4b394ad&tochange=42b2a2adda8f
Thanks!

APZ-related bugs that landed in that range: bug 940706, bug 944047, bug 943846.

Given that iframe has a transform applied, and bug 940706 concerns coordinate transforms, I'd say that's the best place to start looking.
I looked into this. Interesting problem!

  - Prior to showing the external images, we have a scrollable
    <div> containing the email contents. Since it's a subframe,
    it starts of with a scroll info layer, let's call it A.

  - The email contents are in an <iframe>, but that does not
    get its own layer, probably because the iframe is sized
    to fit the document it contains.

  - When you show the external images, the contents of the
    iframe become slightly larger than its scroll port (*) 
    and it also gets a scroll info layer, let's call it B.

       (*) This may not be intentional. If it's not, it can
           probably be fixed in the app.

  - Since scroll info layers do not have children, B is not
    a descendant of A in the layer tree.

  - However, since the element to which B corresponds is a
    child of the element to which A corresponds in the DOM,
    B is above A in the layer tree.

  - When you pan on the email contents, APZ hit testing
    finds B. You can pan it by the very small amount by
    which the iframe contents are larger than its scroll
    port. After that, overscroll handoff kicks in.

  - However, the overscroll handoff chain only walks 
    child -> parent links. Since A is not an ancestor of B,
    A is not an element of B's handoff chain, so A does
    not get the overscroll. As a result, A never scrolls.

  - If you scroll A prior to showing the external images,
    A gets a scrollable layer and B becomes its child,
    and later when B gets its own layer, overscroll handoff
    now works.

Amusingly, the reason this appeared to work prior to the
regression range from comment 10 is that due to a race 
condition (bug 944047), the synchronous scrolling code 
from BrowserElementPanning.js was being enabled in spite
of APZ being turned on, and the <div> was being scrolled
synchronously!

It's not clear to me what is the best way to fix this in the
1.3 timeframe. 

  - The proper solution is to get overscroll handoff to 
    consider non-ancestor-descendant relationships in the 
    layer tree where appropriate. I don't think we want
    to do that until bug 918288 is fixed, and we almost 
    certainly don't want to uplift that to 1.3. 

  - Bug 982141 will fix the problem in this case by giving
    A a scrollable layer as soon as the page loads (it won't
    solve the problem in general, though). However, I don't
    imagine we want to uplift bug 982141 to 1.3 either.

  - Assuming that (*) is not intentional, a Gaia fix should
    be possible by ensuring that the iframe sizes itself to
    accommodate its enlarged contents. Then we'd never get a
    scrollable or scroll info layer B for the iframe, and
    there would be no problem. Andrew, do you think this is
    possible?
Flags: needinfo?(bugmail)
Sounds like the APZ changes uncovered an existing problem.  I'd like to hear from Andrew about the proposed (*) solution from comment 12 before proceeding.
Flags: needinfo?(milan)
Assigning to me while we're sorting this out.
Assignee: nobody → milan
Duplicate of this bug: 982088
(In reply to Botond Ballo [:botond] from comment #12)
>   - Assuming that (*) is not intentional, a Gaia fix should
>     be possible by ensuring that the iframe sizes itself to
>     accommodate its enlarged contents. Then we'd never get a
>     scrollable or scroll info layer B for the iframe, and
>     there would be no problem. Andrew, do you think this is
>     possible?

Maybe?  I've tried to simply change our logic so that we ensure that the iframe's clientHeight is the same as its scrollHeight in resizeFrame() in https://github.com/mozilla-b2g/gaia/blob/master/apps/email/js/iframe_shims.js, but it seems like by the time we do that, the damage is done.  We currently perform our fix-up when we get an onload from the images.  We do not know a priori what the size of the images are.  (If we did, the document would already be laid out correctly.)  We obviously could try and load the images before they get loaded into the DOM, affecting its layout.

*However*, on documents that are already tall, even before the user causes the fetch of external/embedded images, we are already broken.  Probably because of our heuristics explicitly clobber the height to estimate the correct size in the first place.  Honestly,  the logic in iframe_shims is a big evolved mess with probably quite a lot of vestigial pieces that are shooting us in the foot.  But it historically worked enough for us to declare victory.

What we really want for that specific goal is seamless iframes, which the many-times-abandoned bug 80713 has bitrotted patches for.  This is what our hack does.


I just tried to see if we could maybe just lose all of e-mail's efforts to implement a fake scrollable APZ thing.  The answers appears to be no.  Specifically, I ripped out all of our sizing and scrolling logic and had us write a meta viewport tag like so:
  <meta name="viewport" content="width=600, minimum-scale=0.5, maximum-scale=2, initial-scale=0.5, user-scalable=1">

I also tried letting the document have a different origin and using srcdoc instead of us sharing the same origin and using contentDocument.write().  While we do get nice, fast panning inside the iframe, we do not get zooming.  The scroll behaviour is also not what we want.  We want the browser-like situation where scrolling up to the top lets us reveal the envelope information and they move as if they were a single continuous document; instead we get the thing where the document only scrolls after the sub-frame is done scrolling.


In a worst-case scenario we could potentially inject our envelope DOM inside the iframe so that would provide us with the coherent scroll-region, but that's going to require an incredible amount of work on the e-mail end and probably cause a ton of regressions too.  Not to mention weirdness however we manage to get the zooming to work.
Flags: needinfo?(bugmail)
(In reply to Andrew Sutherland (:asuth) from comment #16)
> While we do get nice, fast panning inside the iframe, we do not get zooming.

Currently only the root frame inside a process is zoomable. In the browser, each tab is a separate process so that's why tabs are zoomable. In the email app I assume that the email content is in an iframe in the same process as the email app itself, so it is not zoomable.

> The scroll behaviour is also not what we want.  We want the browser-like
> situation where scrolling up to the top lets us reveal the envelope
> information and they move as if they were a single continuous document;
> instead we get the thing where the document only scrolls after the sub-frame
> is done scrolling.

We have support for this inverted-scrolling behaviour in the APZC code; you can enable it by setting a mozscrollgrab="true" attribute on the outer scrollable, and it will scroll before the inner scrollable.
(In reply to Botond Ballo [:botond] from comment #12)
>   - The proper solution is to get overscroll handoff to 
>     consider non-ancestor-descendant relationships in the 
>     layer tree where appropriate. I don't think we want
>     to do that until bug 918288 is fixed, and we almost 
>     certainly don't want to uplift that to 1.3. 

Why the dependency on bug 918288 here?
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #17)
> > The scroll behaviour is also not what we want.  We want the browser-like
> > situation where scrolling up to the top lets us reveal the envelope
> > information and they move as if they were a single continuous document;
> > instead we get the thing where the document only scrolls after the sub-frame
> > is done scrolling.
> 
> We have support for this inverted-scrolling behaviour in the APZC code; you
> can enable it by setting a mozscrollgrab="true" attribute on the outer
> scrollable, and it will scroll before the inner scrollable.

s/mozscrollgrab/scrollgrab (see https://bugzilla.mozilla.org/show_bug.cgi?id=912666#c62).

(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #18)
> (In reply to Botond Ballo [:botond] from comment #12)
> >   - The proper solution is to get overscroll handoff to 
> >     consider non-ancestor-descendant relationships in the 
> >     layer tree where appropriate. I don't think we want
> >     to do that until bug 918288 is fixed, and we almost 
> >     certainly don't want to uplift that to 1.3. 
> 
> Why the dependency on bug 918288 here?

I guess I had in mind that bug 918288 would change hit-testing to do something more sophisticated than the way we currently loop over the APZC tree, and that we could do something similar to that for scroll handoff, but I haven't really thought it through. There may not necesarily be a dependency.
(In reply to Botond Ballo [:botond] from comment #19)
> > Why the dependency on bug 918288 here?
> 
> I guess I had in mind that bug 918288 would change hit-testing to do
> something more sophisticated than the way we currently loop over the APZC
> tree, and that we could do something similar to that for scroll handoff, but
> I haven't really thought it through. There may not necesarily be a
> dependency.

918288 would make some changes to hit testing, but I'm not sure yet that it would change the way we loop over the APZC tree. I think it probably won't, because hit-testing requires knowledge of which layer z-ordering and the current method of looping the tree provides that. Scroll handoff doesn't really care about z-ordering; it cares about the DOM hierarchy which is independent. So I think it makes sense to annotate scrollable container layers with their DOM-parent layer somehow and use that when building the handoff chain.

It might still be too risky to uplift but I think the solution makes sense as a long-term fix.
Timothy, Kats, and I worked out a plan for fixing this on IRC, which Kats will implement.

The idea is to annotate, at display list building / frame building time, a scrollable layer with the scroll id to which it should hand off scroll to, and use these when building the handoff chain rather than the child -> parent links in the layer tree / APZC tree.
Assignee: milan → bugmail.mozilla
The above patches seem to do what I want on my isolation test case at https://people.mozilla.org/~kgupta/bug/982888.html but I want to rebase them on master and test the HTML email content before requesting review. Early feedback is welcome.
So the above patches don't fix the problem in the email app. I think this is happening because my code sets the active scroll id in the builder before we decide whether or not to create a layer for the nsGfxScrollFrame. So in some cases we end up not creating a layer, but the parent scroll id for the children is still pointing to that content element (which didn't get a layer). I'll try doing this a different way.
Attached patch Additional hack (obsolete) — Splinter Review
Rearranging things in nsGfxScrollFrame so that we have access to the shouldBuildLayer flag before calling BuildDisplayListForChild makes things work. However I don't think we can land this because computing shouldBuildLayer (I believe) requires building the child display list first to get things like the scroll range. Will continue exploring other ways to get the right scroll id down to the children.
I discussed this a bit with Botond yesterday and he suggested walking up the frame tree at the time we call RFM, and finding the nearest ancestor scroll frame that was layerized. I tried to implement this but it doesn't seem to be working as expected. I'm not 100% sure why but it seems to be that walking up the frame tree using GetParent() calls provides a different chain of frames than when we walk down calling BuildDisplayListForChild. I will continue exploring other options.
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #28)
> I discussed this a bit with Botond yesterday and he suggested walking up the
> frame tree at the time we call RFM, and finding the nearest ancestor scroll
> frame that was layerized. I tried to implement this but it doesn't seem to
> be working as expected. I'm not 100% sure why but it seems to be that
> walking up the frame tree using GetParent() calls provides a different chain
> of frames than when we walk down calling BuildDisplayListForChild. I will
> continue exploring other options.

Walking the frame tree using GetParent() is not what I had in mind. I was thinking about introducing a new 'scroll parent' pointer field to frames, which is populated only for scroll frames, and which is populated using the same 'Auto'/active approach that the existing patches use to set the parent scroll id. Sorry if I wasn't clear about that.
I went back to my original approach and added an extra layer of indirection, which worked.

(In reply to Botond Ballo [:botond] from comment #29)
> Walking the frame tree using GetParent() is not what I had in mind. I was
> thinking about introducing a new 'scroll parent' pointer field to frames,
> which is populated only for scroll frames, and which is populated using the
> same 'Auto'/active approach that the existing patches use to set the parent
> scroll id. Sorry if I wasn't clear about that.

Ah, I misunderstood. I can try this approach to - it will probably be cleaner than this patch I just uploaded.
Attachment #8395905 - Attachment is obsolete: true
Attachment #8396613 - Attachment is obsolete: true
Attached patch Part 2 alternate (obsolete) — Splinter Review
Here is Botond's suggestion for part 2. It's about the same amount of code as the other implementation but I think this one will be less able to deal with changes due to flattening and merging, if any are needed (I'm not sure any changes are needed, but I don't know that code too well). For example, we might return true from IsLayerized() even though the items got merged or flattened into something else, and then we might end up putting the wrong parent id in the layer tree. With my previous patch we can just point the mDelegate field in the provider to a different thing when we do the flatten/merge and that will take care of it.

This approach also adds more stuff to the nsIScrollableFrame interface. In particular exposing the layerization flag doesn't seem like such a great idea to me but I could be convinced otherwise.
Comment on attachment 8395907 [details] [diff] [review]
Part 3 - Use the handoff field when building the handoff chain

Review of attachment 8395907 [details] [diff] [review]:
-----------------------------------------------------------------

r+ with comments addressed (other than [suggestion]s, which are just that)

::: gfx/layers/composite/APZCTreeManager.cpp
@@ +214,5 @@
>          APZC_LOG("Using APZC %p for layer %p with identifiers %lld %lld\n", apzc, aLayer, aLayersId, container->GetFrameMetrics().mScrollId);
>  
>          apzc->NotifyLayersUpdated(metrics,
>                                    aIsFirstPaint && (aLayersId == aFirstPaintLayersId));
> +        apzc->SetScrollHandoffParent(container->GetScrollHandoffParent());

[suggestion] Have you considered making the parent id a field of FrameMetrics?

@@ +895,5 @@
>    MonitorAutoLock lock(mTreeLock);
>  
>    mOverscrollHandoffChain.clear();
>  
>    // Start with the child -> parent chain.

Let's add to this comment something similar to "Note that to correctly handle scroll info layers, we use the 'scroll handoff parent id' to get to the parent rather than the parent link in the APZC tree."

It might be worth adding a similar note to the comment above APZCTreeManager::DispatchScroll(), perhaps after the "Consider three nested APZCs..." line.

@@ +904,5 @@
>        mOverscrollHandoffChain.clear();
>        return;
>      }
> +    if (apzc->GetScrollHandoffParent() == FrameMetrics::NULL_SCROLL_ID) {
> +      apzc = apzc->GetParent();

Can we warn if an APZC doesn't have a scroll handoff parent and it's not the root for its layers id? I think that would indicate a bug in the layout code for setting scroll handoff parents, which we probably want to know about.

@@ +916,5 @@
> +    AsyncPanZoomController* rootParent = layersIdRoot->GetParent();
> +    while (rootParent && rootParent->GetGuid().mLayersId == layersId) {
> +      layersIdRoot = rootParent;
> +      rootParent = layersIdRoot->GetParent();
> +    }

[suggestion] This loop can be written

while (!layersIdRoot->IsRootForLayersId()) {
    layersIdRoot = layersIdRoot->GetParent();
}

@@ +938,5 @@
> +{
> +  mTreeLock.AssertCurrentThreadOwns();
> +
> +  if (aApzc->GetGuid().mLayersId != aLayersId) {
> +    return nullptr;

If the intention is for this overload of FindTargetAPZC to only ever be called (other than recursively) with a starting APZC whose layers id matches aLayersId, then please document this (or even better, enforce it with an assertion, though I guess that would require splitting this up into a starting function and a recursive helper).

Otherwise this return is wrong, as aApzc could have a parent layers id while aLayersId is a child layers id.
Attachment #8395907 - Flags: review?(botond) → review+
Comment on attachment 8397417 [details] [diff] [review]
Part 4 - Some additional cleanup in AsyncPanZoomController.h

Review of attachment 8397417 [details] [diff] [review]:
-----------------------------------------------------------------

It's too bad we don't have #regions [1] in C++. They'd make a nice fit for this file.

[1] http://msdn.microsoft.com/en-us/library/9a1ybwek.aspx
Attachment #8397417 - Flags: review?(botond) → review+
(In reply to Botond Ballo [:botond] from comment #34)
> [suggestion] Have you considered making the parent id a field of
> FrameMetrics?

I did, and it would probably be less code, but it also feels less correct. It's a property of the layer and isn't really "metrics" related. I also don't want to keep bloating FrameMetrics :)

The rest of your comments all makes sense, I will address them once tn does his review.
(In reply to Botond Ballo [:botond] from comment #35)
> Comment on attachment 8397417 [details] [diff] [review]
> Part 4 - Some additional cleanup in AsyncPanZoomController.h
> 
> Review of attachment 8397417 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> It's too bad we don't have #regions [1] in C++. They'd make a nice fit for
> this file.
> 
> [1] http://msdn.microsoft.com/en-us/library/9a1ybwek.aspx

We have to make due with #pragma mark but even then its not that well supported.
Attachment #8395904 - Flags: review?(bgirard) → review+
Comment on attachment 8397207 [details] [diff] [review]
Part 2 - populate the field from layout

We should just need to add the mScrollParentProvider field to nsDisplaySubdocument, no?

Don't we need to do the same thing that we do in ScrollFrameHelper also in sub document frame's BuildDisplayList and nsDisplayList::PaintForFrame (for when we hit the early exit in the scroll frame BuildDisplayList)?
Technically that's not required to fix this bug because this only happens when you have a scroll info layer as the parent and a scroll layer as the child. If you have a nsDisplaySubdocument as the parent, then any scrollable children will actually be children in the layer tree, and so by not setting the scrollparent pointer it will fall back to finding the parent by walking up in the tree.

But yeah, maybe it would be better to set it explicitly so that it's consistent. I'll update the patches.
Talked to tn on IRC and he said it should be ok to rearrange the code in ScrollFrameHelper::BuildDisplayList to get the should build layer flag before building the children. This is basically the original patch I had but cleaned up a little bit.
Attachment #8397207 - Attachment is obsolete: true
Attachment #8397412 - Attachment is obsolete: true
Attachment #8397207 - Flags: review?(tnikkel)
Attachment #8401962 - Flags: review?(tnikkel)
Updated but I changed a little bit so rerequesting review.

(In reply to Botond Ballo [:botond] from comment #34)
> >    // Start with the child -> parent chain.
> 
> Let's add to this comment something similar to "Note that to correctly
> handle scroll info layers, we use the 'scroll handoff parent id' to get to
> the parent rather than the parent link in the APZC tree."

I updated this comment.

> It might be worth adding a similar note to the comment above
> APZCTreeManager::DispatchScroll(), perhaps after the "Consider three nested
> APZCs..." line.

This comment was in AsyncPanZoomController.cpp, and I didn't really know what to add there. The code/comment there doesn't really need to know about how the handoff chain is built, it just needs to know that there is a parent to handoff stuff to.

> Can we warn if an APZC doesn't have a scroll handoff parent and it's not the
> root for its layers id? I think that would indicate a bug in the layout code
> for setting scroll handoff parents, which we probably want to know about.

Done

> [suggestion] This loop can be written
> 
> while (!layersIdRoot->IsRootForLayersId()) {
>     layersIdRoot = layersIdRoot->GetParent();
> }

Ah, good catch. Done. I also augmented the loop a little bit so that in the (common) case where the apzc parent is the same as the handoff parent, we don't do the full tree search. This should make things a little more efficient.

> If the intention is for this overload of FindTargetAPZC to only ever be
> called (other than recursively) with a starting APZC whose layers id matches
> aLayersId, then please document this (or even better, enforce it with an
> assertion, though I guess that would require splitting this up into a
> starting function and a recursive helper).

I added a sentence of documentation above the function.

> Otherwise this return is wrong, as aApzc could have a parent layers id while
> aLayersId is a child layers id.

Not sure what this means exactly after looking at the code. The first check in the function is to see if aApzc's layers id matches aLayersId. Maybe we had different default interpretations of the function, hopefully the documentation clears it up.
Attachment #8395907 - Attachment is obsolete: true
Attachment #8401979 - Flags: review?(botond)
Rebased, carrying r+
Attachment #8397417 - Attachment is obsolete: true
Attachment #8401980 - Flags: review+
Comment on attachment 8401979 [details] [diff] [review]
Part 3 - Use the handoff field when building the handoff chain

Review of attachment 8401979 [details] [diff] [review]:
-----------------------------------------------------------------

r+ with comment addressed

> > Otherwise this return is wrong, as aApzc could have a parent layers id while
> > aLayersId is a child layers id.
> 
> Not sure what this means exactly after looking at the code. The first check
> in the function is to see if aApzc's layers id matches aLayersId. Maybe we
> had different default interpretations of the function, hopefully the
> documentation clears it up.

Sorry, I wasn't very clear. I'll illustrate with a concrete example.

Suppose we turn on APZ in the parent process. Consider the following APZC tree:

APZC     process    layers id      scroll id

 A       parent          3             27
 |
 B       child           7             42

Now let's say someone calls FindTargetAPZC(A, 7, 42). The current implementation would return nullptr instead of B.

I think FindTargetAPZC should either be documented as not handling this scenario, or revised to handle this scenario.

::: gfx/layers/composite/APZCTreeManager.cpp
@@ +879,5 @@
>        NS_WARNING("Vector::append failed");
>        mOverscrollHandoffChain.clear();
>        return;
>      }
> +    if (apzc->GetScrollHandoffParent() == FrameMetrics::NULL_SCROLL_ID) {

[suggestion] I think this patch would read slightly better if the accessors were called GetScrollHandoffParentId and SetScrollHandoffParentId. GetScrollHandoffParent sounds like it returns the parent thing (APZC or layer) itself. However, I don't feel strongly about this.
Attachment #8401979 - Flags: review?(botond) → review+
Per a drivers' discussion - we're no longer blocking on any 1.3 specific bugs for Mozilla-specific needs unless there's a special exception for it, so I'm bouncing this to the 1.4 list.
blocking-b2g: 1.3+ → 1.4?
Vance - Can you find out if this is a blocker for TCL or not?
Flags: needinfo?(vchen)
(In reply to Jason Smith [:jsmith] from comment #46)
> Vance - Can you find out if this is a blocker for TCL or not?

Hi Jason -

This one is not reported by Mozilla, I think it is a Moz found bug
Flags: needinfo?(vchen)
(In reply to Vance Chen [:vchen][vchen@mozilla.com] from comment #47)
> (In reply to Jason Smith [:jsmith] from comment #46)
> > Vance - Can you find out if this is a blocker for TCL or not?
> 
> Hi Jason -
> 
> This one is not reported by Mozilla, I think it is a Moz found bug

I know that, but that's not what I'm asking here. I'm asking if TCL sees this as a blocker or not.
Flags: needinfo?(vchen)
If TCL doesn't see this as a blocker, I would rather hold it for post 1.4, as the solution is definitely in the "new feature" land.
blocking-b2g: 1.4? → backlog
Clarified offline - not a blocker by TCL.
Flags: needinfo?(vchen)
Comment on attachment 8401962 [details] [diff] [review]
Part 2 - populate the field from layout (v3)

Can we rename ActiveScrollId to CurrentScrollParent or something? We already have the notion of actively scrolling scroll frames, I don't want this to get confused.

Can you also note somewhere that this won't get the proper scroll parent like one would expect for positioned content but that we are doing that on purpose to be consistent?

I think you only want to use the AutoActiveScrollIdSetter in the subdocument case if we are actually going to be making a scrollable layer (ignoreViewportScrolling controls this). But you also said this is unnecessary (I see why now).
Updated to call the field mScrollHandoffParentId instead of mScrollHandoffParent, and updated all the method names to match. Carrying r+
Attachment #8395904 - Attachment is obsolete: true
Attachment #8404220 - Flags: review+
Addressed review comments
Attachment #8401962 - Attachment is obsolete: true
Attachment #8401962 - Flags: review?(tnikkel)
Attachment #8404221 - Flags: review?(tnikkel)
Addressed review comments. For the FindTargetAPZC function I ended up just removing the separate layersId argument and documenting that it will find the apzc with the same layers id as the aApzc argument. That should reduce the footgun possibilities.

Also updated the fields and methods so that they have Id appended to them.
Attachment #8401979 - Attachment is obsolete: true
Attachment #8404224 - Flags: review+
Attachment #8404221 - Flags: review?(tnikkel) → review+
Included these patches in the try push at https://tbpl.mozilla.org/?tree=Try&rev=d762b0802460 since I was pushing anyway. The last try was green so I don't expect problems.
Depends on: 1039623
blocking-b2g: backlog → ---
You need to log in before you can comment on or make changes to this bug.