Closed Bug 780692 Opened 12 years ago Closed 12 years ago

Throttle interpolation of animations on the main thread when they are being performed on the compositor

Categories

(Core :: Layout, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla20
blocking-basecamp +
Tracking Status
firefox19 --- fixed
firefox20 --- fixed
b2g18 --- fixed

People

(Reporter: dzbarsky, Assigned: nrc)

References

(Blocks 2 open bugs)

Details

(Keywords: perf, Whiteboard: [fast:200-400ms], UX-P1)

Attachments

(16 files, 36 obsolete files)

5.02 KB, patch
roc
: review+
dzbarsky
: checkin+
Details | Diff | Splinter Review
2.66 KB, patch
smaug
: review+
dzbarsky
: checkin+
Details | Diff | Splinter Review
7.68 KB, text/plain; charset=UTF-8
Details
828 bytes, text/html
Details
1011 bytes, text/html
Details
16.10 KB, patch
bzbarsky
: review+
dbaron
: review+
Details | Diff | Splinter Review
35.73 KB, patch
bzbarsky
: review+
dbaron
: review+
Details | Diff | Splinter Review
15.39 KB, patch
bzbarsky
: review+
dbaron
: review+
Details | Diff | Splinter Review
28.13 KB, patch
bzbarsky
: review+
dbaron
: review+
Details | Diff | Splinter Review
4.46 KB, patch
bzbarsky
: review+
dbaron
: review+
Details | Diff | Splinter Review
16.85 KB, patch
bzbarsky
: review+
dbaron
: review+
Details | Diff | Splinter Review
12.21 KB, patch
bzbarsky
: review+
Details | Diff | Splinter Review
72.43 KB, patch
nrc
: review+
Details | Diff | Splinter Review
3.91 KB, patch
nrc
: review+
Details | Diff | Splinter Review
101.57 KB, patch
nrc
: review+
Details | Diff | Splinter Review
102.90 KB, patch
Details | Diff | Splinter Review
      No description provided.
Interpolating async animations in parallel on the main thread is chewing a lot of CPU.  In bug 781726, it's causing scheduling contention during app startup, which is leading to the biggest remaining single overhead (executing a JS file).  The timings look like

 animations disabled entirely --- execution takes ~110ms
 main-thread interpolation disabled --- ~140ms
 no changes --- ~340ms

There are of course things we can fix, but I strongly suspect this problem is going to come up again and again.  We'd like to do things in other threads while async animations are running.

I'd like to suggest we reconsider the interpolate-at-full-speed-on-main-thread approach, and indeed also this interpolate-but-at-a-lower-rate "compromise".

What problems is this intended to solve?  My understanding is that there are three
 1. Reflow so that input events that target the animated frame are targeted "correctly"
 2. Return "correct" values from getComputedStyle()
 3. Recompute overflow so we know the "correct" time to draw scrollbars

Fundamentally, when we're drawing pixels to screen that aren't fully described by the main-thread frame tree / display list, problems (1)-(3) above can never be solved fully correctly.  There will always be inconsistencies.

So how can we draw the line between inconsistency and wasted CPU?  I don't really see any reasonable way.  Problems like bug 781726 will continue to make us not want to waste CPU.  And no matter what Hz we run the main-thread interpolation, in parallel, we'll always have inconsistency jank when the main thread gets delayed by content, and the severity of the jank will be proportional to how much content delays us.

Can we find better solutions?  Here's an attempt
 (S1) Set some flag on the frame tree that forces reflow on an incoming input event when there's async-interpolated content.  This will work strictly better than reflow on fixed Hz.
 (S2) Ensure that the first frame is interpolated on the main thread and that the last frame is.  Make no guarantees about getComputedStyle() otherwise.
 (S3) Respecify transform frames to not create overflow.  Or, do what webkit does.

This isn't necessarily the best place to discuss all this, but I'd prefer to start with a smaller group.
Blocks: 781726
(In reply to Chris Jones [:cjones] [:warhammer] from comment #1)
>  (S1) Set some flag on the frame tree that forces reflow on an incoming
> input event when there's async-interpolated content.  This will work
> strictly better than reflow on fixed Hz.

It means we'll constantly be doing main-thread frame-tree updates during mouse/touch events while there are animations. (Not even throttled by refresh driver ... although I suppose we could arrange some throttling.) I guess that's OK though.

>  (S2) Ensure that the first frame is interpolated on the main thread and
> that the last frame is.  Make no guarantees about getComputedStyle()
> otherwise.

I think that's OK, personally.

>  (S3) Respecify transform frames to not create overflow.  Or, do what webkit
> does.

We've discussed this before and on www-style and no-one had a reasonable solution yet. Webkit is efficient, but will show/hide scrollbars at random times depending on when it needs to flush its frame tree.

What about avoiding flushing on the main thread when we know the animation can't possibly induce scrollbars? This would include overflow:hidden, and we could also compute the geometric bounds of the animation and see if that can overflow.

Are there important animations that do overflow and cause scrollbars to be present?
I would prefer to implement a solution where flushing that's triggered by script will force style and layout information on the main thread to be up-to-date (to the most recent refresh driver tick), but where we otherwise don't update for the animation.  This probably requires changing our concept of a flush type to be a little more complex than a sequence of ranked values.

(I'd also note that the patch to animate visibility on the compositor thread is incorrect with either the throttling solution or this solution, as I said in bug 780340.)
Why can't we just add a new flush type?
As part of implementing this, we can add Visibility to Animatable which will allow us to actually interpolate visibility on the compositor thread instead of relying on the main thread to do it for us.
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #2)
> (In reply to Chris Jones [:cjones] [:warhammer] from comment #1)
> >  (S1) Set some flag on the frame tree that forces reflow on an incoming
> > input event when there's async-interpolated content.  This will work
> > strictly better than reflow on fixed Hz.
> 
> It means we'll constantly be doing main-thread frame-tree updates during
> mouse/touch events while there are animations. (Not even throttled by
> refresh driver ... although I suppose we could arrange some throttling.) I
> guess that's OK though.
> 

I guess this could be a problem with mice, with which we could get a deluge of mousemove even if no listeners care about mousemove.  (I idly move my mouse around all the time, but I never idly drag my finger around on touch screens.)  If mousemove is throttled that should be fine, or we can do something like our mayHaveTouchEventListeners hack.

> >  (S3) Respecify transform frames to not create overflow.  Or, do what webkit
> > does.
> 
> We've discussed this before and on www-style and no-one had a reasonable
> solution yet. Webkit is efficient, but will show/hide scrollbars at random
> times depending on when it needs to flush its frame tree.
> 

I saw the discussion go by but I wasn't capable of responding, sorry :(.  Personally I would prefer to make transformations a render property, not layout.

> What about avoiding flushing on the main thread when we know the animation
> can't possibly induce scrollbars? This would include overflow:hidden, and we
> could also compute the geometric bounds of the animation and see if that can
> overflow.
> 

I thought about that but it seemed fantastically hard to implement to me.
(In reply to David Baron [:dbaron] from comment #3)
> I would prefer to implement a solution where flushing that's triggered by
> script will force style and layout information on the main thread to be
> up-to-date (to the most recent refresh driver tick), but where we otherwise
> don't update for the animation.

Oh --- I naively thought that would happen already, but now I see why it wouldn't.  Yeah, that definitely makes sense to me too.
(In reply to Chris Jones [:cjones] [:warhammer] from comment #5)
> I guess this could be a problem with mice, with which we could get a deluge
> of mousemove even if no listeners care about mousemove.  (I idly move my
> mouse around all the time, but I never idly drag my finger around on touch
> screens.)  If mousemove is throttled that should be fine, or we can do
> something like our mayHaveTouchEventListeners hack.
> 

Or throttle these forced-reflows to the resolution of the refresh driver, which is what we would do with non-async animations anyway.
(In reply to Chris Jones [:cjones] [:warhammer] from comment #5)
> I saw the discussion go by but I wasn't capable of responding, sorry :(. 
> Personally I would prefer to make transformations a render property, not
> layout.

I suspect Web compatibility prevents that now.

> > What about avoiding flushing on the main thread when we know the animation
> > can't possibly induce scrollbars? This would include overflow:hidden, and we
> > could also compute the geometric bounds of the animation and see if that can
> > overflow.
> 
> I thought about that but it seemed fantastically hard to implement to me.

It doesn't seem terribly hard to detect when the transformed element's nearest scrollable ancestor is overflow:hidden. I expect you could use that to make most B2G animations efficient.
dz, do you understand what's required here?  Want to take a stab at a patch?
I do understand, but I won't have time to work on this until the 20th.
In our "touch" interfaces, we don't show persistent scrollbars, only "scroll indicators", and only when the user is actively panning around.  So even synchronously-interpolated animations would never cause the indicators to be shown, on their own.  So that's a somewhat weaselly way to skip the ancestor-might-draw-scrollbars check.

With that, I think the new required feature is the <waving-hands> frame-tree dirty bit to trigger artificial refresh driver tick on style flush when there's async-interpolated content.  Can you sketch out what that implementation might look like?
1. You would want to add a new flush type to mozFlushType.h, something like Flush_Animations.  It should be right after Flush_Style.  Also add it to PresShell::FlushPendingNotifications.  In FlushPendingNotifications, add calls to mPresContext->Animation/TransitionManager()->FlushAnimations() when we have an animation flush.
2. Then, in nsComputedDOMStyle.cpp flush animations instead of just flushing style.
3. In AddAnimationsForProperty, set this frame-needs-animations bit when adding an animation of a frame.
4. In CommonAnimationManager, add a FlushAnimations function that traverses the list of animations, and interpolates animations/calls RestyleElementForAnimation on any elements whose frames have the dirty bit set.
5. Flush animations when we have touch/mouse events, perhaps throttling to the refresh driver rate (which is lower in background tabs).  I'm not quite sure how this part works.
5. In nsAnimationManager/nsTransitionManager, suppress the RestyleElementForAnimation calls under WillNotify.

There may be other places that will require flushing animations, I'm not too sure.
Depends on: 783893
Attachment #653913 - Flags: review?(roc)
Attached patch Part 2: flush for touch events (obsolete) — Splinter Review
These are the same events that nsFrame::HandleEvent is checking for.
Attachment #653947 - Flags: review?(bugs)
Comment on attachment 653913 [details] [diff] [review]
Part 1: Add an animation flush type

Review of attachment 653913 [details] [diff] [review]:
-----------------------------------------------------------------

This approach looks reasonable. But, it doesn't actually fix this bug right? I assume you are going to create an extra patch to throttle or eliminate refresh driver ticks. (And at that point, I'd like to see something to address the scrollbars issue.)

::: content/base/public/mozFlushType.h
@@ +28,2 @@
>                                   completion */
> +  Flush_Display          = 7  /* As above, plus flush painting */

How about just removing the "= N" annotations? They're not serving any purpose, other than maybe Flush_Content = 1.

::: content/base/src/nsDocument.cpp
@@ +6328,5 @@
>    // layout flush on our parent, since we need our container to be the
>    // correct size to determine the correct style.
>    if (mParentDocument && IsSafeToFlush()) {
>      mozFlushType parentType = aType;
> +    if (aType >= Flush_Animations)

This change doesn't look right. I think we still need to Flush_Style here. If not, please explain why.

@@ +6369,5 @@
>  
>  void
>  nsDocument::FlushExternalResources(mozFlushType aType)
>  {
> +  NS_ASSERTION(aType >= Flush_Animations,

This change looks wrong too. If it's right, please explain why.

::: layout/style/nsTransitionManager.cpp
@@ +751,5 @@
>          if (pt.IsRemovedSentinel()) {
>            // Actually remove transitions one cycle after their
>            // completion.  See comment below.
>            et->mPropertyTransitions.RemoveElementAt(i);
> +        } else if (pt.mStartTime + pt.mDuration <= mPresContext->RefreshDriver()->MostRecentRefresh()) {

Does this change need to be part of this patch? If not, it should be its own patch.
I really don't think a new flush type is sufficient here; you need to conceptually convert flush types to be a pair (current flush type, boolean for flushing animations), though perhaps (??) packed into a single value, since right now the refresh driver causes flushes on every tick at a level higher than Flush_Style (in particular, Flush_InterruptibleLayout).  There may also be other FlushPendingNotifications callers (e.g., are there other places we flush leading up to painting) that would need to pass false for whether to flush animations.
Comment on attachment 653947 [details] [diff] [review]
Part 2: flush for touch events

I think this is not what you want. nsPresShellEventCB happens after the
event has been targeted.
Attachment #653947 - Flags: review?(bugs) → review-
Comment on attachment 653913 [details] [diff] [review]
Part 1: Add an animation flush type

Review of attachment 653913 [details] [diff] [review]:
-----------------------------------------------------------------

::: content/base/src/nsDocument.cpp
@@ +6328,5 @@
>    // layout flush on our parent, since we need our container to be the
>    // correct size to determine the correct style.
>    if (mParentDocument && IsSafeToFlush()) {
>      mozFlushType parentType = aType;
> +    if (aType >= Flush_Animations)

Ah, you are correct. I misread what the code was doing.

@@ +6369,5 @@
>  
>  void
>  nsDocument::FlushExternalResources(mozFlushType aType)
>  {
> +  NS_ASSERTION(aType >= Flush_Animations,

Yep, this is wrong.

::: layout/style/nsTransitionManager.cpp
@@ +751,5 @@
>          if (pt.IsRemovedSentinel()) {
>            // Actually remove transitions one cycle after their
>            // completion.  See comment below.
>            et->mPropertyTransitions.RemoveElementAt(i);
> +        } else if (pt.mStartTime + pt.mDuration <= mPresContext->RefreshDriver()->MostRecentRefresh()) {

I could pass the timestamp from WillRefresh into FlushTransitions, but then I would have to call
FlushTransitions(mPresContext->RefreshDriver()->MostRecentRefresh()) from FlushPendingNotifications.  This is simpler, and this is what nsAnimationManager does.
Comment on attachment 653947 [details] [diff] [review]
Part 2: flush for touch events

I think you may want to flush layout before even passing the event
to a presshell (if there are async animations).
Attached patch Part 2: flush for touch events (obsolete) — Splinter Review
How about this?
Attachment #653947 - Attachment is obsolete: true
Attachment #654028 - Flags: review?(bugs)
Comment on attachment 654028 [details] [diff] [review]
Part 2: flush for touch events

I *think* you want to flush even before the event goes to presshell,
or at the beginning of PresShell::HandleEvent.
Presshell then calls ESM.
(I'm not 100% sure about that since I don't know how compositor animations work)
Attachment #654028 - Flags: review?(bugs) → review-
Attachment #653913 - Attachment is obsolete: true
Attachment #653913 - Flags: review?(roc)
Attachment #654057 - Flags: review?(roc)
Attachment #654057 - Flags: feedback?(dbaron)
Attached patch Part 2: flush for touch events (obsolete) — Splinter Review
Do we also need to flush on touchstart and touchend, or is touchmove sent along with those?
Attachment #654028 - Attachment is obsolete: true
Attachment #654061 - Flags: review?(bugs)
Attached patch Part 2: flush for touch events (obsolete) — Splinter Review
I guess touchstart is fired before touchmove, so we do need to flush on it
Attachment #654061 - Attachment is obsolete: true
Attachment #654061 - Flags: review?(bugs)
Attachment #654065 - Flags: review?(bugs)
Comment on attachment 654057 [details] [diff] [review]
Part 1: Create a mozilla::FlushType to wrap mozFlushType and allow flushing animations

Review of attachment 654057 [details] [diff] [review]:
-----------------------------------------------------------------

::: content/base/public/mozFlushType.h
@@ +32,5 @@
> +struct FlushType {
> +  mozFlushType mFlushType;
> +  bool mFlushAnimations;
> +};
> +}

blank line between }s.

To avoid confusion between FlushType and mozFlushType, call this ChangesToFlush?

::: layout/style/nsTransitionManager.cpp
@@ +751,5 @@
>          if (pt.IsRemovedSentinel()) {
>            // Actually remove transitions one cycle after their
>            // completion.  See comment below.
>            et->mPropertyTransitions.RemoveElementAt(i);
> +        } else if (pt.mStartTime + pt.mDuration <= mPresContext->RefreshDriver()->MostRecentRefresh()) {

I'm still confused. what's the difference between aTime and mPresContext->RefreshDriver()->MostRecentRefresh()?
There is no difference, see http://dxr.lanedo.com/mozilla-central/layout/base/nsRefreshDriver.cpp.html#l339.

I could pass in aTime, but then I would need to set aTime in nsPresShell::FlushPendingNotifications, where I would set it from mPresContext->RefreshDriver()->MostRecentRefresh().  It's all the same, I can change it if you prefer.
No, I get it now. Sorry for being stupid.
Comment on attachment 654065 [details] [diff] [review]
Part 2: flush for touch events

Still too late :) 

As far as I understand this bug, we should flush
when view manager passes the event to
PresShell::HandleEvent(nsIFrame        *aFrame,
                       nsGUIEvent*     aEvent,
                       bool            aDontRetargetEvents,
                       nsEventStatus*  aEventStatus)
That is where things like nsLayoutUtils::GetFrameForPoint gets called.
Attachment #654065 - Flags: review?(bugs) → review-
Attached patch Part 2: flush for touch events (obsolete) — Splinter Review
I don't think we care about the event capturing stuff, so we just want to flush right before calling GetFrameForPoint.
Attachment #654065 - Attachment is obsolete: true
Attachment #654292 - Flags: review?(bugs)
I found one more place in nsPresShell that shouldn't flush animations.
Attachment #654057 - Attachment is obsolete: true
Attachment #654057 - Flags: review?(roc)
Attachment #654057 - Flags: feedback?(dbaron)
Attachment #654375 - Flags: review?(roc)
Depends on: 784846
This prevents constant flushing when moving the mouse around during an animation
Attachment #654413 - Flags: review?(bugs)
Comment on attachment 654375 [details] [diff] [review]
Part 1: Create a mozilla::FlushType to wrap mozFlushType and allow flushing animations

Review of attachment 654375 [details] [diff] [review]:
-----------------------------------------------------------------

So, given this doesn't flush animations in the refresh driver or on painting, what causes non-async-animations to animate?
The refresh driver calls WillRefresh on nsAnimationManager, which calls FlushAnimations.
Attachment #654375 - Flags: review?(roc)
Attachment #654375 - Flags: review?(dbaron)
Attachment #654375 - Flags: review+
Attached patch Part 3: Throttle animations (obsolete) — Splinter Review
Ignore the changes to nsTransitionManager.cpp; transitions aren't quite working yet.
Attachment #654721 - Flags: review?(roc)
Attachment #654721 - Flags: review?(dbaron)
Attached patch Part 3: Throttle animations (obsolete) — Splinter Review
Now with transitions working.  Ignore the printfs, I will remove those before landing.
Attachment #654721 - Attachment is obsolete: true
Attachment #654721 - Flags: review?(roc)
Attachment #654721 - Flags: review?(dbaron)
Attachment #654755 - Flags: review?(roc)
Attachment #654755 - Flags: review?(dbaron)
Comment on attachment 654755 [details] [diff] [review]
Part 3: Throttle animations

Review of attachment 654755 [details] [diff] [review]:
-----------------------------------------------------------------

Seems to me we need to stop the frame activity timer from timing out the activity flag while you're throttling refreshes.

::: layout/base/nsDisplayList.cpp
@@ +273,5 @@
>  AddAnimationsForProperty(nsIFrame* aFrame, nsCSSProperty aProperty,
>                           ElementAnimation* ea, Layer* aLayer,
>                           AnimationData& aData)
>  {
> +  printf("\n Rebuilding animations on layer!\n");

remove

@@ +361,5 @@
> +      ea->mAddedToLayer = false;
> +    }
> +    if (et) {
> +      et->mAddedToLayer = false;
> +    }

If content becomes invisible and then is made visible again, it seems to me that mAddedToLayer might not be cleared even though a layer has been recreated with no animations on it.

It might be better to attach some userdata to the layer to indicate which animations/transitions have been set on the layer, so you can check here that the layer you've got has the right animations/transitions set on it.

::: layout/style/AnimationCommon.cpp
@@ +284,4 @@
>  }
>  
> +bool
> +CommonElementAnimationData::CanThrottleCurrentTick(TimeStamp aTime)

Change the name to indicate that this is specifically about transforms. E.g. CanThrottleTransformChanges.

@@ +292,5 @@
>  
> +  // If we know that the animation cannot cause overflow,
> +  // we can just disable flushes for this animation.
> +
> +  // B2G does now show scrollbars, so we don't care about overflow.

"does not"

@@ +294,5 @@
> +  // we can just disable flushes for this animation.
> +
> +  // B2G does now show scrollbars, so we don't care about overflow.
> +#ifdef MOZ_GONK
> +  return true;

Instead of this, I think you could just check nsLookAndFeel's eIntID_ShowHideScrollbars.

@@ +298,5 @@
> +  return true;
> +#endif
> +
> +  // If the nearest scrollable ancestor has overflow:hidden,
> +  // we don't care about overflow.

This scrollbar stuff only matters for transforms.

@@ +304,5 @@
> +  nsIScrollableFrame* scrollableAncestor = nullptr;
> +  while (frame && !scrollableAncestor) {
> +    scrollableAncestor = do_QueryFrame(frame);
> +    frame = frame->GetParent();
> +  }

nsLayoutUtils::GetNearestScrollableFrame

@@ +305,5 @@
> +  while (frame && !scrollableAncestor) {
> +    scrollableAncestor = do_QueryFrame(frame);
> +    frame = frame->GetParent();
> +  }
> +  if (scrollableAncestor) {

If scrollableAncestor is null, you can just throttle.

@@ +315,5 @@
> +  }
> +
> +  // If this animation can cause overflow, we can throttle some of the ticks.
> +  if ((aTime - mStyleRuleRefreshTime) < TimeDuration::FromMilliseconds(200)) {
> +    printf_stderr("\nCauses overflow, but we can still throttle\n");

remove printf

::: layout/style/AnimationCommon.h
@@ +157,5 @@
>    static bool
>    CanAnimatePropertyOnCompositor(const dom::Element *aElement,
>                                   nsCSSProperty aProperty,
> +                                 bool aHasGeometricProperties,
> +                                 bool aAllowPartial);

Use a flags word; boolean parameters suck

::: layout/style/nsAnimationManager.cpp
@@ +942,5 @@
> +        canThrottleTick = true;
> +      }
> +    }
> +    if (!canThrottleTick) {
> +      printf_stderr("\n Not throttling animation! %p\n", ea);

remove

::: layout/style/nsAnimationManager.h
@@ +154,5 @@
>  
>    // True if this animation can be performed on the compositor thread.
> +  // Set aAllowPartial to false to make sure that all properties of this
> +  // animation are supported by the compositor.
> +  bool CanPerformOnCompositorThread(bool aAllowPartial = true) const;

Use an enum or flags word.

@@ +204,5 @@
>  
>    // nsARefreshObserver
>    virtual void WillRefresh(mozilla::TimeStamp aTime);
>  
> +  void FlushAnimations(bool aCanThrottle);

Ditto.

::: layout/style/nsTransitionManager.cpp
@@ +198,5 @@
>                        aNewStyleContext->HasPseudoElementData(),
>                    "pseudo type mismatch");
>  
>    // NOTE: Things in this function (and ConsiderStartingTransition)
> +  // nsIFrame::INVALIDATE_NO_THEBES_LAYERS);uuuuuonsiderStartingTransition)

remove!

@@ +574,5 @@
>      eRestyle_Self : eRestyle_Subtree;
>    presContext->PresShell()->RestyleForAnimation(aElement, hint);
> +
> +  // Invalidate the frame to cancel previous transitions if this is a reversal
> +  nsIFrame* frame = aElement->OwnerDoc()->GetShell()->GetPresContext()->

why not just use presContext?

@@ +575,5 @@
>    presContext->PresShell()->RestyleForAnimation(aElement, hint);
> +
> +  // Invalidate the frame to cancel previous transitions if this is a reversal
> +  nsIFrame* frame = aElement->OwnerDoc()->GetShell()->GetPresContext()->
> +    GetRootPresContext()->PresShell()->GetRootFrame();

need to test GetRootPresContext() for null

@@ +830,5 @@
> +      } else if (!et->mAddedToLayer) {
> +        // If the transition is throttled, make sure it is added to the layer
> +        nsIFrame* frame = et->mElement->OwnerDoc()->GetShell()->GetPresContext()->
> +          GetRootPresContext()->PresShell()->GetRootFrame();
> +        frame->InvalidateWithFlags(frame->GetRect(), nsIFrame::INVALIDATE_NO_THEBES_LAYERS);

You can't use GetRect() here (returns a rect relative to the parent). Use GetVisualOverflowArea() instead.

::: layout/style/nsTransitionManager.h
@@ +79,5 @@
>    bool HasTransitionOfProperty(nsCSSProperty aProperty) const;
>    // True if this animation can be performed on the compositor thread.
> +  // Set aAllowPartial to false to make sure that all properties of this
> +  // animation are supported by the compositor.
> +  bool CanPerformOnCompositorThread(bool aAllowPartial = true) const;

boolean parameters are bad. Pass enums or a flags word.

@@ +150,5 @@
>  
>    // nsARefreshObserver
>    virtual void WillRefresh(mozilla::TimeStamp aTime);
>  
> +  void FlushTransitions(bool aCanThrottle);

ditto
Comment on attachment 654292 [details] [diff] [review]
Part 2: flush for touch events

Flush may delete frame, so you need nsWeakFrame to check whether frame is still
alive. Other option is to flush higher up in the stack, even before
nsIView is used.
Attachment #654292 - Flags: review?(bugs) → review-
Comment on attachment 654413 [details] [diff] [review]
Part 4: Don't flush animations on mouse move when there are no listeners

I have no idea what the HasMouseEnterLeaveEventListeners usage is about.
Attachment #654413 - Flags: review?(bugs) → review-
Comment on attachment 654413 [details] [diff] [review]
Part 4: Don't flush animations on mouse move when there are no listeners

This isn't quite correct because we should be checking for mouseover, mousemove, mousein, mouseout.  However, we don't need this until we are close to enabling OMTA on platforms with a mouse.
Attachment #654413 - Attachment is obsolete: true
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #36)

> Seems to me we need to stop the frame activity timer from timing out the
> activity flag while you're throttling refreshes.

Is that the MarkLayersActive thing?  I'm currently resetting that in CanPerformOnCompositorThread, which is always called when we have a throttling FlushAnimations.  I think it would be more complex to teach the activity timer not to expire.

> 
> If content becomes invisible and then is made visible again, it seems to me
> that mAddedToLayer might not be cleared even though a layer has been
> recreated with no animations on it.
> 
> It might be better to attach some userdata to the layer to indicate which
> animations/transitions have been set on the layer, so you can check here
> that the layer you've got has the right animations/transitions set on it.

I check mAddedToLayer in nsTransitionManager.  I don't think there's a way to get the layer for an element.

> 
> @@ +574,5 @@
> >      eRestyle_Self : eRestyle_Subtree;
> >    presContext->PresShell()->RestyleForAnimation(aElement, hint);
> > +
> > +  // Invalidate the frame to cancel previous transitions if this is a reversal
> > +  nsIFrame* frame = aElement->OwnerDoc()->GetShell()->GetPresContext()->
> 
> why not just use presContext?

Yes, I can do that.

> 
> @@ +830,5 @@
> > +      } else if (!et->mAddedToLayer) {
> > +        // If the transition is throttled, make sure it is added to the layer
> > +        nsIFrame* frame = et->mElement->OwnerDoc()->GetShell()->GetPresContext()->
> > +          GetRootPresContext()->PresShell()->GetRootFrame();
> > +        frame->InvalidateWithFlags(frame->GetRect(), nsIFrame::INVALIDATE_NO_THEBES_LAYERS);
> 
> You can't use GetRect() here (returns a rect relative to the parent). Use
> GetVisualOverflowArea() instead.

If this is the root frame, does that actually matter?
In any case, can we just invalidate the right frame instead of invalidating the parent, or do we need to wait for DLBI?
Currently we sometimes print a warning that a frame is not prerendered when first starting an animation, but then we mark the layer active on the next refresh tick.  Might as well mark it active immediately.
Attachment #655270 - Flags: review?(roc)
(In reply to David Zbarsky (:dzbarsky) from comment #40)
> Is that the MarkLayersActive thing?  I'm currently resetting that in
> CanPerformOnCompositorThread, which is always called when we have a
> throttling FlushAnimations.  I think it would be more complex to teach the
> activity timer not to expire.

Do you actually call CanPerformOnCompositorThread 60 times a second? I thought the goal here was to not have to run the refresh driver all the time.

> > If content becomes invisible and then is made visible again, it seems to me
> > that mAddedToLayer might not be cleared even though a layer has been
> > recreated with no animations on it.
> > 
> > It might be better to attach some userdata to the layer to indicate which
> > animations/transitions have been set on the layer, so you can check here
> > that the layer you've got has the right animations/transitions set on it.
> 
> I check mAddedToLayer in nsTransitionManager.  I don't think there's a way
> to get the layer for an element.

Then you need to make sure all mAddedToLayer flags are cleared when they're not explicitly set during a layer construction pass.

> > > +      } else if (!et->mAddedToLayer) {
> > > +        // If the transition is throttled, make sure it is added to the layer
> > > +        nsIFrame* frame = et->mElement->OwnerDoc()->GetShell()->GetPresContext()->
> > > +          GetRootPresContext()->PresShell()->GetRootFrame();
> > > +        frame->InvalidateWithFlags(frame->GetRect(), nsIFrame::INVALIDATE_NO_THEBES_LAYERS);
> > 
> > You can't use GetRect() here (returns a rect relative to the parent). Use
> > GetVisualOverflowArea() instead.
> 
> If this is the root frame, does that actually matter?

Not much, but it's the principle of the thing.

> In any case, can we just invalidate the right frame instead of invalidating
> the parent, or do we need to wait for DLBI?

Better to wait for DLBI. With DLBI you can explicitly schedule a layer tree update instead of doing what you're doing.
Landed part 0.
https://hg.mozilla.org/integration/mozilla-inbound/rev/39da15b8b6aa
Status: NEW → ASSIGNED
Whiteboard: [leave open]
Attachment #655270 - Flags: checkin+
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #42)
> (In reply to David Zbarsky (:dzbarsky) from comment #40)

> Do you actually call CanPerformOnCompositorThread 60 times a second? I
> thought the goal here was to not have to run the refresh driver all the time.

Yes, the refresh driver is still running.  The expensive part is the style interpolation and flushing.
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #42)
> (In reply to David Zbarsky (:dzbarsky) from comment #40)

> 
> > > If content becomes invisible and then is made visible again, it seems to me
> > > that mAddedToLayer might not be cleared even though a layer has been
> > > recreated with no animations on it.
> > > 
> > > It might be better to attach some userdata to the layer to indicate which
> > > animations/transitions have been set on the layer, so you can check here
> > > that the layer you've got has the right animations/transitions set on it.
> > 
> > I check mAddedToLayer in nsTransitionManager.  I don't think there's a way
> > to get the layer for an element.
> 
> Then you need to make sure all mAddedToLayer flags are cleared when they're
> not explicitly set during a layer construction pass.


I think that already happens.  When we are recreating the layers, if the animation is still running, we will create the nsDisplay{Transform|Opaacity} for it.  Then we will call AddAnimationsAndTransitionsToLayer from BuildLayer.  I guess nsAnimationManager::GetAnimationsForCompositor could return null if the animation has changed and no longer animates the requested property, so I will reset that flag when the style rule on an animation changes and change how I set the flag in AddAnimationAndTransitionsToLayer to make things clearer.
Don't you need an intermediate patch before Part 3 that either backs out bug 780340 or fixes it to actually do something to animate visibility?
That fix was undone in bug 784239.
The way I'm doing this is that the compositor must be able to handle all properties of an animation in order to throttle it on the main thread.  This means that we will need to actually do something for visibility if we want visibility animations to be throttled.  I already have a patch doing that, but I don't think we want it quite yet.
Comment on attachment 654755 [details] [diff] [review]
Part 3: Throttle animations

Review of attachment 654755 [details] [diff] [review]:
-----------------------------------------------------------------

Expecting a new patch here
Attachment #654755 - Flags: review?(roc) → review-
This does what we discussed.
Attachment #654292 - Attachment is obsolete: true
Attachment #656552 - Flags: review?(bugs)
Attachment #656552 - Flags: review?(bugs) → review+
Pushed part 2 to help isolate any possible breakage.
https://hg.mozilla.org/integration/mozilla-inbound/rev/1bc40cb5b27c
Attachment #656552 - Flags: checkin+
Comment on attachment 656552 [details] [diff] [review]
Part 2: flush for touch events

>+nsIFrame* GetRootFrameThroughViews(nsIPresShell* aPresShell)

This is a confusing name for this function. It's more like "GetNearestFrameContainingPresShell".
Ok, I wasn't sure what that function was actually doing.  Want me to change it?
(In reply to David Zbarsky (:dzbarsky) from comment #53)
> Ok, I wasn't sure what that function was actually doing.  Want me to change
> it?

Sure, it'll make the code less confusing.
Comment on attachment 654755 [details] [diff] [review]
Part 3: Throttle animations

Looking into some transition issues.
Attachment #654755 - Attachment is obsolete: true
Attachment #654755 - Flags: review?(dbaron)
No longer depends on: 788436
(In reply to David Zbarsky (:dzbarsky) from comment #50)
> Created attachment 656552 [details] [diff] [review]
> Part 2: flush for touch events
> 
> This does what we discussed.

Is there a reason we can't flush here *only if* the refresh driver is interpolating asyncified animations?  (Or if there's some dirty bit set.)  Otherwise, we end up invalidating unnecessarily, like in bug 790505 comment 8.
I guess another option is to fix whatever seems to always be invalidating here, when there haven't been any style or layout changes to flush.  That's happening from the stack in bug 790505 comment 2.
That stack just shows that there was a restyle event processed, which came up with a change hint that includes a repaint...
I'm not exactly sure what that means, but why would removing that new code make those invalidations go away?
I'm not sure what you're asking.

As far as comment 60 goes, there _are_ style changes.  That's the whole point.  The animation is changing styles as it goes, and we process those style changes.  Since the animation is moving things around, those style changes say we need to repaint the place things used to be and the new place they're at, no?
Right, let me back up a bit.

What bug 790505 is trying to optimize is the case where a finger is dragging around onscreen, and in response to drags (mousemove) we update a style.MozTransform property.  The sequence of events looks like
 1. start dispatching touchmove; there's no listener
   - Flush(Layout) anyway
     . invalidate
 2. start dispatch synthesized mousemove; there is a listener
    - listener does style.MozTransform = ...
    - (poke refresh driver?)
[eventually]
 3. repaint off refresh driver notification

 - Repeat -

There are no animations, async or otherwise, running.

The new flush in (1) added here always ends up in nsIFrame::InvalidateRoot(), and it always cancels empty transactions (sets a flag that says the layer tree needs to be rebuilt).  This destroys the (huge) potential optimization in bug 790505, which is to skip full display-list + layerbuilder repaints of transformed-and-prerendered frames that have only been moved around, by just doing empty (non-repainting) layer transactions that update the layers' transforms.

I thought maybe we were doing work for the Flush(Layout) in (1) that we would have ended up doing anyway for the mousemove dispatch in (2), but if I remove the forced Flush(Layout) in (1) that was added by the part 2 patch here, then the invalidations from (1) don't happen in (2) and we can successfully attempt our optimization.

So something in the part 2 patch here is causing us to invalidate in a way that seems unnecessary in the testcase in bug 790505, and that we weren't doing before this patch.

In general, you're absolutely right that we're going to need to flush here when dispatching events when async animations are running.  I didn't mean to imply otherwise :).  What I want to do is get rid of the new, seemingly unnecessary invalidations introduced here when async animations *aren't* running.

Sorry if the explanation here is unclear, this is pretty far out of my front yard.
That's quite odd.  Flushing should just flush out pending style changes; it shouldn't affect what the effects of those style changes are....

Do we always set the "layer tree needs rebuild" flag when invalidating from DoApplyRenderingChangeToTree?  If not, when do we not set it?

If you remove the flush from patch 2 in this bug, do we still end up hitting DoApplyRenderingChangeToTree during the dragging you describe?
(In reply to Boris Zbarsky (:bz) from comment #64)
> Do we always set the "layer tree needs rebuild" flag when invalidating from
> DoApplyRenderingChangeToTree?  If not, when do we not set it?
> 

We always set it when we reach InvalidateRoot(), and the layer-tree rebuild hasn't been explicitly canceled.  I couldn't tell you if we always hit InvalidateRoot() from DoApplyRenderingChangeToTree().

Maybe roc or dbaron knows?

> If you remove the flush from patch 2 in this bug, do we still end up hitting
> DoApplyRenderingChangeToTree during the dragging you describe?

I don't know, but I think that's a question I can answer with gdb.
(In reply to Chris Jones [:cjones] [:warhammer] from comment #65)
> (In reply to Boris Zbarsky (:bz) from comment #64)
> > Do we always set the "layer tree needs rebuild" flag when invalidating from
> > DoApplyRenderingChangeToTree?  If not, when do we not set it?
> 
> We always set it when we reach InvalidateRoot(), and the layer-tree rebuild
> hasn't been explicitly canceled.  I couldn't tell you if we always hit
> InvalidateRoot() from DoApplyRenderingChangeToTree().
> 
> Maybe roc or dbaron knows?

I expect we will unless the stuff we're invalidating is completely clipped out.

A layout flush when there's a pending transform change to something that's at least partially visible should inevitably lead to a layer tree rebuild. I don't know how you'd be avoiding that even if you remove the flush here. PresShell::HandleEvent seems to flush layout for all touch events, at least, so I don't know how you'd dodge those flushes under any circumstances.
OK, so I retested with the backout of part 2 removed, and I don't see invalidations (any more?).  I think this was the second-to-last empty-txn-canceling invalidation that I removed, so I guess it was the last one that was the "real problem".

Sorry for the noise.  I wish I understood all this code better.
dz is going to update with the latest here and what work remains to be done.
Updated to tip
Attachment #654375 - Attachment is obsolete: true
Attachment #654375 - Flags: review?(dbaron)
With parts 1 and 3 applied, things mostly work.  The one remaining issue is that when a transition is started from another transition's transitionend event, the "old" style context passed into ConsiderStartingTransition has not been updated with the new style rule resulting from the previous transition finishing.
This can be seen easily by applying a translate(500px) transition, and then a translate(200px) transition on transitionend - the element being transitioned should move from 500px back to 200 px, but it will instead jump back to 0px and transition to 200px.

The way to fix this is to add a flag on nsPresShell (or maybe nsPresContext) for whether we have throttled any animations or transitions.  In nsTransitionManager::StyleContextChanged, we should check if the flag is set, and if it is, reset it and update all style contexts with the animations and transitions that are being throttled (because animation rules can affect transition rules, and transitions of parents can affect transition of children, so we should just do them all at once to make sure that it happens at most once per refresh driver tick rather than degenerating to O(n^2))
Fix compile error from rebasing.
Attachment #666742 - Attachment is obsolete: true
Depends on: 796182
These patches save us 300-400ms on app startup on otoro.  (As measured by first-paint on the settings and calculator apps, which is slightly bogus but close enough for a back-of-the-envelope check.)  I'm not 100% sure that we're hitting the totally-disabled-main-thread-interpolation case and not just throttling, but this is a huge win!
I had this working over the weekend (module a nullptr crash on Try), but bug 790505 broke it :-( Trying to re-fix now, but we'll probably miss the uplift now.
Depends on: 790505
Fixed it, but still need to fix the nullptr errors from Try.
(In reply to Chris Jones [:cjones] [:warhammer] from comment #76)
> How did bug 790505 affect this, just ooc?

At the end of a transition, we need to do a full invalidation and flush that style change to the element. The changes in part 4 of 790505 made that turned that invalidation into a non-invalidation paint, which meant that the style flush didn't happen properly (I'm not 100% sure of the mechanism of the last bit). This is only important if a new transition is starting, so my fix is to force a proper invalidation if we are doing the 'mini-flush' on a transition start with a still-active transition. This is a bit hackey though.
Something sounds a little fishy about that, but I don't understand the code well enough to comment more.

Was the problem that we weren't clearing async transitions/animations from layers at the end of their duration?
(In reply to Chris Jones [:cjones] [:warhammer] from comment #78)
> Something sounds a little fishy about that, but I don't understand the code
> well enough to comment more.
> 
> Was the problem that we weren't clearing async transitions/animations from
> layers at the end of their duration?

Maybe. The behaviour I observed was exactly the same as before my first fix, that is, it looked like the flush of the current state of the transition was failing. I don't really know the code well enough to comment more either, sorry. Maybe dzbarsky could shed more light on what's going on here?
Blocks: 796254
Whiteboard: [leave open] → [leave open][fast:200-400ms]
Attachment #666749 - Attachment is obsolete: true
These are dzbarsky's patches rebased, tidied up, and with a few bug fixes. Still WIP.
Attachment #666743 - Attachment is obsolete: true
Attached patch part 4: flush styles when needed (obsolete) — Splinter Review
WIP patch for flushing styles back to layout when needed (e.g., when a new transition is started). This works and passes tests, but breaks an invariant of the style system which will presumably lead to horrible pain further down the line.
Attached patch part 4: flush styles when needed (obsolete) — Splinter Review
because a missing 'not' can make a lot of difference!
Attachment #672139 - Attachment is obsolete: true
(In reply to David Zbarsky (:dzbarsky) from comment #71)
> In nsTransitionManager::StyleContextChanged, we should check if the flag is
> set, and if it is, reset it and update all style contexts with the
> animations and transitions that are being throttled

That doesn't work because we reenter the style system in a bad way.

> The one remaining issue is that when a transition is started from another
> transition's transitionend event, the "old" style context passed into
> ConsiderStartingTransition has not been updated with the new style rule resulting
> from the previous transition finishing.

That particular case could be fixed safely by deferring execution of transitionend events until later in the refresh driver and ensuring animations are flushed before they run.

But it seems to me the underlying problem here is much broader. When a style change happens that triggers a transition on a property, and animations have been throttled, we need the "old value" of the property to be the value after flushing animations, not whatever it happens to be ... and this could happen during any style change.
In terms of a list of where/when/why we need a mini-flush, it is only really a single situation - where we have a transition in progress and we start a new transition, if the new transition is on the same element as the old one and is the same property, then we need to have the current value of the property for the element so that the new transition starts in the right place. We flush all the transitions because the old transition might be on an ancestor and it is less hassle than finding all related transitions. The problem I guess is that this is triggered by a restyling flush in the first place, so when we do our mini-flush, we are already inside a restyle.
So the obvious thing that comes to mind for me is that in the case when mPresContext->ExistThrottledUpdates() in nsTransitionManager::StyleContextChanged  we should do nothing, add the info about which style context and whatnot was involved to some list, and return.  Then when we finish restyle processing, go ahead and process that list we built up.

Does that sound reasonable?
When we process that list, we won't have any way to access "what would have been the old property value if we had flushed animations" (obtaining that value is the goal of the current mini-flushes).
Hmm.  Is that because you need to figure out what aOldStyleContext would have looked like before the style change that's triggering the transition if the animation/transition rules that it depends on had been up to date?

I guess that's really what you need here.

So _really_ what you want, I guess, is a sort of mini-style-reresolve where you take your aOldStyleContext, determine the correct parent for it, then if aOldStyleContext depends on any transition/animation rules tick those to get up-to-date rules and then compute a "new" aOldStyleContext?  In that process, "determine the correct parent" would basically recursively mini-reresolve the style parent of the relevant frame, I guess.
(In reply to Nick Cameron [:nrc] from comment #85)
> In terms of a list of where/when/why we need a mini-flush, it is only really
> a single situation - where we have a transition in progress and we start a
> new transition, if the new transition is on the same element as the old one
> and is the same property, then we need to have the current value of the
> property for the element so that the new transition starts in the right
> place. We flush all the transitions because the old transition might be on
> an ancestor and it is less hassle than finding all related transitions. The
> problem I guess is that this is triggered by a restyling flush in the first
> place, so when we do our mini-flush, we are already inside a restyle.

I don't think this ("only really a single situation") is quite right.

First of all, the case where we're figuring out whether to start transitions, we're doing a restyle-without-animation.  (This is the whole reason for the separate restyle-without-animation and restyle-with-animation; see nsPresContext::IsProcessingAnimationStyleChange.)  So we're comparing the new style without any animation data to:
  (a) if there's no transition running on the element, the old style
  (b) if there is a transition running on the element, the data about the running transition that says what the transitioning style would have been.  There's a bit of complexity here related to transitions running on ancestors and the way we cover them up to avoid starting new transitions on descendants, though; it might be a problem making that bit correct with the lazy updating.

Second, I think there was something else, but I'm still working on remembering it.  It's encoded somewhere in this pair of IRC conversations I had with dzbarsky that I'll attach in a second, but I'm still having trouble (re)wrapping my head around this.
(In reply to Boris Zbarsky (:bz) from comment #88)
> Hmm.  Is that because you need to figure out what aOldStyleContext would
> have looked like before the style change that's triggering the transition if
> the animation/transition rules that it depends on had been up to date?
> 
Yes, I think so. I don't 100% understand what you are trying to say here, so in my own words: the list you create would store aOldStyleContext, but that might be an incorrect value, without doing a mini-flush, you will not know what the right value ought to be.

When we get to processing the list, that should have been updated (right?), but by then we only have access to the new style context, not the old one anymore.

> I guess that's really what you need here.
> 
> So _really_ what you want, I guess, is a sort of mini-style-reresolve where
> you take your aOldStyleContext, determine the correct parent for it, then if
> aOldStyleContext depends on any transition/animation rules tick those to get
> up-to-date rules and then compute a "new" aOldStyleContext?  In that
> process, "determine the correct parent" would basically recursively
> mini-reresolve the style parent of the relevant frame, I guess.

I think this is what my mini-flush is trying to do, except rather than trace ancestors, it is conservative and does it for all currently transitioning elements. I don't understand the last sentence about the recursive mini-re-resolve. But wouldn't we have the same problem about re-entering the re-styling code? Or are you proposing to do this when we process the list you proposed earlier? If so, I don't see how it is possible, since the old style contexts will have been overwritten by the new contexts.
(In reply to David Baron [:dbaron] from comment #89)
> I don't think this ("only really a single situation") is quite right.
> 
> First of all, the case where we're figuring out whether to start
> transitions, we're doing a restyle-without-animation.  (This is the whole
> reason for the separate restyle-without-animation and
> restyle-with-animation; see
> nsPresContext::IsProcessingAnimationStyleChange.)  So we're comparing the
> new style without any animation data to:
>   (a) if there's no transition running on the element, the old style

So far, I understand (I think).

>   (b) if there is a transition running on the element, the data about the
> running transition that says what the transitioning style would have been.

What does "would have been" mean?

> There's a bit of complexity here related to transitions running on ancestors
> and the way we cover them up to avoid starting new transitions on
> descendants, though; it might be a problem making that bit correct with the
> lazy updating.

Does this affect the mini-flush idea? I think I understand what is going on, but I don't see the link to the mini-flush/re-entry issue, sorry.

> 
> Second, I think there was something else, but I'm still working on
> remembering it.  It's encoded somewhere in this pair of IRC conversations I
> had with dzbarsky that I'll attach in a second, but I'm still having trouble
> (re)wrapping my head around this.

Nothing jumps out at me from the logs, did you have any more thoughts?
> but by then we only have access to the new style context, not the old one anymore.

You could hold on to the old one, but the real issue is that it doesn't have the info you need.

> But wouldn't we have the same problem about re-entering the re-styling code? 

The only code that would run would be inside the transition/animation stuff.  In particular, it would not pull pending restyles out of the restyle tracker, unlike the attached patches.

So basically, instead of relying on the normal restyling codepath, which has various side effects, the mini-flush I'm proposing would explicitly only update the animation/transition bits.  And not change any frames' style contexts....  The caveat is performance worries.

Again, that's assuming we actually need the "old style context with all animation stuff ticked".

> since the old style contexts will have been overwritten by the new contexts.

Style context are (mostly) immutable, for what it's worth.  When the style changes you get a new style context object... and someone else can still hold on to the old one.  That's a bit of a side issue.
I think I get the idea of this mini-reresolve. I don't know how you would find any transition/animation rules that we depend on though? I guess it is any of the currently transitions which are on the current element and property, but then we need to do a search for each element (in the set of the element in question and its ancestors) of all transitions. Doesn't this get into the nasty performance (O(n^2)) which dzbarsky/dbaron were trying to avoid by updating all active transitions? If we write our own custom restyle path here, could we just do it on all elements of all active transitions, or would we have to walk their parents anyway?

Also, what exactly do you mean by "tick"?
> I don't know how you would find any transition/animation rules that we depend on though?

Given a stylecontext, you can just walk its rulenodes and check for the right cascade levels.  Assuming we really want the nsIStyleRules.  If we want the transition/animation data, then that would come from the relevant hashtables, right?

> Doesn't this get into the nasty performance (O(n^2)) which dzbarsky/dbaron were trying
> to avoid by updating all active transitions?

Yes, it might.  Hence performance worries.

> If we write our own custom restyle path here, could we just do it on all elements of
> all active transitions, or would we have to walk their parents anyway?

You could just do all the things with active transitions, as long as you do ancestors before descendants, I suspect.

> Also, what exactly do you mean by "tick"?

Update the transition/animation nsIStyleRules to reflect the current time.
Attachment #673335 - Attachment is obsolete: true
Attachment #673335 - Flags: review?(bugs)
Attachment #673338 - Flags: review?(bugs)
Attachment #673338 - Attachment is obsolete: true
Attachment #673338 - Flags: review?(bugs)
posting an email thread here for posterity:

bz:

So basically our situation is the following.  When we're doing a non-animation restyle, we need the old style data, _including_ animations, to properly do things like transition reversal.

So we think we should do the following.  When we decide we need to process restyles (whether due to a refresh tick or due to an explicit page flush) and the following two conditions hold:

1)  We have pending non-animation restyles.
2)  We have throttled animations/transitions.

then before we do any restyling we need to go and update the animation/transition style data on all the throttled elements so it's current.

In particular, we should probably just keep tracks of two timestamps on the prescontext: T1, which is the timestamp of when we last did an update as described above and T2, which is when we last did a full style flush that updates all animation/transition things.  Then we can skip doing things as needed based on how T1 and T2 compare to each other and to the last refresh driver tick time.

So now the actual mechanics of the "update the animation/transition style data" process.  Since we really only need this for transition reversals, it's enough to do this on primary frames.  The proposed algorithm is as follows:

1)  Start with the set S of throttled elements.
2)  Given an element E1 in S, walk up the parent chain and find the
    root-most ancestor E2 that's in S.  This can be implemented by
    creating a hash set from S, as a first cut.
3)  Update the animation style for E2.  More on how to do this below.
4)  Start at E2 and go down the element tree (though we may need to
    include ::before and ::after here somehow).  For each element,
    check whether it's in S.  If it isn't, just reparent its primary
    frame's style context to the new parent style context we have.  If
    it is, then update the animation style for it using the new parent
    style context and remove it from S.
5)  If S is nonempty, return to step 2.

OK, this leaves the mechanics of "update the animation style".   The way to do this, I think, is to start with the rule node of the style context and go up the rule tree recording (in an array, say) the rules corresponding to those rulenodes and which cascade level they come from.  For the rules that come from the animation and transition cascade levels, instead of using the original rule from the rulenode, we want to get an updated rule for the current refresh driver timestamp.  The result of this process should be a list of rules that apply to the element, updated to the current refresh driver timestamp, and information about which of them come from which cascade level.

Then we do something like nsStyleSet::ResolveStyleForRules except with correct treatment for the cascade levels.

A simple way to do this might be an array of structs that hold an nsCOMPtr<nsIStyleRule> and a cascade level enum, and then just doing the SetLevel calls inside the loop that calls ForwardOnPossiblyCSSRule.
dbaron:


I think it's worth a little bit of preface here, though.

Part of the work to defer updating style, layout, etc. for
animations that are being run on the compositor involves changing
what a mozFlushType is.

Before, these were just a series of enumerated values saying how far
through the process we required things to be flushed, e.g.:

=======> Content
===================> Style/Frames
==========================> InterruptibleLayout
===================================> Layout

Now we've added a boolean, so that mozFlushType is now an (enum,
boolean) pair.  The boolean says whether we need to update the data
for animations that are running on the compositor thread.

We need to make sure this boolean is false for all the things that
normally happen during a refresh driver tick (in order to make the
optimization worth anything), but any flush that's happening for a
DOM API needs to make sure the boolean is true so that API returns
correct results.

> So basically our situation is the following.  When we're doing a
> non-animation restyle, we need the old style data, _including_
> animations, to properly do things like transition reversal.
>
> So we think we should do the following.  When we decide we need to
> process restyles (whether due to a refresh tick or due to an
> explicit page flush) and the following two conditions hold:
>
> 1)  We have pending non-animation restyles.
> 2)  We have throttled animations/transitions.
>
> then before we do any restyling we need to go and update the
> animation/transition style data on all the throttled elements so
> it's current.
>
> In particular, we should probably just keep tracks of two timestamps
> on the prescontext: T1, which is the timestamp of when we last did
> an update as described above and T2, which is when we last did a
> full style flush that updates all animation/transition things.  Then
> we can skip doing things as needed based on how T1 and T2 compare to
> each other and to the last refresh driver tick time.

To be clear, these are refresh tick timestamps.  They could also be
booleans, but then we'd have to deal with updating them every
refresh driver tick (which requires being careful about what order
refresh observers are notified, etc.).


So to step back a drop, the underlying algorithm for the whole bug is
basically three pieces:

 * Deferring the work that we do need to do: when the transition and
   animation managers get their refresh driver WillRefresh notification
   they no longer post animation restyles for elements that are being
   handled on the compositor thread.  This leads to T2 no longer
   matching the current refresh driver time (if it did before).

 * Doing the work we've deferred when we need to: however, when we get a
   flush that has the boolean saying we need up-to-date animations being
   true, and we find that T2 is not the current refresh driver time,
   then we notify the animation and transition managers to explicitly
   post these notifications (and we update T2 and T1) at some point
   during PresShell::FlushPendingNotifications that happens before the
   ProcessPendingRestyles call (but not too far before)

 * Doing this special thing that does work we previously deferred but if
   we did it the normal way it would no longer give us the old values
   anymore:  then, when the 2 conditions Boris lists above are true, we
   check if T1 is not the current refresh-driver time, and if it isn't,
   do the element-only (optional restriction) animation-style-only
   (critical restriction, see below) update and update T1 to the current
   refresh driver time.

> So now the actual mechanics of the "update the animation/transition
> style data" process.  Since we really only need this for transition
> reversals, it's enough to do this on primary frames.  The proposed
> algorithm is as follows:
>
> 1)  Start with the set S of throttled elements.
> 2)  Given an element E1 in S, walk up the parent chain and find the
>     root-most ancestor E2 that's in S.  This can be implemented by
>     creating a hash set from S, as a first cut.
> 3)  Update the animation style for E2.  More on how to do this below.
> 4)  Start at E2 and go down the element tree (though we may need to
>     include ::before and ::after here somehow).  For each element,
>     check whether it's in S.  If it isn't, just reparent its primary
>     frame's style context to the new parent style context we have.  If
>     it is, then update the animation style for it using the new parent
>     style context and remove it from S.
> 5)  If S is nonempty, return to step 2.
>
> OK, this leaves the mechanics of "update the animation style".   The
> way to do this, I think, is to start with the rule node of the style
> context and go up the rule tree recording (in an array, say) the
> rules corresponding to those rulenodes and which cascade level they
> come from.  For the rules that come from the animation and
> transition cascade levels, instead of using the original rule from
> the rulenode, we want to get an updated rule for the current refresh
> driver timestamp.  The result of this process should be a list of
> rules that apply to the element, updated to the current refresh
> driver timestamp, and information about which of them come from
> which cascade level.

This probably requires a way of asking the animation manager and
transition manager for such a rule.  (If we ever implemented compositor
animations for SMIL animation, we'd need to do the same for it.)

> Then we do something like nsStyleSet::ResolveStyleForRules except
> with correct treatment for the cascade levels.

I *think* one of the existing callers of ResolveStyleForRules does
something somewhat similar (replacing a particular rule); that may be
useful.  (I'd look now if I didn't want to get to dinner really soon.)
Then again, now that you mention it, I'm not sure it does the SetLevel
stuff right.
dbaron:

> So to step back a drop, the underlying algorithm for the whole bug is
> basically three pieces:
>
>  * Deferring the work that we do need to do: when the transition and

Er, "do need to do" -> "don't need to do".

>    animation managers get their refresh driver WillRefresh notification
>    they no longer post animation restyles for elements that are being
>    handled on the compositor thread.  This leads to T2 no longer
>    matching the current refresh driver time (if it did before).
>
>  * Doing the work we've deferred when we need to: however, when we get a
>    flush that has the boolean saying we need up-to-date animations being
>    true, and we find that T2 is not the current refresh driver time,
>    then we notify the animation and transition managers to explicitly
>    post these notifications (and we update T2 and T1) at some point
>    during PresShell::FlushPendingNotifications that happens before the
>    ProcessPendingRestyles call (but not too far before)
>
>  * Doing this special thing that does work we previously deferred but if
>    we did it the normal way it would no longer give us the old values
>    anymore:  then, when the 2 conditions Boris lists above are true, we
>    check if T1 is not the current refresh-driver time, and if it isn't,
>    do the element-only (optional restriction) animation-style-only
>    (critical restriction, see below) update and update T1 to the current
>    refresh driver time.

And, to be clear, we might need to do the second and third pieces as
part of the same flush, and when that happens, we need to both, and
not have the T2 update from the second piece cause the third piece
not to happen.  This probably means the third piece should happen
outside of ProcessPendingRestyles, in
PresShell::FlushPendingNotifications, even though it might be
tempting to put it inside ProcessPendingRestyles.  (But what about
that pesky ProcessPendingRestyles call in
PresShell::ResizeReflowIgnoreOverride?  Should we just remove it?
The one in PresShell::Initialize clearly isn't a problem, though.)
(In reply to Nick Cameron [:nrc] from comment #100)
> dbaron:

> And, to be clear, we might need to do the second and third pieces as
> part of the same flush, and when that happens, we need to both, and
> not have the T2 update from the second piece cause the third piece
> not to happen.  This probably means the third piece should happen
> outside of ProcessPendingRestyles, in
> PresShell::FlushPendingNotifications, even though it might be
> tempting to put it inside ProcessPendingRestyles.  (But what about
> that pesky ProcessPendingRestyles call in
> PresShell::ResizeReflowIgnoreOverride?  Should we just remove it?
> The one in PresShell::Initialize clearly isn't a problem, though.)

Do you mean T1 update here?
(In reply to Nick Cameron [:nrc] from comment #98)
> posting an email thread here for posterity:
> 
> bz:
> 
> In particular, we should probably just keep tracks of two timestamps on the
> prescontext: T1, which is the timestamp of when we last did an update as
> described above and T2, which is when we last did a full style flush that
> updates all animation/transition things.  Then we can skip doing things as
> needed based on how T1 and T2 compare to each other and to the last refresh
> driver tick time.
>

I'm not sure about these timestamps, looking at dbaron's elaboration, it looks like we never need T2. I see that we would want to compare T1 to now, but I don't see why we would compare T1 to T2 or T2 to now.
 
> So now the actual mechanics of the "update the animation/transition style
> data" process.  Since we really only need this for transition reversals,
> it's enough to do this on primary frames.  The proposed algorithm is as
> follows:
> 
> 1)  Start with the set S of throttled elements.
> 2)  Given an element E1 in S, walk up the parent chain and find the
>     root-most ancestor E2 that's in S.  This can be implemented by
>     creating a hash set from S, as a first cut.

By parent chain, do you mean parent of the element, primary frame, or style context? If style context, how do we go from style context back to its element to find E2? Is this what the hash set is for? If so, what about elements which do not have transitions/animations?
> but I don't see why we would compare T1 to T2 or T2 to now.

We need to compare T2 to the last refresh tick to find out whether we should be doing our miniflush stuff, no?

Or does that only happen from refresh ticks, so we never have to worry about T2 being new enough that we can skip work?

> By parent chain, do you mean parent of the element, primary frame, or style context?

Of the element.
OK, making some progress with this, I'm looking at the re-resolve part for now, I'll come back to the T1/T2 bit next.

One more question:

> Then we do something like nsStyleSet::ResolveStyleForRules except with correct treatment > for the cascade levels.

What does "correct treatment' mean here? I see that this happens:

  ruleWalker.SetLevel(nsStyleSet::eAnimationSheet, false, false);

and I assume we need a different level and/or more stuff, but I have no idea what is going on, so not sure what to do.
Nick, each rule has a cascade level associated with it.  When walking rules, what we do is call SetLevel() on the rulewalker, then call Forward() on the rules in that level.

In your case, you already have a list of rules+levels that led to the existing style context.  Or more precisely, the style context has a GetRuleNode() that hands back an nsRuleNode*.  And each rulenode that's not the root of the rulenode tree has a rule (GetRule()) and a level (GetLevel()).

ResolveStyleForRules() just takes an array of rules and pretends like they're all in the UA level.  What you want to do instead is take an array of rules+levels and just call SetLevel() with the right thing before each Forward() call.  The first argument should be the same as the level the rule used to have (which you get off the rulenode).  The second and third arguments should just be "false".
Attached patch very rough WIP patch (obsolete) — Splinter Review
This patch is extremely rough and is full of TODOs and there is stuff missing (I've only done transitions, not animations), but I want to check I'm doing at least close to the right thing. It does work (or look like its working), transitions are starting properly from the right places and it's much smoother than the un-throttled version.

It sit on top of the other patches above, but I hope you shouldn't have to look at them to work out what it is going on.
Assignee: dzbarsky → ncameron
Attachment #676511 - Flags: feedback?(bzbarsky)
You'll probably want to pass the link booleans from the original style context to GetContext(), and if we allow off-main-thread color animation you'll want to pass in the element as the last argument.  And in general, it's probably best to push most of the code in UpdateStyle after we've built up the array into the nsStyleSet (which for example has direct access to the root rulenode).

The element-walking in FlushThrottled doesn't seem quite right.  In particular I don't understand what it's doing in the "reparent" block.  Seems like it will only do that on ancestors of things which have throttled transitions, which is not what we want.  What we want to do is that every time we update the style on some element because it had non-current animations/transitions we reparent (using nsStyleSet::ReparentStyleContext) the style contexts for all its descendants to the right things.

But yeah, this is definitely looking like the right general direction modulo all the TODOs.  ;)
Attachment #676511 - Flags: feedback?(bzbarsky) → feedback+
Thanks for the feedback, the rest is clear, but this bit I am not sure about. 

(In reply to Boris Zbarsky (:bz) from comment #107)
> What we want to do is that every
> time we update the style on some element because it had non-current
> animations/transitions we reparent (using nsStyleSet::ReparentStyleContext)
> the style contexts for all its descendants to the right things.

As I understood from your original comments we start with an element which we want to make up to date (i.e., it had non-current transitions or animations). We then find all its ancestors and make them all up to date. For each ancestor, if it is a transitioning element we do our update thing. If it is a non-transitioning element we just need to reparent; we don't need to reparent if no ancestors of the current ancestor changed, i.e., no ancestors were transitioning. Is that correct?

Are you saying we should 'update' the descendants of that first element as well as its ancestors or instead of its ancestors? If none of the descendants are transitioning, then does it make a difference? And if there is a transitioning element, it will get 'updated' by the outer loop.
The part this is implementing is step 4 of the second list in comment 98.  So yes, we need to update style on the descendants.

Basically, if you ignore performance, the thing that should give us correct behavior is to just go through all our transitioning elements, update the style for each one, and update its descendants.  This has a failure mode where we do work multiple times if an element E1 is transitioning and has an ancestor E2 that's also transitioning: we'll update the style on the descendants of E1 twice if we process E1 and then E2.  Hence the 5-step thing in comment 98.

I guess the problem is that "down the element tree" is ambiguous.  In my head the root of the element tree is "at the top" so down means towards the leaves.  Sorry about the lack of clarity there.  :(

> Are you saying we should 'update' the descendants of that first element as well as its
> ancestors or instead of its ancestors?

Instead of.

> If none of the descendants are transitioning, then does it make a difference?

Yes. Say you have an element which has "opacity" transitioning on it.  And say it has a child which has "opacity: inherit".  That means the opacity is effectively _also_ transitioning on the child, even though there is no transition running directly on the child.   So if we need to know our style-including-transitions then we need to update the child's style too, I believe.
Attached patch working, but still WIP patch (obsolete) — Splinter Review
This works but it shouldn't, see nsTransitionManager.cpp:351. From here and via a few calls, we are passing nullptr as the parent context to nsStyleSet::GetContext. According to bz, we should be passing parent->GetPrimaryFrame()->GetStyleContext()->GetParent(), as I understand it, we should only pass nullptr if there is no style parent, e.g., for the body element.

BUT, passing nullptr works (we get buttery smooth transitions), and passing the actual parent style does not work (we get the same awful stuttery transitions as if we were doing no style update at all). I have not been able to debug this very successfully because I don't understand the style system code so well. But when passing in the 'correct' we are getting back a new style rule, not the old one.

dbaron: do you have any insight as to what is happening here? Thanks!
Attachment #676511 - Attachment is obsolete: true
Attachment #678612 - Flags: feedback?(dbaron)
Attached file Testcase
Chris,

This is simplified test case from the work of bug 803447. I can confirm that the animation of this testcase is correct on Nightly but incorrect on B2G Desktop. The step2 transition appears jumpy. Is this this bug?
Donovan suggested this might help the h/w volume control lag I see playing Cut the Rope.

/be
(In reply to Tim Guan-tin Chien [:timdream] (MoCo-TPE) from comment #111)
> Created attachment 678684 [details]
> Testcase
> 
> Chris,
> 
> This is simplified test case from the work of bug 803447. I can confirm that
> the animation of this testcase is correct on Nightly but incorrect on B2G
> Desktop. The step2 transition appears jumpy. Is this this bug?

Yes, it is this bug. The WIP patches fix this, except not really for two reasons: the second transition is on both a translation and a scale and this gets turned into an interpolatematrix, which the style system then chokes on, I hope this just needs another case in a switch statement, but I'm not sure what needs to go in there. The second problem is that the patch only seems to work when the kind of transition is 'all' rather than 'transform' I have no idea why that is, but it shouldn't be too bad to fix (given it took me all afternoon to even find this out, that might be optimistic, but hey). Thanks for the test case!
(In reply to Nick Cameron [:nrc] from comment #113)
> (In reply to Tim Guan-tin Chien [:timdream] (MoCo-TPE) from comment #111)
> > Created attachment 678684 [details]
> > Testcase
> > 
> > Chris,
> > 
> > This is simplified test case from the work of bug 803447. I can confirm that
> > the animation of this testcase is correct on Nightly but incorrect on B2G
> > Desktop. The step2 transition appears jumpy. Is this this bug?
> 
> Yes, it is this bug. The WIP patches fix this, except not really for two
> reasons: the second transition is on both a translation and a scale and this
> gets turned into an interpolatematrix, which the style system then chokes
> on, I hope this just needs another case in a switch statement, but I'm not
> sure what needs to go in there. The second problem is that the patch only
> seems to work when the kind of transition is 'all' rather than 'transform' I
> have no idea why that is, but it shouldn't be too bad to fix (given it took
> me all afternoon to even find this out, that might be optimistic, but hey).
> Thanks for the test case!

I fixed the first of these problems, a little stumped on the second one though, will try some more tomorrow.
Attached patch rollup WIP patch (obsolete) — Splinter Review
That patch crashes on the gaia lockscreen "arrow" animation.  Will grab a bt in a minute.
In an opt build, I get a crash on

"Trying to allocate an infallible array that's too big"

In debug, crash at

Program received signal SIGSEGV, Segmentation fault.
0x40bc1fe4 in nsINode::GetBoolFlag (this=0x5a5a5a5a, name=nsINode::IsInDocument) at ../../dist/include/nsINode.h:1323
(gdb) bt
#0  0x40bc1fe4 in nsINode::GetBoolFlag (this=0x5a5a5a5a, name=nsINode::IsInDocument) at ../../dist/include/nsINode.h:1323
#1  0x40bc1fca in nsINode::IsInDoc (this=0x5a5a5a5a) at ../../dist/include/nsINode.h:482
#2  0x40bc2090 in nsIContent::GetPrimaryFrame (this=0x5a5a5a5a) at ../../dist/include/nsIContent.h:795
#3  0x40e6d810 in ElementTransitions::CanPerformOnCompositorThread (this=0x4bd09a60, aFlags=mozilla::css::CommonElementAnimationData::Flags_None) at /home/cjones/mozilla/inbound/layout/style/nsTransitionManager.cpp:139
#4  0x40e6f71a in FlushTransition (aEntry=0x49074f38, aUserArg=0xbe898350) at /home/cjones/mozilla/inbound/layout/style/nsTransitionManager.cpp:988
#5  0x40dc3ae2 in nsTHashtable<mozilla::css::CommonAnimationManager::DataEntry>::s_EnumStub (table=0x4911c854, entry=0x49074f38, number=0, arg=0xbe898318) at ../../dist/include/nsTHashtable.h:486
#6  0x41f3f2e4 in PL_DHashTableEnumerate (table=0x4911c854, etor=0x40dc3ac5 <nsTHashtable<mozilla::css::CommonAnimationManager::DataEntry>::s_EnumStub(PLDHashTable*, PLDHashEntryHdr*, uint32_t, void*)>, arg=0xbe898318) at /home/cjones/mozilla/new-b2g/objdir-gecko/xpcom/build/pldhash.cpp:717
#7  0x40dc2d48 in nsTHashtable<mozilla::css::CommonAnimationManager::DataEntry>::EnumerateEntries (this=0x4911c854, enumFunc=0x40e6f6ed <FlushTransition(mozilla::css::CommonAnimationManager::DataEntry*, void*)>, userArg=0xbe898350) at ../../dist/include/nsTHashtable.h:237
#8  0x40e6fae4 in nsTransitionManager::FlushTransitions (this=0x4911c840, aFlags=mozilla::css::CommonAnimationManager::Can_Throttle) at /home/cjones/mozilla/inbound/layout/style/nsTransitionManager.cpp:1084
#9  0x40e6f5c4 in nsTransitionManager::WillRefresh (this=0x4911c840, aTime=...) at /home/cjones/mozilla/inbound/layout/style/nsTransitionManager.cpp:938
#10 0x40cbdf04 in nsRefreshDriver::Notify (this=0x494896b0, aTimer=0x4bd0aa10) at /home/cjones/mozilla/inbound/layout/base/nsRefreshDriver.cpp:344
The pointer to freed memory comes from

PLDHashOperator
FlushTransition(CommonAnimationManager::DataEntry* aEntry, void* aUserArg)
{
  ElementTransitions *et = static_cast<ElementTransitions*>(aEntry->mKey->mData);
  FlushTransitionArgs* args = static_cast<FlushTransitionArgs*>(aUserArg);

  bool canThrottleTransition = et->CanPerformOnCompositorThread(CommonElementAnimationData::Flags_None) &&
                               CanThrottleTransition(et, args->mNow);

|et| --> 0x5a5a
Attached patch rollup WIP patch (obsolete) — Splinter Review
Attachment #679913 - Attachment is obsolete: true
(In reply to Chris Jones [:cjones] [:warhammer] from comment #118)
> The pointer to freed memory comes from
> 
> PLDHashOperator
> FlushTransition(CommonAnimationManager::DataEntry* aEntry, void* aUserArg)
> {
>   ElementTransitions *et =
> static_cast<ElementTransitions*>(aEntry->mKey->mData);
>   FlushTransitionArgs* args = static_cast<FlushTransitionArgs*>(aUserArg);
> 
>   bool canThrottleTransition =
> et->CanPerformOnCompositorThread(CommonElementAnimationData::Flags_None) &&
>                                CanThrottleTransition(et, args->mNow);
> 
> |et| --> 0x5a5a

Thanks for the heads up, the updated patch shouldn't crash, but that is just hiding the problem, I don't have a mac at the moment, so I can't look at this right now (en route to Vancouver), but will check it out later. This really shouldn't be happening.
Here's what valgrind says

==13498== Invalid read of size 8
==13498==    at 0x898E93F: ElementTransitions::CanPerformOnCompositorThread(mozilla::css::CommonElementAnimationData::Flags) const (nsTransitionManager.cpp:139)
==13498==    by 0x8990EF1: FlushTransition(mozilla::css::CommonAnimationManager::DataEntry*, void*) (nsTransitionManager.cpp:987)
==13498==    by 0x88C1AC1: nsTHashtable<mozilla::css::CommonAnimationManager::DataEntry>::s_EnumStub(PLDHashTable*, PLDHashEntryHdr*, unsigned int, void*) (nsTHashtable.h:486)
==13498==    by 0x9F21C0A: PL_DHashTableEnumerate (pldhash.cpp:717)
==13498==    by 0x88C095A: nsTHashtable<mozilla::css::CommonAnimationManager::DataEntry>::EnumerateEntries(PLDHashOperator (*)(mozilla::css::CommonAnimationManager::DataEntry*, void*), void*) (nsTHashtable.h:237)
==13498==    by 0x8991359: nsTransitionManager::FlushTransitions(mozilla::css::CommonAnimationManager::FlushFlags) (nsTransitionManager.cpp:1084)
==13498==    by 0x8990D80: nsTransitionManager::WillRefresh(mozilla::TimeStamp) (nsTransitionManager.cpp:938)
==13498==    by 0x8779CD0: nsRefreshDriver::Notify(nsITimer*) (nsRefreshDriver.cpp:344)
==13498==    by 0x9F99E87: nsTimerImpl::Fire() (nsTimerImpl.cpp:485)
==13498==    by 0x9F9A26B: nsTimerEvent::Run() (nsTimerImpl.cpp:565)
==13498==    by 0x9F92046: nsThread::ProcessNextEvent(bool, bool*) (nsThread.cpp:627)
==13498==    by 0x9F1F6D9: NS_ProcessNextEvent_P(nsIThread*, bool) (nsThreadUtils.cpp:221)
==13498==    by 0x9BB27BD: mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*) (MessagePump.cpp:82)
==13498==    by 0x9FE7DD0: MessageLoop::RunInternal() (message_loop.cc:215)
==13498==    by 0x9FE7D61: MessageLoop::RunHandler() (message_loop.cc:208)
==13498==    by 0x9FE7D3A: MessageLoop::Run() (message_loop.cc:182)
==13498==    by 0x9A26417: nsBaseAppShell::Run() (nsBaseAppShell.cpp:163)
==13498==    by 0x976B22F: nsAppStartup::Run() (nsAppStartup.cpp:290)
==13498==    by 0x82CB213: XREMain::XRE_mainRun() (nsAppRunner.cpp:3822)
==13498==    by 0x82CB4F2: XREMain::XRE_main(int, char**, nsXREAppData const*) (nsAppRunner.cpp:3889)
==13498==    by 0x82CB743: XRE_main (nsAppRunner.cpp:4083)
==13498==    by 0x402099: do_main(int, char**) (nsBrowserApp.cpp:154)
==13498==    by 0x402322: main (nsBrowserApp.cpp:239)
==13498==  Address 0x1fe5ce40 is 0 bytes inside a block of size 64 free'd
==13498==    at 0x4C2A4BC: operator delete(void*) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==13498==    by 0x898E6AA: ElementTransitionsPropertyDtor(void*, nsIAtom*, void*, void*) (nsTransitionManager.cpp:90)
==13498==    by 0x8B4CFE2: nsPropertyTable::PropertyList::DeletePropertyFor(nsPropertyOwner) (nsPropertyTable.cpp:337)
==13498==    by 0x8B4CD5C: nsPropertyTable::DeleteProperty(nsPropertyOwner, nsIAtom*) (nsPropertyTable.cpp:260)
==13498==    by 0x8B216EE: nsINode::DeleteProperty(unsigned short, nsIAtom*) (nsINode.cpp:174)
==13498==    by 0x86A676D: nsINode::DeleteProperty(nsIAtom*) (nsINode.h:675)
==13498==    by 0x8B15BF6: nsGenericElement::UnbindFromTree(bool, bool) (nsGenericElement.cpp:1527)
==13498==    by 0x8CD1766: nsGenericHTMLElement::UnbindFromTree(bool, bool) (nsGenericHTMLElement.cpp:1781)
==13498==    by 0x8B25448: nsINode::doRemoveChildAt(unsigned int, bool, nsIContent*, nsAttrAndChildArray&) (nsINode.cpp:1371)
==13498==    by 0x8B9F416: mozilla::dom::FragmentOrElement::RemoveChildAt(unsigned int, bool) (FragmentOrElement.cpp:888)
==13498==    by 0x8B222B8: nsINode::RemoveChild(nsINode&, mozilla::ErrorResult&) (nsINode.cpp:446)
==13498==    by 0x95C3FDE: nsIDOMNode_RemoveChild(JSContext*, unsigned int, JS::Value*) (dom_quickstubs.cpp:3440)
==13498==    by 0xABFF299: js::CallJSNative(JSContext*, int (*)(JSContext*, unsigned int, JS::Value*), JS::CallArgs const&) (jscntxtinlines.h:364)
==13498==    by 0xAC0875A: js::InvokeKernel(JSContext*, JS::CallArgs, js::MaybeConstruct) (jsinterp.cpp:369)
==13498==    by 0xAC10026: js::Interpret(JSContext*, js::StackFrame*, js::InterpMode) (jsinterp.cpp:2327)
==13498==    by 0xAC08268: js::RunScript(JSContext*, JS::Handle<JSScript*>, js::StackFrame*) (jsinterp.cpp:326)
==13498==    by 0xAC08879: js::InvokeKernel(JSContext*, JS::CallArgs, js::MaybeConstruct) (jsinterp.cpp:381)
==13498==    by 0xAB40099: js::Invoke(JSContext*, js::InvokeArgsGuard&, js::MaybeConstruct) (jsinterp.h:109)
==13498==    by 0xAC08B42: js::Invoke(JSContext*, JS::Value const&, JS::Value const&, unsigned int, JS::Value*, JS::Value*) (jsinterp.cpp:414)
==13498==    by 0xAB2E6A2: JS_CallFunctionValue(JSContext*, JSObject*, JS::Value, unsigned int, JS::Value*, JS::Value*) (jsapi.cpp:5790)
==13498==    by 0x9590498: nsXPCWrappedJSClass::CallMethod(nsXPCWrappedJS*, unsigned short, XPTMethodDescriptor const*, nsXPTCMiniVariant*) (XPCWrappedJSClass.cpp:1420)
==13498==    by 0x958587D: nsXPCWrappedJS::CallMethod(unsigned short, XPTMethodDescriptor const*, nsXPTCMiniVariant*) (XPCWrappedJS.cpp:580)
==13498==    by 0x9FBE3AC: PrepareAndDispatch (xptcstubs_x86_64_linux.cpp:121)
==13498==    by 0x9FBD576: SharedStub (in /home/cjones/mozilla/igaia-dbg/toolkit/library/libxul.so)
==13498==    by 0x8C6D501: nsEventListenerManager::HandleEventSubType(nsListenerStruct*, nsIDOMEventListener*, nsIDOMEvent*, nsIDOMEventTarget*, unsigned int, nsCxPusher*) (nsEventListenerManager.cpp:882)
==13498==    by 0x8C6D7B7: nsEventListenerManager::HandleEventInternal(nsPresContext*, nsEvent*, nsIDOMEvent**, nsIDOMEventTarget*, unsigned int, nsEventStatus*, nsCxPusher*) (nsEventListenerManager.cpp:955)
==13498==    by 0x8C9F089: nsEventListenerManager::HandleEvent(nsPresContext*, nsEvent*, nsIDOMEvent**, nsIDOMEventTarget*, unsigned int, nsEventStatus*, nsCxPusher*) (nsEventListenerManager.h:153)
==13498==    by 0x8C9F5B9: nsEventTargetChainItem::HandleEvent(nsEventChainPostVisitor&, unsigned int, bool, nsCxPusher*) (nsEventDispatcher.cpp:184)
==13498==    by 0x8C9FB0B: nsEventTargetChainItem::HandleEventTargetChain(nsEventChainPostVisitor&, unsigned int, nsDispatchingCallback*, bool, nsCxPusher*) (nsEventDispatcher.cpp:316)
==13498==    by 0x8CA0C01: nsEventDispatcher::Dispatch(nsISupports*, nsPresContext*, nsEvent*, nsIDOMEvent*, nsEventStatus*, nsDispatchingCallback*, nsCOMArray<nsIDOMEventTarget>*) (nsEventDispatcher.cpp:634)
(In reply to Nick Cameron [:nrc] from comment #120)
> (In reply to Chris Jones [:cjones] [:warhammer] from comment #118)
> > The pointer to freed memory comes from
> > 
> > PLDHashOperator
> > FlushTransition(CommonAnimationManager::DataEntry* aEntry, void* aUserArg)
> > {
> >   ElementTransitions *et =
> > static_cast<ElementTransitions*>(aEntry->mKey->mData);
> >   FlushTransitionArgs* args = static_cast<FlushTransitionArgs*>(aUserArg);
> > 
> >   bool canThrottleTransition =
> > et->CanPerformOnCompositorThread(CommonElementAnimationData::Flags_None) &&
> >                                CanThrottleTransition(et, args->mNow);
> > 
> > |et| --> 0x5a5a
> 
> Thanks for the heads up, the updated patch shouldn't crash, but that is just
> hiding the problem, I don't have a mac at the moment, so I can't look at
> this right now (en route to Vancouver), but will check it out later. This
> really shouldn't be happening.

I see the same crash with the updated patch.
> ==13498==    by 0x9F1F6D9: NS_ProcessNextEvent_P(nsIThread*, bool)
> (nsThreadUtils.cpp:221)
> ==13498==    by 0x9BB27BD:
> mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*)
> (MessagePump.cpp:82)
> ==13498==    by 0x9FE7DD0: MessageLoop::RunInternal() (message_loop.cc:215)
> ==13498==    by 0x9FE7D61: MessageLoop::RunHandler() (message_loop.cc:208)
> ==13498==    by 0x9FE7D3A: MessageLoop::Run() (message_loop.cc:182)
> ==13498==    by 0x9A26417: nsBaseAppShell::Run() (nsBaseAppShell.cpp:163)
> ==13498==    by 0x976B22F: nsAppStartup::Run() (nsAppStartup.cpp:290)
> ==13498==    by 0x82CB213: XREMain::XRE_mainRun() (nsAppRunner.cpp:3822)
> ==13498==    by 0x82CB4F2: XREMain::XRE_main(int, char**, nsXREAppData
> const*) (nsAppRunner.cpp:3889)
> ==13498==    by 0x82CB743: XRE_main (nsAppRunner.cpp:4083)
> ==13498==    by 0x402099: do_main(int, char**) (nsBrowserApp.cpp:154)
> ==13498==    by 0x402322: main (nsBrowserApp.cpp:239)
> ==13498==  Address 0x1fe5ce40 is 0 bytes inside a block of size 64 free'd
> ==13498==    at 0x4C2A4BC: operator delete(void*) (in
> /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
> ==13498==    by 0x898E6AA: ElementTransitionsPropertyDtor(void*, nsIAtom*,
> void*, void*) (nsTransitionManager.cpp:90)
> ==13498==    by 0x8B4CFE2:
> nsPropertyTable::PropertyList::DeletePropertyFor(nsPropertyOwner)
> (nsPropertyTable.cpp:337)
> ==13498==    by 0x8B4CD5C: nsPropertyTable::DeleteProperty(nsPropertyOwner,
> nsIAtom*) (nsPropertyTable.cpp:260)
> ==13498==    by 0x8B216EE: nsINode::DeleteProperty(unsigned short, nsIAtom*)
> (nsINode.cpp:174)
> ==13498==    by 0x86A676D: nsINode::DeleteProperty(nsIAtom*) (nsINode.h:675)
> ==13498==    by 0x8B15BF6: nsGenericElement::UnbindFromTree(bool, bool)
> (nsGenericElement.cpp:1527)
> ==13498==    by 0x8CD1766: nsGenericHTMLElement::UnbindFromTree(bool, bool)
> (nsGenericHTMLElement.cpp:1781)

This is really bad. We must not spin event loop at unsafe times.
(In reply to Olli Pettay [:smaug] from comment #123)
> This is really bad. We must not spin event loop at unsafe times.

Did you miss the "Address 0x1fe5ce40 is 0 bytes inside a block of size 64 free'd" line in the middle of what you quoted?
Apparently so. sorry.
But I still wonder why we're calling nsINode::DeleteProperty inside UnbindFromTree.
DeleteProperty isn't too fast.
(In reply to Chris Jones [:cjones] [:warhammer] from comment #122)
> (In reply to Nick Cameron [:nrc] from comment #120)
> > (In reply to Chris Jones [:cjones] [:warhammer] from comment #118)
> > > The pointer to freed memory comes from
> > > 
> > > PLDHashOperator
> > > FlushTransition(CommonAnimationManager::DataEntry* aEntry, void* aUserArg)
> > > {
> > >   ElementTransitions *et =
> > > static_cast<ElementTransitions*>(aEntry->mKey->mData);
> > >   FlushTransitionArgs* args = static_cast<FlushTransitionArgs*>(aUserArg);
> > > 
> > >   bool canThrottleTransition =
> > > et->CanPerformOnCompositorThread(CommonElementAnimationData::Flags_None) &&
> > >                                CanThrottleTransition(et, args->mNow);
> > > 
> > > |et| --> 0x5a5a
> > 
> > Thanks for the heads up, the updated patch shouldn't crash, but that is just
> > hiding the problem, I don't have a mac at the moment, so I can't look at
> > this right now (en route to Vancouver), but will check it out later. This
> > really shouldn't be happening.
> 
> I see the same crash with the updated patch.

Sorry, I totally misread your posts (I blame trying to read bugmail on my phone at the airport). I can't see a fix from the code, but hopefully I'll be able to fix this when I have a mac to play on next week.
Attached patch rollup WIP patch (obsolete) — Splinter Review
Attachment #679973 - Attachment is obsolete: true
Attached file simple test case
(In reply to Nick Cameron [:nrc] from comment #128)
> Created attachment 680893 [details] [diff] [review]
> rollup WIP patch

This patch still crashes very quickly on loading the gaia lockscreen.
(In reply to Chris Jones [:cjones] [:warhammer] from comment #130)
> (In reply to Nick Cameron [:nrc] from comment #128)
> > Created attachment 680893 [details] [diff] [review]
> > rollup WIP patch
> 
> This patch still crashes very quickly on loading the gaia lockscreen.

Sorry cjones, I did not mean for this one to be tested, I haven't addressed the crash yet. Just wip with bz and dbaron. There will probably be a few more flying around too, I'll let you know when I think I've fixed the crash.
Attachment #680893 - Attachment is obsolete: true
How is this coming along?

Now that we've passed the aurora uplift, the bar to landing this will be higher.  If dbaron/bz/roc think this is too risky, we're going to need to start finding other turnips to squeeze startup win blood out of.
(In reply to Chris Jones [:cjones] [:warhammer] from comment #133)
> How is this coming along?
> 
> Now that we've passed the aurora uplift, the bar to landing this will be
> higher.  If dbaron/bz/roc think this is too risky, we're going to need to
> start finding other turnips to squeeze startup win blood out of.

We made a lot of progress at the work week in Vancouver (we thought we had a working solution), but I've hit another bug, not sure how serious it is, I'll know better in an hour or so, I think. I'll keep you posted.
Hmm, so the problem (we're only throttling the first transition in a series) seems non-trivial, I'm afraid I don't have much more in terms of time estimation. I'll carry on tomorrow and post back when I have a better idea what is going on.
I'm hopeful we can come up with a final patch that's either clearly not going to affect non-OMTC builds, or else #ifdefed out for non-B2G on beta.
Attached patch rollup patch for testing (obsolete) — Splinter Review
I think this works, would be grateful if people could test on b2g and if bz could try on his mac and see if we get that weird incorrect offset bug that I can't reproduce.
Attachment #678612 - Attachment is obsolete: true
Attachment #681097 - Attachment is obsolete: true
Attachment #678612 - Flags: feedback?(dbaron)
Attachment #672136 - Attachment is obsolete: true
Attachment #683877 - Flags: review?(bzbarsky)
Attachment #672137 - Attachment is obsolete: true
Attachment #683878 - Flags: review?(bzbarsky)
Attachment #672148 - Attachment is obsolete: true
Attachment #683879 - Flags: review?(bzbarsky)
Attachment #683880 - Flags: review?(bzbarsky)
Attachment #683881 - Flags: review?(bzbarsky)
I imagine these patches will need a bit of work still, but hopefully they do the right thing and address all the points we've discussed in Vancouver etc. I'm afraid prefixes don't compile, you need to apply parts 4-7 at once, but the only real alternative is one mega-patch, which I thought is better avoided.

I need to work out how to land this without affecting other code, but it should definitely be possible.
Attachment #683882 - Flags: review?(bzbarsky)
Attachment #683877 - Flags: review?(dbaron)
Attachment #683878 - Flags: review?(dbaron)
Attachment #683879 - Flags: review?(dbaron)
Attachment #683880 - Flags: review?(dbaron)
Attachment #683881 - Flags: review?(dbaron)
Attachment #683882 - Flags: review?(dbaron)
(In reply to Nick Cameron [:nrc] from comment #143)
> I imagine these patches will need a bit of work still, but hopefully they do
> the right thing and address all the points we've discussed in Vancouver etc.

What additional work do you expect them to need?
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #144)
> (In reply to Nick Cameron [:nrc] from comment #143)
> > I imagine these patches will need a bit of work still, but hopefully they do
> > the right thing and address all the points we've discussed in Vancouver etc.
> 
> What additional work do you expect them to need?

Sorry, I didn't mean that I thought there was anything more to do, but that because it is a large bit of work and I don't understand the related work that well, that I would expect the reviewers to find more than the average number of problems.
(In reply to Nick Cameron [:nrc] from comment #137)
> Created attachment 683876 [details] [diff] [review]
> rollup patch for testing
> 
> I think this works, would be grateful if people could test on b2g and if bz
> could try on his mac and see if we get that weird incorrect offset bug that
> I can't reproduce.

This rollup patch still crashes on the b2g lock screen, within about a second of it being shown.  Let me double check that it reproduces in desktop builds too (it should).
Yes, crashes in desktop.

  aLayer->SetAnimationFlushCount(et->mFlushCount);

(gdb) f 5
#5  0x00007fd2822dea92 in AddAnimationsAndTransitionsToLayer (aLayer=0x40ea2e0, aBuilder=0x7fff5f479dd0, aItem=0x2ba9a50, aProperty=eCSSProperty_transform) at /home/cjones/mozilla/inbound/layout/base/nsDisplayList.cpp:431
(gdb) p et
$3 = (ElementTransitions *) 0x0

#5  0x00007fd2822dea92 in AddAnimationsAndTransitionsToLayer (aLayer=0x40ea2e0, aBuilder=0x7fff5f479dd0, aItem=0x2ba9a50, aProperty=eCSSProperty_transform) at /home/cjones/mozilla/inbound/layout/base/nsDisplayList.cpp:431
#6  0x00007fd2822eb617 in nsDisplayTransform::BuildLayer (this=0x2ba9a50, aBuilder=0x7fff5f479dd0, aManager=0x2e0bd20, aContainerParameters=...) at /home/cjones/mozilla/inbound/layout/base/nsDisplayList.cpp:3875
#7  0x00007fd2822770d8 in mozilla::(anonymous namespace)::ContainerState::ProcessDisplayItems (this=0x7fff5f4790e0, aList=..., aClip=..., aFlags=0) at /home/cjones/mozilla/inbound/layout/base/FrameLayerBuilder.cpp:2083
#8  0x00007fd282276cea in mozilla::(anonymous namespace)::ContainerState::ProcessDisplayItems (this=0x7fff5f4790e0, aList=..., aClip=..., aFlags=0) at /home/cjones/mozilla/inbound/layout/base/FrameLayerBuilder.cpp:2013
#9  0x00007fd282276cea in mozilla::(anonymous namespace)::ContainerState::ProcessDisplayItems (this=0x7fff5f4790e0, aList=..., aClip=..., aFlags=0) at /home/cjones/mozilla/inbound/layout/base/FrameLayerBuilder.cpp:2013
#10 0x00007fd282276cea in mozilla::(anonymous namespace)::ContainerState::ProcessDisplayItems (this=0x7fff5f4790e0, aList=..., aClip=..., aFlags=0) at /home/cjones/mozilla/inbound/layout/base/FrameLayerBuilder.cpp:2013
#11 0x00007fd28227a18a in mozilla::FrameLayerBuilder::BuildContainerLayerFor (this=0x33fd9b0, aBuilder=0x7fff5f479dd0, aManager=0x2e0bd20, aContainerFrame=0x35d5098, aContainerItem=0x40a1690, aChildren=..., aParameters=..., aTransform=0x0) at /home/cjones/mozilla/inbound/layout/base/FrameLayerBuilder.cpp:2869
#12 0x00007fd2822e797d in nsDisplayOwnLayer::BuildLayer (this=0x40a1690, aBuilder=0x7fff5f479dd0, aManager=0x2e0bd20, aContainerParameters=...) at /home/cjones/mozilla/inbound/layout/base/nsDisplayList.cpp:2799
#13 0x00007fd2822770d8 in mozilla::(anonymous namespace)::ContainerState::ProcessDisplayItems (this=0x7fff5f4797b0, aList=..., aClip=..., aFlags=0) at /home/cjones/mozilla/inbound/layout/base/FrameLayerBuilder.cpp:2083
#14 0x00007fd282276cea in mozilla::(anonymous namespace)::ContainerState::ProcessDisplayItems (this=0x7fff5f4797b0, aList=..., aClip=..., aFlags=0) at /home/cjones/mozilla/inbound/layout/base/FrameLayerBuilder.cpp:2013
#15 0x00007fd28227a18a in mozilla::FrameLayerBuilder::BuildContainerLayerFor (this=0x33fd9b0, aBuilder=0x7fff5f479dd0, aManager=0x2e0bd20, aContainerFrame=0x2b8c9d0, aContainerItem=0x0, aChildren=..., aParameters=..., aTransform=0x0) at /home/cjones/mozilla/inbound/layout/base/FrameLayerBuilder.cpp:2869
#16 0x00007fd2822e161d in nsDisplayList::PaintForFrame (this=0x7fff5f479c40, aBuilder=0x7fff5f479dd0, aCtx=0x0, aForFrame=0x2b8c9d0, aFlags=13) at /home/cjones/mozilla/inbound/layout/base/nsDisplayList.cpp:1084
#17 0x00007fd2822e11ae in nsDisplayList::PaintRoot (this=0x7fff5f479c40, aBuilder=0x7fff5f479dd0, aCtx=0x0, aFlags=13) at /home/cjones/mozilla/inbound/layout/base/nsDisplayList.cpp:1004
#18 0x00007fd28231bbed in nsLayoutUtils::PaintFrame (aRenderingContext=0x0, aFrame=0x2b8c9d0, aDirtyRegion=..., aBackstop=4294967295, aFlags=772) at /home/cjones/mozilla/inbound/layout/base/nsLayoutUtils.cpp:1861
#19 0x00007fd282349608 in PresShell::Paint (this=0x2b87b00, aViewToPaint=0x2b87780, aDirtyRegion=..., aFlags=129) at /home/cjones/mozilla/inbound/layout/base/nsPresShell.cpp:5356
#20 0x00007fd282aea476 in nsViewManager::ProcessPendingUpdatesForView (this=0x2b87730, aView=0x2b87780, aFlushDirtyRegion=true) at /home/cjones/mozilla/inbound/view/src/nsViewManager.cpp:435
#21 0x00007fd282aeca5f in nsViewManager::ProcessPendingUpdates (this=0x2b87730) at /home/cjones/mozilla/inbound/view/src/nsViewManager.cpp:1208
#22 0x00007fd28235dce5 in nsRefreshDriver::Notify (this=0x2b7bce0, aTimer=0x2876870) at /home/cjones/mozilla/inbound/layout/base/nsRefreshDriver.cpp:433
(In reply to Chris Jones [:cjones] [:warhammer] from comment #147)
> Yes, crashes in desktop.
> 
>   aLayer->SetAnimationFlushCount(et->mFlushCount);

Moving this into the |if (et) {| block above fixes the crash.  Not sure if that's correct.

With that fix I don't seem to see the app-open transition being eliminated from the main thread.  The layer txn counter still stays around ~40-60tps during the transition.  Is there any logging that will let me know if we've had to unexpectedly flush during an async transition/animation?

(Not unexpectedly given above, I don't measure any difference in startup time with the rollup patch.)
Attached patch rollup patch (obsolete) — Splinter Review
bug fix for the rollup patch, if it is animations and not transitions that are being incorrectly throttled then this may make a difference. I'll fix the individual patch with this a bit later.
Attachment #683876 - Attachment is obsolete: true
Attached patch basic logging for transitions (obsolete) — Splinter Review
This is some very basic printf logging for transitions (so if the problem is with animations it won't help, sorry). If you see lots of 'non-throttled flush's then we are not throttling enough. If those lines have 'could throttle: yes' then there is a bug in my patches. If they have 'could throttle: no' then we are getting slowed down by flushes requested by (potential) DOM changes. You should only see a mini-flush log when a transition changes.

I will get my b2g setup working and investigate this once I get in to the office.
Attachment #683881 - Attachment is obsolete: true
Attachment #683881 - Flags: review?(dbaron)
Attachment #683881 - Flags: review?(bzbarsky)
Attachment #684139 - Flags: review?(bzbarsky)
Attachment #684139 - Flags: review?(dbaron)
I see http://pastebin.mozilla.org/1950584 when I launch an application.  If you don't beat me to it, I'll see why this code thinks we can't throttle.
(In reply to Chris Jones [:cjones] [:warhammer] from comment #152)
> I see http://pastebin.mozilla.org/1950584 when I launch an application.  If
> you don't beat me to it, I'll see why this code thinks we can't throttle.

I can't get b2g desktop working yet, so I doubt I'll beat you too it, sorry. These Cannot_throttle flushes are in response to flushes from the DOM which might need the transition info. I see a lot of them on desktop when the mouse moves, I would guess testing for mouseover events. Not sure if there is something similar on b2g.
I tested on the phone by
 - tapping icon
 - letting app load finish, no other interaction
With an older gaia, these patches work quite well.  I see only a handful of flushes during the app-open animation and only one isn't throttle-able.

However, the biggest thing that's changed between the older gaia and the newer gaia is the addition of SVG content on the lockscreen.  Are there any emergency bail-out paths here where we punt when SVG or SVG animations are being used?
And indeed, I'm seeing a 300-400ms win in perceived startup for the calculator app (measured by stopwatch).
(In reply to Chris Jones [:cjones] [:warhammer] from comment #155)
> With an older gaia, these patches work quite well.  I see only a handful of
> flushes during the app-open animation and only one isn't throttle-able.
> 

Do you see the icons etc.?

> However, the biggest thing that's changed between the older gaia and the
> newer gaia is the addition of SVG content on the lockscreen.  Are there any
> emergency bail-out paths here where we punt when SVG or SVG animations are
> being used?

I am seeing a lot of flushes starting from SMIL - we don't think these need to do an animation flush and I have a patch building right now to cut these out, I'll see if that helps.

My worry right now is that all the flushes that could be throttled, but aren't, are due to no layer existing for the frame, which is kind of bizarre (I hadn't seen this with any of my test cases, or more precisely I saw this for one tick and then the layer is created, whereas with b2g I see this for hundreds of ticks). I'm hoping that when the icons magically come back these layers will magically come back too. But otherwise, I have to work out what is going on and fix it.
(In reply to Nick Cameron [:nrc] from comment #157)
> My worry right now is that all the flushes that could be throttled, but
> aren't, are due to no layer existing for the frame, which is kind of bizarre

It might be possible for this to happen with an nsSubDocumentFrame for remote content that hasn't painted yet.  I think we could fix that.
Nick, I've started reading through these, but didn't get far enough yet to have useful feedback.  :(  I'm going to finish up on Monday.
(In reply to Boris Zbarsky (:bz) from comment #159)
> Nick, I've started reading through these, but didn't get far enough yet to
> have useful feedback.  :(  I'm going to finish up on Monday.

No problem, I'm still changing the patches a bit, so it might not be such a bad thing to hold off till Monday. Happy Thanksgiving!
Attached patch rough patch 8 with logging (obsolete) — Splinter Review
Very rough WIP patch. Adds a bunch of printf logging to identify some non-throttled flushing and a 'fix' to SMIL flushing of animations (to see if that brings us back to the 'old' Gaia performance). If we see lots of "common out of date ... -1" then that is a problem I need to fix.
Brian, can SMIL animation processing depend on the current state of CSS transitions and animations? I.e., does SMIL animation sampling require CSS transitions and animations to be brought up to date?
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #162)
> Brian, can SMIL animation processing depend on the current state of CSS
> transitions and animations? I.e., does SMIL animation sampling require CSS
> transitions and animations to be brought up to date?

dholbert wrote this part so CC'ing him. As far as I know, when targetting a CSS property SMIL uses the CSS computed value as a base value[1] when (a) it has an additive animation (i.e. gets added to the underlying value), (b) it has no start value and so begins animating from the underlying value.

So if CSS transitions and animations are affecting the computed value then yes, I guess so. Daniel, is that right?

[1] http://mxr.mozilla.org/mozilla-central/source/content/smil/nsSMILCSSProperty.cpp#60
(In reply to Brian Birtles (:birtles) from comment #163)
> (In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment
> #162)
> > Brian, can SMIL animation processing depend on the current state of CSS
> > transitions and animations?
[...]
> So if CSS transitions and animations are affecting the computed value then
> yes, I guess so. Daniel, is that right?

Nominally, yes, for the reasons that Brian mentioned.

However -- in such cases, the SMIL animation's interpolated value won't actually be user/author-visible at all, AFAIK, because the transitions/animations will "win". (They stack on top of SMIL in our animation sandwich, and override any effects it'd have.)

So -- answering the second part of roc's question:
> does SMIL animation sampling require CSS transitions
> and animations to be brought up to date?

I'd say "no".  It may make us generate incorrect intermediate values for SMIL, but those values will be clobbered by the transition/animation anyway.
Attachment #684347 - Attachment description: part 8: rough WIP patch → rough patch 8 with logging
Mostly changes SMIL flushes to not flush animations, plus a few minor changes to warnings etc.
Attachment #684535 - Flags: review?(bzbarsky)
Attachment #684535 - Flags: feedback?(dholbert)
Attachment #684535 - Flags: review?(dbaron)
d'oh... So I just remembered some caveats to comment 164 that make this trickier. (Sorry, it's been a little while since I thought in depth about animations, so it took a bit to fully page the complex bits back in.)

I think we technically *do* need to flush CSS animations during SMIL samples, for correctness, after all.

If you have a CSS animation that targets one property, and then a SMIL animation targeting a *different* property who is influenced by the CSS-animated property, then we need CSS animations to be up-to-date in order for the SMIL animation to be correct.

One example of this would be a CSS animation on "font-size", while a SMIL animation changes some length from "5em" to "10em". (The meanings of those "em" units would be dependent on the CSS font-size animation.)  Another example would be a CSS animation on "color" while a SMIL animation takes "fill" to the currentColor keyword. (which computes to the current value of the "color" property)  You can get similar interactions w/ inheritance, too. (if you have a CSS animation on a parent and a SMIL animation w/ "inherit" on a child)

Sorry for not remembering that earlier. Does that complicate things here?
Yes, but I think we can deal with it.

I suggest we take the SMIL-related changes out of part 8, and try to land the remaining patches to fix this bug. We should file a new bug about avoiding flushing CSS transitions/animations when SMIL animations are running.
Attachment #684535 - Attachment is obsolete: true
Attachment #684535 - Flags: review?(dbaron)
Attachment #684535 - Flags: review?(bzbarsky)
Attachment #684535 - Flags: feedback?(dholbert)
Attachment #684910 - Flags: review?(bzbarsky)
Depends on: 814921
With a pref to turn off throttling completely and a few bug fixes.
Attachment #684910 - Attachment is obsolete: true
Attachment #684910 - Flags: review?(bzbarsky)
Attachment #685906 - Flags: review?(bzbarsky)
So there is a bug with this on b2g, when an app expands into view there is a stutter at the end of the transition, which does not exist without these patches (this is with the bug 788409 fix). I can't recreate this on desktop FF :-(

cjones: are the open app transitions just a regular transition on the scale of the dive from ~0.5 to 1.0?
Flags: needinfo?(jones.chris.g)
The app transitions are defined around here

https://github.com/mozilla-b2g/gaia/blob/master/apps/system/style/system/system.css#L272

The app-opening transition is, I believe

 scale 0.6 -> 1
 opacity 0 -> 1

There's an additional transition from lockscreen->homescreen which goes

 scale   1 -> 2
 opacity 1 -> 0

I believe.  This is defined around

https://github.com/mozilla-b2g/gaia/blob/master/apps/system/style/lockscreen/lockscreen.css#L1

We also animate |visibility| in a way that I don't fully understand.
Flags: needinfo?(jones.chris.g)
Comment on attachment 683877 [details] [diff] [review]
Part 1: Add an animation flush type

> void
> PresShell::FlushPendingNotifications(mozFlushType aType)
> {
>+  mozilla::ChangesToFlush flush(aType, true);
>+  switch (aType) {
>+    case Flush_Content:
>+    case Flush_ContentAndNotify:
>+      flush.mFlushAnimations = false;
>+    default:
>+      break;
>+  }

This should be simplified to:
  // by default, flush animations if aType >= Flush_Style
  mozilla::ChangesToFlush flush(aType, aType >= Flush_Style);

>+  FlushPendingNotifications(flush);
>+}
>+
>+void
>+PresShell::FlushPendingNotifications(mozilla::ChangesToFlush aFlush)
>+{
>   /**
>    * VERY IMPORTANT: If you add some sort of new flushing to this
>    * method, make sure to add the relevant SetNeedLayoutFlush or
>    * SetNeedStyleFlush calls on the document.
>    */

We need to make sure something in this patch series does this for this addition:

>+      if (aFlush.mFlushAnimations) {
>+        mPresContext->AnimationManager()->FlushAnimations();
>+        mPresContext->TransitionManager()->FlushTransitions();
>+      }

though it doesn't have to be in this patch.  (I'll try to remember to look for it as I go through... but if it's not in any of the others and you need to add it, it's probably best to add it in this one.)
Attachment #683877 - Flags: review?(dbaron) → review+
Comment on attachment 683878 [details] [diff] [review]
Part 3: Throttle animations and transitions

>+  enum FlushFlags {
>+    Can_Throttle,
>+    Cannot_Throttle
>+  };

I'm not crazy about these names, but I don't have anything better
right now.

>+  enum Flags {
>+    Flags_None = 0,
>+    Flags_HasGeometricProperty = 1,
>+    Flags_AllowPartial = 2
>+  };

Please:
  * don't have a Flags_None name; people tend to | the names together
    once they realize it's a bitfield, so I'd prefer all the names inside
    the enum to be things that can be usefully |ed together.  Instead use
    Flags(0) at the callers (but see next point).

  * Rename this to something more useful; how about CanAnimateFlags?

  * add comments explaining what these things are.  HasGeometricProperty
    says whether the animation as a whole involves any animation of
    width, height, top, right, bottom, or left.  AllowPartial means
    that, when set, it will return true when compositor animations are
    allowed in general, and the property being asked about is not a
    property that we animate on the compositor.
    (I don't think either of these things is remotely reasonable,
    though; see my comments in bug 784239.  I'm fine with the addition
    here, though, since it adds back the ability to have reasonable
    behavior.)



>+bool
>+CommonElementAnimationData::CanThrottleTransformChanges(TimeStamp aTime)
>+{


>+  // If the nearest scrollable ancestor has overflow:hidden,
>+  // we don't care about overflow.
>+  if (!nsLayoutUtils::GetNearestScrollableFrame(mElement->GetPrimaryFrame())) {
>+    return true;
>+  }

The comment and the code here don't match.  I suspect the code here is
safe but does nothing (I don't think this should ever return false).  If
you want to do what the comment says, you should really check that it's
overflow:hidden *and* that it's scrolled to its initial position.  This
is because overflow:hidden elements are scrollable from script, and
changes to transforms can change the allowed scrollable region.  So you
should look at the return value of GetNearestScrollableFrame, and also
return true if its GetScrollbarStyles() returns NS_STYLE_OVERFLOW_HIDDEN
for both dimensions, and its GetScrollPosition is 0,0.  (I don't think
that optimizes correctly for RTL, though.)


You're adding an extra } after the end of this function.  I suspect
a later patch removes it, but you shouldn't add it here.


In ElementAnimations::EnsureStyleRuleFor:

>+      // This animation is being handled on the compositor thread,
>+      // so we shouldn't interpolate here.
>+      if (aIsThrottled) {
>+        continue;
>+      }

So this is an overly complicated way of doing something much simpler.
When aIsThrottled is true, instead of starting each iteration of the
loop, getting the position, and then continuing, you should just return
before the loop (but after setting mStyleRuleRefreshTime, mStyleRule,
and mNeedsRefreshes).

I think there's something a little subtle broken here, though, which is
that you're setting mNeedsRefreshes to false the first time through when
you're throttled, which means that the second time through when you're
throttled you won't even get to this code because mNeedsRefreshes is
false.  That suggests that maybe you should be doing even less work, and
just taking the early return at the very top of the function when you're
throttled.  That, in turn, affects whether you set canThrottleTick
in FlushAnimations.

My gut feeling is that you want:

 * to take the earliest return in EnsureStyleRuleFor, updating
   mStyleRuleRefreshTime and not touching mStyleRule or mNeedsRefreshes

 * to remove the ea->mNeedsRefreshes part of this condition in
   nsAnimationManager::FlushAnimations
   >+    bool canThrottleTick = ea->mNeedsRefreshes && aFlags == Can_Throttle &&
   >+      ea->CanPerformOnCompositorThread(CommonElementAnimationData::Flags_None) &&
   >+      CanThrottleAnimation(ea, now);

 * to remove the "&& !canThrottleTick" addition a few lines below

But if this messes things up that work now, we should discuss further.


ElementTransitions::CanPerformOnCompositorThread:

>+    if (pt.mStartValue != pt.mStartForReversingTest) {
>+      return false;
>+    }

I'm assuming these lines are here accidentally.  Please remove them.

(They disable running any previously-reversed transition on the
compositor thread.  I don't think there's a good reason to do that.
If there is, it needs extensive comments and a bug on fixing it.)


>   nsRestyleHint hint =
>     aNewStyleContext->GetPseudoType() ==
>       nsCSSPseudoElements::ePseudo_NotPseudoElement ?
>     eRestyle_Self : eRestyle_Subtree;
>   presContext->PresShell()->RestyleForAnimation(aElement, hint);
>-  // XXXdz: invalidate the frame here, once animations are throttled.
>+
> 
>   *aStartedAny = true;

Don't add an extra blank line here; just leave one line of gap.


>+bool
>+CanThrottleTransition(ElementTransitions* aEt, TimeStamp aTime)

This code appears twice, once for animations and once for transitions.
It's pretty easy to share -- you could have a method on ElementTransitions
and ElementAnimations that just makes the two HasTransitionOfProperty /
HasAnimationOfProperty calls, and then calls a method on
CommonElementAnimationData that has all the rest of the code.

>+      bool transitionStartedOrEnded = false;

This is never set to any other value.  Remove it or make it meaningful.
(You might need to do the latter if removing those accidental lines
above breaks something.)

r=dbaron with those things addressed
Attachment #683878 - Flags: review?(dbaron) → review+
(In reply to David Baron [:dbaron] from comment #174)
> You're adding an extra } after the end of this function.  I suspect
> a later patch removes it, but you shouldn't add it here.

Er, never mind this.  I was just missing how silly diff was being.
or, really, r=dbaron on part 3 with those things addressed, assuming that parts 4 and 6 land at the same time, since they both fix known problems in part 3.
Comment on attachment 683879 [details] [diff] [review]
part 4: some refactoring

>+  CommonElementAnimationData* animations =
>+    static_cast<CommonElementAnimationData*>(aContent->GetProperty(aAnimationProperty));

Technically this isn't kosher C++, since you can't assume that
the CommonElementAnimationData base class is at offset 0 in both
an ElementAnimations and ElementTransitions.  I think it's probably
ok, but you should at least leave a comment explaining what it's
depending on.



I'm not especially happy about adding virtual functions to
ElementTransitions and ElementAnimations.  Please at the very least
annotate both ElementTransitions and ElementAnimations as MOZ_FINAL, and
annotate the implementations of the virtual methods as MOZ_OVERRIDE.
(Note that some of this is in part 5, since the nsTransitionManager
changes for the part 4 refactoring actually live in part 5, which is
rather annoying.)

But this does address one of my review comments on part 3 in a slightly
different (but satisfactory, I guess) way.

The nsStyleTransformMatrix.{h,cpp} changes don't seem to be needed
anywhere; drop them.

I'm inclined to think this patch should get rolled into patch 3 rather
than landing as a separate changeset (and two authors listed for patch
3).
Attachment #683879 - Flags: review?(dbaron) → review+
> but you should at least leave a comment explaining what it's depending on.

You should have MOZ_ASSERTS in the ElementAnimations and ElementTransitions constructors that static_cast and reinterpret_cast to CommonElementAnimationData give the same pointer.  (Can't make it a static assert becaust of static_cast.)

Sorry for the lag here, by the way; I really am working on reading through these!
(In reply to David Baron [:dbaron] from comment #177)
> Comment on attachment 683879 [details] [diff] [review]
> part 4: some refactoring
> 
> >+  CommonElementAnimationData* animations =
> >+    static_cast<CommonElementAnimationData*>(aContent->GetProperty(aAnimationProperty));
> 
> Technically this isn't kosher C++, since you can't assume that
> the CommonElementAnimationData base class is at offset 0 in both
> an ElementAnimations and ElementTransitions.  I think it's probably
> ok, but you should at least leave a comment explaining what it's
> depending on.
> 

I don't understand why this is not right, sorry. Why can't I assume CommonElementAnimationData is at offset 0?

> 
> 
> I'm not especially happy about adding virtual functions to
> ElementTransitions and ElementAnimations.

Why not? Because of the vtable overhead on a small class?

>  Please at the very least
> annotate both ElementTransitions and ElementAnimations as MOZ_FINAL, and
> annotate the implementations of the virtual methods as MOZ_OVERRIDE.
> (Note that some of this is in part 5, since the nsTransitionManager
> changes for the part 4 refactoring actually live in part 5, which is
> rather annoying.)
> 

Sorry about that, I found this bug hard to break up into coherent patches.

> But this does address one of my review comments on part 3 in a slightly
> different (but satisfactory, I guess) way.
> 
> The nsStyleTransformMatrix.{h,cpp} changes don't seem to be needed
> anywhere; drop them.
> 

They are needed (I think) in the new addition (case eCSSKeyword_interpolatematrix) to AddTransformFunctions in nsDisplayList.cpp.

> I'm inclined to think this patch should get rolled into patch 3 rather
> than landing as a separate changeset (and two authors listed for patch
> 3).

Sure, I will add changes in a delta patch, but I'll roll up them back out like this once reviewing is done and before landing.
(In reply to Nick Cameron [:nrc] from comment #179)
> (In reply to David Baron [:dbaron] from comment #177)
> > Comment on attachment 683879 [details] [diff] [review]
> > part 4: some refactoring
> > 
> > >+  CommonElementAnimationData* animations =
> > >+    static_cast<CommonElementAnimationData*>(aContent->GetProperty(aAnimationProperty));
> > 
> > Technically this isn't kosher C++, since you can't assume that
> > the CommonElementAnimationData base class is at offset 0 in both
> > an ElementAnimations and ElementTransitions.  I think it's probably
> > ok, but you should at least leave a comment explaining what it's
> > depending on.
> > 
> 
> I don't understand why this is not right, sorry. Why can't I assume
> CommonElementAnimationData is at offset 0?

Why should you be able to assume it?

> > I'm not especially happy about adding virtual functions to
> > ElementTransitions and ElementAnimations.
> 
> Why not? Because of the vtable overhead on a small class?

vtable overhead, virtual call overhead, and just wanting to keep the class more of a simple data structure.  But it's ok.

> > But this does address one of my review comments on part 3 in a slightly
> > different (but satisfactory, I guess) way.
> > 
> > The nsStyleTransformMatrix.{h,cpp} changes don't seem to be needed
> > anywhere; drop them.
> > 
> 
> They are needed (I think) in the new addition (case
> eCSSKeyword_interpolatematrix) to AddTransformFunctions in nsDisplayList.cpp.

Which patch is that in?
(In reply to David Baron [:dbaron] from comment #180)
> (In reply to Nick Cameron [:nrc] from comment #179)
> > (In reply to David Baron [:dbaron] from comment #177)
> > > Comment on attachment 683879 [details] [diff] [review]
> > > part 4: some refactoring
> > > 
> > > >+  CommonElementAnimationData* animations =
> > > >+    static_cast<CommonElementAnimationData*>(aContent->GetProperty(aAnimationProperty));
> > > 
> > > Technically this isn't kosher C++, since you can't assume that
> > > the CommonElementAnimationData base class is at offset 0 in both
> > > an ElementAnimations and ElementTransitions.  I think it's probably
> > > ok, but you should at least leave a comment explaining what it's
> > > depending on.
> > > 
> > 
> > I don't understand why this is not right, sorry. Why can't I assume
> > CommonElementAnimationData is at offset 0?
> 
> Why should you be able to assume it?
> 

Because they both inherit from only CommonElementAnimationData. (I feel like I'm missing something very obvious here).

> > > But this does address one of my review comments on part 3 in a slightly
> > > different (but satisfactory, I guess) way.
> > > 
> > > The nsStyleTransformMatrix.{h,cpp} changes don't seem to be needed
> > > anywhere; drop them.
> > > 
> > 
> > They are needed (I think) in the new addition (case
> > eCSSKeyword_interpolatematrix) to AddTransformFunctions in nsDisplayList.cpp.
> 
> Which patch is that in?

patch 7 (https://bugzilla.mozilla.org/attachment.cgi?id=683882)
(In reply to Nick Cameron [:nrc] from comment #181)
> (In reply to David Baron [:dbaron] from comment #180)
> > (In reply to Nick Cameron [:nrc] from comment #179)
> > > (In reply to David Baron [:dbaron] from comment #177)
> > > > Comment on attachment 683879 [details] [diff] [review]
> > > > part 4: some refactoring
> > > > 
> > > > >+  CommonElementAnimationData* animations =
> > > > >+    static_cast<CommonElementAnimationData*>(aContent->GetProperty(aAnimationProperty));
> > > > 
> > > > Technically this isn't kosher C++, since you can't assume that
> > > > the CommonElementAnimationData base class is at offset 0 in both
> > > > an ElementAnimations and ElementTransitions.  I think it's probably
> > > > ok, but you should at least leave a comment explaining what it's
> > > > depending on.
> > > > 
> > > 
> > > I don't understand why this is not right, sorry. Why can't I assume
> > > CommonElementAnimationData is at offset 0?
> > 
> > Why should you be able to assume it?
> > 
> 
> Because they both inherit from only CommonElementAnimationData. (I feel like
> I'm missing something very obvious here).

Could you cite something in the C++ standard that requires what you state?  I don't know of anything, and I suspect that while it's true in most if not all cases where virtual base classes aren't involved, I'm not sure if it's portable if, say, the base class doesn't have a vtable pointer and the derived class does.
(In reply to David Baron [:dbaron] from comment #182)
> (In reply to Nick Cameron [:nrc] from comment #181)
> > (In reply to David Baron [:dbaron] from comment #180)
> > > (In reply to Nick Cameron [:nrc] from comment #179)
> > > > (In reply to David Baron [:dbaron] from comment #177)
> > > > > Comment on attachment 683879 [details] [diff] [review]
> > > > > part 4: some refactoring
> > > > > 
> > > > > >+  CommonElementAnimationData* animations =
> > > > > >+    static_cast<CommonElementAnimationData*>(aContent->GetProperty(aAnimationProperty));
> > > > > 
> > > > > Technically this isn't kosher C++, since you can't assume that
> > > > > the CommonElementAnimationData base class is at offset 0 in both
> > > > > an ElementAnimations and ElementTransitions.  I think it's probably
> > > > > ok, but you should at least leave a comment explaining what it's
> > > > > depending on.
> > > > > 
> > > > 
> > > > I don't understand why this is not right, sorry. Why can't I assume
> > > > CommonElementAnimationData is at offset 0?
> > > 
> > > Why should you be able to assume it?
> > > 
> > 
> > Because they both inherit from only CommonElementAnimationData. (I feel like
> > I'm missing something very obvious here).
> 
> Could you cite something in the C++ standard that requires what you state? 
> I don't know of anything, and I suspect that while it's true in most if not
> all cases where virtual base classes aren't involved, I'm not sure if it's
> portable if, say, the base class doesn't have a vtable pointer and the
> derived class does.

Oh, I finally get why this is not OK, I was not distinguishing between void* and a pointer to a base class. Sorry it took so long. I will do something else, this is yuck, thanks for noticing it.
(In reply to David Baron [:dbaron] from comment #174)

> 
> >+bool
> >+CommonElementAnimationData::CanThrottleTransformChanges(TimeStamp aTime)
> >+{
> 
> 
> >+  // If the nearest scrollable ancestor has overflow:hidden,
> >+  // we don't care about overflow.
> >+  if (!nsLayoutUtils::GetNearestScrollableFrame(mElement->GetPrimaryFrame())) {
> >+    return true;
> >+  }
> 
> The comment and the code here don't match.  I suspect the code here is
> safe but does nothing (I don't think this should ever return false).  If
> you want to do what the comment says, you should really check that it's
> overflow:hidden *and* that it's scrolled to its initial position.  This
> is because overflow:hidden elements are scrollable from script, and
> changes to transforms can change the allowed scrollable region.  So you
> should look at the return value of GetNearestScrollableFrame, and also
> return true if its GetScrollbarStyles() returns NS_STYLE_OVERFLOW_HIDDEN
> for both dimensions, and its GetScrollPosition is 0,0.  (I don't think
> that optimizes correctly for RTL, though.)
> 
> 

According to the comment at least, GetNearestScrollableFrame can return null if there is no nearest scrollable frame. I'm not exactly sure what you are asking for here, is this along the right lines:

  // If the nearest scrollable ancestor has overflow:hidden,
  // we don't care about overflow.
  nsIScrollableFrame* scrollable = nsLayoutUtils::GetNearestScrollableFrame(mElement->GetPrimaryFrame());
  if (!scrollable) {
    return true;
  }

  nsPresContext::ScrollbarStyles ss = scrollable->GetScrollbarStyles();
  if (ss.mVertical == NS_STYLE_OVERFLOW_HIDDEN &&
      ss.mHorizontal == NS_STYLE_OVERFLOW_HIDDEN &&
      scrollable->GetScrollPosition() == nsPoint(0, 0)) {
    return true;
  }

  return false;
}


Also, do you mean that we should make this work for RTL in some way, or that we should not care about RTL here?
Yes, it is what I mean... except I now see that it's pretty easy to handle RTL, so I think you should just do it.  In particular, all you should need is:

 * add a GetLogicalScrollPosition method to nsIScrollableFrame, nsHTMLScrollFrame, and nsXULScrollFrame that look analogous to the GetScrollPosition method, except that they expose the existing nsGfxScrollFrameInner::GetLogicalScrollPosition (just like GetScrollPosition exposes GetScrollPosition on nsGfxScrollFrameInner).
 * change the above code to call GetLogicalScrollPosition.
Comment on attachment 683880 [details] [diff] [review]
part 5: changes to nsTransitionManager

So some of the changes in this patch really belong in other patches, but
others really should be separate.  In particular, the rename of
HasTransitionOfProperty to HasAnimationOfProperty, and the
virtualization of that and CanPerformOnCompositorThread should be in
part 4 (which in turn belongs in part 3 as I said in comment 177), and
the mLastUpdateFlushCount should be in part 6.  And the changes to
ElementTransitions::CanPerformOnCompositorThread belong where that code
was added.  I think you should either separate these things properly or
squash the whole set together.

I don't understand why you are:
 * changing the handling of removed sentinels in ValuePortionFor
 * changing the < to <= in IsRunningAt
Were these workarounds for the issue eventually fixed in bug 788049?
Are they still needed?

Does ForceLayerRerendering need to handle the case where both opacity
and transform are animating?  If not, it should have a code comment
explaining why not.

In ForceLayerRerendering it would be clearer to just have:
  else {
    return
  }
at the end of the if-else chain than to assume that neither TYPE_OPACITY
nor TYPE_TRANSFORM has the value 0 (though neither does).

>+  // Performs a 'mini-flush' of styles on transitions and animations
>+  // back to their primary frames
>+  void UpdateStylesToPrimaryFrames();

This needs a way more substantial code comment.  I'd suggest something
like:
  // Performs a 'mini-flush' to make styles from throttled transitions
  // up-to-date prior to processing an unrelated style change, so that
  // any transitions triggered by that style change produce correct
  // results.
  //
  // In more detail:  when we're able to run animations on the
  // compositor, we sometimes "throttle" these animations by skipping
  // updating style data on the main thread.  However, whenever we
  // process a normal (non-animation) style change, any changes in
  // computed style on elements that have transition-* properties set
  // may need to trigger new transitions; this process requires knowing
  // both the old and new values of the property.  To do this correctly,
  // we need to have an up-to-date *old* value of the property on the
  // primary frame.  So the purpose of the mini-flush is to update the
  // style for all throttled transitions and animations to the current
  // animation state without making any other updates, so that when we
  // process the queued style updates we'll have correct old data to
  // compare against.  When we do this, we don't bother touching frames
  // other than primary frames.


The way UpdateStylesToPrimaryFrames and DoUpdateElementAndDescendants
interact with each other isn't going to behave correctly when you have
transitions on a descendant of a display:none element that has an
ancestor that also has transitions, since UpdateStylesToPrimaryFrames
will decide to call DoUpdateElementAndDescendants on the ancestor, which
will find the lack of a primary frame on the display:none element and
not bother crawling the descendants.  I think the easiest fix is to
change the if (!primaryFrame) return in DoUpdateElementAndDescendants
into an if (primaryFrame) { do next 3 lines }.  However, it would be
slightly faster to instead fix it by modifying
UpdateStylesToPrimaryFrames by moving the DoUpdateElementAndDescendants
call up into the for-loop over ancestors, but guarded by a check of
et->mLastUpdateFlushCount (on the et resulting from the call to
GetElementTransitions inside that for loop).

In DoUpdateElementAndDescendants:

>+    nsIContent* child = aContent->GetFirstChild();
>+    while (child) {
>+      DoUpdateElementAndDescendants(child, newStyle);
>+      child = child->GetNextSibling();
>+    }

Please write this as a for-loop:

  for (nsIContent *child = aContent->GetFirstChild; child;
       child = child->GetNextSibling()) {
    DoUpdateElementAndDescendants(child, newStyle);
  }

Could you rename the functions:
  UpdateStyleToPrimaryFrames -> UpdateAllThrottledStyles
  DoUpdateElementAndDescendants -> UpdateThrottledStylesForSubtree
  UpdateStyle -> UpdateThrottledStyle

It's not clear to me why ForceLayerRerendering is needed.  Could you
explain?

In UpdateStyle, the null-ea and null-et cases should both have
assertions; that shouldn't happen.  (Maybe it should even just crash.)

>+  //XXX[nrc] is this right? Or should I use SetStyleContextWithoutNotification?
>+  primaryFrame->SetStyleContext(newStyle);

I think you should (use ...WithoutNotification).


>-  NS_PRECONDITION(aOldStyleContext->GetPseudo() ==
>-                      aNewStyleContext->GetPseudo(),
>+  NS_PRECONDITION(aOldStyleContext->GetPseudo() == aNewStyleContext->GetPseudo(),

Leave this; you're changing it to violate the 80th column.

>-    // http://lists.w3.org/Archives/Public/www-style/2009Aug/0109.html .
>+    // http://lists.w3.org/Archives/Public/www-style/2009Aug/0109.html

Leave this.

>-    ExtractComputedValueForTransition(aProperty, aOldStyleContext,
>-                                      pt.mStartValue) &&
>-    ExtractComputedValueForTransition(aProperty, aNewStyleContext,
>-                                      pt.mEndValue);
>+    ExtractComputedValueForTransition(aProperty, aOldStyleContext, pt.mStartValue) &&
>+    ExtractComputedValueForTransition(aProperty, aNewStyleContext, pt.mEndValue);

Leave this.

>-      GetElementTransitions(aElement, aNewStyleContext->GetPseudoType(),
>-                            true);
>+      GetElementTransitions(aElement, aNewStyleContext->GetPseudoType(), true);

Leave this.


>+      bool canThrottleTick = aFlags == Can_Throttle &&
>+                             et->CanPerformOnCompositorThread(CommonElementAnimationData::Flags_None) &&
>+                             et->CanThrottleAnimation(now);

Wrap the long line under 80 columns.

>+      bool transitionStartedOrEnded = false;

Call this |transitionEnded|; it has nothing to do with starting.

>+  if (!events.IsEmpty()) {
>+    mPresContext->Document()->SetNeedStyleFlush();
>+  }

!events.IsEmpty() is the wrong test here.  I suspect what you should really
be doing is setting a |didThrottle| boolean in exactly the case where you
skip the restyling, and make this call only if didThrottle is true.


I did not review the parts of this patch relating to:

 NoAnimationStyleChangeIsUpToDate
 TickLastNoAnimationStyleChange
 *FlushCount*

I will review those when I review patch 6, since that's where they should have been.

r=dbaron with that, on all parts except those that I'll review with patch 6.
Attachment #683880 - Flags: review?(dbaron) → review+
Comment on attachment 684139 [details] [diff] [review]
part 6: various counters for keeping track of flushes

In nsCSSFrameConstructor.h, you should have a comment describing
GetAnimationFlushCounter().  Perhaps:
  // Get a counter that increments on every style change, that we use to
  // track whether off-main-thread animations are up-to-date.

Could you globally rename s/AnimationFlushCount/AnimationGeneration/g
(except on ElementAnimations/ElementTransitions, where it would be
mFlushCount -> mAnimationGeneration, and
UpdateFlushCount->UpdateAnimationGeneration).  Some of this applies to
patch 5.

In ProcessPendingRestyles in nsCSSFrameConstructor.cpp, you should have
at least a simple comment before the added code, perhaps:
  // Before we process any restyles, we need to ensure that style
  // resulting from any throttled animations (animations that we're
  // running entirely on the compositor thread) is up-to-date, so that
  // if any style changes we cause trigger transitions, we have the
  // correct old style for starting the transition.

Could you globally rename s/NoAnimationStyleChange/UpdateThrottledStyle/
(which matches a renaming I proposed for patch 5).  Some of this also
applies to patch 5.

In patch 5, in UpdateStylesToPrimaryFrames (rename to
UpdateThrottledStyle), could you consolidate the checking of
NoAnimationStyleChange (rename to UpdateThrottledStyle) by moving this
line:
>+  mPresContext->TickLastNoAnimationStyleChange();
up to near the start of the function, right after the two early returns.
I don't see any reason to wait to update it, and it's clearer together.


>   }
>+  // Initialise refresh tick counters for OMTA
>+  mLastStyleUpdateForAllAnimations =
>+    mLastNoAnimationStyleChange = mRefreshDriver->MostRecentRefresh();
> 
>   mLangService = do_GetService(NS_LANGUAGEATOMSERVICE_CONTRACTID);

put a blank line before this insertion

>+  bool NoAnimationStyleChangeIsUpToDate()
>+  bool StyleUpdateForAllAnimationsIsUpToDate()
>+  bool ExistThrottledUpdates() {

These 3 methods should be const.

And the first two (and the two after them) can have the { pulled up onto
the same line, which is local style for inline functions.


In part 5, mLastUpdateFlushCount should not be a flush count.  It should
be a refresh driver time stamp.  This is probably the source of some
bugs, since it probably means that the mini-flush code essentially
doesn't work right the second time.  This is because the flush count
(renaming to animation generation) only updates when there's a style
change -- something that would require sending new information to the
layer tree and the compositor thread.  However, the mini-flush of
throttled styles might need to happen more than once -- it's needed to
get the style up-to-date prior to a style change that might or might not
affect the particular element.  So please rename mLastUpdateFlushCount
to mLastUpdateThrottledStyle, and assign to it from |now| and compare it
against |now| (|now| being a variable holding the result of
RefreshDriver()->MostRecentRefresh()) rather than from
|et->mFlushCount|.

Finally, I think patch 5 is missing one occurrence of an
UpdateFlushCount call (renaming to UpdateAnimationGeneration).  This is
in nsTransitionManager::FlushTransitions, exactly where you set
|transitionStartedOrEnded| (renaming to transitionEnded) to true.  I
think you should add an UpdateAnimationGeneration call on the line
either before or after that line.

r=dbaron with that (though a bunch of the changes, as I predicted
earlier, are in patch 5)
Attachment #684139 - Flags: review?(dbaron) → review+
Comment on attachment 683882 [details] [diff] [review]
part 7: odds and ends

ResolveStyleForElementRules:
 * doesn't need to take an element argument at all (it can just pass
   null)
 * should just be called ResolveStyleForRules
 * should carefully document that it takes the rule array in *reverse*
   order, unlike the other version of ResolveStyleForRules
 * should be moved in the series prior to patch 5, which depends on it
   (and if those patches end up being merged, then prior to patch 3; but
   this piece is easy to keep separate)

We need to file a followup bug on switching existing ResolveStyleForRules
callers to the new version (which needs to take care about the order
difference).

The nsDisplayList.cpp changes should be separated into a different patch,
but that patch should also have the nsStyleTransformMatrix changes from
patch 4 (separated out from that patch).

>+        //XXX[nrc] is it OK to add this?

I'm probably not the right person to ask this question, so maybe just
leave the comment?

r=dbaron with that
Attachment #683882 - Flags: review?(dbaron) → review+
(In reply to David Baron [:dbaron] from comment #186)
> Comment on attachment 683880 [details] [diff] [review]
> part 5: changes to nsTransitionManager
> 
> The way UpdateStylesToPrimaryFrames and DoUpdateElementAndDescendants
> interact with each other isn't going to behave correctly when you have
> transitions on a descendant of a display:none element that has an
> ancestor that also has transitions, since UpdateStylesToPrimaryFrames
> will decide to call DoUpdateElementAndDescendants on the ancestor, which
> will find the lack of a primary frame on the display:none element and
> not bother crawling the descendants.  I think the easiest fix is to
> change the if (!primaryFrame) return in DoUpdateElementAndDescendants
> into an if (primaryFrame) { do next 3 lines }.  However, it would be
> slightly faster to instead fix it by modifying
> UpdateStylesToPrimaryFrames by moving the DoUpdateElementAndDescendants
> call up into the for-loop over ancestors, but guarded by a check of
> et->mLastUpdateFlushCount (on the et resulting from the call to
> GetElementTransitions inside that for loop).
> 

I'm not sure I understand the faster approach - don't we then miss a null check on primaryFrame in the recursive call of DoUpdateElementAndDescendants?


> It's not clear to me why ForceLayerRerendering is needed.  Could you
> explain?
> 

I've added the following comment to ForceLayerRerendering:

// Ensure that the next repaint rebuilds the layer tree for aFrame. That
// means that changes to animations on aFrame's layer are propogated to
// the compositor, which is needed for correct behaviour of new
// transitions.

Is that OK?

> In UpdateStyle, the null-ea and null-et cases should both have
> assertions; that shouldn't happen.  (Maybe it should even just crash.)
> 

I don't understand why ea/et can't be null. Does the fact that the ruleNode level is eAnimatoinSheet imply that there should be an ElementAnimations for aElement? (And similarly for transitions).

> >-      GetElementTransitions(aElement, aNewStyleContext->GetPseudoType(),
> >-                            true);
> >+      GetElementTransitions(aElement, aNewStyleContext->GetPseudoType(), true);
> 
> Leave this.
> 
> 

Sorry to question this nit, but why do you prefer true on a separate line? The one line is exactly 80 cols. Happy to leave the change, but I'm curious.
(In reply to David Baron [:dbaron] from comment #173)
> Comment on attachment 683877 [details] [diff] [review]
> Part 1: Add an animation flush type
> 
> >+  FlushPendingNotifications(flush);
> >+}
> >+
> >+void
> >+PresShell::FlushPendingNotifications(mozilla::ChangesToFlush aFlush)
> >+{
> >   /**
> >    * VERY IMPORTANT: If you add some sort of new flushing to this
> >    * method, make sure to add the relevant SetNeedLayoutFlush or
> >    * SetNeedStyleFlush calls on the document.
> >    */
> 
> We need to make sure something in this patch series does this for this
> addition:
> 
> >+      if (aFlush.mFlushAnimations) {
> >+        mPresContext->AnimationManager()->FlushAnimations();
> >+        mPresContext->TransitionManager()->FlushTransitions();
> >+      }
> 
> though it doesn't have to be in this patch.  (I'll try to remember to look
> for it as I go through... but if it's not in any of the others and you need
> to add it, it's probably best to add it in this one.)

I'm not sure exactly what is required here. I think the only time I call SetNeedStyleFlush is in nsTransitionManager::FlushTransitions if we throttle a transition. Do I need more calls? What is the purpose of these calls?
(In reply to David Baron [:dbaron] from comment #186)
> Comment on attachment 683880 [details] [diff] [review]
> part 5: changes to nsTransitionManager
> 
> I don't understand why you are:
>  * changing the handling of removed sentinels in ValuePortionFor

When the transition is a sentinel, mStartTime is null, and so without this we get a null assertion in the expression |timePortion = (aRefreshTime - mStartTime) ...| a few lines below.

>  * changing the < to <= in IsRunningAt

This is not redundant after bug 788049. I forget exactly why it was needed, but I have a test case which is visibly wrong without the change (when a new transition starts as an old one ends, the new transition does not start properly). It makes logical sense to me - if the current time is equal to the starting time of the transition, then I would consider the transition to be running.

Looking at the code I think the change affects things by not properly adding the animation to the layer when the layer is reconstructed at the start of a transition. That is, if the animation is not added when it is brand new, then it misses it chance, and never gets added to the layer.

> Were these workarounds for the issue eventually fixed in bug 788049?
> Are they still needed?
>
(In reply to David Baron [:dbaron] from comment #187)
> In part 5, mLastUpdateFlushCount should not be a flush count.  It should
> be a refresh driver time stamp.  This is probably the source of some
> bugs, since it probably means that the mini-flush code essentially
> doesn't work right the second time.  This is because the flush count
> (renaming to animation generation) only updates when there's a style
> change -- something that would require sending new information to the
> layer tree and the compositor thread.  However, the mini-flush of
> throttled styles might need to happen more than once -- it's needed to
> get the style up-to-date prior to a style change that might or might not
> affect the particular element.  So please rename mLastUpdateFlushCount
> to mLastUpdateThrottledStyle, and assign to it from |now| and compare it
> against |now| (|now| being a variable holding the result of
> RefreshDriver()->MostRecentRefresh()) rather than from
> |et->mFlushCount|.
> 

The idea with mLastUpdateFlushCount is that it is only used within a mini-flush and never outside of one to determine whether a mini-flush should take place. It is used to tick transitions once they have been updated in the current mini-flush, i.e., removing them from the working set of transitions to update. If there is somewhere that mLastUpdateFlushCount is used other than this, then it is a mistake and I should fix it, but I couldn't find anywhere. Did you mean a different counter? I guess I should add a better comment.
(In reply to Nick Cameron [:nrc] from comment #189)
> (In reply to David Baron [:dbaron] from comment #186)
> > Comment on attachment 683880 [details] [diff] [review]
> > part 5: changes to nsTransitionManager
> > 
> > The way UpdateStylesToPrimaryFrames and DoUpdateElementAndDescendants
> > interact with each other isn't going to behave correctly when you have
> > transitions on a descendant of a display:none element that has an
> > ancestor that also has transitions, since UpdateStylesToPrimaryFrames
> > will decide to call DoUpdateElementAndDescendants on the ancestor, which
> > will find the lack of a primary frame on the display:none element and
> > not bother crawling the descendants.  I think the easiest fix is to
> > change the if (!primaryFrame) return in DoUpdateElementAndDescendants
> > into an if (primaryFrame) { do next 3 lines }.  However, it would be
> > slightly faster to instead fix it by modifying
> > UpdateStylesToPrimaryFrames by moving the DoUpdateElementAndDescendants
> > call up into the for-loop over ancestors, but guarded by a check of
> > et->mLastUpdateFlushCount (on the et resulting from the call to
> > GetElementTransitions inside that for loop).
> > 
> 
> I'm not sure I understand the faster approach - don't we then miss a null
> check on primaryFrame in the recursive call of DoUpdateElementAndDescendants?

In the faster approach, I meant you'd leave the primaryFrame null-check the way it is now.

> > It's not clear to me why ForceLayerRerendering is needed.  Could you
> > explain?
> > 
> 
> I've added the following comment to ForceLayerRerendering:
> 
> // Ensure that the next repaint rebuilds the layer tree for aFrame. That
> // means that changes to animations on aFrame's layer are propogated to
> // the compositor, which is needed for correct behaviour of new
> // transitions.
> 
> Is that OK?

Other than the spelling of propagated, perhaps.  But I still don't understand why we'd need to rebuild the layer tree in this case.  The point of the calling code is to update the style data to match what's already happening on the compositor thread (with style data changes throttled), so I still don't see why we'd need to propagate a change into the layer tree in this code.

> > In UpdateStyle, the null-ea and null-et cases should both have
> > assertions; that shouldn't happen.  (Maybe it should even just crash.)
> > 
> 
> I don't understand why ea/et can't be null. Does the fact that the ruleNode
> level is eAnimatoinSheet imply that there should be an ElementAnimations for
> aElement? (And similarly for transitions).

I think it should.  The presence of a rule node with an eAnimationSheet level means that we have an animation style rule, which means there should be a currently-active animation.  Similar for transitions.

> > >-      GetElementTransitions(aElement, aNewStyleContext->GetPseudoType(),
> > >-                            true);
> > >+      GetElementTransitions(aElement, aNewStyleContext->GetPseudoType(), true);
> > 
> > Leave this.
> > 
> > 
> 
> Sorry to question this nit, but why do you prefer true on a separate line?
> The one line is exactly 80 cols. Happy to leave the change, but I'm curious.

Exactly 80 isn't great; it then goes over 80 in diffs.

And in general it's better not to make formatting changes that aren't clear improvements, or to mix formatting cleanups into other patches.

(In reply to Nick Cameron [:nrc] from comment #190)
> (In reply to David Baron [:dbaron] from comment #173)
> > Comment on attachment 683877 [details] [diff] [review]
> > Part 1: Add an animation flush type
> > 
> > >+  FlushPendingNotifications(flush);
> > >+}
> > >+
> > >+void
> > >+PresShell::FlushPendingNotifications(mozilla::ChangesToFlush aFlush)
> > >+{
> > >   /**
> > >    * VERY IMPORTANT: If you add some sort of new flushing to this
> > >    * method, make sure to add the relevant SetNeedLayoutFlush or
> > >    * SetNeedStyleFlush calls on the document.
> > >    */
> > 
> > We need to make sure something in this patch series does this for this
> > addition:
> > 
> > >+      if (aFlush.mFlushAnimations) {
> > >+        mPresContext->AnimationManager()->FlushAnimations();
> > >+        mPresContext->TransitionManager()->FlushTransitions();
> > >+      }
> > 
> > though it doesn't have to be in this patch.  (I'll try to remember to look
> > for it as I go through... but if it's not in any of the others and you need
> > to add it, it's probably best to add it in this one.)
> 
> I'm not sure exactly what is required here. I think the only time I call
> SetNeedStyleFlush is in nsTransitionManager::FlushTransitions if we throttle
> a transition. Do I need more calls? What is the purpose of these calls?

That (doing it when we throttle a transition, per comment 186) sounds exactly right for transitions.  But you also need the same thing in nsAnimationManager::FlushAnimations.

The purpose is the optimization added in bug 709256.  Basically, nsPresShell::FlushPendingNotifictaions is long enough and complicated enough that it's valuable to track whether we need to call it at all.

(In reply to Nick Cameron [:nrc] from comment #191)
I'm going to go back to comment 191 in a bit.

(In reply to Nick Cameron [:nrc] from comment #192)
> (In reply to David Baron [:dbaron] from comment #187)
> > In part 5, mLastUpdateFlushCount should not be a flush count.  It should
> > be a refresh driver time stamp.  This is probably the source of some
> > bugs, since it probably means that the mini-flush code essentially
> > doesn't work right the second time.  This is because the flush count
> > (renaming to animation generation) only updates when there's a style
> > change -- something that would require sending new information to the
> > layer tree and the compositor thread.  However, the mini-flush of
> > throttled styles might need to happen more than once -- it's needed to
> > get the style up-to-date prior to a style change that might or might not
> > affect the particular element.  So please rename mLastUpdateFlushCount
> > to mLastUpdateThrottledStyle, and assign to it from |now| and compare it
> > against |now| (|now| being a variable holding the result of
> > RefreshDriver()->MostRecentRefresh()) rather than from
> > |et->mFlushCount|.
> > 
> 
> The idea with mLastUpdateFlushCount is that it is only used within a
> mini-flush and never outside of one to determine whether a mini-flush should
> take place. It is used to tick transitions once they have been updated in
> the current mini-flush, i.e., removing them from the working set of
> transitions to update. If there is somewhere that mLastUpdateFlushCount is
> used other than this, then it is a mistake and I should fix it, but I
> couldn't find anywhere. Did you mean a different counter? I guess I should
> add a better comment.

No, I stand by what I said.

Basically, the problem I'm worried about is the following sequence of events.
 - transition is started on an element A, ticking animation generation to 20 and setting A's et->mFlushCount to 20
 - refresh driver tick to time=101
 - element.style set on an element B in another part of the tree
 - animation generation ticks to 21 when we process that style change
 - offsetLeft getter on B triggers a mini-flush, which sets A's et->mLastUpdateFlushCount to 20 (to match its mLastUpdate, which is and should still be 20, since it hasn't changed)
 - refresh driver tick to time=102
 - style change on A that involves reversing of transitions (no flush)
 - refresh driver tick to time=103
 - mini-flush triggered by refresh driver tick --> we skip doing anything to A because et->mLastUpdateFlushCount and et->mFlushCount are both 20.

The point is that the animation generation on et is a function of the last actual change to that transition (the animation progressing doesn't count as an actual change), but the mini-flush actually requires things be up-to-date to the progress of the animation.
(In reply to Nick Cameron [:nrc] from comment #191)
> (In reply to David Baron [:dbaron] from comment #186)
> > Comment on attachment 683880 [details] [diff] [review]
> > part 5: changes to nsTransitionManager
> > 
> > I don't understand why you are:
> >  * changing the handling of removed sentinels in ValuePortionFor
> 
> When the transition is a sentinel, mStartTime is null, and so without this
> we get a null assertion in the expression |timePortion = (aRefreshTime -
> mStartTime) ...| a few lines below.

I'd rather have things set up so we don't call ValuePortionFor on removed sentinels.  Which I guess moves us to the question of why you did:
  >-      if (pt.IsRemovedSentinel()) {
  >-        continue;
  >-      }
in EnsureStyleRuleFor.  I don't think that should be needed either... you could then hopefully drop both changes.

> >  * changing the < to <= in IsRunningAt
> 
> This is not redundant after bug 788049. I forget exactly why it was needed,
> but I have a test case which is visibly wrong without the change (when a new
> transition starts as an old one ends, the new transition does not start
> properly). It makes logical sense to me - if the current time is equal to
> the starting time of the transition, then I would consider the transition to
> be running.
> 
> Looking at the code I think the change affects things by not properly adding
> the animation to the layer when the layer is reconstructed at the start of a
> transition. That is, if the animation is not added when it is brand new,
> then it misses it chance, and never gets added to the layer.

The only thing we use IsRunningAt for is the geometric property stuff in bug 784239, which I consider pretty broken to begin with.  So I'm fine with the change, I guess, but you should make the same change in nsAnimationManager.
Blocks: 819191
(In reply to David Baron [:dbaron] from comment #194)
> (In reply to Nick Cameron [:nrc] from comment #191)
> > (In reply to David Baron [:dbaron] from comment #186)
> > > Comment on attachment 683880 [details] [diff] [review]
> > > part 5: changes to nsTransitionManager
> > > 
> > > I don't understand why you are:
> > >  * changing the handling of removed sentinels in ValuePortionFor
> > 
> > When the transition is a sentinel, mStartTime is null, and so without this
> > we get a null assertion in the expression |timePortion = (aRefreshTime -
> > mStartTime) ...| a few lines below.
> 
> I'd rather have things set up so we don't call ValuePortionFor on removed
> sentinels.  Which I guess moves us to the question of why you did:
>   >-      if (pt.IsRemovedSentinel()) {
>   >-        continue;
>   >-      }
> in EnsureStyleRuleFor.  I don't think that should be needed either... you
> could then hopefully drop both changes.
> 

Yes you are right, I can undo both changes and will do that. I assume that is due to the changes in bug 788049.

> > >  * changing the < to <= in IsRunningAt
> > 
> > This is not redundant after bug 788049. I forget exactly why it was needed,
> > but I have a test case which is visibly wrong without the change (when a new
> > transition starts as an old one ends, the new transition does not start
> > properly). It makes logical sense to me - if the current time is equal to
> > the starting time of the transition, then I would consider the transition to
> > be running.
> > 
> > Looking at the code I think the change affects things by not properly adding
> > the animation to the layer when the layer is reconstructed at the start of a
> > transition. That is, if the animation is not added when it is brand new,
> > then it misses it chance, and never gets added to the layer.
> 
> The only thing we use IsRunningAt for is the geometric property stuff in bug
> 784239, which I consider pretty broken to begin with.  So I'm fine with the
> change, I guess, but you should make the same change in nsAnimationManager.

We use it also in AddAnimationsAndTransitionsToLayer in nsDisplayList.cpp, and I think it is for this use where we require the change.
(In reply to David Baron [:dbaron] from comment #188)
> Comment on attachment 683882 [details] [diff] [review]
> part 7: odds and ends
> 
> ResolveStyleForElementRules:
>  * should be moved in the series prior to patch 5, which depends on it
>    (and if those patches end up being merged, then prior to patch 3; but
>    this piece is easy to keep separate)
> 

Is it OK to land all of this as one big patch? I fear getting all the dependencies right is now more effort than it is worth.

> 
> The nsDisplayList.cpp changes should be separated into a different patch,
> but that patch should also have the nsStyleTransformMatrix changes from
> patch 4 (separated out from that patch).
> 
> >+        //XXX[nrc] is it OK to add this?
> 
> I'm probably not the right person to ask this question, so maybe just
> leave the comment?

I checked with mattwoodrow and roc, and we think this is a sensible thing to do.
(In reply to Nick Cameron [:nrc] from comment #195)
> > The only thing we use IsRunningAt for is the geometric property stuff in bug
> > 784239, which I consider pretty broken to begin with.  So I'm fine with the
> > change, I guess, but you should make the same change in nsAnimationManager.
> 
> We use it also in AddAnimationsAndTransitionsToLayer in nsDisplayList.cpp,
> and I think it is for this use where we require the change.

ok, but still keep the animations one in sync with the transitions one

(In reply to Nick Cameron [:nrc] from comment #196)
> (In reply to David Baron [:dbaron] from comment #188)
> > Comment on attachment 683882 [details] [diff] [review]
> > part 7: odds and ends
> > 
> > ResolveStyleForElementRules:
> >  * should be moved in the series prior to patch 5, which depends on it
> >    (and if those patches end up being merged, then prior to patch 3; but
> >    this piece is easy to keep separate)
> > 
> 
> Is it OK to land all of this as one big patch? I fear getting all the
> dependencies right is now more effort than it is worth.

I suppose so.
(In reply to David Baron [:dbaron] from comment #193)
> (In reply to Nick Cameron [:nrc] from comment #189)
> > (In reply to David Baron [:dbaron] from comment #186)
> > > Comment on attachment 683880 [details] [diff] [review]
> > > part 5: changes to nsTransitionManager
> > > 
> > > The way UpdateStylesToPrimaryFrames and DoUpdateElementAndDescendants
> > > interact with each other isn't going to behave correctly when you have
> > > transitions on a descendant of a display:none element that has an
> > > ancestor that also has transitions, since UpdateStylesToPrimaryFrames
> > > will decide to call DoUpdateElementAndDescendants on the ancestor, which
> > > will find the lack of a primary frame on the display:none element and
> > > not bother crawling the descendants.  I think the easiest fix is to
> > > change the if (!primaryFrame) return in DoUpdateElementAndDescendants
> > > into an if (primaryFrame) { do next 3 lines }.  However, it would be
> > > slightly faster to instead fix it by modifying
> > > UpdateStylesToPrimaryFrames by moving the DoUpdateElementAndDescendants
> > > call up into the for-loop over ancestors, but guarded by a check of
> > > et->mLastUpdateFlushCount (on the et resulting from the call to
> > > GetElementTransitions inside that for loop).
> > > 
> > 
> > I'm not sure I understand the faster approach - don't we then miss a null
> > check on primaryFrame in the recursive call of DoUpdateElementAndDescendants?
> 
> In the faster approach, I meant you'd leave the primaryFrame null-check the
> way it is now.
> 

Hmm, so after some more testing, I can't see why I independently implemented the slow approach. In any test cases, whenever I have a null primary frame, I have no children, so it makes no difference. I also sometimes see a null primary frame in the call from UpdateStylesToPrimaryFrames, but doing nothing seems to do no harm.
(In reply to David Baron [:dbaron] from comment #193)
> (In reply to Nick Cameron [:nrc] from comment #189)
> > (In reply to David Baron [:dbaron] from comment #186)
> > > Comment on attachment 683880 [details] [diff] [review]
> > > part 5: changes to nsTransitionManager
> > > 
> > > It's not clear to me why ForceLayerRerendering is needed.  Could you
> > > explain?
> > > 
> > 
> > I've added the following comment to ForceLayerRerendering:
> > 
> > // Ensure that the next repaint rebuilds the layer tree for aFrame. That
> > // means that changes to animations on aFrame's layer are propogated to
> > // the compositor, which is needed for correct behaviour of new
> > // transitions.
> > 
> > Is that OK?
> 
> Other than the spelling of propagated, perhaps.  But I still don't
> understand why we'd need to rebuild the layer tree in this case.  The point
> of the calling code is to update the style data to match what's already
> happening on the compositor thread (with style data changes throttled), so I
> still don't see why we'd need to propagate a change into the layer tree in
> this code.
> 

It is in the case where we end a transition and start a new one that it is required - in that case we need to change the animation on the Layer, and that does not happen anywhere else. In fact this is currently broken on trunk with OMTA turned on :-( So, even if we do no throttling, then we need to do this to make the compositor up to date with the main thread.

I guess we could optimise this to only do it when a transition ends, but that seems to be the reason why we mini-flush most of the time anyway.
Re comment 199:  why doesn't the animation generation check take care of that?
Comment on attachment 683877 [details] [diff] [review]
Part 1: Add an animation flush type

In addition to dbaron's comments, please rev the nsIPresShell IID.

I don't see SetNeedsStyleFlush calls for throttled animations in the later patches, only for throttled transitions.  Am I just missing something?

r=me with that fixed.
Attachment #683877 - Flags: review?(bzbarsky) → review+
Comment on attachment 683878 [details] [diff] [review]
Part 3: Throttle animations and transitions

Nothing jumping out at me here that dbaron didn't cover...
Attachment #683878 - Flags: review?(bzbarsky) → review+
Comment on attachment 683879 [details] [diff] [review]
part 4: some refactoring

r=me
Attachment #683879 - Flags: review?(bzbarsky) → review+
Comment on attachment 683880 [details] [diff] [review]
part 5: changes to nsTransitionManager

ReparentBeforeAndAfter should probably use SetStyleContextWithoutNotification, right?  Same for the reparenting case in DoUpdateElementAndDescendants?

Why does ForceLayerRerendering need to only do the opacity layer if it has both opacity and transform?  If that's actually correct, it needs a comment explaining why.

>+    NS_WARNING("No primary frame for element");

I don't think you should warn here.  Plenty of elements without primary frames.

>+      NS_WARNING("Element has no primary frame, cannot reparent style");

Should we be reparenting undisplayed style contexts here as needed?

The reparenting case in DoUpdateElementAndDescendants should probably also not notify about the new style context.

>+    // that is, the element where we being updates.

s/being/begin/ ?

Should UpdateStylesToPrimaryFrames be called UpdateStylesOfPrimaryFrames?

r=me with that and dbaron's comments, I think.
Attachment #683880 - Flags: review?(bzbarsky) → review+
Comment on attachment 683882 [details] [diff] [review]
part 7: odds and ends

With dbaron's comments, r=me
Attachment #683882 - Flags: review?(bzbarsky) → review+
(In reply to David Baron [:dbaron] from comment #200)
> Re comment 199:  why doesn't the animation generation check take care of
> that?

That only causes us to not throttle the flush, but in the case of a non-throttled flush, we still need to update the layer, and without this change, we don't do that.
Comment on attachment 685906 [details] [diff] [review]
changes patch (no SMIL)

r=me
Attachment #685906 - Flags: review?(bzbarsky) → review+
Comment on attachment 684139 [details] [diff] [review]
part 6: various counters for keeping track of flushes

The various prescontext getters should be const.

r=me with that.  I really hope I've pieced the jigsaw pieces together correctly in my head....
Attachment #684139 - Flags: review?(bzbarsky) → review+
(In reply to David Baron [:dbaron] from comment #193)
> (In reply to Nick Cameron [:nrc] from comment #192)
> > (In reply to David Baron [:dbaron] from comment #187)
> > > In part 5, mLastUpdateFlushCount should not be a flush count.  It should
> > > be a refresh driver time stamp.  This is probably the source of some
> > > bugs, since it probably means that the mini-flush code essentially
> > > doesn't work right the second time.  This is because the flush count
> > > (renaming to animation generation) only updates when there's a style
> > > change -- something that would require sending new information to the
> > > layer tree and the compositor thread.  However, the mini-flush of
> > > throttled styles might need to happen more than once -- it's needed to
> > > get the style up-to-date prior to a style change that might or might not
> > > affect the particular element.  So please rename mLastUpdateFlushCount
> > > to mLastUpdateThrottledStyle, and assign to it from |now| and compare it
> > > against |now| (|now| being a variable holding the result of
> > > RefreshDriver()->MostRecentRefresh()) rather than from
> > > |et->mFlushCount|.
> > > 
> > 
> > The idea with mLastUpdateFlushCount is that it is only used within a
> > mini-flush and never outside of one to determine whether a mini-flush should
> > take place. It is used to tick transitions once they have been updated in
> > the current mini-flush, i.e., removing them from the working set of
> > transitions to update. If there is somewhere that mLastUpdateFlushCount is
> > used other than this, then it is a mistake and I should fix it, but I
> > couldn't find anywhere. Did you mean a different counter? I guess I should
> > add a better comment.
> 
> No, I stand by what I said.
> 
> Basically, the problem I'm worried about is the following sequence of events.
>  - transition is started on an element A, ticking animation generation to 20
> and setting A's et->mFlushCount to 20
>  - refresh driver tick to time=101
>  - element.style set on an element B in another part of the tree
>  - animation generation ticks to 21 when we process that style change
>  - offsetLeft getter on B triggers a mini-flush, which sets A's
> et->mLastUpdateFlushCount to 20 (to match its mLastUpdate, which is and
> should still be 20, since it hasn't changed)
>  - refresh driver tick to time=102
>  - style change on A that involves reversing of transitions (no flush)
>  - refresh driver tick to time=103
>  - mini-flush triggered by refresh driver tick --> we skip doing anything to
> A because et->mLastUpdateFlushCount and et->mFlushCount are both 20.
> 
> The point is that the animation generation on et is a function of the last
> actual change to that transition (the animation progressing doesn't count as
> an actual change), but the mini-flush actually requires things be up-to-date
> to the progress of the animation.

OK, so I think I understand the issue now, and I agree that we are currently broken. How do you feel about adding another generation count which gets ticked every time we do a mini flush and updating mLastUpdateFlushCount to that rather than mFlushCount? I would prefer to do that rather than use a TimeStamp because I think it makes the intent (maintaining a working set) clearer (a TimeStamp makes me think that the time is important, where I don't think it is).
Try run: https://tbpl.mozilla.org/?tree=Try&rev=fadeec59cae6
and without patches for comparison: https://tbpl.mozilla.org/?tree=Try&rev=1a980eb6246b

Despite appearances, we are looking good. The only problem I see is the crashtest for bug 813372 where we assert four times (I can reproduce locally). This is because the debug verification of the style tree in nsFrameManager is off. Due to the mini-flush, we get a different style via nsFrame::DoGetParentStyleContextFrame than is set as the parent style (nsFrameManager.cpp:576). I'm not sure whether we should be setting something extra in the mini-flush, so nsFrame::DoGetParentStyleContextFrame returns the 'right' thing, or changing the verification to take account of the mini-flush.
Most of these changes are simple and requested by the reviewers, but some are large enough I thought I should get re-review, and there are some other changes for bug fixes.
Attachment #690158 - Flags: review?(dbaron)
Comment on attachment 690158 [details] [diff] [review]
reviewer's comments and more changes

I went through this pretty quickly and it looks like what I expected; I'm stamping r=dbaron, but let me know if there's something you wanted me to look at more closely.


>-  wantNextFrame |= SampleAnimations(root, mLastCompose);
>+  //XXX[nrc] is this right? (mLastCompose -> aCurrentFrame)
>+  wantNextFrame |= SampleAnimations(root, aCurrentFrame);

I don't know.
Attachment #690158 - Flags: review?(dbaron) → review+
dbaron: could you look at comment 210 and comment 209 please? From the last patch, I am a little concerned with the change to the removed sentinel removal logic - nsTransitionManager.cpp:1021; does that look sensible?
Flags: needinfo?(dbaron)
(In reply to Nick Cameron [:nrc] from comment #209)
> OK, so I think I understand the issue now, and I agree that we are currently
> broken. How do you feel about adding another generation count which gets
> ticked every time we do a mini flush and updating mLastUpdateFlushCount to
> that rather than mFlushCount? I would prefer to do that rather than use a
> TimeStamp because I think it makes the intent (maintaining a working set)
> clearer (a TimeStamp makes me think that the time is important, where I
> don't think it is).

I'd rather just use a time stamp than have yet-another-generation-counter.


(In reply to Nick Cameron [:nrc] from comment #210)
> Despite appearances, we are looking good. The only problem I see is the
> crashtest for bug 813372 where we assert four times (I can reproduce
> locally). This is because the debug verification of the style tree in
> nsFrameManager is off. Due to the mini-flush, we get a different style via
> nsFrame::DoGetParentStyleContextFrame than is set as the parent style
> (nsFrameManager.cpp:576). I'm not sure whether we should be setting
> something extra in the mini-flush, so nsFrame::DoGetParentStyleContextFrame
> returns the 'right' thing, or changing the verification to take account of
> the mini-flush.

I'd like bz's opinion on this one.
Flags: needinfo?(dbaron) → needinfo?(bzbarsky)
> This is because the debug verification of the style tree in nsFrameManager is off.

Hmm.  What runs that debug verification?  Ideally we just wouldn't be running it while in our slightly-inconsistent state; I'd somewhat hoped that was already the case.  Can you attach a copy of the stack to the assert?
Flags: needinfo?(bzbarsky)
stack trace:

#0  VerifyContextParent (aPresContext=0x7fffcab0f000, aFrame=0x7fffdb5ac760, 
    aContext=0x7fffdb5ade20, aParentContext=0x7fffdb5ad258)
    at /home/ncameron/hdd2/firefox/src/layout/base/nsFrameManager.cpp:596
#1  0x00007ffff0f5e6b3 in VerifyStyleTree (aPresContext=0x7fffcab0f000, aFrame=0x7fffdb5ac760, 
    aParentContext=0x7fffdb5ad528)
    at /home/ncameron/hdd2/firefox/src/layout/base/nsFrameManager.cpp:637
#2  0x00007ffff0f5e964 in nsFrameManager::DebugVerifyStyleTree (this=0x7fffcab10400, 
    aFrame=0x7fffdb5ac760) at /home/ncameron/hdd2/firefox/src/layout/base/nsFrameManager.cpp:687
#3  0x00007ffff0ef389e in nsCSSFrameConstructor::ProcessRestyledFrames (this=0x7fffcab10400, 
    aChangeList=...) at /home/ncameron/hdd2/firefox/src/layout/base/nsCSSFrameConstructor.cpp:8265
#4  0x00007ffff0ed9212 in mozilla::css::RestyleTracker::ProcessOneRestyle (this=0x7fffcab104f0, 
    aElement=0x7fffcaecf7d0, aRestyleHint=0, aChangeHint=nsChangeHint_RepaintFrame)
    at /home/ncameron/hdd2/firefox/src/layout/base/RestyleTracker.cpp:131
#5  0x00007ffff0ed97c6 in mozilla::css::RestyleTracker::DoProcessRestyles (this=0x7fffcab104f0)
    at /home/ncameron/hdd2/firefox/src/layout/base/RestyleTracker.cpp:238
#6  0x00007ffff0ede737 in mozilla::css::RestyleTracker::ProcessRestyles (this=0x7fffcab104f0)
    at /home/ncameron/hdd2/firefox/src/layout/base/RestyleTracker.h:68
#7  0x00007ffff0efcfca in nsCSSFrameConstructor::ProcessPendingRestyles (this=0x7fffcab10400)
    at /home/ncameron/hdd2/firefox/src/layout/base/nsCSSFrameConstructor.cpp:12108
#8  0x00007ffff0f9758c in PresShell::FlushPendingNotifications (this=0x7fffc85aaca0, aFlush=...)
    at /home/ncameron/hdd2/firefox/src/layout/base/nsPresShell.cpp:3866
#9  0x00007ffff0f970c4 in PresShell::FlushPendingNotifications (this=0x7fffc85aaca0, 
    aType=Flush_InterruptibleLayout)
    at /home/ncameron/hdd2/firefox/src/layout/base/nsPresShell.cpp:3755
#10 0x00007ffff0f96fdf in PresShell::HandlePostedReflowCallbacks (this=0x7fffc85aaca0, 
    aInterruptible=true) at /home/ncameron/hdd2/firefox/src/layout/base/nsPresShell.cpp:3724
#11 0x00007ffff0fa3143 in PresShell::DidDoReflow (this=0x7fffc85aaca0, aInterruptible=true)
    at /home/ncameron/hdd2/firefox/src/layout/base/nsPresShell.cpp:7407
#12 0x00007ffff0fa42be in PresShell::ProcessReflowCommands (this=0x7fffc85aaca0, aInterruptible=true)
    at /home/ncameron/hdd2/firefox/src/layout/base/nsPresShell.cpp:7705
#13 0x00007ffff0f97741 in PresShell::FlushPendingNotifications (this=0x7fffc85aaca0, aFlush=...)
    at /home/ncameron/hdd2/firefox/src/layout/base/nsPresShell.cpp:3905
#14 0x00007ffff0fb0480 in nsRefreshDriver::Notify (this=0x7fffcab0fc00, aTimer=0x7fffcaef8400)
    at /home/ncameron/hdd2/firefox/src/layout/base/nsRefreshDriver.cpp:404
#15 0x00007ffff27939cc in nsTimerImpl::Fire (this=0x7fffcaef8400)
    at /home/ncameron/hdd2/firefox/src/xpcom/threads/nsTimerImpl.cpp:485
#16 0x00007ffff2793db0 in nsTimerEvent::Run (this=0x7fffe7596188)
    at /home/ncameron/hdd2/firefox/src/xpcom/threads/nsTimerImpl.cpp:565
#17 0x00007ffff278bb89 in nsThread::ProcessNextEvent (this=0x7ffff6c6f640, mayWait=false, 
    result=0x7fffffffb88f) at /home/ncameron/hdd2/firefox/src/xpcom/threads/nsThread.cpp:627
#18 0x00007ffff27191c8 in NS_ProcessNextEvent_P (thread=0x7ffff6c6f640, mayWait=false)
    at /home/ncameron/hdd2/firefox/obj-debug/xpcom/build/nsThreadUtils.cpp:221
#19 0x00007ffff245fdb2 in mozilla::ipc::MessagePump::Run (this=0x7fffe75291c0, 
    aDelegate=0x7ffff6cb72b0) at /home/ncameron/hdd2/firefox/src/ipc/glue/MessagePump.cpp:82
#20 0x00007ffff27e5e65 in MessageLoop::RunInternal (this=0x7ffff6cb72b0)
    at /home/ncameron/hdd2/firefox/src/ipc/chromium/src/base/message_loop.cc:215
#21 0x00007ffff27e5df6 in MessageLoop::RunHandler (this=0x7ffff6cb72b0)
    at /home/ncameron/hdd2/firefox/src/ipc/chromium/src/base/message_loop.cc:208
#22 0x00007ffff27e5dcf in MessageLoop::Run (this=0x7ffff6cb72b0)
    at /home/ncameron/hdd2/firefox/src/ipc/chromium/src/base/message_loop.cc:182
#23 0x00007ffff22d70c6 in nsBaseAppShell::Run (this=0x7fffe17f7a20)
    at /home/ncameron/hdd2/firefox/src/widget/xpwidgets/nsBaseAppShell.cpp:163
#24 0x00007ffff1fec50c in nsAppStartup::Run (this=0x7fffe17ebb00)
    at /home/ncameron/hdd2/firefox/src/toolkit/components/startup/nsAppStartup.cpp:291
#25 0x00007ffff0af2508 in XREMain::XRE_mainRun (this=0x7fffffffbd40)
    at /home/ncameron/hdd2/firefox/src/toolkit/xre/nsAppRunner.cpp:3824
#26 0x00007ffff0af27f6 in XREMain::XRE_main (this=0x7fffffffbd40, argc=4, argv=0x7fffffffe1a8, 
    aAppData=0x637c40) at /home/ncameron/hdd2/firefox/src/toolkit/xre/nsAppRunner.cpp:3891
#27 0x00007ffff0af2a51 in XRE_main (argc=4, argv=0x7fffffffe1a8, aAppData=0x637c40, aFlags=0)
    at /home/ncameron/hdd2/firefox/src/toolkit/xre/nsAppRunner.cpp:4089
#28 0x0000000000402b8a in do_main (argc=4, argv=0x7fffffffe1a8)
    at /home/ncameron/hdd2/firefox/src/browser/app/nsBrowserApp.cpp:174
#29 0x0000000000402e5b in main (argc=4, argv=0x7fffffffe1a8)
    at /home/ncameron/hdd2/firefox/src/browser/app/nsBrowserApp.cpp:279
Flags: needinfo?(bzbarsky)
Oh, hmm.  That's a style flush that did a miniflush, yeah...

I think the only sane thing to do is to not do the DebugVerifyStyleTree if our styles aren't really up to date.  David?
Flags: needinfo?(bzbarsky)
Flags: needinfo?(dbaron)
Attached patch updated changes 2 (obsolete) — Splinter Review
updated the last set of changes to use a time stamp rather than a generation count corrected the grammar in a comment. Carrying r=dbaron
Attachment #690158 - Attachment is obsolete: true
Attachment #690714 - Flags: review+
Attachment #690715 - Flags: review?(bzbarsky)
Blocks: 820258
and a version which actually builds, carrying r=dbaron
Attachment #690714 - Attachment is obsolete: true
Attachment #690741 - Flags: review+
Attachment #684347 - Attachment is obsolete: true
Comment on attachment 690715 [details] [diff] [review]
part 10: don't verify animated frames

r=me
Attachment #690715 - Flags: review?(bzbarsky) → review+
Looking good in Try: https://tbpl.mozilla.org/?tree=Try&rev=99afcea8990b
(For comparison, OMTC, no throttling: https://tbpl.mozilla.org/?tree=Try&rev=1a980eb6246b)
Attachment #684112 - Attachment is obsolete: true
Attachment #684114 - Attachment is obsolete: true
Apparently we need double parenthesis to get around the warnings as errors things for bracketing assignment as bool :-( Carrying r=bz
Attachment #690715 - Attachment is obsolete: true
Attachment #691007 - Flags: review+
Flags: needinfo?(dbaron)
Attached patch rollup patchSplinter Review
rollup of parts 1 and 3-10. Carrying r=dbaron,bz
Attachment #691026 - Flags: review+
Comment on attachment 691026 [details] [diff] [review]
rollup patch

[Approval Request Comment]
Bug caused by (feature/regressing bug #): 
User impact if declined: slower app startup time on b2g
Testing completed (on m-c, etc.): Landed today, but I propose leaving it for a couple of days before landing on Aurora, getting a+ ahead of time.
Risk to taking this patch (and alternatives if risky): Kind of risky - style system and layout changes, but mostly behind a pref, off everywhere except b2g. OMTA animations/transitions are broken on Aurora, this makes them no worse (and fixes them with the pref turned on).
String or UUID changes made by this patch: nsIPresShell
Attachment #691026 - Flags: approval-mozilla-aurora?
We should consider a bb+ for this from the startup and UX side.  (Approval works too, of course.)

This is the biggest "platform" startup known right now, which is our #1 perf issue.  It's a startup win for all apps, generically.

There's not really an alternative to this patch except changing the UX spec for app opening.  The UX spec and the code being fixed here are in direct competition.

We definitely shouldn't land this on current beta, but if we bb+ or approve it will need to go on aurora.
blocking-basecamp: --- → ?
Whiteboard: [leave open][fast:200-400ms] → [leave open][fast:200-400ms], UX-P1
Keywords: perf
Depends on: 820864
(In reply to Chris Jones [:cjones] [:warhammer] from comment #227)
> We should consider a bb+ for this from the startup and UX side.  (Approval
> works too, of course.)

We discussed this during triage today.  While we don't think it needs to block, and seems risky at this time, the benefits probably justify an uplift.  Does it have to go to aurora or can we just uplift to mozilla-b2g18?
It has to land on Aurora too.
Comment on attachment 691026 [details] [diff] [review]
rollup patch

See above.
Attachment #691026 - Flags: approval-mozilla-b2g18?
Let's take this on aurora and mozilla-b2g18 ASAP to get as much exposure as possible.
blocking-basecamp: ? → +
Attachment #691026 - Flags: approval-mozilla-b2g18?
Attachment #691026 - Flags: approval-mozilla-b2g18+
Attachment #691026 - Flags: approval-mozilla-aurora?
Attachment #691026 - Flags: approval-mozilla-aurora+
I rebased this to b2g18, but it was slightly nontrivial.  I'm going to run some tests and then post the patch for your quick scan.
(In reply to Chris Jones [:cjones] [:warhammer] from comment #234)
> I rebased this to b2g18, but it was slightly nontrivial.  I'm going to run
> some tests and then post the patch for your quick scan.

Ah sorry, I had already done this yesterday. I guess I can just compare our patches - hopefully they'll look the same.
Attached patch Rebased to b2g18Splinter Review
No worries at all.  Here's mine.
Is there more work to be done here to justify the [leave open] on the whiteboard, or can this be resolved?
(In reply to Chris Jones [:cjones] [:warhammer] from comment #236)
> Created attachment 692546 [details] [diff] [review]
> Rebased to b2g18
> 
> No worries at all.  Here's mine.

reassuringly we have come up with almost exactly the same patch :-) I'll land in a few minutes...
(In reply to Ryan VanderMeulen from comment #237)
> Is there more work to be done here to justify the [leave open] on the
> whiteboard, or can this be resolved?

No, done now (might want to check I got the tracking flags right though, not 100% sure how they should get set).
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
tracking-b2g18: --- → ?
Resolution: --- → FIXED
Whiteboard: [leave open][fast:200-400ms], UX-P1 → [fast:200-400ms], UX-P1
Target Milestone: --- → mozilla20
Depends on: 822231
Depends on: 822721
Depends on: 825808
Depends on: 832998
Whiteboard: [fast:200-400ms], UX-P1 → [fast:200-400ms], UX-P1, [perf_tsk]
Whiteboard: [fast:200-400ms], UX-P1, [perf_tsk] → [fast:200-400ms], UX-P1, [FFOS_perf]
No longer blocks: 796254
Whiteboard: [fast:200-400ms], UX-P1, [FFOS_perf] → [fast:200-400ms], UX-P1
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: