Last Comment Bug 758505 - Avoid invalidating when setting identical animation values.
: Avoid invalidating when setting identical animation values.
Status: RESOLVED FIXED
:
Product: Core
Classification: Components
Component: SVG (show other bugs)
: Trunk
: All All
: -- normal (vote)
: mozilla15
Assigned To: Nobody; OK to take it and work on it
:
:
Mentors:
: 755209 (view as bug list)
Depends on:
Blocks: dlbi
  Show dependency treegraph
 
Reported: 2012-05-24 21:37 PDT by Matt Woodrow (:mattwoodrow)
Modified: 2012-06-03 20:38 PDT (History)
8 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
Part 1 - lengths only (1.06 KB, patch)
2012-05-24 21:37 PDT, Matt Woodrow (:mattwoodrow)
dholbert: review+
Details | Diff | Splinter Review
reftest tweak to make the animation longer & repeating (1.51 KB, patch)
2012-05-29 13:14 PDT, Daniel Holbert [:dholbert]
no flags Details | Diff | Splinter Review
Part 2 - other non-list types (9.70 KB, patch)
2012-05-31 08:53 PDT, Robert Longson
dholbert: review+
Details | Diff | Splinter Review
Part 3 - what's left (3.92 KB, patch)
2012-06-01 03:40 PDT, Robert Longson
dholbert: review+
joe: review+
Details | Diff | Splinter Review

Description Matt Woodrow (:mattwoodrow) 2012-05-24 21:37:39 PDT
Created attachment 627099 [details] [diff] [review]
Part 1 - lengths only

With my DLBI patches (http://hg.mozilla.org/users/mwoodrow_mozilla.com/dlbi), I get a reftest timed out error in layout/reftests/svg/smil/anim-filter-primitive-size-01.svg

It appears that we're getting endless calls nsSVGLength2::SetAnimValueInSpecifiedUnits(), always with the same value.

Each one of these triggers a reflow, and then invalidation. Because of this isMozAfterPaintPending is always true, and the reftest harness waits on it indefinitely.

Not exactly sure why this only happens with DLBI.

Attached patch fixes the issue for me, entirely guessing here though.
Comment 1 Robert Longson 2012-05-24 23:43:21 PDT
If we need to do this then we need it for all types e.g.  nsSVGAngle::SetAnimValue,  nsSVGBoolean::SetAnimValue etc. http://mxr.mozilla.org/mozilla-central/search?string=-%3EDidAnimate (but not the ClearAnimVal cases).
Comment 2 Daniel Holbert [:dholbert] 2012-05-25 11:17:27 PDT
> Not exactly sure why this only happens with DLBI.

I think I want to understand this more before r+'ing...  How much of your DLBI patch-queue should I apply to get a build that reproduces this issue?  I tried applying all of it to up-to-date m-c, but I got conflicts when qpushing the patch "debugging-code".
Comment 3 Matt Woodrow (:mattwoodrow) 2012-05-25 13:05:17 PDT
(In reply to Daniel Holbert [:dholbert] from comment #2)
> > Not exactly sure why this only happens with DLBI.
> 
> I think I want to understand this more before r+'ing...  How much of your
> DLBI patch-queue should I apply to get a build that reproduces this issue? 
> I tried applying all of it to up-to-date m-c, but I got conflicts when
> qpushing the patch "debugging-code".

That patch (and the ones after it) probably won't apply, you don't need them. The commit comment on the patch queue repo lists the patch to stop at, and the m-c revision is was based on.
Comment 4 Matt Woodrow (:mattwoodrow) 2012-05-25 17:37:48 PDT
As a test, you can just load the test file in a trunk build (without any of my patches).

I'm still seeing a steady 80% CPU usage, after the animation is finished. Paint flashing shows the entire content area being repainted every time.

Even if my fix isn't the right way to go about this, we really need to fix this. Repainting the whole page of identical content at 60fps is sadmaking.
Comment 5 Daniel Holbert [:dholbert] 2012-05-29 12:41:25 PDT
(In reply to Matt Woodrow (:mattwoodrow) from comment #4)
> As a test, you can just load the test file in a trunk build (without any of
> my patches).
> 
> I'm still seeing a steady 80% CPU usage, after the animation is finished.

That's known/expected at the moment -- see bug 755209 comment 1.

> Even if my fix isn't the right way to go about this, we really need to fix
> this. Repainting the whole page of identical content at 60fps is sadmaking.

Agreed (though note that this is a somewhat pathological testcase -- e.g. it only repaints the whole page because we've got an animation that affects the whole page).  But yeah, we definitely should fix that issue.  I suggest we use bug 755209 to track that.

As for this bug here -- focusing on the test timeout -- I believe the issue here is more fundamental.  If we have a SMIL animation running in a reftest (and let's suppose it's an animation that continues to animate, unlike anim-filter-primitive-size-01.svg), then it looks like (?) DLBI can prevent us from ever firing MozReftestInvalidate, because we'll always have isMozAfterPaintPending set (as noted in comment 0).

That's bad, I think.

IIUC, MozReftestInvalidate is supposed to fire after our _initial_ rendering is done (catching all the reflows/paints that are pending when pageload fires).  Any animations that are running from that point on need to _not_ hold up MozReftestInvalidate.  If my understanding is correct on that (is it?), then I think this is a bug in DLBI interacting with MozReftestInvalidate.

It might be that we're currently checking isMozAfterPaintPending at the end of a refresh-driver sample, but we want to actually check at the _beginning_, for determining whether to fire MozReftestInvalidate. (so that we aren't blocked by any animation-triggered paints/reflows)  Or something like that.
Comment 6 Daniel Holbert [:dholbert] 2012-05-29 13:14:19 PDT
Created attachment 628080 [details] [diff] [review]
reftest tweak to make the animation longer & repeating

(In reply to Daniel Holbert [:dholbert] from comment #5)
> If we have a SMIL animation running in a reftest
> (and let's suppose it's an animation that continues to animate, unlike
> anim-filter-primitive-size-01.svg), then it looks like (?) DLBI can prevent
> us from ever firing MozReftestInvalidate, because we'll always have
> isMozAfterPaintPending set (as noted in comment 0).

To illustrate this: here's a patch to tweak this reftest such that it still fails with DLBI + the first-pass patch here. (but passes in a clean m-c build)  

I tweaked the animations to make them last longer (so that we'll have invalidations coming for at least 20s), and then I made those animations repeat indefinitely, for good measure.

Some substantial fraction of the time, the failure is because this reftest never ever gets a MozReftestInvalidate event fired.  I can tell because I see the animation play out continuously, over and over, in the reftest window. (If we fired MozReftestInvalidate, then the test's JS would run, and it'd pause the animation -- so that's how I know MozReftestInvalidate isn't firing.)

The rest of the time, it looks like we're firing MozReftestInvalidate, but then we're delaying the reftest snapshot, probably for a similar reason.

The first part (MozReftestInvalidate events being held up indefinitely) is definitely a bug that we should fix in the MozReftestInvalidate-triggering mechanics.

The second part (take-reftest-snapshot events being held up indefinitely) is maybe slightly-less-obviously a bug, since we normally expect documents to be frozen when we take the reftest snapshot -- but it could be that we have e.g. a lime rect dancing around a lime document, and we might want to take a snapshot of that, while it's still animating.  So I think that's probably a situation that reftests should allow (as they do currently).
Comment 7 Matt Woodrow (:mattwoodrow) 2012-05-29 14:46:32 PDT
(In reply to Daniel Holbert [:dholbert] from comment #5)

> IIUC, MozReftestInvalidate is supposed to fire after our _initial_ rendering
> is done (catching all the reflows/paints that are pending when pageload
> fires).  Any animations that are running from that point on need to _not_
> hold up MozReftestInvalidate.  If my understanding is correct on that (is
> it?), then I think this is a bug in DLBI interacting with
> MozReftestInvalidate.

I'm not sure that I agree here, my understanding was that the test page was required to reach a steady state before MozReftestInvalidate was fired.

I can't see any real way to distinguish between paints that are pending because of pageload, and paints that are triggered from running animations.

CC'ing roc for clarification here.
Comment 8 Daniel Holbert [:dholbert] 2012-05-29 14:49:31 PDT
(In reply to Matt Woodrow (:mattwoodrow) from comment #7)
> I can't see any real way to distinguish between paints that are pending
> because of pageload, and paints that are triggered from running animations.

(Don't we currently make this distinction, somehow? If we didn't, I'd expect we would have run up against this issue already.)
Comment 9 Matt Woodrow (:mattwoodrow) 2012-05-29 14:51:58 PDT
Note that the reftest harness doesn't just do a single IsMozPaintEventPending check after pageload, it will check again after MozAfterPaint is fired until there are no paints pending.

My best guess as to why this passes on trunk is a timing issue, in that we manage to get a MozAfterPaint callback and then finish the test before the SMIL animation controller invalidates again.
Comment 10 Robert O'Callahan (:roc) (email my personal email if necessary) 2012-05-29 16:57:27 PDT
Right. There is no attempt to distinguish "paints before MozReftestInvalidate fires" from "paints after MozReftestInvalidate fires". And I don't think we should try to do that, it would be too difficult and confusing. Reftests should just reach a steady state where we're not invalidating. That shouldn't be hard.
Comment 11 Daniel Holbert [:dholbert] 2012-05-29 17:38:58 PDT
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #10)
> Reftests should just reach a steady state where we're not
> invalidating. That shouldn't be hard.

For the actual reftest snapshot, sure -- it's reasonable to expect that reftests reach a steady state when they want to be photographed. That's the "second part" that I mentioned in comment 6.

But is it really reasonable to expect that reftests have to _begin_ in a steady state -- that MozReftestInvalidate won't ever be fired if we have animations running?

If that's our expectation, then we probably shouldn't use MozReftestInvalidate at all with testcases that have animations anywhere near the beginning of the document start time.  (or, we'll have to be much more careful about such uses -- e.g. intentionally doing <svg onload="this.pauseAnimations()" ... > to prevent animations from delaying the MozReftestInvalidate firing)

That's fine if we want to do it -- it's just a change from our existing rules-of-thumb for animation tests.
Comment 12 Robert O'Callahan (:roc) (email my personal email if necessary) 2012-05-29 17:43:45 PDT
(In reply to Daniel Holbert [:dholbert] from comment #11)
> But is it really reasonable to expect that reftests have to _begin_ in a
> steady state -- that MozReftestInvalidate won't ever be fired if we have
> animations running?

True, that's less reasonable, but I still think it's better than the alternative.

> If that's our expectation, then we probably shouldn't use
> MozReftestInvalidate at all with testcases that have animations anywhere
> near the beginning of the document start time.  (or, we'll have to be much
> more careful about such uses -- e.g. intentionally doing <svg
> onload="this.pauseAnimations()" ... > to prevent animations from delaying
> the MozReftestInvalidate firing)

Yes.
Comment 13 Daniel Holbert [:dholbert] 2012-05-30 10:37:19 PDT
Comment on attachment 627099 [details] [diff] [review]
Part 1 - lengths only

OK, I'm on-board with the attached patch then.

We really should make the same change across all 19 nsISMILAttr impls for SetAnimValue (or in the function that those impls call into, e.g. nsSVGLength2::SetAnimValueInSpecifiedUnits in this bug's case):
http://mxr.mozilla.org/mozilla-central/search?string=%3A%3ASetAnimValue%28const+nsSMILValue%26+aValue%29&find=&findi=&filter=^[^\0]*%24&hitlimit=&tree=mozilla-central

I feel bad putting mattwoodrow on-the-hook for fixing all of those, though, just to make this one reftest pass.  Matt: if you want to fix those other instances , that would be awesome, but if you're backed-up on DLBI, then I'm OK with taking the attached patch as-is so that existing reftests pass, and then fixing the remaining instances in bug 755209.
Comment 14 Robert Longson 2012-05-31 08:53:23 PDT
Created attachment 628764 [details] [diff] [review]
Part 2 - other non-list types
Comment 15 Robert Longson 2012-05-31 09:02:53 PDT
I don't think nsSMILCSSProperty.cpp, nsSMILMappedAttribute.cpp, SVGMotionSMILAttr need to be changed as they work differently so that leaves

SVGAnimatedLengthList, SVGAnimatedNumberList, SVGAnimatedPathSegList, SVGAnimatedPointList
Comment 16 Robert Longson 2012-05-31 09:29:29 PDT
After looking at the length list types, they don't need doing either as Brian's change in bug 629200 fixed them in the animated case too as their animated and base cases share the "same value detection" code. So as far as I can tell all you need is the patch I've just submitted (together with the already reviewed one for lengths).
Comment 17 Daniel Holbert [:dholbert] 2012-05-31 10:14:56 PDT
Comment on attachment 628764 [details] [diff] [review]
Part 2 - other non-list types

This looks great -- thanks, Robert!
Comment 18 Robert Longson 2012-05-31 10:18:21 PDT
These probably need landing then. I won't be able to do that before Firefox 15 closes.
Comment 19 Daniel Holbert [:dholbert] 2012-05-31 10:28:36 PDT
(In reply to Robert Longson from comment #15)
> I don't think nsSMILCSSProperty.cpp, nsSMILMappedAttribute.cpp,
> SVGMotionSMILAttr need to be changed as they work differently so that leaves

I think we do need to fix those as well, actually.

For nsSMILCSSProperty::SetAnimValue, I forget exactly what triggers invalidation, but I think we'll need to check overrideDecl->GetPropertyValue() just before the SetPropertyValue call.

For nsSMILMappedAttribute, we want to call GetProperty just before we call SetProperty.

And for nsSMILMappedAttribute, we want to tweak nsSVGGraphicElement::SetAnimateMotionTransform() to compare the passed-in matrix against the existing matrix.

(In reply to Robert Longson from comment #18)
> These probably need landing then. I won't be able to do that before Firefox
> 15 closes.

I think the DLBI work isn't quite ready to land, so there's no pressing need for this to land before the cutoff.  It might be best to just wait until after the cutoff, so we can fix all the animated types in one release cycle (including the ones above) and not have a bug straddling releases.
Comment 20 Robert Longson 2012-06-01 03:40:58 PDT
Created attachment 629130 [details] [diff] [review]
Part 3 - what's left
Comment 21 Daniel Holbert [:dholbert] 2012-06-01 11:32:51 PDT
Comment on attachment 629130 [details] [diff] [review]
Part 3 - what's left

>diff --git a/content/smil/nsSMILMappedAttribute.cpp b/content/smil/nsSMILMappedAttribute.cpp
>+  nsRefPtr<nsIAtom> attrName = GetAttrNameAtom();
>+  nsStringBuffer* oldValStrBuf = static_cast<nsStringBuffer*>
>+    (mElement->GetProperty(SMIL_MAPPED_ATTR_ANIMVAL, attrName));
>+  if (oldValStrBuf) {
>+    nsAutoString oldValue;
>+    PRUint32 len = NS_strlen(static_cast<PRUnichar*>(oldValStrBuf->Data()));
>+    oldValStrBuf->ToString(len, oldValue);
>+    if (oldValue.Equals(valStr)) {
>+      return NS_OK;
>+    }

I think this can be shortened to 
  if (oldValStrBuf && valStr.Equals(nsCheapString(oldValStrBuf)) {
    return NS_OK;
  }

(nsCheapString is defined in nsAttrValue.h, and is just syntactic sugar for nsString aka nsAutoString)

>diff --git a/gfx/thebes/gfxMatrix.h b/gfx/thebes/gfxMatrix.h
>+    /* Returns true if the other matrix is fuzzy-equal to this matrix.
>+     * Note that this isn't a cheap comparison!
>+     */
>+    bool operator==(const gfxMatrix& other) const
>+    {
>+      return FuzzyEqual(xx, other.xx) && FuzzyEqual(yx, other.yx) &&
>+             FuzzyEqual(xy, other.xy) && FuzzyEqual(yy, other.yy) &&
>+             FuzzyEqual(x0, other.x0) && FuzzyEqual(y0, other.y0);
>+    }
>+
>+    bool operator!=(const gfxMatrix& other) const
>+    {
>+      return !(*this == other);
>+    }
>+

I'm not sure we actually need the fuzziness of FuzzyEqual for this particular purpose -- after an animation is frozen, I'd expect we'd get the _exact_ same transform every sample -- so I'd imagine that a normal equality-check should be fine.

However, FuzzyEqual is probably what other consumers would expect from gfxMatrix::operator==, and it matches what we do in gfx/2d/Matrix.h (from Bug 705204), so this seems fine. You should get a GFX peer to sign off on it, though.

I'm tagging r?joe for just the gfxMatrix.h chunk, since he wrote the equivalent code in Matrix.h and can hopefully sign off on this quickly & easily.

So, r=me with the nsCheapString thing fixed. (I might just make that change & land this myself, once joe gives the thumbs-up.)
Comment 22 Joe Drew (not getting mail) 2012-06-01 12:20:14 PDT
Comment on attachment 629130 [details] [diff] [review]
Part 3 - what's left

matrix bits look fine, as long as you're sure you need to use it
Comment 23 Daniel Holbert [:dholbert] 2012-06-01 16:55:56 PDT
(In reply to Joe Drew (:JOEDREW!) from comment #22)
> matrix bits look fine, as long as you're sure you need to use it

Yup. Can't compare for equality without it.  Thanks!

(In reply to Daniel Holbert [:dholbert] from comment #21)
> So, r=me with the nsCheapString thing fixed. (I might just make that change
> & land this myself, once joe gives the thumbs-up.)

I ended up making that nsCheapString tweak & landing:
  https://hg.mozilla.org/integration/mozilla-inbound/rev/819174abbf12
  https://hg.mozilla.org/integration/mozilla-inbound/rev/0865e37d8cad
  https://hg.mozilla.org/integration/mozilla-inbound/rev/123be6501757
Comment 25 Daniel Holbert [:dholbert] 2012-06-03 20:38:30 PDT
*** Bug 755209 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.