Intermittent test_animSVGImage.html | timing out after 120000ms. Animated image still doesn't look correct, after call (#2831|#1919|#1890) to onStopFrame

RESOLVED FIXED

Status

()

RESOLVED FIXED
6 years ago
4 years ago

People

(Reporter: philor, Assigned: seth)

Tracking

({intermittent-failure})

Trunk
x86
Windows 7
intermittent-failure
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

(Reporter)

Description

6 years ago
https://tbpl.mozilla.org/php/getParsedLog.php?id=19220261&tree=Mozilla-Inbound
Rev3 WINNT 6.1 mozilla-inbound pgo test mochitest-other on 2013-01-28 19:03:54 PST for push 4e0b83a23621
slave: talos-r3-w7-020
...
...
...
14129 INFO TEST-PASS | chrome://mochitests/content/chrome/image/test/mochitest/test_animSVGImage.html | myOnStopFrame called
14130 ERROR TEST-UNEXPECTED-FAIL | chrome://mochitests/content/chrome/image/test/mochitest/test_animSVGImage.html | timing out after 120000ms.  Animated image still doesn't look correct, after call #2831 to onStopFrame
14131 INFO TEST-PASS | chrome://mochitests/content/chrome/image/test/mochitest/test_animSVGImage.html | myOnStopFrame called
...
...
...

Noisy bugger, innit?
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
This got a lot more frequent on Linux32 in the last day. Can somebody please take a look at this?
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
So, the spammy (multiple-failures-per-day) issues on m-i seem to have started with comment 7, for revision 9f044f9e981a.

Just to be on the safe side, let's assume that one *might've* been random, and the spamminess could have started with the next inbound cset (comment 9, cset ea8b11062c07)  So, here's a pushlog for a day leading up to that cset:
 https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=b093ba2c7ff9&tochange=ea8b11062c07

The most relevant changes there seem to be Seth's patches for bug 704059 (changing how SVG images behave).
(Assignee)

Comment 20

6 years ago
I am taking a look at this now.
Comment hidden (Treeherder Robot)
(Assignee)

Comment 22

6 years ago
Argh. I am having a tough time reproducing this so far. Maybe I'm just being quite unlucky, or maybe my logging code is slowly things down enough that things always work.

The try job where I'm doing this is here:

https://tbpl.mozilla.org/?tree=Try&rev=95e4c4eed660

There are lots of failures due to the logging code I added (timeouts/"too much output", with one crash in test_733533 due to a bug in the logging code that isn't worth a rebuild to fix), but 'oth' on Fedora debug remains solid green.

In case the issue is that timing has changing due to logging, I am preparing a second try job with much more minimal logging that will hopefully still at least give me a clue.
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
(Assignee)

Comment 35

6 years ago
So I've pushed a new try job here to look a bit more into this problem:

https://tbpl.mozilla.org/?tree=Try&rev=788e3c28084a

I've reduced the logging to some simple sanity checking for now to make sure we're not in a totally weird and unexpected state.

The other addition is that we now report a snapshot of the failure, which will hopefully help clarify what's happening here.
Comment hidden (Treeherder Robot)
(Assignee)

Comment 37

6 years ago
Drat, missing return statement burnt that try job. New one: https://tbpl.mozilla.org/?tree=Try&rev=8ee1e8d9b3df
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Summary: Intermittent test_animSVGImage.html | timing out after 120000ms. Animated image still doesn't look correct, after call #2831 to onStopFrame → Intermittent test_animSVGImage.html | timing out after 120000ms. Animated image still doesn't look correct, after call (#2831|#1919) to onStopFrame
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
(Assignee)

Comment 146

6 years ago
@Ed: Actually I've pushed several more testcases to try, and I haven't posted all of them here. My apologies.

The newest one is here: https://tbpl.mozilla.org/?tree=Try&rev=ff6257a70c18

I think this is the first one that actually _does_ tell us something useful. In reading the following, the important thing to pay attention to is how currentTime (the internal SVG document's perception of where we are in the animation) is changing. Here's an (edited) portion of the log for the run of the test that succeeds (Fedora debug):

> 11459 INFO TEST-START | chrome://mochitests/content/chrome/image/test/mochitest/test_animSVGImage.html
> 11463 INFO TEST-PASS | chrome://mochitests/content/chrome/image/test/mochitest/test_animSVGImage.html | myOnStopFrame called
> *** VectorImage::Draw for image 0x98841d0 (currentTime? 0.005000)
> ( ... more myOnStopFrame calls, but we never see any currentTime values other than 0.005 and 0.24 ... )
> 11468 INFO TEST-PASS | chrome://mochitests/content/chrome/image/test/mochitest/test_animSVGImage.html | myOnStopFrame called
> *** VectorImage::Draw for image 0x98841d0 (currentTime? 0.240000)
> 11469 INFO TEST-PASS | chrome://mochitests/content/chrome/image/test/mochitest/test_animSVGImage.html | Animated image looks correct, at call #7 to onStopFrame
> 11470 INFO TEST-END | chrome://mochitests/content/chrome/image/test/mochitest/test_animSVGImage.html | finished in 595ms

Contrast this with the run that fails (Fedora opt):

> 11449 INFO TEST-START | chrome://mochitests/content/chrome/image/test/mochitest/test_animSVGImage.html
> 11453 INFO TEST-PASS | chrome://mochitests/content/chrome/image/test/mochitest/test_animSVGImage.html | myOnStopFrame called
> *** VectorImage::Draw for image 0xa3ddd160 (currentTime? 0.001000)
> ( ... WAY more myOnStopFrame calls, with many teeny tiny increments to currentTime .. )
> 13286 INFO TEST-PASS | chrome://mochitests/content/chrome/image/test/mochitest/test_animSVGImage.html | myOnStopFrame called
> *** VectorImage::Draw for image 0xa3ddd160 (currentTime? 0.043000)
> 13287 ERROR TEST-UNEXPECTED-FAIL | chrome://mochitests/content/chrome/image/test/mochitest/test_animSVGImage.html | timing out after 120000ms.  Animated image still doesn't look correct, after call #1835 to onStopFrame
> 13482 INFO TEST-END | chrome://mochitests/content/chrome/image/test/mochitest/test_animSVGImage.html | finished in 143845ms

Notice that despite running for _way_ longer and passing through many more values of currentTime, the failing run has barely progressed at all through the animation. After 144 seconds it's at 0.043, while in the passing run we're at 0.24 after only 0.6 seconds. The difference in the speed at which we are advancing through the animation is 3 orders of magnitude. I think this is the source of the problem, and if we fix this, this intermittent orange will go away.

As an aside, in a separate bug it'd be good to understand why we redraw so many times when nothing has changed (same currentTime, etc). This isn't exposed in the edited logs above, but if you look at the full logs, we redraw a _lot_ and I don't understand why.
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
(In reply to Seth Fowler [:seth] from comment #146)
> Notice that despite running for _way_ longer and passing through many more
> values of currentTime, the failing run has barely progressed at all through
> the animation.

Very interesting.  So, that time is supposed to be updated in nsSMILTimeContainer::UpdateCurrentTime(), which is called every sample (driven by the refresh driver).

Probably worthwhile to add some logging there to print out the time we set there, and in nsSMILAnimationController::WillRefresh to print out the time that we're passed by the refresh driver.

> As an aside, in a separate bug it'd be good to understand why we redraw so
> many times when nothing has changed (same currentTime, etc).

That implies we're getting lots of calls to SVGRootRenderingObserver::DoUpdate, in VectorImage.cpp (That's how the internal SVG document tells us something has changed,  invalidates the image-clients and gets itself repainted.)

It's normal for that to be called on each animation sample, if something has changed that requires a re-render.  But I'm not sure why it'd be getting called in this case.
(Assignee)

Comment 153

6 years ago
(In reply to Daniel Holbert [:dholbert] from comment #152)
> Probably worthwhile to add some logging there to print out the time we set
> there, and in nsSMILAnimationController::WillRefresh to print out the time
> that we're passed by the refresh driver.

On it.

> It's normal for that to be called on each animation sample, if something has
> changed that requires a re-render.  But I'm not sure why it'd be getting
> called in this case.

Yeah, I'm investigating this as well, because I'm a bit concerned about it. In my logging patch I've also added some code to differentiate the different callers of DoUpdate, so we can try to figure out what's going on.
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
(Assignee)

Comment 166

6 years ago
The extra calls to Draw are traced to their root cause and addressed in bug 846028.
(Assignee)

Comment 167

6 years ago
A new try job with additional refresh-related logging is up here: https://tbpl.mozilla.org/?tree=Try&rev=140014a8c764
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
(Assignee)

Comment 238

6 years ago
An edited IRC transcript about the latest try run's results:

> seth: basically the problem is that the first time nsSMILAnimationController::WillRefresh gets called, elapsedTime=1, and so that's what mAvgTimeBetweenSamples gets initialized to
> seth: after that initial call, elapsedTime is consistently much worse
> seth: the result is that every time, we trigger the "Detected really long delay between samples" code
> seth: mAvgTimeBetweenSamples stays at 1, since elapsedTime never gets <= 200 again, and so every time we advance by the smallest possible tick
> seth: i think though that the code that handles the "really long delay" situations is wrong
> seth: it'd probably be appropriate to still update the moving average in that situation. maybe with some damping factor, but it should get updated
> seth: that would prevent time essentially freezing for the SVG image in a situation like this

Based on this, I've pushed an experimental patch to try that _may_ fix this orange: https://tbpl.mozilla.org/?tree=Try&rev=462ee25eb0db
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
(Assignee)

Comment 278

6 years ago
Preliminary results are good! We got green! It's possible that we just got lucky, though, so I queued up a bunch more runs. This version of the patch is sitting on top of a big logging patch, so right now I'm rebasing against m-c and preparing to create a new bug about the fix.
Comment hidden (Treeherder Robot)
(Assignee)

Updated

6 years ago
Depends on: 846861
(Assignee)

Comment 280

6 years ago
We'll take care of this orange in bug 846861.
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Assigning to Seth, to take this off of the list of high-frequency-oranges-that-are-currently unowned.

As he noted in comment 280, he's got a fix for this over in bug 846861 that should hopefully take care of this for good.
Assignee: nobody → seth
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
<