Closed Bug 628888 Opened 13 years ago Closed 13 years ago

SVG SMIL: Animations in external document sometimes don't run

Categories

(Core :: SVG, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla2.0b12

People

(Reporter: birtles, Assigned: birtles)

References

()

Details

Attachments

(3 files, 2 obsolete files)

For the test URL the animation runs for me about 1 time in 10.

If I run the test locally it runs 1 time in 2.

I haven't debugged it properly but it seems that the external doc is being sent a page hide event but only sometimes getting a page show event.

(If you load the test URL, navigate to another page and then go back the animation will play regardless of whether it played the first time or not.)

This is likely a regression from bug 608295, changeset http://hg.mozilla.org/mozilla-central/rev/04311a70fa33.

Bug 610419 may also be related.
I was able to reproduce this once after shift-reloading for about 7-8 minutes.

(I'm not sure what factors go into the race condition that we're running up against, but it doesn't seem to be bandwidth-related -- I tried slowing my bandwidth to a trickle using 'wondershaper' and that didn't help)
I'll take this since I can reproduce it most reliably.
Assignee: nobody → birtles
Status: NEW → ASSIGNED
Attached patch Patch v1aSplinter Review
Possible patch.

We're getting a situation where the external doc is getting loaded after PageShow. Adding some debugging statements I see this sort of pattern:

Success case:

++DOMWINDOW...
nsExternalResourceMap for nsDocument(A)::PendingLoad::OnStartRequest
nsDocument(B)::GetAnimationController Hiding newly created animation controller since we're hidden
  nsSMILAnimationController::OnPageHide
nsDocument(A)::OnPageShow Telling external resources about page show
nsDocument(B)::OnPageShow Telling external resources about page show
nsDocument(B)::OnPageShow Telling animation controller to show
  nsSMILAnimationController::OnPageShow

Failure case:

++DOMWINDOW...
nsDocument(A)::OnPageShow Telling external resources about page show
nsExternalResourceMap for nsDocument(A)::PendingLoad::OnStartRequest
nsDocument(B)::GetAnimationController Hiding newly created animation controller since we're hidden
  nsSMILAnimationController::OnPageHide

As discussed on IRC with dholbert, this patch mimics what we already do with text zoom levels, syncing the external doc with the referencing doc.

I can't yet think of a good way to test this. One possibility is a similar setup to the test URL with a <set> animation due to run at begin="0.0001s" and then wait for a few seconds for a beginEvent from the animation (to prove we're not paused) and fail if it never comes. But even then we might only fail once in 100 times meaning that if we regressed the orange might not show up until a long time after the offending patch.
Attachment #508641 - Flags: review?(dholbert)
Comment on attachment 508641 [details] [diff] [review]
Patch v1a

Looks good to me, but I'd like to get bz's opinion, too.

Brian, RE test-crafting -- you might see if putting the filtered content in an iframe that gets reloaded helps at all.  That ended up being useful for (eventually) creating a reliable reftest for the bug that added TransferZoomLevels.  See bug 627776 comment 19 and the improved reftests added in that bug as examples.
Attachment #508641 - Flags: review?(dholbert)
Attachment #508641 - Flags: review?(bzbarsky)
Attachment #508641 - Flags: review+
For a test, can you just add the content that needs the external resource dynamically, after onload has fired?  I guess then a problem is telling when the external resource loaded...
Comment on attachment 508641 [details] [diff] [review]
Patch v1a

r=me
Attachment #508641 - Flags: review?(bzbarsky) → review+
Attached patch Test case (obsolete) — Splinter Review
I've been wrestling with test cases for this for some time now and this is as far as I've got.

I've managed to create a test case that reliably reproduces this but the real trouble is detecting when the animation has started so we can take a snapshot.

Normally we can detect if animation has started by listening for beginEvents (and simply fail if the event isn't received after some timeout), but unfortunately we don't get beginEvents from animations in external documents.

I've tried various arrangements such as referencing different parts of the SAME external document via both a 'filter' attribute and a <use> element and then listening for the beginEvent from the use element content but unfortunately we seem to treat these cases differently. The animation plays for the fragment referenced by <use> but not the fragment referenced by filter suggesting we have different nsDocuments.

So the attached test case is a fairly simple arrangement that simply waits 3 seconds and then takes a snapshot. I wonder, does TinderBox ever get so bogged down that this 3 seconds will be too short?
Attachment #509655 - Flags: review?(dholbert)
I'll have a bit more of a think about this since I'm a bit uncomfortable with extending the test suite time by 3 seconds for this bug. (Maybe we can render the test into a canvas, periodically test some pixel that we're expecting to go green, and finish the test early if we detect a success condition?)
Attached patch Test case v2 (obsolete) — Splinter Review
I've rewritten the test case as a mochitest that polls the pixels to see if the animation has taken effect or not. This should take much less time for a successful test run whilst being more tolerant of slow tinderbox performance (the timeout for a failure is now 10s).
Attachment #509655 - Attachment is obsolete: true
Attachment #510477 - Flags: review?(dholbert)
Attachment #509655 - Flags: review?(dholbert)
Comment on attachment 510477 [details] [diff] [review]
Test case v2

Looks good! No functional changes - just nits.

>+++ b/content/smil/test/smilExtDoc_helper.xml
>@@ -0,0 +1,7 @@
>+<svg xmlns="http://www.w3.org/2000/svg">

Nit: Since this file only contains SVG markup, we might as well just give it a ".svg" extension, right?

>diff --git a/content/smil/test/test_smilExtDoc.xhtml b/content/smil/test/test_smilExtDoc.xhtml
[...]
>+<div id="content" style="background: red; width: 50px; height: 50px">
>+&#160;
>+</div>

I haven't encountered "&#160;" before, but a quick google-search suggests it's a nonbreaking space character. Assuming that we actually need this character (do we?), can we call it "&nbsp;" for a readibility win?

>+/* Test for Bug 628888 - Animations in external document sometimes don't run
>+ *
>+ * This bug concerns a condition where an external document is loaded after the
>+ * page show event is dispatched leaving the external document paused.

Nit: add comma after "dispatched".

>+ * This should mean the test succeeds quickly but fails slowly.

Ah, very nice! :)

>+  var green = (imgd.data[0] == 0) &&
>+              (imgd.data[1] == 255) &&
>+              (imgd.data[2] == 0);
>+  if (green) {

s/green/isGreen/

>+  content.style.display = 'none';
>+  SimpleTest.finish();

It looks like the  "content.style.display = 'none';" line there isn't functional, but is rather to follow the semi-convention of mochitests being as visually "quiet" as possible -- is that right?  If so, could you add a comment stating something like that?

r=dholbert with the above addressed.
Attachment #510477 - Flags: review?(dholbert) → review+
Thanks for the quick review Daniel!

(In reply to comment #10)
> I haven't encountered "&#160;" before, but a quick google-search suggests it's
> a nonbreaking space character. Assuming that we actually need this character
> (do we?), can we call it "&nbsp;" for a readibility win?

The "&#160;" was because the file is XHTML where "&nbsp;" isn't recognised as an entity but it turns out we don't need it now that I've added explicit width/height to the div.

All other feedback should be addressed in this patch. Thanks again Daniel!
Attachment #510477 - Attachment is obsolete: true
Requesting approval to land.

Fixes regression in beta range, includes test, ensures users aren't randomly denied animated filter eye candy.
Attachment #510827 - Flags: approval2.0?
Pushed:
http://hg.mozilla.org/mozilla-central/rev/e3a645b56973
http://hg.mozilla.org/mozilla-central/rev/4327761b6350
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → mozilla2.0b12
Backed out: http://hg.mozilla.org/mozilla-central/rev/71d5d43ff9ef
Tests were failing (although they were fine on Try a couple of days ago). Will investigate.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Target Milestone: mozilla2.0b12 → ---
Pushed again:
http://hg.mozilla.org/mozilla-central/rev/65938282acdb
http://hg.mozilla.org/mozilla-central/rev/1ed3464aaa92

The problem was a last minute renaming of the test helper file from .xml to .svg without updating the reference to that file.

I tested it locally but because of the way mochitests are just copied to the destination without clearing out old files there first the test continued to pass because the old .xml file was still there.
Status: REOPENED → RESOLVED
Closed: 13 years ago13 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla2.0b12
Brian, bug 869285 is an intermittent mochitest failure of test_SmilExtDoc.xhtml (for this bug) on windows.

It started failing (rarely, but still) after landing bug 866561, which increased timer resolution on windows to 1ms while playing video (using timeBegin[/end]Period(1)).

Bug 866561 was a followup to bug 590422, which removed the timers averaging filter (which could introduce big skews depending on context).

Could you see how any of this possibly leads to rare test failures?
Ping.
Flags: needinfo?(birtles)
I'll look into it on Friday.
(In reply to Avi Halachmi (:avih) from comment #16)
> Brian, bug 869285 is an intermittent mochitest failure of
> test_SmilExtDoc.xhtml (for this bug) on windows.

This doesn't appear to be anything specific to this bug. There are a bunch of other tests failing there due to what appear to be timeouts.

It *may* be that we're getting some significant load that means that 10s is not a long enough timeout for this test. If that's the case increasing the timeout would solve it but only for this test case. The fact that a bunch of other tests are also consistently failing at the same time suggests something else is amiss.

(For what it's worth, basically this test operates by polling the value of a pixel every 100ms up to 10s and the test failures are occurring when the pixel value hasn't changed by the time we reach 10s.)
Flags: needinfo?(birtles)
(In reply to Brian Birtles (:birtles) from comment #19)
> ... There are a bunch of other tests failing there due to what appear
> to be timeouts.

Which tests? Did they also start failing around the same time that bug 866561 landed?


> It *may* be that we're getting some significant load that means that 10s is
> not a long enough timeout for this test.

This is interesting. Do think that 10s might not be enough under any load which you expect this test to generate?

I'm asking because according to bug 880036, it's possible that under some relatively light but constant loads (e.g. where each refresh driver iteration takes more than 17ms), the paint event handler could get starved by the refresh driver, thus painting to screen much less frequently than it theoretically can. However, 10s starvation does sound quite long even on that case.

Does this test generate this kind of load (where each animation frame could take ~17ms or more)?
(In reply to Avi Halachmi (:avih) from comment #20)
> (In reply to Brian Birtles (:birtles) from comment #19)
> > ... There are a bunch of other tests failing there due to what appear
> > to be timeouts.
> 
> Which tests? Did they also start failing around the same time that bug
> 866561 landed?

Most of the tests listed in bug 869285.

> > It *may* be that we're getting some significant load that means that 10s is
> > not a long enough timeout for this test.
> 
> This is interesting. Do think that 10s might not be enough under any load
> which you expect this test to generate?

The interesting part of this test is it relies on loading an external resource so if I/O was blocked you might expect some delays. However I *don't* think that's what's happening given the other tests that fail (which don't rely on I/O).

> Does this test generate this kind of load (where each animation frame could
> take ~17ms or more)?

This test ought not to generate much load. The animation consists of a single frame.

Most of the other SMIL-related test failures in bug 869285 have timeouts. We might be getting into some starvation situation. Increasing the timeout on all of them might help but it sounds like there's some underlying problem that ought to be fixed.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: