Closed Bug 703806 Opened 8 years ago Closed 3 years ago

Intermittent as-image/img-blobURI-2.html | image comparison (==), max difference: 255, number of differing pixels: 10000

Categories

(Core :: SVG, defect, P3)

x86
macOS
defect

Tracking

()

RESOLVED FIXED
mozilla55
Tracking Status
firefox55 --- fixed

People

(Reporter: mbrubeck, Assigned: u459114)

References

(Blocks 1 open bug)

Details

(Keywords: intermittent-failure)

Attachments

(1 file)

Image 1 (test) is all-white; image 2 (reference) shows a green square.

https://tbpl.mozilla.org/php/getParsedLog.php?id=7475579&tree=Build-System
Rev3 MacOSX Leopard 10.5.8 build-system opt test reftest on 2011-11-18 14:03:52 PST for push 98908e7e477e

REFTEST INFO | Loading a blank page
REFTEST TEST-START | file:///Users/cltbld/talos-slave/test/build/reftest/tests/layout/reftests/svg/as-image/img-and-image-1.html | 5294 / 7179 (73%)
REFTEST TEST-START | file:///Users/cltbld/talos-slave/test/build/reftest/tests/layout/reftests/svg/as-image/img-and-image-1-ref.svg | 5294 / 7179 (73%)
REFTEST TEST-KNOWN-FAIL(EXPECTED RANDOM) | file:///Users/cltbld/talos-slave/test/build/reftest/tests/layout/reftests/svg/as-image/img-and-image-1.html | image comparison (==)
REFTEST INFO | Loading a blank page
REFTEST TEST-START | file:///Users/cltbld/talos-slave/test/build/reftest/tests/layout/reftests/svg/as-image/img-blobBuilder-1.html | 5295 / 7179 (73%)
REFTEST TEST-PASS | file:///Users/cltbld/talos-slave/test/build/reftest/tests/layout/reftests/svg/as-image/img-blobBuilder-1.html | image comparison (==)
REFTEST INFO | Loading a blank page
REFTEST TEST-START | file:///Users/cltbld/talos-slave/test/build/reftest/tests/layout/reftests/svg/as-image/img-blobBuilder-2.html | 5296 / 7179 (73%)
REFTEST TEST-UNEXPECTED-FAIL | file:///Users/cltbld/talos-slave/test/build/reftest/tests/layout/reftests/svg/as-image/img-blobBuilder-2.html | image comparison (==)

REFTEST number of differing pixels: 10000
REFTEST INFO | Saved log: START file:///Users/cltbld/talos-slave/test/build/reftest/tests/layout/reftests/svg/as-image/img-blobBuilder-2.html
REFTEST INFO | Saved log: [CONTENT] OnDocumentLoad triggering WaitForTestEnd
REFTEST INFO | Saved log: [CONTENT] WaitForTestEnd: Adding listeners
REFTEST INFO | Saved log: Initializing canvas snapshot
REFTEST INFO | Saved log: DoDrawWindow 0,0,800,1000
REFTEST INFO | Saved log: [CONTENT] WaitForTestEnd: Adding listeners
REFTEST INFO | Saved log: [CONTENT] MakeProgress: STATE_WAITING_TO_FIRE_INVALIDATE_EVENT
REFTEST INFO | Saved log: [CONTENT] MakeProgress: waiting for MozAfterPaint
REFTEST INFO | Saved log: [CONTENT] AttrModifiedListener fired
REFTEST INFO | Saved log: [CONTENT] AfterPaintListener in file:///Users/cltbld/talos-slave/test/build/reftest/tests/layout/reftests/svg/as-image/img-blobBuilder-2.html
REFTEST INFO | Saved log: Updating canvas for invalidation
REFTEST INFO | Saved log: DoDrawWindow 8,-4,4,12
REFTEST INFO | Saved log: DoDrawWindow 0,0,8,8
REFTEST INFO | Saved log: DoDrawWindow 12,0,788,8
REFTEST INFO | Saved log: DoDrawWindow 0,8,800,992
REFTEST INFO | Saved log: [CONTENT] AfterPaintListener in file:///Users/cltbld/talos-slave/test/build/reftest/tests/layout/reftests/svg/as-image/img-blobBuilder-2.html
REFTEST INFO | Saved log: [CONTENT] MakeProgress: STATE_WAITING_TO_FIRE_INVALIDATE_EVENT
REFTEST INFO | Saved log: [CONTENT] MakeProgress: dispatching MozReftestInvalidate
REFTEST INFO | Saved log: [CONTENT] MakeProgress: STATE_WAITING_FOR_REFTEST_WAIT_REMOVAL
REFTEST INFO | Saved log: [CONTENT] MakeProgress: STATE_WAITING_TO_FINISH
REFTEST INFO | Saved log: [CONTENT] MakeProgress: Completed
REFTEST INFO | Saved log: [CONTENT] MakeProgress: STATE_COMPLETED
REFTEST INFO | Saved log: [CONTENT] RecordResult fired
REFTEST INFO | Saved log: RecordResult fired
REFTEST INFO | Saved log: RecordResult fired
REFTEST INFO | Loading a blank page
REFTEST TEST-START | file:///Users/cltbld/talos-slave/test/build/reftest/tests/layout/reftests/svg/as-image/img-content-outside-viewBox-1.html | 5297 / 7179 (73%)
REFTEST TEST-START | file:///Users/cltbld/talos-slave/test/build/reftest/tests/layout/reftests/svg/as-image/img-content-outside-viewBox-1-ref.html | 5297 / 7179 (73%)
REFTEST TEST-PASS | file:///Users/cltbld/talos-slave/test/build/reftest/tests/layout/reftests/svg/as-image/img-content-outside-viewBox-1.html | image comparison (==)
Hmm.... So this testcase has a SVG image, which itself contains a moz-filedata-encoded image.

We flush parsing & layout inside the SVG image before firing onload, but I'm not sure we mandate that internal images themselves have finished decoding/rendering.

In particular, because the moz-filedata URI itself is very short ("moz-filedata:{short numeric id}"), it may be that we're able to parse & flush layout inside the (relatively consise) outer image before the inner image has actually finished loading...

That's my best guess at what's going on here. Taking, to be sure this doesn't fall off the radar.
Assignee: nobody → dholbert
Version: 11 Branch → Trunk
Bug 704059 is really the "right" way to fix this.  That may not happen right away, though.  So in the meantime, we could probably fix this randomorange by either of the following methods:
 (a) add a delay to the test, to give the internal image some time to load
  Upsides: easy, should fix randomorange; stopgap until the "right" fix comes in (bug 704059)
  Downsides: hacky, won't fix actual bug.

OR:
 (b) somehow flush image loads synchronously inside of SVGDocumentWrapper::OnStopRequest
  Upsides: will fix bug described in comment 1
  Downsides: not the right fix, not sure it's possible, could introduce new potential hangs

(To elaborate on why I'm not sure (b) is possible and why it could introduce hangs -- if there's a moz-filedata URI inside of an image to represent an internal image, that internal image could have gigabytes of data that we haven't received yet because it's coming in off of disk or something,  Even if it's possible to synchronously flush all of that data out of its nsIRequest (?), it'd probably take a long time, and we're blocking the event loop that whole time, which is bad.)

So I lean towards (a), with the hope that a fix for bug 704059 isn't too pie-in-the-sky.
Blocks: 738086
img-blobBuilder-* were renamed to img-blobURI-* in Bug 736687.

Updating bug summary to include both the old & new name, so that TBPL will still match this bug on branches before & after the rename.
Summary: Intermittent REFTEST TEST-UNEXPECTED-FAIL | file:///Users/cltbld/talos-slave/test/build/reftest/tests/layout/reftests/svg/as-image/img-blobBuilder-2.html | image comparison (==) → Intermittent REFTEST TEST-UNEXPECTED-FAIL | layout/reftests/svg/as-image/img-blobBuilder-2.html, layout/reftests/svg/as-image/img-blobURI-2.html | image comparison (==)
Duplicate of this bug: 738086
Daniel, note bug 789719 comment 5 - deferring the PresShell::FrameNeedsReflow() call turns this test permaorange, which may give some insight (confirmation?) regarding what's going on here.
Thanks -- looked into it a bit, but I think it's a different kind of orange.

That patch gives failures of the form
"load failed: timed out waiting for pending paint count to reach zero (after reftest-wait removed and switch to print mode) (waiting for MozAfterPaint)"
whereas the orange here is a rendering failure, not a timeout.

(I tried to debug a bit locally, but I can't reproduce (except for the test taking slightly longer every once in a while).  I tried to interrupt it when it was taking slightly-longer, but that was turning into too much of a time-suck, for a hypothetical test-failure caused by a not-yet-landed-and-maybe-not-going-to-land patch.  If we end up landing or wanting to land the patch from bug 789719 comment 5 and needing a bug for the timeouts that it triggers in this test, let's file a new bug for that.)

Comment 1 and Comment 5 describe what's going on in this instance of the orange -- the right fix for this is bug 704059, which is non-trivial, but it looks like related-bug 505385 is getting some traction, so there might be progress on this in the not-too-distant-future.
Summary: Intermittent REFTEST TEST-UNEXPECTED-FAIL | layout/reftests/svg/as-image/img-blobBuilder-2.html, layout/reftests/svg/as-image/img-blobURI-2.html | image comparison (==) → Intermittent as-image/img-blobURI-2.html | image comparison (==), max difference: 255, number of differing pixels: 10000
Whiteboard: [orange]
Too many intermittent failures - test marked as random-if on WinXP:
https://hg.mozilla.org/integration/mozilla-inbound/rev/e5089872c060
Whiteboard: [test marked as random-if on WinXP][leave open]
Adding dependency on bug 704059, per comment 1 & comment 5.
Depends on: 704059
I spent some time looking at this today. Currently I'm not sure exactly what the problem is; there's an anomaly somewhere, for sure, but I haven't tracked down the exact cause. I can tell you that this will still be random orange even with bug 704059 fixed.

There are 3 documents in this reftest:

1. The outer document, |img-blobURI-2.html|.
2. The SVG image, which is a data URI.
3. The SVG image _inside_ #2, which is a blob URI.

The thing that I find puzzling, and that seems to be at the root of the incorrectly ordered load notifications that causes this random orange, is that if you watch nsDocument::BlockOnload and nsDocument::UnblockOnload, you see that the documents get unblocked in a strange order. The patch in bug 704059 makes an SVG call UnblockOnload on the document that contains it when SVGLoad first on the SVG document. That is, [SVGLoad on inner document] causes [UnblockOnload on outer document]. This means that #1 is actually complete unblocked FIRST (i.e., mBlockCount == 0), followed by #3, and then finally #2. If we don't use the patch in bug 704059, it wouldn't change the fact that #1 will be unblocked first; without that patch, it's unblocked even earlier, as soon as OnStopRequest is called on the SVG image.

I tried as an experiment having SVGs wait for the |load| event on their inner documents before calling UnblockOnload. That is, [Load on inner document] causes [UnblockOnload on outer document]. This means that the SVGs wait longer before firing UnblockOnload. With this change, I saw even worse behavior: #1 unblocks, followed by #3, but #2 NEVER unblocks. This makes this test a permaorange. As far as I could tell, the root cause of this is that #3's |load| event simply never fires (meaning #3 never calls UnblockOnload on #2). I don't understand why not, though.

Daniel, any insight here?
(In reply to Seth Fowler [:seth] from comment #183)
> I spent some time looking at this today. Currently I'm not sure exactly what
> the problem is; there's an anomaly somewhere, for sure, but I haven't
> tracked down the exact cause. I can tell you that this will still be random
> orange even with bug 704059 fixed.
> 
> There are 3 documents in this reftest:
> 
> 1. The outer document, |img-blobURI-2.html|.
> 2. The SVG image, which is a data URI.
> 3. The SVG image _inside_ #2, which is a blob URI.
> 
> The thing that I find puzzling, and that seems to be at the root of the
> incorrectly ordered load notifications that causes this random orange, is
> that if you watch nsDocument::BlockOnload and nsDocument::UnblockOnload, you
> see that the documents get unblocked in a strange order. The patch in bug
> 704059 makes an SVG call UnblockOnload on the document that contains it when
> SVGLoad first on the SVG document. That is, [SVGLoad on inner document]
> causes [UnblockOnload on outer document]. This means that #1 is actually
> complete unblocked FIRST (i.e., mBlockCount == 0), followed by #3, and then
> finally #2. If we don't use the patch in bug 704059, it wouldn't change the
> fact that #1 will be unblocked first; without that patch, it's unblocked
> even earlier, as soon as OnStopRequest is called on the SVG image.
> 
> I tried as an experiment having SVGs wait for the |load| event on their
> inner documents before calling UnblockOnload. That is, [Load on inner
> document] causes [UnblockOnload on outer document]. This means that the SVGs
> wait longer before firing UnblockOnload. With this change, I saw even worse
> behavior: #1 unblocks, followed by #3, but #2 NEVER unblocks. This makes
> this test a permaorange. As far as I could tell, the root cause of this is
> that #3's |load| event simply never fires (meaning #3 never calls
> UnblockOnload on #2). I don't understand why not, though.
> 
> Daniel, any insight here?

Daniel ^

(Bug 704059 undid the random-if for WinXP; adjusting summary accordingly)
Flags: needinfo?(dholbert)
Whiteboard: [test marked as random-if on WinXP][leave open] → [disable-me 2013-04-01]
A lot has changed RE how SVG images are loaded, since Comment 183 was posted.

I believe bug 837315 was at least partly intended to address the weird unblocking order there.  I think someone (me or Seth probably) should do a bit more investigation on this, to see how it's working now.

I don't have cycles to dig into this at the moment, but I hopefully will after the next train switchover, in a little over a week.  For now, I'm definitely OK with this being re-marked random-if(winxp), given the recent (continued) intermittent-failures there.
Flags: needinfo?(dholbert)
Yup, I agree. I restored the random-if annotation here:

https://hg.mozilla.org/integration/mozilla-inbound/rev/a5c47f1edc12

I'm trying to debug some other stuff right now, but I'll give this a look when I get a chance if Daniel hasn't addressed it already.
That's great - thank you :-)