Closed Bug 772443 Opened 7 years ago Closed 6 years ago

Intermittent svg/as-image/svg-stylesheet-datauri-1.html | image comparison (==), max difference: 255, number of differing pixels: 10000

Categories

(Core :: Layout, defect, P5)

defect

Tracking

()

RESOLVED FIXED
mozilla22

People

(Reporter: emorley, Assigned: seth)

References

Details

(Keywords: intermittent-failure, Whiteboard: [test marked as random][leave open])

Rev4 MacOSX Snow Leopard 10.6 mozilla-inbound debug test reftest on 2012-07-09 15:29:26 PDT for push 2cbf12886c89

slave: talos-r4-snow-054

https://tbpl.mozilla.org/php/getParsedLog.php?id=13367206&tree=Mozilla-Inbound

{
REFTEST TEST-START | http://localhost:4444/1341873094523/391/lime100x100.svg | 5929 / 8051 (73%)
++DOMWINDOW == 149 (0x14b952160) [serial = 17352] [outer = 0x1087e4950]
REFTEST TEST-UNEXPECTED-FAIL | file:///Users/cltbld/talos-slave/test/build/reftest/tests/layout/reftests/svg/as-image/svg-stylesheet-datauri-1.html | image comparison (==)
REFTEST number of differing pixels: 10000 max difference: 255
}
Summary: Intermittent → Intermittent svg/as-image/svg-stylesheet-datauri-1.html | image comparison (==)
Entirely white testcase, vs. lime reference case.

The testcase is basically just
  <img src="svg-stylesheet-datauri.svg">
and that image is a simple SVG with a blue rect, with a data URI stylesheet that loads to style it as green.

So -- it's _not_ a case of "the image's stylesheet didn't load yet" -- if that happened, the testcase would be blue, not white.
Whiteboard: [orange]
Summary: Intermittent svg/as-image/svg-stylesheet-datauri-1.html | image comparison (==) → Intermittent svg/as-image/svg-stylesheet-datauri-1.html | image comparison (==), max difference: 255, number of differing pixels: 10000
I'd be willing to bet this is an issue with the notifications (started, decoded, etc.) for SVG images. Probably they are arriving in the wrong order, causing a race. I'd love to fix this as I'm hitting it quite a bit.
(In reply to Seth Fowler [:seth] from comment #40)
> I'd be willing to bet this is an issue with the notifications (started,
> decoded, etc.) for SVG images. Probably they are arriving in the wrong
> order, causing a race.

Yup, this definitely smells like a race condition.

Note: I initially suspected we might be a race where we were displaying the SVG testcase w/out the stylesheet applied (because we were still parsing it), but I don't think that's the case, as described in comment 1.

> I'd love to fix this as I'm hitting it quite a bit.

You are a gentleman! Feel free to take it, and let me know if I can help.
This seems to have gotten significantly worse today. (perhaps related to bug 811391 landing?)  That's probably good, because it means we're more likely to be able to reproduce it locally & see what's going on. :)

If it stays frequent-orange, I'll see if I can reproduce this locally in the next few days and see what I can find. (Seth, let me know if you're already looking into it, per comment 40 / 42)
OS: Mac OS X → All
Hardware: x86_64 → All
(In reply to Daniel Holbert [:dholbert] from comment #75)
> This seems to have gotten significantly worse today. (perhaps related to bug
> 811391 landing?)  That's probably good, because it means we're more likely
> to be able to reproduce it locally & see what's going on. :)
> 
> If it stays frequent-orange, I'll see if I can reproduce this locally in the
> next few days and see what I can find. (Seth, let me know if you're already
> looking into it, per comment 40 / 42)

I'm not already looking into this, but I'll try to take a look either later tonight or tomorrow and see if I can figure out what's going on. If you have time before I get to it feel free, of course. I'll post in here when I get a chance to take a look and report my findings.
Backing out bug 825720 made this less frequent again.
(In reply to Ryan VanderMeulen from comment #105)
> Backing out bug 825720 made this less frequent again.

Thanks!

FWIW, in a debug build from before that backout (from rev ec36f0d72b88), I can reproduce this in local reftest runs, about 50% of the time, with a custom reftest.list that just has this line (or, better, multiple copies of this line):
> HTTP == svg-stylesheet-datauri-1.html  lime100x100.svg
(The HTTP annotation seems to make this more likely to reproduce.)

And I can reproduce it when I load the testcase normally, too, about 50% of the time.
When this fails, it looks like it's because we've got no root frame in the SVG document when we draw -- we haven't yet had a reflow.

Unfortunately, I think this is a version of bug 704059. Quoting the first comment there:
(In reply to Daniel Holbert [:dholbert] from comment #0)
> So -- imgRequest::OnStopRequest() fires when we're finished receiving data,
> and it tells image consumers that we're finished loading/ready to paint. (I
> think via imgStatusTracker::RecordStopRequest)
> 
> However, for SVG images, these events (finished receiving data & finished
> loading) aren't intrinsically linked.  We may need more time to finish
> parsing, constructing the DOM, doing layout, loading internal images, etc.

Add "parsing external stylesheets" to that list. (external in the sense of being stored in a separate document, even if that document is encoded in a local data URI)...

However, maybe we're at a point where fixing bug 704059 wouldn't be too bad, given the refactoring that happened over in bug 505385...
Depends on: 704059
In the meantime, we might be able to get this down to a reasonably low failure-rate (even with bug 825720 re-landed) if we comment out the HTTP-annotated line for this test in reftest.list (assuming my observation about that line being the spammier one holds up).
Thank you for looking at this Daniel :-)
There's a proposed patch to fix this up in bug 704059 right now. Should have it cleared up fairly soon, and then we can reland bug 825720.
Marked as random for too many failures:
https://hg.mozilla.org/integration/mozilla-inbound/rev/973bb5c4d782
Whiteboard: [test marked as random][leave open]
Seth's got a fix for this almost ready, in bug 704059, FWIW. Once that lands, we can re-enable this.
Ironically what's going to get that fix in is probably to disable _another_ test. Sigh.