Closed Bug 1126091 Opened 6 years ago Closed 4 years ago

Intermittent webcam.html | image comparison (==), max difference: 255, number of differing pixels: 1

Categories

(Core :: ImageLib, defect, P3)

ARM
Android
defect

Tracking

()

RESOLVED FIXED
mozilla53
Tracking Status
firefox51 --- wontfix
firefox52 --- fixed
firefox53 --- fixed

People

(Reporter: KWierso, Assigned: aosmond)

References

(Depends on 1 open bug)

Details

(Keywords: intermittent-failure, Whiteboard: gfx-noted, [stockwell fixed])

Attachments

(3 files)

Attached image TEST
15:43:57 INFO - REFTEST TEST-START | http://10.26.131.18:30363/tests/image/test/reftest/gif/webcam.html
15:43:57 INFO - REFTEST TEST-LOAD | http://10.26.131.18:30363/tests/image/test/reftest/gif/webcam.html | 510 / 1781 (28%)
15:43:57 INFO - REFTEST TEST-LOAD | http://10.26.131.18:30363/tests/image/test/reftest/gif/blue.gif | 510 / 1781 (28%)
15:43:57 INFO - REFTEST TEST-UNEXPECTED-FAIL | http://10.26.131.18:30363/tests/image/test/reftest/gif/webcam.html | image comparison (==), max difference: 255, number of differing pixels: 1
15:43:57 INFO - REFTEST IMAGE 1 (TEST): 
15:43:57 INFO - REFTEST IMAGE 2 (REFERENCE): 
15:43:57 INFO - REFTEST INFO | Saved log: START http://10.26.131.18:30363/tests/image/test/reftest/gif/webcam.html
15:43:57 INFO - REFTEST INFO | Saved log: [CONTENT] OnDocumentLoad triggering AfterOnLoadScripts
15:43:57 INFO - REFTEST INFO | Saved log: Initializing canvas snapshot
15:43:57 INFO - REFTEST INFO | Saved log: DoDrawWindow 0,0,800,1000
15:43:57 INFO - REFTEST INFO | Saved log: [CONTENT] RecordResult fired
15:43:57 INFO - REFTEST INFO | Saved log: RecordResult fired
15:43:57 INFO - REFTEST INFO | Saved log: START http://10.26.131.18:30363/tests/image/test/reftest/gif/blue.gif
15:43:57 INFO - REFTEST INFO | Saved log: [CONTENT] OnDocumentLoad triggering AfterOnLoadScripts
15:43:57 INFO - REFTEST INFO | Saved log: Initializing canvas snapshot
15:43:57 INFO - REFTEST INFO | Saved log: DoDrawWindow 0,0,800,1000
15:43:57 INFO - REFTEST INFO | Saved log: [CONTENT] RecordResult fired
15:43:57 INFO - REFTEST INFO | Saved log: RecordResult fired
15:43:57 INFO - REFTEST INFO | Loading a blank page
15:43:57 INFO - REFTEST TEST-END | http://10.26.131.18:30363/tests/image/test/reftest/gif/webcam.html
15:43:57 INFO - REFTEST TEST-START | http://10.26.131.18:30363/tests/image/test/reftest/apng/delaytest.html?bug411852-1.png
Can we just fuzz this on Android and get on with life?
Component: Layout → ImageLib
Flags: needinfo?(seth)
(In reply to Ryan VanderMeulen [:RyanVM UTC-5] from comment #19)
> Can we just fuzz this on Android and get on with life?

This isn't really a case of fuzz. The entire test only involves one pixel. This is actually a total failure. And it's not an easy fix - this test actually can't be a reftest without being intermittent, as things are currently.

The good news is that bug 1113465 will remove this test and make the issue more or less moot. It's not ready to land yet, though. But I'm aiming to land it in the 39 timeframe.

Depending on how much trouble this intermittent orange is causing you, we could just disable the test before that. If bug 1113465 ends up not landing or not sticking, we can always reevaluate at that point.
Flags: needinfo?(seth)
Depends on: 1162308
Bulk assigning P3 to all open intermittent bugs without a priority set in Firefox components per bug 1298978.
Priority: -- → P3
Duplicate of this bug: 1297207
Recent failures are all e10s, almost all linux64, often pgo.
In light of :seth's comment 20, and since there's been no activity in bug 1113465 for over a year, I wonder if we should disable this test. 

:milan - Can you make that call, or redirect if appropriate?
Flags: needinfo?(milan)
Let's just disable it.  It's doing more harm than good right now.
Assignee: nobody → aosmond
Flags: needinfo?(milan)
(In reply to Seth Fowler [:seth] [:s2h] from comment #20)
> And it's not an easy fix - this test
> actually can't be a reftest without being intermittent, as things are
> currently.

That's what I thought at first, because multipart mixed replaced images generally load forever. But not in this test, it is aa finite stream of two images. I see no reason why this retreat cannot work reliably.

> The good news is that bug 1113465 will remove this test and make the issue
> more or less moot. It's not ready to land yet, though. But I'm aiming to
> land it in the 39 timeframe.

Bug 1113465 is based on the false assumption that gif isn't used multipart mixed replace. The very existence of this test explains why. If you check the test history it points to bug 641748, which contains a test case from the wild of this.
try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=8042c7e8575f95aae66611bc0868af584dcab6a1

I think the problem with the reftest is that it can fire the comparison before the multipart image has finished decoding, i.e. it is showing the red pixel instead of the blue pixel due to a race. Since this has similar properties to an animation, we can rewrite the test with the same framework animations use.
The reftest harness uses DrawWindow which by default will sync decode images (unless you pass a flag asking for async):

https://dxr.mozilla.org/mozilla-central/rev/8103c612b79c2587ea4ca1b0a9f9f82db4b185b8/layout/base/PresShell.cpp#4680

So if the multipart image is fully loaded (it should be because the reftest harness waits for the page to be loaded) then the blue image should be present, and when the reftest harness takes a snapshot it should ask to sync decode, so the blue image will get decoded if it isn't already.
(In reply to Timothy Nikkel (:tnikkel) from comment #192)
> The reftest harness uses DrawWindow which by default will sync decode images
> (unless you pass a flag asking for async):
> 
> https://dxr.mozilla.org/mozilla-central/rev/
> 8103c612b79c2587ea4ca1b0a9f9f82db4b185b8/layout/base/PresShell.cpp#4680
> 
> So if the multipart image is fully loaded (it should be because the reftest
> harness waits for the page to be loaded) then the blue image should be
> present, and when the reftest harness takes a snapshot it should ask to sync
> decode, so the blue image will get decoded if it isn't already.

FLAG_SYNC_DECODE is not a strong promise though. If we are still getting data from the network it won't block. For multipart images, it uses ImageWrapper::GetImageContainer which passes through directly to mInnerImage -- thus passing in FLAG_SYNC_DECODE only works on sync decoding *that* particular part. We only change mInnerImage once we get a frame complete from mNextPart. If there is a dispatch anywhere in the code path (progress notifications, MIME handler, etc), I could see it using the old image before it has swapped in the next one.
(In reply to Andrew Osmond [:aosmond] from comment #193)
> FLAG_SYNC_DECODE is not a strong promise though. If we are still getting
> data from the network it won't block.

Reftest takes it's snapshot after the page is loaded, so all network data is there.

> For multipart images, it uses
> ImageWrapper::GetImageContainer which passes through directly to mInnerImage
> -- thus passing in FLAG_SYNC_DECODE only works on sync decoding *that*
> particular part. We only change mInnerImage once we get a frame complete
> from mNextPart. If there is a dispatch anywhere in the code path (progress
> notifications, MIME handler, etc), I could see it using the old image before
> it has swapped in the next one.

Yeah, this is the reason. I pushed to try last night and was able to figure it out with only one push!

https://treeherder.mozilla.org/#/jobs?repo=try&revision=8fd8f25410bd3320e1b6cdde1b79bcb9e0e94775

The decode of the second part is what causes it to become the current part, and there is nothing to force it to be decoded. A mochitest is a good approach to fix this. But if we need to test multipart images in reftests for some reason another way to solve it would be to have the multipart image have at least three parts, and have the last two parts continue the final image. The reason this is sufficient is that when we get the OnDataAvailable for the final part it will force a sync decode of the second last part, so we'd have to be on the second last part or later if all the network data for the whole stream has arrived.
circling back here, this is frequent push, especially now that we have a more normal volume of pushes this week.  It looks like we understand the failure, is there a timeline to work on this?
Comment on attachment 8818341 [details] [diff] [review]
Rewrite webcam reftest as mochitest, v1

My bad, the patch should work fine, just needs to be reviewed :).
Attachment #8818341 - Flags: review?(tnikkel)
Comment on attachment 8818341 [details] [diff] [review]
Rewrite webcam reftest as mochitest, v1

Could you add a simple comment to the test saying what it is testing and how it does that? Like the one in reftest.list.

Also reftest-stylo.list says it is auto generated and shouldn't be edited.
Attachment #8818341 - Flags: review?(tnikkel) → review+
Pushed by aosmond@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/c0593a4dc201
Rewrite GIF webcam reftest as mochitest due to intermittents. r=tnikkel
https://hg.mozilla.org/mozilla-central/rev/c0593a4dc201
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla53
Whiteboard: gfx-noted → gfx-noted, [stockwell fixed]
You need to log in before you can comment on or make changes to this bug.