Closed Bug 1768650 Opened 3 years ago Closed 3 years ago

Intermittent svg/smil/anim-marker-orient-02.svg == svg/smil/lime.svg | image comparison, max difference: 255, number of differing pixels: 380

Categories

(Core :: Graphics: WebRender, defect)

defect

Tracking

()

RESOLVED FIXED
103 Branch
Tracking Status
firefox-esr91 --- unaffected
firefox100 --- unaffected
firefox101 --- unaffected
firefox102 --- fixed
firefox103 --- fixed

People

(Reporter: intermittent-bug-filer, Assigned: nical)

References

(Regression)

Details

(Keywords: intermittent-failure, regression, Whiteboard: [stockwell disable-recommended])

Attachments

(1 file)

Filed by: ccozmuta [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer?job_id=377536848&repo=autoland
Full log: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/Z4Ku1TfYQ6qu0Av-7mmSIg/runs/0/artifacts/public/logs/live_backing.log
Reftest URL: https://hg.mozilla.org/mozilla-central/raw-file/tip/layout/tools/reftest/reftest-analyzer.xhtml#logurl=https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/Z4Ku1TfYQ6qu0Av-7mmSIg/runs/0/artifacts/public/logs/live_backing.log&only_show_unexpected=1


[task 2022-05-10T11:57:08.822Z] 11:57:08     INFO - REFTEST TEST-START | layout/reftests/svg/smil/anim-marker-orient-02.svg == layout/reftests/svg/smil/lime.svg
[task 2022-05-10T11:57:08.824Z] 11:57:08     INFO - REFTEST TEST-LOAD | file:///builds/worker/workspace/build/tests/reftest/tests/layout/reftests/svg/smil/anim-marker-orient-02.svg | 46 / 161 (28%)
[task 2022-05-10T11:57:09.300Z] 11:57:09     INFO - REFTEST TEST-UNEXPECTED-FAIL | layout/reftests/svg/smil/anim-marker-orient-02.svg == layout/reftests/svg/smil/lime.svg | image comparison, max difference: 255, number of differing pixels: 380
<...>
[task 2022-05-10T11:57:09.312Z] 11:57:09     INFO - REFTEST INFO | Saved log: START file:///builds/worker/workspace/build/tests/reftest/tests/layout/reftests/svg/smil/anim-marker-orient-02.svg
[task 2022-05-10T11:57:09.314Z] 11:57:09     INFO - REFTEST INFO | Saved log: [CONTENT] OnDocumentLoad triggering AfterOnLoadScripts
[task 2022-05-10T11:57:09.316Z] 11:57:09     INFO - REFTEST INFO | Saved log: Initializing canvas snapshot
[task 2022-05-10T11:57:09.319Z] 11:57:09     INFO - REFTEST INFO | Saved log: DoDrawWindow 0,0,800,1000
[task 2022-05-10T11:57:09.320Z] 11:57:09     INFO - REFTEST INFO | Saved log: [CONTENT] RecordResult fired
[task 2022-05-10T11:57:09.321Z] 11:57:09     INFO - REFTEST INFO | Saved log: RecordResult fired
[task 2022-05-10T11:57:09.322Z] 11:57:09     INFO - REFTEST INFO | Saved log: RecordResult fired
[task 2022-05-10T11:57:09.323Z] 11:57:09     INFO - REFTEST TEST-END | layout/reftests/svg/smil/anim-marker-orient-02.svg == layout/reftests/svg/smil/lime.svg
[task 2022-05-10T11:57:09.324Z] 11:57:09     INFO - REFTEST TEST-START | layout/reftests/svg/smil/anim-polygon-points-01.svg == layout/reftests/svg/smil/anim-polygon-points-01-ref.svg

'''INFO'''
There is a 'red square' that appears on the first image - the test one, and is missing from the second image.
'''INFO'''

Could be, sort of. This is an animation test and only intermitten with an asan build, I suspect we have a timing/race issue that we were unlikely to hit before and bug 1686654 changed the timing in a way that is exarbated by asan overhead and somehow made it more likely to happen.

There shouldn't be non-determinism introduced in bug 1686654 at least. We should probably just mark it as random in that configuration since it's only intermittent with asan.

Flags: needinfo?(nical.bugzilla)

Update:
There have been 35 failures within the last 3 days:
• 3 failures on Windows 10 x64 2004 WebRender debug
• 7 failures on Windows 10 x64 2004 asan WebRender opt
• 1 failures on Windows 10 x86 2004 WebRender debug
• 1 failures on Linux 18.04 x64 WebRender tsan opt
• 3 failures on Linux 18.04 x64 WebRender Shippable opt
• 2 failures on Linux 18.04 x64 WebRender opt
• 8 failures on Linux 18.04 x64 WebRender debug
• 7 failures on Linux 18.04 x64 WebRender asan opt
• 1 failures on Android 7.0 x86-64 WebRender opt
• 2 failures on Android 7.0 x86-64 Lite WebRender opt
Recent failure log: https://treeherder.mozilla.org/logviewer?job_id=378019030&repo=autoland&lineNumber=5843

Whiteboard: [stockwell needswork:owner]

Hi Jonathan! As the owner of this component, could you help us assign this to someone?
Thank you!

Flags: needinfo?(jwatt)

Just dropping a few notes here, since I happened to run across this on a Try run and took a quick look...

  • Direct link to test: https://hg.mozilla.org/mozilla-central/raw-file/tip/layout/reftests/svg/smil/anim-marker-orient-02.svg

  • The initial rendering of the test has a small red square, which is 20px to the right of the top-left corner. The test has logic to advance the animation timeline a bit later in the animation (after the red square is supposed to disappear) and takes a snapshot. (note that this logic doesn't run in the above-linked hg.m.o-served version of the test, because the CSP blocks the script load)

  • When the test fails (at least in the failure log that I looked at), it looks like that red square is still there, which means we probably failed to invalidate/repaint properly.

  • This test was at one point annotated as random-if(webrender) but that random annotation was removed in bug 1436084 (4 years ago) when it was observed that it was no-longer-random.

So: my tentative guess would be that this might be some version of that same random-if-webrender issue coming back.

Aha, now I see comment 1 - 2. It looks indeed like a good bet that this would be a regression from that change. Marking as such, and updating component.

(In reply to Nicolas Silva [:nical] from comment #2)

Could be, sort of. This is an animation test and only intermitten with an asan build

Unfortunately it's intermittent in more sorts of builds now; see comment 3 and comment 5. So I'm not sure we can just mark this as random on certain platforms, since it looks like it's getting hit on at least some "regular" builds ("plain" opt or debug) on every platform.

nical, what do you think we should do here? If this pref is live, maybe it'd be worth preffing-it-off, just for this test, as a stopgap and to validate the theory that this is indeed what's causing the issue? (Hopefully we wouldn't just leave things in that state indefinitely; maybe we could then try to capture in rr or something and see about fixing?)

Component: SVG → Graphics: WebRender
Flags: needinfo?(jwatt) → needinfo?(nical.bugzilla)
Regressed by: 1686654

Set release status flags based on info from the regressing bug 1686654

Assignee: nobody → nical.bugzilla
Status: NEW → ASSIGNED

nical, what do you think we should do here? If this pref is live, maybe it'd be worth preffing-it-off, just for this test, as a stopgap

Sounds good to me.

Flags: needinfo?(nical.bugzilla)
Has Regression Range: --- → yes

Set release status flags based on info from the regressing bug 1686654

Nicolas can the patch here get landed or are there any changes planned?

Flags: needinfo?(nical.bugzilla)
Whiteboard: [stockwell disable-recommended] → [stockwell needswork:owner]
Pushed by nsilva@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/cce9e3eb2cc1 Turn off gfx.webrender.svg-images for a test. r=dholbert
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → 103 Branch
Flags: needinfo?(nical.bugzilla)

The patch landed in nightly and beta is affected.
:nical, is this bug important enough to require an uplift?
If not please set status_beta to wontfix.

For more information, please visit auto_nag documentation.

Flags: needinfo?(nical.bugzilla)
Flags: needinfo?(nical.bugzilla)

RE the beta102 wontfix: I disagree -- we should just uplift the (trivial) patch to beta. Otherwise sheriffs will have to waste time starring instances of this intermittent failure on beta (and eventually release). (They are already doing so; comment 15 shows this cropping up on beta.)

nical, if you've got a beta tree around, would you mind uplifting? Otherwise feel free to ni me back and I can uplift.

(We don't need to bother with beta approval, since this is just a test manifest change and has zero user impact.)

Flags: needinfo?(nical.bugzilla)

Agreed, we don't want to live with this on ESR102 for the next year+. I'll take care of it.

Flags: needinfo?(nical.bugzilla)
See Also: → 1817212
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: