Intermittent gfx/layers/apz/test/mochitest/test_bug982141.html | found the RCD node

RESOLVED FIXED in Firefox 65

Status

()

defect
P5
normal
RESOLVED FIXED
8 months ago
2 months ago

People

(Reporter: intermittent-bug-filer, Assigned: kats)

Tracking

({intermittent-failure})

unspecified
mozilla66
Points:
---
Bug Flags:
in-testsuite +

Firefox Tracking Flags

(firefox65 fixed, firefox66 fixed)

Details

Attachments

(2 attachments)

Reporter

Description

8 months ago
treeherder
Filed by: rgurzau [at] mozilla.com

https://treeherder.mozilla.org/logviewer.html#?job_id=206030463&repo=mozilla-inbound

https://queue.taskcluster.net/v1/task/BOkEpMWTTPSpjTm8sQlC_Q/runs/0/artifacts/public/logs/live_backing.log

[task 2018-10-17T08:21:43.701Z] 08:21:43     INFO - TEST-START | gfx/layers/apz/test/mochitest/test_bug982141.html
[task 2018-10-17T08:21:44.316Z] 08:21:44     INFO - GECKO(6815) | Parent process: flushed APZ repaints, waiting for callback...
[task 2018-10-17T08:21:44.319Z] 08:21:44     INFO - GECKO(6815) | Flushed APZ repaints, waiting for callback...
[task 2018-10-17T08:21:44.336Z] 08:21:44     INFO - TEST-INFO | started process screentopng
[task 2018-10-17T08:21:44.729Z] 08:21:44     INFO - TEST-INFO | screentopng: exit 0
[task 2018-10-17T08:21:44.730Z] 08:21:44     INFO - Buffered messages logged at 08:21:44
[task 2018-10-17T08:21:44.733Z] 08:21:44     INFO - TEST-PASS | gfx/layers/apz/test/mochitest/test_bug982141.html | expected at least one paint in compositor test data 
[task 2018-10-17T08:21:44.734Z] 08:21:44     INFO - Buffered messages finished
[task 2018-10-17T08:21:44.735Z] 08:21:44     INFO - TEST-UNEXPECTED-FAIL | gfx/layers/apz/test/mochitest/test_bug982141.html | found the RCD node 
[task 2018-10-17T08:21:44.736Z] 08:21:44     INFO -     testBug982141@gfx/layers/apz/test/mochitest/helper_bug982141.html:54:7
[task 2018-10-17T08:21:44.737Z] 08:21:44     INFO -     setTimeout handler*repaintDone@gfx/layers/apz/test/mochitest/apz_test_utils.js:131:7
[task 2018-10-17T08:27:06.930Z] 08:27:06     INFO - Not taking screenshot here: see the one that was previously logged
[task 2018-10-17T08:27:06.931Z] 08:27:06     INFO - TEST-UNEXPECTED-FAIL | gfx/layers/apz/test/mochitest/test_bug982141.html | Test timed out. 
[task 2018-10-17T08:27:06.932Z] 08:27:06     INFO -     reportError@SimpleTest/TestRunner.js:121:7
[task 2018-10-17T08:27:06.933Z] 08:27:06     INFO -     TestRunner._checkForHangs@SimpleTest/TestRunner.js:142:7
[task 2018-10-17T08:27:06.933Z] 08:27:06     INFO -     setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5
[task 2018-10-17T08:27:06.934Z] 08:27:06     INFO -     setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5
[task 2018-10-17T08:27:06.934Z] 08:27:06     INFO -     setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5
[task 2018-10-17T08:27:06.935Z] 08:27:06     INFO -     setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5
[task 2018-10-17T08:27:06.936Z] 08:27:06     INFO -     setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5
[task 2018-10-17T08:27:06.937Z] 08:27:06     INFO -     setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5
[task 2018-10-17T08:27:06.937Z] 08:27:06     INFO -     setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5
[task 2018-10-17T08:27:06.938Z] 08:27:06     INFO -     setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5
[task 2018-10-17T08:27:06.939Z] 08:27:06     INFO -     setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5
[task 2018-10-17T08:27:06.939Z] 08:27:06     INFO -     setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5
[task 2018-10-17T08:27:06.942Z] 08:27:06     INFO -     setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5
[task 2018-10-17T08:27:06.942Z] 08:27:06     INFO -     TestRunner.runTests/<@SimpleTest/TestRunner.js:381:9
[task 2018-10-17T08:27:06.943Z] 08:27:06     INFO -     promise callback*TestRunner.runTests@SimpleTest/TestRunner.js:368:50
[task 2018-10-17T08:27:06.944Z] 08:27:06     INFO -     RunSet.runtests@SimpleTest/setup.js:194:3
[task 2018-10-17T08:27:06.944Z] 08:27:06     INFO -     RunSet.runall@SimpleTest/setup.js:173:5
[task 2018-10-17T08:27:06.944Z] 08:27:06     INFO -     hookupTests@SimpleTest/setup.js:266:5
Comment hidden (Intermittent Failures Robot)
Assignee: nobody → kats
Haven't been able to repro this locally.
Comment hidden (Intermittent Failures Robot)
Comment hidden (Intermittent Failures Robot)
Comment hidden (Intermittent Failures Robot)
Comment hidden (Intermittent Failures Robot)
Comment hidden (Intermittent Failures Robot)
This bug has failed 33 times in the last 7 days. Occurs linux64-qr and windows10-64 on opt build types.

Recent log:
https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=215132664&repo=autoland&lineNumber=6250

kats: Can you please take a look at this bug?
Flags: needinfo?(kats)
I'll take another look, sure.
Flags: needinfo?(kats)
Comment hidden (Intermittent Failures Robot)
Comment hidden (Intermittent Failures Robot)
I did some try pushes with logging to figure out what's happening here. Most useful log is at around [1], and reveals the following:

- In order for the compositor-side APZ test data logging to work, the following conditions must be satisfied:
  (1) the root layer tree must have a reflayer pointing to the desired child layer tree, and
  (2) after (1) is satisfied, we must get a child layer tree update/transaction
- When both of the above conditions are satisfied, the transaction with the updated child layer tree will record the desired information in the test data log.
- In the case of WebRender, we get the child layer tree update/transaction *before* the root layer tree has the relevant ref layer. That is, the root layer tree is still pointing at some old content (or no content at all) while we are getting the child layer tree updates, and then after the root layer tree gets updated, we don't get any more child layer tree updates.
- So we end up without the expected compositor-side log data.

Note that the issue is test-specific (as in, it's not a bug in the production code, but it's about an assumption the test is making that doesn't always hold)

Possible fixes:
- Force a child layer tree update from the test after everything is flushed but before we try and read the log. That's probably the simplest approach
- Modify APZCTM to also walk through "detached" content layer trees as part of UpdateHitTestingTree. This seems like it would be ugly and probably impact production behaviour or perf, even if we put it behind a APZTestLoggingEnabled() guard.
- Other?

[1] https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=216361068&repo=try&lineNumber=6486
Oh. We already have a forceLayerTreeToCompositor function for this exact thing...
https://treeherder.mozilla.org/#/jobs?repo=try&revision=7c7bcde50e699d052b73644dc16ae225faf25c0f looks ok. I had a previous one where I tried just doing the flushLayout thing if there were no paints pending but that failed still.
These two bits of code conceptually belong together, since the
flushLayout code is intended to work around a case where
forceLayerTreeToCompositor wasn't doing anything. So it makes sense to
just put them into the same function.

Comment 17

6 months ago
Pushed by kgupta@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/14a53c3f5f7f
Move the flushLayout code into forceLayerTreeToCompositor. r=botond
https://hg.mozilla.org/integration/autoland/rev/ff5e6ef2f756
Ensure we forceLayerTreeToComposite in tests where we read the compositor APZ test data at the start of the test. r=botond

Comment 18

6 months ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/14a53c3f5f7f
https://hg.mozilla.org/mozilla-central/rev/ff5e6ef2f756
Status: NEW → RESOLVED
Closed: 6 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla66
Comment hidden (Intermittent Failures Robot)
Test-only fixes for uplifting to beta.

Recent failure https://treeherder.mozilla.org/#/jobs?repo=autoland&resultStatus=testfailed%2Cbusted%2Cexception&selectedJob=235537946&revision=a8c448b929137723938b1e4da97a9693f0937048&searchStr=91ddb0687fb4a801c534568d2ec3668a26e6c3f4

Failure log https://treeherder.mozilla.org/logviewer.html#?job_id=235537946&repo=autoland

Part of that log

[task 2019-03-22T20:15:50.053Z] 20:15:50 INFO - 148 INFO TEST-START | gfx/layers/apz/test/mochitest/test_bug982141.html
[task 2019-03-22T20:16:10.503Z] 20:16:10 INFO - Buffered messages logged at 20:15:52
[task 2019-03-22T20:16:10.503Z] 20:16:10 INFO - 149 INFO must wait for load
[task 2019-03-22T20:16:10.503Z] 20:16:10 INFO - Buffered messages logged at 20:15:57
[task 2019-03-22T20:16:10.503Z] 20:16:10 INFO - 150 INFO TEST-PASS | gfx/layers/apz/test/mochitest/test_bug982141.html | expected at least one paint in compositor test data
[task 2019-03-22T20:16:10.503Z] 20:16:10 INFO - Buffered messages finished
[task 2019-03-22T20:16:10.503Z] 20:16:10 INFO - 151 INFO TEST-UNEXPECTED-FAIL | gfx/layers/apz/test/mochitest/test_bug982141.html | found the RCD node
[task 2019-03-22T20:16:10.503Z] 20:16:10 INFO - SimpleTest.ok@SimpleTest/SimpleTest.js:275:18
[task 2019-03-22T20:16:10.504Z] 20:16:10 INFO - testBug982141@gfx/layers/apz/test/mochitest/helper_bug982141.html:54:18
[task 2019-03-22T20:21:06.409Z] 20:21:06 INFO - 152 INFO TEST-UNEXPECTED-FAIL | gfx/layers/apz/test/mochitest/test_bug982141.html | Test timed out.
[task 2019-03-22T20:21:06.409Z] 20:21:06 INFO - SimpleTest.ok@SimpleTest/SimpleTest.js:275:18
[task 2019-03-22T20:21:06.409Z] 20:21:06 INFO - reportError@SimpleTest/TestRunner.js:121:22
[task 2019-03-22T20:21:06.409Z] 20:21:06 INFO - TestRunner._checkForHangs@SimpleTest/TestRunner.js:142:7
[task 2019-03-22T20:21:37.680Z] 20:21:37 INFO - 153 INFO TEST-OK | gfx/layers/apz/test/mochitest/test_bug982141.html | took 348798ms
[task 2019-03-22T20:21:47.899Z] 20:21:47 INFO - 154 INFO TEST-UNEXPECTED-FAIL | gfx/layers/apz/test/mochitest/test_bug982141.html | Test timed out.
[task 2019-03-22T20:21:47.899Z] 20:21:47 INFO - SimpleTest.ok@SimpleTest/SimpleTest.js:275:18
[task 2019-03-22T20:21:47.900Z] 20:21:47 INFO - reportError@SimpleTest/TestRunner.js:121:22
[task 2019-03-22T20:21:47.900Z] 20:21:47 INFO - TestRunner._checkForHangs@SimpleTest/TestRunner.js:142:7
[task 2019-03-22T20:21:58.113Z] 20:21:58 INFO - 155 ERROR [SimpleTest.finish()] this test already called finish!
[task 2019-03-22T20:21:58.114Z] 20:21:58 INFO - 156 INFO TEST-UNEXPECTED-ERROR | gfx/layers/apz/test/mochitest/test_bug982141.html | called finish() multiple times
[task 2019-03-22T20:21:58.114Z] 20:21:58 INFO - TEST-INFO
[task 2019-03-22T20:22:08.337Z] 20:22:08 INFO - 157 INFO Error: Unable to restore focus, expect failures and timeouts.

Status: RESOLVED → REOPENED
Resolution: FIXED → ---

Please file a new bug; patches landed here and the original issue was fixed.

Status: REOPENED → RESOLVED
Closed: 6 months ago3 months ago
Resolution: --- → FIXED
Comment hidden (Intermittent Failures Robot)
Comment hidden (Intermittent Failures Robot)
You need to log in before you can comment on or make changes to this bug.