Closed Bug 1697345 Opened 4 years ago Closed 4 years ago

Perma Android 8.0 jsreftest failures, e.g. js/src/tests/non262/<test> | load failed: timed out waiting for reftest-wait to be removed | test must provide a function getTestCases().

Categories

(Core :: JavaScript Engine, defect, P5)

defect

Tracking

()

RESOLVED WONTFIX

People

(Reporter: intermittent-bug-filer, Unassigned)

References

Details

(Keywords: intermittent-failure)

Filed by: malexandru [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer?job_id=332610251&repo=mozilla-beta
Full log: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/JBqLC-2yQ9CHGwwrO7Ys2w/runs/0/artifacts/public/logs/live_backing.log
Reftest URL: https://hg.mozilla.org/mozilla-central/raw-file/tip/layout/tools/reftest/reftest-analyzer.xhtml#logurl=https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/JBqLC-2yQ9CHGwwrO7Ys2w/runs/0/artifacts/public/logs/live_backing.log&only_show_unexpected=1


[task 2021-03-09T23:12:45.654Z] 23:12:45     INFO -  REFTEST TEST-PASS | js/src/tests/non262/Array/regress-451906.js | Index array by numeric string  item 1
[task 2021-03-09T23:12:45.654Z] 23:12:45     INFO -  REFTEST TEST-END | js/src/tests/non262/Array/regress-451906.js
[task 2021-03-09T23:12:45.654Z] 23:12:45     INFO -  REFTEST TEST-START | js/src/tests/non262/Array/regress-456845.js
[taskcluster 2021-03-09T23:17:38.360Z] [taskcluster-proxy] Successfully refreshed taskcluster-proxy credentials: task-client/JBqLC-2yQ9CHGwwrO7Ys2w/0/on/bitbar/pixel2-25/until/1615333058.336
[task 2021-03-09T23:17:52.659Z] 23:12:45     INFO -  REFTEST TEST-LOAD | http://10.7.205.225:8854/jsreftest/tests/js/src/tests/jsreftest.html?test=non262/Array/regress-456845.js | 94 / 7216 (1%)
[task 2021-03-09T23:17:52.659Z] 23:17:52  WARNING -  REFTEST TEST-UNEXPECTED-FAIL | js/src/tests/non262/Array/regress-456845.js | load failed: timed out waiting for reftest-wait to be removed
[task 2021-03-09T23:17:52.659Z] 23:17:52     INFO -  REFTEST INFO | Saved log: START http://10.7.205.225:8854/jsreftest/tests/js/src/tests/jsreftest.html?test=non262/Array/regress-456845.js
[task 2021-03-09T23:17:52.659Z] 23:17:52     INFO -  REFTEST INFO | Saved log: [CONTENT] OnDocumentLoad triggering WaitForTestEnd
[task 2021-03-09T23:17:52.659Z] 23:17:52     INFO -  REFTEST INFO | Saved log: [CONTENT] WaitForTestEnd: Adding listeners
[task 2021-03-09T23:17:52.659Z] 23:17:52     INFO -  REFTEST INFO | Saved log: Initializing canvas snapshot
[task 2021-03-09T23:17:52.659Z] 23:17:52     INFO -  REFTEST INFO | Saved log: [CONTENT] MakeProgress
[task 2021-03-09T23:17:52.659Z] 23:17:52     INFO -  REFTEST INFO | Saved log: [CONTENT] MakeProgress: STATE_WAITING_TO_FIRE_INVALIDATE_EVENT
[task 2021-03-09T23:17:52.659Z] 23:17:52     INFO -  REFTEST INFO | Saved log: [CONTENT] MakeProgress: dispatching MozReftestInvalidate
[task 2021-03-09T23:17:52.659Z] 23:17:52     INFO -  REFTEST INFO | Saved log: [CONTENT] MakeProgress
[task 2021-03-09T23:17:52.659Z] 23:17:52     INFO -  REFTEST INFO | Saved log: [CONTENT] MakeProgress: STATE_WAITING_FOR_REFTEST_WAIT_REMOVAL
[task 2021-03-09T23:17:52.659Z] 23:17:52     INFO -  REFTEST INFO | Saved log: [CONTENT] MakeProgress: waiting for reftest-wait to be removed
[task 2021-03-09T23:17:52.659Z] 23:17:52     INFO -  REFTEST TEST-END | js/src/tests/non262/Array/regress-456845.js
[task 2021-03-09T23:17:52.659Z] 23:17:52     INFO -  REFTEST TEST-START | js/src/tests/non262/Array/regress-465980-01.js```
See Also: → 1698138

The frequent failures also affect previous pushes whose initially executed jsreftests were successful. The tasks which succeeded and failed share the same dependencies and artifacts. Some of them got updated the day before the failures started by the changes in bug 1695773.

These Android 8.0 arm7 tests continue to fail very frequently. This can be also observed for mozilla-release and hence shouldn't be related to a check-in. Can you check with Bitbar again regarding differences to the gecko-t-bitbar-gw-unit-p2 pool?

Flags: needinfo?(aerickson)

Yes, I'll check with Bitbar again.

It's confusing that the test executor continues to successfully execute tests after seeing a few failures... could the test framework just be firing jobs too quickly?

example: https://treeherder.mozilla.org/logviewer?job_id=333423674&repo=mozilla-central&lineNumber=14529

  • "TEST-PASS" instances happen after the error "test must provide a function getTestCases()." occurs
Flags: needinfo?(aerickson)

The Bitbar issues were due to networking. Sakari is relatively confident all issues are resolved.

These jobs are running in the unit-p2 pool (which is the same as the other pools). Bitbar doesn't actually know/manage what pools our devices are in, it's strictly something we control in our provisioner (devicepool) layer. Bitbar's infrastructure connects 6-8 phones to a Docker host. If we saw a concentrated subset of devices (e.g. pixel2-490 could likely be on the same docker host as pixe2-492) that were having issues, it could be due to a bad Docker host. I didn't see any particular hot spots for this bug (some hosts have fewer failures, but not sure if we have a statistically significant amount of runs).

How do the jsreftests work? It seems from the logs that the tests push everything to the device and then use the local filesystem to load the test cases. That would seem to rule out any networking-related issues.

This can be also observed for mozilla-release and hence shouldn't be related to a check-in.

The linked mozilla-release failures are due to the task exceeding 1 hour (Bug 1697835. I'm not sure why that's happening. I didn't see any artifact fetching failures.), not "test must provide a function getTestCases()."

Have we seen this particular error in jobs run on other trees?

(In reply to Andrew Erickson [:aerickson] from comment #10)

The linked mozilla-release failures are due to the task exceeding 1 hour (Bug 1697835. I'm not sure why that's happening. I didn't see any artifact fetching failures.), not "test must provide a function getTestCases()."

Have we seen this particular error in jobs run on other trees?

Aah, I do see that "test must provide a function getTestCases()." is mentioned in two of the failures. I was looking at the ultimate reason for failure (the mozilla-release jobs fail due to a timeout, while the central jobs are reaching the end of the run and failing due to this bug). It seems the networking issues related to fetching mentioned in Bug 1697835 are better, but this issue persists.

See Also: → 1589796
Summary: Perma Android 8.0 [tier 2] js/src/tests/non262/<test> | load failed: timed out waiting for reftest-wait to be removed | test must provide a function getTestCases(). → Perma Android 8.0 jsreftest failures, e.g. js/src/tests/non262/<test> | load failed: timed out waiting for reftest-wait to be removed | test must provide a function getTestCases().

jsreftests on AArch64 got moved to macOS in bug 1692091.

Status: NEW → RESOLVED
Closed: 4 years ago
Flags: needinfo?(jdemooij)
Resolution: --- → WONTFIX
See Also: → 1692091
You need to log in before you can comment on or make changes to this bug.