[jittest] Better handle intermittent adb errors
Categories
(Core :: JavaScript Engine, defect, P3)
Tracking
()
Tracking | Status | |
---|---|---|
firefox67 | --- | fixed |
People
(Reporter: bc, Assigned: bc)
References
Details
Attachments
(1 file)
7.32 KB,
patch
|
nbp
:
review+
|
Details | Diff | Splinter Review |
While I believe bug 1524352 will help improve the situations such as in bug 1518650 where the expected error test cases were polluting the suggested bugs in Treeherder, it has failed to eliminate the problem with the error: closed [1] and error: device <serial> not found [2] adb errors.
[1] https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=226113784&repo=autoland&lineNumber=8165
[2] https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=225927129&repo=autoland&lineNumber=9928
Comment 1•5 years ago
|
||
One question I have related to ignoring test, is how do we ensure that we do not ignore the entire test suite by accident?
Updated•5 years ago
|
Assignee | ||
Comment 2•5 years ago
|
||
(In reply to Nicolas B. Pierron [:nbp] from comment #1)
One question I have related to ignoring test, is how do we ensure that we do not ignore the entire test suite by accident?
The error: closed and error: device <serial> not found errors do not appear to affect subsequent tests and so long as we localize our changes to the specific test affected, the other tests in the suite would not be affected.
After several attempts I do not think my initial approach of flagging the intermittent error in the [1] except ADBProcessError as e: block [1] in run_test_remote, then handling it as one of the special cases in the [2] if rc != test.expect_status: block in check_output is workable.
Instead, I am now attempting to treat the affected test as if it were skipped since there does not appear to be any means of reliably determining the test's pass/fail status once the adb communication error occurred for that individual test. "Skipping" it seems the most reasonable fall back.
This is insufficient to deal with all of the device error related failures however. For example, we have several failures per day of pushing the libraries and tests to the device. Since the error messages for these failures tend to be unique, it is difficult for the sheriffs to classify them. I am going to experiment with a patch to make these errors more easily identifiable. I've adjusted the bug summary to match.
As an example where these errors are causing problems with triaging jittest failures, see [3] where an expected ReferenceError failed due to a zero return code but was misclassified as an example of bug 1518628 error: closed.
nbp: Would you be a good person to review the patches?
[1] https://searchfox.org/mozilla-central/source/js/src/tests/lib/jittests.py#436
[2] https://searchfox.org/mozilla-central/source/js/src/tests/lib/jittests.py#507
[3] https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&resultStatus=testfailed%2Cbusted%2Cexception%2Crunnable&tier=1%2C2%2C3&searchStr=android-hw%2Cjit&selectedJob=226725792
Comment 3•5 years ago
|
||
(In reply to Bob Clary [:bc:] from comment #2)
nbp: Would you be a good person to review the patches?
sfink or my-self are good persons for reviewing jit-tests harness.
Assignee | ||
Comment 4•5 years ago
|
||
Try run with --rebuild 20
https://treeherder.mozilla.org/#/jobs?repo=try&revision=7f9a3adfb81dcf4815d131e808c369015511b663
We see one device failure at bitbar before the test begins running:
TEST-UNEXPECTED-FAIL | bitbar | ADBDevice.init: ls could not be found attempting to clean up device
One device failure attempting to set up the device for the test:
TEST-UNEXPECTED-FAIL | jit_test.py : Device initialization failed
One device failure in mozharness to connect to the device:
ADBError: ADBDevice.init: ls could not be found
And 5 intermittent adb connection errors out of the 400 test runs:
https://taskcluster-artifacts.net/fWfXgF-0TkmY2wMn5MCtkQ/0/public/logs/live_backing.log
Skipping /builds/worker/workspace/build/tests/jit-test/jit-test/tests/basic/testDivModWithIntMin.js due to ignorable adb error error: device 'HT83K1A02572' not found
https://taskcluster-artifacts.net/ZieBqwweT--n6k2qNeWYfw/0/public/logs/live_backing.log
Skipping /builds/worker/workspace/build/tests/jit-test/jit-test/tests/ion/bug1264948.js due to ignorable adb error error: device 'FA84C1A00154' not found
https://taskcluster-artifacts.net/XfTRkvo4QcS1l1oJxRJHgQ/0/public/logs/live_backing.log
Skipping /builds/worker/workspace/build/tests/jit-test/jit-test/tests/structured-clone/Map-Set-cross-compartment.js due to ignorable adb error error: device 'HT83K1A02597' not found
https://taskcluster-artifacts.net/JMBdYdt9QDu8Ouh4aaZ0nw/0/public/logs/live_backing.log
Skipping /builds/worker/workspace/build/tests/jit-test/jit-test/tests/basic/bug820124-1.js due to ignorable adb error error: device 'FA83V1A02389' not found
https://taskcluster-artifacts.net/BlITfWFCTwmpOLyCsKbW5w/0/public/logs/live_backing.log
Skipping /builds/worker/workspace/build/tests/jit-test/jit-test/tests/ion/bug1365769-2.js due to ignorable adb error error: device 'FA84C1A00167' not found
These all have log messages of the form:
TEST-PASS | tests/jit-test/jit-test/tests/ion/bug1365769-2.js | Success (code 59, args "")
showing the test was skipped. The logs show that the remaining tests continued to run. It may be helpful for you in the future to include the number of skipped tests in the job details.
We didn't see any examples of adb error: closed nor of an uncaught ADB error during the test runs.
Comment 5•5 years ago
|
||
Comment on attachment 9042957 [details] [diff] [review] bug-1525288-jittest-intermittent-adb-errors.patch Review of attachment 9042957 [details] [diff] [review]: ----------------------------------------------------------------- This sounds good to me, but I am not an expert in ADB/mozdevice bindings. Feel free to get some mozdevice peer feedback if you feel that this might be needed.
Pushed by bclary@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/2563d7cfc1d2 [jittest] Better handle intermittent adb errors, r=nbp.
Comment 7•5 years ago
|
||
bugherder |
Comment hidden (Intermittent Failures Robot) |
Description
•