Closed
Bug 1335501
Opened 8 years ago
Closed 8 years ago
Provide better error handling for xpcshell tests when the Android emulator becomes unresponsive
Categories
(Firefox for Android Graveyard :: Testing, defect, P2)
Firefox for Android Graveyard
Testing
Tracking
(firefox52 fixed, firefox53 fixed, firefox54 fixed)
RESOLVED
FIXED
Firefox 54
People
(Reporter: gbrown, Assigned: gbrown)
References
Details
Attachments
(1 file)
1.28 KB,
patch
|
jmaher
:
review+
|
Details | Diff | Splinter Review |
Sometimes the Android emulator becomes unresponsive during a job. For most test jobs, that triggers an automatic retry, but for xpcshell tests, that is usually reported as a test failure, which causes confusion and wastes developer time.
![]() |
Assignee | |
Updated•8 years ago
|
Priority: -- → P2
![]() |
Assignee | |
Comment 24•8 years ago
|
||
Some examples:
[task 2016-11-29T23:58:17.338984Z] 23:58:17 INFO - TEST-START | netwerk/test/unit/test_progress.js
[task 2016-11-30T00:03:17.339826Z] 00:03:17 WARNING - TEST-UNEXPECTED-TIMEOUT | netwerk/test/unit/test_progress.js | Test timed out
[task 2016-11-30T00:03:17.339990Z] 00:03:17 INFO - TEST-INFO took 300000ms
[task 2016-11-30T00:03:27.379045Z] 00:03:27 INFO - mozdevice Timeout exceeded for shell call 'adb shell /data/local/xpcb/xpcw /storage/sdcard/tests/xpc/netwerk/test/unit -r /storage/sdcard/tests/xpc/c/httpd.manifest --greomni /data/local/xpcb/target.apk -m -s -e 'const _HEAD_JS_PATH = "/storage/sdcard/tests/xpc/head.js";' -e 'const _MOZINFO_JS_PATH = "/storage/sdcard/tests/xpc/p/mozinfo.json";' -e 'const _TESTING_MODULES_DIR = "/storage/sdcard/tests/xpc/m";' -f /storage/sdcard/tests/xpc/head.js -e 'const _SERVER_ADDR = "localhost"' -e 'const _HEAD_FILES = ["/storage/sdcard/tests/xpc/netwerk/test/unit/head_channels.js", "/storage/sdcard/tests/xpc/netwerk/test/unit/head_cache.js", "/storage/sdcard/tests/xpc/netwerk/test/unit/head_cache2.js"];' -e 'const _TAIL_FILES = [];' -e 'const _JSDEBUGGER_PORT = 0;' -e 'const _TEST_FILE = ["test_progress.js"];' -e 'const _TEST_NAME = "netwerk/test/unit/test_progress.js"' -e '_execute_test(); quit(0);'; echo $?'
[task 2016-11-30T00:03:47.379151Z] 00:03:47 INFO - mozdevice Timeout exceeded for _runCmd call 'adb shell ps'
[task 2016-11-30T00:03:47.382686Z] 00:03:47 INFO - xpcshell return code: 93
[task 2016-11-30T00:03:57.425446Z] 00:03:57 INFO - mozdevice Timeout exceeded for _runCmd call 'adb shell ps'
[task 2016-11-30T00:04:27.468833Z] 00:04:27 INFO - mozdevice Timeout exceeded for _runCmd call 'adb shell ps'
[task 2016-11-30T00:04:27.473445Z] 00:04:27 WARNING - TEST-UNEXPECTED-FAIL | Received SIGINT (control-C), so stopped run. (Use --keep-going to keep running tests after killing one with SIGINT)
05:40:49 INFO - TEST-START | netwerk/test/unit/test_bug767025.js
05:45:49 WARNING - TEST-UNEXPECTED-TIMEOUT | netwerk/test/unit/test_bug767025.js | Test timed out
05:45:49 INFO - TEST-INFO took 300096ms
05:45:59 INFO - mozdevice Timeout exceeded for shell call 'adb shell /data/local/xpcb/xpcw /storage/sdcard/tests/xpc/netwerk/test/unit -r /storage/sdcard/tests/xpc/c/httpd.manifest --greomni /data/local/xpcb/fennec-53.0a1.en-US.android-arm.apk -m -s -e 'const _HEAD_JS_PATH = "/storage/sdcard/tests/xpc/head.js";' -e 'const _MOZINFO_JS_PATH = "/storage/sdcard/tests/xpc/p/mozinfo.json";' -e 'const _TESTING_MODULES_DIR = "/storage/sdcard/tests/xpc/m";' -f /storage/sdcard/tests/xpc/head.js -e 'const _SERVER_ADDR = "localhost"' -e 'const _HEAD_FILES = ["/storage/sdcard/tests/xpc/netwerk/test/unit/head_channels.js", "/storage/sdcard/tests/xpc/netwerk/test/unit/head_cache.js", "/storage/sdcard/tests/xpc/netwerk/test/unit/head_cache2.js"];' -e 'const _TAIL_FILES = [];' -e 'const _JSDEBUGGER_PORT = 0;' -e 'const _TEST_FILE = ["test_bug767025.js"];' -e 'const _TEST_NAME = "netwerk/test/unit/test_bug767025.js"' -e '_execute_test(); quit(0);'; echo $?'
05:46:20 INFO - mozdevice Timeout exceeded for _runCmd call 'adb shell ps'
05:46:20 INFO - xpcshell return code: 1483450857825
05:46:30 INFO - mozdevice Timeout exceeded for _runCmd call 'adb shell ps'
05:47:00 INFO - mozdevice Timeout exceeded for _runCmd call 'adb shell ps'
05:47:00 WARNING - TEST-UNEXPECTED-FAIL | Received SIGINT (control-C), so stopped run. (Use --keep-going to keep running tests after killing one with SIGINT)
22:27:19 INFO - TEST-START | netwerk/test/unit/test_redirect_failure.js
22:32:19 WARNING - TEST-UNEXPECTED-TIMEOUT | netwerk/test/unit/test_redirect_failure.js | Test timed out
22:32:19 INFO - TEST-INFO expected FAIL | took 300076ms
22:32:29 INFO - mozdevice Timeout exceeded for shell call 'adb shell /data/local/xpcb/xpcw /storage/sdcard/tests/xpc/netwerk/test/unit -r /storage/sdcard/tests/xpc/c/httpd.manifest --greomni /data/local/xpcb/fennec-53.0a1.en-US.android-arm.apk -m -s -e 'const _HEAD_JS_PATH = "/storage/sdcard/tests/xpc/head.js";' -e 'const _MOZINFO_JS_PATH = "/storage/sdcard/tests/xpc/p/mozinfo.json";' -e 'const _TESTING_MODULES_DIR = "/storage/sdcard/tests/xpc/m";' -f /storage/sdcard/tests/xpc/head.js -e 'const _SERVER_ADDR = "localhost"' -e 'const _HEAD_FILES = ["/storage/sdcard/tests/xpc/netwerk/test/unit/head_channels.js", "/storage/sdcard/tests/xpc/netwerk/test/unit/head_cache.js", "/storage/sdcard/tests/xpc/netwerk/test/unit/head_cache2.js"];' -e 'const _TAIL_FILES = [];' -e 'const _JSDEBUGGER_PORT = 0;' -e 'const _TEST_FILE = ["test_redirect_failure.js"];' -e 'const _TEST_NAME = "netwerk/test/unit/test_redirect_failure.js"' -e '_execute_test(); quit(0);'; echo $?'
22:32:49 INFO - mozdevice Timeout exceeded for _runCmd call 'adb shell ps'
22:32:49 INFO - xpcshell return code: 53
22:33:00 INFO - mozdevice Timeout exceeded for _runCmd call 'adb shell ps'
22:33:30 INFO - mozdevice Timeout exceeded for _runCmd call 'adb shell ps'
22:33:30 WARNING - TEST-UNEXPECTED-FAIL | Received SIGINT (control-C), so stopped run. (Use --keep-going to keep running tests after killing one with SIGINT)
19:00:58 INFO - TEST-START | netwerk/test/unit/test_range_requests.js
19:05:59 WARNING - TEST-UNEXPECTED-TIMEOUT | netwerk/test/unit/test_range_requests.js | Test timed out
19:05:59 INFO - TEST-INFO took 300077ms
19:06:09 INFO - mozdevice Timeout exceeded for shell call 'adb shell /data/local/xpcb/xpcw /storage/sdcard/tests/xpc/netwerk/test/unit -r /storage/sdcard/tests/xpc/c/httpd.manifest --greomni /data/local/xpcb/fennec-53.0a1.en-US.android-arm.apk -m -s -e 'const _HEAD_JS_PATH = "/storage/sdcard/tests/xpc/head.js";' -e 'const _MOZINFO_JS_PATH = "/storage/sdcard/tests/xpc/p/mozinfo.json";' -e 'const _TESTING_MODULES_DIR = "/storage/sdcard/tests/xpc/m";' -f /storage/sdcard/tests/xpc/head.js -e 'const _SERVER_ADDR = "localhost"' -e 'const _HEAD_FILES = ["/storage/sdcard/tests/xpc/netwerk/test/unit/head_channels.js", "/storage/sdcard/tests/xpc/netwerk/test/unit/head_cache.js", "/storage/sdcard/tests/xpc/netwerk/test/unit/head_cache2.js"];' -e 'const _TAIL_FILES = [];' -e 'const _JSDEBUGGER_PORT = 0;' -e 'const _TEST_FILE = ["test_range_requests.js"];' -e 'const _TEST_NAME = "netwerk/test/unit/test_range_requests.js"' -e '_execute_test(); quit(0);'; echo $?'
19:06:29 INFO - mozdevice Timeout exceeded for _runCmd call 'adb shell ps'
19:06:29 INFO - xpcshell return code: 1480388468118
19:06:39 INFO - mozdevice Timeout exceeded for _runCmd call 'adb shell ps'
19:07:09 INFO - mozdevice Timeout exceeded for _runCmd call 'adb shell ps'
19:07:09 WARNING - TEST-UNEXPECTED-FAIL | Received SIGINT (control-C), so stopped run. (Use --keep-going to keep running tests after killing one with SIGINT)
![]() |
Assignee | |
Comment 25•8 years ago
|
||
"DMError" in a test log triggers a retry: https://dxr.mozilla.org/mozilla-central/rev/1d025ac534a6333a8170a59a95a8a3673d4028ee/testing/mozharness/mozharness/mozilla/testing/errors.py#101.
That DMError check works for mochitests and reftests because those tests use remoteautomation.py for process monitoring of the remote browser, and the remoteautomation.py code usually fails on a devicemanagerADB _checkCmd() call, which raises DMError.
In contrast, xpcshell tests don't use remoteautomation.py (having different, arguably simpler process management requirements). The xpcshell wrapper is launched with shell(), and the xpcshell timeout mechanism triggers proc.kill() and proc.poll() calls that ultimately use devicemanager.getProcessList(). DM.getProcessList() call _runCmd('adb shell ps') -- not _checkCmd().
It wouldn't be appropriate to throw DMError in _runCmd: it is intended as a non-throwing alternative to _checkCmd().
It might be OK to change DM.getProcessList() to use _checkCmd() instead of _runCmd(), but I worry that might have a de-stabilizing effect on our various DeviceManager clients.
Perhaps the least dangerous path forward is to retry on the existing message, "Timeout exceeded for _runCmd call 'adb shell ps'", or perhaps "Timeout exceeded for _runCmd call". If an adb command times out, I think that is likely an indication of a test environment problem, rather than a test problem.
![]() |
Assignee | |
Comment 26•8 years ago
|
||
See my reasoning in comment 25.
I hope it will be this simple. It seems to do no harm:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=6af77bea86e1bcece4fc7e902d943c807c6eed07
Attachment #8833100 -
Flags: review?(jmaher)
![]() |
Assignee | |
Comment 27•8 years ago
|
||
What luck! We encountered the xpcshell-ps-timeout problem and retried successfully: https://treeherder.mozilla.org/#/jobs?repo=try&revision=6af77bea86e1bcece4fc7e902d943c807c6eed07&selectedJob=74071719
Comment 28•8 years ago
|
||
Comment on attachment 8833100 [details] [diff] [review]
retry on _runCmd() timeout
Review of attachment 8833100 [details] [diff] [review]:
-----------------------------------------------------------------
oh great! I am curious why we have the twisted error in the regex, isn't that specific to buildbot?
Attachment #8833100 -
Flags: review?(jmaher) → review+
![]() |
Assignee | |
Comment 29•8 years ago
|
||
I think the twisted error is specific to buildbot, and probably no longer needed, but I'm not sure - I won't mess with it, today.
Comment 30•8 years ago
|
||
Pushed by gbrown@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/06c3be84254c
Trigger test job retry when Android devicemanager _runCmd() times out; r=jmaher
Comment 34•8 years ago
|
||
bugherder |
Status: NEW → RESOLVED
Closed: 8 years ago
status-firefox54:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → Firefox 54
![]() |
Assignee | |
Updated•8 years ago
|
Whiteboard: [checkin-needed-aurora][checkin-needed-beta]
Comment 35•8 years ago
|
||
bugherder uplift |
status-firefox53:
--- → fixed
Whiteboard: [checkin-needed-aurora][checkin-needed-beta] → [checkin-needed-beta]
Comment 36•8 years ago
|
||
bugherder uplift |
Updated•4 years ago
|
Product: Firefox for Android → Firefox for Android Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•