Closed Bug 820739 Opened 12 years ago Closed 11 years ago

Intermittent B2G emulator "command timed out: 1200 seconds without output, attempting to kill" or "command timed out: 3600 seconds without output, attempting to kill" during run-marionette or b2g_emulator_unittest step

Categories

(Testing :: Marionette Client and Harness, defect)

ARM
Gonk (Firefox OS)
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED INCOMPLETE

People

(Reporter: emorley, Unassigned)

References

Details

(Keywords: intermittent-failure, Whiteboard: [mozharness])

b2g_ics_armv7a_gecko_emulator mozilla-inbound opt test marionette-webapi on 2012-12-09 19:38:06 PST for push 0770315e3538 slave: talos-r3-fed-046 https://tbpl.mozilla.org/php/getParsedLog.php?id=17769242&tree=Mozilla-Inbound { 19:40:44 INFO - ##### 19:40:44 INFO - ##### Running install step. 19:40:44 INFO - ##### 19:40:44 INFO - Getting output from command: ['/home/cltbld/talos-slave/test/build/venv/bin/pip', 'freeze'] 19:40:44 INFO - Copy/paste: /home/cltbld/talos-slave/test/build/venv/bin/pip freeze 19:40:44 INFO - Reading from file tmpfile_stdout 19:40:44 INFO - Emulator tests; skipping. 19:40:44 INFO - ##### 19:40:44 INFO - ##### Running run-marionette step. 19:40:44 INFO - ##### 19:40:44 INFO - Running command: ['/home/cltbld/talos-slave/test/build/venv/bin/python', '-u', '/home/cltbld/talos-slave/test/build/tests/marionette/marionette/runtests.py', '--logcat-dir', '/home/cltbld/talos-slave/test/build', '--emulator', 'arm', '--gecko-path', '/home/cltbld/talos-slave/test/build/gecko/b2g', '--load-early', '--homedir', '/home/cltbld/talos-slave/test/build/emulator/b2g-distro', '--type', 'b2g', '/home/cltbld/talos-slave/test/build/tests/marionette/tests/testing/marionette/client/marionette/tests/unit-tests.ini'] 19:40:44 INFO - Copy/paste: /home/cltbld/talos-slave/test/build/venv/bin/python -u /home/cltbld/talos-slave/test/build/tests/marionette/marionette/runtests.py --logcat-dir /home/cltbld/talos-slave/test/build --emulator arm --gecko-path /home/cltbld/talos-slave/test/build/gecko/b2g --load-early --homedir /home/cltbld/talos-slave/test/build/emulator/b2g-distro --type b2g /home/cltbld/talos-slave/test/build/tests/marionette/tests/testing/marionette/client/marionette/tests/unit-tests.ini 19:40:45 INFO - starting httpd 19:40:45 INFO - running webserver on http://10.12.49.173:42342/ 19:41:00 INFO - Android Debug Bridge version 1.0.29 command timed out: 1200 seconds without output, attempting to kill }
Blocks: 823076
https://tbpl.mozilla.org/php/getParsedLog.php?id=19084122&tree=Firefox 01:10:17 INFO - ##### 01:10:17 INFO - ##### Running install step. 01:10:17 INFO - ##### 01:10:17 INFO - Getting output from command: ['/home/cltbld/talos-slave/test/build/venv/bin/pip', 'freeze'] 01:10:17 INFO - Copy/paste: /home/cltbld/talos-slave/test/build/venv/bin/pip freeze 01:10:17 INFO - Reading from file tmpfile_stdout 01:10:17 INFO - Detecting whether we're running mozinstall >=1.0... 01:10:17 INFO - Getting output from command: ['/home/cltbld/talos-slave/test/build/venv/bin/mozinstall', '--app', 'b2g', '-h'] 01:10:17 INFO - Copy/paste: /home/cltbld/talos-slave/test/build/venv/bin/mozinstall --app b2g -h 01:10:18 INFO - Reading from file tmpfile_stdout 01:10:18 INFO - Output received: 01:10:18 INFO - Usage: mozinstall [options] installer 01:10:18 INFO - Options: 01:10:18 INFO - -h, --help show this help message and exit 01:10:18 INFO - -d DEST, --destination=DEST 01:10:18 INFO - Directory to install application into. [default: 01:10:18 INFO - "/home/cltbld/talos-slave/test"] 01:10:18 INFO - --app=APP Application being installed. [default: firefox] 01:10:18 INFO - mkdir: /home/cltbld/talos-slave/test/build/application 01:10:18 INFO - Getting output from command: ['/home/cltbld/talos-slave/test/build/venv/bin/mozinstall', '--app', 'b2g', '/home/cltbld/talos-slave/test/build/b2g-21.0a1.en-US.android-arm.tar.gz', '--destination', '/home/cltbld/talos-slave/test/build/application'] 01:10:18 INFO - Copy/paste: /home/cltbld/talos-slave/test/build/venv/bin/mozinstall --app b2g /home/cltbld/talos-slave/test/build/b2g-21.0a1.en-US.android-arm.tar.gz --destination /home/cltbld/talos-slave/test/build/application 01:10:19 INFO - Reading from file tmpfile_stdout 01:10:19 INFO - Output received: 01:10:19 INFO - /home/cltbld/talos-slave/test/build/application/b2g/b2g 01:10:19 INFO - ##### 01:10:19 INFO - ##### Running run-tests step. 01:10:19 INFO - ##### 01:10:19 INFO - Running command: ['/home/cltbld/talos-slave/test/build/venv/bin/python', 'runreftestb2g.py', '--adbpath', '/home/cltbld/talos-slave/test/build/emulator/b2g-distro/out/host/linux-x86/bin/adb', '--b2gpath', '/home/cltbld/talos-slave/test/build/emulator/b2g-distro', '--emulator', 'arm', '--emulator-res', '800x1000', '--ignore-window-size', '--logcat-dir', '/home/cltbld/talos-slave/test/build', '--remote-webserver', '10.0.2.2', '--xre-path', '/home/cltbld/talos-slave/test/build/xre/bin', '--total-chunks', '10', '--this-chunk', '2', '--gecko-path', '/home/cltbld/talos-slave/test/build/application/b2g', '--busybox', '/home/cltbld/talos-slave/test/build/busybox', 'tests/layout/reftests/reftest.list'] in /home/cltbld/talos-slave/test/build/tests/reftest 01:10:19 INFO - Copy/paste: /home/cltbld/talos-slave/test/build/venv/bin/python runreftestb2g.py --adbpath /home/cltbld/talos-slave/test/build/emulator/b2g-distro/out/host/linux-x86/bin/adb --b2gpath /home/cltbld/talos-slave/test/build/emulator/b2g-distro --emulator arm --emulator-res 800x1000 --ignore-window-size --logcat-dir /home/cltbld/talos-slave/test/build --remote-webserver 10.0.2.2 --xre-path /home/cltbld/talos-slave/test/build/xre/bin --total-chunks 10 --this-chunk 2 --gecko-path /home/cltbld/talos-slave/test/build/application/b2g --busybox /home/cltbld/talos-slave/test/build/busybox tests/layout/reftests/reftest.list 01:10:35 INFO - Android Debug Bridge version 1.0.29 command timed out: 1200 seconds without output, attempting to kill process killed by signal 9 program finished with exit code -1 elapsedTime=1364.127121 ========= Finished '/tools/buildbot/bin/python scripts/scripts/b2g_emulator_unittest.py ...' failed (results: 2, elapsed: 22 mins, 46 secs) (at 2013-01-24 01:30:37.918921) =========
Summary: Intermittent B2G emulator "command timed out: 1200 seconds without output, attempting to kill" during run-marionette step → Intermittent B2G emulator "command timed out: 1200 seconds without output, attempting to kill" during run-marionette or b2g_emulator_unittest step
That last one timed out while downloading tests.zip. In that case the emulator hasn't even been launched yet. Filed bug 848442 for it.
In the most recent cases, it looks like the emulator isn't starting up successfully, but it's a little hard to tell. Mozharness might be eating the last bit of log (bug 791406), and we don't have a process list or logcat output. It will be hard to add additional logging here, since the mozharness script is getting killed by buildbot due to lack of output.
I have a timeout patch in bug 840305.
Oh cool, that will make it much easier to debug this.
Whiteboard: [mozharness]
Summary: Intermittent B2G emulator "command timed out: 1200 seconds without output, attempting to kill" during run-marionette or b2g_emulator_unittest step → Intermittent B2G emulator "command timed out: 1200 seconds without output, attempting to kill" or "command timed out: 3600 seconds without output, attempting to kill" during run-marionette or b2g_emulator_unittest step
This is now one of our top oranges - could you take another look or find someone who has the cycles to? :-)
Flags: needinfo?(jgriffin)
Yes, I'll try. Looking at the logs, it looks like we could be dealing with several distinct problems; some may be test related, some B2G related, others harness related. Unfortunately the 1200s timeout kill prevents us from dumping logcat, so the information we have to go on is limited. Ideally, we could pass a timeout parameter to run_command so that it can timeout within the mozharness script, and still allow us to dump logcat. I'll open a separate bug to track that.
Flags: needinfo?(jgriffin)
Sounds like a good plan, thank you :-)
Depends on: 892563
Spun off the last 3 failures to bug 899074.
The last many instances of this failure are a hang in one specific crashtest: 12:56:10 INFO - REFTEST TEST-PASS | http://10.0.2.2:8888/tests/content/media/test/crashtests/878014.html | (LOAD ONLY)
Depends on: 904839
878014.html and 880724.html disabled for frequent timeouts. https://hg.mozilla.org/integration/mozilla-inbound/rev/9df04b16a655