Intermittent B2G emulator "command timed out: 1200 seconds without output, attempting to kill" or "command timed out: 3600 seconds without output, attempting to kill" during run-marionette or b2g_emulator_unittest step

RESOLVED INCOMPLETE

Status

RESOLVED INCOMPLETE
6 years ago
5 years ago

People

(Reporter: emorley, Unassigned)

Tracking

({intermittent-failure})

Trunk
ARM
Gonk (Firefox OS)
intermittent-failure
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [mozharness])

b2g_ics_armv7a_gecko_emulator mozilla-inbound opt test marionette-webapi on 2012-12-09 19:38:06 PST for push 0770315e3538

slave: talos-r3-fed-046

https://tbpl.mozilla.org/php/getParsedLog.php?id=17769242&tree=Mozilla-Inbound

{
19:40:44     INFO - #####
19:40:44     INFO - ##### Running install step.
19:40:44     INFO - #####
19:40:44     INFO - Getting output from command: ['/home/cltbld/talos-slave/test/build/venv/bin/pip', 'freeze']
19:40:44     INFO - Copy/paste: /home/cltbld/talos-slave/test/build/venv/bin/pip freeze
19:40:44     INFO - Reading from file tmpfile_stdout
19:40:44     INFO - Emulator tests; skipping.
19:40:44     INFO - #####
19:40:44     INFO - ##### Running run-marionette step.
19:40:44     INFO - #####
19:40:44     INFO - Running command: ['/home/cltbld/talos-slave/test/build/venv/bin/python', '-u', '/home/cltbld/talos-slave/test/build/tests/marionette/marionette/runtests.py', '--logcat-dir', '/home/cltbld/talos-slave/test/build', '--emulator', 'arm', '--gecko-path', '/home/cltbld/talos-slave/test/build/gecko/b2g', '--load-early', '--homedir', '/home/cltbld/talos-slave/test/build/emulator/b2g-distro', '--type', 'b2g', '/home/cltbld/talos-slave/test/build/tests/marionette/tests/testing/marionette/client/marionette/tests/unit-tests.ini']
19:40:44     INFO - Copy/paste: /home/cltbld/talos-slave/test/build/venv/bin/python -u /home/cltbld/talos-slave/test/build/tests/marionette/marionette/runtests.py --logcat-dir /home/cltbld/talos-slave/test/build --emulator arm --gecko-path /home/cltbld/talos-slave/test/build/gecko/b2g --load-early --homedir /home/cltbld/talos-slave/test/build/emulator/b2g-distro --type b2g /home/cltbld/talos-slave/test/build/tests/marionette/tests/testing/marionette/client/marionette/tests/unit-tests.ini
19:40:45     INFO -  starting httpd
19:40:45     INFO -  running webserver on http://10.12.49.173:42342/
19:41:00     INFO -  Android Debug Bridge version 1.0.29

command timed out: 1200 seconds without output, attempting to kill
}
Comment hidden (Treeherder Robot)
https://tbpl.mozilla.org/php/getParsedLog.php?id=19084122&tree=Firefox

01:10:17     INFO - #####
01:10:17     INFO - ##### Running install step.
01:10:17     INFO - #####
01:10:17     INFO - Getting output from command: ['/home/cltbld/talos-slave/test/build/venv/bin/pip', 'freeze']
01:10:17     INFO - Copy/paste: /home/cltbld/talos-slave/test/build/venv/bin/pip freeze
01:10:17     INFO - Reading from file tmpfile_stdout
01:10:17     INFO - Detecting whether we're running mozinstall >=1.0...
01:10:17     INFO - Getting output from command: ['/home/cltbld/talos-slave/test/build/venv/bin/mozinstall', '--app', 'b2g', '-h']
01:10:17     INFO - Copy/paste: /home/cltbld/talos-slave/test/build/venv/bin/mozinstall --app b2g -h
01:10:18     INFO - Reading from file tmpfile_stdout
01:10:18     INFO - Output received:
01:10:18     INFO -  Usage: mozinstall [options] installer
01:10:18     INFO -  Options:
01:10:18     INFO -    -h, --help            show this help message and exit
01:10:18     INFO -    -d DEST, --destination=DEST
01:10:18     INFO -                          Directory to install application into. [default:
01:10:18     INFO -                          "/home/cltbld/talos-slave/test"]
01:10:18     INFO -    --app=APP             Application being installed. [default: firefox]
01:10:18     INFO - mkdir: /home/cltbld/talos-slave/test/build/application
01:10:18     INFO - Getting output from command: ['/home/cltbld/talos-slave/test/build/venv/bin/mozinstall', '--app', 'b2g', '/home/cltbld/talos-slave/test/build/b2g-21.0a1.en-US.android-arm.tar.gz', '--destination', '/home/cltbld/talos-slave/test/build/application']
01:10:18     INFO - Copy/paste: /home/cltbld/talos-slave/test/build/venv/bin/mozinstall --app b2g /home/cltbld/talos-slave/test/build/b2g-21.0a1.en-US.android-arm.tar.gz --destination /home/cltbld/talos-slave/test/build/application
01:10:19     INFO - Reading from file tmpfile_stdout
01:10:19     INFO - Output received:
01:10:19     INFO -  /home/cltbld/talos-slave/test/build/application/b2g/b2g
01:10:19     INFO - #####
01:10:19     INFO - ##### Running run-tests step.
01:10:19     INFO - #####
01:10:19     INFO - Running command: ['/home/cltbld/talos-slave/test/build/venv/bin/python', 'runreftestb2g.py', '--adbpath', '/home/cltbld/talos-slave/test/build/emulator/b2g-distro/out/host/linux-x86/bin/adb', '--b2gpath', '/home/cltbld/talos-slave/test/build/emulator/b2g-distro', '--emulator', 'arm', '--emulator-res', '800x1000', '--ignore-window-size', '--logcat-dir', '/home/cltbld/talos-slave/test/build', '--remote-webserver', '10.0.2.2', '--xre-path', '/home/cltbld/talos-slave/test/build/xre/bin', '--total-chunks', '10', '--this-chunk', '2', '--gecko-path', '/home/cltbld/talos-slave/test/build/application/b2g', '--busybox', '/home/cltbld/talos-slave/test/build/busybox', 'tests/layout/reftests/reftest.list'] in /home/cltbld/talos-slave/test/build/tests/reftest
01:10:19     INFO - Copy/paste: /home/cltbld/talos-slave/test/build/venv/bin/python runreftestb2g.py --adbpath /home/cltbld/talos-slave/test/build/emulator/b2g-distro/out/host/linux-x86/bin/adb --b2gpath /home/cltbld/talos-slave/test/build/emulator/b2g-distro --emulator arm --emulator-res 800x1000 --ignore-window-size --logcat-dir /home/cltbld/talos-slave/test/build --remote-webserver 10.0.2.2 --xre-path /home/cltbld/talos-slave/test/build/xre/bin --total-chunks 10 --this-chunk 2 --gecko-path /home/cltbld/talos-slave/test/build/application/b2g --busybox /home/cltbld/talos-slave/test/build/busybox tests/layout/reftests/reftest.list
01:10:35     INFO -  Android Debug Bridge version 1.0.29

command timed out: 1200 seconds without output, attempting to kill
process killed by signal 9
program finished with exit code -1
elapsedTime=1364.127121
========= Finished '/tools/buildbot/bin/python scripts/scripts/b2g_emulator_unittest.py ...' failed (results: 2, elapsed: 22 mins, 46 secs) (at 2013-01-24 01:30:37.918921) =========
Summary: Intermittent B2G emulator "command timed out: 1200 seconds without output, attempting to kill" during run-marionette step → Intermittent B2G emulator "command timed out: 1200 seconds without output, attempting to kill" during run-marionette or b2g_emulator_unittest step
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
That last one timed out while downloading tests.zip. In that case the emulator hasn't even been launched yet. Filed bug 848442 for it.
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
In the most recent cases, it looks like the emulator isn't starting up successfully, but it's a little hard to tell.  Mozharness might be eating the last bit of log (bug 791406), and we don't have a process list or logcat output.  It will be hard to add additional logging here, since the mozharness script is getting killed by buildbot due to lack of output.
I have a timeout patch in bug 840305.
Oh cool, that will make it much easier to debug this.
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Whiteboard: [mozharness]
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
https://tbpl.mozilla.org/php/getParsedLog.php?id=24326084&tree=Mozilla-Inbound
Summary: Intermittent B2G emulator "command timed out: 1200 seconds without output, attempting to kill" during run-marionette or b2g_emulator_unittest step → Intermittent B2G emulator "command timed out: 1200 seconds without output, attempting to kill" or "command timed out: 3600 seconds without output, attempting to kill" during run-marionette or b2g_emulator_unittest step
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
This is now one of our top oranges - could you take another look or find someone who has the cycles to? :-)
Flags: needinfo?(jgriffin)
Yes, I'll try.  Looking at the logs, it looks like we could be dealing with several distinct problems; some may be test related, some B2G related, others harness related.

Unfortunately the 1200s timeout kill prevents us from dumping logcat, so the information we have to go on is limited.

Ideally, we could pass a timeout parameter to run_command so that it can timeout within the mozharness script, and still allow us to dump logcat.  I'll open a separate bug to track that.
Flags: needinfo?(jgriffin)
Sounds like a good plan, thank you :-)
Depends on: 892563
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Spun off the last 3 failures to bug 899074.
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
The last many instances of this failure are a hang in one specific crashtest:

12:56:10     INFO -  REFTEST TEST-PASS | http://10.0.2.2:8888/tests/content/media/test/crashtests/878014.html | (LOAD ONLY)
Depends on: 904839
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)