833765 - Intermittent DMError: Automation Error: DeviceManager isdir returned null | Exception caught while running tests

Reporter

Description

•

13 years ago

https://tbpl.mozilla.org/php/getParsedLog.php?id=19042299&tree=Mozilla-Inbound Android 4.0 Panda mozilla-inbound opt test mochitest-8 on 2013-01-22 23:56:52 PST for push 2d5b54f00e2e slave: panda-0580 INFO | runtests.py | Installing extension at /builds/panda-0580/test/build/tests/mochitest/extensions/roboextender@mozilla.org to /tmp/tmpuhFNpb. INFO | runtests.py | Installing extension at /builds/panda-0580/test/build/tests/mochitest/extensions/specialpowers to /tmp/tmpuhFNpb. INFO | runtests.py | Installing extension at /builds/panda-0580/test/build/tests/mochitest/extensions/workerbootstrap to /tmp/tmpuhFNpb. INFO | runtests.py | Installing extension at /builds/panda-0580/test/build/tests/mochitest/extensions/worker to /tmp/tmpuhFNpb. Traceback (most recent call last): File "mochitest/runtestsremote.py", line 577, in main File "/builds/panda-0580/test/build/tests/mochitest/runtests.py", line 705, in runTests manifest = self.buildProfile(options) File "mochitest/runtestsremote.py", line 299, in buildProfile File "/builds/panda-0580/test/build/tests/mochitest/devicemanagerSUT.py", line 442, in removeDir if self.dirExists(remoteDir): File "/builds/panda-0580/test/build/tests/mochitest/devicemanagerSUT.py", line 398, in dirExists raise DMError('Automation Error: DeviceManager isdir returned null') DMError: Automation Error: DeviceManager isdir returned null Traceback (most recent call last): File "mochitest/runtestsremote.py", line 595, in <module> Automation Error: Exception caught while running tests main() File "mochitest/runtestsremote.py", line 581, in main mochitest.stopWebServer(options) File "mochitest/runtestsremote.py", line 292, in stopWebServer self.server.stop() AttributeError: 'MochiRemote' object has no attribute 'server' program finished with exit code 1 elapsedTime=6.779581 TinderboxPrint: mochitest-plain<br/><em class="testfail">T-FAIL</em> Unknown Error: command finished with exit code: 1

Comment hidden (Legacy TBPL/Treeherder Robot)

RyanVM https://tbpl.mozilla.org/php/getParsedLog.php?id=20450520&tree=Try Android 4.0 Panda try opt test robocop-2 on 2013-03-07 23:07:17 slave: panda-0551 DMError: Automation Error: Timeout in command exec "MOZ_CRASHREPORTER=1,XPCOM_DEBUG_BREAK=stack,MOZ_HIDE_RESULTS_TABLE=1,MOZ_CRASHREPORTER_NO_REPORT=1,NO_EM_RESTART=1,MOZ_PROCESS_LOG=/tmp/tmpDAFBBdpidlog,XPCOM_MEM_BLOAT_LOG=/mnt/sdcard/tests/profile/runtests_leaks.log" am instrument -w -e deviceroot /mnt/sdcard/tests -e class org.mozilla.fennec.tests.testSystemPages org.mozilla.roboexample.test/org.mozilla.fennec.FennecInstrumentationTestRunner raise DMError('Automation Error: DeviceManager isdir returned null') DMError: Automation Error: DeviceManager isdir returned null Automation Error: Exception caught while running tests Remote Device Error: unable to connect socket: [Errno 113] No route to host Remote Device Error: unable to connect socket: [Errno 113] No route to host Remote Device Error: unable to connect socket: [Errno 113] No route to host Remote Device Error: unable to connect socket: [Errno 113] No route to host Remote Device Error: unable to connect socket: [Errno 113] No route to host Remote Device Error: unable to connect socket: [Errno 113] No route to host Remote Device Error: unable to connect socket: [Errno 113] No route to host Remote Device Error: unable to connect socket: [Errno 113] No route to host Remote Device Error: unable to connect socket: [Errno 113] No route to host Remote Device Error: unable to connect socket: [Errno 113] No route to host Remote Device Error: unable to connect socket: [Errno 113] No route to host Remote Device Error: unable to connect socket: [Errno 113] No route to host Remote Device Error: unable to connect socket: [Errno 113] No route to host Remote Device Error: unable to connect socket: [Errno 113] No route to host Remote Device Error: unable to connect socket: [Errno 113] No route to host Remote Device Error: unable to connect socket: [Errno 110] Connection timed out Only displaying first 20 of 24 failures - .

Comment hidden (Legacy TBPL/Treeherder Robot)

RyanVM https://tbpl.mozilla.org/php/getParsedLog.php?id=20570957&tree=Mozilla-Inbound Android 4.0 Panda mozilla-inbound opt test robocop-2 on 2013-03-12 09:22:54 slave: panda-0798 DMError: Automation Error: Timeout in command exec "MOZ_CRASHREPORTER=1,XPCOM_DEBUG_BREAK=stack,MOZ_HIDE_RESULTS_TABLE=1,MOZ_CRASHREPORTER_NO_REPORT=1,NO_EM_RESTART=1,MOZ_PROCESS_LOG=/tmp/tmpQvD03dpidlog,XPCOM_MEM_BLOAT_LOG=/mnt/sdcard/tests/profile/runtests_leaks.log" am instrument -w -e deviceroot /mnt/sdcard/tests -e class org.mozilla.fennec.tests.testSystemPages org.mozilla.roboexample.test/org.mozilla.fennec.FennecInstrumentationTestRunner raise DMError('Automation Error: DeviceManager isdir returned null') DMError: Automation Error: DeviceManager isdir returned null Automation Error: Exception caught while running tests Remote Device Error: unable to connect socket: [Errno 110] Connection timed out Remote Device Error: unable to connect socket: [Errno 113] No route to host Remote Device Error: unable to connect socket: [Errno 113] No route to host Remote Device Error: unable to connect socket: [Errno 113] No route to host Remote Device Error: unable to connect socket: [Errno 113] No route to host Remote Device Error: unable to connect socket: [Errno 113] No route to host Remote Device Error: unable to connect socket: [Errno 113] No route to host Remote Device Error: unable to connect socket: [Errno 113] No route to host Remote Device Error: unable to connect socket: [Errno 113] No route to host Remote Device Error: unable to connect socket: [Errno 113] No route to host Remote Device Error: unable to connect socket: [Errno 113] No route to host Remote Device Error: unable to connect socket: [Errno 113] No route to host Remote Device Error: unable to connect socket: [Errno 113] No route to host Remote Device Error: unable to connect socket: [Errno 113] No route to host Remote Device Error: unable to connect socket: [Errno 113] No route to host Remote Device Error: unable to connect socket: [Errno 113] No route to host Only displaying first 20 of 24 failures - .

Comment hidden (Legacy TBPL/Treeherder Robot)

philor https://tbpl.mozilla.org/php/getParsedLog.php?id=20570957&tree=Mozilla-Inbound Android 4.0 Panda mozilla-inbound opt test robocop-2 on 2013-03-12 09:22:54 slave: panda-0798 DMError: Automation Error: Timeout in command exec "MOZ_CRASHREPORTER=1,XPCOM_DEBUG_BREAK=stack,MOZ_HIDE_RESULTS_TABLE=1,MOZ_CRASHREPORTER_NO_REPORT=1,NO_EM_RESTART=1,MOZ_PROCESS_LOG=/tmp/tmpQvD03dpidlog,XPCOM_MEM_BLOAT_LOG=/mnt/sdcard/tests/profile/runtests_leaks.log" am instrument -w -e deviceroot /mnt/sdcard/tests -e class org.mozilla.fennec.tests.testSystemPages org.mozilla.roboexample.test/org.mozilla.fennec.FennecInstrumentationTestRunner raise DMError('Automation Error: DeviceManager isdir returned null') DMError: Automation Error: DeviceManager isdir returned null Automation Error: Exception caught while running tests Remote Device Error: unable to connect socket: [Errno 110] Connection timed out Remote Device Error: unable to connect socket: [Errno 113] No route to host Remote Device Error: unable to connect socket: [Errno 113] No route to host Remote Device Error: unable to connect socket: [Errno 113] No route to host Remote Device Error: unable to connect socket: [Errno 113] No route to host Remote Device Error: unable to connect socket: [Errno 113] No route to host Remote Device Error: unable to connect socket: [Errno 113] No route to host Remote Device Error: unable to connect socket: [Errno 113] No route to host Remote Device Error: unable to connect socket: [Errno 113] No route to host Remote Device Error: unable to connect socket: [Errno 113] No route to host Remote Device Error: unable to connect socket: [Errno 113] No route to host Remote Device Error: unable to connect socket: [Errno 113] No route to host Remote Device Error: unable to connect socket: [Errno 113] No route to host Remote Device Error: unable to connect socket: [Errno 113] No route to host Remote Device Error: unable to connect socket: [Errno 113] No route to host Remote Device Error: unable to connect socket: [Errno 113] No route to host Only displaying first 20 of 24 failures - .

Comment hidden (Legacy TBPL/Treeherder Robot)

Geoff Brown [:gbrown]

Assignee

Comment 17

•

12 years ago

Almost all of the recent failures occur in robocop's testSystemPages and show a reboot. Many logs have: W/Watchdog( 1403): *** WATCHDOG KILLING SYSTEM PROCESS: null just before the reboot.

Comment hidden (Legacy TBPL/Treeherder Robot)

philor https://tbpl.mozilla.org/php/getParsedLog.php?id=20802906&tree=Try Android 4.0 Panda try opt test robocop-2 on 2013-03-18 17:24:28 slave: panda-0806 DMError: Automation Error: Timeout in command exec "MOZ_CRASHREPORTER=1,XPCOM_DEBUG_BREAK=stack,MOZ_HIDE_RESULTS_TABLE=1,MOZ_CRASHREPORTER_NO_REPORT=1,NO_EM_RESTART=1,MOZ_PROCESS_LOG=/tmp/tmpTCEI5Ypidlog,XPCOM_MEM_BLOAT_LOG=/mnt/sdcard/tests/profile/runtests_leaks.log" am instrument -w -e deviceroot /mnt/sdcard/tests -e class org.mozilla.fennec.tests.testSystemPages org.mozilla.roboexample.test/org.mozilla.fennec.FennecInstrumentationTestRunner raise DMError('Automation Error: DeviceManager isdir returned null') DMError: Automation Error: DeviceManager isdir returned null Automation Error: Exception caught while running tests Remote Device Error: unable to connect socket: [Errno 113] No route to host Remote Device Error: unable to connect socket: [Errno 113] No route to host Remote Device Error: unable to connect socket: [Errno 113] No route to host Remote Device Error: unable to connect socket: [Errno 113] No route to host Remote Device Error: unable to connect socket: [Errno 113] No route to host Remote Device Error: unable to connect socket: [Errno 113] No route to host Remote Device Error: unable to connect socket: [Errno 113] No route to host Remote Device Error: unable to connect socket: [Errno 113] No route to host Remote Device Error: unable to connect socket: [Errno 113] No route to host Remote Device Error: unable to connect socket: [Errno 113] No route to host Remote Device Error: unable to connect socket: [Errno 113] No route to host Remote Device Error: unable to connect socket: [Errno 113] No route to host Remote Device Error: unable to connect socket: [Errno 113] No route to host Remote Device Error: unable to connect socket: [Errno 113] No route to host Remote Device Error: unable to connect socket: [Errno 113] No route to host Remote Device Error: unable to connect socket: [Errno 110] Connection timed out Only displaying first 20 of 54 failures - .

Geoff Brown [:gbrown]

Assignee

Comment 22

•

12 years ago

Also (Comments 18, 21): *** WATCHDOG KILLING SYSTEM PROCESS: com.android.server.PowerManagerService

Geoff Brown [:gbrown]

Assignee

Comment 23

•

12 years ago

I am not sure about the cause of the WATCHDOG messages, but.... Notice that in most of these logs, the first thing that goes wrong is not "isdir returned null" but "Timeout in command exec". Here's a try push that shows the same thing happening during a special long-running test with devicemanagerSUT debug level 4: https://tbpl.mozilla.org/?tree=Try&rev=f4960773a683. Here we see that we issue a sut exec call and wait for response: sent cmd: exec "MOZ_CRASHREPORTER=1,XPCOM_DEBUG_BREAK=stack,MOZ_HIDE_RESULTS_TABLE=1,MOZ_CRASHREPORTER_NO_REPORT=1,NO_EM_RESTART=1,MOZ_PROCESS_LOG=/tmp/tmp0AILfZpidlog,XPCOM_MEM_BLOAT_LOG=/tmp/tmpDYoKnO/runtests_leaks.log" am instrument -w -e deviceroot /mnt/sdcard/tests -e class org.mozilla.fennec.tests.testLong2 org.mozilla.roboexample.test/org.mozilla.fennec.FennecInstrumentationTestRunner recv'ing... ... (about 300 lines) recv'ing... recv'ing... INFO | runtests.py | Received unexpected exception while running application Traceback (most recent call last): File "/builds/panda-0559/test/build/tests/mochitest/runtests.py", line 768, in runTests onLaunch=onLaunch) File "/builds/panda-0559/test/build/tests/mochitest/automation.py", line 1207, in runApp stderr = subprocess.STDOUT) File "/builds/panda-0559/test/build/tests/mochitest/remoteautomation.py", line 175, in Process return self.RProcess(self._devicemanager, cmd, stdout, stderr, env, cwd) File "/builds/panda-0559/test/build/tests/mochitest/remoteautomation.py", line 185, in __init__ self.proc = dm.launchProcess(cmd, stdout, cwd, env, True) File "/builds/panda-0559/test/build/tests/mochitest/devicemanagerSUT.py", line 519, in launchProcess self.fireProcess(cmdline, failIfRunning) File "/builds/panda-0559/test/build/tests/mochitest/devicemanagerSUT.py", line 472, in fireProcess self._runCmds([{ 'cmd': 'exec ' + appname }]) File "/builds/panda-0559/test/build/tests/mochitest/devicemanagerSUT.py", line 151, in _runCmds self._sendCmds(cmdlist, outputfile, timeout, retryLimit=retryLimit) File "/builds/panda-0559/test/build/tests/mochitest/devicemanagerSUT.py", line 132, in _sendCmds raise err DMError: Automation Error: Timeout in command exec This does not always fail this way. In particular, it usually succeeds on tegras, and often fails on pandas. devicemanagerSUT is waiting for the $> prompt response for 300 seconds. sutagent is also waiting 300 s for program completion: https://hg.mozilla.org/mozilla-central/file/40a228f74389/build/mobile/sutagent/android/DoCommand.java#l3734 I'm thinking that increasing the devicemanagerSUT timeout by a few seconds will help us on long-running (more than 5 minutes) tests.

Assignee: nobody → gbrown

Geoff Brown [:gbrown]

Assignee

Comment 24

•

12 years ago

Special long-running test, before: https://tbpl.mozilla.org/?tree=Try&rev=f4960773a683 And with devicemanagerSUT timeout bumped up from 300 to 320 seconds: https://tbpl.mozilla.org/?tree=Try&rev=165cb260ff89 And here we check it against the full set of tests: https://tbpl.mozilla.org/?tree=Try&rev=995bc18baa6d There are still panda rc failures here, but no "isdir returned null" or "Timeout in command exec" (so far, but I'll retry some more).

Geoff Brown [:gbrown]

Assignee

Comment 25

•

12 years ago

Attached patch increase dmSUT default timeout from 300 to 320 s — Details — Splinter Review

Why 320? Since sutagent is waiting in 10s increments, I want an extra 10s in case there is an extra increment somehow. And then I added another 10s for network delay, etc. It's more than needed, but still only a 7% increase.

Attachment #734077 - Flags: review?(jmaher)

Attachment #734077 - Flags: feedback?(wlachance)

Joel Maher ( :jmaher ) (UTC -8)

Comment 26

•

12 years ago

Comment on attachment 734077 [details] [diff] [review] increase dmSUT default timeout from 300 to 320 s Review of attachment 734077 [details] [diff] [review]: ----------------------------------------------------------------- this is only a robocop issue as mochitest/talos/reftest all fire the process and then poll for the process. Robocop (due to the instrumentation interface) does not return from the fireCommand until it is completely done. So this patch makes sense. Could we instead do this inside of runtestsremote.py: if robocop: dm.default_timeout = 320 If we could go that route I would be happier. If it doesn't work, I am fine with this patch. r+ as the solution is valid.

Attachment #734077 - Flags: review?(jmaher) → review+

Comment hidden (Legacy TBPL/Treeherder Robot)

Geoff Brown [:gbrown]

Assignee

Comment 28

•

12 years ago

Attached patch increase dmSUT default timeout during robocop tests only — Details — Splinter Review

As suggested, here is a patch that limits the change to robocop tests. It seems to work fine. I am a little concerned that there might be some way to hit the 300 s timeout during other test types, but I think either approach is acceptable.

Attachment #734702 - Flags: review?(jmaher)

Attachment #734702 - Flags: feedback?(wlachance)

Joel Maher ( :jmaher ) (UTC -8)

Comment 29

•

12 years ago

Comment on attachment 734702 [details] [diff] [review] increase dmSUT default timeout during robocop tests only Review of attachment 734702 [details] [diff] [review]: ----------------------------------------------------------------- I like this better, something to keep in mind for other harnesses, but I would rather hack it from the harness level than DM.

Attachment #734702 - Flags: review?(jmaher) → review+

Geoff Brown [:gbrown]

Assignee

Comment 30

•

12 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/98b3cdc05f18

Comment hidden (Legacy TBPL/Treeherder Robot)

Ryan VanderMeulen [:RyanVM]

Reporter

Comment 32

•

12 years ago

https://hg.mozilla.org/mozilla-central/rev/98b3cdc05f18

Status: NEW → RESOLVED

Closed: 12 years ago

Resolution: --- → FIXED

Target Milestone: --- → mozilla23

William Lachance (:wlach)

Comment 33

•

12 years ago

Comment on attachment 734077 [details] [diff] [review] increase dmSUT default timeout from 300 to 320 s Makes sense

Attachment #734077 - Flags: feedback?(wlachance) → feedback+

William Lachance (:wlach)

Comment 34

•

12 years ago

Comment on attachment 734702 [details] [diff] [review] increase dmSUT default timeout during robocop tests only Makes sense

Attachment #734702 - Flags: feedback?(wlachance) → feedback+

Comment hidden (Legacy TBPL/Treeherder Robot)

increase dmSUT default timeout from 300 to 320 s 12 years ago Geoff Brown [:gbrown] 1.15 KB, patch	jmaher : review+ wlach : feedback+	Details \| Diff \| Splinter Review
increase dmSUT default timeout during robocop tests only 12 years ago Geoff Brown [:gbrown] 990 bytes, patch	jmaher : review+ wlach : feedback+	Details \| Diff \| Splinter Review