Closed Bug 686378 Opened 14 years ago Closed 12 years ago

Intermittent Android exception "unable to launch process"

Categories

(Release Engineering :: General, defect, P3)

ARM
Android
defect

Tracking

(Not tracked)

RESOLVED WORKSFORME

People

(Reporter: philor, Unassigned)

Details

(Keywords: intermittent-failure, Whiteboard: [android_tier_1])

https://tbpl.mozilla.org/php/getParsedLog.php?id=6381107 Android Tegra 250 mozilla-inbound opt test jsreftest-2 on 2011-09-12 12:15:02 PDT for push 88e23391bc2c REFTEST INFO | runreftest.py | Performing extension manager registration: start. FIRE PROC: '"MOZ_CRASHREPORTER=1,XPCOM_DEBUG_BREAK=stack,MOZ_CRASHREPORTER_NO_REPORT=1,NO_EM_RESTART=1,MOZ_PROCESS_LOG=/tmp/tmpm15G2Apidlog,XPCOM_MEM_BLOAT_LOG=/tmp/tmpXKPRHM/runreftest_leaks.log" org.mozilla.fennec -no-remote -profile /mnt/sdcard/tests/reftest/profile/ -silent' DeviceManager: error pulling file: No such file or directory Traceback (most recent call last): File "reftest/remotereftest.py", line 447, in <module> main() File "reftest/remotereftest.py", line 443, in main reftest.runTests(manifest, options) File "/builds/tegra-040/test/build/tests/reftest/runreftest.py", line 154, in runTests self.registerExtension(browserEnv, options, profileDir) File "reftest/remotereftest.py", line 361, in registerExtension maxTime = 20) File "/builds/tegra-040/test/build/tests/reftest/automation.py", line 857, in runApp stderr = subprocess.STDOUT) File "/builds/tegra-040/test/build/tests/reftest/remoteautomation.py", line 131, in Process return self.RProcess(self._devicemanager, cmd, stdout, stderr, env, cwd) File "/builds/tegra-040/test/build/tests/reftest/remoteautomation.py", line 142, in __init__ raise Exception("unable to launch process") Exception: unable to launch process command timed out: 2400 seconds without output remoteFailed: [Failure instance: Traceback (failure with no frames): <class 'twisted.internet.error.ConnectionLost'>: Connection to the other side was lost in a non-clean fashion. ] [Failure instance: Traceback (failure with no frames): <class 'twisted.internet.error.ConnectionLost'>: Connection to the other side was lost in a non-clean fashion. ]
This one might actually be the P1, because the timeout is failing and it's why bear has to constantly manually poke things: https://tbpl.mozilla.org/php/getParsedLog.php?id=6384507&tree=Mozilla-Inbound INFO | runtests.py | Running tests: start. FIRE PROC: '"MOZ_CRASHREPORTER=1,XPCOM_DEBUG_BREAK=stack,MOZ_CRASHREPORTER_NO_REPORT=1,NO_EM_RESTART=1,MOZ_PROCESS_LOG=/tmp/tmp405q_9pidlog,XPCOM_MEM_BLOAT_LOG=/tmp/tmpJcwxj_/runtests_leaks.log" org.mozilla.fennec -no-remote -profile /mnt/sdcard/tests/profile/ http://mochi.test:8888/tests/content/smil/test?autorun=1&closeWhenDone=1&logFile=%2Fmnt%2Fsdcard%2Ftests%2Flogs%2Fmochitest.log&fileLevel=INFO&consoleLevel=INFO' Traceback (most recent call last): File "mochitest/runtestsremote.py", line 351, in <module> main() File "mochitest/runtestsremote.py", line 348, in main sys.exit(mochitest.runTests(options)) File "/builds/tegra-061/test/build/tests/mochitest/runtests.py", line 669, in runTests timeout = timeout) File "/builds/tegra-061/test/build/tests/mochitest/automation.py", line 857, in runApp stderr = subprocess.STDOUT) File "/builds/tegra-061/test/build/tests/mochitest/remoteautomation.py", line 131, in Process return self.RProcess(self._devicemanager, cmd, stdout, stderr, env, cwd) File "/builds/tegra-061/test/build/tests/mochitest/remoteautomation.py", line 142, in __init__ raise Exception("unable to launch process") Exception: unable to launch process command timed out: 2400 seconds without output remoteFailed: [Failure instance: Traceback (failure with no frames): <class 'twisted.internet.error.ConnectionLost'>: Connection to the other side was lost in a non-clean fashion. ] [Failure instance: Traceback (failure with no frames): <class 'twisted.internet.error.ConnectionLost'>: Connection to the other side was lost in a non-clean fashion. ] ======== Finished 'python mochitest/runtestsremote.py ...' interrupted (results: 4, elapsed: 9 hrs, 17 mins, 5 secs) ======== I'm pretty sure that the "command timed out: 2400 seconds without output" is supposed to be followed by something other than sitting around for 9 hours waiting to be killed.
And if my theory that these are why bear has to wake up and kill stuck Tegras every two hours seven days a week is right, and if my theory that https://tbpl.mozilla.org/php/getParsedLog.php?id=6406159&tree=Mozilla-Inbound is one of these, then I think maybe the patch for bug 650535 means that now bear gets to actually sleep through the night, and maybe even take a whole day off.
The new flavor being https://tbpl.mozilla.org/php/getParsedLog.php?id=6405989&tree=Mozilla-Inbound INFO | runtests.py | Running tests: start. FIRE PROC: '"MOZ_CRASHREPORTER=1,XPCOM_DEBUG_BREAK=stack,MOZ_CRASHREPORTER_NO_REPORT=1,NO_EM_RESTART=1,MOZ_PROCESS_LOG=/tmp/tmpaGEdHYpidlog,XPCOM_MEM_BLOAT_LOG=/tmp/tmpCRAj8B/runtests_leaks.log" org.mozilla.fennec -no-remote -profile /mnt/sdcard/tests/profile/ http://mochi.test:8888/tests/dom/tests/mochitest/dom-level2-html?autorun=1&closeWhenDone=1&logFile=%2Fmnt%2Fsdcard%2Ftests%2Flogs%2Fmochitest.log&fileLevel=INFO&consoleLevel=INFO' INFO | runtests.py | Received unexpected exception while running application 'unable to launch process' WARNING | automationutils.processLeakLog() | refcount logging is off, so leaks can't be detected! INFO | runtests.py | Running tests: end. DeviceManager: error pulling file: No such file or directory removing file: /mnt/sdcard/tests/logs/mochitest.log program finished with exit code 1 elapsedTime=43.648868
ok, this is good. The idea here is that we are catching this exception, cleaning up and not leaving processes and tmp directories around. As a result we should have less timed out waiting for server to startup messages and other random problems.
Priority: -- → P3
can we resolve this now?
I'm seeing it on try builds still - is this something that we just need to get devs to update their try repo m-c clones?
no reports since 9/20, resolving as fixed reopen if you see it again
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
This seems to have occurred again, on slave tegra-086: https://tbpl.mozilla.org/php/getParsedLog.php?id=9301891&tree=Try&full=1 INFO | runtests.py | Received unexpected exception while running application 'unable to launch process'
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Resolving WFM keyword:intermittent-failure bugs last modified >3 months ago, whose whiteboard contains none of: {random,disabled,marked,fuzzy,todo,fails,failing,annotated,time-bomb,leave open} There will inevitably be some false positives; for that (and the bugspam) I apologise. Filter on orangewfm.
Status: REOPENED → RESOLVED
Closed: 14 years ago12 years ago
Resolution: --- → WORKSFORME
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.