Closed
Bug 763527
Opened 12 years ago
Closed 12 years ago
Investigate failure of mochitest chunks on B2G to start
Categories
(Testing :: Mochitest, defect)
Testing
Mochitest
Tracking
(Not tracked)
RESOLVED
WORKSFORME
People
(Reporter: jgriffin, Unassigned)
References
Details
We now have mochitest-plain running on B2G CI (bug 759887). However, we're seeing an issue in which sometimes one of the 8 chunks will fail to start, and the test job will then be killed due to a timeout. For the chunks that failed to start, we're successfully copying the test profile to the emulator and restarting B2G. I'm thinking the failure may occur because the request from Marionette to navigate to the mochitest URL is coming too quickly, somehow, before gecko is able to act on the navigation request.
Reporter | ||
Comment 1•12 years ago
|
||
It looks like chunk 8 is often being interrupted by a 90-minute timeout I had set up on the mochitest-plain job. I've just increased this to 120-minutes to see if this resolves. The VM that is running both the builds and mochitests doesn't really have the capacity needed to do so. We're going to have to expand our VM capacity to handle mochitests and reftests; I'll handle this in a separate bug.
Reporter | ||
Comment 2•12 years ago
|
||
I caught this problem occurring on the VM. When I used adb to look at the file system of the running emulator, I saw: - a marionette.log in /data/b2g/mozilla/{profile} - no marionette.log in /data/local/tests/profile From this I infer that we're either not successfully restarting B2G (so that it starts up with the test profile), or the automation is getting stuck before ever getting to that point.
Reporter | ||
Comment 3•12 years ago
|
||
One more data point: on the server, I see these processes: jenkins 8449 8183 0 21:12 ? 00:00:00 /data/jenkins/jobs/mochitest-plain/workspace/b2g-distro/out/host/linux-x86/bin/adb logcat jenkins 8450 8183 0 21:12 ? 00:00:00 [adb] <defunct> The <defunct> adb makes me think adb has crashed; I've seen this on my own machine occasionally.
Reporter | ||
Comment 4•12 years ago
|
||
I've changed the way mochitests are run in the CI to try and debug what's going on. The last two failures produced this output: INFO | runtests.py | Received unexpected exception while running application Traceback (most recent call last): File "/data/jenkins/workspace/mochitest/objdir-gecko/_tests/testing/mochitest/runtests.py", line 677, in runTests timeout = timeout) File "/data/jenkins/workspace/mochitest/objdir-gecko/_tests/testing/mochitest/automation.py", line 900, in runApp stderr = subprocess.STDOUT) File "/data/jenkins/workspace/mochitest/objdir-gecko/_tests/testing/mochitest/b2gautomation.py", line 205, in Process session = self.marionette.start_session() File "/data/jenkins/workspace/mochitest/venv/src/marionette/marionette/marionette.py", line 218, in start_session self.session = self._send_message('newSession', 'value') File "/data/jenkins/workspace/mochitest/venv/src/marionette/marionette/marionette.py", line 140, in _send_message raise TimeoutException(message='socket.timeout', status=ErrorCodes.TIMEOUT, stacktrace=None) TimeoutException: socket.timeout WARNING | automationutils.processLeakLog() | refcount logging is off, so leaks can't be detected! I think this may be another case of bug 753273. In any case, I'm going to add a sleep to try and resolve this.
Reporter | ||
Comment 5•12 years ago
|
||
http://hg.mozilla.org/mozilla-central/rev/515c5d751c5e - increase some sleeps to see if it resolves the chunk timeout problems
Reporter | ||
Comment 6•12 years ago
|
||
The above doesn't appear to have helped. I'm going to have to manually run mochitests on the CI VM and try to catch it in the act.
Comment 7•12 years ago
|
||
I'm going to dupe bug 778249 to bug 777714 rather than having two releng tracking bugs.
Comment 8•12 years ago
|
||
Is this bug for pandas? or b2g testing on VMs?
Reporter | ||
Comment 9•12 years ago
|
||
This is for old testing on Amazon AWS VM's and isn't relevant any longer.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → WORKSFORME
You need to log in
before you can comment on or make changes to this bug.
Description
•