Closed
Bug 895966
Opened 12 years ago
Closed 10 years ago
tbpl shows green for Android 4.0 rc2 job that failed with DMError
Categories
(Infrastructure & Operations Graveyard :: CIDuty, task)
Tracking
(Not tracked)
RESOLVED
WORKSFORME
People
(Reporter: gbrown, Assigned: gbrown)
References
(Blocks 1 open bug)
Details
https://tbpl.mozilla.org/?tree=Try&rev=580e1d322dbe has a failure at https://tbpl.mozilla.org/php/getParsedLog.php?id=25458214&tree=Try&full=1 (middle job in the last group of 3 consecutive greens) but it shows as passed.
15:53:51 INFO - Traceback (most recent call last):
15:53:51 INFO - File "/builds/panda-0783/test/build/tests/mochitest/runtestsremote.py", line 636, in main
15:53:51 INFO - dm.removeDir("/mnt/sdcard/Robotium-Screenshots")
15:53:51 INFO - File "/builds/panda-0783/test/build/tests/mochitest/devicemanagerSUT.py", line 422, in removeDir
15:53:51 INFO - if self.dirExists(remoteDir):
15:53:51 INFO - File "/builds/panda-0783/test/build/tests/mochitest/devicemanagerSUT.py", line 391, in dirExists
15:53:51 INFO - ret = self._runCmds([{ 'cmd': 'isdir ' + remotePath }]).strip()
15:53:51 INFO - File "/builds/panda-0783/test/build/tests/mochitest/devicemanagerSUT.py", line 152, in _runCmds
15:53:51 INFO - self._sendCmds(cmdlist, outputfile, timeout, retryLimit=retryLimit)
15:53:51 INFO - File "/builds/panda-0783/test/build/tests/mochitest/devicemanagerSUT.py", line 134, in _sendCmds
15:53:51 INFO - raise err
15:53:51 INFO - DMError: Automation Error: Timeout in command isdir /mnt/sdcard/Robotium-Screenshots
16:04:33 INFO - Automation Error: Exception caught while running tests
I assume this is fall-out from bug 829211...but I am not sure.
Comment 1•12 years ago
|
||
https://tbpl.mozilla.org/php/getParsedLog.php?id=25439981&tree=Try - same tryrun, a run from somewhere in the middle of the panda restarts
Comment 2•12 years ago
|
||
Ok, there are a couple things wrong here.
First, the test harness seems to be exiting 1;
https://tbpl.mozilla.org/php/getParsedLog.php?id=25458214&tree=Try#error2
This block isn't setting tbpl_status to TBPL_WARNING (or even TBPL_FAILURE):
http://hg.mozilla.org/build/mozharness/file/e7e6e4dbcbe7/scripts/b2g_panda.py#l141
I think the check for code == 10 is wrong; we need to figure out the exit codes for mochitest/runtestsremote.py and adjust accordingly. If the exit codes match up to http://hg.mozilla.org/build/mozharness/file/e7e6e4dbcbe7/mozharness/mozilla/buildbot.py#l40 , then we can set self.return_code directly.
Once we get self.return_code set properly, either directly setting |self.return_code = ___| or via self.buildbot_status()
http://hg.mozilla.org/build/mozharness/file/e7e6e4dbcbe7/mozharness/mozilla/buildbot.py#l69
the test run should go orange or red as needed.
Second, the Automation Error lines in
https://tbpl.mozilla.org/php/getParsedLog.php?id=25458214&tree=Try&full=1#error2
are INFO, not ERROR. This isn't a terrible bug, but could be remedied by passing an error_list to this run_command:
http://hg.mozilla.org/build/mozharness/file/e7e6e4dbcbe7/scripts/b2g_panda.py#l141
error_lists look like this: an ordered list with substrings that match certain lines, and a level:
http://hg.mozilla.org/build/mozharness/file/e7e6e4dbcbe7/mozharness/base/errors.py#l37
or with re.compile()d regexes: http://hg.mozilla.org/build/mozharness/file/e7e6e4dbcbe7/mozharness/base/errors.py#l73
So the error_list in this case might look like
[{'substr': r'''Automation Error: ''', 'level': ERROR}]
Updated•12 years ago
|
Hardware: x86 → ARM
![]() |
Assignee | |
Comment 3•12 years ago
|
||
Updated•12 years ago
|
Product: mozilla.org → Release Engineering
Comment 4•12 years ago
|
||
Aki: did we do anything to address this on our side?
Has there really been no recurrence since 2013-08-01, or is this one of those things that's happens so often we don't report it any more?
Flags: needinfo?(aki)
Comment 5•12 years ago
|
||
https://github.com/mozilla/mozbase/commit/73f160b63b9232d0afe347fe9ed6a7b079b09ef8 may have helped here.
Flags: needinfo?(aki)
![]() |
Assignee | |
Comment 6•12 years ago
|
||
(In reply to Chris Cooper [:coop] from comment #4)
> Has there really been no recurrence since 2013-08-01, or is this one of
> those things that's happens so often we don't report it any more?
I think I have seen similar problems recently, but failed to report (sorry). Also I think this is a case where we are unlikely to notice a problem since we often don't check logs for greeen jobs.
![]() |
Assignee | |
Comment 7•10 years ago
|
||
We no longer run robocop on 4.0. In general, it feels like we are doing better at "coloring" jobs appropriately...perhaps since the switch to treeherder? In any case, I don't see much remaining value in this bug.
Assignee: nobody → gbrown
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → WORKSFORME
Updated•7 years ago
|
Component: Platform Support → Buildduty
Product: Release Engineering → Infrastructure & Operations
Updated•6 years ago
|
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•