Closed Bug 895966 Opened 6 years ago Closed 5 years ago

tbpl shows green for Android 4.0 rc2 job that failed with DMError


(Infrastructure & Operations :: CIDuty, task)

Not set


(Not tracked)



(Reporter: gbrown, Assigned: gbrown)


(Blocks 1 open bug)

Details has a failure at (middle job in the last group of 3 consecutive greens) but it shows as passed.

15:53:51     INFO -  Traceback (most recent call last):
15:53:51     INFO -    File "/builds/panda-0783/test/build/tests/mochitest/", line 636, in main
15:53:51     INFO -      dm.removeDir("/mnt/sdcard/Robotium-Screenshots")
15:53:51     INFO -    File "/builds/panda-0783/test/build/tests/mochitest/", line 422, in removeDir
15:53:51     INFO -      if self.dirExists(remoteDir):
15:53:51     INFO -    File "/builds/panda-0783/test/build/tests/mochitest/", line 391, in dirExists
15:53:51     INFO -      ret = self._runCmds([{ 'cmd': 'isdir ' + remotePath }]).strip()
15:53:51     INFO -    File "/builds/panda-0783/test/build/tests/mochitest/", line 152, in _runCmds
15:53:51     INFO -      self._sendCmds(cmdlist, outputfile, timeout, retryLimit=retryLimit)
15:53:51     INFO -    File "/builds/panda-0783/test/build/tests/mochitest/", line 134, in _sendCmds
15:53:51     INFO -      raise err
15:53:51     INFO -  DMError: Automation Error: Timeout in command isdir /mnt/sdcard/Robotium-Screenshots
16:04:33     INFO -  Automation Error: Exception caught while running tests

I assume this is fall-out from bug 829211...but I am not sure. - same tryrun, a run from somewhere in the middle of the panda restarts
Ok, there are a couple things wrong here.

First, the test harness seems to be exiting 1;

This block isn't setting tbpl_status to TBPL_WARNING (or even TBPL_FAILURE):

I think the check for code == 10 is wrong; we need to figure out the exit codes for mochitest/ and adjust accordingly.  If the exit codes match up to , then we can set self.return_code directly.

Once we get self.return_code set properly, either directly setting |self.return_code = ___| or via self.buildbot_status()
the test run should go orange or red as needed.

Second, the Automation Error lines in
are INFO, not ERROR.  This isn't a terrible bug, but could be remedied by passing an error_list to this run_command:

error_lists look like this: an ordered list with substrings that match certain lines, and a level:
or with re.compile()d regexes:

So the error_list in this case might look like

    [{'substr': r'''Automation Error: ''', 'level': ERROR}]
Hardware: x86 → ARM
Product: → Release Engineering
See Also: → 917578
Aki: did we do anything to address this on our side? 

Has there really been no recurrence since 2013-08-01, or is this one of those things that's happens so often we don't report it any more?
Flags: needinfo?(aki)
(In reply to Chris Cooper [:coop] from comment #4)
> Has there really been no recurrence since 2013-08-01, or is this one of
> those things that's happens so often we don't report it any more?

I think I have seen similar problems recently, but failed to report (sorry). Also I think this is a case where we are unlikely to notice a problem since we often don't check logs for greeen jobs.
Blocks: 1048775
We no longer run robocop on 4.0. In general, it feels like we are doing better at "coloring" jobs appropriately...perhaps since the switch to treeherder? In any case, I don't see much remaining value in this bug.
Assignee: nobody → gbrown
Closed: 5 years ago
Resolution: --- → WORKSFORME
Component: Platform Support → Buildduty
Product: Release Engineering → Infrastructure & Operations
You need to log in before you can comment on or make changes to this bug.