Closed Bug 898165 Opened 12 years ago Closed 10 years ago

Panda Android jobs show green on TBPL even if they were aborted due to Python errors

Categories

(Release Engineering :: General, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: jgriffin, Unassigned)

Details

Attachments

(1 file, 1 obsolete file)

Case in point: Android 4.0 rc1 rc2 jobs at https://tbpl.mozilla.org/?rev=a22f54b18a51&tree=Try are green, but the log shows: 13:22:41 INFO - IOError: [Errno 2] No such file or directory: u'/tmp/tmpmiMW9B/extensions/staged/worker-test@mozilla.org/chrome.manifest' 13:22:42 INFO - Automation Error: Exception caught while running tests log: https://tbpl.mozilla.org/php/getParsedLog.php?id=25731692&tree=Try&full=1#error0 Presumably, this is because we're not catching Python errors in the parser used in the mozharness script.
Is there any reason not to do this?
Attachment #781256 - Flags: review?(aki)
Assignee: nobody → jgriffin
Comment on attachment 781256 [details] [diff] [review] Use PythonErrorList instead of BaseErrorList, This will work. Whether it creates a bunch of unwanted noise is unknown. I do know we explicitly shifted from PythonErrorList to BaseErrorList somewhere before, due to unwanted noise and false positives; I suggest we bake this on Cedar or elsewhere to make sure this isn't the case. Also, are you more interested in marking these log lines as ERROR - or changing the status of the job to TBPL WARNING / TBPL FAILURE ? This will probably do the former (though we don't catch Automation Error: explicitly), but may not do what you want in the latter.
Attachment #781256 - Flags: review?(aki) → review+
Hmm, yes, I actually want to make TBPL go orange or red. It looks to me like that should work, but maybe I'm missing something. OutputParser.parse_single_line should increment self.num_errors when any of those errors are matched; DesktopUnittestOutputParser.evaluate_parser should return at least TBPL_FAILURE when self.num_errors > 0; and android_panda.py passes that result to self.buildbot_status. I'm happy to check it in to cedar (or ash) and see what happens!
I landed this on ash-mozharness and kicked off a run on ash. This should tell us whether or not this causes any unexpected problem. To test that it turns panda jobs orange, I'll have to check in a change on ash that causes the runner to throw an error.
Ah, this definitely doesn't work. It causes us to error on lots of TEST-KNOWN-FAILs that include Python-esque error statements, like: 12:33:56 ERROR - 14508 INFO TEST-KNOWN-FAIL | /tests/dom/tests/mochitest/dom-level1-core/test_attrdefaultvalue.html | [failure as todo] Test threw exception: TypeError: streetAttr is null Since TypeError is a Python error, it gets caught by the parser. :( We probably want to match Python error strings at the beginning of the string, which would require a little more work. I'll revert this patch on ash-mozharness for now.
Since error_lists are ordered, and we bail out after the first match, we could explicitly match for TEST-KNOWN-FAIL as an INFO near the top of the error_list, and then append the PythonErrorList stuff afterwards. But yeah, this was the type of thing I was alluding to in comment 2.
This version is currently running on ash and doesn't show any false positives. It just anchors SyntaxError and TypeError to the beginning of a string, to prevent it from being caught in either TEST-KNOWN-FAIL lines or in JavascriptError lines.
Attachment #783324 - Flags: review?(aki)
Attachment #781256 - Attachment is obsolete: true
Comment on attachment 783324 [details] [diff] [review] Use PythonErrorList instead of BaseErrorList, Except this will fail in cases where we have logging set timestamps ahead of output (e.g. some android code we use).
And in this case we want to match SyntaxError and/or TypeError? Can you give an example of this kind of output?
Nothing comes to mind as currently failing, but we've seen those errors in the past with things like verify.py (search for"Running command: ['/tools/buildbot/bin/python', '/builds/sut_tools/verify.py']") (notice the timestamps, those are directly from verify.py's output) Also installApp.py has had similar errors in the past (just a bit lower in that file). Again notice the timestamps. I don't say there are normal/expected issues _now_ but the point is that we explicitly want to be sure we catch these issues _later_ if code blows up and we don't, then we may not know there is a problem for hours or days
Comment on attachment 783324 [details] [diff] [review] Use PythonErrorList instead of BaseErrorList, I don't think this will affect anything else negatively, so sure.
Attachment #783324 - Flags: review?(aki) → review+
(In reply to Justin Wood (:Callek) from comment #10) > Nothing comes to mind as currently failing, but we've seen those errors in > the past with things like verify.py > > (search for"Running command: ['/tools/buildbot/bin/python', > '/builds/sut_tools/verify.py']") > > (notice the timestamps, those are directly from verify.py's output) > > Also installApp.py has had similar errors in the past (just a bit lower in > that file). Again notice the timestamps. > > > I don't say there are normal/expected issues _now_ but the point is that we > explicitly want to be sure we catch these issues _later_ if code blows up > and we don't, then we may not know there is a problem for hours or days So to address this, I think implementing aki's suggestion from comment #6 is probably the way to go. That is, instead of anchoring these errors to the beginning of the string, to explicitly match cases we don't care about, with level INFO (i.e., anything with Javascript Error or TEST-KNOWN-FAIL). Do you agree, Callek?
(In reply to Jonathan Griffin (:jgriffin) from comment #12) > So to address this, I think implementing aki's suggestion from comment #6 is > probably the way to go. That is, instead of anchoring these errors to the > beginning of the string, to explicitly match cases we don't care about, with > level INFO (i.e., anything with Javascript Error or TEST-KNOWN-FAIL). Do > you agree, Callek? I think that makes sense, I'm not personally sure how to do that reliably but I'm willing to let someone try it for sure!
Product: mozilla.org → Release Engineering
Unassigning, since I don't have time to wrap this up atm.
Assignee: jgriffin → nobody
closing since we are decomming pandas in bug 1186615
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → WONTFIX
Component: General Automation → General
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: