Closed Bug 898165 Opened 12 years ago Closed 10 years ago

Panda Android jobs show green on TBPL even if they were aborted due to Python errors

Tracking

(Not tracked)

Status:

RESOLVED WONTFIX

People

(Reporter: jgriffin, Unassigned)

Details

Attachments

(1 file, 1 obsolete file)

Use PythonErrorList instead of BaseErrorList, 12 years ago Jonathan Griffin (:jgriffin) 1.35 KB, patch	mozilla : review+	Details \| Diff \| Splinter Review
Use PythonErrorList instead of BaseErrorList, 12 years ago Jonathan Griffin (:jgriffin) 2.22 KB, patch	mozilla : review+	Details \| Diff \| Splinter Review

Jonathan Griffin (:jgriffin)

Reporter

Description

•

12 years ago

Case in point: Android 4.0 rc1 rc2 jobs at https://tbpl.mozilla.org/?rev=a22f54b18a51&tree=Try are green, but the log shows: 13:22:41 INFO - IOError: [Errno 2] No such file or directory: u'/tmp/tmpmiMW9B/extensions/staged/worker-test@mozilla.org/chrome.manifest' 13:22:42 INFO - Automation Error: Exception caught while running tests log: https://tbpl.mozilla.org/php/getParsedLog.php?id=25731692&tree=Try&full=1#error0 Presumably, this is because we're not catching Python errors in the parser used in the mozharness script.

Jonathan Griffin (:jgriffin)

Reporter

Comment 1

•

12 years ago

Attached patch Use PythonErrorList instead of BaseErrorList, (obsolete) — Details — Splinter Review

Is there any reason not to do this?

Attachment #781256 - Flags: review?(aki)

Jonathan Griffin (:jgriffin)

Reporter

Updated

•

12 years ago

Assignee: nobody → jgriffin

Aki Sasaki (not active)

Comment 2

•

12 years ago

Comment on attachment 781256 [details] [diff] [review] Use PythonErrorList instead of BaseErrorList, This will work. Whether it creates a bunch of unwanted noise is unknown. I do know we explicitly shifted from PythonErrorList to BaseErrorList somewhere before, due to unwanted noise and false positives; I suggest we bake this on Cedar or elsewhere to make sure this isn't the case. Also, are you more interested in marking these log lines as ERROR - or changing the status of the job to TBPL WARNING / TBPL FAILURE ? This will probably do the former (though we don't catch Automation Error: explicitly), but may not do what you want in the latter.

Attachment #781256 - Flags: review?(aki) → review+

Jonathan Griffin (:jgriffin)

Reporter

Comment 3

•

12 years ago

Hmm, yes, I actually want to make TBPL go orange or red. It looks to me like that should work, but maybe I'm missing something. OutputParser.parse_single_line should increment self.num_errors when any of those errors are matched; DesktopUnittestOutputParser.evaluate_parser should return at least TBPL_FAILURE when self.num_errors > 0; and android_panda.py passes that result to self.buildbot_status. I'm happy to check it in to cedar (or ash) and see what happens!

Jonathan Griffin (:jgriffin)

Reporter

Comment 4

•

12 years ago

I landed this on ash-mozharness and kicked off a run on ash. This should tell us whether or not this causes any unexpected problem. To test that it turns panda jobs orange, I'll have to check in a change on ash that causes the runner to throw an error.

Jonathan Griffin (:jgriffin)

Reporter

Comment 5

•

12 years ago

Ah, this definitely doesn't work. It causes us to error on lots of TEST-KNOWN-FAILs that include Python-esque error statements, like: 12:33:56 ERROR - 14508 INFO TEST-KNOWN-FAIL | /tests/dom/tests/mochitest/dom-level1-core/test_attrdefaultvalue.html | [failure as todo] Test threw exception: TypeError: streetAttr is null Since TypeError is a Python error, it gets caught by the parser. :( We probably want to match Python error strings at the beginning of the string, which would require a little more work. I'll revert this patch on ash-mozharness for now.

Aki Sasaki (not active)

Comment 6

•

12 years ago

Since error_lists are ordered, and we bail out after the first match, we could explicitly match for TEST-KNOWN-FAIL as an INFO near the top of the error_list, and then append the PythonErrorList stuff afterwards. But yeah, this was the type of thing I was alluding to in comment 2.

Jonathan Griffin (:jgriffin)

Reporter

Comment 7

•

12 years ago

Attached patch Use PythonErrorList instead of BaseErrorList, — Details — Splinter Review

This version is currently running on ash and doesn't show any false positives. It just anchors SyntaxError and TypeError to the beginning of a string, to prevent it from being caught in either TEST-KNOWN-FAIL lines or in JavascriptError lines.

Attachment #783324 - Flags: review?(aki)

Jonathan Griffin (:jgriffin)

Reporter

Updated

•

12 years ago

Attachment #781256 - Attachment is obsolete: true

Justin Wood (:Callek)

Comment 8

•

12 years ago

Comment on attachment 783324 [details] [diff] [review] Use PythonErrorList instead of BaseErrorList, Except this will fail in cases where we have logging set timestamps ahead of output (e.g. some android code we use).

Jonathan Griffin (:jgriffin)

Reporter

Comment 9

•

12 years ago

And in this case we want to match SyntaxError and/or TypeError? Can you give an example of this kind of output?

Justin Wood (:Callek)

Comment 10

•

12 years ago

Nothing comes to mind as currently failing, but we've seen those errors in the past with things like verify.py (search for"Running command: ['/tools/buildbot/bin/python', '/builds/sut_tools/verify.py']") (notice the timestamps, those are directly from verify.py's output) Also installApp.py has had similar errors in the past (just a bit lower in that file). Again notice the timestamps. I don't say there are normal/expected issues _now_ but the point is that we explicitly want to be sure we catch these issues _later_ if code blows up and we don't, then we may not know there is a problem for hours or days

Aki Sasaki (not active)

Comment 11

•

12 years ago

Comment on attachment 783324 [details] [diff] [review] Use PythonErrorList instead of BaseErrorList, I don't think this will affect anything else negatively, so sure.

Attachment #783324 - Flags: review?(aki) → review+

Jonathan Griffin (:jgriffin)

Reporter

Comment 12

•

12 years ago

(In reply to Justin Wood (:Callek) from comment #10) > Nothing comes to mind as currently failing, but we've seen those errors in > the past with things like verify.py > > (search for"Running command: ['/tools/buildbot/bin/python', > '/builds/sut_tools/verify.py']") > > (notice the timestamps, those are directly from verify.py's output) > > Also installApp.py has had similar errors in the past (just a bit lower in > that file). Again notice the timestamps. > > > I don't say there are normal/expected issues _now_ but the point is that we > explicitly want to be sure we catch these issues _later_ if code blows up > and we don't, then we may not know there is a problem for hours or days So to address this, I think implementing aki's suggestion from comment #6 is probably the way to go. That is, instead of anchoring these errors to the beginning of the string, to explicitly match cases we don't care about, with level INFO (i.e., anything with Javascript Error or TEST-KNOWN-FAIL). Do you agree, Callek?

Justin Wood (:Callek)

Comment 13

•

12 years ago

(In reply to Jonathan Griffin (:jgriffin) from comment #12) > So to address this, I think implementing aki's suggestion from comment #6 is > probably the way to go. That is, instead of anchoring these errors to the > beginning of the string, to explicitly match cases we don't care about, with > level INFO (i.e., anything with Javascript Error or TEST-KNOWN-FAIL). Do > you agree, Callek? I think that makes sense, I'm not personally sure how to do that reliably but I'm willing to let someone try it for sure!

Nobody; OK to take it and work on it

Assignee

Updated

•

12 years ago

Product: mozilla.org → Release Engineering

Jonathan Griffin (:jgriffin)

Reporter

Comment 14

•

12 years ago

Unassigning, since I don't have time to wrap this up atm.

Assignee: jgriffin → nobody

Kim Moir [:kmoir] ET

Comment 15

•

10 years ago

closing since we are decomming pandas in bug 1186615

Status: NEW → RESOLVED

Closed: 10 years ago

Resolution: --- → WONTFIX

Nobody; OK to take it and work on it

Assignee

Updated

•

7 years ago

Component: General Automation → General

You need to log in before you can comment on or make changes to this bug.