Closed
Bug 928887
Opened 12 years ago
Closed 7 years ago
mozharness DesktopUnittestOutputParser.evaluate_parser() should record a failure if the return code was non-zero
Categories
(Release Engineering :: Applications: MozharnessCore, defect)
Tracking
(Not tracked)
RESOLVED
INCOMPLETE
People
(Reporter: emorley, Unassigned)
References
(Blocks 1 open bug)
Details
(Whiteboard: [mozharness])
Attachments
(1 file)
2.10 KB,
patch
|
Details | Diff | Splinter Review |
As part of testing bug 923513 it became apparent that we could have an exception in remoteautomation.py::checkForCrashes() but yet the panda test run not be reported as a failure.
> > 3) (But not an issue with this patch in particular) The panda mochitest-1
> > run had an uncaught exception whilst trying to process the java-exception
>
> 16:57:15 INFO - INFO | zombiecheck | Reading PID log:
> /tmp/tmp1PoLeMpidlog
> 16:57:16 INFO - Traceback (most recent call last):
> 16:57:16 INFO - File
> "/builds/panda-0854/test/build/tests/mochitest/runtests.py", line 1005, in
> runTests
> 16:57:16 INFO - onLaunch=onLaunch
> 16:57:16 INFO - File
> "/builds/panda-0854/test/build/tests/mochitest/runtestsremote.py", line 542,
> in runApp
> 16:57:16 INFO - return self._automation.runApp(*args, **kwargs)
> 16:57:16 INFO - File
> "/builds/panda-0854/test/build/tests/mochitest/automation.py", line 899, in
> runApp
> 16:57:16 INFO - crashed =
> self.checkForCrashes(os.path.join(profileDir, "minidumps"), symbolsPath)
> 16:57:16 INFO - File
> "/builds/panda-0854/test/build/tests/mochitest/remoteautomation.py", line
> 118, in checkForCrashes
> 16:57:16 INFO - javaException =
> mozcrash.check_for_java_exception(logcat)
> 16:57:16 INFO - AttributeError: 'module' object has no attribute
> 'check_for_java_exception'
> 16:57:16 INFO - /data/anr/traces.txt not found
> 16:57:16 INFO - Automation Error: Received unexpected exception while
> running application
> 16:57:16 INFO - WARNING | leakcheck | refcount logging is off, so leaks
> can't be detected!
> 16:57:16 INFO - runtests.py | Running tests: end.
Reporter | ||
Comment 1•12 years ago
|
||
Full log for that example:
https://tbpl.mozilla.org/php/getParsedLog.php?id=29108241&tree=Try
Ok, so the exception is bubbling up until here:
http://mxr.mozilla.org/mozilla-central/source/testing/mochitest/runtests.py#1012
{
1010 except:
1011 traceback.print_exc()
1012 log.error("Automation Error: Received unexpected exception while running application\n")
1013 status = 1
...
1026 return status
}
Which should mean we get '1' returned here:
http://mxr.mozilla.org/mozilla-central/source/testing/mochitest/runtestsremote.py#727
{
725 try:
726 dm.recordLogcat()
727 retVal = mochitest.runTests(options)
728 except:
729 log.error("Automation Error: Exception caught while running tests")
730 traceback.print_exc()
731 mochitest.stopWebServer(options)
732 mochitest.stopWebSocketServer(options)
733 try:
734 mochitest.cleanup(None, options)
735 except devicemanager.DMError:
736 # device error cleaning up... oh well!
737 pass
738 retVal = 1
739
740 mochitest.printDeviceInfo(printLogcat=True)
741
742 sys.exit(retVal)
}
So I'm guessing we do sys.exit(1) correctly (I can't see anything else that goes wrong), and that we're just not handling this in mozharness?
Reporter | ||
Comment 2•12 years ago
|
||
Ha, https://hg.mozilla.org/build/mozharness/file/28399b92470f/scripts/android_panda.py does:
234 test_summary_parser = DesktopUnittestOutputParser(suite_category,
235 config=self.config,
236 error_list=error_list,
237 log_obj=self.log_obj)
238
239 return_code = self.run_command(cmd, cwd=dirs['abs_test_install_dir'], env=env, output_parser=test_summary_parser)
240
241 tbpl_status, log_level = test_summary_parser.evaluate_parser(return_code)
DesktopUnittestOutputParser.evaluate_parser():
https://hg.mozilla.org/build/mozharness/file/28399b92470f/mozharness/mozilla/testing/unittest.py#l125
{
125 def evaluate_parser(self, return_code):
126 if self.num_errors: # mozharness ran into a script error
127 self.tbpl_status = self.worst_level(TBPL_FAILURE, self.tbpl_status,
128 levels=TBPL_WORST_LEVEL_TUPLE)
129
130 # I have to put this outside of parse_single_line because this checks not
131 # only if fail_count was more then 0 but also if fail_count is still -1
132 # (no fail summary line was found)
133 if self.fail_count != 0:
134 self.worst_log_level = self.worst_level(WARNING, self.worst_log_level)
135 self.tbpl_status = self.worst_level(TBPL_WARNING, self.tbpl_status,
136 levels=TBPL_WORST_LEVEL_TUPLE)
137 # we can trust in parser.worst_log_level in either case
138 return (self.tbpl_status, self.worst_log_level)
}
...so basically we just don't use return_code at all :-/
Reporter | ||
Updated•12 years ago
|
Summary: AttributeError exception in remoteautomation.py::checkForCrashes() did not turn the run orange/red on panda jobs → mozharness DesktopUnittestOutputParser.evaluate_parser() doesn't use the return code it is passed to determine success/fail
Reporter | ||
Updated•12 years ago
|
Assignee: nobody → emorley
Status: NEW → ASSIGNED
Component: General → General Automation
Product: Testing → Release Engineering
QA Contact: catlee
Whiteboard: [mozharness]
Version: Trunk → other
Reporter | ||
Comment 3•12 years ago
|
||
I believe things like crashes result in a non-zero return code, so we have to make this a WARNING rather than a FAILURE, since otherwise crashes will appear red on TBPL along with automation exceptions etc.
Attachment #819696 -
Flags: review?(aki)
Reporter | ||
Comment 4•12 years ago
|
||
Also note that whilst this is 'DesktopUnittestOutputParser', we actually use this for Android and B2G as well.q
Comment 5•12 years ago
|
||
Comment on attachment 819696 [details] [diff] [review]
WIP
This may have a larger-than-expected change in behavior.
Are you signing sheriffs up to detect that in production, or are you planning on testing this on cedar or ash first?
Attachment #819696 -
Flags: review?(aki) → review+
Reporter | ||
Comment 6•12 years ago
|
||
(In reply to Aki Sasaki [:aki] from comment #5)
> This may have a larger-than-expected change in behavior.
> Are you signing sheriffs up to detect that in production, or are you
> planning on testing this on cedar or ash first?
I'll test on cedar/ash first. Remind me how I do that again? (I looked on https://wiki.mozilla.org/ReleaseEngineering/Mozharness but didn't see any steps). Does cedar pull mozharness default and ash use a user repo? (My memory is fuzzy).
Reporter | ||
Updated•12 years ago
|
Flags: needinfo?(aki)
Comment 7•12 years ago
|
||
(In reply to Ed Morley [:edmorley UTC+1] from comment #6)
> (In reply to Aki Sasaki [:aki] from comment #5)
> > This may have a larger-than-expected change in behavior.
> > Are you signing sheriffs up to detect that in production, or are you
> > planning on testing this on cedar or ash first?
>
> I'll test on cedar/ash first. Remind me how I do that again? (I looked on
> https://wiki.mozilla.org/ReleaseEngineering/Mozharness but didn't see any
> steps). Does cedar pull mozharness default and ash use a user repo? (My
> memory is fuzzy).
Correct. The user repo for ash is http://hg.mozilla.org/users/asasaki_mozilla.com/ash-mozharness .
Flags: needinfo?(aki)
Reporter | ||
Comment 8•12 years ago
|
||
I've pushed to ash-mozharness and then to ash to trigger builds, results should be at:
https://tbpl.mozilla.org/?tree=Ash&rev=17c88762ec86
Reporter | ||
Comment 9•12 years ago
|
||
Ok so Windows tests always return 1 on all trunk trees (I'd noticed this before but forgotten to file), which means this patch (correctly) turns them orange - so we'll need to fix that first before landing this.
eg:
https://tbpl.mozilla.org/php/getParsedLog.php?id=31174977&tree=Ash
but also on m-c:
https://tbpl.mozilla.org/php/getParsedLog.php?id=31217020&tree=Mozilla-Central
{
04:21:16 INFO - 202502 INFO SimpleTest FINISHED
04:21:17 INFO - INFO | runtests.py | Application ran for: 0:18:16.105000
04:21:17 INFO - INFO | zombiecheck | Reading PID log: c:\docume~1\cltbld~1.t-x\locals~1\temp\tmpgs9owppidlog
04:21:17 INFO - ==> process 3648 launched child process 2356 ("C:\slave\test\build\application\firefox\plugin-container.exe" --channel=3648.bd9fbd0.2087353558 "c:\docume~1\cltbld~1.t-x\locals~1\temp\tmpftmjrw\plugins\nptest.dll" -greomni "C:\slave\test\build\application\firefox\omni.ja" -appomni "C:\slave\test\build\application\firefox\browser\omni.ja" -appdir "C:\slave\test\build\application\firefox\browser" - 3648 "\\.\pipe\gecko-crash-server-pipe.3648" plugin)
04:21:17 INFO - ==> process 3648 launched child process 2516 ("C:\slave\test\build\application\firefox\plugin-container.exe" --channel=3648.bebe100.1936084576 -greomni "C:\slave\test\build\application\firefox\omni.ja" -appomni "C:\slave\test\build\application\firefox\browser\omni.ja" -appdir "C:\slave\test\build\application\firefox\browser" - 3648 "\\.\pipe\gecko-crash-server-pipe.3648" tab)
04:21:17 INFO - ==> process 3648 launched child process 4080 ("C:\slave\test\build\application\firefox\plugin-container.exe" --channel=3648.bd9d9d0.394117182 "c:\docume~1\cltbld~1.t-x\locals~1\temp\tmpftmjrw\plugins\nptest.dll" -greomni "C:\slave\test\build\application\firefox\omni.ja" -appomni "C:\slave\test\build\application\firefox\browser\omni.ja" -appdir "C:\slave\test\build\application\firefox\browser" - 3648 "\\.\pipe\gecko-crash-server-pipe.3648" plugin)
04:21:17 INFO - ==> process 3648 launched child process 2840 ("C:\slave\test\build\application\firefox\plugin-container.exe" --channel=3648.975b6b0.1021023147 -greomni "C:\slave\test\build\application\firefox\omni.ja" -appomni "C:\slave\test\build\application\firefox\browser\omni.ja" -appdir "C:\slave\test\build\application\firefox\browser" - 3648 "\\.\pipe\gecko-crash-server-pipe.3648" tab)
04:21:17 INFO - WARNING | leakcheck | refcount logging is off, so leaks can't be detected!
04:21:17 INFO - runtests.py | Running tests: end.
04:21:17 INFO - 0
04:21:17 ERROR - Return code: 1
}
However I'm not sure what is going on with B2G - the exception seems unrelated to my patch (perhaps we need to reset ash-mozharness, since it's a bit of a mess of merges at the moment). eg:
https://tbpl.mozilla.org/php/getParsedLog.php?id=31174708&tree=Ash#error0
Updated•11 years ago
|
Component: General Automation → Mozharness
Reporter | ||
Updated•11 years ago
|
Blocks: 1048775
Summary: mozharness DesktopUnittestOutputParser.evaluate_parser() doesn't use the return code it is passed to determine success/fail → mozharness DesktopUnittestOutputParser.evaluate_parser() should record a failure if the return code was non-zero
Reporter | ||
Comment 10•11 years ago
|
||
Comment on attachment 819696 [details] [diff] [review]
WIP
Clearing r+ since this needs the dependant bug first, and this will get it off my to-check-in bugzilla todos list.
Attachment #819696 -
Attachment description: Patch v1 → WIP
Attachment #819696 -
Flags: review+
Reporter | ||
Comment 11•11 years ago
|
||
This needs bug 944364 fixed before this can land (presuming it's not bitrotted anyway).
Unassigning to clear this from bugzilla todos.
Assignee: emorley → nobody
Status: ASSIGNED → NEW
Updated•7 years ago
|
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → INCOMPLETE
You need to log in
before you can comment on or make changes to this bug.
Description
•