Closed
Bug 988603
Opened 11 years ago
Closed 11 years ago
mozharness reports success when build fails in some cases
Categories
(Release Engineering :: Applications: MozharnessCore, defect)
Release Engineering
Applications: MozharnessCore
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: mshal, Assigned: mozilla)
Details
Attachments
(1 file, 2 obsolete files)
40.75 KB,
patch
|
jlund
:
review+
mozilla
:
checked-in+
|
Details | Diff | Splinter Review |
With the JS minifier enabled (bug 903149), sometimes we'll display a warning message that contains a string like "SyntaxError: blah". The SyntaxError message comes from a subprocess, and the minifier is still returning success (ie: this isn't a failure case, it just is a warning that has the magic word "SyntaxError" in it). Mozharness picks this up and ends up halting the build:
00:19:53 WARNING - Warning: JS minification verification failed for ../../dist/bin/modules/KeyValueParser.jsm:
00:19:53 ERROR - 10:26 SyntaxError: missing ) after argument list:
00:19:53 INFO - 10:26 is.init(fstream, "UTF-8"
00:19:53 INFO - 10:26 ..........................^
...
00:20:44 INFO - Return code: 0
00:20:44 FATAL - Halting on failure while running make package AB_CD=multi
00:20:44 FATAL - Running post_fatal callback...
00:20:44 FATAL - Exiting 0
However, since the return code is still 0, the build shows up as green in tbpl. I think in this case we probably want it to be red.
Assignee | ||
Comment 1•11 years ago
|
||
I was going to put in an |if self.run_command(...):| in b2g_desktop_multilocale.py, but then I looked at run_command()'s logic, and this seems wrong:
http://hg.mozilla.org/build/mozharness/file/8521152eb1f7/mozharness/base/script.py#l704
(namely, if returncode is 0, we will exit 0 on fatal(), which seems wrong).
This is one way to fix it. I know you were thinking about failure_config; I'm open to that, but this patch should fix this specific issue.
I think I've caught the various 'exception' bits (fatal_exit_code=3, in tests where we want to turn an 'unzip' failure purple rather than red). Otherwise we'll default to 2 (failure). We probably want a clear run on Cypress after this lands.
Assignee: nobody → aki
Attachment #8397515 -
Flags: review?(jlund)
Comment 2•11 years ago
|
||
Comment on attachment 8397515 [details] [diff] [review]
fatal_exit_code
Review of attachment 8397515 [details] [diff] [review]:
-----------------------------------------------------------------
So overall looks like an improvement. I think the best thing about this is that we set the return code to a sane number and we add 'exception' purple levels to places that would otherwise be red. I have some questions/comments:
1) This bug is due to parser.num_errors being > 0 while returncode is in success_codes. Maybe it be useful to explicitly say why we fail? Something like my code snippet below.
2) I'm not 100% sure about using parser.num_errors inside the halt_on_failure condition and then ignoring parser. What if we parse_single_line() that results in a level worse than fatal_exit_code? Won't levels from parser get ignored? Maybe, if there are parser.num_errors we should use parser.worst_level somehow? I don't know and it's probably scope creep for this bug.
3) This still doesn't touch self.return_code. Which is only an issue if (a) you implement things like a PostScript decorator/listener or (b) we error/fail in run_command but we don't set halt_on_failure. I think both of those go outside the scope of solving this bug and, like you mentioned, there is still room for something like failure_config.
4) I trust you over me to know which run_commands should have fatal_exit_code=3 so your choices there lgtm. Either way, I think this will be an improvement.
r+ once (1) and (5) are addressed by either commenting on or actually implementing it if you agree with them.
::: mozharness/base/log.py
@@ +227,2 @@
> append_to_log=False,
> ):
I guess halt_on_failure was legacy code here?
::: mozharness/base/script.py
@@ +703,5 @@
> self.log("Return code: %d" % returncode, level=return_level)
> if halt_on_failure:
> if parser.num_errors or returncode not in success_codes:
> + if fatal_exit_code is None:
> + fatal_exit_code = returncode
5) I am not sure about this. We set a default for fatal_exit_code=2 and the only time we explicitly set fatal_exit_code outside of that is when we set it to 3. But then here we 'if fatal_exit_code is None:'?
Are we trying to allow for someone to explicitly pass fatal_exit_code=None? Which effectively is a way of getting run_command to use returncode? If that is the case, maybe we should assert returncode is not also 0 or we will be right back to this weird edge case in this bug IIUC.
But I would argue it is messy to allow fatal_exit_code=None. Maybe doing something like what I mentioned in comment (1) at the top of this review and then also:
if halt_on_failure:
if parser.num_errors or returncode not in success_codes:
if returncode not in success_codes:
self.error("%s not in success codes: %s" % (returncode,
success_codes))
if parser.num_errors:
self.error("failures found while parsing output")
if not fatal_exit_code:
self.warning("'fatal_exit_code' was not > 0. Defaulting to 2")
fatal_exit_code = 2
Assignee | ||
Comment 3•11 years ago
|
||
(In reply to Jordan Lund (:jlund) from comment #2)
> So overall looks like an improvement. I think the best thing about this is
> that we set the return code to a sane number and we add 'exception' purple
> levels to places that would otherwise be red.
Yeah. I don't think this is ideal still, but I wanted to improve it, at least.
> 1) This bug is due to parser.num_errors being > 0 while returncode is in
> success_codes. Maybe it be useful to explicitly say why we fail? Something
> like my code snippet below.
Sure, added.
> 2) I'm not 100% sure about using parser.num_errors inside the
> halt_on_failure condition and then ignoring parser. What if we
> parse_single_line() that results in a level worse than fatal_exit_code?
> Won't levels from parser get ignored? Maybe, if there are parser.num_errors
> we should use parser.worst_level somehow? I don't know and it's probably
> scope creep for this bug.
> 3) This still doesn't touch self.return_code. Which is only an issue if (a)
> you implement things like a PostScript decorator/listener or (b) we
> error/fail in run_command but we don't set halt_on_failure. I think both of
> those go outside the scope of solving this bug and, like you mentioned,
> there is still room for something like failure_config.
Touched return_code, and removed the ability to set fatal_exit_code to None.
> ::: mozharness/base/log.py
> @@ +227,2 @@
> > append_to_log=False,
> > ):
>
> I guess halt_on_failure was legacy code here?
Yup. It was from when fatal() was optionally-fatal.
> 5) I am not sure about this. We set a default for fatal_exit_code=2 and the
> only time we explicitly set fatal_exit_code outside of that is when we set
> it to 3. But then here we 'if fatal_exit_code is None:'?
>
> Are we trying to allow for someone to explicitly pass fatal_exit_code=None?
Yup. But I can't think of a real life use case, so I might as well strip it out.
> if halt_on_failure:
> if parser.num_errors or returncode not in success_codes:
> if returncode not in success_codes:
> self.error("%s not in success codes: %s" % (returncode,
> success_codes))
> if parser.num_errors:
> self.error("failures found while parsing output")
> if not fatal_exit_code:
> self.warning("'fatal_exit_code' was not > 0. Defaulting to 2")
> fatal_exit_code = 2
Done, but slightly differently. This is the only difference between the two patches.
Attachment #8397515 -
Attachment is obsolete: true
Attachment #8397515 -
Flags: review?(jlund)
Attachment #8398079 -
Flags: review?(jlund)
Assignee | ||
Comment 4•11 years ago
|
||
Also a get_output_from_command() fix.
Attachment #8398079 -
Attachment is obsolete: true
Attachment #8398079 -
Flags: review?(jlund)
Attachment #8398081 -
Flags: review?(jlund)
Comment 5•11 years ago
|
||
Comment on attachment 8398081 [details] [diff] [review]
fatal_exit_code3
Review of attachment 8398081 [details] [diff] [review]:
-----------------------------------------------------------------
r+ lgtm
I guess we could still assign 0 or None to fatal_exit_code but that seems like it would be the fault of the dev using run_command().
Attachment #8398081 -
Flags: review?(jlund) → review+
Assignee | ||
Comment 6•11 years ago
|
||
Comment on attachment 8398081 [details] [diff] [review]
fatal_exit_code3
https://hg.mozilla.org/build/mozharness/rev/c1505e7e1fa0
Going to kick off cypress builds+tests.
Attachment #8398081 -
Flags: checked-in+
Comment 7•11 years ago
|
||
in production.
Assignee | ||
Updated•11 years ago
|
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Updated•11 years ago
|
Component: General Automation → Mozharness
You need to log in
before you can comment on or make changes to this bug.
Description
•