Closed
Bug 988603
Opened 10 years ago
Closed 10 years ago
mozharness reports success when build fails in some cases
Categories
(Release Engineering :: Applications: MozharnessCore, defect)
Release Engineering
Applications: MozharnessCore
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: mshal, Assigned: mozilla)
Details
Attachments
(1 file, 2 obsolete files)
40.75 KB,
patch
|
jlund
:
review+
mozilla
:
checked-in+
|
Details | Diff | Splinter Review |
With the JS minifier enabled (bug 903149), sometimes we'll display a warning message that contains a string like "SyntaxError: blah". The SyntaxError message comes from a subprocess, and the minifier is still returning success (ie: this isn't a failure case, it just is a warning that has the magic word "SyntaxError" in it). Mozharness picks this up and ends up halting the build: 00:19:53 WARNING - Warning: JS minification verification failed for ../../dist/bin/modules/KeyValueParser.jsm: 00:19:53 ERROR - 10:26 SyntaxError: missing ) after argument list: 00:19:53 INFO - 10:26 is.init(fstream, "UTF-8" 00:19:53 INFO - 10:26 ..........................^ ... 00:20:44 INFO - Return code: 0 00:20:44 FATAL - Halting on failure while running make package AB_CD=multi 00:20:44 FATAL - Running post_fatal callback... 00:20:44 FATAL - Exiting 0 However, since the return code is still 0, the build shows up as green in tbpl. I think in this case we probably want it to be red.
Assignee | ||
Comment 1•10 years ago
|
||
I was going to put in an |if self.run_command(...):| in b2g_desktop_multilocale.py, but then I looked at run_command()'s logic, and this seems wrong: http://hg.mozilla.org/build/mozharness/file/8521152eb1f7/mozharness/base/script.py#l704 (namely, if returncode is 0, we will exit 0 on fatal(), which seems wrong). This is one way to fix it. I know you were thinking about failure_config; I'm open to that, but this patch should fix this specific issue. I think I've caught the various 'exception' bits (fatal_exit_code=3, in tests where we want to turn an 'unzip' failure purple rather than red). Otherwise we'll default to 2 (failure). We probably want a clear run on Cypress after this lands.
Assignee: nobody → aki
Attachment #8397515 -
Flags: review?(jlund)
Comment 2•10 years ago
|
||
Comment on attachment 8397515 [details] [diff] [review] fatal_exit_code Review of attachment 8397515 [details] [diff] [review]: ----------------------------------------------------------------- So overall looks like an improvement. I think the best thing about this is that we set the return code to a sane number and we add 'exception' purple levels to places that would otherwise be red. I have some questions/comments: 1) This bug is due to parser.num_errors being > 0 while returncode is in success_codes. Maybe it be useful to explicitly say why we fail? Something like my code snippet below. 2) I'm not 100% sure about using parser.num_errors inside the halt_on_failure condition and then ignoring parser. What if we parse_single_line() that results in a level worse than fatal_exit_code? Won't levels from parser get ignored? Maybe, if there are parser.num_errors we should use parser.worst_level somehow? I don't know and it's probably scope creep for this bug. 3) This still doesn't touch self.return_code. Which is only an issue if (a) you implement things like a PostScript decorator/listener or (b) we error/fail in run_command but we don't set halt_on_failure. I think both of those go outside the scope of solving this bug and, like you mentioned, there is still room for something like failure_config. 4) I trust you over me to know which run_commands should have fatal_exit_code=3 so your choices there lgtm. Either way, I think this will be an improvement. r+ once (1) and (5) are addressed by either commenting on or actually implementing it if you agree with them. ::: mozharness/base/log.py @@ +227,2 @@ > append_to_log=False, > ): I guess halt_on_failure was legacy code here? ::: mozharness/base/script.py @@ +703,5 @@ > self.log("Return code: %d" % returncode, level=return_level) > if halt_on_failure: > if parser.num_errors or returncode not in success_codes: > + if fatal_exit_code is None: > + fatal_exit_code = returncode 5) I am not sure about this. We set a default for fatal_exit_code=2 and the only time we explicitly set fatal_exit_code outside of that is when we set it to 3. But then here we 'if fatal_exit_code is None:'? Are we trying to allow for someone to explicitly pass fatal_exit_code=None? Which effectively is a way of getting run_command to use returncode? If that is the case, maybe we should assert returncode is not also 0 or we will be right back to this weird edge case in this bug IIUC. But I would argue it is messy to allow fatal_exit_code=None. Maybe doing something like what I mentioned in comment (1) at the top of this review and then also: if halt_on_failure: if parser.num_errors or returncode not in success_codes: if returncode not in success_codes: self.error("%s not in success codes: %s" % (returncode, success_codes)) if parser.num_errors: self.error("failures found while parsing output") if not fatal_exit_code: self.warning("'fatal_exit_code' was not > 0. Defaulting to 2") fatal_exit_code = 2
Assignee | ||
Comment 3•10 years ago
|
||
(In reply to Jordan Lund (:jlund) from comment #2) > So overall looks like an improvement. I think the best thing about this is > that we set the return code to a sane number and we add 'exception' purple > levels to places that would otherwise be red. Yeah. I don't think this is ideal still, but I wanted to improve it, at least. > 1) This bug is due to parser.num_errors being > 0 while returncode is in > success_codes. Maybe it be useful to explicitly say why we fail? Something > like my code snippet below. Sure, added. > 2) I'm not 100% sure about using parser.num_errors inside the > halt_on_failure condition and then ignoring parser. What if we > parse_single_line() that results in a level worse than fatal_exit_code? > Won't levels from parser get ignored? Maybe, if there are parser.num_errors > we should use parser.worst_level somehow? I don't know and it's probably > scope creep for this bug. > 3) This still doesn't touch self.return_code. Which is only an issue if (a) > you implement things like a PostScript decorator/listener or (b) we > error/fail in run_command but we don't set halt_on_failure. I think both of > those go outside the scope of solving this bug and, like you mentioned, > there is still room for something like failure_config. Touched return_code, and removed the ability to set fatal_exit_code to None. > ::: mozharness/base/log.py > @@ +227,2 @@ > > append_to_log=False, > > ): > > I guess halt_on_failure was legacy code here? Yup. It was from when fatal() was optionally-fatal. > 5) I am not sure about this. We set a default for fatal_exit_code=2 and the > only time we explicitly set fatal_exit_code outside of that is when we set > it to 3. But then here we 'if fatal_exit_code is None:'? > > Are we trying to allow for someone to explicitly pass fatal_exit_code=None? Yup. But I can't think of a real life use case, so I might as well strip it out. > if halt_on_failure: > if parser.num_errors or returncode not in success_codes: > if returncode not in success_codes: > self.error("%s not in success codes: %s" % (returncode, > success_codes)) > if parser.num_errors: > self.error("failures found while parsing output") > if not fatal_exit_code: > self.warning("'fatal_exit_code' was not > 0. Defaulting to 2") > fatal_exit_code = 2 Done, but slightly differently. This is the only difference between the two patches.
Attachment #8397515 -
Attachment is obsolete: true
Attachment #8397515 -
Flags: review?(jlund)
Attachment #8398079 -
Flags: review?(jlund)
Assignee | ||
Comment 4•10 years ago
|
||
Also a get_output_from_command() fix.
Attachment #8398079 -
Attachment is obsolete: true
Attachment #8398079 -
Flags: review?(jlund)
Attachment #8398081 -
Flags: review?(jlund)
Comment 5•10 years ago
|
||
Comment on attachment 8398081 [details] [diff] [review] fatal_exit_code3 Review of attachment 8398081 [details] [diff] [review]: ----------------------------------------------------------------- r+ lgtm I guess we could still assign 0 or None to fatal_exit_code but that seems like it would be the fault of the dev using run_command().
Attachment #8398081 -
Flags: review?(jlund) → review+
Assignee | ||
Comment 6•10 years ago
|
||
Comment on attachment 8398081 [details] [diff] [review] fatal_exit_code3 https://hg.mozilla.org/build/mozharness/rev/c1505e7e1fa0 Going to kick off cypress builds+tests.
Attachment #8398081 -
Flags: checked-in+
Comment 7•10 years ago
|
||
in production.
Assignee | ||
Updated•10 years ago
|
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Updated•10 years ago
|
Component: General Automation → Mozharness
You need to log in
before you can comment on or make changes to this bug.
Description
•