Closed Bug 908945 Opened 6 years ago Closed 6 years ago

Fix automation.py's exit code handling

Categories

(Testing :: General, defect)

x86_64
All
defect
Not set

Tracking

(Not tracked)

RESOLVED FIXED
mozilla31

People

(Reporter: ted, Assigned: vaibhav1994)

Details

(Whiteboard: [good first bug][mentor=jmaher][lang=python])

Attachments

(1 file, 3 obsolete files)

From bug 884471 comment 73:

(In reply to Zack Weinberg (:zwol) from comment #73)
> Oh, also, the Python code that is printing exit statuses really ought to be
> changed to read something like this:
> 
>     # 'status' is the exit status
>     if os.name != 'posix':
>         # Windows error codes are easier to look up if printed in hexadecimal
>         if status < 0: status += 2**32
>         print "exit status %x\n" % status
>     elif os.WIFEXITED(status):
>         print "exit %d\n" % os.WEXITSTATUS(status)
>     elif os.WIFSIGNALED(status):
>         # the python stdlib doesn't appear to have strsignal(), alas
>         print "killed by signal %d\n" % os.WTERMSIG(status)
>     else:
>         # this is probably a can't-happen condition on Unix, but let's be
> defensive
>         print "undecodable exit status %04x\n" % status
(From bug 884471 comment #77)

Sounds like a good idea.

If we did change the Python code, we'd presumably need to do it here:
https://hg.mozilla.org/mozilla-central/annotate/17143a9a0d83/build/automation.py.in#l748
OS: Windows 7 → All
Whiteboard: [good first bug][mentor=jmaher][lang=python]
Can I take up this bug?
Assignee: nobody → vaibhavmagarwal
Hi,
I want to work on this bug. As far as I understand, I need to make an exit code function (similar to what ted has specified) and print out the exit code every time sys.exit() and proc.wait() is called in automation.py.in (currently it is done 2 times at line 357 and line 730 in http://dxr.mozilla.org/mozilla-central/source/build/automation.py.in ).
Please tell me if I am interpreting it right and guide me how to go about this bug. Thanks!
You do not need to change anything around uses of `sys.exit` (e.g. line 357).  That is the automation itself exiting; its caller is responsible for interpreting its exit code.

Uses of `proc.wait` and `os.waitpid` may need to be changed, but keep in mind that in a bunch of cases we don't care (e.g. line 237) -- we just want to make sure the called process has exited *somehow* before continuing.  It does look like line 730 is the most important place to change; it looks like you might also want to insert a call to the decoding-exit-status function around lines 451, 463 and 467 (exit status is not being checked here, but *should* be) and 736 (the "stack fixer").
While working on this bug, I have a dilemma about how to print the status of the exit code.
Right now, what I have done is this:

certutil: exit status 0 (this is a case when everything is normal)

certutil: exit status ff (this is a case when there is an error and the test fails, the value is in hexadecimal).

Kindly tell me if printing such statuses makes sense and whether I should go ahead with this format.
You should follow the logic shown in comment 0; you don't have to use exactly that code but you should have all four of the cases shown.  Exit statuses have structure and the goal here is to decode them properly.
Hm, actually, if you're asking about the WIFEXITED case (not on Windows), please print those in decimal, not hex.  Only print hex on Windows or if neither WIFEXITED nor WIFSIGNALED is true.
One more refinement, if you don't mind: signal numbers really should be mapped back to their symbolic names, because the numbers are not always consistent across operating systems.  Unfortunately Python has no strsignal().  So please put this exact code somewhere

    # Python does not provide strsignal() even in the very latest 3.x.
    # This is a reasonable fake.
    import signal
    _sigtbl = []
    def strsig(n):
        global _sigtbl
        if not _sigtbl:
            # signal numbers run 0 through NSIG-1; an array with NSIG members
            # has exactly that many slots
            _sigtbl = [None]*signal.NSIG
            for k in dir(signal):
                if (k.startswith("SIG") and not k.startswith("SIG_")
                    # exclude obsolete aliases
                    and k != "SIGCLD" and k != "SIGPOLL"):
                  _sigtbl[getattr(signal, k)] = k
            # realtime signals mostly have no names
            if hasattr(signal, "SIGRTMIN") and hasattr(signal, "SIGRTMAX"):
                for r in range(signal.SIGRTMIN+1, signal.SIGRTMAX+1):
                    _sigtbl[r] = "SIGRTMIN+" + str(r - signal.SIGRTMIN)
            # fill in any remaining gaps
            for i in range(signal.NSIG):
                if _sigtbl[i] is None:
                    _sigtbl[i] = "unrecognized signal, number " + str(i)
        if n < 0 or n >= signal.NSIG:
            return "out-of-range signal, number "+str(n)
        return _sigtbl[n]

and then in the WIFSIGNALED case, print "killed by {}".format(strsig(os.WTERMSIG(status))).
Attached patch exit code printing (obsolete) — Splinter Review
The patch incorporates the fix for printing the exit code values.
Attachment #8394219 - Flags: review?(jmaher)
Comment on attachment 8394219 [details] [diff] [review]
exit code printing

This looks like the right general idea, but please try to provide a meaningful second argument everywhere you call printstatus().  Also, I think there are some places where the printstatus() message should *replace* an existing message, and I wonder if it ought to be formatted as a "TEST-INFO | thingy" message.
Attached patch updated exit code printing (obsolete) — Splinter Review
Updated the patch according to :zwol's comments.
Attachment #8394219 - Attachment is obsolete: true
Attachment #8394219 - Flags: review?(jmaher)
Attachment #8394319 - Flags: review?(jmaher)
Attached patch exitcode.patch (obsolete) — Splinter Review
After talking to jmaher, I found some more issues that I have fixed in this patch.
Attachment #8394319 - Attachment is obsolete: true
Attachment #8394319 - Flags: review?(jmaher)
Attachment #8394362 - Flags: review?(jmaher)
Comment on attachment 8394362 [details] [diff] [review]
exitcode.patch

Review of attachment 8394362 [details] [diff] [review]:
-----------------------------------------------------------------

this is shaping up well.

::: build/automation.py.in
@@ +660,5 @@
> +        if os.path.exists(crashinject):
> +		  status = subprocess.Popen([crashinject, str(processPID)]).wait()
> +		  automationutils.printstatus(status, "crashinject")
> +		  if status == 0:
> +            return

please fix the indentation here.  It looks like you have tabs instead of spaces.

@@ +738,2 @@
>      if status == 0:
>        self.lastTestSeen = "Main app process exited normally"

to be consistent here, we identify the process as 'Main app process', could we do that in the printstatus instead of 'firefox' ?

@@ +745,2 @@
>        if fixerStatus != 0 and not didTimeout and not hitMaxTime:
>          self.log.info("TEST-UNEXPECTED-FAIL | automation.py | Stack fixer process exited with code %d during test run", fixerStatus)

here we are printing out the status again, but in this case we are marking it as a failure.  In this case I think we are fine double printing.

::: build/automationutils.py
@@ +159,5 @@
>  
> +# Python does not provide strsignal() even in the very latest 3.x.
> +# This is a reasonable fake.
> +def strsig(n):
> +  # signal numbers run 0 through NSIG-1; an array with NSIG members

nit: capitalize Signal

@@ +163,5 @@
> +  # signal numbers run 0 through NSIG-1; an array with NSIG members
> +  # has exactly that many slots
> +  _sigtbl = [None]*signal.NSIG
> +  for k in dir(signal):
> +    if (k.startswith("SIG") and not k.startswith("SIG_") and k != "SIGCLD" and k != "SIGPOLL"):

you do not need () around the if condition.

@@ +165,5 @@
> +  _sigtbl = [None]*signal.NSIG
> +  for k in dir(signal):
> +    if (k.startswith("SIG") and not k.startswith("SIG_") and k != "SIGCLD" and k != "SIGPOLL"):
> +      _sigtbl[getattr(signal, k)] = k
> +  # realtime signals mostly have no names

nit: capitalize Realtime

@@ +169,5 @@
> +  # realtime signals mostly have no names
> +  if hasattr(signal, "SIGRTMIN") and hasattr(signal, "SIGRTMAX"):
> +    for r in range(signal.SIGRTMIN+1, signal.SIGRTMAX+1):
> +      _sigtbl[r] = "SIGRTMIN+" + str(r - signal.SIGRTMIN)
> +  # fill in any remaining gaps

nit: capitalize Fill

@@ +187,5 @@
> +    print "TEST-INFO | %s: exit status %x\n" % (name, status)
> +  elif os.WIFEXITED(status):
> +    print "TEST-INFO | %s: exit %d\n" % (name, os.WEXITSTATUS(status))
> +  elif os.WIFSIGNALED(status):
> +    # the python stdlib doesn't appear to have strsignal(), alas

nit: capitalize The

@@ +188,5 @@
> +  elif os.WIFEXITED(status):
> +    print "TEST-INFO | %s: exit %d\n" % (name, os.WEXITSTATUS(status))
> +  elif os.WIFSIGNALED(status):
> +    # the python stdlib doesn't appear to have strsignal(), alas
> +    print "TEST-INFO | killed by {}".format(strsig(os.WTERMSIG(status)))

this doesn't say what was killed, please use the name variable.

@@ +190,5 @@
> +  elif os.WIFSIGNALED(status):
> +    # the python stdlib doesn't appear to have strsignal(), alas
> +    print "TEST-INFO | killed by {}".format(strsig(os.WTERMSIG(status)))
> +  else:
> +    # this is probably a can't-happen condition on Unix, but let's be defensive

nit: capitalize This

::: testing/mochitest/runtests.py
@@ +930,5 @@
>      # https://github.com/mozilla/mozbase/blob/master/mozrunner/mozrunner/runner.py#L61
>      # until bug 913970 is fixed regarding mozrunner `wait` not returning status
>      # see https://bugzilla.mozilla.org/show_bug.cgi?id=913970
>      status = proc.wait()
> +    printstatus(status, "firefox")

lets use 'Main app process'
Attachment #8394362 - Flags: review?(jmaher) → review-
Attached patch exitcode.patchSplinter Review
Removed all the nits from the patch as pointed out by jmaher.
Attachment #8394362 - Attachment is obsolete: true
Attachment #8394683 - Flags: review?(jmaher)
Comment on attachment 8394683 [details] [diff] [review]
exitcode.patch

Review of attachment 8394683 [details] [diff] [review]:
-----------------------------------------------------------------

I have tested this locally and on try, thanks for making this patch!
Attachment #8394683 - Flags: review?(jmaher) → review+
https://hg.mozilla.org/mozilla-central/rev/bc2e311097d4
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla31
You need to log in before you can comment on or make changes to this bug.