Closed Bug 892140 Opened 11 years ago Closed 11 years ago

Don't call sys.exit() outside of main program entry point

Categories

(Release Engineering :: General, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: gps, Assigned: gps)

References

Details

Attachments

(1 file, 1 obsolete file)

As I mentioned in bug 859573, having a sys.exit() in BaseScript.run() is constraining. It makes it difficult to do things like unit test run() and to chain multiple BaseScript instances together. It's a best practice in practically every programming language to only call exit() from the program's main() entry point. There are exceptions to this, sure. But I don't think mozharness is one of them.
Patch is straight forward. If granted review, I'm not sure what the
landing process for mozharness is. Please enlighten me as part of the
review.
Attachment #773587 - Flags: review?(aki)
Assignee: nobody → gps
I get around this in unit tests via try/except SystemExit, a la http://hg.mozilla.org/build/mozharness/file/4ab48af35790/test/test_base_script.py#l84

This will get rid of the run() sys.exit(), but won't deal with the fatal() equivalent, which is called as "raise SystemExit": http://hg.mozilla.org/build/mozharness/file/4ab48af35790/mozharness/base/log.py#l63

I'm not sure how to get rid of the fatal() SystemExit since it's written to exit, by design.  How is |try/except SystemExit| as a solution here?
An unhandled SystemExit results in exit() being called by the Python interpreter. If you absolutely must exit the current process, you can raise SystemExit. However, SystemExit should be used sparingly. Sure, you can catch SystemExit. But, you don't know whether that was raised legitimately (say due to the system being in a bad state) or whether some logger wanted to abort termination.

I didn't realize log.py raised SystemExit. Perhaps we should implement run_and_exit() instead. Then, we can put additional exception handling there, including uncaught exceptions from run().

Thoughts?
I think that sounds ok.  I mainly am concerned about being able to keep fatal() for normal use, while being able to explicitly bypass it when needed, e.g. testing.  In other non-testing cases, I've found that launching a second script from the same process often leads to double-logging and other issues tied to the global nature of the python logging module.

If differentiating a fatal() or run_and_exit() SystemExit from other cases is problematic, we could potentially create a custom exception that inherits it, and use the custom exception in those two locations.
Comment on attachment 773587 [details] [diff] [review]
Only call sys.exit() from script entry points

Minusing for not fully covering all SystemExit's; fixes discussed in comments.

Landing on mozharness default will go live immediately on Cedar, but no other branches.  Jenkins (releng-ci) will run tests against default and announce results in #releng.  Someone will merge to the production mozharness branch at some point afterwards, at which point it will be live on all other branches.
Attachment #773587 - Flags: review?(aki) → review-
OK. So I moved the sys.exit from run() into run_and_exit(). I didn't
touch other uses of sys.exit() and SystemExit because, well, I'm lazy.
I'm making this change mostly to facilitate work in the resource
monitoring bug. Fixing premature exits everywhere would at least double
the effort required for the patch. This patch gets us partially there
and is a step in the right direction. If there is will to remove all
sys.exit and SystemExit usage, I think that can be done in a follow-up
bug.
Attachment #773676 - Flags: review?(aki)
Attachment #773587 - Attachment is obsolete: true
Comment on attachment 773676 [details] [diff] [review]
Don't use sys.exit() in BaseScript.run()

Ok.  This sets us up to be able to do that work later.
Thanks!
Attachment #773676 - Flags: review?(aki) → review+
For posterity, should we choose to go this route:

>>> class Fatal(SystemExit):
...     pass
...
>>> raise Fatal, 1
deathduck:/src/gps/mozharness/scripts [16:01:47] (default)
517$ echo $?
1
https://hg.mozilla.org/build/mozharness/rev/e26b48378f6b

I'm not sure how you resolve mozharness bugs once they've been pushed :/
Status: NEW → ASSIGNED
We usually do the same with other releng bugs: keep open until merged to production branch or reconfiged, depending on repo.  Mozharness is the merge-to-production variety.
Comment on attachment 773676 [details] [diff] [review]
Don't use sys.exit() in BaseScript.run()

in production
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Product: mozilla.org → Release Engineering
Component: General Automation → General
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: