Closed Bug 965714 Opened 10 years ago Closed 10 years ago

[mozrunner] Invalid RunnerNotStartedError exception raised when wait() gets called and process already finished

Tracking

(Not tracked)

Status:

RESOLVED FIXED

People

(Reporter: whimboo, Assigned: whimboo)

References

Details

(Keywords: regression, Whiteboard: [mozmill-2.0.4+])

Attachments

(1 file, 1 obsolete file)

Patch v1 10 years ago Henrik Skupin [:whimboo][⌚️UTC+1] 10.03 KB, patch	ahal : review+	Details \| Diff \| Splinter Review
Patch v2 10 years ago Henrik Skupin [:whimboo][⌚️UTC+1] 9.85 KB, patch	ahal : review+	Details \| Diff \| Splinter Review

Henrik Skupin [:whimboo][⌚️UTC+1]

Assignee

Description

•

10 years ago

I tested our restart architecture tests with the latest version of mozmill on hotfix-2.0 and it fails on both places in handle_disconnect where we wait for the application to exit with:

TEST-UNEXPECTED-FAIL | Disconnect Error: Application unexpectedly closed
RESULTS | Passed: 5
RESULTS | Failed: 0
RESULTS | Skipped: 0
Traceback (most recent call last):
  File "/Volumes/data/code/mozmill/mozmill/mozmill/__init__.py", line 799, in run
    mozmill.run(tests, self.options.restart)
  File "/Volumes/data/code/mozmill/mozmill/mozmill/__init__.py", line 424, in run
    self.handle_disconnect()
  File "/Volumes/data/code/mozmill/mozmill/mozmill/__init__.py", line 499, in handle_disconnect
    self.runner.wait(timeout=self.jsbridge_timeout)
  File "/Volumes/data/envs/m2d/lib/python2.7/site-packages/mozrunner-5.32-py2.7.egg/mozrunner/base.py", line 113, in wait
    raise RunnerNotStartedError("Wait() called before process started")
RunnerNotStartedError: Wait() called before process started

Henrik Skupin [:whimboo][⌚️UTC+1]

Assignee

Comment 1

•

10 years ago

This is actually a regression from bug 962495, because it directly polls the process_handler for the exit state now. With that we will leave mozrunner in a weird state because under those conditions the self.process_handler is not set back to None.

I think that we should stop resetting process_handler to none, but leave it existent as long as no other process has been started. Also we should only raise RunnerNotStartedError when no process has been started so far.

The only thing I'm not that sure about yet, is the initial None value of returncode. It could cause confusion if retrieved before a process has been started. Maybe we should make it a property and raise the same exception.

Blocks: 962495

Component: Mozmill → Mozbase

Keywords: regression

Summary: Mozmill can fail in waiting for the application on shutdown/restart if it quits too fast → [mozrunner] Invalid RunnerNotStartedError exception raised when wait() gets called and process already finished

Henrik Skupin [:whimboo][⌚️UTC+1]

Assignee

Comment 2

•

10 years ago

Before I can create tests for such a scenario I have to write two extensions, which restart and shutdown the application after a couple of seconds. Reason is that this is a very specific regression which only happens when the application quits itself. None of our existent tests could catch this.

Henrik Skupin [:whimboo][⌚️UTC+1]

Assignee

Comment 3

•

10 years ago

Actually I can circumvent it by calling runner.process_handler.kill() directly, so the Runner doesn't know about it.

Henrik Skupin [:whimboo][⌚️UTC+1]

Assignee

Comment 4

•

10 years ago

Attached patch Patch v1 (obsolete) — Details — Splinter Review

Ok, this patch makes kinda lot of changes for a better handling of the process handler and return code. I hope that's fine. I'm not sure if that causes any conflicts with projects outside of mozbase. At least all tests on mozbase are passing on OS X (linux and windows tests upcoming), and also Mozmill is fine. But if you want a try server run, let me know and I can get it started.

Attachment #8367885 - Flags: review?(ahalberstadt)

Andrew Halberstadt [:ahal]

Comment 5

•

10 years ago

Comment on attachment 8367885 [details] [diff] [review]
Patch v1

Review of attachment 8367885 [details] [diff] [review]:
-----------------------------------------------------------------

I really like these changes. I have some comments, but nothing major.

::: mozrunner/mozrunner/base.py
@@ +52,5 @@
> +    def returncode(self):
> +        if isinstance(self.process_handler, subprocess.Popen):
> +            return self.process_handler.poll()
> +        elif isinstance(self.process_handler, ProcessHandler):
> +            return self.process_handler.proc.poll()

Didn't we add a poll method to mozprocess? Or not yet..

@@ +137,5 @@
>          """
> +        try:
> +            if not self.is_running():
> +                return
> +        except RunnerNotStartedError:

I would remove the try: except here. I think calling stop() before calling start() is a valid reason to raise an exception.

::: mozrunner/tests/test_interactive.py
@@ +44,5 @@
> +
> +    def test_wait_after_process_finished(self):
> +        """Wait after the process has been stopped should not raise an error"""
> +        self.runner.start(interactive=True)
> +        sleep(5)

Anyway we can make this any smaller? This is going to add about 1 minute per push to our infrastructure load, which may not be too bad in comparison to other tests, but it adds up. I'd almost be inclined to say the cost/benefit would swing in favour of not having this test at all.

Attachment #8367885 - Flags: review?(ahalberstadt) → review+

Henrik Skupin [:whimboo][⌚️UTC+1]

Assignee

Comment 6

•

10 years ago

(In reply to Andrew Halberstadt [:ahal] from comment #5)
> > +    def returncode(self):
> > +        if isinstance(self.process_handler, subprocess.Popen):
> > +            return self.process_handler.poll()
> > +        elif isinstance(self.process_handler, ProcessHandler):
> > +            return self.process_handler.proc.poll()
> 
> Didn't we add a poll method to mozprocess? Or not yet..

Not yet. I filed the bug yesterday.

> @@ +137,5 @@
> >          """
> > +        try:
> > +            if not self.is_running():
> > +                return
> > +        except RunnerNotStartedError:
> 
> I would remove the try: except here. I think calling stop() before calling
> start() is a valid reason to raise an exception.

We are calling stop() inside of start(). So at one position we have to add this check. Tell me what you would prefer. Same for clean-up, where we would also raise this error if runner has never been started.

> ::: mozrunner/tests/test_interactive.py
> @@ +44,5 @@
> > +
> > +    def test_wait_after_process_finished(self):
> > +        """Wait after the process has been stopped should not raise an error"""
> > +        self.runner.start(interactive=True)
> > +        sleep(5)
> 
> Anyway we can make this any smaller? This is going to add about 1 minute per
> push to our infrastructure load, which may not be too bad in comparison to
> other tests, but it adds up. I'd almost be inclined to say the cost/benefit
> would swing in favour of not having this test at all.

Why 1 minute per push? Across all the platforms? But yeah, I can make it smaller and even remove it completely.

Henrik Skupin [:whimboo][⌚️UTC+1]

Assignee

Comment 7

•

10 years ago

(In reply to Henrik Skupin (:whimboo) from comment #6)
> > I would remove the try: except here. I think calling stop() before calling
> > start() is a valid reason to raise an exception.
> 
> We are calling stop() inside of start(). So at one position we have to add
> this check. Tell me what you would prefer. Same for clean-up, where we would
> also raise this error if runner has never been started.

Gna, is_running() should actually never raise an exception. Instead it should catch it and return False. That would fix it. Working on an update.

Henrik Skupin [:whimboo][⌚️UTC+1]

Assignee

Comment 8

•

10 years ago

Attached patch Patch v2 — Details — Splinter Review

Updated as mentioned in the previous comments.

Attachment #8367885 - Attachment is obsolete: true

Attachment #8367976 - Flags: review?(ahalberstadt)

Andrew Halberstadt [:ahal]

Comment 9

•

10 years ago

Comment on attachment 8367976 [details] [diff] [review]
Patch v2

Review of attachment 8367976 [details] [diff] [review]:
-----------------------------------------------------------------

Yeah, it would be 5 seconds per build job of which there are over 20. Granted not all of them run on every push.. but still. Anyway, thanks for updating it. Lgtm! I wouldn't be surprised if this cause some sort of bustage on m-c though, so maybe a try run would be good.

Attachment #8367976 - Flags: review?(ahalberstadt) → review+

Henrik Skupin [:whimboo][⌚️UTC+1]

Assignee

Comment 10

•

10 years ago

Pushed this to try:
https://tbpl.mozilla.org/?tree=Try&rev=133689bec6ab

Henrik Skupin [:whimboo][⌚️UTC+1]

Assignee

Comment 11

•

10 years ago

Had to push again because I missed to remove the mozinstall patch which introduced the pefile package:
https://tbpl.mozilla.org/?tree=Try&rev=0ebd5746ca59

Henrik Skupin [:whimboo][⌚️UTC+1]

Assignee

Comment 12

•

10 years ago

Well, the package dependencies are getting more complex on m-c those days given that even mozversion is not mirrored yet. So no results from the unit tests, but all other types of tests are passing. I will do a full unit test run on Windows, and if all passes we are good to get this patch landed!

Henrik Skupin [:whimboo][⌚️UTC+1]

Assignee

Comment 13

•

10 years ago

All green so I went ahead and got it merged:
https://github.com/mozilla/mozbase/commit/eafc6ef9164b63c7ff15a22e6e9240e646765192

Status: ASSIGNED → RESOLVED

Closed: 10 years ago

Flags: in-testsuite+

Resolution: --- → FIXED

Henrik Skupin [:whimboo][⌚️UTC+1]

Assignee

Updated

•

10 years ago

Blocks: 967058

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Quick Search

[mozrunner] Invalid RunnerNotStartedError exception raised when wait() gets called and process already finished

Categories

(Testing :: Mozbase, defect)

Tracking

(Not tracked)

People

(Reporter: whimboo, Assigned: whimboo)

References

Details

(Keywords: regression, Whiteboard: [mozmill-2.0.4+])

Crash Data

Security

(public)

User Story

Attachments

(1 file, 1 obsolete file)

Description

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Comment 9

Comment 10

Comment 11

Comment 12

Comment 13

Updated

Attachment

General

Description

File Name

Content Type