Improve exception handling and message details in Marionette

RESOLVED FIXED in Firefox 50

Status

defect
RESOLVED FIXED
4 years ago
3 years ago

People

(Reporter: whimboo, Assigned: whimboo)

Tracking

unspecified
mozilla51
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox50 fixed, firefox51 fixed)

Details

Attachments

(1 attachment)

Right now when the application under test dies Marionette seems to come up with a general failure like:

AssertionError: Timed out waiting for port!

That is not pretty helpful given that this hides the real cause and would need someone to actually dive into the logs. One example of that can be seen on bug 1202375.

In Mozmill we had this feature and directly reported a crash instead of a specific method in the runner which failed due to that.

https://github.com/mozilla/mozmill/blob/master/mozmill/mozmill/__init__.py#L543
I do some refactoring on bug 1257476. Maybe that would already be helpful.
Depends on: 1257476
I have seen a couple of instances in the Marionette Python code where we re-raise exceptions. In most of the cases we don't keep the original traceback and value (message) so that it is hard to figure out what's wrong. We even miss more detailed failure messages for diagnosing errors. We should get this fixed with this bug.
With my latest changes on bug 1283906 some failure messages have been changed. Here an example:

https://treeherder.mozilla.org/#/jobs?repo=autoland&filter-searchStr=mn&selectedJob=1015300

Before this was filed as bug 1262705 with the summary:

> "UnknownException: Error loading page".

Now we have:

"TimeoutException: Process killed because the connection was lost (Reason: TimeoutException: Connection timed out)"

So we lack the real underlying issue.
Assignee: nobody → hskupin
Status: NEW → ASSIGNED
I did some more investigation on bug 1262705 and as it turned out we were masking another issue which was more critical. The test affected calls find_element() and keeps hanging. So the error message as shown until now was only mentioning a side-effect but not the real issue. So I think that part we don't have to cover here anymore.
I think the summary of the bug doesn't match anymore. Given that most of the process specific information I already put into my landed patch on bug 1257476, I would cover general exception handling and message improvements here.
Summary: Marionette should better indicate when the application quits due to a crash or other system interrupts → Improve exception handling and message details in Marionette
Comment on attachment 8775727 [details]
Bug 1202392 - Improve exception handling and message details in Marionette.

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/67858/diff/1-2/
Comment on attachment 8775727 [details]
Bug 1202392 - Improve exception handling and message details in Marionette.

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/67858/diff/2-3/
Due to the regression fixes we definitely need this bug for 50.0.
Comment on attachment 8775727 [details]
Bug 1202392 - Improve exception handling and message details in Marionette.

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/67858/diff/3-4/
Attachment #8775727 - Attachment description: Bug 1202392 - Improve exception handling and message details in Marionette → Bug 1202392 - Improve exception handling and message details in Marionette.
Attachment #8775727 - Flags: review?(dburns)
David, the patch was iterative growing while working on it. So it is hard to split up. If you still want that for a review I can do it next week.
Comment on attachment 8775727 [details]
Bug 1202392 - Improve exception handling and message details in Marionette.

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/67858/diff/4-5/
https://reviewboard.mozilla.org/r/67858/#review65084

::: testing/marionette/client/marionette_driver/decorators.py:47
(Diff revision 4)
>              return func(*args, **kwargs)
> -        except (MarionetteException, socket.error, IOError) as e:
> +        except (MarionetteException, IOError) as e:
>              exc, val, tb = sys.exc_info()
> +
> +            # In case of socket failures force a shutdown of the application
> +            if type(e) in (socket.error, socket.timeout):

Using TimeoutException before here was a big mistake because any timeout reported by Marionette server will cause a force_shutdown! So we should only care about socket issues.

Also I add this before the crash report so its better visible in the logs and on Treeherder.

::: testing/marionette/client/marionette_driver/marionette.py:762
(Diff revision 5)
>              exc, val, tb = sys.exc_info()
>  
> -            returncode = self.instance.runner.returncode
> +            # Give the application some time to shutdown
> +            returncode = self.instance.runner.wait(timeout=self.DEFAULT_STARTUP_TIMEOUT)
>              if returncode is None:
> -                self.instance.runner.stop()
> +                self.cleanup()

We have to call `cleanup()` to also destroy the session. Otherwise `gather_debug()` will try to get information and horribly fail.
Comment on attachment 8775727 [details]
Bug 1202392 - Improve exception handling and message details in Marionette.

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/67858/diff/5-6/
Comment on attachment 8775727 [details]
Bug 1202392 - Improve exception handling and message details in Marionette.

https://reviewboard.mozilla.org/r/67858/#review65186

::: testing/marionette/client/marionette_driver/marionette.py:760
(Diff revisions 2 - 3)
>          """
>          if self.instance:
>              exc, val, tb = sys.exc_info()
>  
> -            # Give the application some seconds to shutdown
> -            returncode = self.instance.runner.wait(timeout=self.DEFAULT_STARTUP_TIMEOUT)
> +            # Give the application some time to shutdown
> +            returncode = self.instance.runner.wait(timeout=4) # self.DEFAULT_STARTUP_TIMEOUT)

debug code?
Attachment #8775727 - Flags: review?(dburns) → review-
(In reply to David Burns :automatedtester from comment #15)
> > -            # Give the application some seconds to shutdown
> > -            returncode = self.instance.runner.wait(timeout=self.DEFAULT_STARTUP_TIMEOUT)
> > +            # Give the application some time to shutdown
> > +            returncode = self.instance.runner.wait(timeout=4) # self.DEFAULT_STARTUP_TIMEOUT)
> 
> debug code?

David, is that the only reason why you gave r-? If yes it would be faster to open an issue for it and set r+. I would have to fix it anyway before being able to push to autoland.
Flags: needinfo?(dburns)
https://reviewboard.mozilla.org/r/67858/#review65186

> debug code?

Did you look at an interdiff? One of the last diffs clearly removed it.
Attachment #8775727 - Flags: review- → review?(dburns)
Comment on attachment 8775727 [details]
Bug 1202392 - Improve exception handling and message details in Marionette.

https://reviewboard.mozilla.org/r/67858/#review65982
Attachment #8775727 - Flags: review?(dburns) → review+
Flags: needinfo?(dburns)
Pushed by hskupin@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/186bab1ed189
Improve exception handling and message details in Marionette. r=automatedtester
https://hg.mozilla.org/mozilla-central/rev/186bab1ed189
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla51
A test-only change which we need to be landed on mozilla-aurora.
Whiteboard: [checkin-needed-aurora]
You need to log in before you can comment on or make changes to this bug.