Closed Bug 1388251 Opened 7 years ago Closed 7 years ago

Don't continue trying to connect if the application doesn't run anymore

Categories

(Testing :: geckodriver, enhancement)

enhancement
Not set
normal

Tracking

(firefox57 fixed)

RESOLVED FIXED
mozilla57
Tracking Status
firefox57 --- fixed

People

(Reporter: whimboo, Assigned: whimboo)

References

Details

Attachments

(5 files)

Attached file Rust traceback
In the case when the application gets closed unexpectedly geckodriver tries to re-connect to it for 600 times. The time spend here (62s) is useless and can be shortened when we simply check for the exit code, and if it is not None just raise an appropriate error. Right now the failure is pretty useless and it's not clear why no connection can be made:

1502174988173	geckodriver::marionette	TRACE	  connection attempt 0/600
[..]
Maybe startup crash...
[..]
1502175050615	geckodriver::marionette	TRACE	  connection attempt 600/600
1502175050719	geckodriver::marionette	TRACE	  connection attempt 601/600

Current failure message:
OS: Unspecified → All
Hardware: Unspecified → All
So this should happen in MarionetteConnection:connect():

https://dxr.mozilla.org/mozilla-central/rev/a921bfb8a2cf3db4d9edebe9b35799a3f9d035da/testing/geckodriver/src/marionette.rs#1314-1337

Given that this method / class doesn't know about the current process we would have to pass it in at least for this method as argument.

Another option would be to take out the retry logic for connection attempts to MarionetteHandler:create_connection() which would make more sense.

https://dxr.mozilla.org/mozilla-central/rev/a921bfb8a2cf3db4d9edebe9b35799a3f9d035da/testing/geckodriver/src/marionette.rs#431

Any feedback?
It's getting more important once my patch on bug 1388249 lands. Then Firefox will be shutdown each time a content crash happens. As such we could earlier escape this loop.

Andreas and James, any feedback from you regarding comment 1? If you agree I would start with the latter option and do a bit of refactoring to move the retry logic out of MarionetteConnection.
Flags: needinfo?(james)
Flags: needinfo?(ato)
Passing the runner instance into connection.connect() seems reasonable.
Flags: needinfo?(james)
Flags: needinfo?(ato)
Ok, so there is a problem here with mozrunner, which doesn't let us check if a process is still running. This is because `Command` and `Child` from the standard library don't allow you to retrieve the status of the currently running child process. You only get the exit code when calling `wait` or `stop`. But in both cases the process will/has to close which is not what we want.

Looks like there is no other crate currently available beside subprocess which would offer such an API. Subprocess is modeled like the Python subprocess module:

https://crates.io/crates/subprocess

Would it be ok for you to change our process handling to this crate?
Assignee: nobody → hskupin
Status: NEW → ASSIGNED
James, one question in regards how we should encapsulate the process status code. What I currently have is the following:

Before started and while running:
* Ok(None)

After process has been quit:
Ok(Some(ExitStatus(ExitStatus(256))))

Would that be ok? I'm not sure how to manage to have different values for the status before the process has been started and while it is running. Maybe that's not possible for ExitStatus, or we just use Err() in case it's unknown.
Flags: needinfo?(james)
Yes, that seems fine. One could design the API so you couldn't request this until the process had started, which would make more sense, but I guess that's not worthwhile.
Flags: needinfo?(james)
Blocks: 1391605
Blocks: 1398057
Comment on attachment 8906741 [details]
Bug 1388251 - Bump mozrunner crate to version 0.5.0.

https://reviewboard.mozilla.org/r/178470/#review183456
Attachment #8906741 - Flags: review?(james) → review+
Comment on attachment 8906743 [details]
Bug 1388251 - Cancel connection attempts if process is not running.

https://reviewboard.mozilla.org/r/178474/#review183474

::: testing/geckodriver/src/marionette.rs:1335
(Diff revision 1)
>          let poll_attempts = timeout / poll_interval;
>          let mut poll_attempt = 0;
>  
>          loop {
> +            if let &mut Some(ref mut runner) = browser {
> +                if let Ok(Some(status)) = runner.status() {

Probably want to exit if it returns `Err` too. So this condition should probably be

```
if runner.status() != Ok(None) {
    return Err(WebDriverError::new(
               ErrorStatus::UnknownError,
               format!("Process ended with status {}", runner.status()
               .ok()
               .and_then(|x| x)
               .and_then(|x| x.code())
               .map(|x| x.to_string())
               .unwrap_or("{unknown}".into()));
    
}
```

Which also can't panic.
Attachment #8906743 - Flags: review?(james) → review-
Comment on attachment 8906743 [details]
Bug 1388251 - Cancel connection attempts if process is not running.

https://reviewboard.mozilla.org/r/178474/#review183474

> Probably want to exit if it returns `Err` too. So this condition should probably be
> 
> ```
> if runner.status() != Ok(None) {
>     return Err(WebDriverError::new(
>                ErrorStatus::UnknownError,
>                format!("Process ended with status {}", runner.status()
>                .ok()
>                .and_then(|x| x)
>                .and_then(|x| x.code())
>                .map(|x| x.to_string())
>                .unwrap_or("{unknown}".into()));
>     
> }
> ```
> 
> Which also can't panic.

With slight updates of that it's working now.
Comment on attachment 8906743 [details]
Bug 1388251 - Cancel connection attempts if process is not running.

https://reviewboard.mozilla.org/r/178474/#review183856
Attachment #8906743 - Flags: review?(james) → review+
Comment on attachment 8906742 [details]
Bug 1388251 - Vendor in mozrunner 0.5.0.

https://reviewboard.mozilla.org/r/178472/#review183862
Attachment #8906742 - Flags: review?(james) → review+
Comment on attachment 8907164 [details]
Bug 1388251 - Updated geckodriver changelog for process handling changes.

https://reviewboard.mozilla.org/r/178850/#review183910
Attachment #8907164 - Flags: review?(james) → review+
Pushed by hskupin@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/c550a576fb95
Bump mozrunner crate to version 0.5.0. r=jgraham
https://hg.mozilla.org/integration/autoland/rev/b77eae409e17
Vendor in mozrunner 0.5.0. r=jgraham
https://hg.mozilla.org/integration/autoland/rev/f8673bffc09a
Cancel connection attempts if process is not running. r=jgraham
https://hg.mozilla.org/integration/autoland/rev/557720a6eb31
Updated geckodriver changelog for process handling changes. r=jgraham
You need to log in before you can comment on or make changes to this bug.