Closed Bug 1460651 Opened 2 years ago Closed 2 years ago

Add additional debug logging in Marionette server for Firefox startup

Categories

(Testing :: Marionette, enhancement, P2)

Version 3
enhancement

Tracking

(firefox62 fixed)

RESOLVED FIXED
mozilla62
Tracking Status
firefox62 --- fixed

People

(Reporter: whimboo, Assigned: whimboo)

References

(Blocks 1 open bug)

Details

Attachments

(2 files)

We see hangs and delays during startup of Firefox which cause timeouts of various tests. As such I would like to add some more debug logging, which could give us a better idea where exactly the hang might happen.
hg error in cmd: hg pull gecko -r 7606454bb62057379acff4889251c5a42b7dc985: pulling from https://reviewboard-hg.mozilla.org/gecko
abort: HTTP Error 504: Gateway Timeout
Attachment #8974771 - Flags: review?(ato)
Comment on attachment 8974771 [details]
Bug 1460651 - [marionette] Add additional debug logging for browser startup.

https://reviewboard.mozilla.org/r/243174/#review249120

I am not a great fan of adding all this logging for debugging the
slow startup problem to the official Firefox that we ship to users.
Do I misunderstand something about why you can’t add this to a local
build and run it on try?  Why do we need this in official builds?
Attachment #8974771 - Flags: review?(ato) → review-
Comment on attachment 8974771 [details]
Bug 1460651 - [marionette] Add additional debug logging for browser startup.

https://reviewboard.mozilla.org/r/243174/#review249120

As other bugs have shown we never hit this problem on try, nevertheless how many jobs we retriggered. So we never made progress here.

Also about which specific logging you are concerned about?

The GFX window log lines basically shouldn't be visible in nearly all the cases, but only if the window really doesn't get closed immediately.

The initialization aborted one we could remove, or maybe make it a trace? I would like to see in which cases we try to initialize Marionette but don't do anything because conditions haven't been met, and which finally could end up in an infinite hang.

The delayed startup log lines we could make trace level, and only leave in until we figured out the problem?
Flags: needinfo?(ato)
Comment on attachment 8974771 [details]
Bug 1460651 - [marionette] Add additional debug logging for browser startup.

https://reviewboard.mozilla.org/r/243174/#review249120

> As other bugs have shown we never hit this problem on try,
> nevertheless how many jobs we retriggered. So we never made
> progress here.

I was under the impression we started seeing this on try in
https://bugzilla.mozilla.org/show_bug.cgi?id=1460193?

> Also about which specific logging you are concerned about?
> 
> The GFX window log lines basically shouldn't be visible in nearly
> all the cases, but only if the window really doesn't get closed
> immediately.
> 
> The initialization aborted one we could remove, or maybe make it
> a trace? I would like to see in which cases we try to initialize
> Marionette but don't do anything because conditions haven't been
> met, and which finally could end up in an infinite hang.
> 
> The delayed startup log lines we could make trace level, and only
> leave in until we figured out the problem?

I have a more general concern that too much logging will drown the
things that are useful when debugging the most common bugs we face.
Marionette is already rather chatty at trace level.  When we add
more logging we need to be sure that what we add actually adds
value.

The new log entries added by this patch appears to be useful for
the task of helping to track down the five second delay at startup,
but once the problem has been identified and fixed they wouldn’t
serve any particular purpose any longer.

I wonder therefore if we can reach a compromise where we add these
logs now, then commit to removing them later when the bug has been
fixed?
Comment on attachment 8974771 [details]
Bug 1460651 - [marionette] Add additional debug logging for browser startup.

https://reviewboard.mozilla.org/r/243174/#review249460

::: commit-message-76aad:1
(Diff revision 1)
> +Bug 1460651 - Add additional debug logging in Marionette server for Firefox startup.

s/Firefox// since it will also be visible on Fennec.

::: testing/marionette/components/marionette.js:387
(Diff revision 1)
>              break;
>            }
>          }
>  
>          if (this.gfxWindow) {
> +          log.debug("GFX sanity window detected. Waiting until it has been closed...");

If you log this you also want to log when it is closed.

::: testing/marionette/components/marionette.js:387
(Diff revision 1)
>              break;
>            }
>          }
>  
>          if (this.gfxWindow) {
> +          log.debug("GFX sanity window detected. Waiting until it has been closed...");

Can we make the dot a comma?

::: testing/marionette/components/marionette.js:418
(Diff revision 1)
> +      log.debug(`Initialization aborted: running: ${this.running}, ` +
> +                `enabled: ${this.enabled}, final startup: ${this.finalUIStartup}.`);

This is nitpicking, but can we format this differently?

I think this looks better:

> Init aborted (running=false, enabled=true, finalUIStartup=true)

::: testing/marionette/components/marionette.js:431
(Diff revision 1)
>        let startupRecorder = Promise.resolve();
>        if ("@mozilla.org/test/startuprecorder;1" in Cc) {
>          startupRecorder = Cc["@mozilla.org/test/startuprecorder;1"]
>              .getService().wrappedJSObject.done;
>        }
>        await startupRecorder;

You also want to log waiting for the startup recorder.

::: testing/marionette/components/marionette.js:456
(Diff revision 1)
>          Services.startup.quit(Ci.nsIAppStartup.eForceQuit);
>        }
>  
>        env.set(ENV_ENABLED, "1");
>        Services.obs.notifyObservers(this, NOTIFY_RUNNING, true);
> -      log.info(`Listening on port ${this.server.port}`);
> +      log.debug("Remote service is active.");

Drop .

::: testing/marionette/components/marionette.js:470
(Diff revision 1)
>          Preferences.reset(k);
>        }
>        this.alteredPrefs.clear();
> +
>        Services.obs.notifyObservers(this, NOTIFY_RUNNING);
> +      log.debug("Remote service is inactive.");

Drop .

::: testing/marionette/server.js:94
(Diff revision 1)
>  
>          this.port = this.socket.port;
>  
>          this.socket.asyncListen(this);
> -        logger.debug("New connections are accepted");
> +        logger.info(`Listening on port ${this.port} ` +
> +                    "(new connections are accepted)");

I think we should drop the “(new connections are accepted)” part
of this.

::: testing/marionette/server.js:100
(Diff revision 1)
> -      logger.debug("New connections will no longer be accepted");
> +      logger.info(`Stopped listening on port ${this.port} ` +
> +                  "(new connections will no longer be accepted)");

Is this.port not reset?  That seems like a bug.

::: testing/marionette/server.js:101
(Diff revision 1)
>  
>      } else if (this.socket) {
>        this.socket.close();
>        this.socket = null;
> -      logger.debug("New connections will no longer be accepted");
> +      logger.info(`Stopped listening on port ${this.port} ` +
> +                  "(new connections will no longer be accepted)");

I think we should drop the “(new connections will no longer be
accepted)” part.
Flags: needinfo?(ato)
Comment on attachment 8974771 [details]
Bug 1460651 - [marionette] Add additional debug logging for browser startup.

https://reviewboard.mozilla.org/r/243174/#review249460

> If you log this you also want to log when it is closed.

I don't think we need such a line given that we already log the received `domwindowclosed` observer notification.

> This is nitpicking, but can we format this differently?
> 
> I think this looks better:
> 
> > Init aborted (running=false, enabled=true, finalUIStartup=true)

Sure.

> You also want to log waiting for the startup recorder.

See line 425, or what do I miss?

> Is this.port not reset?  That seems like a bug.

Why should we invalidate the port? It should never change.

> I think we should drop the “(new connections will no longer be
> accepted)” part.

Note that an existing connection can still start the server again, but we do not allow new connections from clients. So I think it's ok to keep that part. Or should we hide it?
Comment on attachment 8974771 [details]
Bug 1460651 - [marionette] Add additional debug logging for browser startup.

https://reviewboard.mozilla.org/r/243174/#review249120

I agree, but please note that those new log lines only apply to startup and shutdown. So they are not occurring that often as our message trace log entries.

And yes, we can remove them later if they are not needed anymore and are annoying.
Comment on attachment 8974771 [details]
Bug 1460651 - [marionette] Add additional debug logging for browser startup.

https://reviewboard.mozilla.org/r/243174/#review249460

> See line 425, or what do I miss?

OK, the actual wait doesn’t happen before :428, but this is fine,
I agree.
Comment on attachment 8974771 [details]
Bug 1460651 - [marionette] Add additional debug logging for browser startup.

https://reviewboard.mozilla.org/r/243174/#review249460

> Why should we invalidate the port? It should never change.

The port value will change the next time you start the server again,
which you can do by flipping the marionette.enabled pref.

But in any case, when the TCP listener is shut down you don’t want
to give the impression it is still listening at that port.

BTW I don’t necessarily mean you have to change this as part of
this patch: we could file a bug for it.

> Note that an existing connection can still start the server again, but we do not allow new connections from clients. So I think it's ok to keep that part. Or should we hide it?

You mean, existing connections won’t be disconnected and can still
issue commands: not that they can start the server again.

I think “Stopped listening on port” sort of implies that new new
connections will no longer be accepted.  Though I do still question
whether or not we shouldn’t just disconnect existing connections
here… maybe that’s not possible, I don’t know.
Comment on attachment 8974771 [details]
Bug 1460651 - [marionette] Add additional debug logging for browser startup.

https://reviewboard.mozilla.org/r/243174/#review249600

::: testing/marionette/components/marionette.js:418
(Diff revision 2)
> +      log.debug(`Init aborted (running=${this.running}, ` +
> +                `enabled=${this.enabled}, finalUIStartup=${this.finalUIStartup}`);

You’re missing a final ")" here.
Attachment #8974771 - Flags: review?(ato) → review+
Comment on attachment 8974771 [details]
Bug 1460651 - [marionette] Add additional debug logging for browser startup.

https://reviewboard.mozilla.org/r/243174/#review249460

> OK, the actual wait doesn’t happen before :428, but this is fine,
> I agree.

Actually, the startup recorder doesn’t always run.  Maybe that’s
what we should be recording?  If it isn’t available startupRecorder
gets set to an empty promise.
Comment on attachment 8974771 [details]
Bug 1460651 - [marionette] Add additional debug logging for browser startup.

https://reviewboard.mozilla.org/r/243174/#review249460

> The port value will change the next time you start the server again,
> which you can do by flipping the marionette.enabled pref.
> 
> But in any case, when the TCP listener is shut down you don’t want
> to give the impression it is still listening at that port.
> 
> BTW I don’t necessarily mean you have to change this as part of
> this patch: we could file a bug for it.

Ok, but lets clarify a bit more. The method we are talking about here is stopping/starting the TCP listener based on:

1) Before shutdown of Firfox
2) Switching the enabled state of Marionette
3) As result of the Marionette:AcceptConnection command

The last doesn't change the port and shouldn't because then it won't be possible for the client to connect again. 1) Is not relevant as this time. So only 2) might want to reset it, but completely disabling Marionette also removes all Marionette related code from memory, or not? If yes, it shouldn't matter.

> You mean, existing connections won’t be disconnected and can still
> issue commands: not that they can start the server again.
> 
> I think “Stopped listening on port” sort of implies that new new
> connections will no longer be accepted.  Though I do still question
> whether or not we shouldn’t just disconnect existing connections
> here… maybe that’s not possible, I don’t know.

> You mean, existing connections won’t be disconnected and can still
issue commands: not that they can start the server again.

Correct. We implemented that method to not allow new connections being accepted while eg Firefox is shutting down. The call to `socket.close()` does not affect existing connections. Those will still be alive: https://developer.mozilla.org/en-US/docs/Mozilla/Tech/XPCOM/Reference/Interface/nsIServerSocket#close(). Maybe we should make that clear by adding a comment in the code.

AFAIR we kept this possibility as last resort, and to not completely loose control of the browser process. I would like to keep that.

Thinking more about it, you may be correct that we can remove the content from inside the brackets.
Comment on attachment 8974771 [details]
Bug 1460651 - [marionette] Add additional debug logging for browser startup.

https://reviewboard.mozilla.org/r/243174/#review249460

> Actually, the startup recorder doesn’t always run.  Maybe that’s
> what we should be recording?  If it isn’t available startupRecorder
> gets set to an empty promise.

Oh, I can move the debug line into the if block then. That makes sense.
Attachment #8975578 - Flags: review?(ato)
Comment on attachment 8975578 [details]
Bug 1460651 - [marionette] Getter for "running" has to always return a boolean.

https://reviewboard.mozilla.org/r/243818/#review249910

::: testing/marionette/components/marionette.js:294
(Diff revision 1)
> -    return this.server && this.server.alive;
> +    if (this.server) {
> +      return this.server.alive;
> +    }
> +
> +    return false;

The previous code is missing a boolean coercion (using not-not).  Your code is equivalent to:

> return !!this.server && this.server.alive;
Attachment #8975578 - Flags: review?(ato) → review+
Pushed by hskupin@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/f6bea415a984
[marionette] Add additional debug logging for browser startup. r=ato
https://hg.mozilla.org/integration/autoland/rev/407cb30f4200
[marionette] Getter for "running" has to always return a boolean. r=ato
https://hg.mozilla.org/mozilla-central/rev/f6bea415a984
https://hg.mozilla.org/mozilla-central/rev/407cb30f4200
Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla62
You need to log in before you can comment on or make changes to this bug.