[remote-dbg-next] Blank devtools when leaving This Firefox page

VERIFIED FIXED in Firefox 65

Status

defect
P1
normal
VERIFIED FIXED
9 months ago
5 months ago

People

(Reporter: jdescottes, Assigned: jdescottes)

Tracking

(Depends on 1 bug, Blocks 2 bugs)

unspecified
Firefox 65
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox65 fixed, firefox66 verified, firefox67 verified, firefox68 verified)

Details

(Whiteboard: old-remote-debugging-ng-m3)

Attachments

(4 attachments)

Posted image blank_devtools.gif
STRs:
- enable new about:debugging
- go to "about:debugging" (it should open on This Firefox)
- open DevTools on the Console
- navigate away from This Firefox (either go to Connect page in about:debugging or navigate to another URL)
=> DevTools become blank. If you trigger the DevTools shortcut it opens a new DevTools toolbox, but doesn't remove the blank area left by the previous one.
Priority: -- → P2
filter on remote-debugging-next-move-m3-to-m2
filter on remote-debugging-next-move-m3-to-m2
filter on remote-debugging-next-move-m3-to-m2
No longer blocks: remote-debugging-ng-m3
Whiteboard: old-remote-debugging-ng-m3
Assignee: nobody → jdescottes
Status: NEW → ASSIGNED
Priority: P2 → P1
When destroying the Debugger server we are destroying the fronts and panels simultaneously in client/framework/target.js:

    this._destroyer = (async () => {
      // Before taking any action, notify listeners that destruction is imminent.
      this.emit("close");

      for (let [, front] of this.fronts) {
        front = await front;
        await front.destroy();
      }

      [...]

      this._cleanup();
    })();

The toolbox listens to `close` events from the target, so the `this.emit("close")` will initiate the destruction of the UI. However we are not waiting for the destroy() of the toolbox to terminate before moving on and destroying the fronts.

Once we reach the destroy() of the console, the front for the tab has already been destroyed. However the destroy() of the console is still calling a method on this front:

  await this.target.activeTab.focus();
  
(https://searchfox.org/mozilla-central/rev/0859e6b10fb901875c80de8f8fc33cbb77b2505e/devtools/client/webconsole/webconsole.js#323)

The BrowsingContextTargetFront corresponding to `activeTab` has already been destroyed, which means that this.actorID is null on this front. Sending the request corresponding to focus() will never resume because we won't find a matching actor to send the request to. 

But on top that, the local transports have also already been destroyed. When destroying the server, we close all the DebuggerServerConnection managed by the server, which then close the LocalTransport instances. In protocol.js, there is an attempt to guard against that:

      // The connection might be closed during the promise resolution
      if (this.conn._transport) {
        this.conn._transport.send(packet);
      }

(https://searchfox.org/mozilla-central/rev/0859e6b10fb901875c80de8f8fc33cbb77b2505e/devtools/shared/protocol.js#1366-1369)

But in this case, this.conn is the DebuggerClient. DebuggerClient is not destroyed yet (it will be the last thing done by the target cleanup) so _transport is still available there. Maybe we should also check that the transport itself is still alive. And we should throw instead of doing nothing.

On about:debugging side, I think we should stop destroying the DebuggerServer in about:debugging.

On the protocol side, we should improve our tests to be more resilient when the server is destroyed:
- check the front is not destroyed before sending requests, and reject/throw if it is
- check the transport is not destroyed before sending requests, and reject/throw if it is

Regarding the order of the destruction, I don't know if we should attempt to fix races here. A lot of the destruction sequence is based on events, so it's difficult to force event emitters to wait for listeners to finish their job. Even more difficult to make sure that different listeners of a single event are executed in a correct order. We might want to review the events involved to see if we should replace them with direct calls in some situations.
Whaa, such a detailed trace in this world of races, thanks for figuring this out in detail!

(In reply to Julian Descottes [:jdescottes][:julian] from comment #4)
> Once we reach the destroy() of the console, the front for the tab has
> already been destroyed. However the destroy() of the console is still
> calling a method on this front:
> 
>   await this.target.activeTab.focus();
>   
> (https://searchfox.org/mozilla-central/rev/
> 0859e6b10fb901875c80de8f8fc33cbb77b2505e/devtools/client/webconsole/
> webconsole.js#323)

Actually, that the source of your troubles, and that is a bad implementation of such feature.
In case of connection loss, this feature won't work, whereas you kind of expect to focus whenever devtools are disconnected, no matter how.
So, any such "cleanup/state reset" requests should not exists. Instead the actor should implement such action on its own from its destroy method. At some point I chased this kind of requests we were doing from the the toolbox, like reconfigure. Ideally we should remove all of them.

Then target, toolbox and panels destruction code should not produce any request. Which hopefully reduces the issues you are seeing here.

> On about:debugging side, I think we should stop destroying the
> DebuggerServer in about:debugging.

I saw that recently while refactoring many tests at once. Some tests are destroying the server, some others aren't.
I wish we could have shared test helper that would do the same thing everywhere.
Actually I started trying to share much more code around setup and teardown in our tests.

> On the protocol side, we should improve our tests to be more resilient when
> the server is destroyed:
> - check the front is not destroyed before sending requests, and reject/throw
> if it is
> - check the transport is not destroyed before sending requests, and
> reject/throw if it is
> 
> Regarding the order of the destruction, I don't know if we should attempt to
> fix races here.

+1

> A lot of the destruction sequence is based on events, so
> it's difficult to force event emitters to wait for listeners to finish their
> job.

I believe we should track and remove all destruction requests, that's the root cause of issues here.
And improving rejection as you just said may help identifying the callsites of these requests.
Depends on D13137. I could use help to write the test in a better.
I believe there is a cleaner way to create the front here?
I also had other suggestions for making the fronts more robust in the bug.
Let me know if you think I should try to investigate them more.
See Also: → 1510690
Pushed by jdescottes@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/641de22d6480
Stop destroying DebuggerServer when moving from This Firefox runtime page;r=daisuke
https://hg.mozilla.org/integration/autoland/rev/2b9aaf2f3b58
Front should throw when sending packet without actorID or destination;r=ochameau
Several issues to fix, here is a new try:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=2c7071ec81d355f0ffb86caaed55b9916a074eb7
Flags: needinfo?(jdescottes)
Attachment #9028064 - Attachment description: Bug 1507125 - Front should throw when sending packet without actorID or destination;r=ochameau → Bug 1507125 - Protocol Front should throw when called after destroy;r=ochameau
Pushed by jdescottes@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/0f93bf2d53eb
Stop destroying DebuggerServer when moving from This Firefox runtime page;r=daisuke
https://hg.mozilla.org/integration/autoland/rev/dcec04522fb6
Protocol Front should throw when called after destroy;r=ochameau
Here is another try push to rename send/request to _send/_request, will open a bug to follow up on this.
https://hg.mozilla.org/mozilla-central/rev/0f93bf2d53eb
https://hg.mozilla.org/mozilla-central/rev/dcec04522fb6
Status: ASSIGNED → RESOLVED
Closed: 9 months ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 65
Posted image 65.0b7.png
With 65.0b7, I got the attached result(s), but not with 100% reproduction rate;
- both where caused by steps in comment 0;
- navigation using the back-forward arrows did not seem to cause issues anymore;
- for the navigation part, I manually added the links for Youtube and the debug page.

Should we file a new bug for this, or it might not have ended up in the current beta?
Flags: needinfo?(jdescottes)
Thanks!

Your screenshot shows the old about:debugging. This bug is about the new version of about:debugging that you enable by setting devtools.aboutdebugging.new-enabled to true in about:config. Can you please try again with this preference turned on?

I don't think a bug is needed for the old about:debugging, as we are looking at removing this UI altogether.
Flags: needinfo?(jdescottes)
With the new UI:
- while navigating back and forth between the about:config, about:newtab and about:debugging the inspector manifested the issue once again after a few "rotations". (tested just on 65.0b7 on win10 and macOS10.11);
- with 66.0a1(2019-01-03) on win10;  the same scenario, caused the inspector to vanish;

If it's of any help the navigation was done by the prev/next buttons.
Thanks I managed to reproduce as well by really quickly going back and forth between about:newtab and about:debugging. 
Can you file a new bug for this?
Flags: needinfo?(cristian.fogel)
Depends on: 1518095
Sure, marked in the bug above.
Flags: needinfo?(cristian.fogel)

I've verified this issue on several machines: Windows 10, MAC OS X, Ubuntu 16.04 and on following Firefox versions: Nightly 68.0a1, Firefox Developer Edition 67.0, release 66.0.1 - on all machines/Firefox versions the issue won't occur.

Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.