Periodically unable to connect Browser Toolbox, despite numerous process restarts, or toolbox split with error message shown on the left
Categories
(DevTools :: General, defect, P1)
Tracking
(firefox-esr68 unaffected, firefox75 wontfix, firefox76 wontfix, firefox77 fixed)
Tracking | Status | |
---|---|---|
firefox-esr68 | --- | unaffected |
firefox75 | --- | wontfix |
firefox76 | --- | wontfix |
firefox77 | --- | fixed |
People
(Reporter: mconley, Assigned: ochameau)
References
(Blocks 1 open bug)
Details
(Keywords: dogfood, regression)
Attachments
(3 files)
STR:
- Launch the Browser Toolbox
ER:
The Browser Toolbox should attach.
AR:
The Browser Toolbox shows a "Browser Toolbox connection status" error message. See screenshot (I can't seem to cut and paste the error text, frustratingly).
Sometimes when I see this error, I can restart the Browser Toolbox, and it attaches properly. This time, however, no matter how many times I restart the Browser Toolbox, it shows me the same error message.
Reporter | ||
Comment 1•5 years ago
|
||
I'll also note that restarting the Firefox instance doesn't appear to help. The only thing I've ever found that helps is restarting the entire OS, which seems suboptimal.
Comment 2•5 years ago
|
||
Comment 3•5 years ago
•
|
||
Same problem, probably for 2 - 3 months now. Only OS restart fixes it. No Firefox's processes are otherwise running.
Re Morgan's screenshot: If toolbox fails to connect, in most cases it will have a full page error dump. Only very rarely I get Morgan's split error toolbox state and recently it only shows up for a moment and then recovers.
Comment 4•5 years ago
•
|
||
I'm running into a (related?) similar issue, which I'll document on this bug too, though it might be unrelated. Note restarting firefox and restarting my OS had no effect.
STR:
- Launch https://www.mozilla.org/media/contentcards/img/home-2019/card_2/guide-high-res.a12bff1c92ab.jpg
- Open browser toolbox with CMD + Option + Shift + I
Expected:
Browser toolbox launches, attached to the current active window
Actual:
See screenshot browser_toolbox_split.png [Window titled "Multiprocess Browser Toolbox". The left half of the window is white, with JSON text titled "Browser Toolbox connection status". The JSON below the heading indicates the connection is successful. The right half of the window shows the normal browser toolbox. The left "pane" is not collapsible and does not vanish with use of the toolbox.]
My current nightly is from here: https://hg.mozilla.org/mozilla-central/rev/2f6870dd1b99edaba1de4d2aa97f3910a640c5bf
I'm running OSX 10.15.3
I also initially thought this might be a fission thing, but I disabled fission.autostart and got the same results.
Comment 5•5 years ago
|
||
Alex, any idea if Multiprocess Toolbox work could have regressed this or any other leads?
Updated•5 years ago
|
Assignee | ||
Comment 7•5 years ago
|
||
(In reply to :Harald Kirschner :digitarald from comment #5)
Alex, any idea if Multiprocess Toolbox work could have regressed this or any other leads?
I couldn't think about anything in particular, but Nicolas just hit an exception and was able to capture it.
And it is related to fission work.
console.error: "Error while calling actor 'root's method 'listRemoteFrames'" "currentContext.getChildren is not a function"
console.error: "listRemoteFrames@resource://devtools/server/actors/root.js:650:45\nasync*handler@resource://devtools/shared/protocol/Actor.js:154:37\nonPacket@resource://devtools/server/devtools-server-connection.js:380:58\n_onJSONObjectReady/<@resource://devtools/shared/transport/transport.js:464:22\nexports.makeInfallible/<@resource://devtools/shared/ThreadSafeDevToolsUtils.js:111:22\nexports.makeInfallible/<@resource://devtools/shared/ThreadSafeDevToolsUtils.js:111:22\n"
console.error: "Exception while opening the toolbox" "Error: Protocol error (unknownError): currentContext.getChildren is not a function" (new Error("Protocol error (unknownError): currentContext.getChildren is not a function", "resource://devtools/shared/protocol/Front.js", 294))
onPacket/<@resource://devtools/shared/protocol/Front.js:294:31
DevTools RDP*request@resource://devtools/shared/protocol/Front.js:232:14
generateRequestMethods/</frontProto[name]@resource://devtools/shared/protocol/Front/FrontClassWithSpec.js:49:19
listRemoteFrames@resource://devtools/shared/fronts/targets/browsing-context.js:132:33
listen@resource://devtools/shared/resources/target-list.js:90:42
startListening@resource://devtools/shared/resources/target-list.js:324:47
async*open/<@resource://devtools/client/framework/toolbox.js:754:29
open@resource://devtools/client/framework/toolbox.js:899:18
createToolbox@resource://devtools/client/framework/devtools.js:622:19
async*showToolbox@resource://devtools/client/framework/devtools.js:509:35
openToolbox@chrome://devtools/content/framework/browser-toolbox/window.js:205:30
connect@chrome://devtools/content/framework/browser-toolbox/window.js:112:9
async*@chrome://devtools/content/framework/browser-toolbox/window.js:162:13
EventListener.handleEvent*@chrome://devtools/content/framework/browser-toolbox/window.js:140:8
It looks like in this new code:
https://searchfox.org/mozilla-central/rev/49ed791eec93335abfe6c2880f84c324e73e47e6/devtools/server/actors/root.js#649
currentContext
is sometimes undefined.
Reporter | ||
Comment 8•5 years ago
|
||
I'll note that I haven't been able to properly use the Browser Toolbox for several weeks now. Even after restarts, I seem to enter this state pretty regularly.
Comment 9•5 years ago
|
||
(In reply to Mike Conley (:mconley) (:⚙️) from comment #8)
I'll note that I haven't been able to properly use the Browser Toolbox for several weeks now. Even after restarts, I seem to enter this state pretty regularly.
+1 I haven't been able to use either (since reporting here) :(
Comment 10•5 years ago
|
||
(In reply to Alexandre Poirot [:ochameau] from comment #7)
(In reply to :Harald Kirschner :digitarald from comment #5)
Alex, any idea if Multiprocess Toolbox work could have regressed this or any other leads?
I couldn't think about anything in particular, but Nicolas just hit an exception and was able to capture it.
And it is related to fission work.console.error: "Error while calling actor 'root's method 'listRemoteFrames'" "currentContext.getChildren is not a function" console.error: "listRemoteFrames@resource://devtools/server/actors/root.js:650:45\nasync*handler@resource://devtools/shared/protocol/Actor.js:154:37\nonPacket@resource://devtools/server/devtools-server-connection.js:380:58\n_onJSONObjectReady/<@resource://devtools/shared/transport/transport.js:464:22\nexports.makeInfallible/<@resource://devtools/shared/ThreadSafeDevToolsUtils.js:111:22\nexports.makeInfallible/<@resource://devtools/shared/ThreadSafeDevToolsUtils.js:111:22\n" console.error: "Exception while opening the toolbox" "Error: Protocol error (unknownError): currentContext.getChildren is not a function" (new Error("Protocol error (unknownError): currentContext.getChildren is not a function", "resource://devtools/shared/protocol/Front.js", 294)) onPacket/<@resource://devtools/shared/protocol/Front.js:294:31 DevTools RDP*request@resource://devtools/shared/protocol/Front.js:232:14 generateRequestMethods/</frontProto[name]@resource://devtools/shared/protocol/Front/FrontClassWithSpec.js:49:19 listRemoteFrames@resource://devtools/shared/fronts/targets/browsing-context.js:132:33 listen@resource://devtools/shared/resources/target-list.js:90:42 startListening@resource://devtools/shared/resources/target-list.js:324:47 async*open/<@resource://devtools/client/framework/toolbox.js:754:29 open@resource://devtools/client/framework/toolbox.js:899:18 createToolbox@resource://devtools/client/framework/devtools.js:622:19 async*showToolbox@resource://devtools/client/framework/devtools.js:509:35 openToolbox@chrome://devtools/content/framework/browser-toolbox/window.js:205:30 connect@chrome://devtools/content/framework/browser-toolbox/window.js:112:9 async*@chrome://devtools/content/framework/browser-toolbox/window.js:162:13 EventListener.handleEvent*@chrome://devtools/content/framework/browser-toolbox/window.js:140:8
It looks like in this new code:
https://searchfox.org/mozilla-central/rev/49ed791eec93335abfe6c2880f84c324e73e47e6/devtools/server/actors/root.js#649
currentContext
is sometimes undefined.
Maybe we could more gracefully fail in that case in the meantime? Maybe filtering out null browsingContexts before pushing them onto contextsToWalk in https://searchfox.org/mozilla-central/rev/d69ec052bed8700af7a283e37b60b4af22734930/devtools/server/actors/root.js#604 would work? Given Comments 8 and 9 that would be helpful.
Also Mike, do you have any idea why we might be getting a null browsingContext on that line above? Like, in what case a browser element in in the DOM browser.xhtml wouldn't have a browsingContext.
Comment 11•5 years ago
|
||
Mike or Morgan, if you don't mind testing locally and you could add .filter(browsingContext => browsingContext)
right after .map(browser => browser.browsingContext)
at https://searchfox.org/mozilla-central/rev/d69ec052bed8700af7a283e37b60b4af22734930/devtools/server/actors/root.js#604 to see if that helps based on Comment 10.
Reporter | ||
Comment 12•5 years ago
|
||
(In reply to Brian Grinstead [:bgrins] from comment #11)
Mike or Morgan, if you don't mind testing locally and you could add
.filter(browsingContext => browsingContext)
right after.map(browser => browser.browsingContext)
at https://searchfox.org/mozilla-central/rev/d69ec052bed8700af7a283e37b60b4af22734930/devtools/server/actors/root.js#604 to see if that helps based on Comment 10.
I'm afraid that didn't change the behaviour.
(In reply to Brian Grinstead [:bgrins] from comment #10)
Also Mike, do you have any idea why we might be getting a null browsingContext on that line above? Like, in what case a browser element in in the DOM browser.xhtml wouldn't have a browsingContext.
The only situation I can think of where a <browser> might not have a BrowsingContext is if it's not yet been assigned a FrameLoader / been attached to the DOM. Nika, are there other cases I haven't thought of?
Assignee | ||
Comment 13•5 years ago
•
|
||
Sorry, this fell through the cracks...
I'm activity working on bug 1593937 which will get rid of this code.
Similar issue might happen in the new code, so it would be handy to better understand what is wrong here.
Now that I read the exception more carefully "currentContext.getChildren is not a function"
currentContext
isn't undefined, but it is rather the getChildren
attribute which is not defined.
I'm wondering if Nicolas only hit a rebase issue, when testing Firefox in between two builds when getChildren
has been renamed to children
.
If that is that, the issue may not be related to this code after all.
Mike, Morgan, are you seeing any error in your terminal? Or in the Browser Console?
Assignee | ||
Comment 14•5 years ago
|
||
Please correct me if I'm wrong, but I think the issue isn't about an issue around BrowsingContext.
At least we can't say that from comment 7 exception, which might just be a rebase issue.
So, I'm reseting the unecessary neeinfo on nika.
Assignee | ||
Comment 15•5 years ago
|
||
I tried really hard reproducing but can't find any way to make it fail.
I'm wondering if this is specific to OSX and might be about processes. i.e. how we execute the browser toolbox process.
Are you all on OSX? Any particular preference, page or window opened?
Comment 16•5 years ago
|
||
I do see it from time to time too, I often only have https://nchevobbe.github.io/demo/console-test-app.html open on my main Firefox window.
Bug 1591590 is a test intermittent about the browser toolbox, and the screenshots indicate we get into that weird state there too.
Reporter | ||
Comment 17•5 years ago
|
||
Yes, I'm on macOS. I can reproduce this even with a new profile, and no other Firefox instance running.
Comment 18•5 years ago
|
||
(In reply to Mike Conley (:mconley) (:⚙️) from comment #8)
I'll note that I haven't been able to properly use the Browser Toolbox for several weeks now. Even after restarts, I seem to enter this state pretty regularly.
Same here.
Reporter | ||
Comment 19•5 years ago
|
||
Comment 20•5 years ago
|
||
(In reply to Mike Conley (:mconley) (:⚙️) from comment #19)
(In reply to Dão Gottwald [::dao] from comment #18)
Same here.
Is this on your Linux machine?
Yeah, Ubuntu 19.10.
Reporter | ||
Comment 21•5 years ago
|
||
Okay, I decided to roll up my sleeves and look at this, because I really miss using this tool.
I think I've made some headway. Here's what I know:
- The server seems to launch correctly.
- The problem appears to be in the toolbox profile that launches - the connection attempt fails, despite the server running
- If I Telnet to the server port locally, I get the following:
mconleys-MBP:~ mikeconley$ telnet
telnet> open localhost 54166
Trying ::1...
telnet: connect to address ::1: Connection refused
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
304:{"from":"root","applicationType":"browser","testConnectionPrefix":"server1.conn1.","traits":{"sources":true,"networkMonitor":true,"storageInspector":true,"wasmBinarySource":true,"bulk":true,"webConsoleCommands":true,"allowChromeProcess":true,"heapSnapshots":true,"perfActorVersion":1,"watchpoints":true}}Connection closed by foreign host.
Interestingly, the first attempt that Telnet makes is refused - presumably, because it's trying to connect via an ipv6 address? And then it falls back to ipv4... I think.
So I was able to get things working locally by monkeypatching this line here:
from:
const host = Prefs.chromeDebuggingHost;
to
const host = "127.0.0.1";
and now it works?
I'm not sure why this would kind of thing would only rear itself after some time after the OS has been running. Maybe someone from the Necko team could comment, because this looks more like a local networking issue that our DevTools is stumbling on.
Reporter | ||
Comment 22•5 years ago
|
||
FWIW, I have the following in my /etc/hosts
file:
127.0.0.1 localhost
127.0.0.1 localhost2
127.0.0.1 localhost3
255.255.255.255 broadcasthost
::1 localhost
fe80::1%lo0 localhost
Comment 23•5 years ago
|
||
(In reply to Mike Conley (:mconley) (:⚙️) from comment #22)
FWIW, I have the following in my
/etc/hosts
file:127.0.0.1 localhost 127.0.0.1 localhost2 127.0.0.1 localhost3 255.255.255.255 broadcasthost ::1 localhost fe80::1%lo0 localhost
I normally expect only ipv4 addresses for localhost, an ipv6 for ip6-localhost.
But in any case, I don't know where we create the server socket, but I suspect we only bind it to the IPv4 address?
Normally if we get an IPv6 address for localhost we should fallback to IPv4 if there's an error, but maybe that doesn't work?
There's a lot of stuff in the devtools socket implementation, so I'm not sure how that all works, but I assume we could add a
if (useIPv4Only) {
s.connectionFlags |= nsISocketTransport.DISABLE_IPV6;
}
around here somewhere.
Reporter | ||
Comment 25•5 years ago
|
||
I'll note that commenting this line out from my /etc/hosts/
file seems to do the job:
::1 localhost
now the Browser Toolbox works without me needing to patch it. \o/
I have no idea when or why I added that /etc/hosts
line, tbh.
Comment 26•5 years ago
|
||
Just for reference, this is my hosts file for macOS, but I didn't have the connection issue in the past 4+ weeks.
127.0.0.1 localhost
255.255.255.255 broadcasthost
::1 localhost
fe80::1%lo0 localhost
Comment 27•5 years ago
|
||
(In reply to :Harald Kirschner :digitarald from comment #26)
Just for reference, this is my hosts file for macOS, but I didn't have the connection issue in the past 4+ weeks.
127.0.0.1 localhost 255.255.255.255 broadcasthost ::1 localhost fe80::1%lo0 localhost
Seems like this is a mac thing? Either way, I think adding the DISABLE_IPV6 flag should fix this.
Updated•5 years ago
|
Assignee | ||
Comment 28•5 years ago
|
||
Updated•5 years ago
|
Assignee | ||
Comment 29•5 years ago
|
||
It would be super nice if someone able to reproduce this issue could apply this patch and confirm that it actually address it!
Assignee | ||
Updated•5 years ago
|
Assignee | ||
Comment 30•5 years ago
|
||
Hi Mike, could you apply this patch whenever you reproduce the issue so that we can confirm this patch is actually effectice?
Reporter | ||
Comment 31•5 years ago
|
||
I reverted my /etc/hosts fix, and was able to reproduce the issue. I then applied the patch, and the issue appears to be solved with it! Thanks!
Comment 32•5 years ago
|
||
Assignee | ||
Comment 33•5 years ago
|
||
Thanks a lot for testing!
Dão, Morgan, if you still see any issue after this bug lands, please let us know!
Comment 34•5 years ago
|
||
bugherder |
Comment 35•5 years ago
|
||
The patch didn't help over here on Linux. My hosts
file has ::1
for ip6-localhost
but not for localhost
.
Updated•5 years ago
|
Comment 36•5 years ago
|
||
The patch landed in nightly and beta is affected.
:ochameau, is this bug important enough to require an uplift?
If not please set status_beta
to wontfix
.
For more information, please visit auto_nag documentation.
Comment 37•5 years ago
|
||
I just ran into this again on OSX, with 77.0a1 (2020-04-21) (64-bit)
.
Curiously I was able to open another toolbox on the failed toolbox to copy the error out:
Exception { name: "NS_ERROR_CONNECTION_REFUSED", message: "Component returned failure code: 0x804b000d (NS_ERROR_CONNECTION_REFUSED) [nsIInputStream.available]", result: 2152398861, filename: "resource://devtools/shared/security/socket.js", lineNumber: 363, columnNumber: 0, data: null, stack: "onInputStreamReady@resource://devtools/shared/security/socket.js:363:18\n", location: XPCWrappedNative_NoHelper }
window.js:173:15
Updated•5 years ago
|
Updated•5 years ago
|
Comment 38•5 years ago
|
||
Alex, there are reports indicating that the following bug isn't fully fixed (comment #35 and comment #37), so I am reopening it.
Honza
Comment 39•5 years ago
|
||
I just hit this error on Nightly due to a version mismatch from a Nightly update but I believe that should be handled in a separate bug (maybe even to just give a better error message). Restarting Firefox fixes it in that case.
Otherwise I haven't seen the error on macOS since this was fixed, definitely not the case where I need to restart my whole computer to fix it.
Assignee | ||
Comment 40•4 years ago
|
||
I might have found another exception leading to a broken browser toolbox, bug 1644772.
I'm closing this bug as the ipv6 issue is fixed. Let's reopen new bugs from now as this bug has sailed on 77.
Description
•