Write a TCP socket class that handles DNS queries and web-proxy
Categories
(Core :: Networking, enhancement, P1)
Tracking
()
Tracking | Status | |
---|---|---|
firefox71 | --- | fixed |
People
(Reporter: bwc, Assigned: bwc)
References
(Blocks 1 open bug)
Details
(Whiteboard: [necko-triaged])
Attachments
(17 files)
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review |
To simplify the webrtc networking code, and provide a tool for fixing bug 1385030, bug 1421240, and bug 1189055, it would make sense to have a TCP socket implementation that takes an FQDN/port/IP protocol (the destination), and an SSL flag. This implementation would determine whether it should connect via a web-proxy (based on proxy config), do DNS A or AAAA lookups if needed, and then connect to the destination.
dom::TCPSocket could be modified to do this stuff, or we could create a new class (perhaps wrapping TCPSocket) that does this stuff.
It would also be nice for this API to be usable from any process (and have it automatically do IPC when needed), but that is not strictly necessary. It would need to have an IPC wrapper, at least.
Honza, which approach do you think would be more fruitful; augmenting dom::TCPSocket, or creating something new? On first glance, I am leaning toward augmenting dom::TCPSocket.
Assignee | ||
Updated•6 years ago
|
Assignee | ||
Comment 1•6 years ago
|
||
I think that if we were to use TCPSocket, all we would need to add is the web proxy support. It looks like it already handles FQDN, and upon further thought, we probably do not need any control over whether we resolve A or AAAA. The w3c spec for TCPSocket doesn't mention web proxies, so I'm guessing we would not want to use web proxy support by default.
Assignee | ||
Comment 2•6 years ago
|
||
One potential sticking point; TCPSocket works on the socket process currently, but it is unclear whether things like nsIProtocolProxyService work on the socket process. If they do not, we will likely need to use the PProxyLookup.ipdl that baku is putting together over in bug 1567892.
Assignee | ||
Comment 3•6 years ago
|
||
Taking this, since I am working on a strawman implementation.
![]() |
||
Comment 4•6 years ago
|
||
Hi, sorry for late answer.
The terminal API for WebRTC is something more up to you to decide, TCPSocket is probably fine.
My old idea for the necko changes is somewhat vague. I wanted to get rid of the 15+ years old and obsolete scheduling internal mess and rather just async-provide a stream-like or socket-like interface to dispatch any application level protocols on and not care about what it is - an end to end socket, proxy tunnel... it would clean things up and allow us to do optimizations better and easily support way more use cases, like this one.
To build this is a project and not small. So let's go incrementally.
What you need is similar to how websockets are created, except that any HTTP preflight is only needed for http(2) proxies (as you say in comment 1). Right now, all the preparations for a connection happen in nsHttpChannel, parent process. We resolve the proxy and host names somewhat in parallel, then pre-connect. If there is a proxy requiring authentication, the channel on the parent process will handle it (may also trigger the authentication dialog.) With the socket process, we will do proxy resolution still on the parent process, proxy auth handling as well; socket connections will be actually happening there including recognition if proxies and servers talk h2 or not (if yes, h2 stack is inserted, going to also live on socket process).
Good is that the data connection itself will fully live in the socket process, so, after all the establishing steps are done (like nsHttpChannels gave up the created socket/tunnel and went away) we can IPC from the socket process to the originally requesting process.
So, the process you will ask for a connection will be the parent process. When established we can then use PBackground or end-points to link the socket process and the requesting content process directly. The TCPSocket on the parent will look for a proxy for the URL in question. What URL you will use is up to you - tcp://example.com:12345
?. If an 'http' or 'https' proxy is setup for that url, you will have to use http channel, created as 'proxied channel' [1] with the already resolved proxy info assigned directly on it and with 'connect only' flag set (introduced in bug 1203503, bug 1416220, I believe you know this well enough). Then, it's up to necko internals to provide (subject to figure out how exactly) an object implementing some suitable API to easily plug it in to TCPSocket.
If you are fine with the actual socket live on the parent process (full socket process is still in dev) then it may be easy, the result of 'connect only' is an actual socket. If there is an h2 proxy we may use the same code we already have for ws-over-h2, but I don't know that code that well.
Updated•6 years ago
|
Assignee | ||
Comment 5•6 years ago
|
||
Ok. So what I am putting together is mostly a refactoring of dom::TCPSocket into a class that provides the core functionality of dom::TCPSocket (including any necessary IPC), and a subclass that has the DOM hooks to satisfy the webidl interface for TCPSocket. Webrtc and TCPSocketParent will use the base-class with its core functionality. There will also be some cleanup and simplification that I did on the way.
Given that I am creating a base-class that isn't involved with DOM, should that live in netwerk? Maybe call it nsTCPSocket? RIght now it is named TCPSocketBase and lives in TCPSocket.h/cpp, but I want to ask before moving it somewhere else.
![]() |
||
Comment 6•6 years ago
|
||
Yeah, the base class should probably live in netwerk/base, yes.
Assignee | ||
Comment 7•6 years ago
|
||
One of the larger snags I've hit is the fact that the socket process does not seem to have a cycle-collector, which causes assertions when I try to use things like dom::TCPSocket. dom::TCPSocket is cycle-collected because it registers itself as a listener on lots of its members, creating lots of cycles. The nsTCPSocket class will have this property also, and I don't see any easy way to avoid it. We could try to add some Destroy() API to nsTCPSocket that would deregister these callbacks (thus breaking the cycles), but it is unclear when dom::TCPSocket would call this API. Maybe dom::TCPSocket could hold a weak reference to the nsTCPSocket, but that seems awkward.
Is there a way we could enable the cycle collector on the socket process? Or is there some way around this mess that I am missing? Are there other things that we will eventually need the cycle collector for as we migrate necko stuff to the socket process?
Comment 8•6 years ago
|
||
I am afraid that there is no easy way to enable cycle collector on the socket process, since we only initialize minimal XPCOM on socket process. As far as I can tell, there is no stuff in necko that needs the cycle collector.
Maybe the best way is to let dom::TCPSocket hold a strong reference to nsTCPSocket. We just have to break the cycle manually.
![]() |
||
Updated•6 years ago
|
Assignee | ||
Comment 9•6 years ago
|
||
So, there's another snag; the API that TCPSocket uses internally is not usable on the socket process yet, and the IPC protocol that TCPSocket uses is not usable on the socket process either. I guess it wouldn't be too hard to allow PTCPSocket to be used between the socket/parent process, and that's something I can do. But for future reference, does anyone have an idea how hard would it be to get the necko stuff that TCPSocket uses working on the socket process?
Comment 10•6 years ago
|
||
(In reply to Byron Campen [:bwc] from comment #9)
So, there's another snag; the API that TCPSocket uses internally is not usable on the socket process yet, and the IPC protocol that TCPSocket uses is not usable on the socket process either. I guess it wouldn't be too hard to allow PTCPSocket to be used between the socket/parent process, and that's something I can do. But for future reference, does anyone have an idea how hard would it be to get the necko stuff that TCPSocket uses working on the socket process?
Could you be more specific about which API not working?
I guess TCPSocket only uses these three objects to manipulate the tcp socket.
![]() |
||
Comment 11•6 years ago
|
||
I think all have been answered during a f2f meeting.
- the code will live on the parent process for now, sufficient for fulfilling the goal and simple to implement
- we still need to use an http channel in the 'connect only' mode to tunnel via 'http' and 'https' proxies
Assignee | ||
Comment 12•5 years ago
|
||
Patches are basically cleaned up now, and now it is clear that there's yet another missing piece; an IPC channel between the socket and parent processes. This is necessary if we're going to use the TCP socket class on the socket process, because the socket process does not support a lot of the API it uses. Has anyone started an effort like this already that I could pick up and finish?
Comment 13•5 years ago
|
||
(In reply to Byron Campen [:bwc] from comment #12)
Patches are basically cleaned up now, and now it is clear that there's yet another missing piece; an IPC channel between the socket and parent processes. This is necessary if we're going to use the TCP socket class on the socket process, because the socket process does not support a lot of the API it uses. Has anyone started an effort like this already that I could pick up and finish?
No, no one in necko team has enough cycle to do this.
Would allowing PTCPSocket to be used between parent and socket process be enough?
Assignee | ||
Comment 14•5 years ago
|
||
(In reply to Kershaw Chang [:kershaw] from comment #13)
(In reply to Byron Campen [:bwc] from comment #12)
Patches are basically cleaned up now, and now it is clear that there's yet another missing piece; an IPC channel between the socket and parent processes. This is necessary if we're going to use the TCP socket class on the socket process, because the socket process does not support a lot of the API it uses. Has anyone started an effort like this already that I could pick up and finish?
No, no one in necko team has enough cycle to do this.
Would allowing PTCPSocket to be used between parent and socket process be enough?
Implementing something like PSocketProcessBridge for a socket/parent IPC link would be enough for my purposes, yes. I can implement that, I think, but was just wondering if anyone had started on something like that on a project branch or something.
Comment 15•5 years ago
|
||
We have implemented a PBackground ipc channel [1] between parent and socket process.
Not sure if this is what you need.
[1] https://hg.mozilla.org/projects/larch/rev/af35c413ac4ae3eb165399fc210a391095e00be9
Assignee | ||
Comment 16•5 years ago
|
||
So I got this mostly working, but I was really unhappy with some of the things I needed to do in the refactoring to make it usable without the cycle collector. I have tried an alternate approach that is much cleaner, that involves enhancing WebrtcProxyChannel to handle plain TCP connections.
Assignee | ||
Comment 17•5 years ago
|
||
Assignee | ||
Comment 18•5 years ago
|
||
Assignee | ||
Comment 19•5 years ago
|
||
Assignee | ||
Comment 20•5 years ago
|
||
Depends on D45096
Assignee | ||
Comment 21•5 years ago
|
||
Depends on D45097
Assignee | ||
Comment 22•5 years ago
|
||
Depends on D45098
Assignee | ||
Comment 23•5 years ago
|
||
Depends on D45099
Assignee | ||
Comment 24•5 years ago
|
||
Depends on D45100
Assignee | ||
Comment 25•5 years ago
|
||
Depends on D45101
Assignee | ||
Comment 26•5 years ago
|
||
Depends on D45102
Assignee | ||
Comment 27•5 years ago
|
||
Depends on D45103
Assignee | ||
Comment 28•5 years ago
|
||
Depends on D45104
Assignee | ||
Comment 29•5 years ago
|
||
Assignee | ||
Comment 30•5 years ago
|
||
Depends on D45105
Assignee | ||
Comment 31•5 years ago
|
||
Depends on D45111
Assignee | ||
Comment 32•5 years ago
|
||
Updated•5 years ago
|
Assignee | ||
Comment 33•5 years ago
|
||
Depends on D45112
Assignee | ||
Comment 34•5 years ago
|
||
Assignee | ||
Comment 35•5 years ago
|
||
Depends on D45186
Assignee | ||
Comment 36•5 years ago
|
||
Updated•5 years ago
|
Assignee | ||
Comment 37•5 years ago
|
||
Assignee | ||
Comment 38•5 years ago
|
||
Depends on D45289
Assignee | ||
Comment 39•5 years ago
|
||
Assignee | ||
Comment 40•5 years ago
|
||
Assignee | ||
Comment 41•5 years ago
|
||
Try looks good. I am going to run a -u all try push now, because I've touched more than just webrtc stuff here.
Assignee | ||
Comment 42•5 years ago
|
||
Assignee | ||
Comment 43•5 years ago
|
||
Assignee | ||
Comment 44•5 years ago
|
||
Assignee | ||
Comment 45•5 years ago
|
||
Assignee | ||
Comment 46•5 years ago
|
||
Assignee | ||
Comment 47•5 years ago
|
||
Depends on D45562
Assignee | ||
Comment 48•5 years ago
|
||
Assignee | ||
Comment 49•5 years ago
|
||
Assignee | ||
Comment 50•5 years ago
|
||
Depends on D46032
Assignee | ||
Comment 51•5 years ago
|
||
Assignee | ||
Comment 52•5 years ago
|
||
Assignee | ||
Comment 53•5 years ago
|
||
Assignee | ||
Comment 54•5 years ago
|
||
Try run focused on webrtc looks good. Here's a -u all try run, since I've touched more than just webrtc code.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=b03c3c7ab9f8c5040104735f65fcf1e178042b36
Assignee | ||
Comment 55•5 years ago
|
||
No unusual failures in that -u all try push.
Comment 56•5 years ago
|
||
Comment 57•5 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/2b7dc87046da
https://hg.mozilla.org/mozilla-central/rev/82af7b115f60
https://hg.mozilla.org/mozilla-central/rev/c684f79d8c67
https://hg.mozilla.org/mozilla-central/rev/6059845767eb
https://hg.mozilla.org/mozilla-central/rev/ed1ae6a85255
https://hg.mozilla.org/mozilla-central/rev/f5353ea9f1df
https://hg.mozilla.org/mozilla-central/rev/b879a0044cf3
https://hg.mozilla.org/mozilla-central/rev/b83124cf3a89
https://hg.mozilla.org/mozilla-central/rev/da41f0fab427
https://hg.mozilla.org/mozilla-central/rev/bd833f5afa30
https://hg.mozilla.org/mozilla-central/rev/bf207998251a
https://hg.mozilla.org/mozilla-central/rev/940d7f787bea
https://hg.mozilla.org/mozilla-central/rev/c43589fd0f90
https://hg.mozilla.org/mozilla-central/rev/6e1eba5c88ad
https://hg.mozilla.org/mozilla-central/rev/8196045d38b5
https://hg.mozilla.org/mozilla-central/rev/cf23522758d5
https://hg.mozilla.org/mozilla-central/rev/cd286d53c121
Comment 58•5 years ago
|
||
Hi, I am wondering if it can also be inserted in "Blocks:" for
https://bugzilla.mozilla.org/show_bug.cgi?id=1239006
?
Comment 62•5 years ago
|
||
Because of the number of patches here, I'm unclear on how big this change actually is. We have a customer that is experiencing
https://bugzilla.mozilla.org/show_bug.cgi?id=1505329
and this fixes it and they were wondering the likelihood of getting this on the ESR.
Is this an extremely large change? Is it scoped well?
Assignee | ||
Comment 63•5 years ago
|
||
It is a big change, yes. I'm not comfortable uplifting this to ESR.
Description
•