Closed
Bug 264268
Opened 20 years ago
Closed 9 years ago
Implement alternative to PR_NewPollableEvent + PR_Poll for the socket transport service on Win32
Categories
(Core :: Networking, enhancement, P5)
Tracking
()
RESOLVED
WONTFIX
People
(Reporter: darin.moz, Unassigned)
References
(Blocks 1 open bug)
Details
(Keywords: helpwanted)
Implement alternative to PR_NewPollableEvent + PR_Poll for the socket transport service on Win32. this code is notorious for triggering firewall alerts about mozilla opening server sockets. clearly, these are "false positives" since the server socket is really just a loopback, connected socket pair that cannot be abused by an attacker. nonetheless, this warning surely makes users feel uneasy. moreover, there are better ways to implement this socket + 'pollable event' approach under Win32. instead of changing the implementation of PR_NewPollableEvent and PR_Poll in NSPR, it might be easier to simply fork nsSocketTransportService2.cpp. it's less than 600 lines of code to consider. we can probably refactor nsSocketTransportService2.cpp to further reduce the amount of #ifdef'd code.
| Reporter | ||
Updated•20 years ago
|
Status: NEW → ASSIGNED
Target Milestone: --- → mozilla1.8beta
Comment 1•20 years ago
|
||
changing NSPR would allow other users of this function to share the improvements, though
| Reporter | ||
Comment 2•20 years ago
|
||
WTC and I spoke about doing this in NSPR, and the problem is that there are limits on the number of socket events that can be waited on simultaneously, at least under older versions of Windows. That means that using WaitForMultipleEvents may not be an option in general. Since we already limit the number of sockets created by Necko, Necko seems like the place where we could implement this with the least amount of pain. Someone correct me if I'm wrong.
Comment 3•20 years ago
|
||
WaitForMultipleObjects can only wait for MAXIMUM_WAIT_OBJECTS handles at a time. MAXIMUM_WAIT_OBJECTS is defined to be 64. This value doesn't seem to depend on the Windows version.
Comment 4•20 years ago
|
||
Would IOCP work better for us here? We could fall back to the current mechanism if the system is too old (pre-NT 3.5?); I'm not nearly as worried about the false alarms on older systems, myself. IOCP also scales better than WFMO, which would make it a more appropriate NSPR-level solution, possibly.
| Reporter | ||
Comment 5•20 years ago
|
||
According to MSDN:
> I/O completion ports are the mechanism by which an application uses a pool of
> threads that was created when the application was started to process
> asynchronous I/O requests.
So, IOCP implies the existance of one or more background threads. What we're
looking for here is not the addition of new threads, because we already having a
threading model that we are happy with.
All we really need is a way to wait for any one of the following conditions to
occur:
(1) a socket is ready for us to read, write, etc.
(2) a task (PLEvent) is pending to be run on the socket transport thread
(3) a timeout occured
We implement #2 today by using PR_NewPollableEvent, and then we call PR_Poll on
our sockets plus the event. When we post a PLEvent to our queue, we tickle the
pollable event thus unblocking the call to PR_Poll. That wakes up the socket
thread and it goes an processes the event queue. Based on my testing, this I/O
mechanism is plenty fast. On fast networks, it is not uncommon for necko to
stall with full buffers waiting for gecko to consume incoming data. So, I think
we are doing a fine job of pulling data off the wire.
The idea behind WFMO is that it would allow us to replace the pollable event
with a Win32 Kernel Event object and hence avoid the problems with the pollable
event and firewalls. I would be surprised if WFMO posed a performance problem
for us, given our usage pattern.
My main concern is the call to WSAEventSelect because there is not a clear
mapping between the various PR_POLL flags and the values for the lNetworkEvents
parameter in the 'connect' and 'accept' cases. We'd need a way to express
FD_CONNECT and FD_ACCEPT, I think. Hmm...| Reporter | ||
Comment 6•20 years ago
|
||
no chance that i will get to this in time for gecko 1.8. instead, i plan to delay initialization of the socket transport service. that'll probably happen as part of bug 283049.
| Reporter | ||
Comment 7•20 years ago
|
||
It occured to me that we might be able to eliminate the need for PR_NewPollableEvent by making use of PR_Interrupt.
| Reporter | ||
Comment 8•20 years ago
|
||
According to wtc: There is no way to unblock select() on Windows. So we call select with at most a 5 second timeout, and check the interrupt flag in between. So the latency of PR_Poll's response to PR_Interrupt is 2.5 seconds on average. In other words, using PR_Interrupt is not an option for us in this context.
Comment 9•20 years ago
|
||
can you use a pipe (http://www.mozilla.org/projects/nspr/reference/html/priofnc.html#19605) for this purpose? (Maybe that's what the pollable event impl should use?)
| Reporter | ||
Comment 10•20 years ago
|
||
> can you use a pipe
> ... for this purpose?
No, you cannot because Win32's select() function only works with sockets.
Comment 11•19 years ago
|
||
Following comment #6, and the comment here (about loading Socket and DNS services lazily): http://lxr.mozilla.org/seamonkey/source/netwerk/base/src/nsIOService.cpp#190 I implemented a mechanism of such lazy loading in nsIOService. This helped, but the SocketTransportService was inited anyway a bit further down the road (with the ZoneAlarm alert, of course): http://lxr.mozilla.org/seamonkey/source/netwerk/protocol/http/src/nsHttpConnectionMgr.cpp#109 (which was trigered by a call to 'nsNavigator::GetPlatform'). So, it seems that the lazy loading of nsSocketTransportService in nsIOService is not the way to go. By the way - the ZoneAlarm alerts are especially suspicious to users, if the user starts a stand alone XULRunner application that by its functionality should never try and connect to the net. Should we maybe CC bsmedberg on this?
| Reporter | ||
Comment 12•19 years ago
|
||
Alex: then we should also lazily load nsHttpConnectionMgr.
Comment 13•19 years ago
|
||
(In reply to comment #12) > Alex: then we should also lazily load nsHttpConnectionMgr. I'm a bit concerned that there are several places out there that get the nsSocketTransportService as a side effect, so fixing one might just bring out the next one. I guess we can try this approach and see where it takes us. I was looking at possible solutions of the original PR_NewPollableEvent problem. I believe that instead of creating a socket and binding it to the loopback, we can create a dummy, unbound socket. Closing it should cause 'select' to return, and it can be recreated after that. What do you think?
Comment 14•19 years ago
|
||
Perhaps this bug should not be tackled by clever coding, but by talking to Zone Labs. We should ask Zone Labs why they warn about a listening socket bound to the loopback address. Because of this bug, I bought and installed the software firewall products of two other vendors (Symantec Norton and McAfee), and neither of them warns about the listening socket created by PR Pollable Event.
Comment 15•19 years ago
|
||
(In reply to comment #14) > Perhaps this bug should not be tackled by clever coding, > but by talking to Zone Labs. We should ask Zone Labs why > they warn about a listening socket bound to the loopback > address. Actually, my Zone Alarm (4.5.538.000) only complains about the local connect request and not about the listening socket. I gueass we could ask Zone Labs, but even if they agree, there will still be many users with older versions. Also, who knows what other firewalls are out there, and what are their policies...
| Reporter | ||
Updated•18 years ago
|
Assignee: darin → nobody
Status: ASSIGNED → NEW
QA Contact: benc → networking
Target Milestone: Future → ---
Comment 16•18 years ago
|
||
Just a note that we are seeing this come up often with people having firewalls warn of opening a server socket. Nom for 1.9 so we keep this on the Radar.
Flags: blocking1.9a2?
Updated•18 years ago
|
Flags: blocking1.9a2? → blocking1.9+
Comment 17•18 years ago
|
||
Ere or Robstrong, are you interested in this? Looking for an owner.
Comment 18•18 years ago
|
||
The chances of my finding the time to do this are very slim.
Comment 19•18 years ago
|
||
I'm interested though I'll need to dig into the sockets stuff to get familiar with them. But how about trying WaitForMultipleObjects in a loop for MAXIMUM_WAIT_OBJECTS objects with 0 timeout?
| Reporter | ||
Comment 20•18 years ago
|
||
Ere, yes that is pretty much the right solution, but the problem is that it is very difficult to implement PR_Poll in terms of WaitForMultipleObjects because WFMO only supports waiting on at most 64 handles, whereas PR_Poll has no such restriction. Two possibilities: 1) make NSPR have a limit on the number of PRPollDesc objects that can be passed to PR_Poll, 2) make nsSocketTransport use Windows directly. Option #1 is fine for Firefox, but probably not fine in general for NSPR. Option #2 is complicated by nsISocketProvider (which is how NSS is plugged in for TLS/SSL support).
Comment 21•18 years ago
|
||
It seems I asked something else I wanted to ask. What I meant is calling WFMO in a loop for MAXIMUM_WAIT_OBJECTS objects at a time with timeout of 0 and checking the return value. In practice: 1. ret=WFMO(MAXIMUM_WAIT_OBJECTS, <first MAXIMUM_WAIT_OBJECTS handles>, FALSE, 0) 2. break out if ret!=WAIT_TIMEOUT 3. ret=WFMO(MAXIMUM_WAIT_OBJECTS, <next MAXIMUM_WAIT_OBJECTS handles>, FALSE, 0) 4. break out if ret!=WAIT_TIMEOUT 5. repeat.. This would be done inside PR_Poll or such to make it transparent. I don't know of any downsides of doing this but it might be worth a shot.
Comment 22•18 years ago
|
||
This is called busy polling. The downside of busy polling is that it wastes CPU cycles. Because of the 0 timeout, the thread that executes this loop will consume all the cycles of the CPU it runs on.
Comment 23•18 years ago
|
||
Yeah, that might cause trouble. I assume adding a small sleep is also out of the question. I know this might be overkill, but WFMO in multiple threads?
Comment 24•18 years ago
|
||
Given Windows' lame limit, I vote for NSPR growing hair to allow an embedding to use WFMO under PR_Poll. Not sure exactly how ugly that hair should be (nose hair? ear hair? yuck! :-P) but it would enable a new, incompatible mode that limits the number of "fd"s to the WFMO limit, and explicitly uses WFMO. /be
Comment 25•18 years ago
|
||
Do the Mozilla clients pass less than 64 file descriptors to PR_Poll? We should also make sure we can support full duplex I/O in this new implementation -- we need to be able to wait for "ready for reading" "ready for writing" events on a socket simultaneously.
Comment 26•18 years ago
|
||
Another shot in the dark: Would it be possible to have a global "something happened" event and then the WFMO loop to check which socket it was?
Comment 27•18 years ago
|
||
Reassigning to jmathies per mconnor. Jim, let me know if you want to talk about this.
Assignee: nobody → jmathies
Comment 28•17 years ago
|
||
Does this really need to block 1.9? From reading the comments it sounds like this is more of a networking enhancement. I understand the issues with firewalls but considering the level of chatter on this bug I wonder if this is really a major issue?
Comment 29•17 years ago
|
||
This is a consistent pain for users who update Firefox and have a firewall... fixing it would be a major usability win because it reduces the number of prompts that a new user sees who is running Firefox for the first time. Thus the blocking marker.
Updated•17 years ago
|
Priority: -- → P5
Updated•17 years ago
|
Flags: tracking1.9+ → wanted-next+
Updated•16 years ago
|
Assignee: jmathies → nobody
Blocks: 511573
Comment 30•15 years ago
|
||
I was about to say, not to interrupt, and then I see the dates. With the amount of ignorance, skepticism, and paranoia, amongst 'new users' I would figure that such an issue would have been hidden. Warnings, Alerts and other indicators form varied 'anti-X' software would put even those who are aware, on alert, at it's first appearance. I understand that the loopback is being used on Windows, and is not needed for *nix. This has me wondering what in Windows requires ..something else? Where 'something else' usually allows for holes of one kind or another. I bet, found or not, there is a way to exploit this (A software version of an inductor - is one such thought). In short, it puts me off. And I can understand. Imagine the user who does not understand or can't - at all. *shudder the eXPeriance*
Comment 31•14 years ago
|
||
Actually, the fact that TB 3.1.6 (WinXP Pro SP3) often uses port 3689, thus stealing the port for streaming music from iTunes to my Airport Express streaming client, is a pain in the a**. As TB usually is one of the first apps I start up after booting the system and iTunes is only used once in a while, iTunes often does not see Aiport Express, because it cannot get the listening port. I just noticed by chance when I watched port usage in my personal firewall's TCP/UDP monitor. I have no idea if there is a way for you to fix this in TB, but obviously if TB would not use listening sockets for internal communication, the probability of 3689 being free for iTunes would be considerably higher.
Updated•13 years ago
|
Comment 32•9 years ago
|
||
this is effectively not an issue afaict in the support channels anymore.. and the alternatives are not good.
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → WONTFIX
You need to log in
before you can comment on or make changes to this bug.
Description
•