<a class="header-button" href="https://bugzilla.mozilla.org/home" title="Go to home page"> Bugzilla

Updated

•

20 years ago

Keywords: helpwanted

mesektet

Comment 3

•

19 years ago

The patch doesn't seem like a good idea, because that timeout blocks further network activity. (Thankfully the UI is still responsive.) A SOCKS proxy doesn't return a SOCKS connect response until the proxy successfully establishes a connection to the remote site. That can take up to the tcp connect timeout on the SOCKS machine, and Firefox should let it take that long without blocking. One way to deal with it would be to return success from that function (nsSOCKSIOLayer.cpp 361, iirc) after the connect request is sent. Then there could be a flag at a higher layer to indicate whether the SOCKS connect response has been received. When mozilla receives data from a tcp connection, if Firefox uses a SOCKS proxy and the flag is unset, then Firefox would verify that the buffer is the expected SOCKS connect response, and would set the flag. Alternatively, is there a way to keep the entire SOCKS connect process from blocking as it does? Non-SOCKS tcp connections don't block (you can click "Stop" and immediately connect to another site)... so why do SOCKS connections block (you can click "Stop", but attempting to connect to another site is delayed while the SOCKS timeout expires... extremely annoying if the timeout is increased to 60 seconds)? I ran into this issue using Tor (which behaves like a socks proxy). 10 seconds is not nearly enough for connections to many .onion sites. Standard practice is to use Firefox->Privoxy->Tor. Unfortunately, I don't think privoxy supports pipelining, or else it has some other problem that makes it rather slow even for Tor network standards. Direct Firefox->Tor is faster, at least when it doesn't timeout, so it would be nice if this issue could be resolved.

Darin Fisher

Updated

•

19 years ago

Assignee: darin → nobody

QA Contact: benc → networking

silivrenion

Comment 4

•

19 years ago

I am experiencing difficulties with this bug also. At this point, a badly thought out patch is better than no connectivity, and this bug is one of the sole reasons why I currently don't advertise Firefox for corporate use. Not only does it prohibit me personally from being able to work with firefox, but in other corporate and SOCKS environments, many other people could be having this problem. I know for a fact that this problem is one of the sole reasons that Tor cannot operate well through Firefox. I'd like to ask anyone that can to at least give us something so we can use SOCKS proxies with high latency with Firefox. I tried creating a patch myself, but I ended up failing miserably. Hopefully this can be resolved, and Firefox will rule once again! :D Thanks in advance for anyone who helps!

Comment 5

•

16 years ago

I think the best way to get rid of the hard coded 10 second timeout is to rewrite the SOCKS code to work with non-blocking I/O, so the event loop isn't halted by blocking I/O while the SOCKS handshake completes. NSPR already supports non-blocking connections with plain sockets with PR_Connect() and PR_ConnectContinue(). The SOCKS I/O layer could use the socket in non-blocking mode, and keep track of the state of the handshake in the nsSOCKSSocketInfo class. PR_Connect() would be used to start the handshake and PR_ConnectContinue() to advance the state of the handshake when I/O events are discovered down the line. The only thing that's missing is some way to notify the transport what sort of events to poll for. For PR_Connect() and PR_ConnectContinue(), it is assumed only write events should be polled for, which is fine for plain non-blocking connections, but SOCKS also needs to signal that it wants to read from the socket. Perhaps we could take a hint from OpenSSL and add PR_WANT_READ_ERROR and PR_WANT_WRITE_ERROR error codes and change the definition of PR_Connect() and PR_ConnectContinue() accordingly?

Comment 6

•

16 years ago

Wan-Teh Chang, Do the proposed changes to NSPR in comment #5 sound OK? Chris, The non-blocking approach for SOCKS sounds promising (though I'll confess I don't know SOCKS well enough to know definitively). Let's see if the NSPR folks agree to the change, and if they do, we can certainly try this out if someone's willing to contribute a patch.

Wan-Teh Chang

Comment 7

•

16 years ago

Jason, This is already handled by the "poll" method of a PRFileDesc. It's equivalent to OpenSSL's use of WANT_READ_ERROR and WANT_WRITE_ERROR. The SOCKS I/O layer can define the "poll" method to tell the caller what I/O events we should poll for at the lowest level during connection setup.

Comment 8

•

16 years ago

(In reply to comment #7) > The SOCKS I/O layer can define the "poll" method to tell the > caller what I/O events we should poll for at the lowest level > during connection setup. Thanks for mentioning this. I didn't notice this before. So while normally the poll flags are set to PR_POLL_EXCEPT | PR_POLL_WRITE during non-blocking connection, the SOCKS layer's implementation of the poll method may override this so read or write events are polled instead?

Wan-Teh Chang

Comment 9

•

16 years ago

Yes, that's exactly what the poll method is intended for.

Comment 10

•

16 years ago

Ok, I'd like to write a patch for this. I'll do it pretty much as mentioned in comment 5, although with the poll method taken into account and no additional error codes.

Comment 11

•

16 years ago

Chris--great, go for it.

Comment 12

•

16 years ago

Attached patch Draft of non-blocking SOCKS handshake (obsolete) — Details — Splinter Review

This patch is a draft of of the non-blocking SOCKS handshake. I've tested it on Linux with Tor and SSH with dynamic port forwarding, and it seems to work OK. The only problem I've run into so far is with error handling. There's trouble distinguishing between errors connecting to the proxy and errors that the proxy reports back when it rejects a connection request. In both cases, the error is set to NS_ERROR_PROXY_CONNECTION_REFUSED in nsSocketTransport2.cpp, and the user gets "The proxy server is refusing connections", even if the proxy server is online and correctly reporting that it had trouble connecting to a site. Any thoughts about the patch or the error handling issue?

Comment 13

•

16 years ago

Attached patch non-blocking SOCKS handshake with improved error handling. (obsolete) — Details — Splinter Review

This patch improves error handling. When a SOCKS connection request is rejected by the proxy, the error is recorded in the SOCKS layer (rather than globally), and it is reported with an extra call to PR_ConnectContinue(). A slight modification is made to nsSocketTransport2.cpp to extract the handshake error. This change helps to prevent Firefox from falsely reporting to the user that the proxy server is rejecting connections when the proxy server is actually online. Running the netwerk unit tests reports no regressions on Linux (for me), and connecting through Tor and SSH seems to work.

Attachment #408453 - Attachment is obsolete: true

Updated

•

16 years ago

Attachment #410350 - Flags: review?(cbiesinger)

Updated

•

15 years ago

Attachment #410350 - Flags: review?(cbiesinger)

Updated

•

15 years ago

Attachment #410350 - Flags: review?(cbiesinger)

Comment 14

•

15 years ago

Can we get someone to review this latest patch? It's been quite a while. With this patch, the Tor project could greatly simplify its bundles, possibly even shipping an all-in-one xpi with just Torbutton+tor.

mark stevens

Comment 15

•

15 years ago

Tested the patch on Linux and it works. Thanks chrisd

Bee

Comment 16

•

15 years ago

Hi all!!!!!! Yeah, maybe i had to say this before, but i'm using this patch in my custom build of firefox i made for my Tor Browser Bundle for Linux!!!!!!!! I also suggest you to merge this patch with the main source trunk!!!!!!!!! It's very useful and it works very well!!!!!!! It has been tested for a whole month so far!!!!! Chris did a very good work!!!!!!!!!!!!!!! bye!!!!!!!!! ~bee!!!!!! http://honeybeenet.altervista.org/factorbee/

Thomas Mencken

Comment 17

•

15 years ago

The latest patch works fine on FreeBSD, I've been using it for over a week now and I didn't encounter any issue whatsoever. It took 5 years, but I can finally use tor directly as a socks proxy on firefox. Great job guys! :)

Comment 18

•

15 years ago

Comment on attachment 410350 [details] [diff] [review] non-blocking SOCKS handshake with improved error handling. Jason, can you review this, or do we have someone else who can? Christian is ... laggy.

Attachment #410350 - Flags: review?(jduell.mcbugs)

Comment 19

•

15 years ago

Glad to hear from the testers... Yes, it would be good to get this reviewed.

baiton82

Comment 20

•

15 years ago

Using OpenBSD 4.7-ish and applied patch to Firefox 3.5.9: it works flawlessy. It would be very helpful if someone could review the patch and commit it to trunk... -- baiton

Comment 21

•

15 years ago

Comment on attachment 410350 [details] [diff] [review] non-blocking SOCKS handshake with improved error handling. I don't know SOCKS yet, and this isn't a small patch, so there's no way I'll get to this before e10s ships. Biesi is also likely to be busy with other things. Do we have anyone who knows SOCKS well enough to do a +r on this?

Attachment #410350 - Flags: review?(jduell.mcbugs)

Comment 22

•

15 years ago

Honza?

Kai Engert [:KaiE:]

Comment 23

•

15 years ago

I think it greatly increases chances of quick acceptance of patches if you can demonstrate that your patch is free of regressions. Are there any tests in our test suites that test the correctness of SOCKS proxy operation? If we currently don't have any automated tests for SOCKS operation, it would be a great contribution to add such tests, and demonstrate the behaviour with and without your new patch is as expected. Having automated tests probably relies on having an appropriate test proxy available. Is there a public one that could be used, maybe a proxy that gets configured to accept requests from inside the Mozilla network? Or could a proxy automatically be started as part of the test suite procedure?

Comment 24

•

15 years ago

The last time I checked, there wasn't a test for SOCKS. I'd be willing to write one if we can work out what server to use... I suppose that a SOCKS server could be written specifically for testing, if we really needed that. OpenSSH client also has a SOCKS server for dynamic port forwarding.

Comment 25

•

15 years ago

It would be really great to have some SOCKS test coverage. It would be best to have a server as part of our source tree (ideally in a scripting language), which we could reliably count on to run. But failing that, even some tests that we could run against a well-known or configurable server would be progress. I'm wondering if there might be some way to make this work so that such tests could be run in our automated tryserver/tinderbox builds. (Can we set up an "always available" socks server somewhere?). Cc-ing some people who are more likely to know than me. But even if full automation isn't possible, manual testing would be a vast improvement, so feel free to get started on writing some tests, while we sort out whether and how we could provide the server end.

Comment 26

•

15 years ago

Presumably you could write a SOCKS server in JS, and write xpcshell tests that go xpcshell -> SOCKS -> httpd.js. I don't know how fully-featured your server would have to be in order for it to be usable by our network stack.

Comment 27

•

15 years ago

That should work(In reply to comment #26) > Presumably you could write a SOCKS server in JS, and write xpcshell tests that > go xpcshell -> SOCKS -> httpd.js. > > I don't know how fully-featured your server would have to be in order for it to > be usable by our network stack. That should work. The SOCKS server for testing shouldn't need to be too complicated.

Comment 28

•

15 years ago

I've started off writing a JS "SOCKS echo server" that simply accepts client connections, handles the SOCKS handshake, and then relays any data the client sends back to the client. That's a little easier than having the SOCKS server make an outgoing connection to httpd. It might be a bit hard to get the unpatched SOCKS client to run in the same thread as the server, because the client doesn't play nice with the event loop. The SOCKS server may need to be in a separate thread or process. The patched client works OK here, though.

Comment 29

•

15 years ago

If need be, you can run a new xpcshell as a subprocess from an xpcshell test. I do that for some crashreporter tests (because I want to crash the subprocess). You can see that code here: http://mxr.mozilla.org/mozilla-central/source/toolkit/crashreporter/test/unit/head_crashreporter.js#29 You'd want to run the process without blocking on it, though, see the docs here: https://developer.mozilla.org/en/XPCOM_Interface_Reference/nsIProcess and obviously you'll need to kill the process before your test exits.

Updated

•

15 years ago

Attachment #410350 - Attachment is obsolete: true

Attachment #410350 - Flags: review?(cbiesinger)

Comment 30

•

15 years ago

Attached patch Patch for non-blocking SOCKS I/O layer updated to reflect new source locations. (obsolete) — Details — Splinter Review

Some of the modified files where moved around, so the patch has been updated with the new locations.

Comment 31

•

15 years ago

Attached patch SOCKS xpcshell unit test (obsolete) — Details — Splinter Review

Tests the SOCKS I/O layer; works with new and old versions.

Comment 32

•

15 years ago

Took Ted's advice for creating a subprocess in the unit test; it seems to work alright with both the old version of the SOCKS layer and the yet-to-be-committed non-blocking version. Anyone have comments on the unit test? And is there anything else I can do to encourage Mozilla folks to get this stuff committed?

Bee

Comment 33

•

15 years ago

YEAH!!!!! thats super!!!!!!!!!!!!!!!!!!! Chris is a very good coder!!!!! He's the only one doing something of useful for this bug report!!!!!!!!!! I believe nobody at googlezilla is helping him!!! is it?!!!!! why?!!!!! After all his efforts, you should at least push his patch into the main source!!!!!!!!! I wonder about what funny excuse you'll seek out to say him no again!!! hah!!!! bye!!!!!!!!! ~bee!!!!!!

Comment 34

•

15 years ago

Comment on attachment 450585 [details] [diff] [review] SOCKS xpcshell unit test I'm putting this in my review queue so that I don't forget to take a look at it. Thanks for taking the time to write a test (even though you had to go out of your way and implement your own test proxy!) Having test coverage of your patch should make this a lot easier to land.

Attachment #450585 - Flags: review?(ted.mielczarek)

Comment 35

•

15 years ago

Bee: We don't have an infinite supply of manpower, which is why we rely so heavily on contributions from people like Chris. We do have a process for accepting patches, which is that they need to be reviewed by a module peer. Chris' patch has not yet been reviewed, so it can't be landed. Hopefully we'll get someone to review it soon and get it landed, since Chris has indeed done some great work here. Calling people names and casting aspersions is not a great way to convince people to work with you, however.

Assignee: nobody → chrisd.mang

Comment 36

•

15 years ago

Comment on attachment 450584 [details] [diff] [review] Patch for non-blocking SOCKS I/O layer updated to reflect new source locations. Jason, if you don't have time for this, could you redirect to someone else? I'll review the test changes Chris made.

Attachment #450584 - Flags: review?(jduell.mcbugs)

Updated

•

15 years ago

Status: NEW → ASSIGNED

eduardo

Comment 37

•

15 years ago

It's working on NetBSD 5.0.2/Firefox 3.6.3 as well. Any hope to have the patch reviewed and committed to trunk?

Updated

•

15 years ago

Attachment #450584 - Flags: review?(jduell.mcbugs) → review?(michal.novotny)

Comment 38

•

15 years ago

The problem is that there aren't any SOCKS experts active in the project, so reviewing the patch requires someone to get up to speed on the SOCKS stuff first... Michal, are you actively working on this? If so, do you have anything like an ETA? If not, do we need to find someone else to do it?

Assignee

Comment 39

•

15 years ago

I think I can do it on Tuesday/Wednesday.

odakim

Comment 40

•

15 years ago

When will this non-blocking socket patch be applied to official release?

timeless

Comment 41

•

15 years ago

odakim: after it's reviewed, approved, committed, and released. we don't read tea leaves. just be patient. it could take 5 years, or it could be present in the next release, whose date we don't even know.

Bee

Comment 42

•

15 years ago

> odakim: after it's reviewed, approved, committed, and released. > just be patient After five years, in which step are we in?!!!!!! are we before the first one?!!!! lolol!!!!!!!!!!!! Hopefully this patch will be applied within the current century!!!!!!!!!!!! but i ain't going to hold my breath!!!! Perhaps somebody has to turn this patch into a "blocking" thing!!!!!! YEAH!! because atm, nobody at mozilla seems to care enough about it!!! bye!!!!!!!! ~bee!!!!

soget

Comment 43

•

15 years ago

A BUMP a day, keeps Mozilla awake!!

Tyler Downer [He/Him]

Comment 46

•

15 years ago

soget, please do not bump bugs. Bugzila is not a forum, and doing so will only get you banned for spamming. Also, do not post this bug on other, unrelated bugs.

Assignee

Comment 47

•

15 years ago

Comment on attachment 450584 [details] [diff] [review] Patch for non-blocking SOCKS I/O layer updated to reflect new source locations. Chris, the patch looks good in general. Most of my comments are just about the coding style. > + rv = dns->Resolve(mProxyHost, 0, getter_AddRefs(mDnsRec)); > + if (NS_FAILED(rv)) { > + LOGERROR(("socks: DNS lookup for SOCKS proxy %s failed", Since our goal is to make connecting non-blocking it makes sense to use AsyncResolve() instead of blocking Resolve(). > + PRInt32 IsConnected() const { return mState == SOCKS_CONNECTED; } PRInt32 -> PRBool > +const char * > +nsSOCKSSocketInfo::StateToStr() const > +{ > + switch (mState) { > + case SOCKS_INITIAL: > + return "SOCKS_INITIAL"; Log the state only as a number and remove the StateToStr() method. > const nsCString &DestinationHost() { return mDestinationHost; } > const nsCString &ProxyHost() { return mProxyHost; } > PRInt32 ProxyPort() { return mProxyPort; } > PRInt32 Version() { return mVersion; } > PRUint32 Flags() { return mFlags; } These methods are no longer used, remove them. > +PRStatus > +nsSOCKSSocketInfo::ContinueConnectingToProxy(PRFileDesc *fd, PRInt16 oflags) > +{ > + PRStatus status; > + > + NS_ABORT_IF_FALSE(mState == SOCKS_CONNECTING_TO_PROXY, > + "Continuing connection in wrong state!"); Checking the state here (and in other similar cases) doesn't make much sense, because the method is called only in the correct state. Please keep only meaningful checks and change them to NS_ASSERTION. Also note that these macros do nothing in release build. So any case that should be handled as error should really throw some error. > + PRInt16 HowToPoll() const { return mPollFlags; } Please rename it to GetPollFlags() or just PollFlags() > + void PushUint8(PRUint8 d); > + void PushUint16(PRUint16 d); > + void PushUint32(PRUint32 d); All PushUintXXX/PopUintXXX should be probably made as macros. Btw. all this Push/Pop concept makes the code hard to read (especially to verify the protocol correctness). Wouldn't simple direct buffer addressing be better in our case? > + PRStatus Fill(PRFileDesc *fd); > + PRStatus Flush(PRFileDesc *fd); The names aren't self-explanatory, please rename it. > + // A connection failure occured, try another address > + mState = SOCKS_INITIAL; > + return ConnectToProxy(fd, PR_INTERVAL_NO_TIMEOUT); Shouldn't we honor the timeout? > + PRStatus SendGreeting(PRFileDesc *fd); > + PRStatus ConnectToProxy(PRFileDesc *fd, PRIntervalTime to); > + PRStatus ContinueConnectingToProxy(PRFileDesc *fd, PRInt16 oflags); > + PRStatus HandleSocks4Connect(PRFileDesc *fd); > + PRStatus HandleSocks5Auth(PRFileDesc *fd); > + PRStatus HandleSocks5Connect(PRFileDesc *fd); The state and method names are a little bit confusing. E.g. SendGreeting() sends connect request in case of SOCKS4 but method selection in case of SOCKS5. I'd prefer something like: SendMethodRequest() RecvMethodResponse() SendConnectRequest() RecvConnectResponse() ... And the same for the states: SOCKS_SEND_METHOD_REQUEST ... > + PRStatus HandleSocks5Connect(PRFileDesc *fd); I'd prefer to split this method into 2. The first would read first 5 bytes and the second the rest. > + nsCString mInBuf; > + nsCString mOutBuf; Using nsCString for this purpose isn't the best idea. E.g. following memory allocations happen in case of HandleSocks5Auth() PushUint8(0x05); // alloc 2 bytes PushUint8(0x01); // realloc to 3 bytes PushUint8(0x00); // realloc to 5 bytes PushUint8(0x01); PushNetAddr(addr); // realloc to 9 bytes PushNetPort(addr); // realloc to 17 bytes Flush(); // Truncate() releases the buffer It should be possible to minimize memory allocations by calling GetMutableData on mOutBuf and using mOutBuf == "" instead of mOutBuf.Truncate(), but this would be an ugly hack, so I would prefer to use a char[] array. > + // Flush all pending data for a SOCKS request, then read the > + // response. If we're still connecting to the SOCKS proxy and > + // there's no I/O to do yet, Flush() and Fill() will simply return > + // PR_SUCCESS Just a suggestion: In each state we know whether if we want to read or write data. IMO it would be cleaner to either read or write according to the state instead of always trying to write and read all data. And then you could use just one buffer for both reading and writing. > + // A SOCKS request was rejected, so call PR_ConnectContinue() to instruct > + // the SOCKS I/O layer to set the actual error code. > + // > + else if (PR_INVALID_STATE_ERROR == code && > + mProxyTransparent && > + !mProxyHost.IsEmpty()) { > + PR_ConnectContinue(fd, 0); > + code = PR_GetError(); > + rv = ErrorAccordingToNSPR(code); > + } I'm afraid that this change to improve the error handling isn't acceptable, but I'll delegate this decision to the necko peer.

Attachment #450584 - Flags: review?(michal.novotny) → review-

Comment 48

•

15 years ago

Michal, thanks for looking through the patch! I have a few questions/comments: > PRInt32 -> PRBool Or just bool. > Checking the state here (and in other similar cases) doesn't make much sense, > because the method is called only in the correct state. Please keep only > meaningful checks and change them to NS_ASSERTION. I'd say that checking assumed invariants that currently _aren't_ violated with NS_ABORT_IF_FALSE is exactly the right thing to do.... why wouldn't it be? That's what NS_ABORT_IF_FALSE is for. > so I would prefer to use a char[] array How does that help the realloc situation? If we just want a growable thing with some nonzero initial size, we have string classes for that... is that what we need here? > I'm afraid that this change to improve the error handling isn't acceptable, What problems does/could it cause? (Genuine question; if a necko peer is supposed to make a call about this, it would be good to have all the relevant information.)

Assignee

Comment 49

•

15 years ago

(In reply to comment #48) > > Checking the state here (and in other similar cases) doesn't make much sense, > > because the method is called only in the correct state. Please keep only > > meaningful checks and change them to NS_ASSERTION. > > I'd say that checking assumed invariants that currently _aren't_ violated with > NS_ABORT_IF_FALSE is exactly the right thing to do.... why wouldn't it be? > That's what NS_ABORT_IF_FALSE is for. OK, I just think that it isn't necessary to check the state at least in case of HandleSocks4Connect(), HandleSocks5Auth(), HandleSocks5Auth() and HandleSocks5Connect() which are called only from switch(mState) statement in DoHandshake(). > > so I would prefer to use a char[] array > > How does that help the realloc situation? If we just want a growable thing > with some nonzero initial size, we have string classes for that... is that what > we need here? Of course I meant char[SOME_MAX_SIZE] with sufficient capacity. Maximum length of each request/reply is known. > > I'm afraid that this change to improve the error handling isn't acceptable, > > What problems does/could it cause? (Genuine question; if a necko peer is > supposed to make a call about this, it would be good to have all the relevant > information.) It isn't actually a problem. I just don't like too much using PR_INVALID_STATE_ERROR in the sense "ask me once again for the real error". I'm not sure what this error is normally used for and if there can be some conflict. But if it is OK to throw this error, then it can be done without the additional call to PR_ConnectContinue() and the SOCKS_REPORT_ERROR state. SOCKS code should throw a real error code and PR_INVALID_STATE_ERROR in case of fd->lower->methods->connect/connectcontinue failure. nsSocketTransport can then interpret this error code differently.

Comment 50

•

15 years ago

> OK, I just think that it isn't necessary to check the state It's not a check. It's an assertion, for in case someone changes the caller code... > Maximum length of each request/reply is known. Ah, ok. Then yes, just using a fixed buffer of max length is great, esp if it's not too big. For the error, the NSPR docs say: 254 /* Object state improper for request */ 255 #define PR_INVALID_STATE_ERROR (-5931L) I haven't read the patch in enough detail yet to see whether this makes sense with the patch's usage of the error code...

Comment 51

•

15 years ago

Comment on attachment 450585 [details] [diff] [review] SOCKS xpcshell unit test Looks good, just some minor comments. >diff -r f21132993dc2 netwerk/test/unit/socks_client_subprocess.js >--- /dev/null Thu Jan 01 00:00:00 1970 +0000 >+++ b/netwerk/test/unit/socks_client_subprocess.js Thu Jun 10 18:23:36 2010 -0700 >+ var data = bin.readBytes(5); >+ if (data == 'PING!') { >+ print('client: got ping, sending pong.'); >+ output.write('PONG!', 5); >+ } else { >+ print('client: wrong data from server:', data); I think you should write some data back to the client even in case of error. Maybe something like "Error: wrong data received". >+for each (var arg in arguments) { >+ print('client: running test', arg); >+ test = arg.split(':'); var test = ... >diff -r f21132993dc2 netwerk/test/unit/test_socks.js >--- /dev/null Thu Jan 01 00:00:00 1970 +0000 >+++ b/netwerk/test/unit/test_socks.js Thu Jun 10 18:23:36 2010 -0700 >@@ -0,0 +1,486 @@ >+const SOCKS_LISTEN_PORT = 9080; It'd be nice if you could figure out a way to make this dynamic, so that the test won't fail on systems where this port is in use, but I wouldn't r- you just for that.

Attachment #450585 - Flags: review?(ted.mielczarek) → review+

Tod

Comment 52

•

15 years ago

Any progress? I've been testing this patch on Ubuntu for months, using custom builds of Firefox. But it's getting annoying having to rebuild Firefox from scratch every time a new minor version is out... :(

grarpamp

Comment 53

•

15 years ago

Will this apply cleanly against 3.6.10? As it is something that will be quite useful when done, I'm interested in giving this a try on top of a local ports build as it is nearing formal release, I hope :)

Scott H.

Comment 54

•

15 years ago

Tested on Firefox 3.5.13, it's working. Can we speed up things a bit? It's taking ages to have this bug fixed and committed to the main source tree.

Comment 55

•

15 years ago

This patch is now coming up on its first birthday, and has been around for at least an entire Firefox major release cycle. Can we please get someone to review this and get it into the Firefox 4 and Fennec trees? I really don't see what the holdup is. The Tor Project is using it in some of our custom Firefox builds, and others in the wider open source community are obviously testing and using it as well.

Flags: wanted-fennec1.0?

Kai Engert [:KaiE:]

Comment 56

•

15 years ago

If I understand correctly, this is now waiting for someone to cleanup the patch as requested by reviewers in comment 47, 48, 49, 50, 51.

Kai Engert [:KaiE:]

Comment 57

•

15 years ago

(In reply to comment #56) > If I understand correctly, this is now waiting for someone to cleanup the patch > as requested by reviewers in comment 47, 48, 49, 50, 51. Usually the reviewers expect that the patch creator (Chris) cleans up the patch according to review comments. Chris, would you able to get this done, or are you looking for help from someone else to do it?

Comment 58

•

15 years ago

(In reply to comment #57) > (In reply to comment #56) > > If I understand correctly, this is now waiting for someone to cleanup the patch > > as requested by reviewers in comment 47, 48, 49, 50, 51. > > Usually the reviewers expect that the patch creator (Chris) cleans up the patch > according to review comments. > > Chris, would you able to get this done, or are you looking for help from > someone else to do it? Yes, I'm working on it.

fernando

Comment 59

•

15 years ago

(In reply to comment #58) > (In reply to comment #57) > > (In reply to comment #56) > > > If I understand correctly, this is now waiting for someone to cleanup the patch > > > as requested by reviewers in comment 47, 48, 49, 50, 51. > > > > Usually the reviewers expect that the patch creator (Chris) cleans up the patch > > according to review comments. > > > > Chris, would you able to get this done, or are you looking for help from > > someone else to do it? > > Yes, I'm working on it. How is it going, Chris?

Justin Dolske [:Dolske]

Comment 60

•

15 years ago

Do note that FF4 Beta 8 may freeze as soon as this week, and we are rapidly approaching the point where this patch won't be taken for Firefox 4 (if we're not already there). Perhaps bz or biesi should weigh in on if the risk here is worthwhile? Good to know Tor folks have been using and testing it, though!

Comment 61

•

15 years ago

If I fix up the issues from comment #47-51, can we land this in 4.0 for desktop and mobile? It seems Chris has disappeared, but I think I have time to do the final cleanup for this right now, if it will definitely get in. Otherwise, I'll just continue updating Torbutton for FF 4.0 and mobile...

fernando

Comment 62

•

15 years ago

Please Mike, fix those issues. I think we are still in time for having non-blocking socks in FF4: after all FF4 Beta 8 hasn't been released yet and Chrisd's patch has been extensively tested by Tor users for seven months or so.

jorel

Comment 63

•

15 years ago

Mike, I'm willing to test the patch once you fix those issues and I do really hope we can have this almost-6-years-old bug fixed in Firefox 4.0.

Comment 64

•

15 years ago

Chris reappeared on our IRC channel yesterday is is working on the patch right now. He says a new version should be ready in a few hours.

Justin Dolske [:Dolske]

Comment 65

•

15 years ago

(In reply to comment #61) > If I fix up the issues from comment #47-51, can we land this in 4.0 for desktop > and mobile? Even if it doesn't land for 4.0 (worst case), the fixes would still need to be made to land for after 4.0; so better to get it done and waiting. :) You've mentioned that this patch is already in use by Tor users, I presume that's with custom builds? Ballpark guess on number of users? Is Tor's SOCKS usage special in any way that might mean SOCKS users in non-TOR environments might hit problems with this patch? [These are the kinds of questions that can help drivers understand potential risk of accepting this for FF4.]

Comment 66

•

15 years ago

No idea how many people may have compiled their own firefox with this patch. I'm pretty sure we began shipping experimental Linux Tor Browser bundles with it enabled though. As far if Tor is special, it uses SOCKS4A/SOCKS5, so it would be sending hostnames instead of IP addresses. But this exercises the more complicated parts of the code. A quick test with an SSH socks proxy should cover the rest. And then there is Chris's SOCKS unit tests that are attached that simulate various functionality. chrisd, anything to add about how well your tests and Tor vs plain Socks4 (ie SSH) socks proxy use exercise the code?

Comment 67

•

15 years ago

The unit tests cover most of the SOCKS request types including IP and hostname combinations, so most possible uses should be accounted for (both Tor and non-Tor). The only request type that hasn't gotten much coverage is SOCKS5/IPv6.

Comment 68

•

15 years ago

Attached patch SOCKS patch updated to address some review comments (obsolete) — Details — Splinter Review

Hi, thanks for reviewing, and sorry it's taken me long to respond. This updated patch should hopefully address most of what was suggested in previous comments. The SOCKS layer now uses a pre-allocated buffer instead of two nsCStrings. This seems to work pretty well. The Push*/Pop* methods have been renamed to Read*/Write*, with the hope that this will make things clearer. Leaving these buffer access methods around in some form is at least handy for the debugging assertions, and having a file-like api is nicer than using direct addressing everywhere, IMO. However, we can go with direct addressing if it's really preferable. I don't know if making the methods macros would give us any performance benefit, or if tuning the SOCKS handshake for performance really matters in general. SOCKS is out of the way pretty quickly and not much data is handled. The other debugging assertions are left in; they shouldn't get in the way, and better to be safe than sorry, etc. The HandleSocks* methods for transferring SOCKS data have been split apart for reading and writing, and the states have been updated as well. The complicated code to read the variable-length SOCKS5 connection response has been split, also. The call to Resolve() has been kept in for now, because going with AsyncResolve might be a bit difficult to get right. Since we're typically looking up the same name over and over, we should be able to take advantage of the DNS cache(?), so hopefully it would only spend a trival amount of time blocking in most cases. But, if we really should go with AsyncResolve, I'll try to find a way to get it in. Also, I'm uncertain what to do with the code for propagating SOCKS errors using PR_INVALID_STATE_ERROR. Can this part stay, or should it be removed for now in lieu of a better strategy? There seems to be a fair amount of C code between the SOCKS layer and the socket transport, so I don't see how C++ exceptions could be used to report errors.

Attachment #450584 - Attachment is obsolete: true

Comment 69

•

15 years ago

Attached patch SOCKS xpshell test with automatic port discovery — Details — Splinter Review

Attachment #450585 - Attachment is obsolete: true

fernando

Comment 70

•

15 years ago

Now that the patches have been cleaned up, can we find someone to review (and hopefully approve) them asap?

jorel

Comment 71

•

15 years ago

(In reply to comment #70) > can we find someone to review (and hopefully approve) them asap? Can Michael and Ted do that? Or are they busy with something else?

jorel

Comment 72

•

15 years ago

Where are the reviewers?

Assignee

Comment 73

•

15 years ago

(In reply to comment #71) > Can Michael and Ted do that? Or are they busy with something else? I'll do the review ASAP.

Assignee

Comment 74

•

15 years ago

(In reply to comment #68) > The SOCKS layer now uses a pre-allocated buffer instead of two nsCStrings. This > seems to work pretty well. The Push*/Pop* methods have been renamed to > Read*/Write*, with the hope that this will make things clearer. Leaving these > buffer access methods around in some form is at least handy for the debugging > assertions, and having a file-like api is nicer than using direct addressing > everywhere, IMO. However, we can go with direct addressing if it's really > preferable. I don't know if making the methods macros would give us any > performance benefit, or if tuning the SOCKS handshake for performance really Maybe they could be inlined. > The call to Resolve() has been kept in for now, because going with AsyncResolve > might be a bit difficult to get right. Since we're typically looking up the > same name over and over, we should be able to take advantage of the DNS > cache(?), so hopefully it would only spend a trival amount of time blocking in > most cases. But, if we really should go with AsyncResolve, I'll try to find a > way to get it in. I think we can land it without AsyncResolve(). This isn't critical and can be fixed later. > Also, I'm uncertain what to do with the code for propagating SOCKS errors using > PR_INVALID_STATE_ERROR. Can this part stay, or should it be removed for now in > lieu of a better strategy? There seems to be a fair amount of C code between > the SOCKS layer and the socket transport, so I don't see how C++ exceptions > could be used to report errors. My fault, I should be more clear. I didn't mean throwing C++ exception. By "throw a real error" I just meant to return correct error via PR_SetError() instead of using PR_INVALID_STATE_ERROR. We could use PR_INVALID_STATE_ERROR directly to indicate that underlaying connect or connectcontinue failed and handle this state in nsSocketTransport2.cpp. Then there would be no need to call PR_ConnectContinue() again to find out the real error code. > + // A buffer of 1024 bytes is more than enough to hold a 255-byte hostname > + // alongside additional fields. > + static const PRUint32 BUFFER_SIZE = 1024; > + static const PRUint32 MAX_HOSTNAME_LEN = 255; If I count correctly then the largest SOCKS4 request is 265 bytes and the largest SOCKS5 connect request/response is 262 bytes. So BUFFER_SIZE should be 265. > + case SOCKS5_WRITE_AUTH_REQUEST: > + if (WriteToSocket(fd) != PR_SUCCESS) > + return PR_FAILURE; > + WantRead(2); > + mState = SOCKS5_READ_AUTH_RESPONSE; > + return PR_SUCCESS; > + case SOCKS5_READ_AUTH_RESPONSE: You could advance immediately to the next state instead of returning PR_SUCCESS. > + // Advance read offset to ignore version number > + mReadOffset = 1; Shouldn't we check the version? (Also on other places where it is skipped.) > +nsSOCKSSocketInfo::WriteString(const nsCString &str) Use nsACString > + if (mDestinationHost.Length() > MAX_HOSTNAME_LEN) { > + LOGERROR(("socks4: destination host name is too long!")); > + HandshakeFinished(PR_BAD_ADDRESS_ERROR); > + } "return PR_FAILURE;" is missing here > + switch (*type) { > + case 0x01: // ipv4 > + *len = 4 - 1; > + break; > + case 0x04: // ipv6 > + *len = 16 - 1; > + break; > + case 0x03: // fqdn > + *len = ReadUint8(); > + break; > + } We need to fail in case of an unknown type. > + NS_ABORT_IF_FALSE(mVersion == 5, > + "SOCKS version must be 4 or 5!"); Incorrect message. > + NS_ABORT_IF_FALSE(mDataLength == 5, > + "SOCKS 5 connection reply must be at least 5 bytes!"); Incorrect message. > + NS_ABORT_IF_FALSE(mDataLength > 5, > + "SOCKS 5 connection reply must be at least 5 bytes!"); Again incorrect message in ReadV5ConnectResponseBottom(). And I think it would make more sense to move it after the ReadV5AddrTypeAndLength() and check if mDataLength == (7+len). > + // The port is the last field in the response. If data remains, something's > + // off. > + LOGDEBUG(("socks5: remaining data %u", > + unsigned(mDataLength - mReadOffset))); Is there some reason to allow this? We are asking for some exact amount and we should get it. > + do { > + status = info->DoHandshake(fd, oflags); > + } while (status == PR_SUCCESS && !info->IsConnected()); No timeout is passed to DoHandShake() so PR_INTERVAL_NO_TIMEOUT is used, which is wrong. We should remember the timeout passed to nsSOCKSIOLayerConnect(). E.g. set the timeout via new method info->SetConnectTimeout(to) before the do-while loop. Anyway, to handle the timeout correctly we should limit the whole process (connecting + handshake), but now we limit just the connecting, right? > + if (!info->IsConnected()) { > + *out_flags = 0; > + //LOGDEBUG(("socks: polling %hd", info->GetPollFlags())); > + return info->GetPollFlags(); > + } Remove the commented out LOGDEBUG().

Comment 75

•

14 years ago

Michal - It looks like Chris is MIA again. I think he may be getting weary of the back-and-forth on this.. I believe he's also a student, so his time is probably available only randomly and periodically depending on classes. Since this is your second review, and some of your comments look either new this time, trivial, and/or still a little unclear, can you possibly take care of them? Otherwise, I can try to go through and fix them myself sometime this week..

Assignee

Comment 76

•

14 years ago

I'll update the patch.

Assignee: chrisd.mang → michal.novotny

Assignee

Comment 77

•

14 years ago

Attached patch updated patch (obsolete) — Details — Splinter Review

Attachment #494975 - Attachment is obsolete: true

Attachment #510446 - Flags: review?(bzbarsky)

Comment 78

•

14 years ago

I'm probabably the wrong reviewer for this: I don't know anything about SOCKS proxies or really this part of the necko code. If there's absolutely no one else to do the review I can try to learn this code well enough to review, but that will take time. :( That said, I thought Michal reviewed this already. So what part am I supposed to review and why?

Assignee

Comment 79

•

14 years ago

(In reply to comment #78) > That said, I thought Michal reviewed this already. So what part am I supposed > to review and why? I think someone else should look at the changes between attachment 494975 [details] [diff] [review] and 510446. Otherwise, I would in fact review my own code.

Comment 80

•

14 years ago

OK, can you post an interdiff between those two attachments? That would make it a lot easier to see what was changed...

Assignee

Comment 81

•

14 years ago

Attached patch interdiff 494975 - 510446 (obsolete) — Details — Splinter Review

Comment 82

•

14 years ago

Comment on attachment 510446 [details] [diff] [review] updated patch >+ // A SOCKS request was rejected, get the actual error code from >+ // the OS error s/,/;/ The fallthrough in nsSOCKSSocketInfo::DoHandshake looks purposeful, but why do we want that? I mean the fallthrough when going from the various REQUEST states to the corresponding RESPONSE states. If we didn't do that, would the caller just end up calling right back into DoHandshake, so this is basically an optimization or something? If so, why do we want to make the optimization? Seems to make the code more confusing to me.... r=me with the typo fixed and the DoHandshake stuff explained. And thank you for posting the interdiff! It was very useful.

Attachment #510446 - Flags: review?(bzbarsky) → review+

Comment 83

•

14 years ago

And to be clear, I only read the interdiff and the comments on the bug....

Assignee

Comment 84

•

14 years ago

(In reply to comment #82) > The fallthrough in nsSOCKSSocketInfo::DoHandshake looks purposeful, but why do > we want that? I mean the fallthrough when going from the various REQUEST > states to the corresponding RESPONSE states. If we didn't do that, would the > caller just end up calling right back into DoHandshake, so this is basically an > optimization or something? If so, why do we want to make the optimization? You understand it correctly. It is just an optimization to not to go back to nsSOCKSIOLayerConnect() or nsSOCKSIOLayerConnectContinue() and pass to the RESPONSE state directly. If we didn't do that the while loop in nsSOCKSIOLayerConnect() or nsSOCKSIOLayerConnectContinue() would call DoHandshake() again immediately. I can remove it if you think that it makes the code hard to read.

Comment 85

•

14 years ago

Unless we hit this code a whole lot, I'd rather not do the optimization, so the control flow is clearer, yeah. Of course if we have profile data showing this matters we should keep the optimization!

Assignee

Comment 86

•

14 years ago

Attached patch removed optimization in DoHandshake() (obsolete) — Details — Splinter Review

Attachment #510446 - Attachment is obsolete: true

Attachment #512930 - Attachment is obsolete: true

Attachment #514770 - Flags: approval2.0?

Johnny Stenback (:jst)

Comment 87

•

14 years ago

Comment on attachment 514770 [details] [diff] [review] removed optimization in DoHandshake() I want this to land as soon as possible, but *not* before 2.0. It's just too late now, and the reward isn't high enough IMO.

Attachment #514770 - Flags: approval2.0? → approval2.0-

Updated

•

14 years ago

Depends on: post2.0

Assignee

Comment 88

•

14 years ago

Attached patch updated patch — Details — Splinter Review

> >+ // A SOCKS request was rejected, get the actual error code from > >+ // the OS error > > s/,/;/ I forgot to do this change in the previous patch.

Attachment #514770 - Attachment is obsolete: true

http://hg.mozilla.org/projects/cedar/rev/739bfb41cd01 http://hg.mozilla.org/projects/cedar/rev/2def74c0f277

Comment 89

•

14 years ago

Keywords: helpwanted

Whiteboard: fixed-in-cedar

Karl Tomlinson (:karlt)

Comment 90

•

14 years ago

Sorry, I backed this out of cedar as test_socks.js was failing on NT. http://hg.mozilla.org/projects/cedar/rev/f17c1a0a7ca3 e.g. http://tinderbox.mozilla.org/showlog.cgi?log=Cedar/1300948416.1300950704.20889.gz#err0 server: using test case socks4:50743:127.0.0.1:8000:local server: using test case socks4:50743:12345.xxx:8001:remote server: using test case socks:50743:127.0.0.1:8002:local server: using test case socks:50743:abcdefg.xxx:8003:remote TEST-UNEXPECTED-FAIL | c:/talos-slave/test/build/xpcshell/tests/netwerk/test/unit/test_socks.js | false == true - See following stack: JS frame :: c:\talos-slave\test\build\xpcshell\head.js :: do_throw :: line 439 JS frame :: c:\talos-slave\test\build\xpcshell\head.js :: do_check_eq :: line 491 JS frame :: c:\talos-slave\test\build\xpcshell\head.js :: do_check_true :: line 503 JS frame :: c:/talos-slave/test/build/xpcshell/tests/netwerk/test/unit/test_socks.js :: runScriptSubprocess :: line 44 JS frame :: c:/talos-slave/test/build/xpcshell/tests/netwerk/test/unit/test_socks.js :: <TOP_LEVEL> :: line 431 JS frame :: c:/talos-slave/test/build/xpcshell/tests/netwerk/test/unit/test_socks.js :: run_test :: line 483 JS frame :: c:\talos-slave\test\build\xpcshell\head.js :: _execute_test :: line 322

Karl Tomlinson (:karlt)

Updated

•

14 years ago

Whiteboard: fixed-in-cedar

Updated

•

14 years ago

No longer depends on: post2.0

Whiteboard: not-ready

Justin Dolske [:Dolske]

Comment 91

•

14 years ago

(In reply to comment #90) > JS frame :: > c:/talos-slave/test/build/xpcshell/tests/netwerk/test/unit/test_socks.js :: > runScriptSubprocess :: line 44 Michal, are you planning to investigate this? (Just checking since this bug has had lurchy progress in the past.) From a quick skim it seems the test is having problems finding xpcshell.exe. Not sure offhand why that would be.

bug_info

Comment 92

•

14 years ago

Possibly dumb question: is the patch from Michal included in the FF 4.0 release?

Comment 93

•

14 years ago

No. This patch is not included anywhere yet, since it failed tests when it was checked in.

paltu

Comment 94

•

14 years ago

Dumb question, in the patch "SOCKS xpshell test with automatic port discovery": + bin.append("xpcshell"); + if (!bin.exists()) { + bin.leafName = "xpshell.exe"; + do_check_true(bin.exists()); + if (!bin.exists()) + do_throw("Can't find xpshell binary"); Shouldn't it be xpcshell.exe instead of xpshell.exe?

Comment 95

•

14 years ago

Uh, yeah, good catch. Somehow we missed that.

Comment 96

•

14 years ago

For what it's worth, I pushed this patch with the correction from comment 94 to try for a Windows xpcshell run; if that passes, I'll land it.

Comment 97

•

14 years ago

Yeah, that went green. I pushed to cedar: http://hg.mozilla.org/mozilla-central/rev/31dce5e73ca4 http://hg.mozilla.org/mozilla-central/rev/1fe0a6fa91ea

Flags: in-testsuite+

Whiteboard: not-ready → fixed-in-cedar

http://hg.mozilla.org/mozilla-central/rev/31dce5e73ca4 http://hg.mozilla.org/mozilla-central/rev/1fe0a6fa91ea

Comment 98

•

14 years ago

Status: ASSIGNED → RESOLVED

Closed: 14 years ago

Resolution: --- → FIXED

Whiteboard: fixed-in-cedar

Target Milestone: --- → mozilla2.2

Comment 99

•

14 years ago

Which branch is cedar? If this isn't making it into a 4.0 dot release, the Tor Project is going to have to fork Firefox and remove Torbutton from addons.mozilla.org, and encourage our users to only use our Tor Browser Bundle instead. The reason for this is that the existing HTTP to SOCKS conversion proxies we need to work around this SOCKS issue (Polipo/Privoxy) are unmaintained and have a number of serious stability and security bugs that make it simpler for us to use our own Firefox builds instead of maintaining the same fork for Polipo/Privoxy...