Add support for named pipe connection to proxy.

RESOLVED FIXED in Firefox 52

Status

()

defect
P2
normal
RESOLVED FIXED
3 years ago
2 years ago

People

(Reporter: xeonchen, Assigned: xeonchen)

Tracking

(Depends on 1 bug, Blocks 2 bugs)

unspecified
mozilla52
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox52 fixed)

Details

(Whiteboard: [tor][proxy][necko-active])

Attachments

(4 attachments, 27 obsolete attachments)

9.29 KB, patch
xeonchen
: review+
Details | Diff | Splinter Review
14.67 KB, patch
xeonchen
: review+
Details | Diff | Splinter Review
19.35 KB, patch
xeonchen
: review+
Details | Diff | Splinter Review
39.93 KB, patch
xeonchen
: review+
Details | Diff | Splinter Review
In bug 1211567 we implemented a feature to connect to proxy via domain socket.
This bug is going to support similar function but the medium is named pipe on Windows.
No longer depends on: 1287994
Depends on: 1287994
Whiteboard: [tor][necko-backlog] → [tor][necko-next]
Whiteboard: [tor][necko-next] → [tor][necko-next][proxy]
Whiteboard: [tor][necko-next][proxy] → [tor][proxy][necko-active]
Blocks: meta_tor
Priority: -- → P2
Posted patch WIP (obsolete) — Splinter Review
Posted patch WIP (obsolete) — Splinter Review
Has problems on HTTPS sites
Attachment #8790752 - Attachment is obsolete: true
Posted patch WIP (obsolete) — Splinter Review
Attachment #8791226 - Attachment is obsolete: true

Updated

3 years ago
Target Milestone: --- → mozilla52
Attachment #8792213 - Attachment is obsolete: true
The main purpose of this patch is to enable the ability to pass the name of the pipe to the corresponding |PRFileDesc|.

Some ugly magic number is introduced temporarily for not changing too many existing code, but I'm also thinking a better way to achieve this.
Attachment #8794131 - Flags: feedback?(daniel)
In this patch an IO layer for named pipe is implemented.

Daniel and Honza, would you please feedback this patch especially for the following parts:

* PR_NSPR_IO_LAYER object is replaced by the named pipe handle
* Not pretty sure how PR_Read/PR_Write handle nonblocking I/O? I ask this because read/written bytes are not available until the operation is done.
* Is what I've done in |GetPollFlags| reasonable?

Thank you!
Attachment #8794139 - Flags: feedback?(honzab.moz)
Attachment #8794139 - Flags: feedback?(daniel)
I will take a look, but I need to read the named pipes docs first.  I'm no expert to overlapped IO too.

Anyway, by quickly looking at the patch, one question arise: why don't you use a dedicated thread to poll the pipe?  (If there is a way to poll it, but I assume it is, the IPC code does it.)  Your use of CanRead() in Read() suggests you block for aTimeout time when there is nothing on the pipe.  The mNonblocking flag (whatever its purpose is - there is no comment on it ;)) doesn't seem to be used.  But I didn't go that carefully through the patch.

BTW, the patch is generally missing comments.  Some architectural overview of the patch put to bugzilla would be useful too.  Even when you ask just for a feedback ;)

One thing that's not clear is how you wake up the STS thread when something appears on the pipe, but again, I just went through the patch quickly.
Comment on attachment 8794131 [details] [diff] [review]
0001-Bug-1288308-Part-0-add-Named-Pipe-type-on-Windows-pl.patch

Review of attachment 8794131 [details] [diff] [review]:
-----------------------------------------------------------------

I think it is all pretty clear and straight-forward: no objects from me. The '42' should of course use a defined name instead but you already said so.
Attachment #8794131 - Flags: feedback?(daniel) → feedback+
(In reply to Honza Bambas (:mayhemer) from comment #6)
> I will take a look, but I need to read the named pipes docs first.  I'm no
> expert to overlapped IO too.
> 
> Anyway, by quickly looking at the patch, one question arise: why don't you
> use a dedicated thread to poll the pipe?  (If there is a way to poll it, but
> I assume it is, the IPC code does it.) 

The overlapped IO can be used for non-blocking IO, so I thought it is okay to use within the STS thread?
Not sure which IPC code do you mean, would you give me the link to check?

> Your use of CanRead() in Read()
> suggests you block for aTimeout time when there is nothing on the pipe.  The
> mNonblocking flag (whatever its purpose is - there is no comment on it ;))
> doesn't seem to be used.  But I didn't go that carefully through the patch.

No, I didn't use it yet.
I'm not sure the behavior for reading/writing should be blocking for non-blocking.
Based on [1][2] for |PR_Read| and |PR_Write|, it is supposed to be blocking, right?
I'm using non-blocking IO plus a wait function to retrieve return value.

> BTW, the patch is generally missing comments.  Some architectural overview
> of the patch put to bugzilla would be useful too.  Even when you ask just
> for a feedback ;)

The SOCKS IO layer creates the named pipe layer when the server address is a local named pipe address.
All operations are called through |PRIOMethods|, which creates |NamedPipeInfo| when the upper layer is trying to connect.
Basically main operations can be done by calling |nsNamedPipeSend|, |nsNamedPipeRecv|, and |nsNamedPipePoll|.

> One thing that's not clear is how you wake up the STS thread when something
> appears on the pipe, but again, I just went through the patch quickly.

Only by returning some values from |GetPollFlags|.

[1] https://developer.mozilla.org/en-US/docs/Mozilla/Projects/NSPR/Reference/PR_Read
[2] https://developer.mozilla.org/en-US/docs/Mozilla/Projects/NSPR/Reference/PR_Write
Comment on attachment 8794139 [details] [diff] [review]
0002-Bug-1288308-Part-1-implement-Named-Pipe-IO-layer-r-b.patch

Review of attachment 8794139 [details] [diff] [review]:
-----------------------------------------------------------------

If this is intended as a blocking implementation only (which appears to me), then you don't need overlapping and async operations at all.
If this is a base for an implementation that will be turned to a non-blocking (with optional blocking capabilities as well) then I think you will need to do some changes.

To answer the "* Not pretty sure how PR_Read/PR_Write handle nonblocking I/O? I ask this because read/written bytes are not available until the operation is done" question:

let's start with a non-blocking read.  The goal is to call Read() only when there are data.  If there are not, the function must immediately return with WOULD_BLOCK error and never block the thread.  For a TCP socket it works like this: note that all of the following is happening only on the socket transport thread (STS thread).  When the consumer of nsSocketInputStream wants to read from the socket (expects data) it calls AsyncWait on the input stream.  That sets the poll flags on the socket to PR_POLL_READ.  Based on that, we add that socket to the list of socket to poll and it's than passed to select().  select() doesn't exit until an event on some socket it polls occurs or an event to the STS thread is posted (from another thread) - this is important.  OK, some data has arrived on our socket we want to read from.  select() returns, and we signal PR_POLL_READ to the socket transport (and the appropriate input stream) that calls the callback passed to AsyncWait before.  Implementation of that callback will then call Read() on the input stream.  This ends down in TCP socket NSPR read (or is it recv?) NSPR layer function.  It tries to read from the socket, it is likely it returns some data.  But, if you call Read() on the socket's input stream before you get the notification about readability, you will probably get PR_WOULD_BLOCK_ERROR from the NSPR TCP socket recv() function call, because there are probably not any data to read.  In other words, non-blocking socket will simply throw an error when you try reading from it while there are no data.  A blocking socket will block (recv() will not return) as long as timeout has not passed or data has not arrived.  Generally, we only use non-blocking IO for sockets.  And the pipe should do the same.

But in this patch I don't see how the pipe is being polled for something.  You cannot add it to the list of sockets to be polled with select(), that is clear.  Your poll method should always return 0 unless you want to signal an existing event (like data availability) on the pipe yourself.

I'd go a different way to implement the non-blocking fashion of your pipe layer.  To explain what I have in mind, I'll stick with reading from the pipe.  It means, we've already connected to the proxy's pipe and wrote something to it (whichever way).  Now, we expect a data to arrive back to us, the socket input stream has been called AsyncWait().  It sets the poll flags, which you can then see in your Poll() method, with the PR_POLL_READ bit.  When you see that flag, you should call ReadFile with an overlapped event and reference to the internal buffer (you already have one, good!)  

If that succeeds, data fro the pipe are now in your internal buffer and you can immediately return PR_POLL_READ as an _out flag_ from you Poll().  That will immediately tell the upper layers that there is something to read on the pipe and will end up calling socket's input stream Read() method that will go down to your Read() method.  You can now copy the internal buffer to the consumer and return OK.

If ReadFile fails with ERROR_IO_PENDING, remember there is a read pending and return 0 from your Poll().  You unfortunately need a separate thread (or threads) that will wait for all the overlapped events.  When that thread is woken up (an event has been signaled), it will simply post a dummy runnable to the STS thread to wake it up.  It will then call your Poll() method again.  You remembered there was a read pending before.  So, you call GetOverlappedResult (don't wait!) to see if the operation is completed.  If it has completed, signal from your Poll method that there are data to read with PR_POLL_READ _out flag_.  If it's not completed, just continue waiting - no state change and return 0 from Poll().

When someone calls your Read() and there is nothing in your read buffer, you simply return PR_WOULD_BLOCK_ERROR.

Side note: you unfortunately cannot use a completion routine to wake the STS thread (through ReadFileEx) since select(), AFAIK, is not an alertable wait.

Note #2: if you don't wait wake the STS thread after an overlapped event signaled, you may get to a situation that the browser will hang for noticeable amounts of time (several seconds if not even way longer).

::: netwerk/socket/nsNamedPipeIOLayer.cpp
@@ +29,5 @@
> +  NS_INLINE_DECL_REFCOUNTING(NamedPipeInfo)
> +
> +public:
> +  explicit NamedPipeInfo();
> +  virtual ~NamedPipeInfo();

nit: make dtor private

@@ +111,5 @@
> +      LOG_ERROR("[%p] open pipe %s failed (%d)", this, path.get(), GetLastError());
> +      return NS_ERROR_FAILURE;
> +    }
> +
> +    if (!WaitNamedPipeA(path.get(), aTimeout)) {

why exactly are you waiting here?  first, this is a blocking operation, second, I don't see a reason (unless you know more than me) to do it.  If CreateFile fails, it means there is no pipe server accepting connections at the time (no pending ConnectNamedPipe on the socks proxy).  You expect it to be available in some amount of time (the timeout?)  IMO, when CreateFile fails, we should bail as there is no proxy listening.  But again, you may know more than me ;)

@@ +166,5 @@
> +    PR_SetError(PR_NOT_CONNECTED_ERROR, 0);
> +    return -1;
> +  }
> +
> +  if (!CanRead(aTimeout)) {

what is the purpose of the CanRead method and its usage here exactly?  you simply want to disallow reading when you already read (asynchronously) from the pipe?  if so, should you rather use GetOverlappedResult function instead of WaitForSingleObject(timeout) ?

@@ +168,5 @@
> +  }
> +
> +  if (!CanRead(aTimeout)) {
> +    // previous async I/O is not finished
> +    PR_SetError(PR_IO_PENDING_ERROR, 0);

as I understand how PR_IO_PENDING_ERROR is used, it's only returned when the current thread is doing an overlapped io that is controlled via a thread specific and NSPR internal structure held in the PRThread.md.overlapped.  I don't think you need to use this error code, since you control the overlapped IO yourself.

if you can't read here, you should return PR_WOULD_BLOCK_ERROR.  But, the way you implement the API, you do it fully blocking way.

@@ +179,5 @@
> +      PR_SetError(PR_WOULD_BLOCK_ERROR, 0);
> +      return -1;
> +    case -1: // disconnected
> +      LOG_ERROR("%s: disconnected\n", __func__);
> +      PR_SetError(PR_IO_ERROR, 0);

why not PR_NOT_CONNECTED_ERROR?  (I don't want to say that PR_IO_ERROR is wrong)

@@ +202,5 @@
> +        PR_SetError(PR_IO_ERROR, 0);
> +        return -1;
> +    }
> +
> +    success = GetOverlappedResult(mPipe, &mReadOverlapped, &bytesRead, TRUE);

The last arg being TRUE means you block.  From how you use it I don't think you'd need overlapping at all...

Also, when you wait here for the operation to finish, then CanRead (which waits for the overlapped event) will never say |false|.

@@ +248,5 @@
> +    WaitForSingleObject(mWriteOverlapped.hEvent, aTimeout);
> +    success = GetOverlappedResult(mPipe,
> +                                  &mWriteOverlapped,
> +                                  &bytesWritten,
> +                                  TRUE);

WaitForSingleObject will wait up to the timeout.  Regardless the result, you then block in GetOverlappedResult (again, wait = TRUE).
Attachment #8794139 - Flags: feedback?(honzab.moz) → feedback-
(In reply to Gary Chen [:xeonchen] from comment #8)
> (In reply to Honza Bambas (:mayhemer) from comment #6)
> > I will take a look, but I need to read the named pipes docs first.  I'm no
> > expert to overlapped IO too.
> > 
> > Anyway, by quickly looking at the patch, one question arise: why don't you
> > use a dedicated thread to poll the pipe?  (If there is a way to poll it, but
> > I assume it is, the IPC code does it.) 
> 
> The overlapped IO can be used for non-blocking IO, so I thought it is okay
> to use within the STS thread?

Hopefully this will be more clear from my feedback (a comment before).

> Not sure which IPC code do you mean, would you give me the link to check?

They are using completion ports (GetQueuedCompletionStatus).  See https://dxr.mozilla.org/mozilla-central/rev/29beaebdfaccbdaeb4c1ee5a43a9795ab015ef49/ipc/chromium/src/base/message_pump_win.cc#505 and around.

> 
> > Your use of CanRead() in Read()
> > suggests you block for aTimeout time when there is nothing on the pipe.  The
> > mNonblocking flag (whatever its purpose is - there is no comment on it ;))
> > doesn't seem to be used.  But I didn't go that carefully through the patch.
> 
> No, I didn't use it yet.
> I'm not sure the behavior for reading/writing should be blocking for
> non-blocking.
> Based on [1][2] for |PR_Read| and |PR_Write|, it is supposed to be blocking,
> right?

No.  When the socket (or in general NSPR terms - a FD) is set as non-blocking then it's expected that none of the function called on it never blocks.

> I'm using non-blocking IO plus a wait function to retrieve return value.
> 
> > BTW, the patch is generally missing comments.  Some architectural overview
> > of the patch put to bugzilla would be useful too.  Even when you ask just
> > for a feedback ;)
> 
> The SOCKS IO layer creates the named pipe layer when the server address is a
> local named pipe address.
> All operations are called through |PRIOMethods|, which creates
> |NamedPipeInfo| when the upper layer is trying to connect.
> Basically main operations can be done by calling |nsNamedPipeSend|,
> |nsNamedPipeRecv|, and |nsNamedPipePoll|.

That is not an architectural overview ;)

> 
> > One thing that's not clear is how you wake up the STS thread when something
> > appears on the pipe, but again, I just went through the patch quickly.
> 
> Only by returning some values from |GetPollFlags|.

If that may happen that it's not clear from my previous comment, this is not how you WAKE the STS thread when it is inside select().

> 
> [1]
> https://developer.mozilla.org/en-US/docs/Mozilla/Projects/NSPR/Reference/
> PR_Read
> [2]
> https://developer.mozilla.org/en-US/docs/Mozilla/Projects/NSPR/Reference/
> PR_Write
Attachment #8794139 - Attachment is obsolete: true
Attachment #8794139 - Flags: feedback?(daniel)
This patch introduces a nsNamedPipeService that notifies observers when registered overlapped I/O operations are completed.
Attachment #8797654 - Flags: feedback?(honzab.moz)
Re-implement the model based on the comment 9.
Attachment #8797655 - Flags: feedback?(honzab.moz)
Comment on attachment 8797655 [details] [diff] [review]
0003-Bug-1288308-Part-2-implement-Named-Pipe-IO-layer-r-b.patch

Review of attachment 8797655 [details] [diff] [review]:
-----------------------------------------------------------------

note: only works on non-e10s environment, checking

::: netwerk/socket/nsNamedPipeIOLayer.cpp
@@ +31,5 @@
> +static PRDescIdentity nsNamedPipeLayerIdentity;
> +static PRIOMethods nsNamedPipeLayerMethods;
> +
> +class NamedPipeInfo final : public nsINamedPipeDataObserver
> +                          , public nsIEventTarget

|nsIEventTarget| is no longer necessary, will be removed.

@@ +153,5 @@
> +    LOG_DEBUG("[%s] write %d bytes", __func__, aBytesTransferred);
> +  } else {
> +    LOG_ERROR("invalid callback");
> +    return NS_ERROR_FAILURE;
> +  }

Should post an event to |mOwningThread| here, I'll fix this in next patch.
Comment on attachment 8797654 [details] [diff] [review]
0002-Bug-1288308-Part-1-implement-nsNamedPipeService-r-ba.patch

Gary, I think these patches will update, right?  We did a lot F2F discussions last week in Vienna, so f given there.
Attachment #8797654 - Flags: feedback?(honzab.moz)
Attachment #8797655 - Flags: feedback?(honzab.moz)
Refined per Honza's comment during the meeting in Vienna.
Attachment #8797655 - Attachment is obsolete: true
Attachment #8800179 - Flags: feedback?(honzab.moz)
Attachment #8800179 - Flags: feedback?(daniel)
Attachment #8797660 - Attachment description: 0001-Bug-1288308-Part-0-add-Named-Pipe-type-on-Windows-pl.patch → Part 0: add Named Pipe type on Windows platform
Attachment #8797660 - Flags: review?(daniel)
Attachment #8797654 - Attachment is obsolete: true
Attachment #8801030 - Flags: review?(honzab.moz)
Attachment #8801030 - Flags: review?(daniel)
Attachment #8797656 - Attachment is obsolete: true
Attachment #8801031 - Flags: review?(daniel)
the difference between v2 and v3 is about comment only.
Attachment #8800179 - Attachment is obsolete: true
Attachment #8800179 - Flags: feedback?(honzab.moz)
Attachment #8800179 - Flags: feedback?(daniel)
Attachment #8801035 - Flags: review?(honzab.moz)
Attachment #8801035 - Flags: review?(daniel)
Comment on attachment 8797660 [details] [diff] [review]
Part 0: add Named Pipe type on Windows platform

Review of attachment 8797660 [details] [diff] [review]:
-----------------------------------------------------------------

This looks fine! Just a few questions/notes that aren't stopping anything.

A little related note for the future: on Linux there's this little variation of unix domain sockets called "abstract sockets" (see "man unix") that I wouldn't be surprised if proxies on Linux would like to be able to take advantage of, and then we might need to offer some additional tweaks.

::: netwerk/base/nsProtocolProxyService.cpp
@@ +401,3 @@
>      return Substring(aHost, 0, 5) == "file:";
> +#elif defined(XP_WIN)
> +    return Substring(aHost, 0, 9) == "\\\\.\\pipe\\";

Curious about this prefix. What's the motivation for using a separate one for windows? Is this a commonly used prefix or how come this particular one is chosen?

Also, you check for this prefix in two places (the second place is in nsSOCKSIOLayer.cpp). Wouldn't it make sense to use a const somewhere to refer to the same string in both places?

Or perhaps the entire "IsHostLocalTarget" function could be reused as the checks done in both places are very similar.

::: netwerk/dns/DNS.cpp
@@ +94,5 @@
>    }
> +#elif defined(XP_WIN)
> +  else if (addr->raw.family == AF_LOCAL) {
> +    prAddr->local.family = PR_AF_LOCAL;
> +    memcpy(prAddr->local.path, addr->local.path, sizeof(addr->local.path));

I realize here I didn't remark anything on this for the unix version, but...

is there a good reason to copy anything more than the path itself? This always copies the full "sizeof(addr->local.path)" instad of just the length of the path + 1 ?

::: netwerk/dns/DNS.h
@@ +21,5 @@
>  #ifdef XP_WIN
>  #include "winsock2.h"
>  #endif
>  
> +#if defined(XP_WIN) && !defined(AF_LOCAL)

This could possibly be simplified to just

#ifndef AF_LOCAL

... since only win systems should end up like that and it won't harm other systems to have that extra check.

::: netwerk/socket/nsSOCKSIOLayer.cpp
@@ +121,5 @@
>          return Substring(proxyHost, 0, 5) == "file:";
> +#elif defined(XP_WIN)
> +        nsAutoCString proxyHost;
> +        mProxy->GetHost(proxyHost);
> +        return Substring(proxyHost, 0, 9) == "\\\\.\\pipe\\";

Here's the second use of this prefix, as mentioned above.
Attachment #8797660 - Flags: review?(daniel) → review+
Comment on attachment 8797660 [details] [diff] [review]
Part 0: add Named Pipe type on Windows platform

Review of attachment 8797660 [details] [diff] [review]:
-----------------------------------------------------------------

Thank you Daniel, I'll fix them in the next patch.

::: netwerk/base/nsProtocolProxyService.cpp
@@ +401,3 @@
>      return Substring(aHost, 0, 5) == "file:";
> +#elif defined(XP_WIN)
> +    return Substring(aHost, 0, 9) == "\\\\.\\pipe\\";

The prefix is based on [1], and specialized for local-only names.

[1] https://msdn.microsoft.com/en-us/library/windows/desktop/aa365783(v=vs.85).aspx

::: netwerk/dns/DNS.cpp
@@ +94,5 @@
>    }
> +#elif defined(XP_WIN)
> +  else if (addr->raw.family == AF_LOCAL) {
> +    prAddr->local.family = PR_AF_LOCAL;
> +    memcpy(prAddr->local.path, addr->local.path, sizeof(addr->local.path));

I know it's not a good reason, but I just follow the original style above.
improvements, carry r+
Attachment #8797660 - Attachment is obsolete: true
Attachment #8802386 - Flags: review+
rebase, carry r+
Attachment #8802386 - Attachment is obsolete: true
Attachment #8802403 - Flags: review+
Comment on attachment 8801030 [details] [diff] [review]
[v2] Part 1: implement nsNamedPipeService

Review of attachment 8801030 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good to me!

I'm not familiar with windows APIs in general and therefore not I/O completion ports either so I can't tell if we're using them correctly in this code. Although with your additional tests for this, both manual and the test cases, I think it feels safe to proceed.

Honza is also up for reviewing this and I suspect he has more knowledge in the windowsy side of this.

::: netwerk/socket/nsINamedPipeService.idl
@@ +22,5 @@
> +     *        Available bytes during last transmission.
> +     * @param aOverlapped
> +     *        Corresponding overlapped structure used by the async I/O
> +     */
> +    void onDataAvailable(in unsigned long aBytesTransferred,

I assume this is the standard type to use for bytes transferred, it just looks weird when we know this is for windows and windows uses 32 bit longs even on 64 bit windows...
Attachment #8801030 - Flags: review?(daniel) → review+
Comment on attachment 8801035 [details] [diff] [review]
[v3] Part 2: implement Named Pipe IO layer

Review of attachment 8801035 [details] [diff] [review]:
-----------------------------------------------------------------

I only have a few questions I think need to get clarified before r+'ing this.

::: netwerk/socket/nsNamedPipeIOLayer.cpp
@@ +83,5 @@
> +  int32_t DoReadContinue();
> +  int32_t DoWrite();
> +  int32_t DoWriteContinue();
> +
> +  static const uint32_t kBufferSize = 65536; // size of internal buffer

Is there any science behind this number, have you tested different numbers or is this just arbitrary set to something we believe is fine?

@@ +151,5 @@
> +    LOG_ERROR("invalid callback");
> +    return NS_ERROR_FAILURE;
> +  }
> +
> +  mErrorCode = ERROR_SUCCESS;

So what if the return path above is followed, what's the mErrorCode value supposed to be then? Just untouched? If so, I think it could warrant a comment since to me it looks like a mistake.

@@ +208,5 @@
> +
> +  nsresult rv = mNamedPipeService->AddDataObserver(pipe, this);
> +  if (NS_WARN_IF(NS_FAILED(rv))) {
> +    LOG_ERROR("AddDataObserver failed");
> +    return rv;

missing a CloseHandle(pipe) in this error path?

@@ +212,5 @@
> +    return rv;
> +  }
> +
> +  mReadOverlapped.hEvent = CreateEventA(nullptr, TRUE, TRUE, "NamedPipeRead");
> +  mWriteOverlapped.hEvent = CreateEventA(nullptr, TRUE, TRUE, "NamedPipeWrite");

Not checking that the return values are OK?

@@ +695,5 @@
> +{
> +  MOZ_ASSERT(PR_GetCurrentThread() == gSocketThread);
> +
> +  NamedPipeInfo* info = GetNamedPipeInfo(aFd);
> +  return static_cast<PRInt32>(info->Available());

And this is guaranteed to never be larger than an int32, right? Maybe add that as a comment then. And an assert within the Available method?

::: nsprpub/pr/src/md/windows/w32poll.c
@@ +157,5 @@
>                  bottom = PR_GetIdentitiesLayer(pd->fd, PR_NSPR_IO_LAYER);
> +
> +                // Bug 1288308: bottom would be NULL if named pipe proxy is
> +                // being used because there's no PR_NSPR_IO_LAYER.
> +                // PR_ASSERT(NULL != bottom);  /* what to do about that? */

I think you should cut out the "/* what to do about that? */" part at least. I know it comes from the original line that you've commented here but that part only makes the whole comment very confusing after this change is applied.
Attachment #8801035 - Flags: review?(daniel) → review-
Comment on attachment 8801031 [details] [diff] [review]
[v2] Part 3: test for nsINamedPipeService

Review of attachment 8801031 [details] [diff] [review]:
-----------------------------------------------------------------

LGTM! Just a question, on two places.

::: netwerk/test/TestNamedPipeService.cpp
@@ +47,5 @@
> +  , mOverlapped()
> +  , mBytesTransferred(0)
> +  , mMonitor("named-pipe")
> +{
> +  mOverlapped.hEvent = CreateEventA(nullptr, TRUE, TRUE, "named-pipe");

No checking return code?

@@ +236,5 @@
> +static nsresult
> +CreateNamedPipe(LPHANDLE aServer, LPHANDLE aClient)
> +{
> +  OVERLAPPED overlapped;
> +  overlapped.hEvent = CreateEvent(NULL, TRUE, TRUE, NULL);

No checking return code?
Attachment #8801031 - Flags: review?(daniel) → review+
(In reply to Daniel Stenberg [:bagder] from comment #27)
> ::: netwerk/socket/nsNamedPipeIOLayer.cpp
> @@ +83,5 @@
> > +  int32_t DoReadContinue();
> > +  int32_t DoWrite();
> > +  int32_t DoWriteContinue();
> > +
> > +  static const uint32_t kBufferSize = 65536; // size of internal buffer
> 
> Is there any science behind this number, have you tested different numbers
> or is this just arbitrary set to something we believe is fine?

There was a limitation [1] that max write is 64k, and the API [2] I used at first also mentions the limitation.
Seems the limitation no longer exists, but I'd leave the value as is.

[1] https://support.microsoft.com/en-us/kb/119218
[2] https://msdn.microsoft.com/en-us/library/windows/desktop/aa365790%28v=vs.85%29.aspx
modification based on comment 27
Attachment #8801035 - Attachment is obsolete: true
Attachment #8801035 - Flags: review?(honzab.moz)
Attachment #8802811 - Flags: review?(honzab.moz)
Attachment #8802811 - Flags: review?(daniel)
Comment on attachment 8802811 [details] [diff] [review]
[v4] Part 2: implement Named Pipe IO layer

Review of attachment 8802811 [details] [diff] [review]:
-----------------------------------------------------------------

::: netwerk/socket/nsNamedPipeIOLayer.cpp
@@ +840,5 @@
> +
> +  PRFileDesc* layer = PR_CreateIOLayerStub(nsNamedPipeLayerIdentity,
> +                                           &nsNamedPipeLayerMethods);
> +  if (NS_WARN_IF(!layer))
> +  {

move open brace to the if() line!
Attachment #8802811 - Flags: review?(daniel) → review+
Comment on attachment 8801030 [details] [diff] [review]
[v2] Part 1: implement nsNamedPipeService

Review of attachment 8801030 [details] [diff] [review]:
-----------------------------------------------------------------

I'm very sorry for later review here!  Few potential lifetime and locking issues I can see in the patch (can be really critical).  Generally we are getting where we need to.

::: netwerk/build/nsNetModule.cpp
@@ +295,5 @@
>  } // namespace net
>  } // namespace mozilla
>  
> +#ifdef XP_WIN
> +#include "mozilla/net/nsNamedPipeService.h"

note that instead of export (if this is the only consumer) you can also do #include "../socket/nsNamed....h"

::: netwerk/socket/nsINamedPipeService.idl
@@ +18,5 @@
> +    /**
> +     * onDataAvailable
> +     *
> +     * @param aBytesTransferred
> +     *        Available bytes during last transmission.

s/Available/Transfered/

@@ +50,5 @@
> +     * @param aHandle
> +     *        The handle that is going to be monitored for read/write operations.
> +     *        Only handles that are opened with overlapped IO are supported.
> +     * @param aObserver
> +     *        The observer of the handle

say that the service strong-refs the observer

::: netwerk/socket/nsNamedPipeService.cpp
@@ +16,5 @@
> +static mozilla::LazyLogModule gNamedPipeServiceLog("NamedPipe");
> +#define LOG_DEBUG(...) \
> +  MOZ_LOG(gNamedPipeServiceLog, mozilla::LogLevel::Debug, (__VA_ARGS__))
> +#define LOG_ERROR(...) \
> +  MOZ_LOG(gNamedPipeServiceLog, mozilla::LogLevel::Error, (__VA_ARGS__))

new rule: please remove the MOZ_ prefix and add NPS_ (aka Named Pipe Service) prefix to your log macros.  we want not to repeat bug 1311682 in the future.

@@ +42,5 @@
> +
> +  // nsIObserverService must be accessed in main thread.
> +  // register shutdown event to stop NamedPipeSrv thread.
> +  nsIObserver* self = nsCOMPtr<nsIObserver>(this).forget().take();
> +  nsCOMPtr<nsIRunnable> r = NS_NewRunnableFunction([self] () -> void {

nit:
nsCOMPtr<nsIObserver> self(this);
nsCOMPtr<nsIRunnable> r = NS_NewRunnableFunction([self = Move(self)]() -> void {
  and then use self as a usual comptr

because otherwise when NS_DispatchToMainThread fails you leak |self|

@@ +70,5 @@
> +    return NS_ERROR_FAILURE;
> +  }
> +
> +  if (NS_WARN_IF(NS_FAILED(rv = NS_NewNamedThread("NamedPipeSrv",
> +                                                  getter_AddRefs(mThread))))) {

split this please:
rv = NS_New...
if (NS_WARN_IF(NS_FAILED(rv)) { ...

@@ +92,5 @@
> +
> +  // stop thread
> +  if (mThread && !mIsShutdown) {
> +    mIsShutdown = true;
> +    mThread->Shutdown();

here you mostly count on the timeout of GetQueuedCompletionStatus, right?  I don't see any way you would wake the thread up.  but that is probably alright for the first version.

@@ +93,5 @@
> +  // stop thread
> +  if (mThread && !mIsShutdown) {
> +    mIsShutdown = true;
> +    mThread->Shutdown();
> +    mThread = nullptr;

this is gonna make nsNamedPipeService::IsOnCurrentThread crash

@@ +123,5 @@
> +    return NS_ERROR_FAILURE;
> +  }
> +  if (NS_WARN_IF(h != mIocp)) {
> +    CloseHandle(h);
> +    return NS_ERROR_FAILURE;

maybe log this too

@@ +127,5 @@
> +    return NS_ERROR_FAILURE;
> +  }
> +
> +  MutexAutoLock lock(mLock);
> +  if (!mObservers.Contains(aObserver)) {

how often do you expect the observer already be present?  Contains() is quite expensive and also it smells odd to duplicate this (it also means you may call CreateIoCompletionPort twice for the same observer?)

@@ +133,5 @@
> +    if (mObservers.Length() == 1) {
> +      LOG_DEBUG("first observer, start loop");
> +      rv = mThread->Dispatch(this, NS_DISPATCH_NORMAL);
> +      if (NS_WARN_IF(NS_FAILED(rv))) {
> +        mObservers.Clear();

hmm.. so they never get any message anymore?  iterate with onerror would be better than just forgetting them (leaks)

@@ +193,5 @@
> +    BOOL success = GetQueuedCompletionStatus(mIocp,
> +                                             &bytesTransferred,
> +                                             &key,
> +                                             &overlapped,
> +                                             1000); // timeout, 1s

not very battery friendly when holding long lasting connections.  but leave it, we can handle that in a followup if needed.

@@ +204,5 @@
> +      LOG_ERROR("invalid key");
> +      continue;
> +    }
> +
> +    // don't use nsCOMPtr here since it might have been destroyed.

s/it/key/

say why it's legal the instance can already be gone (despite how much I don't like the premise)

btw, how do you handle duplicates?  keep in mind that specially windows mem allocator (when jemalloc is off) likes to recycle pointers on same instances.

and finally, can you somehow change the code easily to not work with invalid handles?  this is something that smells a lot to be not super stable.

@@ +210,5 @@
> +      reinterpret_cast<nsINamedPipeDataObserver*>(key);
> +
> +    {
> +      // keep this lock until callback function is called to avoid
> +      // changing reference count in another thread.

ref count of what?  please be more concrete in your comments (in general, don't use pronouns in comments to avoid non-clear comments)

@@ +219,5 @@
> +        if (observer.get() == obs) {
> +          found = true;
> +          break;
> +        }
> +      }

you have nsTArray::IndexOf()
https://dxr.mozilla.org/mozilla-central/rev/215f9686117673a2c914ed207bc7da9bb8d741ad/xpcom/glue/nsTArray.h#1143

@@ +231,5 @@
> +        LOG_DEBUG("OnDataAvailable: obs=%p, bytes=%d", obs, bytesTransferred);
> +        obs->OnDataAvailable(bytesTransferred, overlapped);
> +      } else {
> +        LOG_ERROR("GetQueuedCompletionStatus %p failed, error=%d", obs, err);
> +        obs->OnError(err, overlapped);

|obs| should be addrefed/released around the call.  this is not very healthy.

also, it's big mistake to call observers (an implementation you don't control) from inside a lock.  in case of reentrancy you will deadlock or crash.

::: netwerk/socket/nsNamedPipeService.h
@@ +5,5 @@
> +
> +#ifndef mozilla_netwerk_socket_nsNamedPipeService_h
> +#define mozilla_netwerk_socket_nsNamedPipeService_h
> +
> +#include <set>

do you need this?

@@ +6,5 @@
> +#ifndef mozilla_netwerk_socket_nsNamedPipeService_h
> +#define mozilla_netwerk_socket_nsNamedPipeService_h
> +
> +#include <set>
> +#include <windows.h>

Windows.h

@@ +36,5 @@
> +  virtual ~nsNamedPipeService() = default;
> +  void Shutdown();
> +
> +  HANDLE mIocp;
> +  Atomic<bool> mIsShutdown;

please add comments how this is used
Attachment #8801030 - Flags: review?(honzab.moz) → feedback+
Comment on attachment 8802811 [details] [diff] [review]
[v4] Part 2: implement Named Pipe IO layer

Review of attachment 8802811 [details] [diff] [review]:
-----------------------------------------------------------------

r+ with comments addressed.

I could not test this time.  The patches (0-4 parts) don't apply cleanly to m-c or provide instructions how to apply them (probably my fault, I was postponing this review too much....)

I think this is closing to be ready for landing.  Shake few small details and we can go.

Generally I very much lack comments on members and method declarations!  that makes the review way harder and is also something we have to do anyway.

::: netwerk/socket/nsNamedPipeIOLayer.cpp
@@ +27,5 @@
> +namespace net {
> +
> +static mozilla::LazyLogModule gNamedPipeLog("NamedPipe");
> +#define LOG_DEBUG(...) MOZ_LOG(gNamedPipeLog, mozilla::LogLevel::Debug, (__VA_ARGS__))
> +#define LOG_ERROR(...) MOZ_LOG(gNamedPipeLog, mozilla::LogLevel::Error, (__VA_ARGS__))

maybe use the same log module as for the service?  maybe just use NamedPipe module for all?  consider "NamedPipeWin" name of the module.

@@ +35,5 @@
> +
> +class NamedPipeInfo final : public nsINamedPipeDataObserver
> +{
> +public:
> +  NS_DECL_ISUPPORTS

not thread safe?  then I would not understand why the observer array in the service is protected with a mutex

@@ +43,5 @@
> +
> +  nsresult Connect(const nsACString& aPath);
> +  nsresult Disconnect();
> +
> +  // Read data from internal buffer.

some more words about how this behaves in various states of the info object would be good.

@@ +65,5 @@
> +  bool IsConnected() const;
> +  bool IsNonblocking() const;
> +  HANDLE GetHandle() const;
> +
> +  int16_t GetPollFlags(int16_t aInFlags, int16_t* aOutFlags);

this one definitely should have docs.

@@ +76,5 @@
> +   * depending on |mNonblocking|. In blocking mode, they return when the action
> +   * has been done and in non-blocking mode it returns the number of bytes that
> +   * were read/written if the operation is done immediately. If it takes some
> +   * time to finish the operation, zero is returned and
> +   * DoReadContinue/DoWriteContinue must be called to get async I/O result.

this is a good comment! :)  thanks.

@@ +99,5 @@
> +
> +  uint8_t mReadBuffer[kBufferSize];
> +  DWORD mReadSize;
> +  DWORD mReadOffset;
> +  bool mHasPendingRead;

all these need good comments too.

@@ +104,5 @@
> +
> +  uint8_t mWriteBuffer[kBufferSize];
> +  DWORD mWriteSize;
> +  DWORD mWriteOffset;
> +  bool mHasPendingWrite;

as well here.

@@ +106,5 @@
> +  DWORD mWriteSize;
> +  DWORD mWriteOffset;
> +  bool mHasPendingWrite;
> +
> +  bool mNonblocking;

add comment

@@ +143,5 @@
> +NS_IMETHODIMP
> +NamedPipeInfo::OnDataAvailable(uint32_t aBytesTransferred,
> +                               void* aOverlapped)
> +{
> +  bool isOnPipeServiceThread;

DebugOnly<bool>
#include "mozilla/DebugOnly.h"

@@ +169,5 @@
> +NS_IMETHODIMP
> +NamedPipeInfo::OnError(uint32_t aError,
> +                       void* aOverlapped)
> +{
> +  bool isOnPipeServiceThread;

debugonly too

@@ +201,5 @@
> +                     FILE_FLAG_OVERLAPPED,
> +                     nullptr);
> +
> +  if (pipe == INVALID_HANDLE_VALUE) {
> +    return NS_ERROR_FAILURE;

log GetLastError()

@@ +277,5 @@
> +  }
> +
> +  /**
> +   * If there's nothing in the read buffer, it triggers internal read operation
> +   * by calling |GetPollFlags|. This is for callers that uses blocking IO.

This is for callers that uses blocking IO.

s/uses/use/

I don't follow ..  If !Available() you may end up returning WOULD_BLOCK.  that is not something expected on a blocking file desc.  the code is probably correct, just the comment seems misplaced.

@@ +290,5 @@
> +    }
> +  }
> +
> +  int32_t bytesRead = std::min<int32_t>(aSize, Available());
> +  memcpy(aBuffer, &mReadBuffer[mReadOffset], bytesRead);

add a comment that Available() can't return more than what fits to the buffer at the read offset!  otherwise this looks like a potential for buffer overflow

@@ +314,5 @@
> +
> +  int32_t bytesToWrite = std::min<int32_t>(aSize,
> +                                           sizeof(mWriteBuffer) - mWriteSize);
> +  if (bytesToWrite == 0) {
> +    PR_SetError(PR_WOULD_BLOCK_ERROR, 0);

even when blocking?  this is not good.

@@ +322,5 @@
> +  memcpy(&mWriteBuffer[mWriteSize], aBuffer, bytesToWrite);
> +  mWriteSize += bytesToWrite;
> +
> +  /**
> +   * Triggers internal write operation. This is for blocking IO mode.

this comment is insufficient, I don't much follow what happens with the getpollflags call here

@@ +353,5 @@
> +  }
> +
> +  int32_t bytesRead = std::min<int32_t>(Available(), aSize);
> +  memcpy(aBuffer, &mReadBuffer[mReadOffset], bytesRead);
> +  return bytesRead;

do you think we could turn Read() to call Peek() and on success just move the read cursor?  it would save some code duplication.

@@ +361,5 @@
> +NamedPipeInfo::Available() const
> +{
> +  MOZ_ASSERT(PR_GetCurrentThread() == gSocketThread);
> +  MOZ_ASSERT(mReadOffset <= mReadSize);
> +  MOZ_ASSERT(mReadSize - mReadOffset < 0x7FFFFFFF);

just for curiosity, what's this check for ? :)

@@ +362,5 @@
> +{
> +  MOZ_ASSERT(PR_GetCurrentThread() == gSocketThread);
> +  MOZ_ASSERT(mReadOffset <= mReadSize);
> +  MOZ_ASSERT(mReadSize - mReadOffset < 0x7FFFFFFF);
> +  return mReadSize - mReadOffset;

the names of the mRead* members could be better chosen ;)

@@ +379,5 @@
> +void
> +NamedPipeInfo::SetNonblocking(bool nonblocking)
> +{
> +  MOZ_ASSERT(PR_GetCurrentThread() == gSocketThread);
> +  mNonblocking = nonblocking;

so, is mNonblocking only accessed on the socket thread?  if so, state it in the header (as part of the rich comment about this member ;))

@@ +462,5 @@
> +  mReadOffset = 0;
> +  mReadSize = 0;
> +
> +  BOOL success = ReadFile(mPipe,
> +                          mReadBuffer,

should you offset to the buffer with mReadOffset?  or mReadSize, I don't know which is one...

@@ +475,5 @@
> +
> +  switch (GetLastError()) {
> +    case ERROR_MORE_DATA:   // has more data to read
> +      mHasPendingRead = true;
> +      return DoReadContinue();

please state in a comment why you do this call here.

@@ +615,5 @@
> +                   PRIntervalTime aTimeout)
> +{
> +  MOZ_ASSERT(PR_GetCurrentThread() == gSocketThread);
> +
> +  NamedPipeInfo* info = GetNamedPipeInfo(aFd);

check nonnull? on more places

@@ +646,5 @@
> +  }
> +
> +  if (aFd->lower) {
> +    return aFd->lower->methods->close(aFd->lower);
> +  }

shouldn't aFd be somehow freed here?

@@ +675,5 @@
> +                PRIntervalTime aTimeout)
> +{
> +  MOZ_ASSERT(PR_GetCurrentThread() == gSocketThread);
> +
> +  Unused << aTimeout;

do you plan to support timeout?  if so, followup is OK

@@ +828,5 @@
> +
> +bool
> +IsNamedPipePath(const nsACString& aPath)
> +{
> +  return Substring(aPath, 0, 9) == "\\\\.\\pipe\\";

There is also StringBeginsWith,  use NS_LITERAL_CSTRING("\\\\.\\pipe\\");

::: netwerk/socket/nsNamedPipeIOLayer.h
@@ +14,5 @@
> +
> +bool IsNamedPipePath(const nsACString& aPath);
> +PRFileDesc* CreateNamedPipeLayer();
> +
> +extern PRDescIdentity nsNamedPipeLayerIdentity;

check if there actually is a consumer for this.

::: netwerk/socket/nsSOCKSIOLayer.cpp
@@ +90,3 @@
>  
>  private:
> +    virtual ~nsSOCKSSocketInfo() { SetNamedPipeFD(nullptr); HandshakeFinished(); }

split as:

virtual ~nsS...() {
  Set..();
  Hand..();
}

@@ +1557,5 @@
> +    PRDescIdentity fdIdentity = PR_GetLayersIdentity(fd);
> +#if defined(XP_WIN)
> +    if (fdIdentity == mozilla::net::nsNamedPipeLayerIdentity) {
> +        // Set named pipe FD to non-nullptr to enable blocking
> +        // mode read/write on named pipe. This is required since SOCKSIOLayer

"remember named pipe fd on the info object so that we can switch blocking and non-blocking mode on the pipe later."

OTOH, when I see now how this is used, could you just search for your layer in mFD instead?  maybe not, I just may not remember if it's possible or not.

::: nsprpub/pr/src/md/windows/w32poll.c
@@ +157,5 @@
>                  bottom = PR_GetIdentitiesLayer(pd->fd, PR_NSPR_IO_LAYER);
> +
> +                // Bug 1288308: bottom would be NULL if named pipe proxy is
> +                // being used because there's no PR_NSPR_IO_LAYER.
> +                // PR_ASSERT(NULL != bottom);

change this to:

PR_ASSERT(NULL != bottom || (in_flags_read | in_flags_write | pd->in_flags) == 0);

also, this is an NSPR change, but I think we can take a "private" (=landed on m-c) patch here for awhile.  but correctly this has to land in the NSPR repository and has to be tracked in a separate bug under NSPR product.
Attachment #8802811 - Flags: review?(honzab.moz) → review+
rebase, carry r+
Attachment #8802403 - Attachment is obsolete: true
Attachment #8804267 - Flags: review+
(In reply to Honza Bambas (:mayhemer) from comment #32)
> Comment on attachment 8801030 [details] [diff] [review]
> [v2] Part 1: implement nsNamedPipeService
> 
> Review of attachment 8801030 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> I'm very sorry for later review here!  Few potential lifetime and locking
> issues I can see in the patch (can be really critical).  Generally we are
> getting where we need to.
> ::: netwerk/socket/nsNamedPipeService.cpp
> @@ +204,5 @@
> > +      LOG_ERROR("invalid key");
> > +      continue;
> > +    }
> > +
> > +    // don't use nsCOMPtr here since it might have been destroyed.
> 
> s/it/key/
> 
> say why it's legal the instance can already be gone (despite how much I
> don't like the premise)
> 
> btw, how do you handle duplicates?  keep in mind that specially windows mem
> allocator (when jemalloc is off) likes to recycle pointers on same instances.

What does the "duplicates" mean? Handles or observers?
Are you talking about adding duplicate |HANDLE| values to |CreateIoCompletionPort|?

> and finally, can you somehow change the code easily to not work with invalid
> handles?  this is something that smells a lot to be not super stable.

Does "invalid handles" mean |key|s not in the |mObservers|?

We cannot control the life-cycle of instances set to |CreateIoCompletionPort|
and get from |GetQueuedCompletionStatus|, so the solution here is using a raw pointer
and a table to simulate what weak pointers do.
Flags: needinfo?(honzab.moz)
(In reply to Gary Chen [:xeonchen] from comment #35)
> (In reply to Honza Bambas (:mayhemer) from comment #32)
> > Comment on attachment 8801030 [details] [diff] [review]
> > [v2] Part 1: implement nsNamedPipeService
> > 
> > Review of attachment 8801030 [details] [diff] [review]:
> > -----------------------------------------------------------------
> > 
> > I'm very sorry for later review here!  Few potential lifetime and locking
> > issues I can see in the patch (can be really critical).  Generally we are
> > getting where we need to.
> > ::: netwerk/socket/nsNamedPipeService.cpp
> > @@ +204,5 @@
> > > +      LOG_ERROR("invalid key");
> > > +      continue;
> > > +    }
> > > +
> > > +    // don't use nsCOMPtr here since it might have been destroyed.
> > 
> > s/it/key/
> > 
> > say why it's legal the instance can already be gone (despite how much I
> > don't like the premise)
> > 
> > btw, how do you handle duplicates?  keep in mind that specially windows mem
> > allocator (when jemalloc is off) likes to recycle pointers on same instances.
> 
> What does the "duplicates" mean? Handles or observers?

Here I mean observers, sorry probably wasn't clear enough.  The problem I smell is that the "key" argument, which is a context value sticking to the handle via the MS pipe API, may already point to an invalid memory (the observer is released and removed from the observers array sooner, than you stop "polling" the handle bound to it.  If this is not the case then there is no problem, but I think this is possible (persuade me otherwise if not :)).

According the duplicity problem, the following may happen: the IO polling thread you introduced is inside the GetQueuedCompletionStatus, waiting for a handle H1 bound to an observer O1 ("key" for H1 is O1's address).  Now, the observer O1 is removed from the array (we no longer want to poll it) and released from memory, also H1 is closed (I think this is wrong too, I'll get to it later).  Immediately after that a new observer O2 and new handle H2 are created and added to the observer array.  It may very easily happen that O2's address is the same as O1's - this is the duplication I talk about.  Now, your polling thread exits GetQueuedCompletionStatus and I think it potentially may indicate completion for H1 (already closed) it gives you "key" which is pointing at the same address as O2 is currently at.  You are searching the observer array, find O2 and call on it, wrongly expecting the event is for H2.  But it's actually for H1.

Other thing is that you are closing handles while they're still being polled for by GetQueuedCompletionStatus.  I'm not sure it's legal, maybe it is.  You have check this in the API documentation.  Anyway, I think that nsNamedPipeService::RemoveDataObserver should make the removal itself happen on the polling thread to avoid these potentials for duplicates and polling of closed handles.  Maybe there is a simpler solution, it's up to you to decide.

Generally, a good production quality patch must be have object lifetime well thought through.  Here I sense a problem, hence the r-.

> Are you talking about adding duplicate |HANDLE| values to
> |CreateIoCompletionPort|?

No.

> 
> > and finally, can you somehow change the code easily to not work with invalid
> > handles?  this is something that smells a lot to be not super stable.
> 
> Does "invalid handles" mean |key|s not in the |mObservers|?

Sorry for using words 'invalid handles', I wanted to say 'invalid observer pointers'.  If it's not there, it's not a good design, but itself would probably work.  But there is the duplicity problem outlined above.

> 
> We cannot control the life-cycle of instances set to |CreateIoCompletionPort|
> and get from |GetQueuedCompletionStatus|, so the solution here is using a
> raw pointer
> and a table to simulate what weak pointers do.

It's correct to use an array and search in it (hash table would be more efficient, but we don't have a key type for this purpose).  OTOH, you may change the code to make sure that the pointer to an observer in "key" is always valid.  Actually, you should.

Hope it's clear now, feel free to ask more questions.  This is a bit more complicated patch.
Flags: needinfo?(honzab.moz)
rebase, carry r+
Attachment #8804267 - Attachment is obsolete: true
Attachment #8804987 - Flags: review+
refinements based on comment 32
Attachment #8801030 - Attachment is obsolete: true
Attachment #8804988 - Flags: review?(honzab.moz)
refinements based on comment 33, carry r+
Attachment #8800180 - Attachment is obsolete: true
Attachment #8802811 - Attachment is obsolete: true
Attachment #8804989 - Flags: review+
rebase, carry r+
Attachment #8801031 - Attachment is obsolete: true
Attachment #8804990 - Flags: review+
Comment on attachment 8804988 [details] [diff] [review]
[v3] Part 1: implement nsNamedPipeService

Review of attachment 8804988 [details] [diff] [review]:
-----------------------------------------------------------------

r+ with comments

please read them all before you start modifying the patch, I let my thoughts flow a but :)

::: netwerk/socket/nsNamedPipeService.cpp
@@ +14,5 @@
> +namespace mozilla {
> +namespace net {
> +
> +static mozilla::LazyLogModule gNamedPipeServiceLog("NamedPipeWin");
> +#define NPS_DEBUG(...) \

oh, sorry, meant: LOG_NPS_DEBUG or some like that. "LOG" should be there to make clear what's going on.

@@ +31,5 @@
> +  , mThread()
> +  , mLock("NamedPipeServiceLock")
> +  , mObservers()
> +  , mRetiredObservers()
> +  , mRetiredHandles()

no need to init the arrays or comptrs.

@@ +60,5 @@
> +                                              false)))) {
> +      return;
> +    }
> +  });
> +  rv = NS_DispatchToMainThread(r);

so, here you post to the main thread to add this service as an observer but few lines below you call Shutdown() directly which just removes it again.  so, which thread you are on here?  this should be cleaned up.

@@ +121,5 @@
> +    }
> +    mRetiredHandles.Clear();
> +  }
> +
> +  mRetiredObservers.Clear();

keep lock over this method too!

@@ +174,5 @@
> +        observers.SwapElements(mObservers);
> +      }
> +
> +      for (auto& obs : observers) {
> +        obs->OnError(ERROR_INTERNAL_ERROR, nullptr);

question is if the observer will be ready to handler call to its OnError from inside of AddDataObserver

you should check all your error paths manually (just instead of rv = mThread->Dispatch(this, NS_DISPATCH_NORMAL) do rv = NS_ERROR_FAILURE and see what happens)

@@ +191,5 @@
> +  MutexAutoLock lock(mLock);
> +  mObservers.RemoveElement(aObserver);
> +
> +  mRetiredHandles.AppendElement(aHandle);
> +  mRetiredObservers.AppendElement(aObserver);

good, just first append to the retired arrays and only then remove from mObserver.  just to be safe ;)

@@ +200,5 @@
> +NS_IMETHODIMP
> +NamedPipeService::IsOnCurrentThread(bool* aRetVal)
> +{
> +  MOZ_ASSERT(mThread);
> +  return mThread->IsOnCurrentThread(aRetVal);

since socks proxies are not very much used, I think, if we crash here on null ptr in release builds, we will do so rarely, so probably not until this arrives to the release channel.  I'd rather see a full null check here for mThread.  leave the assert, tho.

@@ +240,5 @@
> +        break;
> +      }
> +    }
> +
> +    RemoveRetiredObjects();

maybe instead of locking inside this method you may want to call it when holding the lock to save the lock acquiring.  inside RemoveRetiredObjects() use mLock.AssertCurrentThreadOwns();

@@ +287,5 @@
> +      MutexAutoLock lock(mLock);
> +
> +      using Comparator = nsDefaultComparator<nsCOMPtr<nsINamedPipeDataObserver>,
> +                                             nsINamedPipeDataObserver*>;
> +      auto idx = mObservers.IndexOf(target, 0, Comparator());

sorry for misleading link, but there are overloads for IndexOf that don't need comparator (use default).  just omit your own Comparator class completely.

::: netwerk/socket/nsNamedPipeService.h
@@ +18,5 @@
> +namespace net {
> +
> +class NamedPipeService final : public nsINamedPipeService
> +                               , public nsIObserver
> +                               , public nsIRunnable

nit: indent:

: and , at the same column
Attachment #8804988 - Flags: review?(honzab.moz) → review+
Attachment #8804988 - Attachment is obsolete: true
Comment on attachment 8805478 [details] [diff] [review]
[v4] Part 1: implement nsNamedPipeService

Review of attachment 8805478 [details] [diff] [review]:
-----------------------------------------------------------------

How do you think about the modification of error handling?

::: netwerk/socket/nsNamedPipeService.cpp
@@ +168,5 @@
> +      rv = mThread->Dispatch(this, NS_DISPATCH_NORMAL);
> +      if (NS_WARN_IF(NS_FAILED(rv))) {
> +        LOG_NPS_ERROR("Dispatch to thread failed (%08x)", rv);
> +        mObservers.Clear();
> +        return rv;

It's a dilemma that I cannot callback from an undispatchable thread nor dispatch to socket thread.
I finally chose to use return value only for the error and make sure no other observer tries to add itself to the queue concurrently.
The scope of the lock extends, but no callbacks, time-consuming jobs, I think it seems an acceptable solution.
Attachment #8805478 - Flags: feedback?(honzab.moz)
rebase again
Attachment #8804987 - Attachment is obsolete: true
Attachment #8805484 - Flags: review+
previous patch is incorrect, fix it.
Attachment #8805484 - Attachment is obsolete: true
Attachment #8805485 - Flags: review+
See Also: → 1313612
Comment on attachment 8805478 [details] [diff] [review]
[v4] Part 1: implement nsNamedPipeService

Review of attachment 8805478 [details] [diff] [review]:
-----------------------------------------------------------------

::: netwerk/socket/nsNamedPipeService.cpp
@@ +168,5 @@
> +      rv = mThread->Dispatch(this, NS_DISPATCH_NORMAL);
> +      if (NS_WARN_IF(NS_FAILED(rv))) {
> +        LOG_NPS_ERROR("Dispatch to thread failed (%08x)", rv);
> +        mObservers.Clear();
> +        return rv;

yeah, this is good too.
Attachment #8805478 - Flags: feedback?(honzab.moz) → feedback+
refinements, carry r+
Attachment #8804989 - Attachment is obsolete: true
Attachment #8806202 - Flags: review+
rebase, carry r+
Attachment #8806205 - Flags: review+
Attachment #8805478 - Attachment is obsolete: true
rebase
Attachment #8806205 - Attachment is obsolete: true
Attachment #8806216 - Flags: review+

Comment 53

3 years ago
Pushed by gachen@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/41aca5073e9d
Part 0: add Named Pipe type on Windows platform; r=bagder
https://hg.mozilla.org/integration/mozilla-inbound/rev/2c52d5c8fe22
Part 1: implement nsNamedPipeService; r=bagder,mayhemer
https://hg.mozilla.org/integration/mozilla-inbound/rev/e81bf7852eb2
Part 2: implement Named Pipe IO layer; r=bagder,mayhemer
https://hg.mozilla.org/integration/mozilla-inbound/rev/eb5a061acfda
Part 3: test for nsINamedPipeService; r=bagder
Keywords: checkin-needed
backed this out since this seems to have caused https://treeherder.mozilla.org/logviewer.html#?job_id=38550450&repo=mozilla-inbound
Flags: needinfo?(xeonchen)
fix broken tests, carry r+
Attachment #8806202 - Attachment is obsolete: true
Flags: needinfo?(xeonchen)
Attachment #8807055 - Flags: review+

Comment 58

3 years ago
Pushed by gachen@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/da9997b39928
Part 0: add Named Pipe type on Windows platform; r=bagder
https://hg.mozilla.org/integration/mozilla-inbound/rev/7b74a18794d6
Part 1: implement nsNamedPipeService; r=bagder,mayhemer
https://hg.mozilla.org/integration/mozilla-inbound/rev/b70e2870aced
Part 2: implement Named Pipe IO layer; r=bagder,mayhemer
https://hg.mozilla.org/integration/mozilla-inbound/rev/2f00f2194b90
Part 3: test for nsINamedPipeService; r=bagder
Keywords: checkin-needed
Gary, when you are pushing to try, you should use "try: -b do -p all -u all -t none" to run the same set of tests that m-i/m-c run.
(In reply to Honza Bambas (:mayhemer) from comment #59)
> Gary, when you are pushing to try, you should use "try: -b do -p all -u all
> -t none" to run the same set of tests that m-i/m-c run.

You're right. I thought only minimum tests are supposed to run.
But, yeah, I'll do that next time :(
Gary, Honza and Daniel,
Thanks for fixing this bug!  It's a big step for us.  Cheers!
Depends on: 1316116
You did it wrong.

NSPR is a separate product, and Firefox should only use released snapshots of NSPR.

Please don't make local changes to the copy of NSPR in directory /nsprpub (except when you make a try build).

Instead, please suggest your changes to the NSPR component, and ask for a new NSPR snapshot to get uplifted.
this has to get fixed - kai is basically right. (the released snapshot thing isn't a blocking problem - we can make one whenever we want.. but the code goes in a separate repo than gecko.. you can duplicate that code in the gecko tree to help with landings during nightly/aurora if you want, but its always going to get overwritten - so you need to land it in nspr too.)

 let me know if you can't get nspr reviews and we can decide what to do.
Flags: needinfo?(xeonchen)
(In reply to Kai Engert (:kaie) from comment #65)
> You did it wrong.
> 
> NSPR is a separate product, and Firefox should only use released snapshots
> of NSPR.
> 

I'm sorry I didn't aware the process to land on NSPR.

> Please don't make local changes to the copy of NSPR in directory /nsprpub
> (except when you make a try build).
> 

So sorry again :(

> Instead, please suggest your changes to the NSPR component, and ask for a
> new NSPR snapshot to get uplifted.

I already saw what you've done in bug 1313612 and bug 1347932, really appreciate.
Is there any follow-up I have to do?
Flags: needinfo?(xeonchen) → needinfo?(kaie)
gary - I only looked at this briefly. iirc nspr is inlined on osx and windows but generally not linux - so linux is where we will see the mismatch in practice. You changed one of the structure sizes (which could lead to memory errors) but it seems that's only for windows which is going to have this code inlined. Is that correct?

If that's not correct then we've got to back it out of 53 for practical reasons.. otherwise the 2 new bugs will probably take care of it.
Are you referring to union PRNetAddr, the change at the bottom of this patch?
  https://bugzilla.mozilla.org/attachment.cgi?id=8806215&action=diff

If yes, it seems the size was increased on windows, only?

I haven't checked if PRNetAddr is part of the API and should remain constant size for NSPR's forward compatibility promised, is it?

I also don't know if anyone on Windows is shipping NSPR binaries, and relies on that forward compatibility promise.
Flags: needinfo?(kaie)
(In reply to Patrick McManus [:mcmanus] from comment #68)
> gary - I only looked at this briefly. iirc nspr is inlined on osx and
> windows but generally not linux - so linux is where we will see the mismatch
> in practice. You changed one of the structure sizes (which could lead to
> memory errors) but it seems that's only for windows which is going to have
> this code inlined. Is that correct?
> 

Yes, the modification of this is for the named pipe path and only changed Windows platforms.

> If that's not correct then we've got to back it out of 53 for practical
> reasons.. otherwise the 2 new bugs will probably take care of it.

Comment 71

2 years ago
I've bisected a regression in Firefox 52 to this commit series.

Firefox 52 broke SOCKS5 proxies for me on FreeBSD 12.0-CURRENT revision 314972. I create a proxy with "ssh -D 2001 example.com" and put "127.0.0.1", "2001" into the SOCKS proxy configuration. SOCKS Remote DNS is checked.

Sites would start to load but then a (seemingly random) connection would hang indefinitely.

When I remove the toggling between non-blocking and blocking mode SOCKS5 works again:


diff --git a/netwerk/socket/nsSOCKSIOLayer.cpp b/netwerk/socket/nsSOCKSIOLayer.cpp
index de82d61..6779502 100644
--- a/netwerk/socket/nsSOCKSIOLayer.cpp
+++ b/netwerk/socket/nsSOCKSIOLayer.cpp
@@ -425,14 +425,6 @@ nsSOCKSSocketInfo::HandshakeFinished(PRErrorCode err)
 {
     if (err == 0) {
         mState = SOCKS_CONNECTED;
-        // Switch back to nonblocking mode after finishing handshaking.
-        if (mFD) {
-            PRSocketOptionData opt_nonblock;
-            opt_nonblock.option = PR_SockOpt_Nonblocking;
-            opt_nonblock.value.non_blocking = PR_TRUE;
-            PR_SetSocketOption(mFD, &opt_nonblock);
-            mFD = nullptr;
-        }
     } else {
         mState = SOCKS_FAILED;
         PR_SetError(PR_UNKNOWN_ERROR, err);
@@ -573,14 +565,6 @@ nsSOCKSSocketInfo::ConnectToProxy(PRFileDesc *fd)
         }
     } while (status != PR_SUCCESS);

-    // Switch to blocking mode during handshaking
-    if (mFD) {
-        PRSocketOptionData opt_nonblock;
-        opt_nonblock.option = PR_SockOpt_Nonblocking;
-        opt_nonblock.value.non_blocking = PR_FALSE;
-        PR_SetSocketOption(mFD, &opt_nonblock);
-    }
-
     // Connected now, start SOCKS
     if (mVersion == 4)
         return WriteV4ConnectRequest();
(In reply to Jan Kokemüller from comment #71)
> 
> When I remove the toggling between non-blocking and blocking mode SOCKS5
> works again:

Can you please verify, if you are using NSPR 4.13, or 4.13.1 which fixed bug 1309681 related to blocking/non-blocking?
Flags: needinfo?(jan.kokemueller)

Comment 73

2 years ago
I am using Firefox 52 and NSPR 4.13.1 packages from FreeBSD:

> $ pkg info firefox nspr
> firefox-52.0_3,1
> nspr-4.13.1

I'll try to reproduce the hang on a clean VM later today to rule out any possible weird issues on my machine.

The manual build ("./mach build") I've used for bisecting seems to use its own copy of NSPR. I could force remove the nspr package ("pkg remove -f nspr") and still reproduce the issue. It also happens on Aurora.

Maybe there is a bug in the SOCKS connection state machine related to non-blocking sockets? I can try to narrow it down further.
So you forced it to use the NSPR snapshot that is included with Firefox.

I don't know if Firefox 52 should still work if you revert the patches from this bug.
It might be worth testing, if you still see your problem, if you remove your change from comment 71, and revert this patch.

If revert the commits from this bug fixes it for you, then we probably have a BSD-specific regression (if we haven't seen a similar bug report on other platforms yet).

If it's indeed a BSD-specific regression, then please file a new, separate bug report (and make it refer to this bug).

I'll attach my attempt to revert the patches from this bug (excluding the test, which has changed afterwards).
Experimental full backout (including test).
Flags: needinfo?(jan.kokemueller)

Updated

2 years ago
Flags: needinfo?(jan.kokemueller)

Updated

2 years ago
Blocks: 1348841
Jan, let's move the discussion over to bug 1348841, until we know if it's really related to this bug, or not.

Updated

2 years ago
Flags: needinfo?(jan.kokemueller)

Updated

2 years ago
Attachment #8849105 - Attachment description: experimental patch to backout on ff 52 branch → experimental patch to backout on ff 52 branch (doesn't build)
Attachment #8849105 - Attachment is obsolete: true
You need to log in before you can comment on or make changes to this bug.