Closed Bug 631801 Opened 14 years ago Closed 13 years ago

Available() does not work on SSL Stream

Categories

(Core :: Security: PSM, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla12

People

(Reporter: mcmanus, Assigned: mcmanus)

References

Details

Attachments

(1 file, 3 obsolete files)

These stubs are used to handle un-implemeted methods. Those methods can be invoked via normal IDL methods and should, imo, return errors that can be handled instead of asserting and aborting.

My example: I call ->Available() on a nsIInputStream.. under SSL that is not implemented (and I understand why) and .available is mapped to invalidInt in nsNSSIOLayer.cpp. It then abort()s. The error code they return should be enough.
Attached patch patch v1 (obsolete) — Splinter Review
Attachment #510021 - Flags: review?(honzab.moz)
Summary: invalidInt, invalidStatus, and invalidDesc ASSERT on valid error paths → nsNSSIOLayer.cpp: invalidInt, invalidStatus, and invalidDesc ASSERT on valid error paths
Blocks: 623948
Comment on attachment 510021 [details] [diff] [review]
patch v1

These functions are called "InvalidXXX".  If those are assigned to a socket method then that method must never be called.

According to [1], you cannot let PR_Available of SSL socket throw an error, as it would indicate an error state on the socket.


Can you please list all cases, show patches, that don't work for you and why you need to change the socket implementation?  We should go case by case, introduce new methods (functions) for available and other as you need.

For available (and -64), we can go either the way of returning PR_WOULD_BLOCK_ERROR or the way of return 0 bytes and SUCCESS result as a start.  It also must reflect the state of the socket, i.e. throw the error state the socket is in.


[1] http://hg.mozilla.org/mozilla-central/annotate/280e978fc6fb/xpcom/io/nsIInputStream.idl#l103
Attachment #510021 - Flags: review?(honzab.moz) → review-
(In reply to comment #2)
> Comment on attachment 510021 [details] [diff] [review]
> patch v1
> 
> These functions are called "InvalidXXX".  If those are assigned to a socket
> method then that method must never be called.
> 

I think we need a better story than this, because those functions are called from well defined IDL interfaces.. potentially extensions, etc.. they should not assert. Are we in agreement on that?

> According to [1], you cannot let PR_Available of SSL socket throw an error, as
> it would indicate an error state on the socket.

I see that now, thanks. That's kind of a short sighted definition ("all other errors mean the stream is closed"). oh well.

> Can you please list all cases, show patches, that don't work for you and why
> you need to change the socket implementation?  We should go case by case,
> introduce new methods (functions) for available and other as you need.

as an optimization, I'd like nsHttpConnection::CanReuse() to test mSocketIn->Available().. if there is response data sitting on the stream without an outstanding request, then we shouldn't reuse that connection. In practice, this often turns out to be 408 timeouts followed by a fin. The IsAlive() check doesn't detect the fin because there are data bytes queued in front of it.

I think we do deal with the 408 ok in the state machine (going through the restart() logic), but doing so involves processing the transaction twice when the issue is often forseeable in CanReuse() (modulo a race condition of when the 408 arrives).

The syn-retry stuff makes this show up a little more often because the timeout for the first request is often shorter than the timeout between reuses on a connection. p.ic.tynt.com is apparently set at 4 with nginx (which is a non standard nginx config). I think sending the 408 is wrong on its part (it should just close the connection instead of sending a response to a non existent request), but that's what it does.

Its just an optimization so its ok if SSL just returns 0 (or an error if appropos).  We can't test it at the socket level as we do for the fin because read data there might be part of the SSL wrapper, not http.

 the way of return 0 bytes and SUCCESS result as a
> start.  It also must reflect the state of the socket, i.e. throw the error
> state the socket is in.
> 

I'll try and reform the patch this way.
(In reply to comment #3)
> I think we need a better story than this, because those functions are called
> from well defined IDL interfaces.. potentially extensions, etc.. they should
> not assert. Are we in agreement on that?

I wanted to point out not to modify exiting function, but rather introduce new ones if you want a different behavior.  I.e. - finish their implementation.
Attached patch patch v2 (obsolete) — Splinter Review
Is this more what you had in mind?
Attachment #510021 - Attachment is obsolete: true
Attachment #511114 - Flags: review?(honzab.moz)
Blocks: 603503
Assignee: nobody → mcmanus
Just an idea:

We need to fix this because you have added a check for mSocketIn->Available to nsHttpConnection::CanReuse().  However how is this check different from check in nsSocketTransport::IsAlive?  It returns TRUE when there are data on the wire or we just wait while no error is indicated.

What about to rather change nsSocketTransport::IsAlive (or even have a new method) to also return information about how much or just if some data is waiting?  We could then avoid using Available, what I would like to anyway.
> 
> What about to rather change nsSocketTransport::IsAlive (or even have a new
> method) to also return information about how much or just if some data is
> waiting?  We could then avoid using Available, what I would like to anyway.

I'm not sure I follow what you're suggesting. The difference between IsAlive() and Available() is that available() tells you how much information is pending on the socket because you can surely have a valid socket with 0 bytes pending, just as you can have a dead socket with N bytes pending (which arrived before it was closed).

I agree this patch doesn't make Available() totally work for SSL. It just makes it a little less broken. Do you have thoughts on how to correctly implement the API for SSL?
(In reply to comment #7)
> I'm not sure I follow what you're suggesting. The difference between
> IsAlive() and Available() is that available() tells you how much information
> is pending on the socket because you can surely have a valid socket with 0
> bytes pending, just as you can have a dead socket with N bytes pending
> (which arrived before it was closed).

Have a socket that is in one of the 4 states:
1 - the TCP session is open and alive, but there are no data to read
2 - the TCP session is open and alive and there are pending data to read
3 - the TCP session has been closed (RST or FIN) and there are no data to read
4 - the TCP session has been closed (RST or FIN), but there are pending data to read

Maybe I'm wrong, but AFAIK, this is how PR_Read(PR_MSG_PEEK) and PR_Available behaves in those 4 states:

1 - PR_Read returns WOULDBLOCK, PR_Available returns OK and 0
2 - PR_Read returns OK and the number of bytes available in the limit we gave it to read, PR_Available returns OK and the full number of bytes to read
3 - PR_Read returns NETRESET (for RST) or OK and 0 for FIN, PR_Available return NETRESET (for RST) or OK and 0 for FIN
4 - PR_Read returns OK and the number of bytes available, PR_Available returns OK and the number of bytes available

My goal was to use PR_Read(PR_MSG_PEEK) to get the same information that you are later trying to extract, ones again, using PR_Available.

As described above (needs to be confirmed, but I believe it is correct) we really have all the necessary info from PR_Read already.  We just need to expose it through IsAlive or a new method of nsSocketTransport.
your primary motivation is to get rid of PR_Available, yes? I didn't really understand that.

That's fine.. we can keep the Available() API and just implement it with pr_read(peek) instead of pr_available, yes?
(In reply to comment #9)
> your primary motivation is to get rid of PR_Available, yes? I didn't really
> understand that.
> 

Yes, sorry to not be precise on this.

> That's fine.. we can keep the Available() API and just implement it with
> pr_read(peek) instead of pr_available, yes?

No.  I want to avoid call to PR_Available at all.  We can use PR_Read to get the necessary info.  We already do that, so no worry about any regressions.  

We just need to tweak the following code to expose the 3-state info: alive / alive+data / dead, change this... [1]

if ((rval > 0) || (rval < 0 && PR_GetError() == PR_WOULD_BLOCK_ERROR))
   *result = PR_TRUE;


...to something like:

if (rval > 0) {
    *hasData = true;
    *result = true;
}
else if (rval < 0)
    if (PR_GetError() == PR_WOULD_BLOCK_ERROR) {
       *result = true;
       *hasData = false;
    }
    else {
       *result = false;
       *hasData = false;
    }
}
else { // rval == 0 
    *hasData = false;
    *result = false;
}


[1] http://hg.mozilla.org/mozilla-central/annotate/3fd770ef6a65/netwerk/base/src/nsSocketTransport2.cpp#l1851
(In reply to comment #10)
> (In reply to comment #9)
> > your primary motivation is to get rid of PR_Available, yes? I didn't really
> > understand that.
> > 
> 
> Yes, sorry to not be precise on this.
> 
> > That's fine.. we can keep the Available() API and just implement it with
> > pr_read(peek) instead of pr_available, yes?
> 
> No.  I want to avoid call to PR_Available at all. 

Why can't we keep the nsSocketInputStream::Available() API, but just change its implmentation to not use PR_Available ?
We probably still not understand each other. 

nsHttpConnection::CanReuse() calls (between else) mSocketTransport->IsAlive() that calls PR_Read on the socket, then CanReuse() calls mSocketIn->Available that calls PR_Available() on the same socket to just get information if there are pending data.  We can already get this information from mSocketTransport->IsAlive()'s call to PR_Read, right?

I want to avoid the call to mSocketIn->Available from nsHttpConnection::CanReuse().

I don't want to change nsSocketInputStream::Available() method implementation, it is IMHO a bit unwise.
(In reply to comment #12)
> We probably still not understand each other. 

indeed - but this comment cleared things up. Thanks!

>
> I want to avoid the call to mSocketIn->Available from
> nsHttpConnection::CanReuse().
>

[by using the data already accumulated in the IsAlive() call, using an interface change to IsAlive() or another IsAlive-Like() function]

I can change this patch to do that, sure.

I still think we ought to make the existing Available() API work with nss.
(In reply to comment #13)
> [by using the data already accumulated in the IsAlive() call, using an 
> interface change to IsAlive() or another IsAlive-Like() function]

Exactly!

> I still think we ought to make the existing Available() API work with nss.

One day yes, but not necessarily now.
Assignee: mcmanus → nobody
Component: Security: UI → Security: PSM
QA Contact: ui → psm
Didn't mean to un-assign this when changing component.
Assignee: nobody → mcmanus
Attachment #511114 - Flags: review?(honzab.moz)
Summary: nsNSSIOLayer.cpp: invalidInt, invalidStatus, and invalidDesc ASSERT on valid error paths → Available() does not work on SSL Stream
Attached patch patch 3 (obsolete) — Splinter Review
As per comment 14, for the moment we'll ignore the SSL problem with Available() because the code in HttpConnection just wants to know if any data bytes are available not necessarily how many.

That can be done with MSG_PEEK which is working just fine in socketTransport::isAlive() for SSL now.

This patch adds a new transport method that returns both isAlive and a isdatapending indicator. It manages to reuse the old code path, so no new sys calls are added.

I added a nsISocketTransportInternal idl as an unfrozen bucket of new interfaces instead of changing the vernerable socketTransport in hopes of not causing any other gecko code problems with a core interface.
Attachment #511114 - Attachment is obsolete: true
Attachment #579899 - Flags: superreview?(cbiesinger)
Attachment #579899 - Flags: review?(honzab.moz)
(In reply to Patrick McManus from comment #16)
> As per comment 14, for the moment we'll ignore the SSL problem with
> Available() because the code in HttpConnection just wants to know if any
> data bytes are available not necessarily how many.
> 
> That can be done with MSG_PEEK which is working just fine in
> socketTransport::isAlive() for SSL now.

So wait, if you can determine whether any bytes are available (using MSG_PEEK), can't you use that knowledge to implement available() in a way that it returns 0/1/error, at least?
(In reply to Christian :Biesinger (don't email me, ping me on IRC) from comment #17)
> (In reply to Patrick McManus from comment #16)
> > As per comment 14, for the moment we'll ignore the SSL problem with
> > Available() because the code in HttpConnection just wants to know if any
> > data bytes are available not necessarily how many.
> > 
> > That can be done with MSG_PEEK which is working just fine in
> > socketTransport::isAlive() for SSL now.
> 
> So wait, if you can determine whether any bytes are available (using
> MSG_PEEK), can't you use that knowledge to implement available() in a way
> that it returns 0/1/error, at least?

Honza is pretty clear in the history of this bug that he does not want changes to psm available(). don't much care myself, but want that check in nshttpconnection to have parity with ssl - its potentially an impt sign that something has gone wrong in pipelining.
Hrm, I'm really not happy adding this workaround for a broken available() implementation. mayhemer, maybe we can discuss this on IRC tomorrow?
To make my opinion clear:

- it was not about not adding available to PSM, it was about not calling available from Necko since I was afraid that could cause unexpected regressions from PSM SSL stack
- PSM SSL stack and its interaction with UI has changed after Brian's work on it (removed proxies, UI is now contacted in a blocking manner, no SSL thread, PSM SSL sockets are handled the same way as Necko TCP sockets are)
- based on that I'm not that much against implementing PR_Available in PSM, during SSL thread removal review I also noticed there used to be an implementation for it
- I'm just still concern about making Necko do call to PR_Read(MSG_PEEK) and then another call to PR_Available() when the first call is enough to get the information that we are trying to obtain with the latter call immediately after (performance suboptimal and still potentially a threat of regressions, deps on changes to PSM)
Comment on attachment 579899 [details] [diff] [review]
patch 3

Review of attachment 579899 [details] [diff] [review]:
-----------------------------------------------------------------

On the first look I like this.  Waiting for Christian's feedback.
Well, this code is also performance suboptimal, since it adds a QueryInterface call here :)

I guess it's mainly the additional interface that I don't like, but even so, do we know that making the additional call is slow? And could we just replace isAlive() with available()? It just seems ugly to me to add a function that is basically just the combination of two existing functions.
I'm not real bothered by the efficiency of either a QI or a system call in this path (which is pretty much just when you're interacting with the connection manager). A slushy internal interface for sockets could be very useful to have going forward if we want to have access to some lower level tcp'ish things.  In the rapid release model I like the concept of a rock solid basic interface like nsISocketTransport and having a second one that allows for more rapid iteration and experimentation - so to me that's a good separation.

If it became more of an issue as things are added to internal we could of course derive nsisockettransport from nsisockettransportinternal and just qi the thing one time when we store the original reference that comes back from the transport service.. but that seems like putting the cart before the horse to me.

So I guess I'd just argue for the current patch 3.

but whatever we decide, let's just do it so I can write the code and move on. The differences don't seem huge. In the end I'm happy with it being biesi's call.
(In reply to Christian :Biesinger (don't email me, ping me on IRC) from comment #22)
> do we know that making the additional call is slow? 

We don't.  We just know it is not necessary.

> And could we just
> replace isAlive() with available()? 

= remove or stop using nsISocketTransport.isAlive() and in nsHttpConnection::IsAlive rather use nsISocketTransport.available() instead?

I'm not against that.  It actually makes a lot of sense, we can see whether the socket is still alive (no error except would block, if |available| can even return would block) and also indication of data waiting for read, if any.

We would have to think of SSL implementation for it.  Before ssl thread had been introduced SSL_DataPending was used to report any pending data.  However, its implementation just looks into the SSL buffer, not at the socket it self.

For us the more proper implementation would be to look into the buffer and if empty look at the socket by call of lower level avail method.  Actually, when there is something on the socket waiting, next call to PR_Read on the ssl socket will with high probability return decrypted data.

Available on socket transport is not used anywhere in the Gecko code.  So, changing PSM implementation for PR_Available seems to be safe...

> It just seems ugly to me to add a
> function that is basically just the combination of two existing functions.

We don't have interface to peek a message and get the amount of data.  We cannot change implementation of nsSocketTransport::IsAlive, since its meaning is "is the socket up?" and dropping the flag when there are data is not what we want.
(In reply to Honza Bambas (:mayhemer) from comment #24)
> For us the more proper implementation would be to look into the buffer and
> if empty look at the socket by call of lower level avail method.  Actually,
> when there is something on the socket waiting, next call to PR_Read on the
> ssl socket will with high probability return decrypted data.

Hmmm.. but the number of bytes actually available will be different when looking what number of bytes is on the TCP socket then the number of bytes later decrypted for the application level consumer.

However, necko sockets are always non-blocking, so I don't think anyone would do socketInputStream.read(buffer, socketInputStream.available()) and would deadlock like that..  If the stream is set to block then it is also buffered with a pipe what filters the inconsistency.
And one argument against using Available: we don't know if the implementation will work on all platforms well, we never used that function before for this decision.  So, tests are a must.
(In reply to Honza Bambas (:mayhemer) from comment #24)
> = remove or stop using nsISocketTransport.isAlive() and in
> nsHttpConnection::IsAlive rather use nsISocketTransport.available() instead?

Yep, that's what I meant.

> We would have to think of SSL implementation for it.  Before ssl thread had
> been introduced SSL_DataPending was used to report any pending data. 
> However, its implementation just looks into the SSL buffer, not at the
> socket it self.

Can't you use PR_Recv + MSG_PEEK for that? That may return less data than is available on the socket, but that is fine for this function.
(In reply to Honza Bambas (:mayhemer) from comment #26)
> And one argument against using Available: we don't know if the
> implementation will work on all platforms well, we never used that function
> before for this decision.  So, tests are a must.

I think my suggestion solves that problem because then available() will do basically the same as what isAlive does, right?
(In reply to Christian :Biesinger (don't email me, ping me on IRC) from comment #27)
> Can't you use PR_Recv + MSG_PEEK for that? That may return less data than is
> available on the socket, but that is fine for this function.

Something like:
nsSocketInputStream::Available(..)
{
  return PR_Read(MSG_PEEK, 1);
}

That would work well.  It is better to return less then what we can actually return from Read call.

I say try it.  Good idea.
If we only need to know whether there is any data ending, and not *how much* data is pending, then we should avoid using an interface that provides anything other/more than that single bit of information. So, I like that part of the proposed patch.

However, in another part of the patch, it is mentioned that an idle connection should never have any data pending to be read. This is not true, because the server can send 1xx responses at any time, and I have seen various proposals for servers to do so for various reasons.
(In reply to Brian Smith (:bsmith) from comment #30)
> However, in another part of the patch, it is mentioned that an idle
> connection should never have any data pending to be read. This is not true,
> because the server can send 1xx responses at any time, and I have seen
> various proposals for servers to do so for various reasons.

Please erase this from your mind. I know it isn't true; 1xx can be sent after a request and before a non-1xx response, but not after a non-1xx response or before a request.
Attached patch patch 4Splinter Review
Happy to defer to :bisei's here and do something like comment 29.

This patch:
 * changes PSM so that calls to PR_Available() no longer assert and now return NOT_IMPLEMENTED
 * changes the Available() implementation to call PR_Available() but fallback when it sees NOT_IMPLEMENTED to running MSG_PEEK instead
 * removes the don't-check-available() restriction in nsHttpConnection::CanReuse()
Attachment #579899 - Attachment is obsolete: true
Attachment #580467 - Flags: superreview?(cbiesinger)
Attachment #580467 - Flags: review?(honzab.moz)
Attachment #579899 - Flags: superreview?(cbiesinger)
Attachment #579899 - Flags: review?(honzab.moz)
Comment on attachment 580467 [details] [diff] [review]
patch 4

brian, r? re psm bits.
Attachment #580467 - Flags: review?(bsmith)
Comment on attachment 580467 [details] [diff] [review]
patch 4

oh, not quite what I was thinking (my thought was to do this inside of PSMAvailable) but this works. sr=biesi
Attachment #580467 - Flags: superreview?(cbiesinger) → superreview+
Comment on attachment 580467 [details] [diff] [review]
patch 4

Review of attachment 580467 [details] [diff] [review]:
-----------------------------------------------------------------

This patch calls (may call) PR_Recv(1, MSG_PEEK) twice on SSL sockets, not an inexpensive operation on live sockets.  This isn't called that often, though.

We could potentially call just PR_Available to check whether a socket is alive (meaning to also implement PSM's available as recv(1, MSG_PEEK)).  But that would be a riskier change, we never tried how well PR_Available behaves on all platforms for TCP sockets.

Except that I think it is not a bad idea to move the 'fake' available code to the PSM's methods directly anyway - remember we do this change just because of SSL sockets, so the fix should be actually there.  But I'm OK with this too, just that when someone implements a new socket and PR_Available will be unimplemented we may end up with something potentially unexpected for PR_Recv(MSG_PEEK).

One detail: what will PSM (SSL) read() return when there is something on the TCP socket, but is not a complete SSL data record?  Won't that return WOULD_BLOCK while there are actually data to read, even though later?


r=honzab anyway for the current version of the patch.
Attachment #580467 - Flags: review?(honzab.moz) → review+
thanks honza.

> One detail: what will PSM (SSL) read() return when there is something on the
> TCP socket, but is not a complete SSL data record?  Won't that return
> WOULD_BLOCK while there are actually data to read, even though later?
> 
> 

That's true, but its true in some sense at all levels for different time spans. A TCP read will return would_block even though an ethernet frame is half way received, for instance. Data just doesn't exist for those interfaces until the quanta is filled - SSL seems the same to me.
Comment on attachment 580467 [details] [diff] [review]
patch 4

Review of attachment 580467 [details] [diff] [review]:
-----------------------------------------------------------------

(In reply to Patrick McManus [:mcmanus] from comment #36)
> That's true, but its true in some sense at all levels for different time
> spans. A TCP read will return would_block even though an ethernet frame is
> half way received, for instance. Data just doesn't exist for those
> interfaces until the quanta is filled - SSL seems the same to me.

I think what you are saying is that this check is unreliable/imperfect even in the non-SSL case, and that is OK because this is a heuristic, right?

I reviewed the PSM change only. The PSM change is fine by me. But, I don't understand why we call PR_Available() first, instead of just always calling PR_Recv(1, MSG_PEEK). I would avoid calling PR_Available at all if PR_Recv(1, MSG_PEEK) works in all cases and doesn't cause performance problems. Also, we already use PR_Recv(1, MSG_PEEK) elsewhere to test for errors on the socket, IIRC.
Comment on attachment 580467 [details] [diff] [review]
patch 4

Review of attachment 580467 [details] [diff] [review]:
-----------------------------------------------------------------

(In reply to Patrick McManus [:mcmanus] from comment #36)
> That's true, but its true in some sense at all levels for different time
> spans. A TCP read will return would_block even though an ethernet frame is
> half way received, for instance. Data just doesn't exist for those
> interfaces until the quanta is filled - SSL seems the same to me.

I think what you are saying is that this check is unreliable/imperfect even in the non-SSL case, and that is OK because this is a heuristic, right?

I reviewed the PSM change only. The PSM change is fine by me. But, I don't understand why we call PR_Available() first, instead of just always calling PR_Recv(1, MSG_PEEK). I would avoid calling PR_Available at all if PR_Recv(1, MSG_PEEK) works in all cases and doesn't cause performance problems. Also, we already use PR_Recv(1, MSG_PEEK) elsewhere to test for errors on the socket, IIRC.
Attachment #580467 - Flags: review?(bsmith) → review+
(In reply to Brian Smith (:bsmith) from comment #38)
> I don't
> understand why we call PR_Available() first, instead of just always calling
> PR_Recv(1, MSG_PEEK). I would avoid calling PR_Available at all if
> PR_Recv(1, MSG_PEEK) works in all cases and doesn't cause performance
> problems. Also, we already use PR_Recv(1, MSG_PEEK) elsewhere to test for
> errors on the socket, IIRC.

That's exactly what I wanted to do first, call PR_Recv(1, MSG_PEEK) just ones.  But that means to introduce a new interface, see the patch 3, Christian didn't like that.

We don't have an interface/method that would call PR_Read(1, MSG_PEEK) and give the all state information: failed socket, gracefully closed socket, live socket with data, live socket w/o data.  We can only get the live or dead socket information.

Not sure how expensive is to call PR_Read(1, MSG_PEEK) on ssl sockets, but seems to be expensive.  However, this code doesn't get called that often.
https://hg.mozilla.org/integration/mozilla-inbound/rev/2675a1cc6792
Target Milestone: --- → mozilla12
https://hg.mozilla.org/mozilla-central/rev/2675a1cc6792
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: