486769 - Apply a detect-and-recover strategy to http pipelining

Reporter

Description

•

15 years ago

This is filed to investigate how far we can get with a different approach to http-pipelining. The observation is that there have been several attempts to enable http pipelining, all failing due to complex, uncontrollable, and maybe not completely understood reasons. The idea is to attack the problem from a different angle and see how far it takes us. If this angle has been tried before, please point to a defect, report or similar.

A way to make any software more robust is to implement a combination of 1) detecting bad states in the system, and 2) appropriate actions to recover and bring the system back to a stable state. I like to refer to this as "detect-and-recover" - there are probably other terms.

This approach is a little different from what we normally do since code is (knowingly) written in a way that allows the system to get in a bad state (which can simplify the code and maybe algorithms greatly!). The premise is that there are detection-mechanisms and recovery-actions to keep the system stable. The "normal" approach is to catch single failures and build error-correction in all parts of the system to avoid getting into bad states. However, this sometimes just doesn't work, and it may also make code rather complex and inefficient.

So, this thread is to brainstorm 1) how can we detect a bad state in a pipeline and 2) how can we recover to a sound state. It may or may not lead somewhere, but at least some understanding should come out of it.

I'd like to start by identifying events that may make a pipeline fail. Intuitively, I see these

- server drops connection
- read form network times out (similar to the above, or not..?)
- missing content-length header in response
- wrong content-length header in response
- mismatched request/response
- blocked by one very long response (degrades performance)

Are there any of these which typically has caused pipelining to fail in the past? Any of these being irrelevant or impossible? Any other relevant events which should be mentioned?

Transparent proxies have often been blamed in the past. Which of the above events, if any, do they typically cause?

Boris Zbarsky [:bzbarsky]

Comment 1

•

15 years ago

> - mismatched request/response

This has been a major source of issues that were actually tracked down, as I recall.  This has manifested in responses being sent on the wrong connection, the same response being sent partially on one connection and partially on another, bogus responses being sent, responses coming in in the wrong order on a single connection (including being interleaved).

> - server drops connection

This one too, I think.

Both have been caused by proxies...

Patrick McManus [:mcmanus]

Comment 2

•

15 years ago

I would add to your list
 * server drops request (but not connection).

(In reply to comment #0)

> - server drops connection

as long as it happens in between responses, this is a legitimate state that needs to be handled in the "good path" code. (and I think it is, fwiw). Server is entitled by spec to do this.

> - blocked by one very long response (degrades performance)

comet is the pathological case:

http://cometdaily.com/2008/02/22/comet-and-http-pipelining/

> Transparent proxies have often been blamed in the past. Which of the above
> events, if any, do they typically cause?

to me, the transparent thing is a bit of a misnomer. They are just server side bugs. I cannot think of one interaction problem where the server side was doing it because it believed it was the right thing to do. I think we see them with transparent proxies more because there is a lot more diversity in http intermediaries than there is in straight up origin server software. And when they are transparent (which is true at least as often as not) the server: blacklist code doesn't work.

fwiw - its not just proxies.. l4 and l7 switches have these bugs too.

A.D.F.

Comment 3

•

15 years ago

(In reply to comment #0)
> The idea is to attack the problem from a
> different angle and see how far it takes us. If this angle has been tried
> before, please point to a defect, report or similar.

Enabling and using a safe pipelining approach is an old dream that has been around for several years (let's say at least for 5-6 years).

I guess you already know about bug 264354, bug 329977 and bug 395838; there are many other bugs related to "pipelining" keyword but they are mainly about sympthoms of a broken world (servers, proxies - especially transparent proxies -, HTTP accelerators with custom / non standard behaviours, HTTP filters including anti-virus programs, etc.).

> So, this thread is to brainstorm 1) how can we detect a bad state in a pipeline
> and 2) how can we recover to a sound state. It may or may not lead somewhere,
> but at least some understanding should come out of it.

Good luck ;-)

> I'd like to start by identifying events that may make a pipeline fail.
> Intuitively, I see these
> 
> - server drops connection
> - read from network times out (similar to the above, or not..?)
> - missing content-length header in response
> - wrong content-length header in response
> - mismatched request/response
> - blocked by one very long response (degrades performance)
> 
> Are there any of these which typically has caused pipelining to fail in the
> past? Any of these being irrelevant or impossible? Any other relevant events
> which should be mentioned?
> 
> Transparent proxies have often been blamed in the past. Which of the above
> events, if any, do they typically cause?

2 and 6 maybe also 3 and 5, ask UK users.

It looks like you want to restart the whole thinking from scratch so you should be aware that the whole process won't be an easy task.

I suggest you to start by trying to reconstruct two main scenarios:

1) the ideal scenario or "how things should work if everything would comply to RFC 2616";

2) the real scenario in which you have to take in account the various bad actions / behaviours of lots of software programs (web servers, proxies, HTTP aware load balancers, HTTP accelerators, etc.) and network conditions (available bandwidth, latencies, quality of service, etc.).

So, if I were you I would try to:

1) write state diagrams of how things should work / behave as specified by RFC 2616 and looking to its proposed revisions too (http://www.w3.org/Protocols/) in order to reconstruct the ideal scenario by state machines, UML use cases, etc.

2) construct the real possible scenario(s) by:
   A) adding the logic of many workarounds implemented in current network / pipelining code (added to fix some popular misbehaviours);
   B) introducing the effects of other possible behaviours of bad software (*);
   C) thinking about sympthoms due to bad software and how to work around them.

NOTE: part of the pipelining problem lies also in the fact that an HTTP response doesn't include a reference to its request id and this because it is assumed that HTTP responses come back in the same order of their requests, but this is not always the case (because of buggy SW) as it is mentioned in other comments in the various pipelining bugs.

Of course the easy path would be to start by enabling pipelining only in connections were all actors really support pipelining as specified by RFC 2616; this could be detected by the usage of a "pipelining" field / token that should be propagated only by compliant software (server, proxies, etc.) and discarded by all others.

An interesting solution could be to add a keyword "pipelining" to Connection: field, i.e.:

Connection: Keep-Alive, pipelining

that would be discarded by HTTP/1.1 programs that don't understand it; new software versions could then add it to explicitely confirm that they support pipelining (and pipelining would be enabled by browsers only if all other programs in HTTP chain - including transparent proxies - really support pipelining after first request/response in a new connection).

Unluckily many web servers, proxies, etc. do not even parse properly the value(s) in Connection field (in theory there can be more than one token, comma separated, in practice only one is used), so the effects of this possible solution should be studied and verified carefully before adopting it.

Sorry for this partial off topic reply but the real problem in pipelining is the usage of broken transparent proxies (that cannot be detected) and when there are many (2, 3) transparent proxies in an HTTP chain the effects can be really unpredictable (things might work or not work randomly depending on timings, network latencies, request length, etc.).

Bjarne (:bjarne)

Reporter

Comment 4

•

15 years ago

Thanks for lots of useful and constructive info! :)

It wasn't emphasized in comment #0, but I really plan to use the current code as is (to the extent possible), and I have full respect for the excellent work done earlier in this area. The idea is to view the issue from a different perspective/angle in the hope that this might bring us a step forward.

I agree with Patrick from comment #2 in that it doesn't matter whether errors are caused by a proxy, transparent proxy, switch/router or the base http-server itself. The point is that we receive something wrong, and the fact that this may be caused by any entity along the path just means that we cannot rely on blacklisting the base server. (Not suggesting that blacklisting is useless - just that it is not sufficient.) An additional strategy seems necessary.

Summing up thinking so far, I'll argue that our current list of events can be divided in two groups, each group representing a bad state in the pipeline

State A) Pipeline appears functional but produces mismatched requests/responses. This can be caused by any combination of
1) server dropping request (but not the connection)
2) responses returned out of sequence
3) server sending bogus response(s)
4) server sending response(s) partly or completely over wrong connection

State B) Pipeline appears dysfunctional or doesn't respond. This can be caused by any one of
1) some request hangs or takes very long time to respond
2) server drops connection in the middle of a response
3) read from network times out (same problem as 2?)
4) content-length header for response missing
5) content-length header wrong (leads to problems reading responses, producing responses with bad data, eventually crashing the pipeline)

All input/comments/corrections/arguments etc are very welcome at this point! :) Especially, it would be useful to identify other bad events and group them.

Pushing this a little further, there are now 4 tasks :

T1 : How to detect state A
T2 : How to recover from state A
T3 : How to detect state B
T4 : How to recover from state B

Recovery tasks (T2 and T4) at first glance seems to be very similar: Drop pipeline, re-dispatch pending and/or mismatched requests. It *may* be possible to match requests and responses in T2, but this can be an optimization for later. In the case of T4/B-1, the hanging response could be left and outstanding requests could be preempted and re-dispatched.

Detecting state B (T3) could be done by monitoring socket-state and delay since receiving the last response. (I.e define connection-state and timeout at the pipeline-level.) T3/B-4 is particularly easy to detect.

Detecting state A (T1) is surely the toughest task, but it seems to boil down to decide whether a response belongs to a given request or not. At the time of writing, I'm playing with these ideas

- Is there *anything* we could attach to a request which can be identified in the response (probably not, but it must be asked... :) ) ? I.e. is there something we can pass with a request which forces the server to return something distinguishable?

- Can we utilize unique combinations of content-type, request-method, headers or similar to cross-check that a response matches a given request? E.g. a response to a HEAD-request can not have content, response to an OPTIONS-request is likely to have Allow-headers, etc.

To clarify the last idea : Let's say a pipeline only has one GET-img-request and one GET-text/xml request outstanding at the same time. It should be possible to verify that responses match in this case, no? How many such combinations of content-type/method/other-stuff can be identified? Are they sufficiently unique? Can we determine or hint about the expected content-type when dispatching requests? If so, how?

Constructive ideas, input and other comments are very welcome.

It has been pointed out that Opera now ships with http-pipelining enabled by default. This indicates that they found some way to deal with this. Has anyone tried Opera on sites which typically fail with FF & pipelining? Anyone wishing to share their experience and/or network-logs? :)

Boris Zbarsky [:bzbarsky]

Comment 5

•

15 years ago

We need to make sure to not send POST requests over pipelined connections (since we don't want to retry them), right.

Patrick McManus [:mcmanus]

Comment 6

•

15 years ago

(In reply to comment #5)
> We need to make sure to not send POST requests over pipelined connections
> (since we don't want to retry them), right.

yes imho. even more broadly: non idempotent methods should not be sent over persistent connections due to race conditions in the detection of connection close. Pipelines are a subset of persistent connection.

This is true even if everything is operating bug free.

Bjarne (:bjarne)

Reporter

Comment 7

•

15 years ago

We don't disagree here...  :)  But out of curiosity : what kind of race-condition do you foresee here? Or rephrased : which threads would race?

Btw, would you consider OPTIONS to be an idempotent method (nsHttpChannel::SetupTransaction() does not think so)?

Patrick McManus [:mcmanus]

Comment 8

•

15 years ago

(In reply to comment #7)
> We don't disagree here...  :)  But out of curiosity : what kind of
> race-condition do you foresee here? Or rephrased : which threads would race?
> 

race is between client and server, not threads of client. If server initiates a close that might happen while the request is in flight or even before request is sent but before client receives the close. Neither scenario is really an error - its inherent in http. But it is indistinguishable from an error, so idempotence is impt. 

> Btw, would you consider OPTIONS to be an idempotent method
> (nsHttpChannel::SetupTransaction() does not think so)?

imo OPTIONS is pconn safe, but that doesn't mean the existing implementation is the wrong choice. OPTIONS should be relatively rare so going the safe route is no big deal.

Bjarne (:bjarne)

Reporter

Comment 9

•

15 years ago

Ahh - that kind of race condition. :) Is this really likely to happen for persistent connections wo/ pipelining? Or do you mean the server initiates a close on the socket-level (as opposed to on http-level)?

OPTIONS is used during XHR preflight (for cross-site XHR), thus usage may be increasing. The client uses the preflight-response to determine whether the actual XHR-request should be sent, so there is a synchronization-point here (although I don't currently see how or whether this fact can be used).

Actually, if a client sends an XHR-preflight request, it should avoid all requests to the server which may change the result of OPTIONS until it has received the preflight-response and sent/not sent the actual XHR-request. This means that the client must block all (non-idempotent?) requests while waiting for the preflight-response, as well as wait for all pending (non-idempotent?) requests to complete *before* sending the preflight-request. Phew...  hope I'm wrong... :\

Patrick McManus [:mcmanus]

Comment 10

•

15 years ago

(In reply to comment #9)
> Ahh - that kind of race condition. :) Is this really likely to happen for
> persistent connections wo/ pipelining? Or do you mean the server initiates a
> close on the socket-level (as opposed to on http-level)?
> 

I do mean on the socket level. It's not clear to me what an http-level close even is.. you mean connection: close? That's not a required element, the server can close the connection perfectly legitimately without ever sending a header.

It happens all the time in a racey way and the current code base deals with the necessary retries.

> OPTIONS is used during XHR preflight (for cross-site XHR), thus usage may be
> increasing. The client uses the preflight-response to determine whether the
> actual XHR-request should be sent, so there is a synchronization-point here
> (although I don't currently see how or whether this fact can be used).
> 

as I understand it the OPTIONS preceeds the actual cross site request, but there is no protocol requirement that they be on the same socket connection - right? (that would be a pretty big http no-no). So this is an implementation decision..

> Actually, if a client sends an XHR-preflight request, it should avoid all
> requests to the server which may change the result of OPTIONS until it has
> received the preflight-response and sent/not sent the actual XHR-request. 


That seems overly conservative to me and doesn't acheive very much given that there certainly can be other unrelated clients out there that can't operate within that "lock".. are you reacting to anything particular in the cross-origin spec? I've only read that one time.

Bjarne (:bjarne)

Reporter

Comment 11

•

15 years ago

Looks like we should keep non-idempotent methods out of the equation at the moment, rather focusing on tasks and ideas mentioned in comment #4. However, obviously, non-idempotent methods must be considered carefully later.

About OPTIONS: Those thoughts just occurred to me and have not matured. I'm not aware of anything related in the specs - I guess it depends on how "atomic" the preflight and the request are considered to be, as well as how "volatile" results of OPTIONS-requests are. You are certainly right about unrelated clients not being able to relate to this.

Still things I don't get about the race-condition: Do you say that a server is likely to initiate close on a connection, *then* execute a POST-request which arrived over the same connection it just closed?

Patrick McManus [:mcmanus]

Comment 12

•

15 years ago

(In reply to comment #11)

> Still things I don't get about the race-condition: Do you say that a server is
> likely to initiate close on a connection, *then* execute a POST-request which
> arrived over the same connection it just closed?

No. I am saying a client cannot distinguish between these two cases:

case 1] server sends TCP close. client sends POST/HTTP-Req. client then recvs close without a HTTP response.

case 2]Client sends POST/HTTP-Req. Server begins to process it. Server crashes and sends TCP close. client recvs close without an HTTP response.

They are indistinguishable from the client's POV and it should definitely not silently retry the POST in case 2. (there are many variations on case 2). The server isn't doing anything out of spec in case 1 by closing the connection..

8.1.4 of rfc 2616 is the relevant part:

 A client, server, or proxy MAY close the transport connection at any
   time. For example, a client might have started to send a new request
   at the same time that the server has decided to close the "idle"
   connection. From the server's point of view, the connection is being
   closed while it was idle, but from the client's point of view, a
   request is in progress.

   This means that clients, servers, and proxies MUST be able to recover
   from asynchronous close events. Client software SHOULD reopen the
   transport connection and retransmit the aborted sequence of requests
   without user interaction so long as the request sequence is
   idempotent (see section 9.1.2). Non-idempotent methods or sequences
   MUST NOT be automatically retried, although user agents MAY offer a
   human operator the choice of retrying the request(s). Confirmation by
   user-agent software with semantic understanding of the application
   MAY substitute for user confirmation. The automatic retry SHOULD NOT
   be repeated if the second sequence of requests fails.

Bjarne (:bjarne)

Reporter

Comment 13

•

15 years ago

Ok - I think see what you mean. In [case 1] a non-idempotent request can safely be re-transmitted, whereas in [case 2] it may or may not be safe to re-transmit. The problem is the fact that the client cannot distinguish btw these cases, and thus cannot know whether it is safe to re-transmit or not.

Furthermore, if I understand your cases correctly, with non-persistent connections [case 1] is less likely to occur (only when server violates first sentence, 5th paragraph of section 8.1.4 in rfc 2616). However, it may still occur, and clients would not be able to distinguish between these cases (IMO).

The conclusion from all this (IMO) is that we must make sure to follow the last part of 4th paragraph in RFC 2616 section 8.1.4 (never automatically re-transmit non-idempotent requests if the connection is unexpectedly closed). Also, in my understanding, this applies when using both persistent and non-persistent connections, as argued above.

Whether to use persistent or non-persistent connections for non-idempotent requests is therefore IMO unrelated to this race-condition, as it seems like we cannot safely re-transmit them on unexpected connection-close in any case.

Are our views aligned?

Patrick McManus [:mcmanus]

Comment 14

•

15 years ago

(In reply to comment #13)

> The conclusion from all this (IMO) is that we must make sure to follow the last
> part of 4th paragraph in RFC 2616 section 8.1.4 (never automatically
> re-transmit non-idempotent requests if the connection is unexpectedly closed).

yes, that is the impt thing. The "could this order 2 pizzas by accident rule" as it was known.

> Also, in my understanding, this applies when using both persistent and
> non-persistent connections, as argued above.

yes.

> 
> Whether to use persistent or non-persistent connections for non-idempotent
> requests is therefore IMO unrelated to this race-condition, as it seems like we
> cannot safely re-transmit them on unexpected connection-close in any case.

You cannot retry - I agree.

But I disagree that it is unrelated because, in practice, this occurs more commonly on transaction > 1 of a persistent connection but uncommonly on transaction #1. This is because the timeouts for those two conditions are often not equal (servers trying to implement para 5 that you cited), and there is often idle time on the client before the reuse of persistent connection which eats into that timeout period and makes things racier. (i.e. the first transaction follows more or less immediately after opening the connection, but transaction > 1 does not necessarily occur immediately after the preceeding one).

You need the same logical error handling for each case, but you will reduce your error rate (and this is a user visible error to prompt the retry) if you keep non-idempotent requests on fresh connections.

Jonas Sicking (:sicking) No longer reading bugmail consistently

Comment 15

•

15 years ago

Yeah, I have some pretty serious doubts that this is possible. As other have pointed out, you generally can't recover from a failed response since we don't know if making the request had side effects on the server side.

Technically a GET request *should* be possible to redo, according to the HTTP spec. However lots of sites have server-side side effects for GET requests which means that if we redo a GET request due to the response having been jumbled.

I'm also not sure that we could detect jumbled responses. But I don't have any information one way or another on that.

Honza Bambas (:mayhemer)

Comment 16

•

15 years ago

By the way, made anyone any measurements with other browsers that support pipelining how big load improvement we could get? To fox this bug seems to be a big effort, so it's good to know what we get for that effort.

Patrick McManus [:mcmanus]

Comment 17

•

15 years ago

(In reply to comment #16)
> By the way, made anyone any measurements with other browsers that support
> pipelining how big load improvement we could get? To fox this bug seems to be a
> big effort, so it's good to know what we get for that effort.

it's a partial answer to your question - look at page two of https://bugzilla.mozilla.org/attachment.cgi?id=334617

The white space in those graphs is the latency that pipelining helps attack. It's a big deal at just a glance.

People are pretty aware of the inter-transaction latency but also consider the intra-transaction pauses which are generally a result of TCP congestion control ramp up time.. pipelining gives you comparatively longer-denser flows which are much more TCP efficient on this front.

So there is a lot of potential.

Also note that the latency to bandwidth ratio is getting worse (i.e bandwidth is improving faster than RTT) and thats going to make the opportunity bigger. This is most obvious as mobile apps become more common, but also consider how the world is getting more international (driving up avg rtt) and how emerging services like fiber-to-the-home have worse ratios (although better absolute performance) than the cable or dsl they might be replacing.

my opinion anyhow - this is an important technology. If you really want to get pie in the sky look at how latency was the bottleneck on CPU evolution.. they had to get into massively pipelined and speculative architectures to make progress. Network applications can learn a lot from that.

Bjarne (:bjarne)

Reporter

Comment 18

•

15 years ago

Attached file Initial version of pserver (obsolete) — Details

I'm attaching a simple (python based) http-server which let me simulate different types of errors on the server-side. See PServer/README.txt for a more detailed description. I observe that there has been some recent development to httpd.js, but I'm not sure if it can handle the kind of things necessary. If I'm wrong, please let me know and I'll use it instead.

I'm working on a set of Mochitests accessing this server and the handles provided and I'll attach these as well when they're getting ready. In the meantime, feel free to point out faults or issues in the server, or suggest useful handlers.

Bjarne (:bjarne)

Reporter

Comment 19

•

15 years ago

Another thing : If a GET-request fails because connection dies in the middle of the response - should it be re-requested? What if it carries "parameters" (i.e. is a query-url as described in rfc2616 section 13.9)?

(Note that this is slightly different from a pipelined request since in this case we *know* that the server has received the request and has even started to respond.)

Refer also to rfc2616 section 9.1.

Patrick McManus [:mcmanus]

Comment 20

•

15 years ago

(In reply to comment #19)
> Another thing : If a GET-request fails because connection dies in the middle of
> the response - should it be re-requested? What if it carries "parameters" (i.e.
> is a query-url as described in rfc2616 section 13.9)?
>

yes, it should be retried imho. The query-url isn't especially more interesting than any cookie data. it is all probably being used for dynamic purposes, but that doesn't mean it isn't safe. the method is what drives safeness. (this is a little different than cachability.. but that's another story.)

Jonas Sicking (:sicking) No longer reading bugmail consistently

Comment 21

•

15 years ago

According to the http spec a GET request *should* always be safe to retry. However this isn't always the case and rerequesting could result in a server-side action being taken twice.

This might be a risk we'll simply have to live with though.

But I agree with Patrick. The existence of query parameters isn't a good indicator one way or another about the safeness of a request.

Bjarne (:bjarne)

Reporter

Comment 22

•

15 years ago

Ok - thanks. Sounds to me like we should allow necko to retry a GET request if it fails in the middle of the response due to network-error, regardless of query-params.

What about this one : If a response-msg is "0123456789" (length=10) but content-length header is 5, should the "correct"/expected response be "01234" or "0123456789"? :)

Bjarne (:bjarne)

Reporter

Comment 23

•

15 years ago

Attached file V1.1 pserver w/ handlers (obsolete) — Details

Initial version had a nasty bug in echo.py which is fixed here. Also added a handler which returns wrong content-length.

Attachment #373846 - Attachment is obsolete: true

Patrick McManus [:mcmanus]

Comment 24

•

15 years ago

(In reply to comment #22)

> What about this one : If a response-msg is "0123456789" (length=10) but
> content-length header is 5, should the "correct"/expected response be "01234"
> or "0123456789"? :)

technically the response body is 01234 .. bytes 56789 are the first 5 bytes of the next response header - which is going to be an error. 

practically, 56789 gets ignored without an error if it is whitespace (ignored, not considered part of the response..) Having it be whitespace is fairly common and probably shouldn't be considered a sign of pipeline corruption.

If it isn't whitespace and pipelines are in use, I would definitely turn them off.

Bjarne (:bjarne)

Reporter

Comment 25

•

15 years ago

(In reply to comment #24)
> (In reply to comment #22)
> 
> > What about this one : If a response-msg is "0123456789" (length=10) but
> > content-length header is 5, should the "correct"/expected response be "01234"
> > or "0123456789"? :)
> 
> technically the response body is 01234 .. bytes 56789 are the first 5 bytes of
> the next response header - which is going to be an error. 

With persistent connections, yes, I agree. My intuitive problem with this is that if we use HTTP/1.0 or Connection: close we get the whole sequence of chars, ignoring content-length (at least necko does this). And imo it is confusing if we get different results depending on the version of protocol used.

> If it isn't whitespace and pipelines are in use, I would definitely turn them
> off.

Yup - agreed (also for persistent conn without pipeline).

Recovery in this situation (after discovering the fact that there is rubbish on the connection after reading the response) may be to re-request the resource over a non-persistent conn. Or what do you think?

Bjarne (:bjarne)

Reporter

Comment 26

•

15 years ago

Attached patch Patch containing server and some mochitests — Details — Splinter Review

Latest version of the server with some mochitests, this time packaged as a patch. I renamed it to "badserver" since it is supposed to be exactly that. :)

I have not integrated badserver with runtests.py (although this should be simple), so it must be run separately. It's located in "testing/badserver/" in the source-dir (not copied to obj-dir). Cd to "tests/badserver" and do "python badserver.py" to run it (it's quite verbose at the moment). It has only been tested on Linux...

The mochitests are located in the "netwerk" module and can be selected by adding "--test-path=netwerk" when running mochitest.

I think these mochitests cover cases B1 (test_preempting), B2 (test_retry_request, test_unexpected_close) and B5 (test_bad-content_length) from comment #4. There are others in progress and some extra in the patch. Note that necko has not been changed to fix any issues. However, I had to patch nsHttpChannel in order to 1) allow OPTIONS-requests to be pipelined (since I'm using XHR for the tests), and 2) get more logging.

Corrections, comments and/or input to individual tests and/or to the approach in general is appreciated.

Attachment #374046 - Attachment is obsolete: true

Patrick McManus [:mcmanus]

Comment 27

•

15 years ago

> With persistent connections, yes, I agree. My intuitive problem with this is
> that if we use HTTP/1.0 or Connection: close we get the whole sequence of
> chars, ignoring content-length (at least necko does this). And imo it is
> confusing if we get different results depending on the version of protocol
> used.
> 

sure its confusing - this is definitely a server error if they don't match :)

But if there is C-L it takes precedence over "the number of bytes read at connection close" according to rfc 2616 4.4 (#3).

> Recovery in this situation (after discovering the fact that there is rubbish on
> the connection after reading the response) may be to re-request the resource
> over a non-persistent conn. Or what do you think?

yeah, I think it is generally true that if we get a response over a pconn (whether or not pipelined) that we can detect as garbled and the request was idempotent then it makes sense to retry it with a fresh non-pipelined connection. Nothing to lose.

Bjarne (:bjarne)

Reporter

Comment 28

•

15 years ago

(In reply to comment #27)
> But if there is C-L it takes precedence over "the number of bytes read at
> connection close" according to rfc 2616 4.4 (#3).

Uhh...  do you have a link? :} I cannot read this from

http://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html#sec4.4

(You're probably right in what you're saying, but it's always nice to read the original text.)

Patrick McManus [:mcmanus]

Comment 29

•

15 years ago

(In reply to comment #28)
> (In reply to comment #27)
> > But if there is C-L it takes precedence over "the number of bytes read at
> > connection close" according to rfc 2616 4.4 (#3).
> 
> Uhh...  do you have a link? :} I cannot read this from
> 
> http://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html#sec4.4
> 
> (You're probably right in what you're saying, but it's always nice to read the
> original text.)

your link is the text I am talking about.

There are 5 ways to delimit a message body. They are described in order of precedence in section 4.4 as noted in the first paragraph.

method #3 is via content length header, if present.

method #5 is via the server closing the connection.

...

even though there is an order of precedence, the last para also notes that 
"When a Content-Length is given in a message where a message-body is allowed, its field value MUST exactly match the number of OCTETs in the message-body."  - so it is an error if methods 3 and 5 are in conflict.

Bjarne (:bjarne)

Reporter

Comment 30

•

15 years ago

Right... I see what you mean. Btw, any idea how FF handles the last part of last para : "HTTP/1.1 user agents MUST notify the user when an invalid length is received and detected." :)

(Re-dispatching using HTTP/1.0 feels increasingly tempting...)

John Vandenberg

Updated

•

14 years ago

Blocks: 264354

Patrick McManus [:mcmanus]

Updated

•

14 years ago

Depends on: 603503

Bjarne (:bjarne)

Reporter

Comment 32

•

12 years ago

-> default owner

Assignee: bjarne → nobody

Patrick McManus [:mcmanus]

Comment 33

•

12 years ago

we've landed a scoreboard for this

Status: NEW → RESOLVED

Closed: 12 years ago

Resolution: --- → FIXED

Initial version of pserver 15 years ago Bjarne (:bjarne) 9.20 KB, application/octet-stream		Details
V1.1 pserver w/ handlers 15 years ago Bjarne (:bjarne) 9.76 KB, application/zip		Details
Patch containing server and some mochitests 15 years ago Bjarne (:bjarne) 43.93 KB, patch		Details \| Diff \| Splinter Review