Closed
Bug 92224
Opened 23 years ago
Closed 23 years ago
8.2.4 Client Behavior if Server Prematurely Closes Connection
Categories
(Core :: Networking: HTTP, defect, P2)
Tracking
()
RESOLVED
FIXED
mozilla0.9.4
People
(Reporter: ian, Assigned: darin.moz)
References
()
Details
(Keywords: testcase, Whiteboard: [Hixie-P2])
Attachments
(5 files)
4.90 KB,
patch
|
Details | Diff | Splinter Review | |
3.97 KB,
patch
|
Details | Diff | Splinter Review | |
27.31 KB,
patch
|
Details | Diff | Splinter Review | |
26.07 KB,
patch
|
Details | Diff | Splinter Review | |
28.01 KB,
patch
|
Details | Diff | Splinter Review |
If an HTTP/1.1 client sends a request which includes a request body, but which does not include an Expect request-header field with the "100-continue" expectation, and if the client is not directly connected to an HTTP/1.1 origin server, and if the client sees the connection close before receiving any status from the server, the client SHOULD retry the request. If the client does retry this request, it MAY use the following "binary exponential backoff" algorithm to be assured of obtaining a reliable response: 1. Initiate a new connection to the server 2. Transmit the request-headers 3. Initialize a variable R to the estimated round-trip time to the server (e.g., based on the time it took to establish the connection), or to a constant value of 5 seconds if the round- trip time is not available. 4. Compute T = R * (2**N), where N is the number of previous retries of this request. 5. Wait either for an error response from the server, or for T seconds (whichever comes first) 6. If no error response is received, after T seconds transmit the body of the request. 7. If client sees that the connection is closed prematurely, repeat from step 1 until the request is accepted, an error response is received, or the user becomes impatient and terminates the retry process. If at any point an error status is received, the client - SHOULD NOT continue and - SHOULD close the connection if it has not completed sending the request message. Currently, we try to rerequest the resource about a gazillion times a second, resulting in my server's error log growing to an unearthly (and expensive) size. STEPS TO REPRODUCE See http://www.hixie.ch/tests/adhoc/http/error/nph-001.cgi This test script simulates a failure on the server side: it doesn't return anything (not even headers). This kind of failure occurs, for example, when an nph script has a bug. (This is how I found the bug in the first place.) ACTUAL RESULTS The test script aborts after 100 requests are sent in less than 25 seconds. EXPECTED RESULTS See above for a quote from the HTTP spec. Other browsers don't handle this in a particularly stellar way either, but they at least don't overload the server like we do: WinIE6-pre: Shows a "cannot find server or DNS error" message after sending 2 requests back to back. Opera5: Sends 5 requests over a period of 3 seconds then shows an alert dialog saying "Connection closed by remote server". MacIE5: Displays an error dialog after sending 2 reqests back to back. iCab2.51-pre: Sends 1 request and then does nothing (no user feedback). Escape4.7: Displays a blank page after sending 3 requests back to back. Lynx2.8.3-pre: Displays 2 status line alerts after sending 1 request. This is a particularly important bug because it is only likely to manifest itself when the server is in trouble, and it results in the server being in even more trouble. Basically, it makes the server-side "drop all incoming connections in an effort to save ourselves" tactic backfire horribly. (Note: If you are trying this in other browsers, make sure to clear their cache between each attempt. Some browsers do not follow the spec's recommendation of not caching query-based GETs.)
Comment 1•23 years ago
|
||
-> darin, -> major darin, we could use mReuseCount to determine if this is a case where wewant to detect this for timed out keep-alives, if we incremented that variable at some point.
Assignee: neeti → darin
Severity: normal → major
Assignee | ||
Comment 2•23 years ago
|
||
bbaetz: as we talked about that won't work since NSS is using this "feature" for handling TLS intolerant servers. i'd like to start by simply limiting the number of consecutive restarts. what's a good number? perhaps no more than 2 consecutive restart attempts? -> 0.9.4
Status: NEW → ASSIGNED
Target Milestone: --- → mozilla0.9.4
Comment 3•23 years ago
|
||
The spec says: ... "and if the client is not directly connected to an HTTP/1.1 origin server," What does this mean? If we haven't got a connection, we can't know if it was 1.1 or not. Is this section only meant to apply if we're talking to proxies. Does it imply that if we were to do fallback to different servers, we SHOULD retry the original proxy at least once? (That makes sense, I guess) I'd say 2 would be fine, if we're not going to do exponential backoff. If we are, maybe 5? Or maybe try for n seconds (however many retries that involves, following this algorithm), up to a maximum number of requests. From Ian's findings, 2-3 would be enough though. We should check if the behaviour is different for proxy servers.
Assignee | ||
Updated•23 years ago
|
Priority: -- → P2
Assignee | ||
Comment 5•23 years ago
|
||
btw: the exponential backoff algorithm described in section 8.2.4 applies to requests which contain a message body. however, this bug deals with GET requests. the "restart feature" currently implemented doesn't really have anything to do with section 8.2.4. instead, it simply has to do with the way TCP/IP works... that is, we may be able to successfully write to a socket which the server has closed, but will only discover this when we try to read. therefore, the "restart feature" is necessary, especially when using persistent connections. we have to be careful when limiting the number of restart attempts b/c we have to allow for each idle connection to fail in this manner. since we currently allow up to 8 persistent connections per server, and because PSM is utilizing this "feature", i'm going to submit a patch which limits the number of restarts to be no more than 10 (i'd like to allow an additional restart just for good measure). this patch will include a preference to configure this number.
Assignee | ||
Comment 6•23 years ago
|
||
Comment 7•23 years ago
|
||
+ PRUint16 MaxRequestAttempts() { return (PRUint16) mMaxRequestAttempts; } Why the cast? Yes, this will never overflow, but still, store it with the same type that we'll be using it as. Or just return it as a 32 bit number. Ditto for mHttpVersion (Although that was there before)
Assignee | ||
Comment 8•23 years ago
|
||
yeah.. just be'in a bit sloppy. wanted to avoid adding a local variable when calling GetIntPref, but of course that's really not the right trade-off ;-)
Assignee | ||
Comment 9•23 years ago
|
||
Assignee | ||
Comment 10•23 years ago
|
||
Comment 11•23 years ago
|
||
I'm not too sure about the change to use bitfields - wouldn't this be slower? I'll apply the patch tomorrow, but apart from that it looks good.
Assignee | ||
Comment 12•23 years ago
|
||
yeah.. actually, using bitfields is more costly (an extra instruction on x86). perhaps it not worth the savings of a couple bytes.
Assignee | ||
Comment 13•23 years ago
|
||
Comment 14•23 years ago
|
||
r=bbaetz if you run the patch through john taylor's tests.
Assignee | ||
Comment 15•23 years ago
|
||
results from jtaylor's page loading test shows minor to no improvement. http://g.mcom.com/~jtaylor/necko/bin/userData.pl?user=darin@netscape.com
Assignee | ||
Comment 16•23 years ago
|
||
Comment 17•23 years ago
|
||
sr=mscott
Assignee | ||
Comment 18•23 years ago
|
||
fixed-on-trunk
Status: ASSIGNED → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
Assignee | ||
Comment 19•23 years ago
|
||
*** Bug 90976 has been marked as a duplicate of this bug. ***
You need to log in
before you can comment on or make changes to this bug.
Description
•