Closed Bug 92224 Opened 23 years ago Closed 23 years ago

8.2.4 Client Behavior if Server Prematurely Closes Connection

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla0.9.4

People

(Reporter: ian, Assigned: darin.moz)

References

(
URL
)

Details

(Keywords: testcase, Whiteboard: [Hixie-P2])

Attachments

(5 files)

v1.0 fixes the problem 23 years ago Darin Fisher 4.90 KB, patch		Details \| Diff \| Splinter Review
partial revision... more to come 23 years ago Darin Fisher 3.97 KB, patch		Details \| Diff \| Splinter Review
v1.1 "it's a fat trimming extravaganza!" 23 years ago Darin Fisher 27.31 KB, patch		Details \| Diff \| Splinter Review
v1.2 removed bitfields 23 years ago Darin Fisher 26.07 KB, patch		Details \| Diff \| Splinter Review
v1.3 revised patch to build against today's tree 23 years ago Darin Fisher 28.01 KB, patch		Details \| Diff \| Splinter Review

Hixie (not reading bugmail)

Reporter

Description

•

23 years ago

If an HTTP/1.1 client sends a request which includes a request body, but which 
does not include an Expect request-header field with the "100-continue" 
expectation, and if the client is not directly connected to an HTTP/1.1 origin 
server, and if the client sees the connection close before receiving any status 
from the server, the client SHOULD retry the request. If the client does retry 
this request, it MAY use the following "binary exponential backoff" algorithm 
to be assured of obtaining a reliable response:

      1. Initiate a new connection to the server

      2. Transmit the request-headers

      3. Initialize a variable R to the estimated round-trip time to the
         server (e.g., based on the time it took to establish the
         connection), or to a constant value of 5 seconds if the round-
         trip time is not available.

      4. Compute T = R * (2**N), where N is the number of previous
         retries of this request.

      5. Wait either for an error response from the server, or for T
         seconds (whichever comes first)

      6. If no error response is received, after T seconds transmit the
         body of the request.

      7. If client sees that the connection is closed prematurely,
         repeat from step 1 until the request is accepted, an error
         response is received, or the user becomes impatient and
         terminates the retry process.

If at any point an error status is received, the client

      - SHOULD NOT continue and

      - SHOULD close the connection if it has not completed sending the
        request message.

Currently, we try to rerequest the resource about a gazillion times a second, 
resulting in my server's error log growing to an unearthly (and expensive) size.

STEPS TO REPRODUCE
   See http://www.hixie.ch/tests/adhoc/http/error/nph-001.cgi
   This test script simulates a failure on the server side: it doesn't return
   anything (not even headers). This kind of failure occurs, for example, when
   an nph script has a bug. (This is how I found the bug in the first place.)

ACTUAL RESULTS
   The test script aborts after 100 requests are sent in less than 25 seconds.

EXPECTED RESULTS
   See above for a quote from the HTTP spec. Other browsers don't handle this
   in a particularly stellar way either, but they at least don't overload the
   server like we do:

     WinIE6-pre: Shows a "cannot find server or DNS error" message after sending
       2 requests back to back.
     Opera5: Sends 5 requests over a period of 3 seconds then shows an alert 
       dialog saying "Connection closed by remote server".
     MacIE5: Displays an error dialog after sending 2 reqests back to back.
     iCab2.51-pre: Sends 1 request and then does nothing (no user feedback).
     Escape4.7: Displays a blank page after sending 3 requests back to back.
     Lynx2.8.3-pre: Displays 2 status line alerts after sending 1 request.

This is a particularly important bug because it is only likely to manifest 
itself when the server is in trouble, and it results in the server being in even
more trouble. Basically, it makes the server-side "drop all incoming connections
in an effort to save ourselves" tactic backfire horribly.

(Note: If you are trying this in other browsers, make sure to clear their cache
between each attempt. Some browsers do not follow the spec's recommendation of
not caching query-based GETs.)

Hixie (not reading bugmail)

Reporter

Updated

•

23 years ago

Keywords: testcase

Whiteboard: [Hixie-P2]

Bradley Baetz (:bbaetz)

Comment 1

•

23 years ago

-> darin, -> major

darin, we could use mReuseCount to determine if this is a case where wewant to
detect this for timed out keep-alives, if we incremented that variable at some
point.

Assignee: neeti → darin

Severity: normal → major

Darin Fisher

Assignee

Comment 2

•

23 years ago

bbaetz: as we talked about that won't work since NSS is using this "feature" for
handling TLS intolerant servers.

i'd like to start by simply limiting the number of consecutive restarts.  what's
a good number?  perhaps no more than 2 consecutive restart attempts?

-> 0.9.4

Status: NEW → ASSIGNED

Target Milestone: --- → mozilla0.9.4

Bradley Baetz (:bbaetz)

Comment 3

•

23 years ago

The spec says:

... "and if the client is not directly connected to an HTTP/1.1 origin server,"
What does this mean? If we haven't got a connection, we can't know if it was 1.1
or not. Is this section only meant to apply if we're talking to proxies. Does it
imply that if we were to do fallback to different servers, we SHOULD retry the
original proxy at least once? (That makes sense, I guess)

I'd say 2 would be fine, if we're not going to do exponential backoff. If we
are, maybe 5? Or maybe try for n seconds (however many retries that involves,
following this algorithm), up to a maximum number of requests.

From Ian's findings, 2-3 would be enough though. We should check if the
behaviour is different for proxy servers.

Jeroen Ruigrok van der Werven

Comment 4

•

23 years ago

*** Bug 92685 has been marked as a duplicate of this bug. ***

Darin Fisher

Assignee

Updated

•

23 years ago

Priority: -- → P2

Darin Fisher

Assignee

Comment 5

•

23 years ago

btw: the exponential backoff algorithm described in section 8.2.4 applies to
requests which contain a message body.  however, this bug deals with GET
requests.  the "restart feature" currently implemented doesn't really have
anything to do with section 8.2.4.  instead, it simply has to do with the way
TCP/IP works... that is, we may be able to successfully write to a socket which
the server has closed, but will only discover this when we try to read. 
therefore, the "restart feature" is necessary, especially when using persistent
connections.

we have to be careful when limiting the number of restart attempts b/c we have
to allow for each idle connection to fail in this manner.  since we currently
allow up to 8 persistent connections per server, and because PSM is utilizing
this "feature", i'm going to submit a patch which limits the number of restarts
to be no more than 10 (i'd like to allow an additional restart just for good
measure).  this patch will include a preference to configure this number.

Darin Fisher

Assignee

Comment 6

•

23 years ago

Attached patch v1.0 fixes the problem — Details — Splinter Review

Darin Fisher

Assignee

Updated

•

23 years ago

Keywords: patch

Bradley Baetz (:bbaetz)

Comment 7

•

23 years ago

+    PRUint16      MaxRequestAttempts() { return (PRUint16) mMaxRequestAttempts; }

Why the cast? Yes, this will never overflow, but still, store it with the same
type that we'll be using it as. Or just return it as a 32 bit number.
Ditto for mHttpVersion (Although that was there before)

Darin Fisher

Assignee

Comment 8

•

23 years ago

yeah.. just be'in a bit sloppy.  wanted to avoid adding a local variable when
calling GetIntPref, but of course that's really not the right trade-off ;-)

Darin Fisher

Assignee

Comment 9

•

23 years ago

Attached patch partial revision... more to come — Details — Splinter Review

Darin Fisher

Assignee

Comment 10

•

23 years ago

Attached patch v1.1 "it's a fat trimming extravaganza!" — Details — Splinter Review

Bradley Baetz (:bbaetz)

Comment 11

•

23 years ago

I'm not too sure about the change to use bitfields - wouldn't this be slower?

I'll apply the patch tomorrow, but apart from that it looks good.

Darin Fisher

Assignee

Comment 12

•

23 years ago

yeah.. actually, using bitfields is more costly (an extra instruction on x86). 
perhaps it not worth the savings of a couple bytes.

Darin Fisher

Assignee

Comment 13

•

23 years ago

Attached patch v1.2 removed bitfields — Details — Splinter Review

Bradley Baetz (:bbaetz)

Comment 14

•

23 years ago

r=bbaetz if you run the patch through john taylor's tests.

Darin Fisher

Assignee

Comment 15

•

23 years ago

results from jtaylor's page loading test shows minor to no improvement.
http://g.mcom.com/~jtaylor/necko/bin/userData.pl?user=darin@netscape.com

Darin Fisher

Assignee

Comment 16

•

23 years ago

Attached patch v1.3 revised patch to build against today's tree — Details — Splinter Review

Scott MacGregor

Comment 17

•

23 years ago

sr=mscott

Darin Fisher

Assignee

Comment 18

•

23 years ago

fixed-on-trunk

Status: ASSIGNED → RESOLVED

Closed: 23 years ago

Resolution: --- → FIXED

Darin Fisher

Assignee

Comment 19

•

23 years ago

*** Bug 90976 has been marked as a duplicate of this bug. ***

benc

Updated

•

22 years ago

QA Contact: benc → tever

You need to log in before you can comment on or make changes to this bug.