Closed Bug 448575 Opened 16 years ago Closed 8 years ago

nsHttpChannel silently eats socket errors that occur after the response body is being received

Categories

(Core :: Networking: HTTP, defect)

x86
Windows XP
defect
Not set
normal

Tracking

()

RESOLVED INCOMPLETE

People

(Reporter: michaeln, Unassigned)

References

Details

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9) Gecko/2008052906 Firefox/3.0
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9) Gecko/2008052906 Firefox/3.0

The listener's OnStopRequest method is called when a connection is dropped mid-stream, but the 'status' argument says NS_OK.

Not sure if this happens in all cases, but certainly in some.

Reproducible: Always

Steps to Reproduce:
1. Initiate a request expected to take some time to receive (fetch a large file from a slow server).

2. While the response is being received, disable the "Local Area Connection" in via windows.

3. Observe OnStopRequest is called with 'status' == NS_OK, however only a portion of the response data has been received.
Component: General → Networking: HTTP
Product: Firefox → Core
QA Contact: general → networking.http
Version: unspecified → Trunk
If this happens in the browser? What happens? No error? Throbber keeps running?
When loading content into a browser frame... the request does terminate so the throbber does not keep running... and the partially received content (html,script,css, whatever) is loaded.  The page may or may not work properly depending.

This isn't a terrible problem for page loading (hit refresh and maybe the page will work, the way of the web). The socket was reset so the data isn't available, what more could be done (perhaps indicate some resources didn't fully load in a gui?).

But for other clients of nsHttpChannel trying to use HTTP as a reliable transport mechanism (Gears.LocalServer trying to cache resources for example), not being able to detect errors is a problem.
 
Component: Networking: HTTP → Networking: FTP
QA Contact: networking.http → networking.ftp
rnewman, I remember, a long time ago, your team talking about our HTTP stack not finishing requests for some unknown reason, and/or silently discarding headers, and that you guys worked around these issues by automatically retrying the transactions. Did you every find an explanation about what was going on? Do you think this has any bearing on that problem?
Component: Networking: FTP → Networking: HTTP
QA Contact: networking.ftp → networking.http
(In reply to Brian Smith (:bsmith) from comment #3)
> rnewman, I remember, a long time ago, your team talking about our HTTP stack
> not finishing requests for some unknown reason, and/or silently discarding
> headers, and that you guys worked around these issues by automatically
> retrying the transactions. Did you every find an explanation about what was
> going on?

Bug 696137. No, we didn't get to the bottom of it; we investigated a bunch of possibilities (e.g., the server terminating the connection due to a query timeout), but never got any solid leads.

We don't even retry: we just error when accessing the response body because it doesn't parse as valid JSON. We have an open bug to verify that the content matches the Content-Length header, which I hope would make truncated responses more obvious: Bug 685944.

> Do you think this has any bearing on that problem?

This bug sounds very promising indeed, and I would welcome a fix… and I'd settle for a way to provoke it in xpcshell so that we can verify that it's a root cause.

Flagging as qawanted, because this really shouldn't still be sitting in UNCONFIRMED. And I'll go so far as to block Bug 696137 on this; I don't want it to drop off our radar.

Thanks for digging up this nugget of gold, bsmith!
Blocks: 696137
Keywords: qawanted
Tried to test this with the 01/14 Nightly build and the 01/13 debug Nightly on Windows XP.

I started fetching this image https://support.liferay.com/secure/attachment/31703/e%20large.png, then disabled the local area connection. The Network dev tool was used for watching requests/responses and the Browser Console for all other info.

As soon as the fetching starts, the status for response with the whole image is OK. Nothing changes here when I disable the network, although very little of the image is actually loaded. I do get a "The connection was reset" error on the page though and the Browser Console shows "Image corrupt or truncated: https://support.liferay.com/secure/attachment/31703/e%20large.png". I see nothing about OnStopRequest in any of these tools, nor in the console for the debug build. Where could I follow the details about how OnStopRequest is called (when, with what status)?
Flags: needinfo?(rnewman)
Keywords: qawanted
Ioana: it's possible that 

https://developer.mozilla.org/en-US/docs/HTTP_Logging

will give you enough info. If not, you'll need to test via the Browser Console and manually poking at RESTRequest or Resource.
Flags: needinfo?(rnewman)
(In reply to Richard Newman [:rnewman] from comment #6)
> Ioana: it's possible that 
> 
> https://developer.mozilla.org/en-US/docs/HTTP_Logging
> 
> will give you enough info. If not, you'll need to test via the Browser
> Console and manually poking at RESTRequest or Resource.

Thanks! I did some HTTP logging an checked the Browser Console a bit.

When loading the image in comment 5 and disabling the network connection once:
* The console says that everything is ok and loaded. 
* HTTP logs show 6 calls to OnStopRequest, the first 5 with status=0 and the last one with status=804b0014.
I don't think this is actionable at this point..

but we do treat silent truncations as OK for webcompat reasons. We've tried otherwise :(
Status: UNCONFIRMED → RESOLVED
Closed: 8 years ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.