Closed Bug 177326 Opened 22 years ago Closed 22 years ago

Mozilla rapidly repeats requests multiple times

Categories

(Core :: Networking: HTTP, defect, P1)

x86
Linux
defect

Tracking

()

CLOSED FIXED
mozilla1.2final

People

(Reporter: sharding, Assigned: darin.moz)

References

Details

(Keywords: topembed)

Attachments

(3 files, 1 obsolete file)

I've run across a situation in which Mozilla sometimes requests an URL multiple times in response to one request from the user. This is happening in an environment with a Squid proxy server. While I'm not sure the proxy server is the cause of the problem, I haven't noticed it happen without the proxy. I'm still testing to confirm that. To the user, this problem appears as an error saying "the document contains no data." In trying to troubleshoot this problem, I looked in the proxy logs and saw that the URL in question was being requested multiple times in rapid succession. For example, this proxy log entry was the result of clicking a link *once* in Mozilla. Mozilla popped the "no data" error instead of displaying the page: 1035486220.362 148 <client ip> TCP_MISS/200 2996 GET http://cgi.ebay.com/ws/eBayISAPI.dll?ViewItem&item=2064056329 - DIRECT/66.135.192.148 text/html 1035486220.471 126 <client ip> TCP_MISS/200 2996 GET http://cgi.ebay.com/ws/eBayISAPI.dll?ViewItem&item=2064056329 - DIRECT/66.135.192.135 text/html 1035486220.566 113 <client ip> TCP_MISS/200 2996 GET http://cgi.ebay.com/ws/eBayISAPI.dll?ViewItem&item=2064056329 - DIRECT/66.135.192.148 text/html 1035486220.666 117 <client ip> TCP_MISS/200 2996 GET http://cgi.ebay.com/ws/eBayISAPI.dll?ViewItem&item=2064056329 - DIRECT/66.135.192.135 text/html 1035486220.769 118 <client ip> TCP_MISS/200 2996 GET http://cgi.ebay.com/ws/eBayISAPI.dll?ViewItem&item=2064056329 - DIRECT/66.135.192.148 text/html 1035486220.868 118 <client ip> TCP_MISS/200 2996 GET http://cgi.ebay.com/ws/eBayISAPI.dll?ViewItem&item=2064056329 - DIRECT/66.135.192.135 text/html 1035486221.065 216 <client ip> TCP_MISS/200 2996 GET http://cgi.ebay.com/ws/eBayISAPI.dll?ViewItem&item=2064056329 - DIRECT/66.135.192.148 text/html 1035486221.169 121 <client ip> TCP_MISS/200 2996 GET http://cgi.ebay.com/ws/eBayISAPI.dll?ViewItem&item=2064056329 - DIRECT/66.135.192.135 text/html 1035486221.311 159 <client ip> TCP_MISS/200 2996 GET http://cgi.ebay.com/ws/eBayISAPI.dll?ViewItem&item=2064056329 - DIRECT/66.135.192.148 text/html That's nine GETs in response to a single click. I've also had this happen with POSTs. In one case, Mozilla submitted a form ten times in response to a single click of the submit button. Thankfully, it was a feedback form instead of a form to buy a TV or sell 1000 shares of stock. After seeing this odd pattern in the logs, I set up tcpdump to watch the requests from Mozilla to the proxy to see if Mozilla was actually sending multiple requests and what the responses from the proxy are. In the instance I caught with tcpdump, Mozilla sent the GET ten times in response to me typing the URL in and hitting enter once. * The first GET was answered by the proxy with an immediate RST. * The second GET is more complicated. The initial GET packet is responded to by the proxy server with an "HTTP/1.0 200 OK" and about 1K of data (not the complete page). The HTTP headers on that response are: HTTP/1.0 200 OK Date: Thu, 24 Oct 2002 15:25:54 GMT Server: Apache/1.3.19 (Unix) FrontPage/4.0.4.3 Content-Type: text/html X-Cache: MISS from <proxy server name> Proxy-Connection: close The only TCP flag set is ACK. The client (Mozilla end) replies to that packet with a FIN ACK packet, even though the transfer is not complete. Then we receive the second packet of the response from the proxy (still not the complete page). Then the client sends an RST packet to the server. Then we receive a second copy of the HTTP/1.0 200 OK packet from the proxy (same sequence and ack numbers). Client sends another RST, server sends another OK. This repeats six times until a final RST from the client and nothing else is heard from the server in response to this request. * The third GET is just like the first -- an immediate RST from the server. * The rest are essentially like the second -- some data is sent by the server and Mozilla closes the connection before it has all arrived. So, on the first and third requests, the proxy replied with an RST before sending any data. In all the others, the proxy replied OK and started sending data, but Mozilla closed the connection before the data had all arrived. Remember that this all happened in a span of 1.45 seconds. I have no idea what's causing this, and I don't really know where to look next. Analysis of the logs on the proxy server indicates that this isn't happening to other clients, so I don't think it's a problem with the proxy server by itself. Any ideas?
Changed summary -- it's not specific to the GET method.
Summary: Mozilla rapidly repeats GETs multiple times → Mozilla rapidly repeats requests multiple times
sean: thanks for the detailed information. this is a known problem, but i have not been able to reproduce it myself. mozilla responds to a RST connection by resending the request on a new socket connection. can you please upload a copy of your tcpdump. that would be very very helpful to me in analyzing this problem. if possible, can you build a debug version of mozilla and capture some additional mozilla logs? in particular, it would be very very interesting to see the log generated from these environment variable settings: export NSPR_LOG_MODULES=nsSocketTransport:5,nsHttp:5 export NSPR_LOG_FILE=http.log see also bug 135182 and bug 168138. thx!!
Severity: normal → major
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Keywords: nsbeta1, topembed
Priority: -- → P2
Target Milestone: --- → mozilla1.3alpha
I think bug 162875 is a dupe of this one ... but this is the first time we got a tcpdump.
Attached file HTTP log of this error happening (obsolete) —
Here's the debug log output from this error happening. The whole log is a large file, so I trimmed it to a few dozen lines before the request for the problem URL. At the end of the log, I quit Mozilla. Let me know if you need more context before the error. The URL that had a problem in this case is http://www.alaskausa.org/Products_and_Services/Fees/fees.html. It was requested 10 times from the proxy server in less than 1 second.
sean: your log only contains HTTP logging, which suggests that you used a mozilla build off of ftp.mozilla.org (i.e., nsSocketTransport logging is not enabled in mozilla nightly builds). is it at all possible for you to compile/build mozilla yourself? that would give you the ability to capture a nsSocketTransport log. unfortunately, the HTTP log by itself is not very helpful. it only indicates that the socket connection is being reset / closed prematurely. it doesn't indicate why, which is something the nsSocketTransport log would perhaps tell us. thx!!
I did build that Mozilla myself. I thought I had debugging enabled, but apparently not. I'll double-check and rebuild.
Ok, here's another shot. I forgot I had --disable-debug set in my ~/.mozconfig This is from getting the error on http://boards.fool.com/?ref=topnav I've again trimmed it significantly, and again there's a Mozilla exit at the end of the file. Let me know if this still isn't what you're looking for.
Attachment #104645 - Attachment is obsolete: true
sean: thank you so much for providing this log file!! it does indicate a serious problem. after detecting the TCP RST, we never call the low-level PR_Read function again, yet we call PR_Write each time after a failure. it seems that we have a bug in our RST handling code. this probably explains all of the "document contains no data" bugs we've been seeing. thanks again!! :-) i think this should be considered a mozilla 1.2 blocker if it isn't already too late.
Severity: major → critical
Priority: P2 → P1
Target Milestone: mozilla1.3alpha → mozilla1.2final
Attached patch v1 patchSplinter Review
sean: can you please test out this patch and let me know if it solves the problem? thx!
Rebuilding now. I'll let you know. Because of the nature of this error -- it occurs somewhat randomly -- it's much easier to know that it still is happening than to be sure that it *isn't*. But I'll do my best.
thanks. for kicks, i made the socket transport randomly fail with NS_ERROR_NET_RESET in my build, and we seem to do OK. so, i'm thinking that this fixed the problem. but, of course... we need real world testing, since i may have the wrong cause in mind.
this patch is just for testing. the "v1 patch" makes moz properly loads pages even with this sadistic patch in the tree.
Keywords: adt1.0.2
Comment on attachment 104803 [details] [diff] [review] v1 patch r=dougt
Attachment #104803 - Flags: review+
QA Contact: httpqa → tever
So far, the patch seems to be working for me. I haven't seen any "no data" errors, and it looks like there have been some rsts (at least, there are four instances of "mapping to NS_ERROR_NET_RESET" in my log).
sean: excellent news!
fixed-on-trunk
Status: ASSIGNED → RESOLVED
Closed: 22 years ago
Resolution: --- → FIXED
adt wants this please verify on trunk asap.
Keywords: adt1.0.2adt1.0.2+
Need verification now. This is now urgent.
Tom has been looking at this bug. Apparently, it is not trivial to reproduce. He is assessing to determine course of action to take to verify. Thanks for your patience and understanding.
FWIW, I had been seeing this problem between 10 and 50 times per day. After building with the patch on 10/31, I haven't seen it a single time (neither the "document contains no data" error nor the pattern of multiple requests in the proxy server logs). The bug that I reported is effectively gone for me.
ok thanks, so I think this can be marked verified according to Sean's comments and Darin's hard coded testing mentioned in comment# 12. The patch appears to be working. I have not been able to reproduce this so far. I can confirm with LXR that the patch is checked in on the trunk.
Status: RESOLVED → VERIFIED
agreed. this bug requires a flaky network or http server to duplicate (it only occurs when a connection is abruptly terminated, which is not supposed to happen). the fact that this is so difficult to repro (except for the folks that it effects) is the reason why we have had this bug for so long and have not been able to fix it until now. we owe a big thank you to sean for helping us track down this bug. IMO the fact that he has confirmed the bug fix should be sufficient verification.
Status: VERIFIED → CLOSED
Comment on attachment 104803 [details] [diff] [review] v1 patch a=blizzard on behalf of drivers for the 1.0 branch. Please mark with 'fixed1.0.2' when you have fixed it and I imagine 'verified1.0.2' once it's been verified.
fixed1.0.2
Keywords: mozilla1.0.2fixed1.0.2
confirmed that patch is checked in on branch - verified1.0.2
*** Bug 168138 has been marked as a duplicate of this bug. ***
Blocks: 181536
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: