So, I filed a bug today. The bug submission took pretty long, and I got suspicious... well, when I clicked "Stop" and checked the list of new bugs, I found bug 184115 - bug 184119, all the same, all my bug. Contrary to what darin once said to timeless, this does not seem to be fixed. Win98, 2002120709 here's hoping that this bug will only be created once...
related: bug 183053
"The bug submission took pretty long" is a somewhat different symptom. in the past, we've seen multiple form submissions happen very quickly, but maybe bugzilla was simply slow to allow a new connection. hmm...
so, today it happened again - 10 bugs were created altogether. (bug 185877 and duplicates of it)
ok, biesi's log file appears to reveal a new way in which this problem can occur. i'm fairly certain that bugzilla is closing/reseting/dropping the connection after receiving the POST data. it is strange that it would do this 10 times, and the log file unfortunately does not give us enough information to say so conclusively. i'm in the middle of reworking the transport layer for necko (bug 176919), which will have the effect of greatly simplifying the code which deals with TCP/IP RST detection. i'm expecting to land those changes soon, so i'm going to defer working on this bug until after those changes go in. as a result of those changes it will be very easy to implement support for HTTP/1.1 Expect 100 Continue, which will almost completely wipe out any possibility of this sort of problem even if the server is totally wacked.
Severity: normal → critical
Status: NEW → ASSIGNED
Depends on: 176919
Priority: -- → P1
Target Milestone: --- → mozilla1.3beta
Hi All, I just posted bug 186339 and got 186340 and 186341 as duplicates. I am adding myself to the Cc: list. I am running 126.96.36.1992122008 on w2k-p, sp3. --Tony
assuming darin meant to nominate. only drivers set the + on blocking and approval flags.
Flags: blocking1.3b+ → blocking1.3b?
Darin, it doesn't look like there's much activity over at 176919. Is this something we should hold the beta for? Can you look into it independent of progress on 176919. If 176919 doesn't happen pretty soon it's not gonna make 1.3. We're less than two weeks from the freeze and it looks too large to take after beta.
Flags: blocking1.3b? → blocking1.3b+
asa: the patch for bug 176919 is being reviewed. this bug will not be fixed by that patch alone, but that patch will make it possible to fix this bug. trying to fix this bug w/o the patch for 176919 is going to be difficult.
If useful, you might see what livehttpheaders from livehttpheaders.mozdev.org is showing as you post bugs.
*** Bug 191586 has been marked as a duplicate of this bug. ***
a possible short-term solution for this bug might be to disable using a keep-alive connection with POST requests (and disable the transaction restart logic for POST requests). long-term, we should issue an "Expect: 100-continue" request header for POST requests to a HTTP/1.1 server. this plan would break POST requests to TLS-intolerant servers; however, since PSM simulates a TCP RST to force necko to retransmit the request. in which case the user would see a "The document contains no data" alert. Resubmitting the form would work since PSM remembers TLS-intolerant servers, but the alert is probably something we should avoid showing the user upon clicking "OK" to purchase something. perhaps the only solution is to implement "Expect: 100-continue" for HTTP/1.1 and add a delay before POSTing for HTTP/1.0 (see bug 135182). thoughts?
actually, on second thought.. TLS-intolerance is detected on first write, not on first read, and we should be able to restart a POST request if we have not yet written out any of the request data. so, we should be able to solve this bug by preventing POST requests from reusing existing connections and by only allowing a POST request restart if no request data has been written to the socket. patch in hand... (note: this is still only a short-term fix... we really need the long-term solution i mentioned above, but for 1.3 this short-term solution is probably the right thing to do.)
Can we get the patch out of your hand and into reviews? ;-)
sorry.. i've been delayed. i should have a patch by tomorrow or this evening.
so, i really don't like the idea of disabling keep-alive connections for POST requests. i spent some time reading over the classic codebase to see how it handled this problem. seems that they have a check to only resend the POST data if the connection that failed (w/ premature EOF) was being reused. mozilla currently resends even if a new connection fails (w/ premature EOF). so, perhaps instead of trying to do something so drastic like disabling keep-alive, we should instead try to follow the classic codebase.
Created attachment 113663 [details] [diff] [review] v1 patch makes our transaction restart behavior mimic that of nav4x. ultimately, we should still try to support HTTP/1.1 Expect 100-continue, but i think this patch is the right thing for the time being.
I'm not going to get to this review till at least Friday afternoon.
Comment on attachment 113663 [details] [diff] [review] v1 patch Whats the timeout change for? This looks fine apart from that.
bbaetz: hmm... i was thinking the 15 second timeout could easily be 1 minute instead, so as to lessen the burden on laptops when mozilla is effectively idle. but, that problem really needs a better solution, like... not running the timeout at all when there are no idle connections. hmm... and after re-reading my comments in bug 97833 (in which i argued in favor of 15 seconds instead of 1 minute), i think i will not include the timeout change in this patch afterall. better to file a bug about disabling the timer when no idle connections exist ;) (thanks for being vigilant btw!)
Created attachment 113723 [details] [diff] [review] v1.1 patch same patch, just keeps the idle connection pruning timer at every 15 seconds.
Attachment #113663 - Attachment is obsolete: true
Comment on attachment 113663 [details] [diff] [review] v1 patch (clearing review requests on obsolete patch)
Comment on attachment 113723 [details] [diff] [review] v1.1 patch Looks fine. r=bbaetz
Attachment #113723 - Flags: review?(bbaetz) → review+
Comment on attachment 113723 [details] [diff] [review] v1.1 patch sr=me... but I have to admit that I don't fully understand the new world of HTTP... ;)
Attachment #113723 - Flags: superreview?(bzbarsky) → superreview+
Comment on attachment 113723 [details] [diff] [review] v1.1 patch requesting drivers approval for 1.3 beta...
Attachment #113723 - Flags: approval1.3b?
Attachment #113723 - Flags: approval1.3b? → approval1.3b+
Status: ASSIGNED → RESOLVED
Last Resolved: 16 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.