Closed Bug 1277522 Opened 9 years ago Closed 9 years ago

Incorrect handling of closed connections by server

Categories

(Core :: Networking: HTTP, defect)

46 Branch
defect
Not set
normal

Tracking

()

VERIFIED FIXED
mozilla49
Tracking Status
firefox46 --- wontfix
firefox47 + verified
firefox48 + verified
firefox49 + fixed

People

(Reporter: azurit, Assigned: dragana)

References

()

Details

(Keywords: regression, Whiteboard: [necko-active])

Attachments

(1 file)

User Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:46.0) Gecko/20100101 Firefox/46.0 Build ID: 20160511224154 Steps to reproduce: 1.) Go to www.elbianic.sk and wait until the page is fully loaded. 2.) Press CTRL + F5 to fully load a page again. 3.) Quickly, within 5 seconds (=KeepAlive timeout on server), press CTRL + R to refresh the page. Actual results: You will see the error 'The connection was reset' because server closed the connection - this is completely ok server behavior according to RFC https://tools.ietf.org/html/rfc2616#section-8.1.4 and MUST be handled correctly by clients. In fact, this was handled correctly by Firefox until version 46 (it simply reconnected to the server without showing any error, like all other browsers). What EXACTLY happned: We are using Apache MPM ITK ( http://mpm-itk.sesse.net/ ) and resources within one virtualhost are available to different system users (scripts vs. static data). When Firefox loads a page for the second time (step 3, see above), it uses the connection which is still opened but was previously used for loading static data (images for example) and is running under user who don't have access to scripts. Apache MPM ITK will try to switch to different user but, as it is already switched to non-root user, this operation fails and connection is correctly closed (see http://mpm-itk.sesse.net/ section Quirks and warnings). Firefox 46 and above displays incorrect error instead of doing reconnect. Expected results: Firefox should reconnect to the server in the background without showing any error to the user. It was doing it until version 46.
Regression range: https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=39e699922ca7763e81c690e706a451f0d849059f&tochange=e74d457b8b1100479d6e011b7658ef0baa31c3d2 Regressed by: Dragana Damjanovic — Bug 1236277 - Retry all connections not just the reused one. r=mcmanus It should be fixed in 47+ by bug 1269055, but that's not the case, the issue is still present. Dragana, any thoughts?
Blocks: 1236277
Status: UNCONFIRMED → NEW
Component: Untriaged → Networking: HTTP
Ever confirmed: true
Flags: needinfo?(dd.mozilla)
Keywords: regression
OS: Unspecified → All
Product: Firefox → Core
Hardware: Unspecified → All
Assignee: nobody → dd.mozilla
Whiteboard: [necko-active]
(In reply to Loic from comment #1) > Regression range: > https://hg.mozilla.org/integration/mozilla-inbound/ > pushloghtml?fromchange=39e699922ca7763e81c690e706a451f0d849059f&tochange=e74d > 457b8b1100479d6e011b7658ef0baa31c3d2 > > Regressed by: > Dragana Damjanovic — Bug 1236277 - Retry all connections not just the reused > one. r=mcmanus > > It should be fixed in 47+ by bug 1269055, but that's not the case, the issue > is still present. > > Dragana, any thoughts? This is not a regression from bug 1236277, I have tested that (if it was it would be an easy fix :), but it is not). I will look further.
Flags: needinfo?(dd.mozilla)
Given that this bug has regressed in Fx46, it might already be too late to do anything about it in Fx47. Neither do I consider this as release blocking. I will discuss with Andrew Overholt and see if there are any concerns.
My mistake it is bug 1236277 :) Before that bug we use to retry 10 times before we fail. And fixing that bug we said: 10 times is too much, 3 is enough, shortly after we changed that to 6. So if we change pref: network.http.request.max-attempts from 6 to 10 this works fine. To fix this we need just to change pref. I am ok with uplifting this to current beta as well.
(In reply to Ritu Kothari (:ritu) from comment #3) > Given that this bug has regressed in Fx46, it might already be too late to > do anything about it in Fx47. Neither do I consider this as release > blocking. I will discuss with Andrew Overholt and see if there are any > concerns. Sorry we were typing at the same time. The fix for this would be only a pref change.
Flags: needinfo?(rkothari)
Attached patch bug_1277522_v1.patch — — Splinter Review
Attachment #8759391 - Flags: review?(mcmanus)
Thak you. If it's only such a simple fix, could it go to the next minor release?
Comment on attachment 8759391 [details] [diff] [review] bug_1277522_v1.patch Review of attachment 8759391 [details] [diff] [review]: ----------------------------------------------------------------- I need to understand why this matters to this use case
(In reply to Patrick McManus [:mcmanus] from comment #8) > Comment on attachment 8759391 [details] [diff] [review] > bug_1277522_v1.patch > > Review of attachment 8759391 [details] [diff] [review]: > ----------------------------------------------------------------- > > I need to understand why this matters to this use case This was a good point asking that question: We have 7 parallel connections open to that host. When refresh we try one of them and the server closes the connection, than we try the next and again the same ... Only when we open a new tcp connection it succeeds :)
The comment above: it is 6 connection (I was wondering about 7 connection, but it only me not counting well :) )
(In reply to Dragana Damjanovic [:dragana] from comment #9) > (In reply to Patrick McManus [:mcmanus] from comment #8) > > Comment on attachment 8759391 [details] [diff] [review] > > bug_1277522_v1.patch > > > > Review of attachment 8759391 [details] [diff] [review]: > > ----------------------------------------------------------------- > > > > I need to understand why this matters to this use case > > > This was a good point asking that question: > We have 7 parallel connections open to that host. When refresh we try one of > them and the server closes the connection, than we try the next and again > the same ... > Only when we open a new tcp connection it succeeds :) Whoa :)
Comment on attachment 8759391 [details] [diff] [review] bug_1277522_v1.patch Review of attachment 8759391 [details] [diff] [review]: ----------------------------------------------------------------- I need to understand why this matters to this use case
Attachment #8759391 - Flags: review?(mcmanus) → review+
Keywords: checkin-needed
Comment on attachment 8759391 [details] [diff] [review] bug_1277522_v1.patch Approval Request Comment [Feature/regressing bug #]: Bug 1236277 [User impact if declined]: On reload it can happened that we show an error "The connection was reset". It is a rare case. [Describe test coverage new/current, TreeHerder]: It is only a pref change therefore no risk at all. And this change changes the behavior as it was before 46 [Risks and why]: None [String/UUID change made/needed]:none
Attachment #8759391 - Flags: approval-mozilla-beta?
Attachment #8759391 - Flags: approval-mozilla-aurora?
Hi Dragana, Patrick: We pushed RC2 to Beta channel this morning. I have no other release blockers driving an RC3. At this point, an RC3 will delay Fx47 go-live date. To me, this issue is not a release blocker. I can add this to my 47 dot-release potential ride-along list but I do not feel the need to spin another RC build just for this issue. Are you ok with that plan? Please let me know.
Flags: needinfo?(rkothari)
Flags: needinfo?(mcmanus)
Flags: needinfo?(dd.mozilla)
ride-along is appropriate. I don't expect this would impact many sites at all.
Flags: needinfo?(mcmanus)
Well, i'm not very ok with that :( This is a regression which causes lots of our sites to simply not work properly without any change on our side and we are forced to do ugly workarounds like completely disabling KeepAlive. Even more, this will affect all users of Firefox 46 FOREVER and if you push it to Firefox 47, it will be even worse (not everyone is doing updates).
azurit - the statement of impact is based on the fact that you're not really doing persistent connections consistently even though you are negotiating that you are. That's not widespread. I'm not complaining about it or anything, I'm just saying that's not going to impact that many sites and that's a factor in the uplift decision. Disabling keepalive is probably the right behavior for your site if you are deliberating closing connections based on factors other than timeout - or perhaps dividing scripts and static data into different origins with different keep-alive configurations. The retry behavior you are relying on is going to be much slower than a fresh connection anyhow. as a practical matter, there is very likely to be a 47.0.1
Comment on attachment 8759391 [details] [diff] [review] bug_1277522_v1.patch Changing the flag to m-r nom so this shows up as dot release potential ride-along candidates.
Attachment #8759391 - Flags: approval-mozilla-beta? → approval-mozilla-release?
Pushed by cbook@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/7e9f77858528 Increase max connection retries to 10.r=mcmanus
Keywords: checkin-needed
(In reply to Ritu Kothari (:ritu) from comment #14) > Hi Dragana, Patrick: We pushed RC2 to Beta channel this morning. I have no > other release blockers driving an RC3. At this point, an RC3 will delay Fx47 > go-live date. To me, this issue is not a release blocker. I can add this to > my 47 dot-release potential ride-along list but I do not feel the need to > spin another RC build just for this issue. Are you ok with that plan? Please > let me know. it is ok.
Flags: needinfo?(dd.mozilla)
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla49
Comment on attachment 8759391 [details] [diff] [review] bug_1277522_v1.patch Minor pref change, may be a regression from 46, ok to uplift to aurora.
Attachment #8759391 - Flags: approval-mozilla-beta?
Attachment #8759391 - Flags: approval-mozilla-aurora?
Attachment #8759391 - Flags: approval-mozilla-aurora+
Hi Dragana, This patch changes the pref from 6 back to 10. Will this affect bug 1236277?
Flags: needinfo?(dd.mozilla)
(In reply to Gerry Chang [:gchang] from comment #24) > Hi Dragana, > This patch changes the pref from 6 back to 10. Will this affect bug 1236277? It should not affect it, it only increases number of retries. Why, is there a problem?
Flags: needinfo?(dd.mozilla)
Because from the patch of bug 1236277, the number is changed from 10 to 3. I was wondering if the change will affect the bug 1236277.
(In reply to Gerry Chang [:gchang] from comment #26) > Because from the patch of bug 1236277, the number is changed from 10 to 3. I > was wondering if the change will affect the bug 1236277. It will not affect it. The main change in bug 1236277 is to do retry in even more cases (not only if keepalive was used or pipelining)
Comment on attachment 8759391 [details] [diff] [review] bug_1277522_v1.patch Review of attachment 8759391 [details] [diff] [review]: ----------------------------------------------------------------- This patch fixes a regression. Take it in 48 beta 3.
Attachment #8759391 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Sorry I did not know why you are asking. This is definitely not going to affect 1236277.
comment 30 marked this fixed 47.. but I think that's mozilla-release now. so it's fixed 48, 49, 50.. and affected 47.
(In reply to Patrick McManus [:mcmanus] from comment #31) > comment 30 marked this fixed 47.. but I think that's mozilla-release now. so > it's fixed 48, 49, 50.. and affected 47. It is not in 47 Comment 18: decided to put it only as ride along. And it is push to aurora and beta just yesterday.
Comment on attachment 8759391 [details] [diff] [review] bug_1277522_v1.patch This is a very low risk fix for a recent regression, it seems to be a good candidate for a ride-along in 47.0.1
Attachment #8759391 - Flags: approval-mozilla-release? → approval-mozilla-release+
Flags: qe-verify+
Reproduced "The connection was reset" error message with Firefox 47. Using Firefox 47.0.1 and 48 beta 4, the page is loaded at every hard refresh, marking as verified.
Status: RESOLVED → VERIFIED
Flags: qe-verify+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: