Connection is reset message appears intermittently after landing of bug 592284

RESOLVED FIXED

Status

()

Core
Networking: HTTP
RESOLVED FIXED
7 years ago
5 years ago

People

(Reporter: d.a., Assigned: mcmanus)

Tracking

({regression})

Trunk
x86
All
regression
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(blocking2.0 final+)

Details

(Whiteboard: [http-conn])

Attachments

(1 attachment)

(Reporter)

Description

7 years ago
User-Agent:       Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:2.0b8pre) Gecko/20101124 Firefox/4.0b8pre
Build Identifier: 

Ever since bug 602284 landed on trunk I've been getting "The Connection Was Reset" message, but only intermittently. It doesn't seem specific to any site, but it appears to be more frequent to sites which have a high ping, perhaps something around 250 ms and above.

I've not been able to reproduce it at will.


Reproducible: Sometimes
Confirmed on Win7 as well. Lots of people seeing it on mozillazine; http://forums.mozillazine.org/viewtopic.php?p=10161971#p10161971

Seeing the problem itself on the mozillazine site itself.
Status: UNCONFIRMED → NEW
Ever confirmed: true
OS: Mac OS X → All
Version: unspecified → Trunk
I have not seen this at all. Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.0b8pre) Gecko/20101124 Firefox/4.0b8pre (Windows 7 Pro x64).

Updated

7 years ago
Status: NEW → UNCONFIRMED
Ever confirmed: false

Comment 3

7 years ago
Confirmed. >Win 7x64 latest nightly.
(Assignee)

Comment 4

7 years ago
I'm traveling for the holiday and won't be able to look at it thoroughly until monday. I haven't seen it myself, but I'll spend some time with mozillazine.
(Assignee)

Comment 5

7 years ago
Would the reporter try this build:

http://ftp.mozilla.org/pub/mozilla.org/firefox/tryserver-builds/mcmanus@ducksong.com-2be20026e175/

That build removes just the reclaim logic associated with surplus extra connections. If that does successfully isolate the problem we can build something a little more nuanced for merge.
(Assignee)

Comment 6

7 years ago
The root cause here seems to be that nsSocketTransport::IsAlive(), which is called before a persistent connection is used does not detect a server generated FIN as I had thought it did.

I need to figure out if that is a bug in IsAlive() or not.

I have a patch that works around the issue by seeting the isResued flag on the connection - that lets the http transaction send it a second time if it gets a RST the first time (which makes sense to deal with the inherent reuse race condition - so I'll keep it no matter what), but it would be better to avoid that path if we know the fin has already arrived.
(Assignee)

Comment 7

7 years ago
(In reply to comment #6)
> The root cause here seems to be that nsSocketTransport::IsAlive(), which is
> called before a persistent connection is used does not detect a server
> generated FIN as I had thought it did.
>

you can scratch this comment - I mistakenly drew this conclusion due to how I was using my debugger. My err. I think the patch I will post shortly addresses the problem.
(Assignee)

Comment 8

7 years ago
track the patch for this in 613977
Depends on: 613977

Comment 9

7 years ago
(In reply to comment #8)
> track the patch for this in 613977

This means that the try build you posted on 613977 comments should solve the bug 614677 ? Because it doesn't for me.
(Assignee)

Comment 10

7 years ago
(In reply to comment #9)
> (In reply to comment #8)
> > track the patch for this in 613977
> 
> This means that the try build you posted on 613977 comments should solve the
> bug 614677 ? Because it doesn't for me.

yes, it should. Sorry to hear you're still having problems.

Does the build in Comment 5 of this bug solve the problem for you? (that's not a real fix, but it will help identify what might be causing the issue).

Do you have any advice on steps to reproduce? For instance, an earlier commenter mentioned forums.mozillazine.org and that was helpful to me even though it wasn't an exact recipe.

Comment 11

7 years ago
Yes the build in Comment 5 helps a lot (I have just reinstalled this build to tell you after some testing if it solves totally or partially the problem, I will let you know)

I can't tell exactly steps helping to reproduce the bug but I have noticed it is often happening when I try to open several tabs almost at same time (for example when opening in a new tabs different links to read).

And as told before, when THE tab is hanging, all the other are "waiting".
I'm sorry to be so imprecise.

Hope it helps (a little ;-)

Comment 12

7 years ago
Well, unfortunately, some bad news, even with the Comment 5 builds, it's still happening a lot after some browsing time, maybe a clue ?

Comment 13

7 years ago
To be more precise, it mainly stucks to "Connecting..." every time and not necessary when opening a new tab (but often).
(Assignee)

Comment 14

7 years ago
(In reply to comment #13)
> To be more precise, it mainly stucks to "Connecting..." every time and not
> necessary when opening a new tab (but often).

other than the retry timeout, do you have any non-default configuration networking preferences? I'm thinking specifically about the various connection maximums, but any non-standard ones would be interesting.
(Assignee)

Comment 15

7 years ago
I've identified a case where the backup socket can exceed global connection limits in a way that will lockup some of the normal socket allocations (at least until the connection manager times out some idle ones).

the fix means getting an nshttpconnection() object from the server manager when the backup socket is created, subject to its limits, and not just when it is being recycled.

That would certainly explain the lingering weirdness.
(Assignee)

Updated

7 years ago
Duplicate of this bug: 614950
(Assignee)

Comment 17

7 years ago
In: https://bugzilla.mozilla.org/show_bug.cgi?id=613977#c31

there is a new patch and URL for a try server build (which is just getting started as I write this) which I hope resolves the lingering problems. Please give it a try when its ready.

Updated

7 years ago
Blocks: 592284
Keywords: regression

Comment 18

7 years ago
Hope I'm posting in the right thread this time.

This tryserver build seems to be an improvement over it's predecessor. After several hours of use, I haven't seen any connection reset messages. I also haven't seen any of the noticeable tab hangs I was getting before.
(Assignee)

Comment 19

7 years ago
(In reply to comment #18)
> Hope I'm posting in the right thread this time.
> 
> This tryserver build seems to be an improvement over it's predecessor. After
> several hours of use, I haven't seen any connection reset messages. I also
> haven't seen any of the noticeable tab hangs I was getting before.

great!

zouk?

Comment 20

7 years ago
Sorry for the late reply Patrick, I wanted to be sure before to give you my feelings/testing, and there are also good so far !
I do not see as before any tab "stucks" or other connection problems (so far again).
For me also, this tryserver build is BIG improvement. Thank you a lot for  you continuous work ! :-)
If anything changes, I will let you know.

Comment 21

7 years ago
Justed wanted to add that this build is still working nicely and I'm not ready to update it until patch is included in official nightly build ;-)

Comment 22

7 years ago
Sorry to bother you Patrick, I have a question because I'm not very aware of the "patching including process".
I would like to ask you when can we expect to see your work included in a official nightly release ?
(Assignee)

Comment 23

7 years ago
(In reply to comment #22)
> Sorry to bother you Patrick, I have a question because I'm not very aware of
> the "patching including process".
> I would like to ask you when can we expect to see your work included in a
> official nightly release ?

the patch in 613977 needs a review and also needs to be approved for gecko 2.0.. It is flagged as needing both, so folks will get there as soon as they can. I think I read that the list of candidate blockers for 2.0 is being re-triaged today.

so I think the answer is "soonish"

Comment 24

7 years ago
I'm not sure this is the same problem but after landing of https://bugzilla.mozilla.org/show_bug.cgi?id=592284 , i can't use the browser for more that 4/5 hours. I've to restart it because it's unable to connect to anything, it gets stuck in looking up or connecting phase. Looks like it's out of sockets

Updated

7 years ago
Status: UNCONFIRMED → NEW
blocking2.0: --- → ?
Ever confirmed: true
(Assignee)

Comment 25

7 years ago
Created attachment 496553 [details] [diff] [review]
disable syn retry accel

Disables underlying feature by setting default of 
pref("network.http.connection-retry-timeout", 0);

If you have a local value for this pref the feature will still be enabled.

a=shaver
(Assignee)

Updated

7 years ago
Keywords: checkin-needed
(Assignee)

Updated

7 years ago
Assignee: nobody → mcmanus
Pref turned off for beta8:

  http://hg.mozilla.org/mozilla-central/rev/0a9e64523c06

Should this bug be closed now?  Or is there work here beyond what bug 613977 will fix?
(Assignee)

Comment 27

7 years ago
Thanks jason, that should fix it for default configs - the rest can live in 613977. Relieved to see it make it in b8.
Status: NEW → RESOLVED
Last Resolved: 7 years ago
Keywords: checkin-needed
Resolution: --- → FIXED

Updated

7 years ago
blocking2.0: ? → final+
Whiteboard: [http-conn]
You need to log in before you can comment on or make changes to this bug.