From time to time, all URLs time out. Only solution is to restart.

RESOLVED WORKSFORME

Status

()

Core
Networking
RESOLVED WORKSFORME
4 years ago
2 years ago

People

(Reporter: sole, Unassigned)

Tracking

Trunk
x86
Mac OS X
Points:
---

Firefox Tracking Flags

(firefox25-)

Details

(Whiteboard: lame-network)

Attachments

(1 attachment)

At some point no URL will load, not even reloading the page or pressing COMMAND+SHIFT+R, and I have to restart the browser. Once I do that, everything is back to normal.

It started happening to me somehow recently (maybe a week ago?) but it's come to a point in which I have to restart several times a day. 

When this happens, other browsers can connect to the internet normally, so it's not a connectivity issue.

I'm not sure how to pinpoint when this happens, as my activity can vary from developing WebRTC stuff to just browsing. In both cases the urls will start to time out after some amount of time.
(Reporter)

Comment 1

4 years ago
Second reboot today...
(Reporter)

Comment 2

4 years ago
Third...
(Reporter)

Updated

4 years ago
tracking-firefox25: --- → ?
I've seen this occasionally too.
I have most definitely never seen this on OS X. What addons do you have installed?
Flags: needinfo?(sole)
Flags: needinfo?(jwatt)
I just happened to see (ancient) bug 475106 get closed, which described a similar problem.
Enabled, I have:

Adblock Plus 2.2.4
Add-on Builder Helper 1.7
Certificate Patrol 2.0.14
Collusion 0.27
DOM Inspector 2.0.14
Firebug 1.11.4
Flashblock 1.5.17
Gecko Profiler 1.12.4
Mass Password Reset 1.05
Tab Counter 1.9.8
Toggle Paint Flashing initial.rev?

(In reply to Justin Dolske [:Dolske] from comment #5)
> I just happened to see (ancient) bug 475106 get closed, which described a
> similar problem.

I don't believe the ocurances that I've observed were correlated with my Mac waking up from having been put to sleep.
Flags: needinfo?(jwatt)
(Reporter)

Comment 7

4 years ago
- AdBlock
- DuckDuckGo
- Feedback
- Firefox OS Simulator
- Pinboard Extension
Flags: needinfo?(sole)
(Reporter)

Comment 8

4 years ago
Also as JWatt says, this doesn't happen when getting back from sleep. I could be working with the browser, not touch it for a minute and then find the next tab I open to be timing out.

That said... I haven't seen this for the past two or three days, but it might be because I've been restarting the browser for testing different versions now and then.

Comment 9

4 years ago
We'll wait for reports from others before tracking. It's not yet clear that this is a recent regression or especially common issue.
tracking-firefox25: ? → -
Ah! This just hit an hour ago with a current UX Nightly, and I spent some time looking at it...

Symptom: Open a tab, enter a URL, just get a blue throbber spinning forever. Thought it was just a crappy Wifi connection, but Chrome worked just fine.

I saved the data lest it be interesting, but I didn't see anything unusual with the various tools (netstat, lsof, tcpdump, OS X activity sampler, dtruss, etc). I saw DNS requests getting answered for URLs I entered, but Firefox wasn't trying to open any TCP connections. Opening chrome pages worked fine, and I could save data from about:memory.

I noticed the first stalled tab was a Google Maps page, I got the "try the new Google Maps!" offer but couldn't get it to do anything. I wondered if that was the cause -- closed the other stalled tabs, confirmed that loading a site was still broken, then closed the Maps tab. It instantly unclogged -- I saw a flurry of activity for stiff like Twitter in the browser console / network activity tab I had open.

So, still not sure exactly _how_ this is happening, but the (new?) Google Maps may be one way to trigger it.
Trying to load Maps again is working just fine now, alas.
Component: General → Networking
Product: Firefox → Core

Updated

4 years ago
Whiteboard: lame-network
We don't have much data here, but the suspicion :mcmanus and I have is that folks are somehow accumulating stale TCP connection that don't ever send a server-side FIN/RST (there are various ways this can happen: home routers that do NAT translation can forget mappings if they get busy, servers can misbehave or disappear).  We're going to start using TCP keepalive pings to help time these out--see bug 444328.
Keep-alives, something we are talking about so long time ;)  And I still don't think it's a good idea to have frequent keep-alives on idle connections.

Why don't we just close longer hanging connections?  With threshold at say 3 seconds or so.  I think we may even have a pref to change for it.  It's IMO better to have to reestablish a connection then wait until we discover an existing one is dead...
(In reply to Honza Bambas (:mayhemer) from comment #13)

> Why don't we just close longer hanging connections?  

reused persistent connections save a RTT and basically cost us nothing. Costs in RTT units are the enemy.

> better to have to reestablish a connection then wait until we discover an
> existing one is dead...

in the normal case we don't wait to discover they are dead - we actively detect that based on the close from the other end of the connection.

Its also possible that sockets having this problem aren't http-idle, so shortening the http-idle timeout wouldn't have enough scope.

One other possibility is that the sockets are in the nsSocketTransportService idle list somehow with nothing listening for their error or hangup events.

I'm not aware of anyone actively working on the TCP-KA code at this point - jduell is someone?
(In reply to Patrick McManus [:mcmanus] from comment #14)
> (In reply to Honza Bambas (:mayhemer) from comment #13)
> 
> > Why don't we just close longer hanging connections?  
> 
> reused persistent connections save a RTT and basically cost us nothing.
> Costs in RTT units are the enemy.
> 
> > better to have to reestablish a connection then wait until we discover an
> > existing one is dead...
> 
> in the normal case we don't wait to discover they are dead - we actively
> detect that based on the close from the other end of the connection.

But sometimes we don't and that is what is this bug about and what I'm trying to address.  It may take much longer time than one RTT to detect a connection is dead because a home router or whatever killed it.

> Its also possible that sockets having this problem aren't http-idle, so
> shortening the http-idle timeout wouldn't have enough scope.

You mean that a connection can be killed when we are actively waiting for a response?  Right, and I've suggested to use very short keep-alives ONLY for this - during wait for reply, but not when idle.

> One other possibility is that the sockets are in the
> nsSocketTransportService idle list somehow with nothing listening for their
> error or hangup events.

This is interesting point.  We should have a watch dog or some mechanism to catch them.

> I'm not aware of anyone actively working on the TCP-KA code at this point -
> jduell is someone?
We may extend on bug 897124: when a users clicks a link (and request is idempotent), we will open a new connection (regardless limits) AND also send the request on an existing idle one (if avail, best idle for the shortest time).  If the response doesn't make it before the connection is open (+some threshold), we'll use the new connection to send the request again.  Then we take the response from a connection that makes it first to deliver it.

The point here is to do this for a request directly invoked by a user action and not for everything.
BTW, would be great to get about:timeline trace for this when it happens...
Created attachment 781211 [details]
network.txt

At least in my case, I don't think it was due to connection limits. Here's the netstat and lsof output from when my browser was hung: the process had 24 network-related file handles open, and the whole system only had ~30 tcp connections in various states.
jduell mentioned this in email, posting it here lest someone else be able to reproduce this first. (You'll probably have to read this comment in another browser ;-)

---
You can try this hack if you've still got the browser wedged:

Go to about:config and find the network.http.diagnostics pref.. open
the js console (ctrl-shift-j) and clear it. Then flip the pref from
false to true (making it false to start with if necessary). That should
make a crude dump of http connection manager show up on the console. 
---
(Reporter)

Comment 20

2 years ago
I don't remember having seen this in a very long time, so closing.
Status: NEW → RESOLVED
Last Resolved: 2 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.