Closed Bug 1005808 Opened 6 years ago Closed 6 years ago

New response timeout is affecting request requiring significant processing time

Categories

(Core :: Networking: HTTP, defect, major)

29 Branch
defect
Not set
major

Tracking

()

RESOLVED WONTFIX

People

(Reporter: xpoinsard, Unassigned)

References

Details

User Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:29.0) Gecko/20100101 Firefox/29.0 (Beta/Release)
Build ID: 20140428193813

Steps to reproduce:

We are maintaining an classical web applications (developped using Java, Tomcat, Struts 1 and jsp pages). Some pages require a significant processing time. For example the user fills a form and submit it. Then we have some cases where preparing the response takes more than five minutes.
In relation to my comment on bug #947391, I precise that these are classical HTTP requests (not XHR).


Actual results:

With version 29 of Firefox we now have a timeout.


Expected results:

With previous versions or other browsers, the browser waits until the response is sent by Tomcat.
Severity: normal → major
Component: Untriaged → Networking: HTTP
OS: Linux → All
Product: Firefox → Core
Hardware: x86_64 → All
Needinfo'ing Steve per bug 947391 comment 24. :-)
Blocks: 947391
Flags: needinfo?(sworkman)
Hi Xavier,

After discussing with :jduell, we'd really like to keep the timeout as it is. We have no other way of knowing if an HTTP server is unresponsive without some kind of timeout, and we consider 5 mins to be a significant time to wait without even a partial response from the server.

A way forward for you would be to let Firefox know that your server is still alive by sending the first few bytes of the response. If you have a consistent header or even partial header, this should work fine. Or convert to a self-refreshing intermediary page that will effectively poll the server until the resource is ready. Or use XHR and long-polling to wait for a notification that the resource is ready to be loaded ... there are several possibilities here.

For now, we're going to mark this as wontfix.
Status: UNCONFIRMED → RESOLVED
Closed: 6 years ago
Flags: needinfo?(sworkman)
Resolution: --- → WONTFIX
I understand your point for an initial request.
But couldn't you take into account the fact that the server already answered well for the previous requests (including login request) in the session and apply the timeout only for the first request in the session ?
(In reply to Xavier Poinsard from comment #3)
> I understand your point for an initial request.
> But couldn't you take into account the fact that the server already answered
> well for the previous requests (including login request) in the session and
> apply the timeout only for the first request in the session ?

xavier - we're actually looking for tcp sessions that have been timed out by nats and firewalls while idle between transactions.. the bad ones don't give us any indications like rsts or closes and they really gum things up.
Duplicate of this bug: 1008950
See also: Bug #1024015
Duplicate of this bug: 1024015
Hi, Thanks for the reply.  But I don't see my question answered here.

Why is it necessary for the browser to forceably close the connection?  Why can't the browser tell the user that there may be a problem so that the user can decide whether or not to continue?

Other questions come to mind.  If firewall/whatever's improper killing of connections "really gums things up" then why don't you tell the user about it/do soemthing about it when things are "gummed up" rather than at some arbitrary point?

The user seems to be the one who's in the position to best judge whether the connection is taking too long.
Unless you explain the problem that network.http.response.timeout is supposed to solve it's very hard to talk about the appropriateness of the solution.  

So far as I know, all I see is this:  The browser establishes a tcp connection.  The evil firewall stops transmitting packets without sending the browser a tcp RST.  The browser waits forever for a response to  one of it's requests.  The user waits forever too, watching the little spinning ball in the tab spin forever.

Where's the problem?  Clearly there's a problem for the user.  So it's reasonable to ask the user if he wants to resolve the problem.  But only after "too long", and only the user can know what that is.

Is there a problem anywhere else, something bad happening that somehow degrades the user's computer experience?  Perhaps something consuming limited resources?  Nobody's said.
Karl,

There's a OS limit on how may file descriptors the browser can open.  Every "hung" TCP connection that we don't timeout will wind up clogging up that count, and eventually we can't open up new sockets and the browser looks "hung".  Before that point it may limit parallelism (i.e. fewer HTTP connections are able to be open), so that perf is reduced in a way that's not easy to pin down the cause for.

In our experience trying to add UI and let the user figure things out for something like this is simply confusing for a plurality of users.  It's also extremely rare for valid HTTP responses to take >5 minutes.  We've made the decision that the tradeoff of breaking a tiny handful of sites is worth getting recovery from bad network conditions.  Yes, we could have tried to engineer this so that we only close connections if they seem to be degrading performance, but that's a much more complex solution that would take a lot of time and code to get right, and we have limited engineering resources.  Supporting HTTP responses that take >5 min is just not enough of a priority to justify that work.

That said, now that we also use TCP keepalives on our sockets, I might personally be persuaded that the 5 minutes timeout is less important for detecting errant sockets than it used to be.  OTOH we may not always get keepalive support turned on for all platforms (esp phones), and those often have lower file descriptor counts.

The superreview process at Mozilla has frankly gotten a little fuzzy--it was primarily for changes to IDLs (for addon compatibility).  Now that we're willing to break addons when needed, it's used a lot less.  I've never seen it used for something like this.  HTTP behavior isn't really part of a "contract" the same way (otherwise we wouldn't have been able to do things like open speculative "extra" connections to web sites).  This decision really belongs in the hands of the networking module owner, so I'm needinfo'ing :mcmanus to make that call.  I very much doubt :bz will disapprove but if he does he can jump in.

Karl: thanks for the eloquent appeal for your cause--I'm not sure I agree that this is important enough to change course, but you've made your case well and politely and with a good understanding of how to move things through the "appeal" process :)
and... my previous comment really belonged in bug 947391 :)
Regards the above (hidden) comments which are un-approving of the HTTP request timeout:

First, you will probably be pleased to note that the issue is being addressed.  See bug #1024015.

Second, **** people off only makes them stupid.  We do not want to make the Mozilla developers stupid.  Ranting may feel good for a moment, you will pay the price later.

Third (although the point is moot), it is (IMO) very important to inform Mozilla of the impact and scope of the problems created by a change like this.  The more concrete, objective, information as to why you have made the engineering choices you have, the curcumstances surrounding your choices, and the scope of your operations the better.  HTTP, like the Internet, is an agreement.  (http://www.worldofends.com/#BM2)  Like any agreement it is subject to change.  If you think things are going in the wrong direction and want to change people's minds please do something effective.

Apologies for being preachy.  Due to the energy I've invested in this issue I feel a sense of stewardship.

(In an excess of zeal regards this issue I'll take this opportunity to refer interested Mozilla developers to point 4 of Word of Ends "Adding value to the Internet lowers it's value": http://www.worldofends.com/#BM4)
Sorry for my worries.

Such changes are not transparent to the user and are a no go.

All applications above 301 Sec runtime are broken now without tweaking about:config .

What are 5 Minutes for a "old style written" db Import script? Exactly. Nothing. Some (commercial) scripts are even encoded, so you even have no chance to tweak...
You need to log in before you can comment on or make changes to this bug.